sample#

geowombat.sample(data, method='random', band=None, n=None, strata=None, spacing=None, min_dist=None, max_attempts=10, num_workers=1, verbose=1, **kwargs)#

Generates samples from a raster.

Parameters:
  • data (DataArray) – The xarray.DataArray to extract data from.

  • method (Optional[str]) – The sampling method. Choices are [‘random’, ‘systematic’].

  • band (Optional[int or str]) – The band name to extract from. Only required if method = ‘random’ and strata is given.

  • n (Optional[int]) – The total number of samples. Only required if method = ‘random’.

  • strata (Optional[dict]) –

    The strata to sample within. The dictionary key–>value pairs should be {‘conditional,value’: sample size}.

    E.g.,

    strata = {‘==,1’: 0.5, ‘>=,2’: 0.5} … would sample 50% of total samples within class 1 and 50% of total samples in class >= 2.

    strata = {‘==,1’: 10, ‘>=,2’: 20} … would sample 10 samples within class 1 and 20 samples in class >= 2.

  • spacing (Optional[float]) – The spacing (in map projection units) when method = ‘systematic’.

  • min_dist (Optional[float or int]) – A minimum distance allowed between samples. Only applies when method = ‘random’.

  • max_attempts (Optional[int]) – The maximum numer of attempts to sample points > min_dist from each other.

  • num_workers (Optional[int]) – The number of parallel workers for dask.compute().

  • verbose (Optional[int]) – The verbosity level.

  • kwargs (Optional[dict]) – Keyword arguments passed to geowombat.extract.

Return type:

GeoDataFrame

Returns:

geopandas.GeoDataFrame

Examples

>>> import geowombat as gw
>>>
>>> # Sample 100 points randomly across the image
>>> with gw.open('image.tif') as src:
>>>     df = gw.sample(src, n=100)
>>>
>>> # Sample points systematically (with 10km spacing) across the image
>>> with gw.open('image.tif') as src:
>>>     df = gw.sample(src, method='systematic', spacing=10000.0)
>>>
>>> # Sample 50% of 100 in class 1 and 50% in classes >= 2
>>> strata = {'==,1': 0.5, '>=,2': 0.5}
>>> with gw.open('image.tif') as src:
>>>     df = gw.sample(src, band=1, n=100, strata=strata)
>>>
>>> # Specify a per-stratum minimum allowed point distance of 1,000 meters
>>> with gw.open('image.tif') as src:
>>>     df = gw.sample(src, band=1, n=100, min_dist=1000, strata=strata)