geowombat package#
Subpackages#
- geowombat.core package
- Submodules
- geowombat.core.api module
- geowombat.core.base module
- geowombat.core.conversion module
Converters
Converters.array_to_polygon()
Converters.bounds_to_coords()
Converters.coords_to_indices()
Converters.dask_to_xarray()
Converters.indices_to_coords()
Converters.lonlat_to_xy()
Converters.ndarray_to_xarray()
Converters.polygon_to_array()
Converters.polygons_to_points()
Converters.prepare_points()
Converters.xarray_to_xdataset()
Converters.xy_to_lonlat()
- geowombat.core.geoxarray module
GeoWombatAccessor
GeoWombatAccessor.apply()
GeoWombatAccessor.assign_nodata_attrs()
GeoWombatAccessor.avi()
GeoWombatAccessor.band_mask()
GeoWombatAccessor.bounds_overlay()
GeoWombatAccessor.calc_area()
GeoWombatAccessor.check_chunksize()
GeoWombatAccessor.clip()
GeoWombatAccessor.clip_by_polygon()
GeoWombatAccessor.compare()
GeoWombatAccessor.compute()
GeoWombatAccessor.data_are_separate
GeoWombatAccessor.data_are_stacked
GeoWombatAccessor.evi()
GeoWombatAccessor.evi2()
GeoWombatAccessor.extract()
GeoWombatAccessor.filenames
GeoWombatAccessor.gcvi()
GeoWombatAccessor.imshow()
GeoWombatAccessor.kndvi()
GeoWombatAccessor.mask()
GeoWombatAccessor.mask_nodata()
GeoWombatAccessor.match_data()
GeoWombatAccessor.moving()
GeoWombatAccessor.n_windows()
GeoWombatAccessor.nbr()
GeoWombatAccessor.ndvi()
GeoWombatAccessor.norm_brdf()
GeoWombatAccessor.norm_diff()
GeoWombatAccessor.read()
GeoWombatAccessor.recode()
GeoWombatAccessor.replace()
GeoWombatAccessor.sample()
GeoWombatAccessor.save()
GeoWombatAccessor.set_nodata()
GeoWombatAccessor.subset()
GeoWombatAccessor.tasseled_cap()
GeoWombatAccessor.to_netcdf()
GeoWombatAccessor.to_polygon()
GeoWombatAccessor.to_raster()
GeoWombatAccessor.to_vector()
GeoWombatAccessor.to_vrt()
GeoWombatAccessor.transform_crs()
GeoWombatAccessor.wi()
GeoWombatAccessor.windows()
- geowombat.core.io module
- geowombat.core.parallel module
- geowombat.core.properties module
DataProperties
DataProperties.affine
DataProperties.altitude
DataProperties.array_is_dask
DataProperties.avail_sensors
DataProperties.band_chunks
DataProperties.bottom
DataProperties.bounds
DataProperties.bounds_as_namedtuple
DataProperties.cellx
DataProperties.cellxh
DataProperties.celly
DataProperties.cellyh
DataProperties.central_um
DataProperties.chunk_grid
DataProperties.col_chunks
DataProperties.crs_to_pyproj
DataProperties.dtype
DataProperties.footprint_grid
DataProperties.geodataframe
DataProperties.geometry
DataProperties.has_band
DataProperties.has_band_coord
DataProperties.has_band_dim
DataProperties.has_time
DataProperties.has_time_coord
DataProperties.has_time_dim
DataProperties.left
DataProperties.meta
DataProperties.nbands
DataProperties.ncols
DataProperties.ndims
DataProperties.nodataval
DataProperties.nrows
DataProperties.ntime
DataProperties.offsetval
DataProperties.pydatetime
DataProperties.right
DataProperties.row_chunks
DataProperties.scaleval
DataProperties.sensor_names
DataProperties.time_chunks
DataProperties.top
DataProperties.transform
DataProperties.unary_union
DataProperties.wavelengths
Metadata
WavelengthsBGR
WavelengthsBGRN
WavelengthsL57
WavelengthsL57Pan
WavelengthsL57Thermal
WavelengthsL8
WavelengthsL8Thermal
WavelengthsL9
WavelengthsL9Thermal
WavelengthsMODSR
WavelengthsPan
WavelengthsRGB
WavelengthsRGBN
WavelengthsS2
WavelengthsS220
WavelengthsS2Cloudless
WavelengthsS2Full
get_sensor_info()
- geowombat.core.series module
BaseSeries
SeriesStats
SeriesStats.abs_slope_q1()
SeriesStats.abs_slope_q2()
SeriesStats.abs_slope_q3()
SeriesStats.abs_slope_q4()
SeriesStats.amp()
SeriesStats.calculate()
SeriesStats.cv()
SeriesStats.max()
SeriesStats.mean()
SeriesStats.mean_abs_diff()
SeriesStats.median()
SeriesStats.min()
SeriesStats.norm_abs_energy()
SeriesStats.percentile()
TimeModule
TimeModulePipeline
TransferLib
- geowombat.core.sops module
- geowombat.core.stac module
STACCollections
STACCollections.cop_dem_glo_30
STACCollections.io_lulc
STACCollections.landsat_c2_l1
STACCollections.landsat_c2_l2
STACCollections.landsat_l8_c2_l2
STACCollections.sentinel_3_lst
STACCollections.sentinel_s1_l1c
STACCollections.sentinel_s2_l1c
STACCollections.sentinel_s2_l2a
STACCollections.sentinel_s2_l2a_cogs
STACCollections.usda_cdl
STACNames
merge_stac()
open_stac()
- geowombat.core.util module
- geowombat.core.vi module
- geowombat.core.windows module
- Module contents
apply()
array_to_polygon()
avi()
bounds_to_coords()
calc_area()
clip()
clip_by_polygon()
coords_to_indices()
coregister()
dask_to_xarray()
evi()
evi2()
extract()
gcvi()
indices_to_coords()
kndvi()
lonlat_to_xy()
mask()
moving()
nbr()
ndarray_to_xarray()
ndvi()
norm_diff()
polygon_to_array()
polygons_to_points()
recode()
replace()
sample()
save()
sort_images_by_date()
subset()
tasseled_cap()
to_netcdf()
to_raster()
to_vrt()
transform_crs()
wi()
xy_to_lonlat()
- geowombat.ml package
- moving
- geowombat.radiometry package
- Submodules
- geowombat.radiometry.angles module
AngleInfo
estimate_cloud_shadows()
get_sentinel_angle_shape()
get_sentinel_crs_transform()
get_sentinel_sensor()
landsat_angle_prep()
landsat_pixel_angles()
open_angle_file()
parse_sentinel_angles()
postprocess_espa_angles()
relative_azimuth()
resample_angles()
run_espa_command()
scattering_angle()
sentinel_pixel_angles()
shift_objects()
transform_angles()
- geowombat.radiometry.brdf module
- geowombat.radiometry.mask module
- geowombat.radiometry.qa module
- geowombat.radiometry.sixs module
- geowombat.radiometry.sr module
- geowombat.radiometry.topo module
- Module contents
- geowombat.tasks package
- geowombat.util package
Submodules#
geowombat.config module#
- class geowombat.config.update(config={'bigtiff': 'NO', 'blockxsize': 512, 'blockysize': 512, 'compress': None, 'driver': 'GTiff', 'ignore_warnings': False, 'l57_angles_path': None, 'l8_angles_path': None, 'nasa_earthdata_code': None, 'nasa_earthdata_key': None, 'nasa_earthdata_user': None, 'nodata': None, 'offset': None, 'ref_bounds': None, 'ref_crs': None, 'ref_image': None, 'ref_res': None, 'ref_tar': None, 'scale_factor': None, 'sensor': None, 'tiled': True, 'with_config': False}, **kwargs)[source]#
Bases:
object
Updates the global configuration parameters. See ‘config.ini’ for parameter options and defaults.
Note that
nodata
is only used/applied as a value setter during array warping. I.e., thenodata
argument is not the input ‘no data’ value itself. Rather, it is used to replace ‘no data’ values in the opened, warped array.Example
>>> with gw.config.update(sensor='l8'): >>> with gw.open('image.tif') as ds: >>> print(ds.gw.config)
geowombat.handler module#
Module contents#
- class geowombat.TimeModule[source]#
Bases:
object
Methods
__call__
(w, array, band_dict)calculate
(data)Calculates the user function.
- abstract calculate(data)[source]#
Calculates the user function.
- Parameters:
| (data (numpy.ndarray) –
jax.Array
|torch.Tensor
|tensorflow.Tensor
): The input array, shaped [time x bands x rows x columns].data (Any) –
- Return type:
Any
- Returns:
numpy.ndarray
|jax.Array
|torch.Tensor
|tensorflow.Tensor
:Shaped (time|bands x rows x columns)
- class geowombat.TimeModulePipeline(module_list)[source]#
Bases:
object
Methods
__call__
(w, array, band_dict)- Parameters:
module_list (List[TimeModule]) –
- geowombat.apply(infile, outfile, block_func, args=None, count=1, scheduler='processes', gdal_cache=512, n_jobs=4, overwrite=False, tags=None, **kwargs)[source]#
Applies a function and writes results to file.
- Parameters:
infile (str) – The input file to process.
outfile (str) – The output file.
block_func (func) – The user function to apply to each block. The function should always return the window, the data, and at least one argument. The block data inside the function will be a 2d array if the input image has 1 band, otherwise a 3d array.
args (Optional[tuple]) – Additional arguments to pass to
block_func
.count (Optional[int]) – The band count for the output file.
scheduler (Optional[str]) – The
concurrent.futures
scheduler to use. Choices are [‘threads’, ‘processes’].gdal_cache (Optional[int]) – The
GDAL
cache size (in MB).n_jobs (Optional[int]) – The number of blocks to process in parallel.
overwrite (Optional[bool]) – Whether to overwrite an existing output file.
tags (Optional[dict]) – Image tags to write to file.
kwargs (Optional[dict]) – Additional keyword arguments to pass to
rasterio.open
.
- Returns:
None
, writes tooutfile
Examples
>>> import geowombat as gw >>> >>> # Here is a function with no arguments >>> def my_func0(w, block, arg): >>> return w, block >>> >>> gw.apply('input.tif', >>> 'output.tif', >>> my_func0, >>> n_jobs=8) >>> >>> # Here is a function with 1 argument >>> def my_func1(w, block, arg): >>> return w, block * arg >>> >>> gw.apply('input.tif', >>> 'output.tif', >>> my_func1, >>> args=(10.0,), >>> n_jobs=8)
- geowombat.array_to_polygon(data, mask=None, connectivity=4, num_workers=1)#
Converts an
xarray.DataArray` to a ``geopandas.GeoDataFrame
- Parameters:
data (DataArray) – The
xarray.DataArray
to convert.mask (Optional[str, numpy ndarray, or rasterio Band object]) – Must evaluate to bool (rasterio.bool_ or rasterio.uint8). Values of False or 0 will be excluded from feature generation. Note well that this is the inverse sense from Numpy’s, where a mask value of True indicates invalid data in an array. If source is a Numpy masked array and mask is None, the source’s mask will be inverted and used in place of mask. If
mask
is equal to ‘source’, thendata
is used as the mask.connectivity (Optional[int]) – Use 4 or 8 pixel connectivity for grouping pixels into features.
num_workers (Optional[int]) – The number of parallel workers to send to
dask.compute()
.
- Return type:
GeoDataFrame
- Returns:
geopandas.GeoDataFrame
Example
>>> import geowombat as gw >>> >>> with gw.open('image.tif') as src: >>> >>> # Convert the input image to a GeoDataFrame >>> df = gw.array_to_polygon( >>> src, >>> mask='source', >>> num_workers=8 >>> )
- geowombat.avi(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the advanced vegetation index
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[AVI = {(NIR \times (1.0 - red) \times (NIR - red))}^{0.3334}\]- Returns:
Data range: 0 to 1
- Return type:
xarray.DataArray
- geowombat.bounds_to_coords(bounds, dst_crs)#
Converts bounds from longitude and latitude to native map coordinates.
- Parameters:
bounds (
tuple
|rasterio.coords.BoundingBox
) – The lat/lon bounds to transform.dst_crs (str, object, or DataArray) – The CRS to transform to. It can be provided as a string, a CRS instance (e.g.,
pyproj.crs.CRS
), or ageowombat.DataArray
.
- Returns:
(left, bottom, right, top)
- Return type:
tuple
- geowombat.calc_area(data, values, op='eq', units='km2', row_chunks=None, col_chunks=None, n_workers=1, n_threads=1, scheduler='threads', n_chunks=100)#
Calculates the area of data values.
- Parameters:
data (DataArray) – The
xarray.DataArray
to calculate area.values (list) – A list of values.
op (Optional[str]) – The value sign. Choices are [‘gt’, ‘ge’, ‘lt’, ‘le’, ‘eq’].
units (Optional[str]) – The units to return. Choices are [‘km2’, ‘ha’].
row_chunks (Optional[int]) – The row chunk size to process in parallel.
col_chunks (Optional[int]) – The column chunk size to process in parallel.
n_workers (Optional[int]) – The number of parallel workers for
scheduler
.n_threads (Optional[int]) – The number of parallel threads for
dask.compute()
.scheduler (Optional[str]) –
The parallel task scheduler to use. Choices are [‘processes’, ‘threads’, ‘mpool’].
mpool: process pool of workers using
multiprocessing.Pool
processes: process pool of workers usingconcurrent.futures
threads: thread pool of workers usingconcurrent.futures
n_chunks (Optional[int]) – The chunk size of windows. If not given, equal to
n_workers
x 50.
- Return type:
DataFrame
- Returns:
pandas.DataFrame
Example
>>> import geowombat as gw >>> >>> # Read a land cover image with 512x512 chunks >>> with gw.open('land_cover.tif', chunks=512) as src: >>> >>> df = gw.calc_area( >>> src, >>> [1, 2, 5], # calculate the area of classes 1, 2, and 5 >>> units='km2', # return area in kilometers squared >>> n_workers=4, >>> row_chunks=1024, # iterate over larger chunks to use 512 chunks in parallel >>> col_chunks=1024 >>> )
- geowombat.clip(data, df, query=None, mask_data=False, expand_by=0)#
Clips a DataArray by vector polygon geometry.
Deprecated since version 2.1.7: Use
geowombat.clip_by_polygon()
.- Parameters:
data (DataArray) – The
xarray.DataArray
to subset.df (GeoDataFrame or str) – The
geopandas.GeoDataFrame
or filename to clip to.query (Optional[str]) – A query to apply to
df
.mask_data (Optional[bool]) – Whether to mask values outside of the
df
geometry envelope.expand_by (Optional[int]) – Expand the clip array bounds by
expand_by
pixels on each side.
- Return type:
DataArray
- Returns:
xarray.DataArray
Examples
>>> import geowombat as gw >>> >>> with gw.open('image.tif') as ds: >>> ds = gw.clip(ds, df, query="Id == 1") >>> >>> # or >>> >>> with gw.open('image.tif') as ds: >>> ds = ds.gw.clip(df, query="Id == 1")
- geowombat.clip_by_polygon(data, df, query=None, mask_data=False, expand_by=0)#
Clips a DataArray by vector polygon geometry.
- Parameters:
data (DataArray) – The
xarray.DataArray
to subset.df (GeoDataFrame or str) – The
geopandas.GeoDataFrame
or filename to clip to.query (Optional[str]) – A query to apply to
df
.mask_data (Optional[bool]) – Whether to mask values outside of the
df
geometry envelope.expand_by (Optional[int]) – Expand the clip array bounds by
expand_by
pixels on each side.
- Return type:
DataArray
- Returns:
xarray.DataArray
Examples
>>> import geowombat as gw >>> >>> with gw.open('image.tif') as ds: >>> ds = gw.clip_by_polygon(ds, df, query="Id == 1")
- geowombat.coords_to_indices(x, y, transform)#
Converts map coordinates to array indices.
- Parameters:
x (float or 1d array) – The x coordinates.
y (float or 1d array) – The y coordinates.
transform (object) – The affine transform.
- Returns:
(col_index, row_index)
- Return type:
tuple
Example
>>> import geowombat as gw >>> from geowombat.core import coords_to_indices >>> >>> with gw.open('image.tif') as src: >>> j, i = coords_to_indices(x, y, src)
- geowombat.coregister(target, reference, band_names_reference=None, band_names_target=None, wkt_version=None, **kwargs)#
Co-registers an image, or images, using AROSICS.
While the required inputs are DataArrays, the intermediate results are stored as NumPy arrays. Therefore, memory usage is constrained to the size of the input data. Dask is not used for any of the computation in this function.
- Parameters:
target (DataArray or str) – The target
xarray.DataArray
or file name to co-register toreference
.reference (DataArray or str) – The reference
xarray.DataArray
or file name used to co-registertarget
.band_names_reference (Optional[list | tuple]) – Band names to open for the reference data.
band_names_target (Optional[list | tuple]) – Band names to open for the target data.
wkt_version (Optional[str]) – The WKT version to use with
to_wkt()
.kwargs (Optional[dict]) – Keyword arguments passed to
arosics
.
- Return type:
DataArray
- Reference:
- Return type:
DataArray
- Returns:
xarray.DataArray
- Parameters:
target (str | Path | DataArray) –
reference (str | Path | DataArray) –
band_names_reference (Sequence[str] | None) –
band_names_target (Sequence[str] | None) –
wkt_version (str | None) –
Example
>>> import geowombat as gw >>> >>> # Co-register a single image to a reference image >>> with gw.open('target.tif') as tar, gw.open('reference.tif') as ref: >>> results = gw.coregister( >>> tar, ref, q=True, ws=(512, 512), max_shift=3, CPUs=4 >>> ) >>> >>> # or >>> >>> results = gw.coregister( >>> 'target.tif', >>> 'reference.tif', >>> q=True, >>> ws=(512, 512), >>> max_shift=3, >>> CPUs=4 >>> )
- geowombat.evi(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the enhanced vegetation index
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[EVI = 2.5 \times \frac{NIR - red}{NIR + 6 \times red - 7.5 \times blue + 1}\]- Returns:
Data range: 0 to 1
- Return type:
xarray.DataArray
- geowombat.evi2(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the two-band modified enhanced vegetation index
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[EVI2 = 2.5 \times \frac{NIR - red}{NIR + 1 + 2.4 \times red}\]- Reference:
See [JHDM08]
- Returns:
Data range: 0 to 1
- Return type:
xarray.DataArray
- geowombat.extract(data, aoi, bands=None, time_names=None, band_names=None, frac=1.0, min_frac_area=None, all_touched=False, id_column='id', time_format='%Y%m%d', mask=None, n_jobs=8, verbose=0, n_workers=1, n_threads=-1, use_client=False, address=None, total_memory=24, processes=False, pool_kwargs=None, **kwargs)#
Extracts data within an area or points of interest. Projections do not need to match, as they are handled ‘on-the-fly’.
- Parameters:
data (DataArray) – The
xarray.DataArray
to extract data from.aoi (str or GeoDataFrame) – A file or
geopandas.GeoDataFrame
to extract data frame.bands (Optional[int or 1d array-like]) – A band or list of bands to extract. If not given, all bands are used. Bands should be GDAL-indexed (i.e., the first band is 1, not 0).
band_names (Optional[list]) – A list of band names. Length should be the same as bands.
time_names (Optional[list]) – A list of time names.
frac (Optional[float]) – A fractional subset of points to extract in each polygon feature.
min_frac_area (Optional[int | float]) – A minimum polygon area to use
frac
. Otherwise, use all samples within a polygon.all_touched (Optional[bool]) – The
all_touched
argument is passed torasterio.features.rasterize()
.id_column (Optional[str]) – The id column name.
time_format (Optional[str]) – The
datetime
conversion format iftime_names
aredatetime
objects.mask (Optional[GeoDataFrame or Shapely Polygon]) – A
shapely.geometry.Polygon
mask to subset to.n_jobs (Optional[int]) – The number of features to rasterize in parallel.
verbose (Optional[int]) – The verbosity level.
n_workers (Optional[int]) – The number of process workers. Only applies when
use_client
=True
.n_threads (Optional[int]) – The number of thread workers. Only applies when
use_client
=True
.use_client (Optional[bool]) – Whether to use a
dask
client.address (Optional[str]) – A cluster address to pass to client. Only used when
use_client
=True
.total_memory (Optional[int]) – The total memory (in GB) required when
use_client
=True
.processes (Optional[bool]) – Whether to use process workers with the
dask.distributed
client. Only applies whenuse_client
=True
.pool_kwargs (Optional[dict]) – Keyword arguments passed to
multiprocessing.Pool().imap()
.kwargs (Optional[dict]) – Keyword arguments passed to
dask.compute()
.
- Return type:
GeoDataFrame
- Returns:
geopandas.GeoDataFrame
Examples
>>> import geowombat as gw >>> >>> with gw.open('image.tif') as src: >>> df = gw.extract(src, 'poly.gpkg') >>> >>> # On a cluster >>> # Use a local cluster >>> with gw.open('image.tif') as src: >>> df = gw.extract(src, 'poly.gpkg', use_client=True, n_threads=16) >>> >>> # Specify the client address with a local cluster >>> with LocalCluster( >>> n_workers=1, >>> threads_per_worker=8, >>> scheduler_port=0, >>> processes=False, >>> memory_limit='4GB' >>> ) as cluster: >>> >>> with gw.open('image.tif') as src: >>> df = gw.extract( >>> src, >>> 'poly.gpkg', >>> use_client=True, >>> address=cluster >>> )
- geowombat.indices_to_coords(col_index, row_index, transform)#
Converts array indices to map coordinates.
- Parameters:
col_index (float or 1d array) – The column index.
row_index (float or 1d array) – The row index.
transform (Affine, DataArray, or tuple) – The affine transform.
- Returns:
(x, y)
- Return type:
tuple
Example
>>> import geowombat as gw >>> from geowombat.core import indices_to_coords >>> >>> with gw.open('image.tif') as src: >>> x, y = indices_to_coords(j, i, src)
- geowombat.kndvi(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the kernel normalized difference vegetation index
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[kNDVI = tanh({NDVI}^2)\]- Reference:
- Returns:
Data range: -1 to 1
- Return type:
xarray.DataArray
- geowombat.load(image_list, time_names, band_names, chunks=512, nodata=65535, in_range=None, out_range=None, data_slice=None, num_workers=1, src=None, scheduler='ray')[source]#
Loads data into memory using
xarray.open_mfdataset()
andray
. This function does not check data alignments and CRSs. It assumes each image inimage_list
has the same y and x dimensions and that the coordinates align.The
load
function cannot be used ifdataclasses
was pip installed.- Parameters:
image_list (list) – The list of image file paths.
time_names (list) – The list of image
datetime
objects.band_names (list) – The list of bands to open.
chunks (Optional[int]) – The dask chunk size.
nodata (Optional[float | int]) – The ‘no data’ value.
in_range (Optional[tuple]) – The input (min, max) range. If not given, defaults to (0, 10000).
out_range (Optional[tuple]) – The output (min, max) range. If not given, defaults to (0, 1).
data_slice (Optional[tuple]) – The slice object to read, given as (time, bands, rows, columns).
num_workers (Optional[int]) – The number of threads.
scheduler (Optional[str]) – The distributed scheduler. Currently not implemented.
- Returns:
Datetime list, array of (time x bands x rows x columns)
- Return type:
list
,numpy.ndarray
Example
>>> import datetime >>> import geowombat as gw >>> >>> image_names = ['LT05_L1TP_227082_19990311_20161220_01_T1.nc', >>> 'LT05_L1TP_227081_19990311_20161220_01_T1.nc', >>> 'LT05_L1TP_227082_19990327_20161220_01_T1.nc'] >>> >>> image_dates = [datetime.datetime(1999, 3, 11, 0, 0), >>> datetime.datetime(1999, 3, 11, 0, 0), >>> datetime.datetime(1999, 3, 27, 0, 0)] >>> >>> data_slice = (slice(0, None), slice(0, None), slice(0, 64), slice(0, 64)) >>> >>> # Load data into memory >>> dates, y = gw.load(image_names, >>> image_dates, >>> ['red', 'nir'], >>> chunks=512, >>> nodata=65535, >>> data_slice=data_slice, >>> num_workers=4)
- geowombat.lonlat_to_xy(lon, lat, dst_crs)#
Converts from longitude and latitude to native map coordinates.
- Parameters:
lon (float) – The longitude to convert.
lat (float) – The latitude to convert.
dst_crs (str, object, or DataArray) – The CRS to transform to. It can be provided as a string, a CRS instance (e.g.,
pyproj.crs.CRS
), or ageowombat.DataArray
.
- Returns:
(x, y)
- Return type:
tuple
Example
>>> import geowombat as gw >>> from geowombat.core import lonlat_to_xy >>> >>> lon, lat = -55.56822206, -25.46214220 >>> >>> with gw.open('image.tif') as src: >>> x, y = lonlat_to_xy(lon, lat, src)
- geowombat.mask(data, dataframe, query=None, keep='in')#
Masks a DataArray by vector polygon geometry.
- Parameters:
data (DataArray) – The
xarray.DataArray
to mask.dataframe (GeoDataFrame or str) – The
geopandas.GeoDataFrame
or filename to use for masking.query (Optional[str]) – A query to apply to
dataframe
.keep (Optional[str]) – If
keep
= ‘in’, mask values outside of the geometry (keep inside). Otherwise, ifkeep
= ‘out’, mask values inside (keep outside).
- Return type:
DataArray
- Returns:
xarray.DataArray
Examples
>>> import geowombat as gw >>> >>> with gw.open('image.tif') as ds: >>> ds = ds.gw.mask(df)
- geowombat.moving(data, stat='mean', perc=50, w=3, nodata=None, weights=False)#
Applies a moving window function over Dask array blocks.
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.stat (Optional[str]) – The statistic to compute. Choices are [‘mean’, ‘std’, ‘var’, ‘min’, ‘max’, ‘perc’].
perc (Optional[int]) – The percentile to return if
stat
= ‘perc’.w (Optional[int]) – The moving window size (in pixels).
nodata (Optional[int or float]) – A ‘no data’ value to ignore.
weights (Optional[bool]) – Whether to weight values by distance from window center.
- Return type:
DataArray
- Returns:
xarray.DataArray
Examples
>>> import geowombat as gw >>> >>> # Calculate the mean within a 5x5 window >>> with gw.open('image.tif') as src: >>> res = gw.moving(ds, stat='mean', w=5, nodata=32767.0) >>> >>> # Calculate the 90th percentile within a 15x15 window >>> with gw.open('image.tif') as src: >>> res = gw.moving(stat='perc', w=15, perc=90, nodata=32767.0) >>> res.data.compute(num_workers=4)
- geowombat.nbr(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the normalized burn ratio
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[NBR = \frac{NIR - SWIR2}{NIR + SWIR2}\]- Returns:
Data range: -1 to 1
- Return type:
xarray.DataArray
- geowombat.ndvi(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the normalized difference vegetation index
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[NDVI = \frac{NIR - red}{NIR + red}\]- Returns:
Data range: -1 to 1
- Return type:
xarray.DataArray
- geowombat.norm_diff(data, b1, b2, sensor=None, nodata=None, mask=False, scale_factor=None)#
Calculates the normalized difference band ratio
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.b1 (str) – The band name of the first band.
b2 (str) – The band name of the second band.
sensor (Optional[str]) – sensor (Optional[str]): The data’s sensor. If
None
, the band names should reflect the index being calculated.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[{norm}_{diff} = \frac{b2 - b1}{b2 + b1}\]- Returns:
Data range: -1 to 1
- Return type:
xarray.DataArray
- class geowombat.open(filename, band_names=None, time_names=None, stack_dim='time', bounds=None, bounds_by='reference', resampling='nearest', persist_filenames=False, netcdf_vars=None, mosaic=False, overlap='max', nodata=None, scale_factor=None, offset=None, dtype=None, scale_data=False, num_workers=1, **kwargs)[source]#
Bases:
object
Opens one or more raster files.
- Parameters:
filename (str or list) – The file name, search string, or a list of files to open.
band_names (Optional[1d array-like]) – A list of band names if
bounds
is given orwindow
is given. Default is None.time_names (Optional[1d array-like]) – A list of names to give the time dimension if
bounds
is given. Default is None.stack_dim (Optional[str]) – The stack dimension. Choices are [‘time’, ‘band’].
bounds (Optional[1d array-like]) – A bounding box to subset to, given as [minx, maxy, miny, maxx]. Default is None.
bounds_by (Optional[str]) –
How to concatenate the output extent if
filename
is alist
andmosaic
=False
. Choices are [‘intersection’, ‘union’, ‘reference’]. * reference: Use the bounds of the reference image. If aref_image
is not given, the first image inthe
filename
list is used.intersection: Use the intersection (i.e., minimum extent) of all the image bounds
union: Use the union (i.e., maximum extent) of all the image bounds
resampling (Optional[str]) – The resampling method if
filename
is alist
. Choices are [‘average’, ‘bilinear’, ‘cubic’, ‘cubic_spline’, ‘gauss’, ‘lanczos’, ‘max’, ‘med’, ‘min’, ‘mode’, ‘nearest’].persist_filenames (Optional[bool]) – Whether to persist the filenames list with the
xarray.DataArray
attributes. By default,persist_filenames=False
to avoid storing large file lists.netcdf_vars (Optional[list]) – NetCDF variables to open as a band stack.
mosaic (Optional[bool]) – If
filename
is alist
, whether to mosaic the arrays instead of stacking.overlap (Optional[str]) – The keyword that determines how to handle overlapping data if
filenames
is alist
. Choices are [‘min’, ‘max’, ‘mean’].nodata (Optional[float | int]) –
A ‘no data’ value to set. Default is
None
. Ifnodata
isNone
, the ‘no data’ value is set from the file metadata. Ifnodata
is given, then the file ‘no data’ value is overridden. See docstring examples for use ofnodata
ingeowombat.config.update
.Note
The
geowombat.config.update
overrides this argument. Thus, preference is always given in the following order:geowombat.config.update(nodata not None)
open(nodata not None)
file ‘no data’ value from metadata ‘_FillValue’ or ‘nodatavals’
scale_factor (Optional[float | int]) –
A scale value to apply to the opened data. The same rules used in
nodata
apply. I.e.,Note
The
geowombat.config.update
overrides this argument. Thus, preference is always given in the following order:geowombat.config.update(scale_factor not None)
open(scale_factor not None)
file scale value from metadata ‘scales’
offset (Optional[float | int]) –
An offset value to apply to the opened data. The same rules used in
nodata
apply. I.e.,Note
The
geowombat.config.update
overrides this argument. Thus, preference is always given in the following order:geowombat.config.update(offset not None)
open(offset not None)
file offset value from metadata ‘offsets’
dtype (Optional[str]) – A data type to force the output to. If not given, the data type is extracted from the file.
scale_data (Optional[bool]) –
Whether to apply scaling to the opened data. Default is
False
. Scaled data are returned as:scaled = data * gain + offset
See the arguments
nodata
,scale_factor
, andoffset
for rules regarding how scaling is applied.num_workers (Optional[int]) – The number of parallel workers for Dask if
bounds
is given orwindow
is given. Default is 1.kwargs (Optional[dict]) – Keyword arguments passed to the file opener.
- Returns:
xarray.DataArray
orxarray.Dataset
Examples
>>> import geowombat as gw >>> >>> # Open an image >>> with gw.open('image.tif') as ds: >>> print(ds) >>> >>> # Open a list of images, stacking along the 'time' dimension >>> with gw.open(['image1.tif', 'image2.tif']) as ds: >>> print(ds) >>> >>> # Open all GeoTiffs in a directory, stack along the 'time' dimension >>> with gw.open('*.tif') as ds: >>> print(ds) >>> >>> # Use a context manager to handle images of difference sizes and projections >>> with gw.config.update(ref_image='image1.tif'): >>> # Use 'time' names to stack and mosaic non-aligned images with identical dates >>> with gw.open(['image1.tif', 'image2.tif', 'image3.tif'], >>> >>> # The first two images were acquired on the same date >>> # and will be merged into a single time layer >>> time_names=['date1', 'date1', 'date2']) as ds: >>> >>> print(ds) >>> >>> # Mosaic images across space using a reference >>> # image for the CRS and cell resolution >>> with gw.config.update(ref_image='image1.tif'): >>> with gw.open(['image1.tif', 'image2.tif'], mosaic=True) as ds: >>> print(ds) >>> >>> # Mix configuration keywords >>> with gw.config.update(ref_crs='image1.tif', ref_res='image1.tif', ref_bounds='image2.tif'): >>> # The ``bounds_by`` keyword overrides the extent bounds >>> with gw.open(['image1.tif', 'image2.tif'], bounds_by='union') as ds: >>> print(ds) >>> >>> # Resample an image to 10m x 10m cell size >>> with gw.config.update(ref_crs=(10, 10)): >>> with gw.open('image.tif', resampling='cubic') as ds: >>> print(ds) >>> >>> # Open a list of images at a window slice >>> from rasterio.windows import Window >>> # Stack two images, opening band 3 >>> with gw.open( >>> ['image1.tif', 'image2.tif'], >>> band_names=['date1', 'date2'], >>> num_workers=8, >>> indexes=3, >>> window=Window(row_off=0, col_off=0, height=100, width=100), >>> dtype='float32' >>> ) as ds: >>> print(ds) >>> >>> # Scale data upon opening, using the image metadata to get scales and offsets >>> with gw.open('image.tif', scale_data=True) as ds: >>> print(ds) >>> >>> # Scale data upon opening, specifying scales and overriding metadata >>> with gw.open('image.tif', scale_data=True, scale_factor=1e-4) as ds: >>> print(ds) >>> >>> # Scale data upon opening, specifying scales and overriding metadata >>> with gw.config.update(scale_factor=1e-4): >>> with gw.open('image.tif', scale_data=True) as ds: >>> print(ds) >>> >>> # Open a NetCDF variable, specifying a NetCDF prefix and variable to open >>> with gw.open('netcdf:image.nc:blue') as src: >>> print(src) >>> >>> # Open a NetCDF image without access to transforms by providing full file path >>> # NOTE: This will be faster than the above method >>> # as it uses ``xarray.open_dataset`` and bypasses CRS checks. >>> # NOTE: The chunks must be provided by the user. >>> # NOTE: Providing band names will ensure the correct order when reading from a NetCDF dataset. >>> with gw.open( >>> 'image.nc', >>> chunks={'band': -1, 'y': 256, 'x': 256}, >>> band_names=['blue', 'green', 'red', 'nir', 'swir1', 'swir2'], >>> engine='h5netcdf' >>> ) as src: >>> print(src) >>> >>> # Open multiple NetCDF variables as an array stack >>> with gw.open('netcdf:image.nc', netcdf_vars=['blue', 'green', 'red']) as src: >>> print(src)
Methods
close
- geowombat.polygon_to_array(polygon, col=None, data=None, cellx=None, celly=None, band_name=None, row_chunks=512, col_chunks=512, src_res=None, fill=0, default_value=1, all_touched=True, dtype='uint8', sindex=None, tap=False, bounds_by='intersection')#
Converts a polygon geometry to an
xarray.DataArray
.- Parameters:
polygon (GeoDataFrame | str) – The
geopandas.DataFrame
or file with polygon geometry.col (Optional[str]) – The column in
polygon
you want to assign values from. If not set, creates a binary raster.data (Optional[DataArray]) – An
xarray.DataArray
to use as a reference for rasterizing.cellx (Optional[float]) – The output cell x size.
celly (Optional[float]) – The output cell y size.
band_name (Optional[list]) – The
xarray.DataArray
band name.row_chunks (Optional[int]) – The
dask
row chunk size.col_chunks (Optional[int]) – The
dask
column chunk size.(Optional[tuple] (src_res) – A source resolution to align to.
fill (Optional[int]) – Used as fill value for all areas not covered by input geometries to
rasterio.features.rasterize
.default_value (Optional[int]) – Used as value for all geometries, if not provided in shapes to
rasterio.features.rasterize
.all_touched (Optional[bool]) – If True, all pixels touched by geometries will be burned in. If false, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm will be burned in. The
all_touched
value forrasterio.features.rasterize()
.dtype (Optional[str | numpy data type]) – The output data type for
rasterio.features.rasterize()
.sindex (Optional[object]) – An instanced of
geopandas.GeoDataFrame.sindex
.tap (Optional[bool]) – Whether to target align pixels.
bounds_by (Optional[str]) –
How to concatenate the output extent. Choices are [‘intersection’, ‘union’, ‘reference’].
reference: Use the bounds of the reference image
intersection: Use the intersection (i.e., minimum extent) of all the image bounds
union: Use the union (i.e., maximum extent) of all the image bounds
src_res (Sequence[float] | None) –
- Return type:
DataArray
- Returns:
xarray.DataArray
Example
>>> import geowombat as gw >>> import geopandas as gpd >>> >>> df = gpd.read_file('polygons.gpkg') >>> >>> # 100x100 cell size >>> data = gw.polygon_to_array(df, 100.0, 100.0) >>> >>> # Align to an existing image >>> with gw.open('image.tif') as src: >>> data = gw.polygon_to_array(df, data=src)
- geowombat.polygons_to_points(data, df, frac=1.0, min_frac_area=None, all_touched=False, id_column='id', n_jobs=1, **kwargs)#
Converts polygons to points.
- Parameters:
data (DataArray or Dataset) – The
xarray.DataArray
orxarray.Dataset
.df (GeoDataFrame) – The
geopandas.GeoDataFrame
containing the geometry to rasterize.frac (Optional[float]) – A fractional subset of points to extract in each feature.
min_frac_area (Optional[int | float]) – A minimum polygon area to use
frac
. Otherwise, use all samples within a polygon.all_touched (Optional[bool]) – The
all_touched
argument is passed torasterio.features.rasterize
.id_column (Optional[str]) – The ‘id’ column.
n_jobs (Optional[int]) – The number of features to rasterize in parallel.
kwargs (Optional[dict]) – Keyword arguments passed to
multiprocessing.Pool().imap
.
- Returns:
geopandas.GeoDataFrame
- geowombat.recode(data, polygon, to_replace, num_workers=1)#
Recodes a DataArray with polygon mappings.
- Parameters:
data (DataArray) – The
xarray.DataArray
to recode.polygon (GeoDataFrame | str) – The
geopandas.DataFrame
or file with polygon geometry.to_replace (dict) –
How to find the values to replace. Dictionary mappings should be given as {from: to} pairs. If
to_replace
is an integer/string mapping, the to string should be ‘mode’.- {1: 5}:
recode values of 1 to 5
- {1: ‘mode’}:
recode values of 1 to the polygon mode
num_workers (Optional[int]) – The number of parallel Dask workers (only used if
to_replace
has a ‘mode’ mapping).
- Return type:
DataArray
- Returns:
xarray.DataArray
Example
>>> import geowombat as gw >>> >>> with gw.open('image.tif', chunks=512) as ds: >>> # Recode 1 with 5 within a polygon >>> res = gw.recode(ds, 'poly.gpkg', {1: 5})
- geowombat.replace(data, to_replace)#
Replace values given in to_replace with value.
- Parameters:
data (DataArray) – The
xarray.DataArray
to recode.to_replace (dict) –
How to find the values to replace. Dictionary mappings should be given as {from: to} pairs. If
to_replace
is an integer/string mapping, the to string should be ‘mode’.- {1: 5}:
recode values of 1 to 5
- {1: ‘mode’}:
recode values of 1 to the polygon mode
- Return type:
DataArray
- Returns:
xarray.DataArray
Example
>>> import geowombat as gw >>> >>> with gw.open('image.tif', chunks=512) as ds: >>> # Replace 1 with 5 >>> res = gw.replace(ds, {1: 5})
- geowombat.sample(data, method='random', band=None, n=None, strata=None, spacing=None, min_dist=None, max_attempts=10, num_workers=1, verbose=1, **kwargs)#
Generates samples from a raster.
- Parameters:
data (DataArray) – The
xarray.DataArray
to extract data from.method (Optional[str]) – The sampling method. Choices are [‘random’, ‘systematic’].
band (Optional[int or str]) – The band name to extract from. Only required if
method
= ‘random’ andstrata
is given.n (Optional[int]) – The total number of samples. Only required if
method
= ‘random’.strata (Optional[dict]) –
The strata to sample within. The dictionary key–>value pairs should be {‘conditional,value’: sample size}.
E.g.,
strata = {‘==,1’: 0.5, ‘>=,2’: 0.5} … would sample 50% of total samples within class 1 and 50% of total samples in class >= 2.
strata = {‘==,1’: 10, ‘>=,2’: 20} … would sample 10 samples within class 1 and 20 samples in class >= 2.
spacing (Optional[float]) – The spacing (in map projection units) when
method
= ‘systematic’.min_dist (Optional[float or int]) – A minimum distance allowed between samples. Only applies when
method
= ‘random’.max_attempts (Optional[int]) – The maximum numer of attempts to sample points >
min_dist
from each other.num_workers (Optional[int]) – The number of parallel workers for
dask.compute()
.verbose (Optional[int]) – The verbosity level.
kwargs (Optional[dict]) – Keyword arguments passed to
geowombat.extract
.
- Return type:
GeoDataFrame
- Returns:
geopandas.GeoDataFrame
Examples
>>> import geowombat as gw >>> >>> # Sample 100 points randomly across the image >>> with gw.open('image.tif') as src: >>> df = gw.sample(src, n=100) >>> >>> # Sample points systematically (with 10km spacing) across the image >>> with gw.open('image.tif') as src: >>> df = gw.sample(src, method='systematic', spacing=10000.0) >>> >>> # Sample 50% of 100 in class 1 and 50% in classes >= 2 >>> strata = {'==,1': 0.5, '>=,2': 0.5} >>> with gw.open('image.tif') as src: >>> df = gw.sample(src, band=1, n=100, strata=strata) >>> >>> # Specify a per-stratum minimum allowed point distance of 1,000 meters >>> with gw.open('image.tif') as src: >>> df = gw.sample(src, band=1, n=100, min_dist=1000, strata=strata)
- geowombat.save(data, filename, mode='w', nodata=None, overwrite=False, client=None, compute=True, tags=None, compress='none', compression=None, num_workers=1, log_progress=True, tqdm_kwargs=None, bigtiff=None)[source]#
Saves a DataArray to raster using rasterio/dask.
- Parameters:
filename (str | Path) – The output file name to write to.
overwrite (Optional[bool]) – Whether to overwrite an existing file. Default is False.
mode (Optional[str]) – The file storage mode. Choices are [‘w’, ‘r+’].
nodata (Optional[float | int]) – The ‘no data’ value. If
None
(default), the ‘no data’ value is taken from theDataArray
metadata.client (Optional[Client object]) – A
dask.distributed.Client
client object to persist data. Default is None.compute (Optinoal[bool]) – Whether to compute and write to
filename
. Otherwise, return thedask
task graph. IfTrue
, compute and write tofilename
. IfFalse
, return thedask
task graph. Default isTrue
.tags (Optional[dict]) – Metadata tags to write to file. Default is None.
compress (Optional[str]) – The file compression type. Default is ‘none’, or no compression.
compression (Optional[str]) –
The file compression type. Default is ‘none’, or no compression.
Deprecated since version 2.1.4: Use ‘compress’ – ‘compression’ will be removed in >=2.2.0.
num_workers (Optional[int]) – The number of dask workers (i.e., chunks) to write concurrently. Default is 1.
log_progress (Optional[bool]) – Whether to log the progress bar during writing. Default is True.
tqdm_kwargs (Optional[dict]) – Keyword arguments to pass to
tqdm
.bigtiff (Optional[str]) – A GDAL BIGTIFF flag. Choices are [“YES”, “NO”, “IF_NEEDED”, “IF_SAFER”].
data (DataArray) –
- Returns:
None
, writes tofilename
Example
>>> import geowombat as gw >>> >>> with gw.open('file.tif') as src: >>> result = ... >>> gw.save(result, 'output.tif', compress='lzw', num_workers=8) >>> >>> # Create delayed write tasks and compute later >>> tasks = [gw.save(array, 'output.tif', compute=False) for array in array_list] >>> # Write and close files >>> dask.compute(tasks, num_workers=8)
- class geowombat.series(filenames, time_names=None, band_names=None, transfer_lib='jax', crs=None, res=None, bounds=None, resampling='nearest', nodata=0, warp_mem_limit=256, num_threads=1, window_size=None, padding=None)[source]#
Bases:
BaseSeries
A class for time series concurrent processing on a GPU.
- Parameters:
filenames (list) – The list of filenames to open.
band_names (Optional[list]) – The band associated names.
transfer_lib (Optional[str]) – The library to transfer data to. Choices are [‘jax’, ‘keras’, ‘numpy’, ‘pytorch’, ‘tensorflow’].
crs (Optional[str]) – The coordinate reference system.
res (Optional[list | tuple]) – The cell resolution.
bounds (Optional[object]) – The coordinate bounds.
resampling (Optional[str]) – The resampling method.
nodata (Optional[float | int]) – The ‘no data’ value.
warp_mem_limit (Optional[int]) – The
rasterio
warping memory limit (in MB).num_threads (Optional[int]) – The number of
rasterio
warping threads.window_size (Optional[int | list | tuple]) – The concurrent processing window size (height, width) or -1 (i.e., entire array).
padding (Optional[list | tuple]) – Padding for each window.
padding
should be given as a tuple of (left pad, bottom pad, right pad, top pad). Ifpadding
is given, the returned list will contain a tuple ofrasterio.windows.Window
objects as (w1, w2), where w1 contains the normal window offsets and w2 contains the padded window offsets.time_names (list) –
- Requirement:
> # CUDA 11.1 > pip install –upgrade “jax[cuda111]” -f https://storage.googleapis.com/jax-releases/jax_releases.html
- Attributes:
- band_dict
- blockxsize
- blockysize
- count
- crs
- height
- nchunks
- nodata
- transform
- width
- Parameters:
filenames (list) –
time_names (list) –
band_names (list) –
transfer_lib (str) –
crs (str) –
res (list | tuple) –
bounds (BoundingBox | list | tuple) –
resampling (str) –
nodata (float | int) –
warp_mem_limit (int) –
num_threads (int) –
window_size (int | list | tuple) –
padding (list | tuple) –
Methods
apply
(func, bands[, gain, offset, ...])Applies a function concurrently over windows.
group_dates
(data, image_dates, band_names)Groups data by dates.
read
(bands[, window, gain, offset, pool, ...])Reads a window.
ndarray_to_darray
open
warp
- apply(func, bands, gain=1.0, offset=0.0, processes=False, num_workers=1, monitor_progress=True, outfile=None, bigtiff='NO', kwargs={})[source]#
Applies a function concurrently over windows.
- Parameters:
func (object | str | list | tuple) – The function to apply. If
func
is a string, choices are [‘cv’, ‘max’, ‘mean’, ‘min’].bands (list | int) – The bands to read.
gain (Optional[float]) – A gain factor to apply.
offset (Optional[float | int]) – An offset factor to apply.
processes (Optional[bool]) – Whether to use process workers, otherwise use threads.
num_workers (Optional[int]) – The number of concurrent workers.
monitor_progress (Optional[bool]) – Whether to monitor progress with a
tqdm
bar.outfile (Optional[Path | str]) – The output file.
bigtiff (Optional[str]) – Whether to create a BigTIFF file. Choices are [‘YES’, ‘NO’,”IF_NEEDED”, “IF_SAFER”]. Default is ‘NO’.
kwargs (Optional[dict]) – Keyword arguments passed to rasterio open profile.
- Returns:
Window, array, [datetime, …] If outfile is not None:
None, writes to
outfile
- Return type:
If outfile is None
Example
>>> import itertools >>> import geowombat as gw >>> import rasterio as rio >>> >>> # Import an image with 3 bands >>> from geowombat.data import l8_224078_20200518 >>> >>> # Create a custom class >>> class TemporalMean(gw.TimeModule): >>> >>> def __init__(self): >>> super(TemporalMean, self).__init__() >>> >>> # The main function >>> def calculate(self, array): >>> >>> sl1 = (slice(0, None), slice(self.band_dict['red'], self.band_dict['red']+1), slice(0, None), slice(0, None)) >>> sl2 = (slice(0, None), slice(self.band_dict['green'], self.band_dict['green']+1), slice(0, None), slice(0, None)) >>> >>> vi = (array[sl1] - array[sl2]) / ((array[sl1] + array[sl2]) + 1e-9) >>> >>> return vi.mean(axis=0).squeeze() >>> >>> with rio.open(l8_224078_20200518) as src: >>> res = src.res >>> bounds = src.bounds >>> nodata = 0 >>> >>> # Open many files, each with 3 bands >>> with gw.series([l8_224078_20200518]*100, >>> band_names=['blue', 'green', 'red'], >>> crs='epsg:32621', >>> res=res, >>> bounds=bounds, >>> nodata=nodata, >>> num_threads=4, >>> window_size=(1024, 1024)) as src: >>> >>> src.apply(TemporalMean(), >>> bands=-1, # open all bands >>> gain=0.0001, # scale from [0,10000] -> [0,1] >>> processes=False, # use threads >>> num_workers=4, # use 4 concurrent threads, one per window >>> outfile='vi_mean.tif') >>> >>> # Import a single-band image >>> from geowombat.data import l8_224078_20200518_B4 >>> >>> # Open many files, each with 1 band >>> with gw.series([l8_224078_20200518_B4]*100, >>> band_names=['red'], >>> crs='epsg:32621', >>> res=res, >>> bounds=bounds, >>> nodata=nodata, >>> num_threads=4, >>> window_size=(1024, 1024)) as src: >>> >>> src.apply('mean', # built-in function over single-band images >>> bands=1, # open all bands >>> gain=0.0001, # scale from [0,10000] -> [0,1] >>> num_workers=4, # use 4 concurrent threads, one per window >>> outfile='red_mean.tif') >>> >>> with gw.series([l8_224078_20200518_B4]*100, >>> band_names=['red'], >>> crs='epsg:32621', >>> res=res, >>> bounds=bounds, >>> nodata=nodata, >>> num_threads=4, >>> window_size=(1024, 1024)) as src: >>> >>> src.apply(['mean', 'max', 'cv'], # calculate multiple statistics >>> bands=1, # open all bands >>> gain=0.0001, # scale from [0,10000] -> [0,1] >>> num_workers=4, # use 4 concurrent threads, one per window >>> outfile='stack_mean.tif')
- geowombat.subset(data, left=None, top=None, right=None, bottom=None, rows=None, cols=None, center=False, mask_corners=False)#
Subsets a DataArray.
- Parameters:
data (DataArray) – The
xarray.DataArray
to subset.left (Optional[float]) – The left coordinate.
top (Optional[float]) – The top coordinate.
right (Optional[float]) – The right coordinate.
bottom (Optional[float]) – The bottom coordinate.
rows (Optional[int]) – The number of output rows.
cols (Optional[int]) – The number of output rows.
center (Optional[bool]) – Whether to center the subset on
left
andtop
.mask_corners (Optional[bool]) – Whether to mask corners (*requires
pymorph
).
- Return type:
DataArray
- Returns:
xarray.DataArray
Example
>>> import geowombat as gw >>> >>> with gw.open('image.tif', chunks=512) as ds: >>> ds_sub = gw.subset( >>> ds, >>> left=-263529.884, >>> top=953985.314, >>> rows=2048, >>> cols=2048 >>> )
- geowombat.tasseled_cap(data, nodata=None, sensor=None, scale_factor=None)#
Applies a tasseled cap transformation
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
- Return type:
DataArray
Examples
>>> import geowombat as gw >>> >>> with gw.config.update(sensor='qb', scale_factor=0.0001): >>> with gw.open('image.tif', band_names=['blue', 'green', 'red', 'nir']) as ds: >>> tcap = gw.tasseled_cap(ds)
- Return type:
DataArray
- Returns:
xarray.DataArray
- Parameters:
data (DataArray) –
nodata (float | int | None) –
sensor (str | None) –
scale_factor (float | None) –
References
- geowombat.to_netcdf(data, filename, overwrite=False, compute=True, *args, **kwargs)[source]#
Writes an Xarray DataArray to a NetCDF file.
- Parameters:
data (DataArray) – The
xarray.DataArray
to write.filename (str) – The output file name to write to.
overwrite (Optional[bool]) – Whether to overwrite an existing file. Default is
False
.compute (Optinoal[bool]) – Whether to compute and write to
filename
. Otherwise, return thedask
task graph. Default isTrue
.args (DataArray) – Additional
DataArrays
to stack.kwargs (dict) – Encoding arguments.
- Returns:
None
, writes tofilename
Examples
>>> import geowombat as gw >>> import xarray as xr >>> >>> # Write a single DataArray to a .nc file >>> with gw.config.update(sensor='l7'): >>> with gw.open('LC08_L1TP_225078_20200219_20200225_01_T1.tif') as src: >>> gw.to_netcdf(src, 'filename.nc', zlib=True, complevel=5) >>> >>> # Add extra layers >>> with gw.config.update(sensor='l7'): >>> with gw.open( >>> 'LC08_L1TP_225078_20200219_20200225_01_T1.tif' >>> ) as src, gw.open( >>> 'LC08_L1TP_225078_20200219_20200225_01_T1_angles.tif', >>> band_names=['zenith', 'azimuth'] >>> ) as ang: >>> src = ( >>> xr.where( >>> src == 0, -32768, src >>> ) >>> .astype('int16') >>> .assign_attrs(**src.attrs) >>> ) >>> >>> gw.to_netcdf( >>> src, >>> 'filename.nc', >>> ang.astype('int16'), >>> zlib=True, >>> complevel=5, >>> _FillValue=-32768 >>> ) >>> >>> # Open the data and convert to a DataArray >>> with xr.open_dataset( >>> 'filename.nc', engine='h5netcdf', chunks=256 >>> ) as ds: >>> src = ds.to_array(dim='band')
- geowombat.to_raster(data, filename, readxsize=None, readysize=None, separate=False, out_block_type='gtiff', keep_blocks=False, verbose=0, overwrite=False, gdal_cache=512, scheduler='mpool', n_jobs=1, n_workers=None, n_threads=None, n_chunks=None, padding=None, tags=None, tqdm_kwargs=None, **kwargs)[source]#
Writes a
dask
array to a raster file.Note
We advise using
save()
in place of this method.- Parameters:
data (DataArray) – The
xarray.DataArray
to write.filename (str) – The output file name to write to.
readxsize (Optional[int]) – The size of column chunks to read. If not given,
readxsize
defaults to Dask chunk size.readysize (Optional[int]) – The size of row chunks to read. If not given,
readysize
defaults to Dask chunk size.separate (Optional[bool]) – Whether to write blocks as separate files. Otherwise, write to a single file.
out_block_type (Optional[str]) – The output block type. Choices are [‘gtiff’, ‘zarr’]. Only used if
separate
=True
.keep_blocks (Optional[bool]) – Whether to keep the blocks stored on disk. Only used if
separate
=True
.verbose (Optional[int]) – The verbosity level.
overwrite (Optional[bool]) – Whether to overwrite an existing file.
gdal_cache (Optional[int]) – The
GDAL
cache size (in MB).scheduler (Optional[str]) –
The parallel task scheduler to use. Choices are [‘processes’, ‘threads’, ‘mpool’].
mpool: process pool of workers using
multiprocessing.Pool
processes: process pool of workers usingconcurrent.futures
threads: thread pool of workers usingconcurrent.futures
n_jobs (Optional[int]) – The total number of parallel jobs.
n_workers (Optional[int]) – The number of process workers.
n_threads (Optional[int]) – The number of thread workers.
n_chunks (Optional[int]) – The chunk size of windows. If not given, equal to
n_workers
x 50.overviews (Optional[bool or list]) – Whether to build overview layers.
resampling (Optional[str]) – The resampling method for overviews when
overviews
isTrue
or alist
. Choices are [‘average’, ‘bilinear’, ‘cubic’, ‘cubic_spline’, ‘gauss’, ‘lanczos’, ‘max’, ‘med’, ‘min’, ‘mode’, ‘nearest’].padding (Optional[tuple]) – Padding for each window.
padding
should be given as a tuple of (left pad, bottom pad, right pad, top pad). Ifpadding
is given, the returned list will contain a tuple ofrasterio.windows.Window
objects as (w1, w2), where w1 contains the normal window offsets and w2 contains the padded window offsets.tags (Optional[dict]) – Image tags to write to file.
tqdm_kwargs (Optional[dict]) – Additional keyword arguments to pass to
tqdm
.kwargs (Optional[dict]) – Additional keyword arguments to pass to
rasterio.write
.
- Returns:
None
, writes tofilename
Examples
>>> import geowombat as gw >>> >>> # Use 8 parallel workers >>> with gw.open('input.tif') as ds: >>> gw.to_raster(ds, 'output.tif', n_jobs=8) >>> >>> # Use 4 process workers and 2 thread workers >>> with gw.open('input.tif') as ds: >>> gw.to_raster(ds, 'output.tif', n_workers=4, n_threads=2) >>> >>> # Control the window chunks passed to concurrent.futures >>> with gw.open('input.tif') as ds: >>> gw.to_raster(ds, 'output.tif', n_workers=4, n_threads=2, n_chunks=16) >>> >>> # Compress the output and build overviews >>> with gw.open('input.tif') as ds: >>> gw.to_raster(ds, 'output.tif', n_jobs=8, overviews=True, compress='lzw')
- geowombat.to_vrt(data, filename, overwrite=False, resampling=None, nodata=None, init_dest_nodata=True, warp_mem_limit=128)[source]#
Writes a file to a VRT file.
- Parameters:
data (DataArray) – The
xarray.DataArray
to write.filename (str) – The output file name to write to.
overwrite (Optional[bool]) – Whether to overwrite an existing VRT file.
resampling (Optional[object]) – The resampling algorithm for
rasterio.vrt.WarpedVRT
. Default is ‘nearest’.nodata (Optional[float or int]) – The ‘no data’ value for
rasterio.vrt.WarpedVRT
.init_dest_nodata (Optional[bool]) – Whether or not to initialize output to
nodata
forrasterio.vrt.WarpedVRT
.warp_mem_limit (Optional[int]) – The GDAL memory limit for
rasterio.vrt.WarpedVRT
.
- Returns:
None
, writes tofilename
Examples
>>> import geowombat as gw >>> from rasterio.enums import Resampling >>> >>> # Transform a CRS and save to VRT >>> with gw.config.update(ref_crs=102033): >>> with gw.open('image.tif') as src: >>> gw.to_vrt( >>> src, >>> 'output.vrt', >>> resampling=Resampling.cubic, >>> warp_mem_limit=256 >>> ) >>> >>> # Load multiple files set to a common geographic extent >>> bounds = (left, bottom, right, top) >>> with gw.config.update(ref_bounds=bounds): >>> with gw.open( >>> ['image1.tif', 'image2.tif'], mosaic=True >>> ) as src: >>> gw.to_vrt(src, 'output.vrt')
- geowombat.transform_crs(data_src, dst_crs=None, dst_res=None, dst_width=None, dst_height=None, dst_bounds=None, src_nodata=None, dst_nodata=None, coords_only=False, resampling='nearest', warp_mem_limit=512, num_threads=1)[source]#
Transforms a DataArray to a new coordinate reference system.
- Parameters:
data_src (DataArray) – The data to transform.
dst_crs (Optional[CRS | int | dict | str]) – The destination CRS.
dst_res (Optional[tuple]) – The destination resolution.
dst_width (Optional[int]) – The destination width. Cannot be used with
dst_res
.dst_height (Optional[int]) – The destination height. Cannot be used with
dst_res
.dst_bounds (Optional[BoundingBox | tuple]) – The destination bounds, as a
rasterio.coords.BoundingBox
or as a tuple of (left, bottom, right, top).src_nodata (Optional[int | float]) – The source nodata value. Pixels with this value will not be used for interpolation. If not set, it will default to the nodata value of the source image if a masked ndarray or rasterio band, if available.
dst_nodata (Optional[int | float]) – The nodata value used to initialize the destination; it will remain in all areas not covered by the reprojected source. Defaults to the nodata value of the destination image (if set), the value of src_nodata, or 0 (GDAL default).
coords_only (Optional[bool]) – Whether to return transformed coordinates. If
coords_only
=True
then the array is not warped and the size is unchanged. It also avoids in-memory computations.resampling (Optional[str]) – The resampling method if
filename
is alist
. Choices are [‘average’, ‘bilinear’, ‘cubic’, ‘cubic_spline’, ‘gauss’, ‘lanczos’, ‘max’, ‘med’, ‘min’, ‘mode’, ‘nearest’].warp_mem_limit (Optional[int]) – The warp memory limit.
num_threads (Optional[int]) – The number of parallel threads.
- Returns:
xarray.DataArray
- geowombat.wi(data, nodata=None, mask=False, sensor=None, scale_factor=None)#
Calculates the woody vegetation index
- Parameters:
data (DataArray) – The
xarray.DataArray
to process.nodata (Optional[int or float]) – A ‘no data’ value to fill NAs with. If
None
, the ‘no data’ value is taken from thexarray.DataArray
attributes.mask (Optional[bool]) – Whether to mask the results.
sensor (Optional[str]) – The data’s sensor. If
None
, the band names should reflect the index being calculated.scale_factor (Optional[float]) – A scale factor to apply to the data. If
None
, the scale value is taken from thexarray.DataArray
attributes.
Equation:
\[WI = \Biggl \lbrace { 0,\text{ if } { red + SWIR1 \ge 0.5 } \atop 1 - \frac{red + SWIR1}{0.5}, \text{ otherwise } }\]- Reference:
See [LWC+13]
- Returns:
Data range: 0 to 1
- Return type:
xarray.DataArray
- geowombat.xy_to_lonlat(x, y, dst_crs)#
Converts from native map coordinates to longitude and latitude.
- Parameters:
x (float) – The x coordinate to convert.
y (float) – The y coordinate to convert.
dst_crs (str, object, or DataArray) – The CRS to transform to. It can be provided as a string, a CRS instance (e.g.,
pyproj.crs.CRS
), or ageowombat.DataArray
.
- Returns:
(longitude, latitude)
- Return type:
tuple
Example
>>> import geowombat as gw >>> from geowombat.core import xy_to_lonlat >>> >>> x, y = 643944.6956113526, 7183104.984484519 >>> >>> with gw.open('image.tif') as src: >>> lon, lat = xy_to_lonlat(x, y, src)