open#
- geowombat.open(filename, band_names=None, time_names=None, stack_dim='time', bounds=None, bounds_by='reference', resampling='nearest', persist_filenames=False, netcdf_vars=None, mosaic=False, overlap='max', nodata=None, scale_factor=None, offset=None, dtype=None, scale_data=False, num_workers=1, **kwargs)[source]#
Opens one or more raster files.
- Parameters:
filename (str or list) – The file name, search string, or a list of files to open.
band_names (Optional[1d array-like]) – A list of band names if
bounds
is given orwindow
is given. Default is None.time_names (Optional[1d array-like]) – A list of names to give the time dimension if
bounds
is given. Default is None.stack_dim (Optional[str]) – The stack dimension. Choices are [‘time’, ‘band’].
bounds (Optional[1d array-like]) – A bounding box to subset to, given as [minx, maxy, miny, maxx]. Default is None.
bounds_by (Optional[str]) –
How to concatenate the output extent if
filename
is alist
andmosaic
=False
. Choices are [‘intersection’, ‘union’, ‘reference’]. * reference: Use the bounds of the reference image. If aref_image
is not given, the first image inthe
filename
list is used.intersection: Use the intersection (i.e., minimum extent) of all the image bounds
union: Use the union (i.e., maximum extent) of all the image bounds
resampling (Optional[str]) – The resampling method if
filename
is alist
. Choices are [‘average’, ‘bilinear’, ‘cubic’, ‘cubic_spline’, ‘gauss’, ‘lanczos’, ‘max’, ‘med’, ‘min’, ‘mode’, ‘nearest’].persist_filenames (Optional[bool]) – Whether to persist the filenames list with the
xarray.DataArray
attributes. By default,persist_filenames=False
to avoid storing large file lists.netcdf_vars (Optional[list]) – NetCDF variables to open as a band stack.
mosaic (Optional[bool]) – If
filename
is alist
, whether to mosaic the arrays instead of stacking.overlap (Optional[str]) – The keyword that determines how to handle overlapping data if
filenames
is alist
. Choices are [‘min’, ‘max’, ‘mean’].nodata (Optional[float | int]) –
A ‘no data’ value to set. Default is
None
. Ifnodata
isNone
, the ‘no data’ value is set from the file metadata. Ifnodata
is given, then the file ‘no data’ value is overridden. See docstring examples for use ofnodata
ingeowombat.config.update
.Note
The
geowombat.config.update
overrides this argument. Thus, preference is always given in the following order:geowombat.config.update(nodata not None)
open(nodata not None)
file ‘no data’ value from metadata ‘_FillValue’ or ‘nodatavals’
scale_factor (Optional[float | int]) –
A scale value to apply to the opened data. The same rules used in
nodata
apply. I.e.,Note
The
geowombat.config.update
overrides this argument. Thus, preference is always given in the following order:geowombat.config.update(scale_factor not None)
open(scale_factor not None)
file scale value from metadata ‘scales’
offset (Optional[float | int]) –
An offset value to apply to the opened data. The same rules used in
nodata
apply. I.e.,Note
The
geowombat.config.update
overrides this argument. Thus, preference is always given in the following order:geowombat.config.update(offset not None)
open(offset not None)
file offset value from metadata ‘offsets’
dtype (Optional[str]) – A data type to force the output to. If not given, the data type is extracted from the file.
scale_data (Optional[bool]) –
Whether to apply scaling to the opened data. Default is
False
. Scaled data are returned as:scaled = data * gain + offset
See the arguments
nodata
,scale_factor
, andoffset
for rules regarding how scaling is applied.num_workers (Optional[int]) – The number of parallel workers for Dask if
bounds
is given orwindow
is given. Default is 1.kwargs (Optional[dict]) – Keyword arguments passed to the file opener.
- Returns:
xarray.DataArray
orxarray.Dataset
Examples
>>> import geowombat as gw >>> >>> # Open an image >>> with gw.open('image.tif') as ds: >>> print(ds) >>> >>> # Open a list of images, stacking along the 'time' dimension >>> with gw.open(['image1.tif', 'image2.tif']) as ds: >>> print(ds) >>> >>> # Open all GeoTiffs in a directory, stack along the 'time' dimension >>> with gw.open('*.tif') as ds: >>> print(ds) >>> >>> # Use a context manager to handle images of difference sizes and projections >>> with gw.config.update(ref_image='image1.tif'): >>> # Use 'time' names to stack and mosaic non-aligned images with identical dates >>> with gw.open(['image1.tif', 'image2.tif', 'image3.tif'], >>> >>> # The first two images were acquired on the same date >>> # and will be merged into a single time layer >>> time_names=['date1', 'date1', 'date2']) as ds: >>> >>> print(ds) >>> >>> # Mosaic images across space using a reference >>> # image for the CRS and cell resolution >>> with gw.config.update(ref_image='image1.tif'): >>> with gw.open(['image1.tif', 'image2.tif'], mosaic=True) as ds: >>> print(ds) >>> >>> # Mix configuration keywords >>> with gw.config.update(ref_crs='image1.tif', ref_res='image1.tif', ref_bounds='image2.tif'): >>> # The ``bounds_by`` keyword overrides the extent bounds >>> with gw.open(['image1.tif', 'image2.tif'], bounds_by='union') as ds: >>> print(ds) >>> >>> # Resample an image to 10m x 10m cell size >>> with gw.config.update(ref_crs=(10, 10)): >>> with gw.open('image.tif', resampling='cubic') as ds: >>> print(ds) >>> >>> # Open a list of images at a window slice >>> from rasterio.windows import Window >>> # Stack two images, opening band 3 >>> with gw.open( >>> ['image1.tif', 'image2.tif'], >>> band_names=['date1', 'date2'], >>> num_workers=8, >>> indexes=3, >>> window=Window(row_off=0, col_off=0, height=100, width=100), >>> dtype='float32' >>> ) as ds: >>> print(ds) >>> >>> # Scale data upon opening, using the image metadata to get scales and offsets >>> with gw.open('image.tif', scale_data=True) as ds: >>> print(ds) >>> >>> # Scale data upon opening, specifying scales and overriding metadata >>> with gw.open('image.tif', scale_data=True, scale_factor=1e-4) as ds: >>> print(ds) >>> >>> # Scale data upon opening, specifying scales and overriding metadata >>> with gw.config.update(scale_factor=1e-4): >>> with gw.open('image.tif', scale_data=True) as ds: >>> print(ds) >>> >>> # Open a NetCDF variable, specifying a NetCDF prefix and variable to open >>> with gw.open('netcdf:image.nc:blue') as src: >>> print(src) >>> >>> # Open a NetCDF image without access to transforms by providing full file path >>> # NOTE: This will be faster than the above method >>> # as it uses ``xarray.open_dataset`` and bypasses CRS checks. >>> # NOTE: The chunks must be provided by the user. >>> # NOTE: Providing band names will ensure the correct order when reading from a NetCDF dataset. >>> with gw.open( >>> 'image.nc', >>> chunks={'band': -1, 'y': 256, 'x': 256}, >>> band_names=['blue', 'green', 'red', 'nir', 'swir1', 'swir2'], >>> engine='h5netcdf' >>> ) as src: >>> print(src) >>> >>> # Open multiple NetCDF variables as an array stack >>> with gw.open('netcdf:image.nc', netcdf_vars=['blue', 'green', 'red']) as src: >>> print(src)