GeoDownloads#

class geowombat.util.GeoDownloads[source]#

Bases: CloudPathMixin, DownloadMixin

Attributes:
get_gcp_results

Methods

download_aws(landsat_id, band_list[, outdir])

Downloads Landsat 8 data from Amazon AWS.

download_cube(sensors, date_range, bounds, bands)

Downloads a cube of Landsat and/or Sentinel 2 imagery.

download_gcp(sensor[, downloads, outdir, ...])

Downloads a file from Google Cloud platform.

get_landsat_urls(scene_id[, bands, cloud])

Gets Google Cloud Platform COG urls for Landsat.

get_sentinel2_urls(safe_id[, bands, cloud])

Gets Google Cloud Platform COG urls for Sentinel 2.

list_gcp(sensor, query)

Lists files from Google Cloud Platform.

Attributes Summary

get_gcp_results

Methods Summary

download_cube(sensors, date_range, bounds, bands)

Downloads a cube of Landsat and/or Sentinel 2 imagery.

list_gcp(sensor, query)

Lists files from Google Cloud Platform.

Attributes Documentation

get_gcp_results#

Methods Documentation

download_cube(sensors, date_range, bounds, bands, bands_out=None, crs=None, out_bounds=None, outdir='.', resampling='bilinear', ref_res=None, l57_angles_path=None, l8_angles_path=None, subsample=1, write_format='gtiff', write_angle_files=False, mask_qa=False, lqa_mask_items=None, chunks=512, cloud_heights=None, sr_method='srem', earthdata_username=None, earthdata_key_file=None, earthdata_code_file=None, srtm_outdir=None, n_jobs=1, num_workers=1, num_threads=1, **kwargs)[source]#

Downloads a cube of Landsat and/or Sentinel 2 imagery.

Parameters:
  • sensors (str or list) – The sensors, or sensor, to download.

  • date_range (list) – The date range, given as [date1, date2], where the date format is yyyy-mm.

  • bounds (GeoDataFrame, list, or tuple) – The geometry bounds (in WGS84 lat/lon) that define the cube extent to download. If given as a GeoDataFrame, only the first DataFrame record will be used. If given as a tuple or a list, the order should be (left, bottom, right, top).

  • bands (str or list) –

    The bands to download.

    E.g.:

    Sentinel s2cloudless bands:

    bands = [‘coastal’, ‘blue’, ‘red’, ‘nir1’, ‘nir’, ‘rededge’, ‘water’, ‘cirrus’, ‘swir1’, ‘swir2’]

  • bands_out (Optional[list]) – The bands to write to file. This might be useful after downloading all bands to mask clouds, but are only interested in subset of those bands.

  • crs (Optional[str or object]) – The output CRS. If bounds is a GeoDataFrame, the CRS is taken from the object.

  • out_bounds (Optional[list or tuple]) – The output bounds in crs. If not given, the bounds are taken from bounds.

  • outdir (Optional[str]) – The output directory.

  • ref_res (Optional[tuple]) – A reference cell resolution.

  • resampling (Optional[str]) – The resampling method.

  • l57_angles_path (str) – The path to the Landsat 5 and 7 angles bin.

  • l8_angles_path (str) – The path to the Landsat 8 angles bin.

  • subsample (Optional[int]) – The sub-sample factor when calculating the angles.

  • write_format (Optional[bool]) – The data format to write. Choices are [‘gtiff’, ‘netcdf’].

  • write_angle_files (Optional[bool]) – Whether to write the angles to file.

  • mask_qa (Optional[bool]) – Whether to mask data with the QA file.

  • lqa_mask_items (Optional[list]) – A list of QA mask items for Landsat.

  • chunks (Optional[int]) – The chunk size to read at.

  • cloud_heights (Optional[list]) – The cloud heights, in kilometers.

  • sr_method (Optional[str]) – The surface reflectance correction method. Choices are [‘srem’, ‘6s’].

  • earthdata_username (Optional[str]) – The EarthData username.

  • earthdata_key_file (Optional[str]) – The EarthData secret key file.

  • earthdata_code_file (Optional[str]) – The EarthData secret passcode file.

  • srtm_outdir (Optional[str]) – The output SRTM directory.

  • n_jobs (Optional[int]) – The number of parallel download workers for joblib.

  • num_workers (Optional[int]) – The number of parallel workers for dask.compute.

  • num_threads (Optional[int]) – The number of GDAL warp threads.

  • kwargs (Optional[dict]) – Keyword arguments passed to to_raster.

Examples

>>> from geowombat.util import GeoDownloads
>>> gdl = GeoDownloads()
>>>
>>> # Download a Landsat 7 panchromatic cube
>>> gdl.download_cube(['l7'],
>>>                   ['2010-01-01', '2010-02-01'],
>>>                   (-91.57, 40.37, -91.46, 40.42),
>>>                   ['pan'],
>>>                   crs="+proj=aea +lat_1=-5 +lat_2=-42 +lat_0=-32 +lon_0=-60 +x_0=0 +y_0=0 +ellps=aust_SA +units=m +no_defs")
>>>
>>> # Download a Landsat 7, 8 and Sentinel 2 cube of the visible spectrum
>>> gdl.download_cube(['l7', 'l8', 's2a'],
>>>                   ['2017-01-01', '2018-01-01'],
>>>                   (-91.57, 40.37, -91.46, 40.42),
>>>                   ['blue', 'green', 'red'],
>>>                   crs={'init': 'epsg:102033'},
>>>                   readxsize=1024,
>>>                   readysize=1024,
>>>                   n_workers=1,
>>>                   n_threads=8)
list_gcp(sensor, query)[source]#

Lists files from Google Cloud Platform.

Parameters:
  • sensor (str) – The sensor to query. Choices are [‘l5’, ‘l7’, ‘l8’, ‘s2a’, ‘s2c’].

  • query (str) – The query string.

Examples

>>> dl = GeoDownloads()
>>>
>>> # Query from a known directory
>>> dl.list_gcp('landsat', 'LC08/01/042/034/LC08_L1TP_042034_20161104_20170219_01_T1/')
>>>
>>> # Query a date for Landsat 5
>>> dl.list_gcp('l5', '042/034/*2016*')
>>>
>>> # Query a date for Landsat 7
>>> dl.list_gcp('l7', '042/034/*2016*')
>>>
>>> # Query a date for Landsat 8
>>> dl.list_gcp('l8', '042/034/*2016*')
>>>
>>> # Query Sentinel-2
>>> dl.list_gcp('s2a', '21/H/UD/*2019*.SAFE/GRANULE/*')
Returns:

dict

download_cube(sensors, date_range, bounds, bands, bands_out=None, crs=None, out_bounds=None, outdir='.', resampling='bilinear', ref_res=None, l57_angles_path=None, l8_angles_path=None, subsample=1, write_format='gtiff', write_angle_files=False, mask_qa=False, lqa_mask_items=None, chunks=512, cloud_heights=None, sr_method='srem', earthdata_username=None, earthdata_key_file=None, earthdata_code_file=None, srtm_outdir=None, n_jobs=1, num_workers=1, num_threads=1, **kwargs)[source]#

Downloads a cube of Landsat and/or Sentinel 2 imagery.

Parameters:
  • sensors (str or list) – The sensors, or sensor, to download.

  • date_range (list) – The date range, given as [date1, date2], where the date format is yyyy-mm.

  • bounds (GeoDataFrame, list, or tuple) – The geometry bounds (in WGS84 lat/lon) that define the cube extent to download. If given as a GeoDataFrame, only the first DataFrame record will be used. If given as a tuple or a list, the order should be (left, bottom, right, top).

  • bands (str or list) –

    The bands to download.

    E.g.:

    Sentinel s2cloudless bands:

    bands = [‘coastal’, ‘blue’, ‘red’, ‘nir1’, ‘nir’, ‘rededge’, ‘water’, ‘cirrus’, ‘swir1’, ‘swir2’]

  • bands_out (Optional[list]) – The bands to write to file. This might be useful after downloading all bands to mask clouds, but are only interested in subset of those bands.

  • crs (Optional[str or object]) – The output CRS. If bounds is a GeoDataFrame, the CRS is taken from the object.

  • out_bounds (Optional[list or tuple]) – The output bounds in crs. If not given, the bounds are taken from bounds.

  • outdir (Optional[str]) – The output directory.

  • ref_res (Optional[tuple]) – A reference cell resolution.

  • resampling (Optional[str]) – The resampling method.

  • l57_angles_path (str) – The path to the Landsat 5 and 7 angles bin.

  • l8_angles_path (str) – The path to the Landsat 8 angles bin.

  • subsample (Optional[int]) – The sub-sample factor when calculating the angles.

  • write_format (Optional[bool]) – The data format to write. Choices are [‘gtiff’, ‘netcdf’].

  • write_angle_files (Optional[bool]) – Whether to write the angles to file.

  • mask_qa (Optional[bool]) – Whether to mask data with the QA file.

  • lqa_mask_items (Optional[list]) – A list of QA mask items for Landsat.

  • chunks (Optional[int]) – The chunk size to read at.

  • cloud_heights (Optional[list]) – The cloud heights, in kilometers.

  • sr_method (Optional[str]) – The surface reflectance correction method. Choices are [‘srem’, ‘6s’].

  • earthdata_username (Optional[str]) – The EarthData username.

  • earthdata_key_file (Optional[str]) – The EarthData secret key file.

  • earthdata_code_file (Optional[str]) – The EarthData secret passcode file.

  • srtm_outdir (Optional[str]) – The output SRTM directory.

  • n_jobs (Optional[int]) – The number of parallel download workers for joblib.

  • num_workers (Optional[int]) – The number of parallel workers for dask.compute.

  • num_threads (Optional[int]) – The number of GDAL warp threads.

  • kwargs (Optional[dict]) – Keyword arguments passed to to_raster.

Examples

>>> from geowombat.util import GeoDownloads
>>> gdl = GeoDownloads()
>>>
>>> # Download a Landsat 7 panchromatic cube
>>> gdl.download_cube(['l7'],
>>>                   ['2010-01-01', '2010-02-01'],
>>>                   (-91.57, 40.37, -91.46, 40.42),
>>>                   ['pan'],
>>>                   crs="+proj=aea +lat_1=-5 +lat_2=-42 +lat_0=-32 +lon_0=-60 +x_0=0 +y_0=0 +ellps=aust_SA +units=m +no_defs")
>>>
>>> # Download a Landsat 7, 8 and Sentinel 2 cube of the visible spectrum
>>> gdl.download_cube(['l7', 'l8', 's2a'],
>>>                   ['2017-01-01', '2018-01-01'],
>>>                   (-91.57, 40.37, -91.46, 40.42),
>>>                   ['blue', 'green', 'red'],
>>>                   crs={'init': 'epsg:102033'},
>>>                   readxsize=1024,
>>>                   readysize=1024,
>>>                   n_workers=1,
>>>                   n_threads=8)
list_gcp(sensor, query)[source]#

Lists files from Google Cloud Platform.

Parameters:
  • sensor (str) – The sensor to query. Choices are [‘l5’, ‘l7’, ‘l8’, ‘s2a’, ‘s2c’].

  • query (str) – The query string.

Examples

>>> dl = GeoDownloads()
>>>
>>> # Query from a known directory
>>> dl.list_gcp('landsat', 'LC08/01/042/034/LC08_L1TP_042034_20161104_20170219_01_T1/')
>>>
>>> # Query a date for Landsat 5
>>> dl.list_gcp('l5', '042/034/*2016*')
>>>
>>> # Query a date for Landsat 7
>>> dl.list_gcp('l7', '042/034/*2016*')
>>>
>>> # Query a date for Landsat 8
>>> dl.list_gcp('l8', '042/034/*2016*')
>>>
>>> # Query Sentinel-2
>>> dl.list_gcp('s2a', '21/H/UD/*2019*.SAFE/GRANULE/*')
Returns:

dict