Opening and Plotting Rasters with GeoWombat#

This notebook covers the main options for opening and visualizing raster data with gw.open() and gw.config.update(). Topics include:

  1. Opening a single file and inspecting properties

  2. Band naming (explicit and sensor-based)

  3. Opening specific bands

  4. Resampling and changing resolution

  5. Reprojecting to a different CRS

  6. Subsetting by bounds

  7. Setting nodata values

  8. Scaling data

  9. Stacking multiple files (stack_dim='time' and stack_dim='band')

  10. Mosaicking multiple files

  11. Plotting


Setup#

[37]:
import os
os.environ['CPL_LOG'] = '/dev/null'  # suppress GDAL warnings

import matplotlib.pyplot as plt
import geowombat as gw
from geowombat.data import (
    rgbn,
    rgbn_20160101,
    rgbn_20160401,
    rgbn_suba,
    rgbn_subb,
    l8_224078_20200518,
    l8_224077_20200518_B2,
    l8_224077_20200518_B3,
    l8_224077_20200518_B4,
)

1. Opening a Single File#

gw.open() is a context manager that opens rasters lazily using dask-backed xarray DataArrays. Data are not loaded into memory until you call .compute() or .gw.save().

[38]:
with gw.open(rgbn) as src:
    print(f'Shape: {src.shape}')
    print(f'Dims:  {src.dims}')
    print(f'CRS:   {src.crs}')
    print(f'Res:   {src.res}')
    print(f'Dtype: {src.dtype}')
    print(f'Bands: {src.band.values}')
    print(f'\nAttributes:')
    for k, v in src.attrs.items():
        print(f'  {k}: {v}')
Shape: (4, 403, 515)
Dims:  ('band', 'y', 'x')
CRS:   32618
Res:   (5.0, 5.0)
Dtype: uint8
Bands: [1 2 3 4]

Attributes:
  transform: (5.0, 0.0, 792988.0, 0.0, -5.0, 2050382.0)
  crs: 32618
  res: (5.0, 5.0)
  is_tiled: 1
  nodatavals: (nan, nan, nan, nan)
  _FillValue: nan
  scales: (1.0, 1.0, 1.0, 1.0)
  offsets: (0.0, 0.0, 0.0, 0.0)
  filename: /home/mmann1123/Documents/github/geowombat/src/geowombat/data/rgbn.tif
  resampling: nearest
  DataType: Generic
  AREA_OR_POINT: Area
  _data_are_separate: 0
  _data_are_stacked: 0

2. Band Naming#

By default, bands are numbered (1, 2, 3, …). You can assign meaningful names two ways.

2a. Explicit band names#

[39]:
with gw.open(rgbn, band_names=['red', 'green', 'blue', 'nir']) as src:
    print(f'Bands: {list(src.band.values)}')
    # Select by name
    print(f'NIR band shape: {src.sel(band="nir").shape}')
Bands: ['red', 'green', 'blue', 'nir']
NIR band shape: (403, 515)

2b. Sensor-based band names via gw.config.update(sensor=...)#

GeoWombat has built-in sensor configurations that auto-assign band names. Common sensors: 'rgbn', 'bgrn', 'l5', 'l7', 'l8', 's2', 'ps'.

[40]:
with gw.config.update(sensor='bgrn'):
    with gw.open(rgbn) as src:
        print(f'Bands: {list(src.band.values)}')
Bands: ['blue', 'green', 'red', 'nir']

3. Selecting Specific Bands#

Select bands after opening with .sel(band=...). The band_names parameter is for renaming bands, not selecting them.

[41]:
# Select a single band by index
with gw.open(rgbn) as src:
    band1 = src.sel(band=1)
    print(f'Single band shape: {band1.shape}')
    print(f'Dims: {band1.dims}')
Single band shape: (403, 515)
Dims: ('y', 'x')
[42]:
# Or open all bands and select after
with gw.open(rgbn, band_names=['red', 'green', 'blue', 'nir']) as src:
    rgb = src.sel(band=['red', 'green', 'blue'])
    print(f'RGB subset shape: {rgb.shape}')
    print(f'RGB bands: {list(rgb.band.values)}')
RGB subset shape: (3, 403, 515)
RGB bands: ['red', 'green', 'blue']

4. Resampling and Changing Resolution#

Use gw.config.update(ref_res=...) to change the output resolution. The resampling parameter controls the interpolation method.

Resampling choices: 'nearest', 'bilinear', 'cubic', 'cubic_spline', 'lanczos', 'average', 'mode', 'max', 'min', 'med', 'gauss'.

[43]:
# Original resolution
with gw.open(rgbn) as src:
    print(f'Original: {src.shape}, res={src.res}')

# Downsample to 20m
with gw.config.update(ref_res=20):
    with gw.open(rgbn, resampling='bilinear') as src:
        print(f'20m:      {src.shape}, res={src.res}')

# Non-square pixels
with gw.config.update(ref_res=(10, 20)):
    with gw.open(rgbn, resampling='cubic') as src:
        print(f'10x20m:   {src.shape}, res={src.res}')
Original: (4, 403, 515), res=(5.0, 5.0)
20m:      (4, 100, 128), res=(20.0, 20.0)
10x20m:   (4, 100, 257), res=(10.0, 20.0)

5. Reprojecting to a Different CRS#

Use gw.config.update(ref_crs=...) with an EPSG code or a reference image.

Note: When changing CRS units (e.g., meters to degrees), you must also set ref_res in the target CRS units — geowombat does not auto-convert resolution across unit systems.

[44]:
with gw.open(rgbn) as src:
    print(f'Original CRS: {src.crs}')
    print(f'Original res: {src.res}')

# Reproject to a different UTM zone (same unit system — meters)
with gw.config.update(ref_crs=32617):
    with gw.open(rgbn) as src:
        print(f'\nReprojected CRS: {src.crs}')
        print(f'Shape: {src.shape}')
        print(f'Res: {src.res}')

# Reproject to EPSG:4326 (degrees) — must set ref_res in degrees
# ~0.00005 degrees is roughly 5m at mid-latitudes
with gw.config.update(ref_crs=4326, ref_res=0.00005):
    with gw.open(rgbn) as src:
        print(f'\nWGS84 CRS: {src.crs}')
        print(f'Shape: {src.shape}')
        print(f'Res: {src.res}')
Original CRS: 32618
Original res: (5.0, 5.0)

Reprojected CRS: 32617
Shape: (4, 424, 533)
Res: (5.0, 5.0)

WGS84 CRS: 4326
Shape: (4, 370, 493)
Res: (5e-05, 5e-05)
[45]:
# Use a reference image to match CRS and resolution
with gw.config.update(ref_image=l8_224078_20200518):
    with gw.open(rgbn) as src:
        print(f'CRS (from ref): {src.crs}')
        print(f'Res (from ref): {src.res}')
CRS (from ref): 32621
Res (from ref): (30.0, 30.0)

6. Subsetting by Bounds#

Clip the output to a bounding box using bounds or gw.config.update(ref_bounds=...).

Bounds format: (left, bottom, right, top) in the output CRS.

[46]:
# Get the full extent first
with gw.open(rgbn) as src:
    print(f'Full extent: {src.shape}')
    print(f'Y range: {float(src.y.min()):.0f} to {float(src.y.max()):.0f}')
    print(f'X range: {float(src.x.min()):.0f} to {float(src.x.max()):.0f}')

# Subset with bounds (left, bottom, right, top)
with gw.open(rgbn, bounds=(793000, 2049000, 794000, 2050000)) as src:
    print(f'\nSubset: {src.shape}')
    print(f'Y range: {float(src.y.min()):.0f} to {float(src.y.max()):.0f}')
    print(f'X range: {float(src.x.min()):.0f} to {float(src.x.max()):.0f}')
Full extent: (4, 403, 515)
Y range: 2048370 to 2050380
X range: 792990 to 795560

Subset: (4, 200, 200)
Y range: 2049002 to 2049998
X range: 793002 to 793998
[47]:
# Or use ref_bounds from config (useful when opening multiple files)
with gw.config.update(ref_bounds=(793000, 2049000, 794000, 2050000)):
    with gw.open(rgbn) as src:
        print(f'Config bounds subset: {src.shape}')
Config bounds subset: (4, 200, 200)

7. Setting Nodata Values#

The nodata parameter overrides the file’s internal nodata value. This is useful when files have incorrect or missing nodata metadata.

[48]:
# Check the file's built-in nodata
with gw.open(l8_224078_20200518) as src:
    print(f'Default nodata: {src.attrs.get("nodatavals")}')

# Override with a specific nodata value
with gw.open(l8_224078_20200518, nodata=0) as src:
    print(f'Overridden nodata: {src.attrs.get("nodatavals")}')

# Or set via config (takes highest precedence)
with gw.config.update(nodata=0):
    with gw.open(l8_224078_20200518) as src:
        print(f'Config nodata: {src.attrs.get("nodatavals")}')
Default nodata: (nan, nan, nan)
Overridden nodata: (0, 0, 0)
Config nodata: (0, 0, 0)

8. Scaling Data#

Use scale_data=True to apply gain/offset: scaled = data * scale_factor + offset.

Scale values come from (in priority order):

  1. gw.config.update(scale_factor=..., offset=...)

  2. gw.open(scale_factor=..., offset=...)

  3. File metadata

[49]:
# No scaling (raw DN values)
with gw.open(rgbn) as src:
    print(f'Raw dtype: {src.dtype}')
    print(f'Raw range: {float(src.min()):.0f} - {float(src.max()):.0f}')

# Scale to 0-1 reflectance-like values
with gw.open(rgbn, scale_data=True, scale_factor=1/255.0) as src:
    print(f'Scaled dtype: {src.dtype}')
    print(f'Scaled range: {float(src.min()):.3f} - {float(src.max()):.3f}')
Raw dtype: uint8
Raw range: 0 - 255
Scaled dtype: float64
Scaled range: 0.000 - 1.000

9. Stacking Multiple Files#

When given a list of files, gw.open() stacks them along either 'time' or 'band' dimension.

9a. Time stacking (stack_dim='time', default)#

Each file becomes a time step. Output shape: (time, band, y, x).

[50]:
with gw.open(
    [rgbn_20160101, rgbn_20160401],
    stack_dim='time',
) as src:
    print(f'Shape: {src.shape}')
    print(f'Dims:  {src.dims}')
    print(f'Time:  {src.time.values}')
Shape: (2, 4, 403, 515)
Dims:  ('time', 'band', 'y', 'x')
Time:  [1 2]
[51]:
# With custom time names
with gw.open(
    [rgbn_20160101, rgbn_20160401],
    stack_dim='time',
    time_names=['2016-01-01', '2016-04-01'],
) as src:
    print(f'Time: {src.time.values}')
Time: ['2016-01-01' '2016-04-01']

9b. Band stacking (stack_dim='band')#

Bands from each file are concatenated into a single band dimension. Output shape: (band, y, x) with more bands.

[52]:
with gw.open(
    [rgbn_20160101, rgbn_20160401],
    stack_dim='band',
) as src:
    print(f'Shape: {src.shape}')
    print(f'Dims:  {src.dims}')
    print(f'Bands: {src.band.values}')
Shape: (8, 403, 515)
Dims:  ('band', 'y', 'x')
Bands: [1 2 3 4 5 6 7 8]

9c. Stacking single-band files#

Stack individual band files into a multi-band image.

[53]:
with gw.open(
    [l8_224077_20200518_B2, l8_224077_20200518_B3, l8_224077_20200518_B4],
    stack_dim='band',
    band_names=['blue', 'green', 'red'],
) as src:
    print(f'Shape: {src.shape}')
    print(f'Bands: {list(src.band.values)}')
Shape: (3, 1515, 2006)
Bands: ['blue', 'green', 'red']

9d. Controlling overlap extent with bounds_by#

When stacking files with different extents, bounds_by controls the output:

bounds_by

Behavior

'reference' (default)

Use bounds of the first (or ref_image) file

'intersection'

Use the overlapping area only

'union'

Use the combined extent of all files

[54]:
# Union: maximum extent
with gw.open(
    [rgbn_suba, rgbn_subb],
    stack_dim='band',
    bounds_by='union',
) as src:
    print(f'Union shape: {src.shape}')

# Intersection: minimum extent
with gw.open(
    [rgbn_suba, rgbn_subb],
    stack_dim='band',
    bounds_by='intersection',
) as src:
    print(f'Intersection shape: {src.shape}')
15:07:52:WARNING:548:xarray_.check_alignment:The stacked dimensions are not aligned. If this was not intentional, use gw.config.update to align coordinates. To suppress this message, use with gw.config.update(ignore_warnings=True):.
Union shape: (8, 431, 570)
15:07:52:WARNING:548:xarray_.check_alignment:The stacked dimensions are not aligned. If this was not intentional, use gw.config.update to align coordinates. To suppress this message, use with gw.config.update(ignore_warnings=True):.

Intersection shape: (8, 431, 570)

10. Mosaicking#

Use mosaic=True to spatially merge adjacent/overlapping images into a single raster instead of stacking them. The overlap parameter controls how overlapping pixels are resolved: 'max', 'min', or 'mean'.

[55]:
with gw.open(
    [rgbn_suba, rgbn_subb],
    mosaic=True,
    overlap='max',
) as src:
    print(f'Mosaic shape: {src.shape}')
    print(f'Dims: {src.dims}')
Mosaic shape: (4, 212, 276)
Dims: ('band', 'y', 'x')

11. Plotting#

Use src.gw.imshow() for quick visualization. It wraps xarray’s .plot.imshow() with tidier axes. Only 1-band or 3-band arrays can be plotted.

RGB plot#

[56]:
with gw.open(rgbn) as src:
    # Select bands 3, 2, 1 (RGB order) for display
    src.sel(band=[3, 2, 1]).gw.imshow()
_images/open_plot_39_0.png

Single band plot#

[57]:
fig, ax = plt.subplots(figsize=(6, 6))
with gw.open(rgbn, band_names=['red', 'green', 'blue', 'nir']) as src:
    src.sel(band='nir').plot(ax=ax, cmap='YlGn', robust=True)
    ax.set_title('NIR Band')
    ax.set_aspect('equal')
plt.tight_layout()
plt.show()
_images/open_plot_41_0.png

Plot with robust=True and nodata masking#

[58]:
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

with gw.open(l8_224078_20200518, nodata=0) as src:
    # Without masking
    src.sel(band=[3, 2, 1]).gw.imshow(ax=axes[0])
    axes[0].set_title('Without nodata mask')

    # With nodata masking (mask=True hides nodata pixels)
    src.sel(band=[3, 2, 1]).gw.imshow(mask=True, nodata=0, robust=True, ax=axes[1])
    axes[1].set_title('With nodata mask + robust stretch')

plt.tight_layout()
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0..24147].
_images/open_plot_43_1.png

Flip band order#

If your bands are in BGR order, use flip=True to reverse them for display.

[59]:
with gw.open(rgbn) as src:
    src.sel(band=[1, 2, 3]).gw.imshow(flip=True)
_images/open_plot_45_0.png

gw.config.update() Reference#

The config context manager controls global settings that override gw.open() arguments.

Parameter

Description

Example

sensor

Auto-assign band names

'l8', 's2', 'bgrn', 'rgbn'

ref_image

Reference file for CRS, res, and bounds

'ref.tif'

ref_crs

Output CRS (EPSG code or file path)

4326, 32618

ref_res

Output resolution in CRS units

10, (10, 20)

ref_bounds

Output bounding box (left, bottom, right, top)

(xmin, ymin, xmax, ymax)

nodata

Nodata value (highest precedence)

0, -9999

scale_factor

Scale factor for scale_data=True

1e-4

offset

Offset for scale_data=True

-0.2

Config settings can be combined:

with gw.config.update(ref_crs=4326, ref_res=0.001, nodata=0):
    with gw.open('image.tif', resampling='bilinear') as src:
        ...

gw.open() Reference#

Parameter

Default

Description

filename

(required)

File path, list of paths, or glob pattern ('*.tif')

band_names

None

Custom band names

time_names

None

Custom time labels when stacking

stack_dim

'time'

Stack dimension: 'time' or 'band'

bounds

None

Spatial subset: (left, bottom, right, top)

bounds_by

'reference'

Extent strategy: 'reference', 'intersection', 'union'

resampling

'nearest'

Resampling method

mosaic

False

Merge files spatially instead of stacking

overlap

'max'

Mosaic overlap handling: 'max', 'min', 'mean'

nodata

None

Override nodata value

scale_factor

None

Scale multiplier

offset

None

Additive offset

scale_data

False

Apply scaling: data * scale_factor + offset

dtype

None

Force output data type (e.g., 'float32')

num_workers

1

Dask workers for parallel I/O