Sentinel#

class torchgeo.datasets.Sentinel(paths='data', crs=None, res=None, bands=None, transforms=None, cache=True, time_series=False)[source]#

Bases: RasterDataset

Abstract base class for all Sentinel datasets.

Sentinel is a family of satellites launched by the European Space Agency (ESA) under the Copernicus Programme.

If you use this dataset in your research, please cite it using the following format:

class torchgeo.datasets.Sentinel1(paths='data', crs=None, res=(10, 10), bands=['VV', 'VH'], transforms=None, cache=True, time_series=False)[source]#

Bases: Sentinel

Sentinel-1 dataset.

The Sentinel-1 mission comprises a constellation of two polar-orbiting satellites, operating day and night performing C-band synthetic aperture radar imaging, enabling them to acquire imagery regardless of the weather.

Data can be downloaded from:

Product Types:

Polarizations:

  • HH: horizontal transmit, horizontal receive

  • HV: horizontal transmit, vertical receive

  • VV: vertical transmit, vertical receive

  • VH: vertical transmit, horizontal receive

Acquisition Modes:

Note

At the moment, this dataset only supports the GRD product type. Data must be radiometrically terrain corrected (RTC). This can be done manually using a DEM, or you can download an On Demand RTC product from ASF DAAC.

Note

Mixing \(\gamma_0\) and \(\sigma_0\) backscatter coefficient data is not recommended. Similarly, power, decibel, and amplitude scale data should not be mixed, and TorchGeo does not attempt to convert all data to a common scale.

Added in version 0.4.

filename_regex = '\n        ^S1(?P<mission>[A-D])\n        _(?P<mode>SM|IW|EW|WV)\n        _(?P<date>\\d{8}T\\d{6})\n        _(?P<polarization>[DS][HV])\n        (?P<orbit>[PRO])\n        _RTC(?P<spacing>\\d{2})\n        _(?P<package>G)\n        _(?P<backscatter>[gs])\n        (?P<scale>[pda])\n        (?P<mask>[uw])\n        (?P<filter>[nf])\n        (?P<area>[ec])\n        (?P<matching>[dm])\n        _(?P<product>[0-9A-Z]{4})\n        _(?P<band>[VH]{2})\n        \\.\n    '#

Regular expression used to extract date from filename.

The expression should use named groups. The expression may contain any number of groups. The following groups are specifically searched for by the base class:

  • date: used to calculate mint and maxt for index insertion

  • start: used to calculate mint for index insertion

  • stop: used to calculate maxt for index insertion

When separate_files is True, the following additional groups are searched for to find other files:

  • band: replaced with requested band name

date_format = '%Y%m%dT%H%M%S'#

Date format string used to parse date from filename.

Not used if filename_regex does not contain a date group or start and stop groups.

all_bands: tuple[str, ...] = ('HH', 'HV', 'VV', 'VH')#

Names of all available bands in the dataset

separate_files = True#

True if data is stored in a separate file for each band, else False.

__init__(paths='data', crs=None, res=(10, 10), bands=['VV', 'VH'], transforms=None, cache=True, time_series=False)[source]#

Initialize a new Dataset instance.

Parameters:
  • paths (str | PathLike[str] | list[str | PathLike[str]]) – one or more root directories to search or files to load

  • crs (CRS | None) – coordinate reference system (CRS) to warp to (defaults to the CRS of the first file found)

  • res (float | tuple[float, float]) – resolution of the dataset in units of CRS in (xres, yres) format. If a single float is provided, it is used for both the x and y resolution. (defaults to the resolution of the first file found)

  • bands (Sequence[str]) – bands to return (defaults to [“VV”, “VH”])

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes an input sample and returns a transformed version

  • cache (bool) – if True, cache file handle to speed up repeated sampling

  • time_series (bool) – if True, stack data along the time series dimension [T, C, H, W]. If False, merge data into a [C, H, W] mosaic.

Raises:

Added in version 0.9: The time_series parameter.

Changed in version 0.5: root was renamed to paths.

filename_glob = 'S1*{}.*'#

Glob expression used to search for files.

This expression should be specific enough that it will not pick up files from other datasets. It should not include a file extension, as the dataset may be in a different file format than what it was originally downloaded as.

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by RasterDataset.__getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional string to use as a suptitle

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

class torchgeo.datasets.Sentinel2(paths='data', crs=None, res=10, bands=None, transforms=None, cache=True, time_series=False)[source]#

Bases: Sentinel

Sentinel-2 dataset.

The Copernicus Sentinel-2 mission comprises a constellation of two polar-orbiting satellites placed in the same sun-synchronous orbit, phased at 180° to each other. It aims at monitoring variability in land surface conditions, and its wide swath width (290 km) and high revisit time (10 days at the equator with one satellite, and 5 days with 2 satellites under cloud-free conditions which results in 2-3 days at mid-latitudes) will support monitoring of Earth’s surface changes.

date_format = '%Y%m%dT%H%M%S'#

Date format string used to parse date from filename.

Not used if filename_regex does not contain a date group or start and stop groups.

all_bands: tuple[str, ...] = ('B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B10', 'B11', 'B12')#

Names of all available bands in the dataset

rgb_bands: tuple[str, ...] = ('B04', 'B03', 'B02')#

Names of RGB bands in the dataset, used for plotting

separate_files = True#

True if data is stored in a separate file for each band, else False.

__init__(paths='data', crs=None, res=10, bands=None, transforms=None, cache=True, time_series=False)[source]#

Initialize a new Dataset instance.

Parameters:
  • paths (str | PathLike[str] | Iterable[str | PathLike[str]]) – one or more root directories to search or files to load

  • crs (CRS | None) – coordinate reference system (CRS) to warp to (defaults to the CRS of the first file found)

  • res (float | tuple[float, float]) – resolution of the dataset in units of CRS in (xres, yres) format. If a single float is provided, it is used for both the x and y resolution. (defaults to the resolution of the first file found)

  • bands (Sequence[str] | None) – bands to return (defaults to all bands)

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes an input sample and returns a transformed version

  • cache (bool) – if True, cache file handle to speed up repeated sampling

  • time_series (bool) – if True, stack data along the time series dimension [T, C, H, W]. If False, merge data into a [C, H, W] mosaic.

Raises:

DatasetNotFoundError – If dataset is not found.

Added in version 0.9: The time_series parameter.

Changed in version 0.5: root was renamed to paths

filename_glob = 'T*_*_{}*.*'#

Glob expression used to search for files.

This expression should be specific enough that it will not pick up files from other datasets. It should not include a file extension, as the dataset may be in a different file format than what it was originally downloaded as.

filename_regex = '\n        ^T(?P<tile>\\d{{2}}[A-Z]{{3}})\n        _(?P<date>\\d{{8}}T\\d{{6}})\n        _(?P<band>B[018][\\dA])\n        (?:_(?P<resolution>{}))?\n        \\..*$\n    '#

Regular expression used to extract date from filename.

The expression should use named groups. The expression may contain any number of groups. The following groups are specifically searched for by the base class:

  • date: used to calculate mint and maxt for index insertion

  • start: used to calculate mint for index insertion

  • stop: used to calculate maxt for index insertion

When separate_files is True, the following additional groups are searched for to find other files:

  • band: replaced with requested band name

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by RasterDataset.__getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional string to use as a suptitle

Returns:

a matplotlib Figure with the rendered sample

Raises:

RGBBandsMissingError – If bands does not include all RGB bands.

Return type:

Figure

Changed in version 0.3: Method now takes a sample dict, not a Tensor. Additionally, possible to show subplot titles and/or use a custom suptitle.