SEN12MS#

class torchgeo.datasets.SEN12MS(root='data', split='train', bands=('VV', 'VH', 'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B10', 'B11', 'B12'), transforms=None, checksum=False)[source]#

Bases: NonGeoDataset

SEN12MS dataset.

The SEN12MS dataset contains 180,662 patch triplets of corresponding Sentinel-1 dual-pol SAR data, Sentinel-2 multi-spectral images, and MODIS-derived land cover maps. The patches are distributed across the land masses of the Earth and spread over all four meteorological seasons. This is reflected by the dataset structure. All patches are provided in the form of 16-bit GeoTiffs containing the following specific information:

  • Sentinel-1 SAR: 2 channels corresponding to sigma nought backscatter values in dB scale for VV and VH polarization.

  • Sentinel-2 Multi-Spectral: 13 channels corresponding to the 13 spectral bands (B1, B2, B3, B4, B5, B6, B7, B8, B8a, B9, B10, B11, B12).

  • MODIS Land Cover: 4 channels corresponding to IGBP, LCCS Land Cover, LCCS Land Use, and LCCS Surface Hydrology layers.

If you use this dataset in your research, please cite the following paper:

Note

This dataset can be automatically downloaded using the following bash script:

for season in 1158_spring 1868_summer 1970_fall 2017_winter
do
    for source in lc s1 s2
    do
        wget "ftp://m1474000:[email protected]/ROIs${season}_${source}.tar.gz"
        tar xvzf "ROIs${season}_${source}.tar.gz"
    done
done

for split in train test
do
    wget "https://raw.githubusercontent.com/schmitt-muc/SEN12MS/3a41236a28d08d253ebe2fa1a081e5e32aa7eab4/splits/${split}_list.txt"
done

or manually downloaded from https://dataserv.ub.tum.de/s/m1474000 and schmitt-muc/SEN12MS. This download will likely take several hours.

__init__(root='data', split='train', bands=('VV', 'VH', 'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B10', 'B11', 'B12'), transforms=None, checksum=False)[source]#

Initialize a new SEN12MS dataset instance.

The bands argument allows for the subsetting of bands returned by the dataset. Integers in bands index into a stack of Sentinel 1 and Sentinel 2 imagery. Indices 0 and 1 correspond to the Sentinel 1 imagery where indices 2 through 14 correspond to the Sentinel 2 imagery.

Parameters:
  • root (str | PathLike[str]) – root directory where dataset can be found

  • split (str) – one of “train” or “test”

  • bands (Sequence[str]) – a sequence of band indices to use where the indices correspond to the array index of combined Sentinel 1 and Sentinel 2

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version

  • checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)

Raises:
__getitem__(index)[source]#

Return an index within the dataset.

Parameters:

index (int) – index to return

Returns:

data and label at that index

Return type:

dict[str, Any]

__len__()[source]#

Return the number of data points in the dataset.

Returns:

length of the dataset

Return type:

int

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by __getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional suptitle to use for figure

Returns:

a matplotlib Figure with the rendered sample

Raises:

RGBBandsMissingError – If bands does not include all RGB bands.

Return type:

Figure

Added in version 0.2.

__annotate_func__()#

The type of the None singleton.