SSL4EO-L Benchmark#

class torchgeo.datasets.SSL4EOLBenchmark(root='data', sensor='oli_sr', product='cdl', split='train', classes=None, transforms=None, download=False, checksum=False)[source]#

Bases: NonGeoDataset

SSL4EO Landsat Benchmark Evaluation Dataset.

Dataset is intended to be used for evaluation of SSL techniques. Each benchmark dataset consists of 25,000 images with corresponding land cover classification masks.

Dataset format:

Input landsat image and single channel mask
25,000 total samples split into train, val, test (70%, 15%, 15%)
NLCD dataset version has 17 classes
CDL dataset version has 134 classes

Each patch has the following properties:

264 x 264 pixels
Resampled to 30 m resolution (7920 x 7920 m)
Single multispectral GeoTIFF file

If you use this dataset in your research, please cite the following paper:

https://proceedings.neurips.cc/paper_files/paper/2023/hash/bbf7ee04e2aefec136ecf60e346c2e61-Abstract-Datasets_and_Benchmarks.html

Added in version 0.5.

__init__(root='data', sensor='oli_sr', product='cdl', split='train', classes=None, transforms=None, download=False, checksum=False)[source]#

Initialize a new SSL4EO Landsat Benchmark instance.

Parameters:

root (str | PathLike[str]) – root directory where dataset can be found
sensor (str) – one of [‘etm_toa’, ‘etm_sr’, ‘oli_tirs_toa, ‘oli_sr’]
product (str) – mask target, one of [‘cdl’, ‘nlcd’]
split (str) – dataset split, one of [‘train’, ‘val’, ‘test’]
classes (list[int] | None) – list of classes to include, the rest will be mapped to 0 (defaults to all classes for the chosen product)
transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version
download (bool) – if True, download dataset and store it in the root directory
checksum (bool) – if True, check the MD5 after downloading files (may be slow)

Raises:

AssertionError – if any arguments are invalid
DatasetNotFoundError – If dataset is not found and download is False.

__getitem__(index)[source]#

Return an index within the dataset.

Parameters:: index (int) – index to return
Returns:: image and sample
Return type:: dict[str, Any]

__len__()[source]#

Return the number of data points in the dataset.

Returns:: length of the dataset
Return type:: int

retrieve_sample_collection()[source]#

Retrieve paths to samples in data directory.

__annotate_func__()#: The type of the None singleton.

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:

sample (dict[str, Any]) – a sample returned by __getitem__()
show_titles (bool) – flag indicating whether to show titles above each panel
suptitle (str | None) – optional string to use as a suptitle

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

SSL4EO-L Benchmark#

This Page