Cloud Cover Detection#

class torchgeo.datasets.CloudCoverDetection(root='data', split='train', bands=('B02', 'B03', 'B04', 'B08'), transforms=None, download=False)[source]#

Bases: NonGeoDataset

Sentinel-2 Cloud Cover Segmentation Dataset.

This training dataset was generated as part of a crowdsourcing competition on DrivenData.org, and later on was validated using a team of expert annotators. See this website for dataset details.

The dataset consists of Sentinel-2 satellite imagery and corresponding cloudy labels stored as GeoTiffs. There are 22,728 chips in the training data, collected between 2018 and 2020.

Each chip has:

  • 4 multi-spectral bands from Sentinel-2 L2A product. The four bands are [B02, B03, B04, B08] (refer to Sentinel-2 documentation for more information about the bands).

  • Label raster for the corresponding source tile representing a binary classification for if the pixel is a cloud or not.

If you use this dataset in your research, please cite the following paper:

Note

This dataset requires the following additional library to be installed:

  • azcopy: to download the dataset from Source Cooperative.

Added in version 0.4.

__init__(root='data', split='train', bands=('B02', 'B03', 'B04', 'B08'), transforms=None, download=False)[source]#

Initiatlize a CloudCoverDetection instance.

Parameters:
  • root (str | PathLike[str]) – root directory where dataset can be found

  • split (str) – ‘train’ or ‘test’

  • bands (Sequence[str]) – the subset of bands to load

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version

  • download (bool) – if True, download dataset and store it in the root directory

Raises:
__len__()[source]#

Return the number of items in the dataset.

Returns:

length of dataset in integer

Return type:

int

__getitem__(index)[source]#

Returns a sample from dataset.

Parameters:

index (int) – index to return

Returns:

data and label at given index

Return type:

dict[str, Any]

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by __getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • time_step – time step at which to access image, beginning with 0

  • suptitle (str | None) – optional suptitle to use for figure

Returns:

a matplotlib Figure with the rendered sample

Raises:

RGBBandsMissingError – If bands does not include all RGB bands.

Return type:

Figure

__annotate_func__()#

The type of the None singleton.