DFC2022#

class torchgeo.datasets.DFC2022(root='data', split='train', transforms=None, checksum=False)[source]#

Bases: NonGeoDataset

DFC2022 dataset.

The DFC2022 dataset is used as a benchmark dataset for the 2022 IEEE GRSS Data Fusion Contest and extends the MiniFrance dataset for semi-supervised semantic segmentation. The dataset consists of a train set containing labeled and unlabeled imagery and an unlabeled validation set. The dataset can be downloaded from the IEEEDataPort DFC2022 website.

Dataset features:

  • RGB aerial images at 0.5 m per pixel spatial resolution (~2,000x2,0000 px)

  • DEMs at 1 m per pixel spatial resolution (~1,000x1,0000 px)

  • Masks at 0.5 m per pixel spatial resolution (~2,000x2,0000 px)

  • 16 land use/land cover categories

  • Images collected from the IGN BD ORTHO database

  • DEMs collected from the IGN RGE ALTI database

  • Labels collected from the UrbanAtlas 2012 database

  • Data collected from 19 regions in France

Dataset format:

  • images are three-channel geotiffs

  • DEMS are single-channel geotiffs

  • masks are single-channel geotiffs with the pixel values represent the class

Dataset classes:

  1. No information

  2. Urban fabric

  3. Industrial, commercial, public, military, private and transport units

  4. Mine, dump and construction sites

  5. Artificial non-agricultural vegetated areas

  6. Arable land (annual crops)

  7. Permanent crops

  8. Pastures

  9. Complex and mixed cultivation patterns

  10. Orchards at the fringe of urban classes

  11. Forests

  12. Herbaceous vegetation associations

  13. Open spaces with little or no vegetation

  14. Wetlands

  15. Water

  16. Clouds and Shadows

If you use this dataset in your research, please cite the following paper:

Added in version 0.3.

__init__(root='data', split='train', transforms=None, checksum=False)[source]#

Initialize a new DFC2022 dataset instance.

Parameters:
  • root (str | PathLike[str]) – root directory where dataset can be found

  • split (str) – one of “train” or “test”

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version

  • checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)

Raises:
__getitem__(index)[source]#

Return an index within the dataset.

Parameters:

index (int) – index to return

Returns:

data and label at that index

Return type:

dict[str, Any]

__len__()[source]#

Return the number of data points in the dataset.

Returns:

length of the dataset

Return type:

int

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by __getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional string to use as a suptitle

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

__annotate_func__()#

The type of the None singleton.