QuakeSet#

class torchgeo.datasets.QuakeSet(root='data', split='train', transforms=None, download=False, checksum=False)[source]#

Bases: NonGeoDataset

QuakeSet dataset.

QuakeSet is a dataset for Earthquake Change Detection and Magnitude Estimation and is used for the Seismic Monitoring and Analysis (SMAC) ECML-PKDD 2024 Discovery Challenge.

Dataset features:

  • Sentinel-1 SAR imagery

  • before/pre/post imagery of areas affected by earthquakes

  • 2 SAR bands (VV/VH)

  • 3,327 pairs of pre and post images with 5 m per pixel resolution (512x512 px)

  • 2 classification labels (unaffected / affected by earthquake)

  • pre/post image pairs represent earthquake affected areas

  • before/pre image pairs represent hard negative unaffected areas

  • earthquake magnitudes for each sample

Dataset format:

  • single hdf5 dataset containing images, magnitudes, hypercenters, and splits

Dataset classes:

  1. unaffected area

  2. earthquake affected area

If you use this dataset in your research, please cite the following paper:

Note

This dataset requires the following additional library to be installed:

  • h5py to load the dataset

Added in version 0.6.

__init__(root='data', split='train', transforms=None, download=False, checksum=False)[source]#

Initialize a new QuakeSet dataset instance.

Parameters:
  • root (str | PathLike[str]) – root directory where dataset can be found

  • split (str) – one of “train”, “val”, or “test”

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version

  • download (bool) – if True, download dataset and store it in the root directory

  • checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)

Raises:
__getitem__(index)[source]#

Return an index within the dataset.

Parameters:

index (int) – index to return

Returns:

sample containing image and mask

Return type:

dict[str, Any]

__len__()[source]#

Return the number of data points in the dataset.

Returns:

length of the dataset

Return type:

int

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by __getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional suptitle to use for figure

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

__annotate_func__()#

The type of the None singleton.