MMFlood#

class torchgeo.datasets.MMFlood(root='data', crs=None, res=None, split='train', include_dem=False, include_hydro=False, transforms=None, download=False, checksum=False, cache=False, time_series=False)[source]#

Bases: IntersectionDataset

MMFlood dataset.

MMFlood dataset is a multimodal flood delineation dataset. Sentinel-1 data is matched with masks and DEM data for all available tiles. If hydrography maps are loaded, only a subset of the dataset is loaded, since only 1,012 Sentinel-1 tiles have a corresponding hydrography map. Some Sentinel-1 tiles have missing data, which are automatically set to 0. Corresponding pixels in masks are set to 255 and should be ignored in performance computation.

Dataset features:

  • 1,748 Sentinel-1 tiles of varying pixel dimensions

  • multimodal dataset

  • 95 flood events from 42 different countries

  • includes DEMs

  • includes hydrography maps (available for 1,012 tiles out of 1,748)

  • flood delineation maps (ground truth) is obtained from Copernicus EMS

Dataset classes:

  1. no flood

  2. flood

If you use this dataset in your research, please cite the following paper:

Added in version 0.7.

__init__(root='data', crs=None, res=None, split='train', include_dem=False, include_hydro=False, transforms=None, download=False, checksum=False, cache=False, time_series=False)[source]#

Initialize a new MMFlood dataset instance.

Parameters:
  • root (str | PathLike[str]) – root directory where dataset can be found

  • crs (CRS | None) – coordinate reference system (CRS) to warp to (defaults to the CRS of the first file found)

  • res (float | tuple[float, float] | None) – resolution of the dataset in units of CRS in (xres, yres) format. If a single float is provided, it is used for both the x and y resolution. (defaults to the resolution of the first file found)

  • split (str) – train/val/test split to load

  • include_dem (bool) – If True, DEM data is concatenated after Sentinel-1 bands.

  • include_hydro (bool) – If True, hydrography data is concatenated as last channel. Only a smaller subset of the original dataset is loaded in this case.

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version

  • download (bool) – if True, download dataset and store it in the root directory

  • checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)

  • cache (bool) – if True, cache file handle to speed up repeated sampling

  • time_series (bool) – if True, stack data along the time series dimension [T, C, H, W]. If False, merge data into a [C, H, W] mosaic.

Raises:

Added in version 0.9: The time_series parameter.

__getitem__(index)[source]#

Retrieve input, target, and/or metadata indexed by spatiotemporal slice.

Parameters:

index (slice | tuple[slice] | tuple[slice, slice] | tuple[slice, slice, slice]) – [xmin:xmax:xres, ymin:ymax:yres, tmin:tmax:tres] coordinates to index.

Returns:

Sample of input, target, and/or metadata at that index.

Raises:

IndexError – If index is not found in the dataset.

Return type:

dict[str, Any]

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by __getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional suptitle to use for figure

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure