So2Sat#
- class torchgeo.datasets.So2Sat(root='data', version='2', split='train', bands=('S1_B1', 'S1_B2', 'S1_B3', 'S1_B4', 'S1_B5', 'S1_B6', 'S1_B7', 'S1_B8', 'S2_B02', 'S2_B03', 'S2_B04', 'S2_B05', 'S2_B06', 'S2_B07', 'S2_B08', 'S2_B8A', 'S2_B11', 'S2_B12'), transforms=None, checksum=False)[source]#
Bases:
NonGeoDatasetSo2Sat dataset.
The So2Sat dataset consists of corresponding synthetic aperture radar and multispectral optical image data acquired by the Sentinel-1 and Sentinel-2 remote sensing satellites, and a corresponding local climate zones (LCZ) label. The dataset is distributed over 42 cities across different continents and cultural regions of the world, and comes with a variety of different splits.
This implementation covers the 2nd and 3rd versions of the dataset as described in the author’s github repository: zhu-xlab/So2Sat-LCZ42.
The different versions are as follows:
Version 2: This version contains imagery from 52 cities and is split into train/val/test as follows:
Training: 42 cities around the world
Validation: western half of 10 other cities covering 10 cultural zones
Testing: eastern half of the 10 other cities
Version 3: A version of the dataset with 3 different train/test splits, as follows:
Random split: every city 80% training / 20% testing (randomly sampled)
Block split: every city is split in a geospatial 80%/20%-manner
Cultural 10: 10 cities from different cultural zones are held back for testing purposes
Dataset classes:
Compact high rise
Compact middle rise
Compact low rise
Open high rise
Open mid rise
Open low rise
Lightweight low rise
Large low rise
Sparsely built
Heavy industry
Dense trees
Scattered trees
Bush, scrub
Low plants
Bare rock or paved
Bare soil or sand
Water
If you use this dataset in your research, please cite the following paper:
Note
The version 2 dataset can be automatically downloaded using the following bash script:
for split in training validation testing do wget ftp://m1483140:[email protected]/$split.h5 done
or manually downloaded from https://dataserv.ub.tum.de/index.php/s/m1483140 This download will likely take several hours.
The version 3 datasets can be downloaded using the following bash script:
for version in random block culture_10 do for split in training testing do wget -P $version/ ftp://m1613658:[email protected]/$version/$split.h5 done done
or manually downloaded from https://mediatum.ub.tum.de/1613658
Note
This dataset requires the following additional library to be installed:
https://pypi.org/project/h5py/ to load the dataset
- __init__(root='data', version='2', split='train', bands=('S1_B1', 'S1_B2', 'S1_B3', 'S1_B4', 'S1_B5', 'S1_B6', 'S1_B7', 'S1_B8', 'S2_B02', 'S2_B03', 'S2_B04', 'S2_B05', 'S2_B06', 'S2_B07', 'S2_B08', 'S2_B8A', 'S2_B11', 'S2_B12'), transforms=None, checksum=False)[source]#
Initialize a new So2Sat dataset instance.
- Parameters:
root (str | PathLike[str]) – root directory where dataset can be found
version (str) – one of “2”, “3_random”, “3_block”, or “3_culture_10”
split (str) – one of “train”, “validation”, or “test”
bands (Sequence[str]) – a sequence of band names to use where the indices correspond to the array index of combined Sentinel 1 and Sentinel 2
transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and its target as entry and returns a transformed version
checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)
- Raises:
AssertionError – if
splitargument is invalidDatasetNotFoundError – If dataset is not found.
DependencyNotFoundError – If h5py is not installed.
Added in version 0.3: The bands parameter.
Added in version 0.5: The version parameter.
- __len__()[source]#
Return the number of data points in the dataset.
- Returns:
length of the dataset
- Return type:
- plot(sample, show_titles=True, suptitle=None)[source]#
Plot a sample from the dataset.
- Parameters:
- Returns:
a matplotlib Figure with the rendered sample
- Raises:
RGBBandsMissingError – If bands does not include all RGB bands.
- Return type:
Added in version 0.2.
- __annotate_func__()#
The type of the None singleton.