Chesapeake Land Cover#

class torchgeo.datasets.Chesapeake(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: RasterDataset, ABC

Abstract base class for all Chesapeake datasets.

Chesapeake Bay Land Use and Land Cover (LULC) Database 2022 Edition

The Chesapeake Bay Land Use and Land Cover Database (LULC) facilitates characterization of the landscape and land change for and between discrete time periods. The database was developed by the University of Vermont’s Spatial Analysis Laboratory in cooperation with Chesapeake Conservancy (CC) and U.S. Geological Survey (USGS) as part of a 6-year Cooperative Agreement between Chesapeake Conservancy and the U.S. Environmental Protection Agency (EPA) and a separate Interagency Agreement between the USGS and EPA to provide geospatial support to the Chesapeake Bay Program Office.

The database contains one-meter 13-class Land Cover (LC) and 54-class Land Use/Land Cover (LULC) for all counties within or adjacent to the Chesapeake Bay watershed for 2013/14 and 2017/18, depending on availability of National Agricultural Imagery Program (NAIP) imagery for each state. Additionally, 54 LULC classes are generalized into 18 LULC classes for ease of visualization and communication of LULC trends. LC change between discrete time periods, detected by spectral changes in NAIP imagery and LiDAR, represents changes between the 12 land cover classes. LULC change uses LC change to identify where changes are happening and then LC is translated to LULC to represent transitions between the 54 LULC classes. The LULCC data is represented as a LULC class change transition matrix which provides users acres of change between multiple classes. It is organized by 18x18 and 54x54 LULC classes. The Chesapeake Bay Water (CBW) indicates raster tabulations were performed for only areas that fall inside the CBW boundary e.g., if user is interested in CBW portion of a county then they will use LULC Matrix CBW. Conversely, if they are interested change transitions across the entire county, they will use LULC Matrix.

If you use this dataset in your research, please cite the following:

date_format = '%Y'#

Date format string used to parse date from filename.

Not used if filename_regex does not contain a date group or start and stop groups.

is_image = False#

True if the dataset only contains model inputs (such as images). False if the dataset only contains ground truth model outputs (such as segmentation masks).

The sample returned by the dataset/data loader will use the “image” key if is_image is True, otherwise it will use the “mask” key.

For datasets with both model inputs and outputs, the recommended approach is to use 2 RasterDataset instances and combine them using an IntersectionDataset.

abstract property md5s: dict[int, str]#

Mapping between data year and zip file MD5.

property state: str#

State abbreviation.

cmap: ClassVar[dict[int, tuple[int, int, int, int]]] = {11: (0, 92, 230, 255), 12: (0, 92, 230, 255), 13: (0, 92, 230, 255), 14: (0, 92, 230, 255), 15: (0, 92, 230, 255), 21: (0, 0, 0, 255), 22: (235, 6, 2, 255), 23: (89, 89, 89, 255), 24: (138, 138, 136, 255), 25: (138, 138, 136, 255), 26: (138, 138, 136, 255), 27: (115, 115, 0, 255), 28: (233, 255, 190, 255), 29: (255, 255, 115, 255), 41: (38, 115, 0, 255), 42: (56, 168, 0, 255), 51: (255, 255, 115, 255), 52: (255, 255, 115, 255), 53: (255, 255, 115, 255), 54: (170, 255, 0, 255), 55: (170, 255, 0, 255), 56: (170, 255, 0, 255), 62: (77, 209, 148, 255), 63: (77, 209, 148, 255), 64: (56, 168, 0, 255), 65: (38, 115, 0, 255), 72: (186, 245, 217, 255), 73: (186, 245, 217, 255), 74: (56, 168, 0, 255), 75: (38, 115, 0, 255), 83: (255, 211, 127, 255), 84: (255, 211, 127, 255), 85: (255, 211, 127, 255), 91: (0, 168, 132, 255), 92: (0, 168, 132, 255), 93: (0, 168, 132, 255), 94: (56, 168, 0, 255), 95: (38, 115, 0, 255), 127: (255, 255, 255, 255)}#

Color map for the dataset, used for plotting

__init__(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Initialize a new Chesapeake instance.

Parameters:
  • paths (str | PathLike[str] | Iterable[str | PathLike[str]]) – one or more root directories to search or files to load

  • crs (CRS | None) – coordinate reference system (CRS) to warp to (defaults to the CRS of the first file found)

  • res (float | tuple[float, float] | None) – resolution of the dataset in units of CRS in (xres, yres) format. If a single float is provided, it is used for both the x and y resolution. (defaults to the resolution of the first file found)

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes an input sample and returns a transformed version

  • cache (bool) – if True, cache file handle to speed up repeated sampling

  • download (bool) – if True, download dataset and store it in the root directory

  • checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)

  • time_series (bool) – if True, stack data along the time series dimension [T, C, H, W]. If False, merge data into a [C, H, W] mosaic.

Raises:

DatasetNotFoundError – If dataset is not found and download is False.

Added in version 0.9: The time_series parameter.

Changed in version 0.5: root was renamed to paths.

filename_glob = '{state}_lulc_*_2022-Edition.tif'#

Glob expression used to search for files.

This expression should be specific enough that it will not pick up files from other datasets. It should not include a file extension, as the dataset may be in a different file format than what it was originally downloaded as.

filename_regex = '^{state}_lulc_(?P<date>\\d{{4}})_2022-Edition\\.tif$'#

Regular expression used to extract date from filename.

The expression should use named groups. The expression may contain any number of groups. The following groups are specifically searched for by the base class:

  • date: used to calculate mint and maxt for index insertion

  • start: used to calculate mint for index insertion

  • stop: used to calculate maxt for index insertion

When separate_files is True, the following additional groups are searched for to find other files:

  • band: replaced with requested band name

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

Changed in version 0.3: Method now takes a sample dict, not a Tensor. Additionally, possible to show subplot titles and/or use a custom suptitle.

__annotate_func__()#

The type of the None singleton.

class torchgeo.datasets.ChesapeakeDC(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for Washington, D.C.

class torchgeo.datasets.ChesapeakeDE(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for Delaware.

class torchgeo.datasets.ChesapeakeMD(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for Maryland.

class torchgeo.datasets.ChesapeakeNY(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for New York.

class torchgeo.datasets.ChesapeakePA(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for Pennsylvania.

class torchgeo.datasets.ChesapeakeVA(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for Virginia.

class torchgeo.datasets.ChesapeakeWV(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#

Bases: Chesapeake

This subset of the dataset contains data only for West Virginia.

class torchgeo.datasets.ChesapeakeCVPR(root='data', splits=['de-train'], layers=['naip-new', 'lc'], transforms=None, cache=True, download=False, checksum=False)[source]#

Bases: GeoDataset

CVPR 2019 Chesapeake Land Cover dataset.

The CVPR 2019 Chesapeake Land Cover dataset contains two layers of NAIP aerial imagery, Landsat 8 leaf-on and leaf-off imagery, Chesapeake Bay land cover labels, NLCD land cover labels, and Microsoft building footprint labels.

This dataset was organized to accompany the 2019 CVPR paper, “Large Scale High-Resolution Land Cover Mapping with Multi-Resolution Data”.

The paper “Resolving label uncertainty with implicit generative models” added an additional layer of data to this dataset containing a prior over the Chesapeake Bay land cover classes generated from the NLCD land cover labels. For more information about this layer see the dataset documentation.

If you use this dataset in your research, please cite the following paper:

prior_color_matrix = array([[0.        , 0.77254902, 1.        , 1.        ],        [0.14901961, 0.45098039, 0.        , 1.        ],        [0.63921569, 1.        , 0.45098039, 1.        ],        [0.61176471, 0.61176471, 0.61176471, 1.        ]])#
__init__(root='data', splits=['de-train'], layers=['naip-new', 'lc'], transforms=None, cache=True, download=False, checksum=False)[source]#

Initialize a new Dataset instance.

Parameters:
  • root (str | PathLike[str]) – root directory where dataset can be found

  • splits (Sequence[str]) – a list of strings in the format “{state}-{train,val,test}” indicating the subset of data to use, for example “ny-train”

  • layers (Sequence[str]) – a list containing a subset of “naip-new”, “naip-old”, “lc”, “nlcd”, “landsat-leaf-on”, “landsat-leaf-off”, “buildings”, or “prior_from_cooccurrences_101_31_no_osm_no_buildings” indicating which layers to load

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes an input sample and returns a transformed version

  • cache (bool) – if True, cache file handle to speed up repeated sampling

  • download (bool) – if True, download dataset and store it in the root directory

  • checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)

Raises:
__getitem__(index)[source]#

Retrieve input, target, and/or metadata indexed by spatiotemporal slice.

Parameters:

index (slice | tuple[slice] | tuple[slice, slice] | tuple[slice, slice, slice]) – [xmin:xmax:xres, ymin:ymax:yres, tmin:tmax:tres] coordinates to index.

Returns:

Sample of input, target, and/or metadata at that index.

Raises:

IndexError – If index is not found in the dataset.

Return type:

dict[str, Any]

plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by __getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional string to use as a suptitle

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

Added in version 0.4.