Chesapeake Land Cover#
- class torchgeo.datasets.Chesapeake(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
RasterDataset,ABCAbstract base class for all Chesapeake datasets.
Chesapeake Bay Land Use and Land Cover (LULC) Database 2022 Edition
The Chesapeake Bay Land Use and Land Cover Database (LULC) facilitates characterization of the landscape and land change for and between discrete time periods. The database was developed by the University of Vermont’s Spatial Analysis Laboratory in cooperation with Chesapeake Conservancy (CC) and U.S. Geological Survey (USGS) as part of a 6-year Cooperative Agreement between Chesapeake Conservancy and the U.S. Environmental Protection Agency (EPA) and a separate Interagency Agreement between the USGS and EPA to provide geospatial support to the Chesapeake Bay Program Office.
The database contains one-meter 13-class Land Cover (LC) and 54-class Land Use/Land Cover (LULC) for all counties within or adjacent to the Chesapeake Bay watershed for 2013/14 and 2017/18, depending on availability of National Agricultural Imagery Program (NAIP) imagery for each state. Additionally, 54 LULC classes are generalized into 18 LULC classes for ease of visualization and communication of LULC trends. LC change between discrete time periods, detected by spectral changes in NAIP imagery and LiDAR, represents changes between the 12 land cover classes. LULC change uses LC change to identify where changes are happening and then LC is translated to LULC to represent transitions between the 54 LULC classes. The LULCC data is represented as a LULC class change transition matrix which provides users acres of change between multiple classes. It is organized by 18x18 and 54x54 LULC classes. The Chesapeake Bay Water (CBW) indicates raster tabulations were performed for only areas that fall inside the CBW boundary e.g., if user is interested in CBW portion of a county then they will use LULC Matrix CBW. Conversely, if they are interested change transitions across the entire county, they will use LULC Matrix.
If you use this dataset in your research, please cite the following:
- date_format = '%Y'#
Date format string used to parse date from filename.
Not used if
filename_regexdoes not contain adategroup orstartandstopgroups.
- is_image = False#
True if the dataset only contains model inputs (such as images). False if the dataset only contains ground truth model outputs (such as segmentation masks).
The sample returned by the dataset/data loader will use the “image” key if is_image is True, otherwise it will use the “mask” key.
For datasets with both model inputs and outputs, the recommended approach is to use 2 RasterDataset instances and combine them using an IntersectionDataset.
- cmap: ClassVar[dict[int, tuple[int, int, int, int]]] = {11: (0, 92, 230, 255), 12: (0, 92, 230, 255), 13: (0, 92, 230, 255), 14: (0, 92, 230, 255), 15: (0, 92, 230, 255), 21: (0, 0, 0, 255), 22: (235, 6, 2, 255), 23: (89, 89, 89, 255), 24: (138, 138, 136, 255), 25: (138, 138, 136, 255), 26: (138, 138, 136, 255), 27: (115, 115, 0, 255), 28: (233, 255, 190, 255), 29: (255, 255, 115, 255), 41: (38, 115, 0, 255), 42: (56, 168, 0, 255), 51: (255, 255, 115, 255), 52: (255, 255, 115, 255), 53: (255, 255, 115, 255), 54: (170, 255, 0, 255), 55: (170, 255, 0, 255), 56: (170, 255, 0, 255), 62: (77, 209, 148, 255), 63: (77, 209, 148, 255), 64: (56, 168, 0, 255), 65: (38, 115, 0, 255), 72: (186, 245, 217, 255), 73: (186, 245, 217, 255), 74: (56, 168, 0, 255), 75: (38, 115, 0, 255), 83: (255, 211, 127, 255), 84: (255, 211, 127, 255), 85: (255, 211, 127, 255), 91: (0, 168, 132, 255), 92: (0, 168, 132, 255), 93: (0, 168, 132, 255), 94: (56, 168, 0, 255), 95: (38, 115, 0, 255), 127: (255, 255, 255, 255)}#
Color map for the dataset, used for plotting
- __init__(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Initialize a new Chesapeake instance.
- Parameters:
paths (str | PathLike[str] | Iterable[str | PathLike[str]]) – one or more root directories to search or files to load
crs (CRS | None) – coordinate reference system (CRS) to warp to (defaults to the CRS of the first file found)
res (float | tuple[float, float] | None) – resolution of the dataset in units of CRS in (xres, yres) format. If a single float is provided, it is used for both the x and y resolution. (defaults to the resolution of the first file found)
transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes an input sample and returns a transformed version
cache (bool) – if True, cache file handle to speed up repeated sampling
download (bool) – if True, download dataset and store it in the root directory
checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)
time_series (bool) – if True, stack data along the time series dimension [T, C, H, W]. If False, merge data into a [C, H, W] mosaic.
- Raises:
DatasetNotFoundError – If dataset is not found and download is False.
Added in version 0.9: The time_series parameter.
Changed in version 0.5: root was renamed to paths.
- filename_glob = '{state}_lulc_*_2022-Edition.tif'#
Glob expression used to search for files.
This expression should be specific enough that it will not pick up files from other datasets. It should not include a file extension, as the dataset may be in a different file format than what it was originally downloaded as.
- filename_regex = '^{state}_lulc_(?P<date>\\d{{4}})_2022-Edition\\.tif$'#
Regular expression used to extract date from filename.
The expression should use named groups. The expression may contain any number of groups. The following groups are specifically searched for by the base class:
date: used to calculatemintandmaxtforindexinsertionstart: used to calculatemintforindexinsertionstop: used to calculatemaxtforindexinsertion
When
separate_filesis True, the following additional groups are searched for to find other files:band: replaced with requested band name
- plot(sample, show_titles=True, suptitle=None)[source]#
Plot a sample from the dataset.
- Parameters:
- Returns:
a matplotlib Figure with the rendered sample
- Return type:
Changed in version 0.3: Method now takes a sample dict, not a Tensor. Additionally, possible to show subplot titles and/or use a custom suptitle.
- __annotate_func__()#
The type of the None singleton.
- class torchgeo.datasets.ChesapeakeDC(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for Washington, D.C.
- class torchgeo.datasets.ChesapeakeDE(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for Delaware.
- class torchgeo.datasets.ChesapeakeMD(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for Maryland.
- class torchgeo.datasets.ChesapeakeNY(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for New York.
- class torchgeo.datasets.ChesapeakePA(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for Pennsylvania.
- class torchgeo.datasets.ChesapeakeVA(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for Virginia.
- class torchgeo.datasets.ChesapeakeWV(paths='data', crs=None, res=None, transforms=None, cache=True, download=False, checksum=False, time_series=False)[source]#
Bases:
ChesapeakeThis subset of the dataset contains data only for West Virginia.
- class torchgeo.datasets.ChesapeakeCVPR(root='data', splits=['de-train'], layers=['naip-new', 'lc'], transforms=None, cache=True, download=False, checksum=False)[source]#
Bases:
GeoDatasetCVPR 2019 Chesapeake Land Cover dataset.
The CVPR 2019 Chesapeake Land Cover dataset contains two layers of NAIP aerial imagery, Landsat 8 leaf-on and leaf-off imagery, Chesapeake Bay land cover labels, NLCD land cover labels, and Microsoft building footprint labels.
This dataset was organized to accompany the 2019 CVPR paper, “Large Scale High-Resolution Land Cover Mapping with Multi-Resolution Data”.
The paper “Resolving label uncertainty with implicit generative models” added an additional layer of data to this dataset containing a prior over the Chesapeake Bay land cover classes generated from the NLCD land cover labels. For more information about this layer see the dataset documentation.
If you use this dataset in your research, please cite the following paper:
- prior_color_matrix = array([[0. , 0.77254902, 1. , 1. ], [0.14901961, 0.45098039, 0. , 1. ], [0.63921569, 1. , 0.45098039, 1. ], [0.61176471, 0.61176471, 0.61176471, 1. ]])#
- __init__(root='data', splits=['de-train'], layers=['naip-new', 'lc'], transforms=None, cache=True, download=False, checksum=False)[source]#
Initialize a new Dataset instance.
- Parameters:
root (str | PathLike[str]) – root directory where dataset can be found
splits (Sequence[str]) – a list of strings in the format “{state}-{train,val,test}” indicating the subset of data to use, for example “ny-train”
layers (Sequence[str]) – a list containing a subset of “naip-new”, “naip-old”, “lc”, “nlcd”, “landsat-leaf-on”, “landsat-leaf-off”, “buildings”, or “prior_from_cooccurrences_101_31_no_osm_no_buildings” indicating which layers to load
transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes an input sample and returns a transformed version
cache (bool) – if True, cache file handle to speed up repeated sampling
download (bool) – if True, download dataset and store it in the root directory
checksum (bool) – if True, check the MD5 of the downloaded files (may be slow)
- Raises:
AssertionError – if
splitsorlayersare not validDatasetNotFoundError – If dataset is not found and download is False.
- __getitem__(index)[source]#
Retrieve input, target, and/or metadata indexed by spatiotemporal slice.
- Parameters:
index (slice | tuple[slice] | tuple[slice, slice] | tuple[slice, slice, slice]) – [xmin:xmax:xres, ymin:ymax:yres, tmin:tmax:tres] coordinates to index.
- Returns:
Sample of input, target, and/or metadata at that index.
- Raises:
IndexError – If index is not found in the dataset.
- Return type: