OpenStreetMap#

class torchgeo.datasets.OpenStreetMap(bbox, classes, paths='data', res=(0.0001, 0.0001), transforms=None, download=False)[source]#

Bases: VectorDataset

OpenStreetMap dataset.

The OpenStreetMap dataset provides access to crowd-sourced geographic data. This implementation uses the Overpass API to query and download OSM data for a specified geographic bounding box at initialization, then allows spatial querying of the cached data.

Dataset features#

  • Vector data (points, lines, polygons) for various geographic features

  • Flexible querying by class configuration (buildings, highways, amenities, etc.)

  • Data fetched once at initialization and cached locally

  • Class-based labeling with priority-based assignment

Class priority and label assignment#

classes is a list of dicts defining feature classes. Each has name (str) and selector (list of OSM tag filters). Features are assigned labels based on the order of classes in this list:

  • First class gets label=1, second gets label=2, etc.

  • If a feature matches multiple classes, it receives the label of the first matching class

  • Features that don’t match any class get label=0 (background)

Example:

classes = [
    {'name': 'buildings', 'selector': [{'building': '*'}]},  # label=1
    {'name': 'roads', 'selector': [{'highway': '*'}]},  # label=2
    {'name': 'commercial', 'selector': [{'landuse': 'commercial'}]},  # label=3
]

# A feature with tags {'building': 'yes', 'landuse': 'commercial'}
# would get label=1 (buildings) because buildings comes first

If you use this dataset in your research, please cite the following source:

Added in version 0.8.

__init__(bbox, classes, paths='data', res=(0.0001, 0.0001), transforms=None, download=False)[source]#

Initialize a new OpenStreetMap dataset instance.

Parameters:
  • bbox (tuple[float, float, float, float]) – bounding box for initial data fetch as (xmin, ymin, xmax, ymax) in EPSG:4326

  • classes (list[dict[str, Any]]) – list of dicts defining feature classes. Each dict must have: - ‘name’ (str): class name - ‘selector’ (list[dict[str, Any]]): list of OSM tag filters Features get labels 1-N based on class order, with first match taking priority.

  • paths (str | PathLike[str]) – paths directory where dataset will be stored

  • res (float | tuple[float, float]) – resolution of the dataset in units of EPSG:4326 (degrees). Default is 0.0001°. For small AOIs, consider using a finer resolution to avoid pixelated plots. A good rule of thumb: res = min(bbox_width, bbox_height) / 400 for ~400 pixels.

  • transforms (Callable[[dict[str, Any]], dict[str, Any]] | None) – a function/transform that takes input sample and returns a transformed version

  • download (bool) – if True, download dataset and store it in the paths directory

Raises:
plot(sample, show_titles=True, suptitle=None)[source]#

Plot a sample from the dataset.

Parameters:
  • sample (dict[str, Any]) – a sample returned by VectorDataset.__getitem__()

  • show_titles (bool) – flag indicating whether to show titles above each panel

  • suptitle (str | None) – optional string to use as a suptitle

Returns:

a matplotlib Figure with the rendered sample

Return type:

Figure

__annotate_func__()#

The type of the None singleton.