ChangeViT#

class torchgeo.models.ChangeViT(backbone, img_size=256, in_channels=3, num_classes=1, pretrained=False, **kwargs)[source]#

Bases: Module

ChangeViT model for change detection.

ChangeViT implementation using plain Vision Transformer as backbone with detail capture module and feature injection mechanism.

If you use this model in your research, please cite the following paper:

Note

For best results on LEVIR-CD as reported in the paper, use:

  • Backbone: vit_large_patch16_dinov3.sat493m (DINOv3-Large pretrained on satellite imagery)

  • Loss: Combined BCE+Dice loss (not yet implemented in ChangeDetectionTask)

  • Training: 80k steps with batch size 48

  • Image size: 256x256 patches

Added in version 0.8.

__init__(backbone, img_size=256, in_channels=3, num_classes=1, pretrained=False, **kwargs)[source]#

Initialize ChangeViT model.

Parameters:
  • backbone (str) – Name of the timm ViT model to use as backbone (e.g., ‘vit_small_patch14_dinov2’, ‘vit_tiny_patch16_224’)

  • img_size (int) – Input image size (default: 256)

  • in_channels (int) – Number of input channels per temporal frame (default: 3)

  • num_classes (int) – Number of output classes (default: 1)

  • pretrained (bool) – Whether to load pretrained weights from timm (default: False)

  • **kwargs (Any) – Additional keyword arguments passed to timm backbone

forward(x)[source]#

Forward pass of ChangeViT.

Parameters:

x (Tensor) – Bitemporal input tensor [B, T, C, H, W]

Returns:

Change detection logits [B, 1, H, W]

Return type:

Tensor