STeInFormer: Spatial–Temporal Interaction Transformer Architecture for Remote Sensing Change Detection
Convolutional neural networks and attention mechanisms have greatly benefited remote sensing change detection (RSCD) because of their outstanding discriminative ability. Existent RSCD methods often follow a paradigm of using a noninteractive Siamese neural network for multitemporal feature extractio...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10815617/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Convolutional neural networks and attention mechanisms have greatly benefited remote sensing change detection (RSCD) because of their outstanding discriminative ability. Existent RSCD methods often follow a paradigm of using a noninteractive Siamese neural network for multitemporal feature extraction and change detection heads for feature fusion and change representation. However, this paradigm lacks the contemplation of the characteristics of RSCD in temporal and spatial dimensions, and causes the drawback on spatial–temporal interaction that hinders high-quality feature extraction. To address this problem, we present a spatial–temporal interaction Transformer architecture for multitemporal feature extraction, which is the first general backbone network specifically designed for RSCD. In addition, we propose a parameter-free multifrequency token mixer to integrate frequency-domain features that provide spectral information for RSCD. Experimental results on three datasets validate the effectiveness of the proposed method, which can outperform the state-of-the-art methods and achieve the most satisfactory efficiency-accuracy tradeoff. |
---|---|
ISSN: | 1939-1404 2151-1535 |