DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation

In recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a cri...

Full description

Saved in:
Bibliographic Details
Main Authors: Zirui Shen, Wanjie Liu, Sheng Xu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10750398/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a critical gap remains in effectively capturing both local and global contextual information. Existing models often excel in either fine-grained local detail or long-range dependencies, but not both. Our work addresses this research gap by proposing the DS-SwinUNet model integrating convolutional operations with transformer-based attention mechanisms through the novel DS-transformer block, which consists of a two-scale attention mechanism incorporating convolutional computation and a modified FFN, and the module is placed in the skip connection section with Swin-UNet as the backbone. Experiments demonstrate that the transformer module proposed in this article improves the mIoU by 2.73% and 0.41% over the original Swin-UNet when the WHDLD and OpenEarthMap dataset are used as the segmentation task.
ISSN:1939-1404
2151-1535