DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation

In recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a cri...

Full description

Saved in:
Bibliographic Details
Main Authors: Zirui Shen, Wanjie Liu, Sheng Xu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10750398/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576742490898432
author Zirui Shen
Wanjie Liu
Sheng Xu
author_facet Zirui Shen
Wanjie Liu
Sheng Xu
author_sort Zirui Shen
collection DOAJ
description In recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a critical gap remains in effectively capturing both local and global contextual information. Existing models often excel in either fine-grained local detail or long-range dependencies, but not both. Our work addresses this research gap by proposing the DS-SwinUNet model integrating convolutional operations with transformer-based attention mechanisms through the novel DS-transformer block, which consists of a two-scale attention mechanism incorporating convolutional computation and a modified FFN, and the module is placed in the skip connection section with Swin-UNet as the backbone. Experiments demonstrate that the transformer module proposed in this article improves the mIoU by 2.73% and 0.41% over the original Swin-UNet when the WHDLD and OpenEarthMap dataset are used as the segmentation task.
format Article
id doaj-art-26ad79b8f55148f7acd047723172e6be
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-26ad79b8f55148f7acd047723172e6be2025-01-31T00:00:24ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01184382439510.1109/JSTARS.2024.349672510750398DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic SegmentationZirui Shen0https://orcid.org/0009-0003-6856-2480Wanjie Liu1https://orcid.org/0009-0005-6296-4280Sheng Xu2https://orcid.org/0000-0002-9017-1510College of Information Science and Technology & College of Artificial Intelligence, Nanjing Forestry University, Nanjing, ChinaCollege of Information Science and Technology & College of Artificial Intelligence, Nanjing Forestry University, Nanjing, ChinaCollege of Information Science and Technology & College of Artificial Intelligence, Nanjing Forestry University, Nanjing, ChinaIn recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a critical gap remains in effectively capturing both local and global contextual information. Existing models often excel in either fine-grained local detail or long-range dependencies, but not both. Our work addresses this research gap by proposing the DS-SwinUNet model integrating convolutional operations with transformer-based attention mechanisms through the novel DS-transformer block, which consists of a two-scale attention mechanism incorporating convolutional computation and a modified FFN, and the module is placed in the skip connection section with Swin-UNet as the backbone. Experiments demonstrate that the transformer module proposed in this article improves the mIoU by 2.73% and 0.41% over the original Swin-UNet when the WHDLD and OpenEarthMap dataset are used as the segmentation task.https://ieeexplore.ieee.org/document/10750398/Deep learningsemantic segmentationskip connectiontransformer
spellingShingle Zirui Shen
Wanjie Liu
Sheng Xu
DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Deep learning
semantic segmentation
skip connection
transformer
title DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
title_full DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
title_fullStr DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
title_full_unstemmed DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
title_short DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
title_sort ds swinunet redesigning skip connection with double scale attention for land cover semantic segmentation
topic Deep learning
semantic segmentation
skip connection
transformer
url https://ieeexplore.ieee.org/document/10750398/
work_keys_str_mv AT ziruishen dsswinunetredesigningskipconnectionwithdoublescaleattentionforlandcoversemanticsegmentation
AT wanjieliu dsswinunetredesigningskipconnectionwithdoublescaleattentionforlandcoversemanticsegmentation
AT shengxu dsswinunetredesigningskipconnectionwithdoublescaleattentionforlandcoversemanticsegmentation