DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation
In recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a cri...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10750398/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832576742490898432 |
---|---|
author | Zirui Shen Wanjie Liu Sheng Xu |
author_facet | Zirui Shen Wanjie Liu Sheng Xu |
author_sort | Zirui Shen |
collection | DOAJ |
description | In recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a critical gap remains in effectively capturing both local and global contextual information. Existing models often excel in either fine-grained local detail or long-range dependencies, but not both. Our work addresses this research gap by proposing the DS-SwinUNet model integrating convolutional operations with transformer-based attention mechanisms through the novel DS-transformer block, which consists of a two-scale attention mechanism incorporating convolutional computation and a modified FFN, and the module is placed in the skip connection section with Swin-UNet as the backbone. Experiments demonstrate that the transformer module proposed in this article improves the mIoU by 2.73% and 0.41% over the original Swin-UNet when the WHDLD and OpenEarthMap dataset are used as the segmentation task. |
format | Article |
id | doaj-art-26ad79b8f55148f7acd047723172e6be |
institution | Kabale University |
issn | 1939-1404 2151-1535 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
spelling | doaj-art-26ad79b8f55148f7acd047723172e6be2025-01-31T00:00:24ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01184382439510.1109/JSTARS.2024.349672510750398DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic SegmentationZirui Shen0https://orcid.org/0009-0003-6856-2480Wanjie Liu1https://orcid.org/0009-0005-6296-4280Sheng Xu2https://orcid.org/0000-0002-9017-1510College of Information Science and Technology & College of Artificial Intelligence, Nanjing Forestry University, Nanjing, ChinaCollege of Information Science and Technology & College of Artificial Intelligence, Nanjing Forestry University, Nanjing, ChinaCollege of Information Science and Technology & College of Artificial Intelligence, Nanjing Forestry University, Nanjing, ChinaIn recent years, the development of visual transformer has gradually replaced convolutional neural networks in the visual domain with attention computation, causing pure transformer networks to become a trend. Despite significant advancements in semantic segmentation models for remote sensing, a critical gap remains in effectively capturing both local and global contextual information. Existing models often excel in either fine-grained local detail or long-range dependencies, but not both. Our work addresses this research gap by proposing the DS-SwinUNet model integrating convolutional operations with transformer-based attention mechanisms through the novel DS-transformer block, which consists of a two-scale attention mechanism incorporating convolutional computation and a modified FFN, and the module is placed in the skip connection section with Swin-UNet as the backbone. Experiments demonstrate that the transformer module proposed in this article improves the mIoU by 2.73% and 0.41% over the original Swin-UNet when the WHDLD and OpenEarthMap dataset are used as the segmentation task.https://ieeexplore.ieee.org/document/10750398/Deep learningsemantic segmentationskip connectiontransformer |
spellingShingle | Zirui Shen Wanjie Liu Sheng Xu DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Deep learning semantic segmentation skip connection transformer |
title | DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation |
title_full | DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation |
title_fullStr | DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation |
title_full_unstemmed | DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation |
title_short | DS-SwinUNet: Redesigning Skip Connection With Double Scale Attention for Land Cover Semantic Segmentation |
title_sort | ds swinunet redesigning skip connection with double scale attention for land cover semantic segmentation |
topic | Deep learning semantic segmentation skip connection transformer |
url | https://ieeexplore.ieee.org/document/10750398/ |
work_keys_str_mv | AT ziruishen dsswinunetredesigningskipconnectionwithdoublescaleattentionforlandcoversemanticsegmentation AT wanjieliu dsswinunetredesigningskipconnectionwithdoublescaleattentionforlandcoversemanticsegmentation AT shengxu dsswinunetredesigningskipconnectionwithdoublescaleattentionforlandcoversemanticsegmentation |