SCAT: Shift Channel Attention Transformer for Remote Sensing Image Super-Resolution

The quadratic increase in computational complexity caused by global receptive fields has been a persistent challenge when applying Transformer-based methods in remote sensing image super-resolution (RSISR), involving high-resolution images. Channel attention (CA)-based Transformers offer an efficien...

Full description

Saved in:
Bibliographic Details
Main Authors: Yingdong Kang, Xuemin Zhang, Shaoju Wang, Guang Jin
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10886926/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The quadratic increase in computational complexity caused by global receptive fields has been a persistent challenge when applying Transformer-based methods in remote sensing image super-resolution (RSISR), involving high-resolution images. Channel attention (CA)-based Transformers offer an efficient approach with linear complexity by computing self-attention across the channel dimension. However, current CA-based Transformers suffer from performance degradation due to two main issues: constrained receptive field in the channel dimension caused by the multihead strategy and insufficient feature diversity resulting from the small size of attention matrices. To address these drawbacks, a novel shift channel attention Transformer (SCAT) is proposed for RSISR in this article. The core innovation of SCAT lies in its shift channel attention block (SCAB), which expands the receptive field by facilitating cross-head communication through a shift channel strategy. This design enables parallel computation of self-attention across multiple heads while ensuring robust cross-channel connections, thereby enhancing the global context modeling capabilities of the network. In addition, an attention supplementation module using depthwise convolution (DWC) is incorporated into SCAB to improve feature diversity. Finally, the proposed gated feedforward network utilizes the gating mechanism and DWC to effectively control the information flow and extract complementary spatial details. In the experiments, the effectiveness of these proposed modules is verified, and the SCAT model demonstrated superior performance in terms of quantitative and qualitative compared to several state-of-the-art RSISR methods.
ISSN:1939-1404
2151-1535