Efficient Attention Transformer Network With Self-Similarity Feature Enhancement for Hyperspectral Image Classification

Recently, transformer has gained widespread application in hyperspectral image classification (HSIC) tasks due to its powerful global modeling ability. However, the inherent high-dimensional property of hyperspectral images (HSIs) leads to a sharp increase in the number of parameters and expensive c...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuyang Wang, Zhenqiu Shu, Zhengtao Yu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10964176/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, transformer has gained widespread application in hyperspectral image classification (HSIC) tasks due to its powerful global modeling ability. However, the inherent high-dimensional property of hyperspectral images (HSIs) leads to a sharp increase in the number of parameters and expensive computational costs. Moreover, self-attention operations in transformer-based HSIC methods may introduce irrelevant spectral–spatial information, and thus may consequently impact the classification performance. To mitigate these issues, in this article, we introduce an efficient deep network, named efficient attention transformer network (EATN), for practice HSIC tasks. Specifically, we propose two self-similarity descriptors based on the original HSI patch to enhance spatial feature representations. The center self-similarity descriptor emphasizes pixels similar to the central pixel. In contrast, the neighborhood self-similarity descriptor explores the similarity relationship between each pixel and its neighboring pixels within the patch. Then, we embed these two self-similarity descriptors into the original patch for subsequent feature extraction and classification. Furthermore, we design two efficient feature extraction modules based on the preprocessed patches, called spectral interactive transformer module and spatial conv-attention module, to reduce the computational costs of the classification framework. Extensive experiments on four benchmark datasets show that our proposed EATN method outperforms other state-of-the-art HSI classification approaches.
ISSN:1939-1404
2151-1535