BinaryViT: Binary Vision Transformer for Hyperspectral Image Classification
Vision transformers have demonstrated remarkable performance in hyperspectral image classification tasks. However, their complex computational mechanisms and excessive parameterization severely restrict deployment on resource-constrained platforms, such as FPGAs and embedded CPUs. As a key technolog...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11072278/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Vision transformers have demonstrated remarkable performance in hyperspectral image classification tasks. However, their complex computational mechanisms and excessive parameterization severely restrict deployment on resource-constrained platforms, such as FPGAs and embedded CPUs. As a key technology for lightweight deep models, binary quantization achieves significant parameter compression and computational acceleration by binarizing activations and weights. However, binary quantization in transformers faces challenges such as degradation of feature representation capability after binarizing self-attention mechanisms and decline in fusion efficiency of multiscale spectral–spatial information, leading to relatively lagging progress in this field. To address these issues, this study proposes a novel binary vision transformer architecture tailored for hyperspectral image classification. Built upon traditional Transformers, the approach innovatively introduces a self-adaptive softmax binarization module, which dynamically adjusts the binarization threshold distribution to effectively mitigate discretization errors in gradient propagation during the binarization process. Meanwhile, a multibranch average pooling block is designed to enable hierarchical aggregation of features across different spectral dimensions, significantly enhancing the model’s ability to represent complex spectral–spatial correlations in hyperspectral data. The proposed method achieves over 99% parameter binarization and reduces floating-point computation by more than 89% compared to full-precision counterparts, while maintaining competitive classification accuracy. Experiments conducted on seven benchmark hyperspectral datasets demonstrate the effectiveness of our approach in balancing computational efficiency and classification performance. |
|---|---|
| ISSN: | 1939-1404 2151-1535 |