IMViT: Adjacency Matrix-Based Lightweight Plain Vision Transformer

Transformers are becoming dominant deep learning backbones for both computer vision and natural language processing. While extensive experiments prove its outstanding ability for large models, transformers with small sizes are not comparable with convolutional neural networks in various downstream t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Qihao Chen, Yunfeng Yan, Xianbo Wang, Jishen Peng
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Image classification non-hierarchical vision transformer mask self-attention
Online Access:	https://ieeexplore.ieee.org/document/10849548/
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://ieeexplore.ieee.org/document/10849548/

IMViT: Adjacency Matrix-Based Lightweight Plain Vision Transformer

Internet

Similar Items