IMViT: Adjacency Matrix-Based Lightweight Plain Vision Transformer

Transformers are becoming dominant deep learning backbones for both computer vision and natural language processing. While extensive experiments prove its outstanding ability for large models, transformers with small sizes are not comparable with convolutional neural networks in various downstream t...

Full description

Saved in:
Bibliographic Details
Main Authors: Qihao Chen, Yunfeng Yan, Xianbo Wang, Jishen Peng
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10849548/
Tags: Add Tag
No Tags, Be the first to tag this record!