Designing and Evaluating a Dual-Stream Transformer-Based Architecture for Visual Question Answering

In the realm of Visual Question Answering, accurate answers often hinge on the harmonious fusion of textual and visual elements. While these complex architectures are effective, they typically come with a hefty price tag: a large number of parameters that demand significant processing power and leng...

Full description

Saved in:
Bibliographic Details
Main Authors: Faheem Shehzad, Aniello Minutolo, Massimo Esposito
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10811881/
Tags: Add Tag
No Tags, Be the first to tag this record!