Improving Visual Question Answering by Image Captioning

Visual Question Answering (VQA) is a challenging task that bridges the computer vision and natural language processing communities. It provide natural language answers to questions related to an associated image. Most existing VQA methods focus on the fusion and inference of visual features with the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiangjun Shao, Hongsong Dong, Guangsheng Wu
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Deep learning image captioning multimodal learning visual question answering
Online Access:	https://ieeexplore.ieee.org/document/10918635/
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://ieeexplore.ieee.org/document/10918635/

Improving Visual Question Answering by Image Captioning

Internet

Similar Items