Improving Visual Question Answering by Image Captioning

Visual Question Answering (VQA) is a challenging task that bridges the computer vision and natural language processing communities. It provide natural language answers to questions related to an associated image. Most existing VQA methods focus on the fusion and inference of visual features with the...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangjun Shao, Hongsong Dong, Guangsheng Wu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10918635/
Tags: Add Tag
No Tags, Be the first to tag this record!