BVQA: Connecting Language and Vision Through Multimodal Attention for Open-Ended Question Answering

BVQA: Connecting Language and Vision Through Multimodal Attention for Open-Ended Question Answering

Visual Question Answering (VQA) is a challenging problem of Artificial Intelligence (AI) that requires an understanding of natural language and computer vision to respond to inquiries based on visual content within images. Research on VQA has gained immense traction due to its wide range of applicat...

Full description

Saved in:

Bibliographic Details
Main Authors:	Md. Shalha Mucha Bhuyan, Eftekhar Hossain, Khaleda Akhter Sathi, Md. Azad Hossain, M. Ali Akber Dewan
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Visual question answering multimodal deep learning large language model natural language processing multi-head attention mechanism
Online Access:	https://ieeexplore.ieee.org/document/10878995/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multimodal representative answer extraction in community question answering
by: Ming Li, et al.
Published: (2023-10-01)

Enhancing Visual Question Answering for Multiple Choice Questions
by: Rashi Goel, et al.
Published: (2025-01-01)

Designing and Evaluating a Dual-Stream Transformer-Based Architecture for Visual Question Answering
by: Faheem Shehzad, et al.
Published: (2024-01-01)

Cross-Encoder-Based Semantic Evaluation of Extractive and Generative Question Answering in Low-Resourced African Languages
by: Funebi Francis Ijebu, et al.
Published: (2025-03-01)

ReceiptQA: A Question-Answering Dataset for Receipt Understanding
by: Mahmoud Abdalla, et al.
Published: (2025-05-01)

Medical Knowledge-Based Differential Image Visual Question Answering
by: Fangpeng Lu, et al.
Published: (2025-01-01)

MOODLE IN LANGUAGE TEACHING AND TESTING. THE EMBEDDED ANSWERS QUESTION TYPE
by: Ioana-Claudia Horea
Published: (2025-03-01)

Adaptive Conditional Reasoning for Remote Sensing Visual Question Answering
by: Yiqun Gao, et al.
Published: (2025-04-01)

Improving Visual Question Answering by Image Captioning
by: Xiangjun Shao, et al.
Published: (2025-01-01)

SHIFA: SBERT-Based Healthcare Information Focused Arabic Question Answering
by: Rahaf Alruwaithi, et al.
Published: (2025-01-01)

Assessing the performance of zero-shot visual question answering in multimodal large language models for 12-lead ECG image interpretation
by: Tomohisa Seki, et al.
Published: (2025-02-01)

Assessing the quality of automatic-generated short answers using GPT-4
by: Luiz Rodrigues, et al.
Published: (2024-12-01)

A large language model for multimodal identification of crop diseases and pests
by: Yiqun Wang, et al.
Published: (2025-07-01)

MusiQAl: A Dataset for Music Question–Answering through Audio–Video Fusion
by: Anna-Maria Christodoulou, et al.
Published: (2025-07-01)

An Empirical Evaluation of Large Language Models on Consumer Health Questions
by: Moaiz Abrar, et al.
Published: (2025-02-01)

Intelligent accounting question-answering robot based on a large language model and knowledge graph
by: Shi Shengyun, et al.
Published: (2025-04-01)

Envisioning Answers: Unleashing Deep Learning for Visual Question Answering in Artistic Images
by: Erfan Zolghadriha, et al.
Published: (2024-03-01)

ZPVQA: Visual Question Answering of Images Based on Zero-Shot Prompt Learning
by: Naihao Hu, et al.
Published: (2025-01-01)

A question-answering framework for geospatial data retrieval enhanced by a knowledge graph and large language models
by: Hao Li, et al.
Published: (2025-08-01)

An Image Grid Can Be Worth a Video: Zero-Shot Video Question Answering Using a VLM
by: Wonkyun Kim, et al.
Published: (2024-01-01)

The role of answer content and length when preparing answers to questions
by: Ruth Elizabeth Corps, et al.
Published: (2024-07-01)

Elimination-based reasoning with LLM for multiple-choice educational question answering
by: Qianli Zhao, et al.
Published: (2025-08-01)

Deep Memory Fusion Model for Long Video Question Answering
by: SUN Guanglu, et al.
Published: (2021-02-01)

Visual Question Answering in Robotic Surgery: A Comprehensive Review
by: Di Ding, et al.
Published: (2025-01-01)

Knowledge Graphs as a source of trust for LLM-powered enterprise question answering
by: Juan Sequeda, et al.
Published: (2025-05-01)

Enhancing pre-trained language model by answering natural questions for event extraction
by: Yuxin Zhang, et al.
Published: (2025-04-01)

Knowledge injection methods in question answering
by: D. V. Radyush
Published: (2025-06-01)

Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
by: Junkai Zhang, et al.
Published: (2025-04-01)

A Region-based Approach to the Automated Marking of Short Textual Answers
by: Raheel Siddiqi
Published: (2011-12-01)

Adapting an English Corpus and a Question Answering System for Slovene
by: Uroš Šmajdek, et al.
Published: (2023-09-01)

DRKG: Faithful and Interpretable Multi-Hop Knowledge Graph Question Answering via LLM-Guided Reasoning Plans
by: Yan Chen, et al.
Published: (2025-06-01)

Generative Models for Multiple-Choice Question Answering in Portuguese: A Monolingual and Multilingual Experimental Study
by: Guilherme Dallmann Lima, et al.
Published: (2025-05-01)

VQABG: Vietnamese question/answers benchmark generator for field-specific chatbot ground-truth dataset using EMINI (Exact Match wIth Numeric Information) indicator
by: Anh-Khoa NGO-HO, et al.
Published: (2024-10-01)

VQABG: Vietnamese question/answers benchmark generator for field-specific chatbot ground-truth dataset using EMINI (Exact Match wIth Numeric Information) indicator
by: Anh-Khoa NGO-HO, et al.
Published: (2024-10-01)

Rhetorical questions as aggressive, friendly or sarcastic/ironical questions with imposed answers
by: Džemal Špago
Published: (2025-01-01)

Design of agricultural question answering information extraction method based on improved BILSTM algorithm
by: Ruipeng Tang, et al.
Published: (2024-10-01)

A lightweight knowledge graph-driven question answering system for field-based mineral resource survey
by: Mingguo Wang, et al.
Published: (2025-09-01)

Evaluating large language models as graders of medical short answer questions: a comparative analysis with expert human graders
by: Olena Bolgova, et al.
Published: (2025-12-01)

Methods of Asking and Answering Questions in Jadal Works Written by Fiqh Scholars
by: Abdurrahim Bilik
Published: (2021-10-01)

Automatic question-answering modeling in English by integrating TF-IDF and segmentation algorithms
by: Hainan Wang
Published: (2024-12-01)