Efficient knowledge distillation and alignment for improved KB-VQA

Abstract Knowledge-based visual question answering (KB-VQA) often requires utilizing external knowledge to answer natural language questions about image content. Recent research has emphasized the importance of knowledge in answering questions by implicitly leveraging Large Language Models (LLMs). H...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaofei Qin, Ruiqi Pei, Changxiang He, Fan Li, Xuedian Zhang
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-07539-9
Tags: Add Tag
No Tags, Be the first to tag this record!

Similar Items