SumGPT: A Multimodal Framework for Radiology Report Summarization to Improve Clinical Performance
Radiology report summarization plays a critical role in medical imaging, addressing the growing need for concise and accessible interpretation of complex radiology findings. However, existing models often fail to fully leverage the potential of multimodal data integration. In this study, we propose...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10836737/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Radiology report summarization plays a critical role in medical imaging, addressing the growing need for concise and accessible interpretation of complex radiology findings. However, existing models often fail to fully leverage the potential of multimodal data integration. In this study, we propose a novel model, SumGPT, which integrates T5 with a Vision Transformer to harness the power of transformer-based architectures for enhanced radiology report summarization. The dataset used in this study comprises 1,952 radiology images with detailed textual reports for training and 488 images with reports for testing. The SumGPT technique was evaluated against several baseline models, including BERT + EfficientNet, XLM-RoBERTa + ViT, T5+ CLIP, VisualGPT (GPT-2+ ViT), and others, using a dataset explicitly designed for this task. The experimental results indicate that SumGPT outperformed all baseline models, achieving the highest performance across all metrics. Specifically, it attained a ROUGE-1 score of 0.8514, ROUGE-2 of 0.8471, ROUGE-L of 0.8514, and a BLEU score of 0.8470. The results demonstrate that SumGPT effectively produces clear and accurate summaries of radiology reports. Combining a Vision Transformer(ViT) with a language model enhances its ability to capture detailed information. The study also shows that SumGPT performs well with different types of reports and could be beneficial in other areas, such as pathology and cardiology. In the future, this approach could pave the way for applications in other medical domains while further optimizing the model for real-time clinical use. |
---|---|
ISSN: | 2169-3536 |