ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays

Radiology departments are under increasing pressure to meet the demand for timely and accurate diagnostics, especially with chest x-rays, a key modality for pulmonary condition assessment. Producing comprehensive and accurate radiological reports is a time-consuming process prone to errors, particul...

Full description

Saved in:
Bibliographic Details
Main Authors: Prateek Singh, Sudhakar Singh
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Digital Health
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fdgth.2025.1535168/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832592412734652416
author Prateek Singh
Sudhakar Singh
author_facet Prateek Singh
Sudhakar Singh
author_sort Prateek Singh
collection DOAJ
description Radiology departments are under increasing pressure to meet the demand for timely and accurate diagnostics, especially with chest x-rays, a key modality for pulmonary condition assessment. Producing comprehensive and accurate radiological reports is a time-consuming process prone to errors, particularly in high-volume clinical environments. Automated report generation plays a crucial role in alleviating radiologists' workload, improving diagnostic accuracy, and ensuring consistency. This paper introduces ChestX-Transcribe, a multimodal transformer model that combines the Swin Transformer for extracting high-resolution visual features with DistilGPT for generating clinically relevant, semantically rich medical reports. Trained on the Indiana University Chest x-ray dataset, ChestX-Transcribe demonstrates state-of-the-art performance across BLEU, ROUGE, and METEOR metrics, outperforming prior models in producing clinically meaningful reports. However, the reliance on the Indiana University dataset introduces potential limitations, including selection bias, as the dataset is collected from specific hospitals within the Indiana Network for Patient Care. This may result in underrepresentation of certain demographics or conditions not prevalent in those healthcare settings, potentially skewing model predictions when applied to more diverse populations or different clinical environments. Additionally, the ethical implications of handling sensitive medical data, including patient privacy and data security, are considered. Despite these challenges, ChestX-Transcribe shows promising potential for enhancing real-world radiology workflows by automating the creation of medical reports, reducing diagnostic errors, and improving efficiency. The findings highlight the transformative potential of multimodal transformers in healthcare, with future work focusing on improving model generalizability and optimizing clinical integration.
format Article
id doaj-art-5eef14aa6c6046f7afb78853757cc7a2
institution Kabale University
issn 2673-253X
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Digital Health
spelling doaj-art-5eef14aa6c6046f7afb78853757cc7a22025-01-21T08:36:48ZengFrontiers Media S.A.Frontiers in Digital Health2673-253X2025-01-01710.3389/fdgth.2025.15351681535168ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-raysPrateek SinghSudhakar SinghRadiology departments are under increasing pressure to meet the demand for timely and accurate diagnostics, especially with chest x-rays, a key modality for pulmonary condition assessment. Producing comprehensive and accurate radiological reports is a time-consuming process prone to errors, particularly in high-volume clinical environments. Automated report generation plays a crucial role in alleviating radiologists' workload, improving diagnostic accuracy, and ensuring consistency. This paper introduces ChestX-Transcribe, a multimodal transformer model that combines the Swin Transformer for extracting high-resolution visual features with DistilGPT for generating clinically relevant, semantically rich medical reports. Trained on the Indiana University Chest x-ray dataset, ChestX-Transcribe demonstrates state-of-the-art performance across BLEU, ROUGE, and METEOR metrics, outperforming prior models in producing clinically meaningful reports. However, the reliance on the Indiana University dataset introduces potential limitations, including selection bias, as the dataset is collected from specific hospitals within the Indiana Network for Patient Care. This may result in underrepresentation of certain demographics or conditions not prevalent in those healthcare settings, potentially skewing model predictions when applied to more diverse populations or different clinical environments. Additionally, the ethical implications of handling sensitive medical data, including patient privacy and data security, are considered. Despite these challenges, ChestX-Transcribe shows promising potential for enhancing real-world radiology workflows by automating the creation of medical reports, reducing diagnostic errors, and improving efficiency. The findings highlight the transformative potential of multimodal transformers in healthcare, with future work focusing on improving model generalizability and optimizing clinical integration.https://www.frontiersin.org/articles/10.3389/fdgth.2025.1535168/fullmedical report generationmultimodal transformersswin transformerDistilGPTvision-language modelsradiology workflow
spellingShingle Prateek Singh
Sudhakar Singh
ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays
Frontiers in Digital Health
medical report generation
multimodal transformers
swin transformer
DistilGPT
vision-language models
radiology workflow
title ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays
title_full ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays
title_fullStr ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays
title_full_unstemmed ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays
title_short ChestX-Transcribe: a multimodal transformer for automated radiology report generation from chest x-rays
title_sort chestx transcribe a multimodal transformer for automated radiology report generation from chest x rays
topic medical report generation
multimodal transformers
swin transformer
DistilGPT
vision-language models
radiology workflow
url https://www.frontiersin.org/articles/10.3389/fdgth.2025.1535168/full
work_keys_str_mv AT prateeksingh chestxtranscribeamultimodaltransformerforautomatedradiologyreportgenerationfromchestxrays
AT sudhakarsingh chestxtranscribeamultimodaltransformerforautomatedradiologyreportgenerationfromchestxrays