Harnessing GPT-4 for automated error detection in pathology reports: Implications for oncology diagnostics

Objective Accurate pathology reports are crucial for the diagnosis and treatment planning of cancer patients. However, these reports are prone to errors due to time pressures, subjective interpretation, and inconsistencies among professionals. Addressing these errors is vital for improving oncology...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiongwen Yang, Yun Zhang, Jinyan Jiang, Zhijun Chen, Rinasu Bai, Zihao Yuan, Longyan Dong, Yi Xiao, Di Liu, Huiyin Deng, Jian Huang, Huiyou Shi, Dan Liu, Maoli Liang, WeiJuan Tang, Chuan Xu
Format: Article
Language:English
Published: SAGE Publishing 2025-05-01
Series:Digital Health
Online Access:https://doi.org/10.1177/20552076251346703
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective Accurate pathology reports are crucial for the diagnosis and treatment planning of cancer patients. However, these reports are prone to errors due to time pressures, subjective interpretation, and inconsistencies among professionals. Addressing these errors is vital for improving oncology care outcomes. Artificial intelligence (AI) systems, such as GPT-4, offer the potential to enhance diagnostic accuracy and efficiency. Methods A total of 700 malignant tumor pathology reports were collected from four hospitals. Of these, 350 reports had deliberate errors introduced by a senior pathologist, mimicking real-world reporting challenges. Error detection performance was evaluated by comparing GPT-4 to six human pathologists (two seniors, two attending pathologists, and two residents). Key metrics included error detection rates with Wilson confidence intervals and processing time per report. Results GPT-4 detected 88% of errors (350/400; 95% CI: [84, 91]), compared to a 95% detection rate by the top senior pathologist (382/400; 95% CI: [93, 97]). GPT-4 significantly reduced the average processing time to 4.03 seconds per report, compared to 65.64 seconds for the fastest human pathologist. However, GPT-4 exhibited a higher rate of false positives (2.3%; 95% CI: [1.52, 3.01]) compared to the best-performing senior pathologist (0.3%; 95% CI: [0.01, 0.91]). Conclusions GPT-4 demonstrates substantial potential in improving the efficiency and accuracy of pathology error detection, which could accelerate clinical workflows and enhance cancer diagnostics. However, its higher false-positive rate emphasizes the need for human oversight to ensure safe implementation in clinical practice.
ISSN:2055-2076