Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology
Objective: This study aims to compare the diagnostic accuracy of customized ChatGPT and human experts in identifying primary liver carcinoma using gross morphology. Materials and Methods: Gross morphology images of hepatocellular carcinoma (HCC) and cholangiocarcinoma (CCA) cases were assessed. T...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Faculty of Medicine Siriraj Hospital
2025-02-01
|
Series: | Siriraj Medical Journal |
Subjects: | |
Online Access: | https://he02.tci-thaijo.org/index.php/sirirajmedj/article/view/271596 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832545213269147648 |
---|---|
author | Prakasit Sa-ngiamwibool Thiyaphat Laohawetwanit |
author_facet | Prakasit Sa-ngiamwibool Thiyaphat Laohawetwanit |
author_sort | Prakasit Sa-ngiamwibool |
collection | DOAJ |
description |
Objective: This study aims to compare the diagnostic accuracy of customized ChatGPT and human experts in identifying primary liver carcinoma using gross morphology.
Materials and Methods: Gross morphology images of hepatocellular carcinoma (HCC) and cholangiocarcinoma (CCA) cases were assessed. These images were analyzed by two versions of customized ChatGPT (e.g., with and without a scoring system), pathology residents, and pathologist assistants. The diagnostic accuracy and consistency of each participant group were evaluated.
Results: The study analyzed 128 liver carcinoma images (62 HCC, 66 CCA) were analyzed, with the participation of 13 pathology residents (median experience of 1.5 years) and three pathologist assistants (median experience of 5 years). When augmented with a scoring system, ChatGPT’s performance was found to align closely with first- and second-year pathology residents and was inferior to third-year pathology residents and pathologist assistants, with statistical significance (p-values < 0.01). In contrast, the diagnostic accuracy of ChatGPT, when operating without the scoring system, was significantly lower than that of all human participants (p-values < 0.01). Kappa statistics indicated that the diagnostic consistency was slight to fair for both customized versions of ChatGPT and the pathology residents. It was noted that the interobserver agreement among the pathologist assistants was moderate.
Conclusion: The study highlights the potential of ChatGPT for augmenting diagnostic processes in pathology. However, it also emphasizes the current limitations of this AI tool compared to human expertise, particularly among experienced participants. This suggests the importance of integrating AI with human judgment in diagnostic
pathology.
|
format | Article |
id | doaj-art-114bf4415b4d48d88257430e826016b4 |
institution | Kabale University |
issn | 2228-8082 |
language | English |
publishDate | 2025-02-01 |
publisher | Faculty of Medicine Siriraj Hospital |
record_format | Article |
series | Siriraj Medical Journal |
spelling | doaj-art-114bf4415b4d48d88257430e826016b42025-02-03T07:37:10ZengFaculty of Medicine Siriraj HospitalSiriraj Medical Journal2228-80822025-02-0177210.33192/smj.v77i2.271596Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross MorphologyPrakasit Sa-ngiamwibool0Thiyaphat Laohawetwanit1Department of Pathology, Faculty of Medicine, Khon Kaen University, Khon Kaen, ThailandDivision of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand Objective: This study aims to compare the diagnostic accuracy of customized ChatGPT and human experts in identifying primary liver carcinoma using gross morphology. Materials and Methods: Gross morphology images of hepatocellular carcinoma (HCC) and cholangiocarcinoma (CCA) cases were assessed. These images were analyzed by two versions of customized ChatGPT (e.g., with and without a scoring system), pathology residents, and pathologist assistants. The diagnostic accuracy and consistency of each participant group were evaluated. Results: The study analyzed 128 liver carcinoma images (62 HCC, 66 CCA) were analyzed, with the participation of 13 pathology residents (median experience of 1.5 years) and three pathologist assistants (median experience of 5 years). When augmented with a scoring system, ChatGPT’s performance was found to align closely with first- and second-year pathology residents and was inferior to third-year pathology residents and pathologist assistants, with statistical significance (p-values < 0.01). In contrast, the diagnostic accuracy of ChatGPT, when operating without the scoring system, was significantly lower than that of all human participants (p-values < 0.01). Kappa statistics indicated that the diagnostic consistency was slight to fair for both customized versions of ChatGPT and the pathology residents. It was noted that the interobserver agreement among the pathologist assistants was moderate. Conclusion: The study highlights the potential of ChatGPT for augmenting diagnostic processes in pathology. However, it also emphasizes the current limitations of this AI tool compared to human expertise, particularly among experienced participants. This suggests the importance of integrating AI with human judgment in diagnostic pathology. https://he02.tci-thaijo.org/index.php/sirirajmedj/article/view/271596Artificial intelligenceChatGPTGPT-4Liver cancerHepatocellular carcinomaCholangiocarcinoma |
spellingShingle | Prakasit Sa-ngiamwibool Thiyaphat Laohawetwanit Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology Siriraj Medical Journal Artificial intelligence ChatGPT GPT-4 Liver cancer Hepatocellular carcinoma Cholangiocarcinoma |
title | Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology |
title_full | Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology |
title_fullStr | Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology |
title_full_unstemmed | Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology |
title_short | Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology |
title_sort | comparative analysis of chatgpt and human expertise in diagnosing primary liver carcinoma a focus on gross morphology |
topic | Artificial intelligence ChatGPT GPT-4 Liver cancer Hepatocellular carcinoma Cholangiocarcinoma |
url | https://he02.tci-thaijo.org/index.php/sirirajmedj/article/view/271596 |
work_keys_str_mv | AT prakasitsangiamwibool comparativeanalysisofchatgptandhumanexpertiseindiagnosingprimarylivercarcinomaafocusongrossmorphology AT thiyaphatlaohawetwanit comparativeanalysisofchatgptandhumanexpertiseindiagnosingprimarylivercarcinomaafocusongrossmorphology |