Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
Introduction This study aims to critically assess the appropriateness and limitations of two prominent large language models (LLMs), enhanced representation through knowledge integration (ERNIE Bot) and chat generative pre-trained transformer (ChatGPT), in answering questions about liver cancer inte...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2025-01-01
|
Series: | Digital Health |
Online Access: | https://doi.org/10.1177/20552076251315511 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832590519028416512 |
---|---|
author | Xue-ting Yuan Chen-ye Shao Zhen-zhen Zhang Duo Qian |
author_facet | Xue-ting Yuan Chen-ye Shao Zhen-zhen Zhang Duo Qian |
author_sort | Xue-ting Yuan |
collection | DOAJ |
description | Introduction This study aims to critically assess the appropriateness and limitations of two prominent large language models (LLMs), enhanced representation through knowledge integration (ERNIE Bot) and chat generative pre-trained transformer (ChatGPT), in answering questions about liver cancer interventional radiology. Through a comparative analysis, the performance of these models will be evaluated based on their responses to questions about transarterial chemoembolization and hepatic arterial infusion chemotherapy in both English and Chinese contexts. Methods A total of 38 questions were developed to cover a range of topics related to transarterial chemoembolization (TACE) and hepatic arterial infusion chemotherapy (HAIC), including foundational knowledge, patient education, and treatment and care. The responses generated by ERNIE Bot and ChatGPT were rigorously evaluated by 10 professionals in liver cancer interventional radiology. The final score was determined by one seasoned clinical expert. Each response was rated on a five-point Likert scale, facilitating a quantitative analysis of the accuracy and comprehensiveness of the information provided by each language model. Results ERNIE Bot is superior to ChatGPT in the Chinese context (ERNIE Bot: 5, 89.47%; 4, 10.53%; 3, 0%; 2, 0%; 1, 0% vs ChatGPT: 5, 57.89%; 4, 5.27%; 3, 34.21%; 2, 2.63%; 1, 0%; P = 0.001). However, ChatGPT outperformed ERNIE Bot in the English context (ERNIE Bot: 5, 73.68%; 4, 2.63%; 3, 13.16; 2, 10.53%;1, 0% vs ChatGPT: 5, 92.11%; 4, 2.63%; 3, 5.26%; 2, 0%; 1, 0%; P = 0.026). Conclusions This study preliminarily demonstrated that ERNIE Bot and ChatGPT effectively address questions related to liver cancer interventional radiology. However, their performance varied by language: ChatGPT excelled in English contexts, while ERNIE Bot performed better in Chinese. We found that choosing the appropriate LLMs is beneficial for patients in obtaining more accurate treatment information. Both models require manual review to ensure accuracy and reliability in practical use. |
format | Article |
id | doaj-art-1606368636b649d28b10a3283d00d953 |
institution | Kabale University |
issn | 2055-2076 |
language | English |
publishDate | 2025-01-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Digital Health |
spelling | doaj-art-1606368636b649d28b10a3283d00d9532025-01-23T13:03:45ZengSAGE PublishingDigital Health2055-20762025-01-011110.1177/20552076251315511Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative studyXue-ting Yuan0Chen-ye Shao1Zhen-zhen Zhang2Duo Qian3 Department of Interventional Radiology, Suzhou, China School of Nursing, , The First Affiliated Hospital of Soochow University, Suzhou, China Department of Interventional Radiology, Suzhou, China Department of Interventional Radiology, Suzhou, ChinaIntroduction This study aims to critically assess the appropriateness and limitations of two prominent large language models (LLMs), enhanced representation through knowledge integration (ERNIE Bot) and chat generative pre-trained transformer (ChatGPT), in answering questions about liver cancer interventional radiology. Through a comparative analysis, the performance of these models will be evaluated based on their responses to questions about transarterial chemoembolization and hepatic arterial infusion chemotherapy in both English and Chinese contexts. Methods A total of 38 questions were developed to cover a range of topics related to transarterial chemoembolization (TACE) and hepatic arterial infusion chemotherapy (HAIC), including foundational knowledge, patient education, and treatment and care. The responses generated by ERNIE Bot and ChatGPT were rigorously evaluated by 10 professionals in liver cancer interventional radiology. The final score was determined by one seasoned clinical expert. Each response was rated on a five-point Likert scale, facilitating a quantitative analysis of the accuracy and comprehensiveness of the information provided by each language model. Results ERNIE Bot is superior to ChatGPT in the Chinese context (ERNIE Bot: 5, 89.47%; 4, 10.53%; 3, 0%; 2, 0%; 1, 0% vs ChatGPT: 5, 57.89%; 4, 5.27%; 3, 34.21%; 2, 2.63%; 1, 0%; P = 0.001). However, ChatGPT outperformed ERNIE Bot in the English context (ERNIE Bot: 5, 73.68%; 4, 2.63%; 3, 13.16; 2, 10.53%;1, 0% vs ChatGPT: 5, 92.11%; 4, 2.63%; 3, 5.26%; 2, 0%; 1, 0%; P = 0.026). Conclusions This study preliminarily demonstrated that ERNIE Bot and ChatGPT effectively address questions related to liver cancer interventional radiology. However, their performance varied by language: ChatGPT excelled in English contexts, while ERNIE Bot performed better in Chinese. We found that choosing the appropriate LLMs is beneficial for patients in obtaining more accurate treatment information. Both models require manual review to ensure accuracy and reliability in practical use.https://doi.org/10.1177/20552076251315511 |
spellingShingle | Xue-ting Yuan Chen-ye Shao Zhen-zhen Zhang Duo Qian Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study Digital Health |
title | Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study |
title_full | Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study |
title_fullStr | Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study |
title_full_unstemmed | Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study |
title_short | Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study |
title_sort | comparing the performance of chatgpt and ernie bot in answering questions regarding liver cancer interventional radiology in chinese and english contexts a comparative study |
url | https://doi.org/10.1177/20552076251315511 |
work_keys_str_mv | AT xuetingyuan comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy AT chenyeshao comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy AT zhenzhenzhang comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy AT duoqian comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy |