Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study

Introduction This study aims to critically assess the appropriateness and limitations of two prominent large language models (LLMs), enhanced representation through knowledge integration (ERNIE Bot) and chat generative pre-trained transformer (ChatGPT), in answering questions about liver cancer inte...

Full description

Saved in:
Bibliographic Details
Main Authors: Xue-ting Yuan, Chen-ye Shao, Zhen-zhen Zhang, Duo Qian
Format: Article
Language:English
Published: SAGE Publishing 2025-01-01
Series:Digital Health
Online Access:https://doi.org/10.1177/20552076251315511
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832590519028416512
author Xue-ting Yuan
Chen-ye Shao
Zhen-zhen Zhang
Duo Qian
author_facet Xue-ting Yuan
Chen-ye Shao
Zhen-zhen Zhang
Duo Qian
author_sort Xue-ting Yuan
collection DOAJ
description Introduction This study aims to critically assess the appropriateness and limitations of two prominent large language models (LLMs), enhanced representation through knowledge integration (ERNIE Bot) and chat generative pre-trained transformer (ChatGPT), in answering questions about liver cancer interventional radiology. Through a comparative analysis, the performance of these models will be evaluated based on their responses to questions about transarterial chemoembolization and hepatic arterial infusion chemotherapy in both English and Chinese contexts. Methods A total of 38 questions were developed to cover a range of topics related to transarterial chemoembolization (TACE) and hepatic arterial infusion chemotherapy (HAIC), including foundational knowledge, patient education, and treatment and care. The responses generated by ERNIE Bot and ChatGPT were rigorously evaluated by 10 professionals in liver cancer interventional radiology. The final score was determined by one seasoned clinical expert. Each response was rated on a five-point Likert scale, facilitating a quantitative analysis of the accuracy and comprehensiveness of the information provided by each language model. Results ERNIE Bot is superior to ChatGPT in the Chinese context (ERNIE Bot: 5, 89.47%; 4, 10.53%; 3, 0%; 2, 0%; 1, 0% vs ChatGPT: 5, 57.89%; 4, 5.27%; 3, 34.21%; 2, 2.63%; 1, 0%; P  = 0.001). However, ChatGPT outperformed ERNIE Bot in the English context (ERNIE Bot: 5, 73.68%; 4, 2.63%; 3, 13.16; 2, 10.53%;1, 0% vs ChatGPT: 5, 92.11%; 4, 2.63%; 3, 5.26%; 2, 0%; 1, 0%; P  = 0.026). Conclusions This study preliminarily demonstrated that ERNIE Bot and ChatGPT effectively address questions related to liver cancer interventional radiology. However, their performance varied by language: ChatGPT excelled in English contexts, while ERNIE Bot performed better in Chinese. We found that choosing the appropriate LLMs is beneficial for patients in obtaining more accurate treatment information. Both models require manual review to ensure accuracy and reliability in practical use.
format Article
id doaj-art-1606368636b649d28b10a3283d00d953
institution Kabale University
issn 2055-2076
language English
publishDate 2025-01-01
publisher SAGE Publishing
record_format Article
series Digital Health
spelling doaj-art-1606368636b649d28b10a3283d00d9532025-01-23T13:03:45ZengSAGE PublishingDigital Health2055-20762025-01-011110.1177/20552076251315511Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative studyXue-ting Yuan0Chen-ye Shao1Zhen-zhen Zhang2Duo Qian3 Department of Interventional Radiology, Suzhou, China School of Nursing, , The First Affiliated Hospital of Soochow University, Suzhou, China Department of Interventional Radiology, Suzhou, China Department of Interventional Radiology, Suzhou, ChinaIntroduction This study aims to critically assess the appropriateness and limitations of two prominent large language models (LLMs), enhanced representation through knowledge integration (ERNIE Bot) and chat generative pre-trained transformer (ChatGPT), in answering questions about liver cancer interventional radiology. Through a comparative analysis, the performance of these models will be evaluated based on their responses to questions about transarterial chemoembolization and hepatic arterial infusion chemotherapy in both English and Chinese contexts. Methods A total of 38 questions were developed to cover a range of topics related to transarterial chemoembolization (TACE) and hepatic arterial infusion chemotherapy (HAIC), including foundational knowledge, patient education, and treatment and care. The responses generated by ERNIE Bot and ChatGPT were rigorously evaluated by 10 professionals in liver cancer interventional radiology. The final score was determined by one seasoned clinical expert. Each response was rated on a five-point Likert scale, facilitating a quantitative analysis of the accuracy and comprehensiveness of the information provided by each language model. Results ERNIE Bot is superior to ChatGPT in the Chinese context (ERNIE Bot: 5, 89.47%; 4, 10.53%; 3, 0%; 2, 0%; 1, 0% vs ChatGPT: 5, 57.89%; 4, 5.27%; 3, 34.21%; 2, 2.63%; 1, 0%; P  = 0.001). However, ChatGPT outperformed ERNIE Bot in the English context (ERNIE Bot: 5, 73.68%; 4, 2.63%; 3, 13.16; 2, 10.53%;1, 0% vs ChatGPT: 5, 92.11%; 4, 2.63%; 3, 5.26%; 2, 0%; 1, 0%; P  = 0.026). Conclusions This study preliminarily demonstrated that ERNIE Bot and ChatGPT effectively address questions related to liver cancer interventional radiology. However, their performance varied by language: ChatGPT excelled in English contexts, while ERNIE Bot performed better in Chinese. We found that choosing the appropriate LLMs is beneficial for patients in obtaining more accurate treatment information. Both models require manual review to ensure accuracy and reliability in practical use.https://doi.org/10.1177/20552076251315511
spellingShingle Xue-ting Yuan
Chen-ye Shao
Zhen-zhen Zhang
Duo Qian
Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
Digital Health
title Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
title_full Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
title_fullStr Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
title_full_unstemmed Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
title_short Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study
title_sort comparing the performance of chatgpt and ernie bot in answering questions regarding liver cancer interventional radiology in chinese and english contexts a comparative study
url https://doi.org/10.1177/20552076251315511
work_keys_str_mv AT xuetingyuan comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy
AT chenyeshao comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy
AT zhenzhenzhang comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy
AT duoqian comparingtheperformanceofchatgptanderniebotinansweringquestionsregardinglivercancerinterventionalradiologyinchineseandenglishcontextsacomparativestudy