Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain

Background Patients who are informed about the causes, pathophysiology, treatment and prevention of a disease are better able to participate in treatment procedures in the event of illness. Artificial intelligence (AI), which has gained popularity in recent years, is defined as the study of algorith...

Full description

Saved in:
Bibliographic Details
Main Authors: Erkan Ozduran, Volkan Hancı, Yüksel Erkin, İlhan Celil Özbek, Vugar Abdulkerimov
Format: Article
Language:English
Published: PeerJ Inc. 2025-01-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/18847.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832587232509165568
author Erkan Ozduran
Volkan Hancı
Yüksel Erkin
İlhan Celil Özbek
Vugar Abdulkerimov
author_facet Erkan Ozduran
Volkan Hancı
Yüksel Erkin
İlhan Celil Özbek
Vugar Abdulkerimov
author_sort Erkan Ozduran
collection DOAJ
description Background Patients who are informed about the causes, pathophysiology, treatment and prevention of a disease are better able to participate in treatment procedures in the event of illness. Artificial intelligence (AI), which has gained popularity in recent years, is defined as the study of algorithms that provide machines with the ability to reason and perform cognitive functions, including object and word recognition, problem solving and decision making. This study aimed to examine the readability, reliability and quality of responses to frequently asked keywords about low back pain (LBP) given by three different AI-based chatbots (ChatGPT, Perplexity and Gemini), which are popular applications in online information presentation today. Methods All three AI chatbots were asked the 25 most frequently used keywords related to LBP determined with the help of Google Trend. In order to prevent possible bias that could be created by the sequential processing of keywords in the answers given by the chatbots, the study was designed by providing input from different users (EO, VH) for each keyword. The readability of the responses given was determined with the Simple Measure of Gobbledygook (SMOG), Flesch Reading Ease Score (FRES) and Gunning Fog (GFG) readability scores. Quality was assessed using the Global Quality Score (GQS) and the Ensuring Quality Information for Patients (EQIP) score. Reliability was assessed by determining with DISCERN and Journal of American Medical Association (JAMA) scales. Results The first three keywords detected as a result of Google Trend search were “Lower Back Pain”, “ICD 10 Low Back Pain”, and “Low Back Pain Symptoms”. It was determined that the readability of the responses given by all AI chatbots was higher than the recommended 6th grade readability level (p < 0.001). In the EQIP, JAMA, modified DISCERN and GQS score evaluation, Perplexity was found to have significantly higher scores than other chatbots (p < 0.001). Conclusion It has been determined that the answers given by AI chatbots to keywords about LBP are difficult to read and have low reliability and quality assessment. It is clear that when new chatbots are introduced, they can provide better guidance to patients with increased clarity and text quality. This study can provide inspiration for future studies on improving the algorithms and responses of AI chatbots.
format Article
id doaj-art-922a019658e64036a359d8e7aa84c062
institution Kabale University
issn 2167-8359
language English
publishDate 2025-01-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj-art-922a019658e64036a359d8e7aa84c0622025-01-24T15:05:08ZengPeerJ Inc.PeerJ2167-83592025-01-0113e1884710.7717/peerj.18847Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back painErkan Ozduran0Volkan Hancı1Yüksel Erkin2İlhan Celil Özbek3Vugar Abdulkerimov4Physical Medicine and Rehabilitation, Pain Medicine, Sivas Numune Hospital, Sivas, TurkeyAnesthesiology and Reanimation, Critical Care Medicine, Dokuz Eylül University, Izmir, TurkeyAnesthesiology and Reanimation, Pain Medicine, Dokuz Eylül University, Izmir, TurkeyPhysical Medicine and Rehabilitation, Health Science University, Derince Education and Research Hospital, Kocaeli, TurkeyAnesthesiology and Reanimation, Central Clinical Hospital, Baku, AzerbaijanBackground Patients who are informed about the causes, pathophysiology, treatment and prevention of a disease are better able to participate in treatment procedures in the event of illness. Artificial intelligence (AI), which has gained popularity in recent years, is defined as the study of algorithms that provide machines with the ability to reason and perform cognitive functions, including object and word recognition, problem solving and decision making. This study aimed to examine the readability, reliability and quality of responses to frequently asked keywords about low back pain (LBP) given by three different AI-based chatbots (ChatGPT, Perplexity and Gemini), which are popular applications in online information presentation today. Methods All three AI chatbots were asked the 25 most frequently used keywords related to LBP determined with the help of Google Trend. In order to prevent possible bias that could be created by the sequential processing of keywords in the answers given by the chatbots, the study was designed by providing input from different users (EO, VH) for each keyword. The readability of the responses given was determined with the Simple Measure of Gobbledygook (SMOG), Flesch Reading Ease Score (FRES) and Gunning Fog (GFG) readability scores. Quality was assessed using the Global Quality Score (GQS) and the Ensuring Quality Information for Patients (EQIP) score. Reliability was assessed by determining with DISCERN and Journal of American Medical Association (JAMA) scales. Results The first three keywords detected as a result of Google Trend search were “Lower Back Pain”, “ICD 10 Low Back Pain”, and “Low Back Pain Symptoms”. It was determined that the readability of the responses given by all AI chatbots was higher than the recommended 6th grade readability level (p < 0.001). In the EQIP, JAMA, modified DISCERN and GQS score evaluation, Perplexity was found to have significantly higher scores than other chatbots (p < 0.001). Conclusion It has been determined that the answers given by AI chatbots to keywords about LBP are difficult to read and have low reliability and quality assessment. It is clear that when new chatbots are introduced, they can provide better guidance to patients with increased clarity and text quality. This study can provide inspiration for future studies on improving the algorithms and responses of AI chatbots.https://peerj.com/articles/18847.pdfArtificial intelligenceChatGPTGeminiLow back painOnline medical informationPerplexity
spellingShingle Erkan Ozduran
Volkan Hancı
Yüksel Erkin
İlhan Celil Özbek
Vugar Abdulkerimov
Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain
PeerJ
Artificial intelligence
ChatGPT
Gemini
Low back pain
Online medical information
Perplexity
title Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain
title_full Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain
title_fullStr Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain
title_full_unstemmed Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain
title_short Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain
title_sort assessing the readability quality and reliability of responses produced by chatgpt gemini and perplexity regarding most frequently asked keywords about low back pain
topic Artificial intelligence
ChatGPT
Gemini
Low back pain
Online medical information
Perplexity
url https://peerj.com/articles/18847.pdf
work_keys_str_mv AT erkanozduran assessingthereadabilityqualityandreliabilityofresponsesproducedbychatgptgeminiandperplexityregardingmostfrequentlyaskedkeywordsaboutlowbackpain
AT volkanhancı assessingthereadabilityqualityandreliabilityofresponsesproducedbychatgptgeminiandperplexityregardingmostfrequentlyaskedkeywordsaboutlowbackpain
AT yukselerkin assessingthereadabilityqualityandreliabilityofresponsesproducedbychatgptgeminiandperplexityregardingmostfrequentlyaskedkeywordsaboutlowbackpain
AT ilhancelilozbek assessingthereadabilityqualityandreliabilityofresponsesproducedbychatgptgeminiandperplexityregardingmostfrequentlyaskedkeywordsaboutlowbackpain
AT vugarabdulkerimov assessingthereadabilityqualityandreliabilityofresponsesproducedbychatgptgeminiandperplexityregardingmostfrequentlyaskedkeywordsaboutlowbackpain