Assessing information provided via artificial intelligence regarding distal biceps tendon repair surgery

Abstract Purpose The purpose of this study was to analyze the quality, accuracy, reliability and readability of information provided by an Artificial Intelligence (AI) model ChatGPT (Open AI, San Francisco) regarding Distal Biceps Tendon repair surgery. Methods ChatGPT 3.5 was used to answer 27 comm...

Full description

Saved in:
Bibliographic Details
Main Authors: Suhasini Gupta, Brett D. Haislup, Ryan A. Hoffman, Anand M. Murthi
Format: Article
Language:English
Published: Wiley 2025-04-01
Series:Journal of Experimental Orthopaedics
Subjects:
Online Access:https://doi.org/10.1002/jeo2.70281
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Purpose The purpose of this study was to analyze the quality, accuracy, reliability and readability of information provided by an Artificial Intelligence (AI) model ChatGPT (Open AI, San Francisco) regarding Distal Biceps Tendon repair surgery. Methods ChatGPT 3.5 was used to answer 27 commonly asked questions regarding ‘distal biceps repair surgery’ by patients. These questions were categorized using the Rothwell criteria of Fact, Policy and Value. The answers generated by ChatGPT were analyzed using the DISCERN scale, Journal of the American Medical Association (JAMA) benchmark criteria, Flesch‐Kincaid Reading Ease Score (FRES) and grade Level (FKGL). Results The DISCERN score for Fact‐based questions was 59, Policy was 61 and Value was 59 (all considered ‘good scores’). The JAMA benchmark criteria were 0, representing the lowest score, for all three categories of Fact, Policy and Value. The FRES score for the Fact questions was 24.49, Policy was 22.82, Value was 21.77 and the FKGL score for Fact was 14.96, Policy was 14.78 and Value was 15.00. Conclusion The answers provided by ChatGPT were a ‘good’ source in terms of quality assessment, compared to other online resources that do not have citations as an option. The accuracy and reliability of these answers were shown to be low, with nearly a college‐graduate level of readability. This indicates that physicians should caution patients when searching ChatGPT for information regarding distal biceps repairs. ChatGPT serves as a promising source for patients to learn about their procedure, although its reliability and readability are disadvantages for the average patient when utilizing the software.
ISSN:2197-1153