Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study

BackgroundThe implementation of large language models (LLMs), such as BART (Bidirectional and Auto-Regressive Transformers) and GPT-4, has revolutionized the extraction of insights from unstructured text. These advancements have expanded into health care, allowing analysis of...

Full description

Saved in:

Bibliographic Details
Main Authors:	William Trevena, Xiang Zhong, Michelle Alvarado, Alexander Semenov, Alp Oktay, Devin Devlin, Aarya Yogesh Gohil, Sai Harsha Chittimouju
Format:	Article
Language:	English
Published:	JMIR Publications 2025-01-01
Series:	Journal of Medical Internet Research
Online Access:	https://www.jmir.org/2025/1/e54601
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832577752898732032
author	William Trevena Xiang Zhong Michelle Alvarado Alexander Semenov Alp Oktay Devin Devlin Aarya Yogesh Gohil Sai Harsha Chittimouju
author_facet	William Trevena Xiang Zhong Michelle Alvarado Alexander Semenov Alp Oktay Devin Devlin Aarya Yogesh Gohil Sai Harsha Chittimouju
author_sort	William Trevena
collection	DOAJ
description	BackgroundThe implementation of large language models (LLMs), such as BART (Bidirectional and Auto-Regressive Transformers) and GPT-4, has revolutionized the extraction of insights from unstructured text. These advancements have expanded into health care, allowing analysis of social media for public health insights. However, the detection of drug discontinuation events (DDEs) remains underexplored. Identifying DDEs is crucial for understanding medication adherence and patient outcomes. ObjectiveThe aim of this study is to provide a flexible framework for investigating various clinical research questions in data-sparse environments. We provide an example of the utility of this framework by identifying DDEs and their root causes in an open-source web-based forum, MedHelp, and by releasing the first open-source DDE datasets to aid further research in this domain. MethodsWe used several LLMs, including GPT-4 Turbo, GPT-4o, DeBERTa (Decoding-Enhanced Bidirectional Encoder Representations from Transformer with Disentangled Attention), and BART, among others, to detect and determine the root causes of DDEs in user comments posted on MedHelp. Our study design included the use of zero-shot classification, which allows these models to make predictions without task-specific training. We split user comments into sentences and applied different classification strategies to assess the performance of these models in identifying DDEs and their root causes. ResultsAmong the selected models, GPT-4o performed the best at determining the root causes of DDEs, predicting only 12.9% of root causes incorrectly (hamming loss). Among the open-source models tested, BART demonstrated the best performance in detecting DDEs, achieving an F1-score of 0.86, a false positive rate of 2.8%, and a false negative rate of 6.5%, all without any fine-tuning. The dataset included 10.7% (107/1000) DDEs, emphasizing the models’ robustness in an imbalanced data context. ConclusionsThis study demonstrated the effectiveness of open- and closed-source LLMs, such as GPT-4o and BART, for detecting DDEs and their root causes from publicly accessible data through zero-shot classification. The robust and scalable framework we propose can aid researchers in addressing data-sparse clinical research questions. The launch of open-access DDE datasets has the potential to stimulate further research and novel discoveries in this field.
format	Article
id	doaj-art-86b31b83fe5040e1b9349428177d03a6
institution	Kabale University
issn	1438-8871
language	English
publishDate	2025-01-01
publisher	JMIR Publications
record_format	Article
series	Journal of Medical Internet Research
spelling	doaj-art-86b31b83fe5040e1b9349428177d03a62025-01-30T15:45:33ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-01-0127e5460110.2196/54601Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation StudyWilliam Trevenahttps://orcid.org/0000-0001-7011-6867Xiang Zhonghttps://orcid.org/0000-0002-6214-5876Michelle Alvaradohttps://orcid.org/0000-0001-9649-214XAlexander Semenovhttps://orcid.org/0000-0003-2691-4575Alp Oktayhttps://orcid.org/0009-0007-2075-4896Devin Devlinhttps://orcid.org/0009-0005-2207-673XAarya Yogesh Gohilhttps://orcid.org/0009-0008-3603-9480Sai Harsha Chittimoujuhttps://orcid.org/0009-0002-4420-9220 BackgroundThe implementation of large language models (LLMs), such as BART (Bidirectional and Auto-Regressive Transformers) and GPT-4, has revolutionized the extraction of insights from unstructured text. These advancements have expanded into health care, allowing analysis of social media for public health insights. However, the detection of drug discontinuation events (DDEs) remains underexplored. Identifying DDEs is crucial for understanding medication adherence and patient outcomes. ObjectiveThe aim of this study is to provide a flexible framework for investigating various clinical research questions in data-sparse environments. We provide an example of the utility of this framework by identifying DDEs and their root causes in an open-source web-based forum, MedHelp, and by releasing the first open-source DDE datasets to aid further research in this domain. MethodsWe used several LLMs, including GPT-4 Turbo, GPT-4o, DeBERTa (Decoding-Enhanced Bidirectional Encoder Representations from Transformer with Disentangled Attention), and BART, among others, to detect and determine the root causes of DDEs in user comments posted on MedHelp. Our study design included the use of zero-shot classification, which allows these models to make predictions without task-specific training. We split user comments into sentences and applied different classification strategies to assess the performance of these models in identifying DDEs and their root causes. ResultsAmong the selected models, GPT-4o performed the best at determining the root causes of DDEs, predicting only 12.9% of root causes incorrectly (hamming loss). Among the open-source models tested, BART demonstrated the best performance in detecting DDEs, achieving an F1-score of 0.86, a false positive rate of 2.8%, and a false negative rate of 6.5%, all without any fine-tuning. The dataset included 10.7% (107/1000) DDEs, emphasizing the models’ robustness in an imbalanced data context. ConclusionsThis study demonstrated the effectiveness of open- and closed-source LLMs, such as GPT-4o and BART, for detecting DDEs and their root causes from publicly accessible data through zero-shot classification. The robust and scalable framework we propose can aid researchers in addressing data-sparse clinical research questions. The launch of open-access DDE datasets has the potential to stimulate further research and novel discoveries in this field.https://www.jmir.org/2025/1/e54601
spellingShingle	William Trevena Xiang Zhong Michelle Alvarado Alexander Semenov Alp Oktay Devin Devlin Aarya Yogesh Gohil Sai Harsha Chittimouju Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study Journal of Medical Internet Research
title	Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study
title_full	Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study
title_fullStr	Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study
title_full_unstemmed	Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study
title_short	Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study
title_sort	using large language models to detect and understand drug discontinuation events in web based forums development and validation study
url	https://www.jmir.org/2025/1/e54601
work_keys_str_mv	AT williamtrevena usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT xiangzhong usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT michellealvarado usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT alexandersemenov usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT alpoktay usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT devindevlin usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT aaryayogeshgohil usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy AT saiharshachittimouju usinglargelanguagemodelstodetectandunderstanddrugdiscontinuationeventsinwebbasedforumsdevelopmentandvalidationstudy

Using Large Language Models to Detect and Understand Drug Discontinuation Events in Web-Based Forums: Development and Validation Study

Similar Items