Voice EHR: introducing multimodal audio data for health
IntroductionArtificial intelligence (AI) models trained on audio data may have the potential to rapidly perform clinical tasks, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets collected with expensive reco...
Saved in:
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-01-01
|
Series: | Frontiers in Digital Health |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fdgth.2024.1448351/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832583439445917696 |
---|---|
author | James Anibal James Anibal Hannah Huth Ming Li Lindsey Hazen Veronica Daoud Dominique Ebedes Yen Minh Lam Hang Nguyen Phuc Vo Hong Michael Kleinman Shelley Ost Christopher Jackson Laura Sprabery Cheran Elangovan Balaji Krishnaiah Lee Akst Lee Akst Ioan Lina Iqbal Elyazar Lenny Ekawati Stefan Jansen Richard Nduwayezu Charisse Garcia Jeffrey Plum Jacqueline Brenner Miranda Song Emily Ricotta Emily Ricotta David Clifton C. Louise Thwaites Yael Bensoussan Bradford Wood |
author_facet | James Anibal James Anibal Hannah Huth Ming Li Lindsey Hazen Veronica Daoud Dominique Ebedes Yen Minh Lam Hang Nguyen Phuc Vo Hong Michael Kleinman Shelley Ost Christopher Jackson Laura Sprabery Cheran Elangovan Balaji Krishnaiah Lee Akst Lee Akst Ioan Lina Iqbal Elyazar Lenny Ekawati Stefan Jansen Richard Nduwayezu Charisse Garcia Jeffrey Plum Jacqueline Brenner Miranda Song Emily Ricotta Emily Ricotta David Clifton C. Louise Thwaites Yael Bensoussan Bradford Wood |
author_sort | James Anibal |
collection | DOAJ |
description | IntroductionArtificial intelligence (AI) models trained on audio data may have the potential to rapidly perform clinical tasks, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets collected with expensive recording equipment in high-income countries, which challenges deployment in resource-constrained, high-volume settings where audio data may have a profound impact on health equity.MethodsThis report introduces a novel protocol for audio data collection and a corresponding application that captures health information through guided questions.ResultsTo demonstrate the potential of Voice EHR as a biomarker of health, initial experiments on data quality and multiple case studies are presented in this report. Large language models (LLMs) were used to compare transcribed Voice EHR data with data (from the same patients) collected through conventional techniques like multiple choice questions. Information contained in the Voice EHR samples was consistently rated as equally or more relevant to a health evaluation.DiscussionThe HEAR application facilitates the collection of an audio electronic health record (“Voice EHR”) that may contain complex biomarkers of health from conventional voice/respiratory features, speech patterns, and spoken language with semantic meaning and longitudinal context–potentially compensating for the typical limitations of unimodal clinical datasets. |
format | Article |
id | doaj-art-dfb6e033d1c7467db8d98021c3e6d383 |
institution | Kabale University |
issn | 2673-253X |
language | English |
publishDate | 2025-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Digital Health |
spelling | doaj-art-dfb6e033d1c7467db8d98021c3e6d3832025-01-28T14:06:13ZengFrontiers Media S.A.Frontiers in Digital Health2673-253X2025-01-01610.3389/fdgth.2024.14483511448351Voice EHR: introducing multimodal audio data for healthJames Anibal0James Anibal1Hannah Huth2Ming Li3Lindsey Hazen4Veronica Daoud5Dominique Ebedes6Yen Minh Lam7Hang Nguyen8Phuc Vo Hong9Michael Kleinman10Shelley Ost11Christopher Jackson12Laura Sprabery13Cheran Elangovan14Balaji Krishnaiah15Lee Akst16Lee Akst17Ioan Lina18Iqbal Elyazar19Lenny Ekawati20Stefan Jansen21Richard Nduwayezu22Charisse Garcia23Jeffrey Plum24Jacqueline Brenner25Miranda Song26Emily Ricotta27Emily Ricotta28David Clifton29C. Louise Thwaites30Yael Bensoussan31Bradford Wood32Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesComputational Health Informatics Lab, Oxford Institute of Biomedical Engineering, University of Oxford, Oxford, United KingdomCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesMorsani College of Medicine, University of South Florida, Tampa, FL, United StatesMorsani College of Medicine, University of South Florida, Tampa, FL, United StatesSocial Science and Implementation Research Team, Oxford University Clinical Research Unit, Ho Chi Minh City, VietnamSocial Science and Implementation Research Team, Oxford University Clinical Research Unit, Ho Chi Minh City, VietnamSocial Science and Implementation Research Team, Oxford University Clinical Research Unit, Ho Chi Minh City, VietnamCollege of Medicine, University of Tennessee Health Sciences Center, Memphis, TN, United StatesCollege of Medicine, University of Tennessee Health Sciences Center, Memphis, TN, United StatesCollege of Medicine, University of Tennessee Health Sciences Center, Memphis, TN, United StatesCollege of Medicine, University of Tennessee Health Sciences Center, Memphis, TN, United StatesCollege of Medicine, University of Tennessee Health Sciences Center, Memphis, TN, United StatesCollege of Medicine, University of Tennessee Health Sciences Center, Memphis, TN, United StatesJohns Hopkins Voice Center, Johns Hopkins University, Baltimore, MD, United StatesDepartment of Otolaryngology-Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, United StatesDepartment of Otolaryngology-Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, United StatesGeospatial Epidemiology Program, Oxford University Clinical Research Unit Indonesia, Jakarta, IndonesiaGeospatial Epidemiology Program, Oxford University Clinical Research Unit Indonesia, Jakarta, IndonesiaCollege of Medicine and Health Sciences, University of Rwanda, Kigali, Rwanda0King Faisal Hospital, Kigali, RwandaCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United States1Epidemiology and Data Management Unit, National Institute of Allergy and Infectious Diseases, Bethesda, MD, United States2Department of Preventive Medicine and Biostatistics, Uniformed Services University, Bethesda, MD, United StatesComputational Health Informatics Lab, Oxford Institute of Biomedical Engineering, University of Oxford, Oxford, United KingdomMorsani College of Medicine, University of South Florida, Tampa, FL, United StatesMorsani College of Medicine, University of South Florida, Tampa, FL, United StatesCenter for Interventional Oncology, NIH Clinical Center, National Institutes of Health, Bethesda, MD, United StatesIntroductionArtificial intelligence (AI) models trained on audio data may have the potential to rapidly perform clinical tasks, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets collected with expensive recording equipment in high-income countries, which challenges deployment in resource-constrained, high-volume settings where audio data may have a profound impact on health equity.MethodsThis report introduces a novel protocol for audio data collection and a corresponding application that captures health information through guided questions.ResultsTo demonstrate the potential of Voice EHR as a biomarker of health, initial experiments on data quality and multiple case studies are presented in this report. Large language models (LLMs) were used to compare transcribed Voice EHR data with data (from the same patients) collected through conventional techniques like multiple choice questions. Information contained in the Voice EHR samples was consistently rated as equally or more relevant to a health evaluation.DiscussionThe HEAR application facilitates the collection of an audio electronic health record (“Voice EHR”) that may contain complex biomarkers of health from conventional voice/respiratory features, speech patterns, and spoken language with semantic meaning and longitudinal context–potentially compensating for the typical limitations of unimodal clinical datasets.https://www.frontiersin.org/articles/10.3389/fdgth.2024.1448351/fullAI for healthnatural language processinglarge language models (LLM)multimodal datavoice biomarkers |
spellingShingle | James Anibal James Anibal Hannah Huth Ming Li Lindsey Hazen Veronica Daoud Dominique Ebedes Yen Minh Lam Hang Nguyen Phuc Vo Hong Michael Kleinman Shelley Ost Christopher Jackson Laura Sprabery Cheran Elangovan Balaji Krishnaiah Lee Akst Lee Akst Ioan Lina Iqbal Elyazar Lenny Ekawati Stefan Jansen Richard Nduwayezu Charisse Garcia Jeffrey Plum Jacqueline Brenner Miranda Song Emily Ricotta Emily Ricotta David Clifton C. Louise Thwaites Yael Bensoussan Bradford Wood Voice EHR: introducing multimodal audio data for health Frontiers in Digital Health AI for health natural language processing large language models (LLM) multimodal data voice biomarkers |
title | Voice EHR: introducing multimodal audio data for health |
title_full | Voice EHR: introducing multimodal audio data for health |
title_fullStr | Voice EHR: introducing multimodal audio data for health |
title_full_unstemmed | Voice EHR: introducing multimodal audio data for health |
title_short | Voice EHR: introducing multimodal audio data for health |
title_sort | voice ehr introducing multimodal audio data for health |
topic | AI for health natural language processing large language models (LLM) multimodal data voice biomarkers |
url | https://www.frontiersin.org/articles/10.3389/fdgth.2024.1448351/full |
work_keys_str_mv | AT jamesanibal voiceehrintroducingmultimodalaudiodataforhealth AT jamesanibal voiceehrintroducingmultimodalaudiodataforhealth AT hannahhuth voiceehrintroducingmultimodalaudiodataforhealth AT mingli voiceehrintroducingmultimodalaudiodataforhealth AT lindseyhazen voiceehrintroducingmultimodalaudiodataforhealth AT veronicadaoud voiceehrintroducingmultimodalaudiodataforhealth AT dominiqueebedes voiceehrintroducingmultimodalaudiodataforhealth AT yenminhlam voiceehrintroducingmultimodalaudiodataforhealth AT hangnguyen voiceehrintroducingmultimodalaudiodataforhealth AT phucvohong voiceehrintroducingmultimodalaudiodataforhealth AT michaelkleinman voiceehrintroducingmultimodalaudiodataforhealth AT shelleyost voiceehrintroducingmultimodalaudiodataforhealth AT christopherjackson voiceehrintroducingmultimodalaudiodataforhealth AT laurasprabery voiceehrintroducingmultimodalaudiodataforhealth AT cheranelangovan voiceehrintroducingmultimodalaudiodataforhealth AT balajikrishnaiah voiceehrintroducingmultimodalaudiodataforhealth AT leeakst voiceehrintroducingmultimodalaudiodataforhealth AT leeakst voiceehrintroducingmultimodalaudiodataforhealth AT ioanlina voiceehrintroducingmultimodalaudiodataforhealth AT iqbalelyazar voiceehrintroducingmultimodalaudiodataforhealth AT lennyekawati voiceehrintroducingmultimodalaudiodataforhealth AT stefanjansen voiceehrintroducingmultimodalaudiodataforhealth AT richardnduwayezu voiceehrintroducingmultimodalaudiodataforhealth AT charissegarcia voiceehrintroducingmultimodalaudiodataforhealth AT jeffreyplum voiceehrintroducingmultimodalaudiodataforhealth AT jacquelinebrenner voiceehrintroducingmultimodalaudiodataforhealth AT mirandasong voiceehrintroducingmultimodalaudiodataforhealth AT emilyricotta voiceehrintroducingmultimodalaudiodataforhealth AT emilyricotta voiceehrintroducingmultimodalaudiodataforhealth AT davidclifton voiceehrintroducingmultimodalaudiodataforhealth AT clouisethwaites voiceehrintroducingmultimodalaudiodataforhealth AT yaelbensoussan voiceehrintroducingmultimodalaudiodataforhealth AT bradfordwood voiceehrintroducingmultimodalaudiodataforhealth |