Diagnosis of depression based on facial multimodal data

IntroductionDepression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.MethodsThis study...

Full description

Saved in:
Bibliographic Details
Main Authors: Nani Jin, Renjia Ye, Peng Li
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Psychiatry
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1508772/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832583759036153856
author Nani Jin
Renjia Ye
Peng Li
author_facet Nani Jin
Renjia Ye
Peng Li
author_sort Nani Jin
collection DOAJ
description IntroductionDepression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.MethodsThis study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.ResultsWe conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.DiscussionCompared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.
format Article
id doaj-art-fea593787b884542b3422317395492df
institution Kabale University
issn 1664-0640
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Psychiatry
spelling doaj-art-fea593787b884542b3422317395492df2025-01-28T07:52:18ZengFrontiers Media S.A.Frontiers in Psychiatry1664-06402025-01-011610.3389/fpsyt.2025.15087721508772Diagnosis of depression based on facial multimodal dataNani Jin0Renjia Ye1Peng Li2Materdicine Lab, School of Life Sciences, Shanghai University, Shanghai, ChinaMaterdicine Lab, School of Life Sciences, Shanghai University, Shanghai, ChinaResearch Department, Third Xiangya Hospital of Central South University, Changsha, ChinaIntroductionDepression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.MethodsThis study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.ResultsWe conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.DiscussionCompared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1508772/fulldepressionmulti-modal datafeature fusionspatial-temporal attentionartificial intelligence
spellingShingle Nani Jin
Renjia Ye
Peng Li
Diagnosis of depression based on facial multimodal data
Frontiers in Psychiatry
depression
multi-modal data
feature fusion
spatial-temporal attention
artificial intelligence
title Diagnosis of depression based on facial multimodal data
title_full Diagnosis of depression based on facial multimodal data
title_fullStr Diagnosis of depression based on facial multimodal data
title_full_unstemmed Diagnosis of depression based on facial multimodal data
title_short Diagnosis of depression based on facial multimodal data
title_sort diagnosis of depression based on facial multimodal data
topic depression
multi-modal data
feature fusion
spatial-temporal attention
artificial intelligence
url https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1508772/full
work_keys_str_mv AT nanijin diagnosisofdepressionbasedonfacialmultimodaldata
AT renjiaye diagnosisofdepressionbasedonfacialmultimodaldata
AT pengli diagnosisofdepressionbasedonfacialmultimodaldata