Diagnosis of depression based on facial multimodal data
IntroductionDepression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.MethodsThis study...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-01-01
|
Series: | Frontiers in Psychiatry |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1508772/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832583759036153856 |
---|---|
author | Nani Jin Renjia Ye Peng Li |
author_facet | Nani Jin Renjia Ye Peng Li |
author_sort | Nani Jin |
collection | DOAJ |
description | IntroductionDepression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.MethodsThis study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.ResultsWe conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.DiscussionCompared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression. |
format | Article |
id | doaj-art-fea593787b884542b3422317395492df |
institution | Kabale University |
issn | 1664-0640 |
language | English |
publishDate | 2025-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Psychiatry |
spelling | doaj-art-fea593787b884542b3422317395492df2025-01-28T07:52:18ZengFrontiers Media S.A.Frontiers in Psychiatry1664-06402025-01-011610.3389/fpsyt.2025.15087721508772Diagnosis of depression based on facial multimodal dataNani Jin0Renjia Ye1Peng Li2Materdicine Lab, School of Life Sciences, Shanghai University, Shanghai, ChinaMaterdicine Lab, School of Life Sciences, Shanghai University, Shanghai, ChinaResearch Department, Third Xiangya Hospital of Central South University, Changsha, ChinaIntroductionDepression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.MethodsThis study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.ResultsWe conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.DiscussionCompared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1508772/fulldepressionmulti-modal datafeature fusionspatial-temporal attentionartificial intelligence |
spellingShingle | Nani Jin Renjia Ye Peng Li Diagnosis of depression based on facial multimodal data Frontiers in Psychiatry depression multi-modal data feature fusion spatial-temporal attention artificial intelligence |
title | Diagnosis of depression based on facial multimodal data |
title_full | Diagnosis of depression based on facial multimodal data |
title_fullStr | Diagnosis of depression based on facial multimodal data |
title_full_unstemmed | Diagnosis of depression based on facial multimodal data |
title_short | Diagnosis of depression based on facial multimodal data |
title_sort | diagnosis of depression based on facial multimodal data |
topic | depression multi-modal data feature fusion spatial-temporal attention artificial intelligence |
url | https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1508772/full |
work_keys_str_mv | AT nanijin diagnosisofdepressionbasedonfacialmultimodaldata AT renjiaye diagnosisofdepressionbasedonfacialmultimodaldata AT pengli diagnosisofdepressionbasedonfacialmultimodaldata |