Multimodal Data Fusion for Depression Detection Approach
Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Computation |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-3197/13/1/9 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832588761164152832 |
---|---|
author | Mariia Nykoniuk Oleh Basystiuk Nataliya Shakhovska Nataliia Melnykova |
author_facet | Mariia Nykoniuk Oleh Basystiuk Nataliya Shakhovska Nataliia Melnykova |
author_sort | Mariia Nykoniuk |
collection | DOAJ |
description | Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are not always aware of or able to express their symptoms. By analyzing text and audio data, such networks are able to automatically identify patterns in speech and behavior that indicate a depressive state. In this study, we propose two multimodal information fusion networks: early and late fusion. These networks were developed using convolutional neural network (CNN) layers to learn local patterns, a bidirectional LSTM (Bi-LSTM) to process sequences, and a self-attention mechanism to improve focus on key parts of the data. The DAIC-WOZ and EDAIC-WOZ datasets were used for the experiments. The experiments compared the precision, recall, f1-score, and accuracy metrics for the cases of using early and late multimodal data fusion and found that the early information fusion multimodal network achieved higher classification accuracy results. On the test dataset, this network achieved an f1-score of 0.79 and an overall classification accuracy of 0.86, indicating its effectiveness in detecting depression. |
format | Article |
id | doaj-art-47b1906a369e45c8b5a040b25db7c6c3 |
institution | Kabale University |
issn | 2079-3197 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Computation |
spelling | doaj-art-47b1906a369e45c8b5a040b25db7c6c32025-01-24T13:27:47ZengMDPI AGComputation2079-31972025-01-01131910.3390/computation13010009Multimodal Data Fusion for Depression Detection ApproachMariia Nykoniuk0Oleh Basystiuk1Nataliya Shakhovska2Nataliia Melnykova3Department of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepartment of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepartment of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepartment of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are not always aware of or able to express their symptoms. By analyzing text and audio data, such networks are able to automatically identify patterns in speech and behavior that indicate a depressive state. In this study, we propose two multimodal information fusion networks: early and late fusion. These networks were developed using convolutional neural network (CNN) layers to learn local patterns, a bidirectional LSTM (Bi-LSTM) to process sequences, and a self-attention mechanism to improve focus on key parts of the data. The DAIC-WOZ and EDAIC-WOZ datasets were used for the experiments. The experiments compared the precision, recall, f1-score, and accuracy metrics for the cases of using early and late multimodal data fusion and found that the early information fusion multimodal network achieved higher classification accuracy results. On the test dataset, this network achieved an f1-score of 0.79 and an overall classification accuracy of 0.86, indicating its effectiveness in detecting depression.https://www.mdpi.com/2079-3197/13/1/9depression detectionmultimodal networksearly fusionlate fusionmental healthdeep learning |
spellingShingle | Mariia Nykoniuk Oleh Basystiuk Nataliya Shakhovska Nataliia Melnykova Multimodal Data Fusion for Depression Detection Approach Computation depression detection multimodal networks early fusion late fusion mental health deep learning |
title | Multimodal Data Fusion for Depression Detection Approach |
title_full | Multimodal Data Fusion for Depression Detection Approach |
title_fullStr | Multimodal Data Fusion for Depression Detection Approach |
title_full_unstemmed | Multimodal Data Fusion for Depression Detection Approach |
title_short | Multimodal Data Fusion for Depression Detection Approach |
title_sort | multimodal data fusion for depression detection approach |
topic | depression detection multimodal networks early fusion late fusion mental health deep learning |
url | https://www.mdpi.com/2079-3197/13/1/9 |
work_keys_str_mv | AT mariianykoniuk multimodaldatafusionfordepressiondetectionapproach AT olehbasystiuk multimodaldatafusionfordepressiondetectionapproach AT nataliyashakhovska multimodaldatafusionfordepressiondetectionapproach AT nataliiamelnykova multimodaldatafusionfordepressiondetectionapproach |