Multimodal Data Fusion for Depression Detection Approach

Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are...

Full description

Saved in:
Bibliographic Details
Main Authors: Mariia Nykoniuk, Oleh Basystiuk, Nataliya Shakhovska, Nataliia Melnykova
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Computation
Subjects:
Online Access:https://www.mdpi.com/2079-3197/13/1/9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588761164152832
author Mariia Nykoniuk
Oleh Basystiuk
Nataliya Shakhovska
Nataliia Melnykova
author_facet Mariia Nykoniuk
Oleh Basystiuk
Nataliya Shakhovska
Nataliia Melnykova
author_sort Mariia Nykoniuk
collection DOAJ
description Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are not always aware of or able to express their symptoms. By analyzing text and audio data, such networks are able to automatically identify patterns in speech and behavior that indicate a depressive state. In this study, we propose two multimodal information fusion networks: early and late fusion. These networks were developed using convolutional neural network (CNN) layers to learn local patterns, a bidirectional LSTM (Bi-LSTM) to process sequences, and a self-attention mechanism to improve focus on key parts of the data. The DAIC-WOZ and EDAIC-WOZ datasets were used for the experiments. The experiments compared the precision, recall, f1-score, and accuracy metrics for the cases of using early and late multimodal data fusion and found that the early information fusion multimodal network achieved higher classification accuracy results. On the test dataset, this network achieved an f1-score of 0.79 and an overall classification accuracy of 0.86, indicating its effectiveness in detecting depression.
format Article
id doaj-art-47b1906a369e45c8b5a040b25db7c6c3
institution Kabale University
issn 2079-3197
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Computation
spelling doaj-art-47b1906a369e45c8b5a040b25db7c6c32025-01-24T13:27:47ZengMDPI AGComputation2079-31972025-01-01131910.3390/computation13010009Multimodal Data Fusion for Depression Detection ApproachMariia Nykoniuk0Oleh Basystiuk1Nataliya Shakhovska2Nataliia Melnykova3Department of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepartment of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepartment of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepartment of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, UkraineDepression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are not always aware of or able to express their symptoms. By analyzing text and audio data, such networks are able to automatically identify patterns in speech and behavior that indicate a depressive state. In this study, we propose two multimodal information fusion networks: early and late fusion. These networks were developed using convolutional neural network (CNN) layers to learn local patterns, a bidirectional LSTM (Bi-LSTM) to process sequences, and a self-attention mechanism to improve focus on key parts of the data. The DAIC-WOZ and EDAIC-WOZ datasets were used for the experiments. The experiments compared the precision, recall, f1-score, and accuracy metrics for the cases of using early and late multimodal data fusion and found that the early information fusion multimodal network achieved higher classification accuracy results. On the test dataset, this network achieved an f1-score of 0.79 and an overall classification accuracy of 0.86, indicating its effectiveness in detecting depression.https://www.mdpi.com/2079-3197/13/1/9depression detectionmultimodal networksearly fusionlate fusionmental healthdeep learning
spellingShingle Mariia Nykoniuk
Oleh Basystiuk
Nataliya Shakhovska
Nataliia Melnykova
Multimodal Data Fusion for Depression Detection Approach
Computation
depression detection
multimodal networks
early fusion
late fusion
mental health
deep learning
title Multimodal Data Fusion for Depression Detection Approach
title_full Multimodal Data Fusion for Depression Detection Approach
title_fullStr Multimodal Data Fusion for Depression Detection Approach
title_full_unstemmed Multimodal Data Fusion for Depression Detection Approach
title_short Multimodal Data Fusion for Depression Detection Approach
title_sort multimodal data fusion for depression detection approach
topic depression detection
multimodal networks
early fusion
late fusion
mental health
deep learning
url https://www.mdpi.com/2079-3197/13/1/9
work_keys_str_mv AT mariianykoniuk multimodaldatafusionfordepressiondetectionapproach
AT olehbasystiuk multimodaldatafusionfordepressiondetectionapproach
AT nataliyashakhovska multimodaldatafusionfordepressiondetectionapproach
AT nataliiamelnykova multimodaldatafusionfordepressiondetectionapproach