Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism

In the pursuit of developing an efficient and harmonious human-computer interaction interface, Emotion Recognition in Conversations (ERC) is particularly important. It requires the system to delicately capture and understand the nuances of human emotional fluctuations during the communication proces...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xue Zhang, Mingjiang Wang, Xiao Zeng, Xuyi Zhuang
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Emotion recognition transfer learning multimodal
Online Access:	https://ieeexplore.ieee.org/document/10701560/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850023115994693632
author	Xue Zhang Mingjiang Wang Xiao Zeng Xuyi Zhuang
author_facet	Xue Zhang Mingjiang Wang Xiao Zeng Xuyi Zhuang
author_sort	Xue Zhang
collection	DOAJ
description	In the pursuit of developing an efficient and harmonious human-computer interaction interface, Emotion Recognition in Conversations (ERC) is particularly important. It requires the system to delicately capture and understand the nuances of human emotional fluctuations during the communication process. Currently, although emotional signals are prevalent in various modalities of conversation such as audio, video, and text, multimodal Emotion Recognition in Conversations (ERC) still remains a challenging problem to tackle due to its inherent complexity. Previous research has tended to rely on a single modality, particularly text information, while neglecting the rich emotional cues present in audio and video modalities. Based on the current research status and challenges such as inadequate extraction of contextual emotional dynamic features and data scarcity, a multimodal emotion recognition method called Attention-based Fusion Contextual Attention Network (Af-CAN) has been proposed to break through these limitations. Af-CAN is meticulously designed with a multimodal feature fusion mechanism that can extract emotion-relevant features from different sources of information and uses advanced attention mechanisms to integrate these features, ensuring the comprehensiveness and accuracy of emotion recognition. Furthermore, in response to the characteristics of emotional dynamics and context dependency in conversations, this framework introduces a special context modeling unit capable of tracking the evolution of emotional states in the conversation and the mutual influence of emotions between speakers. Experimental evaluations carried out on multiple standard datasets have shown that Af-CAN outperforms existing ERC systems on various evaluation metrics, particularly showing significant advantages in handling complex emotional changes in conversations, laying a solid foundation for advancing the application of emotional intelligence in human-computer interaction.
format	Article
id	doaj-art-34c1e68b83704be2816e6e23fb554815
institution	DOAJ
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-34c1e68b83704be2816e6e23fb5548152025-08-20T03:01:28ZengIEEEIEEE Access2169-35362025-01-0113448584487110.1109/ACCESS.2024.347161310701560Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention MechanismXue Zhang0https://orcid.org/0009-0003-8169-4453Mingjiang Wang1https://orcid.org/0000-0002-4706-009XXiao Zeng2Xuyi Zhuang3Key Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology (Shenzhen), Shenzhen, ChinaKey Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology (Shenzhen), Shenzhen, ChinaKey Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology (Shenzhen), Shenzhen, ChinaKey Laboratory for Key Technologies of IoT Terminals, Harbin Institute of Technology (Shenzhen), Shenzhen, ChinaIn the pursuit of developing an efficient and harmonious human-computer interaction interface, Emotion Recognition in Conversations (ERC) is particularly important. It requires the system to delicately capture and understand the nuances of human emotional fluctuations during the communication process. Currently, although emotional signals are prevalent in various modalities of conversation such as audio, video, and text, multimodal Emotion Recognition in Conversations (ERC) still remains a challenging problem to tackle due to its inherent complexity. Previous research has tended to rely on a single modality, particularly text information, while neglecting the rich emotional cues present in audio and video modalities. Based on the current research status and challenges such as inadequate extraction of contextual emotional dynamic features and data scarcity, a multimodal emotion recognition method called Attention-based Fusion Contextual Attention Network (Af-CAN) has been proposed to break through these limitations. Af-CAN is meticulously designed with a multimodal feature fusion mechanism that can extract emotion-relevant features from different sources of information and uses advanced attention mechanisms to integrate these features, ensuring the comprehensiveness and accuracy of emotion recognition. Furthermore, in response to the characteristics of emotional dynamics and context dependency in conversations, this framework introduces a special context modeling unit capable of tracking the evolution of emotional states in the conversation and the mutual influence of emotions between speakers. Experimental evaluations carried out on multiple standard datasets have shown that Af-CAN outperforms existing ERC systems on various evaluation metrics, particularly showing significant advantages in handling complex emotional changes in conversations, laying a solid foundation for advancing the application of emotional intelligence in human-computer interaction.https://ieeexplore.ieee.org/document/10701560/Emotion recognitiontransfer learningmultimodal
spellingShingle	Xue Zhang Mingjiang Wang Xiao Zeng Xuyi Zhuang Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism IEEE Access Emotion recognition transfer learning multimodal
title	Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism
title_full	Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism
title_fullStr	Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism
title_full_unstemmed	Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism
title_short	Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism
title_sort	af can multimodal emotion recognition method based on situational attention mechanism
topic	Emotion recognition transfer learning multimodal
url	https://ieeexplore.ieee.org/document/10701560/
work_keys_str_mv	AT xuezhang afcanmultimodalemotionrecognitionmethodbasedonsituationalattentionmechanism AT mingjiangwang afcanmultimodalemotionrecognitionmethodbasedonsituationalattentionmechanism AT xiaozeng afcanmultimodalemotionrecognitionmethodbasedonsituationalattentionmechanism AT xuyizhuang afcanmultimodalemotionrecognitionmethodbasedonsituationalattentionmechanism

Af-CAN: Multimodal Emotion Recognition Method Based on Situational Attention Mechanism

Similar Items