A Social Media Dataset and H-GNN-Based Contrastive Learning Scheme for Multimodal Sentiment Analysis

Multimodal sentiment analysis faces a number of challenges, including modality missing, modality heterogeneity gap, incomplete datasets, etc. Previous studies usually adopt schemes like meta-learning or multi-layer structures. Nevertheless, these methods lack interpretability for the interaction bet...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiao Peng, Yue He, Yongjuan Chang, Yanyan Lu, Pengfei Zhang, Zhonghong Ou, Qingzhi Yu
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/2/636
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multimodal sentiment analysis faces a number of challenges, including modality missing, modality heterogeneity gap, incomplete datasets, etc. Previous studies usually adopt schemes like meta-learning or multi-layer structures. Nevertheless, these methods lack interpretability for the interaction between modalities. In this paper, we constructed a new dataset, SM-MSD, for sentiment analysis in social media (SAS) that differs significantly from conventional corpora, comprising 10K instances of diverse data from Twitter, encompassing text, emoticons, emojis, and text embedded in images. This dataset aims to reflect authentic social scenarios and various emotional expressions, and provides a meaningful and challenging evaluation benchmark for multimodal sentiment analysis in specific contexts. Furthermore, we propose a multi-task framework based on heterogeneous graph neural networks (H-GNNs) and contrastive learning. For the first time, heterogeneous graph neural networks are applied to multimodal sentiment analysis tasks. In the case of additional labeling data, it guides the emotion prediction of the missing mode. We conduct extensive experiments on multiple datasets to verify the effectiveness of the proposed scheme. Experimental results demonstrate that our proposed scheme surpasses state-of-the-art methods by 1.7% and 0 in accuracy and 1.54% and 4.9% in F1-score on the MOSI and MOSEI datasets, respectively, and exhibits robustness to modality missing scenarios.
ISSN:2076-3417