A Novel Hierarchical Multimodal Recommender With Enhanced Global Collaborative Signals

Multimodal recommender systems leverage auxiliary item features, such as images and descriptions, to alleviate the data sparsity problem and facilitate the preference modeling process. Despite their potential, existing multimodal recommenders fail to exploit global collaborative signals and lack ins...

Full description

Saved in:
Bibliographic Details
Main Authors: Peng Yi, Lu Chen, Zhaoxian Li, Cheng Yang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11014073/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multimodal recommender systems leverage auxiliary item features, such as images and descriptions, to alleviate the data sparsity problem and facilitate the preference modeling process. Despite their potential, existing multimodal recommenders fail to exploit global collaborative signals and lack insights into the underlying interaction formation mechanism, resulting in suboptimal recommendation performance. To this end, we propose a hierarchical multimodal recommender named HMMGCF, which can capture crucial global collaborative signals through modality feature-enhanced hierarchical structures and a novel inter-modality alignment strategy. Specifically, modality features are first utilized to identify neighboring relationships, and similar users (items) are steadily merged together to form modality-specific hierarchical structures. Then, with the proper graph convolution operation on each hierarchy, the crucial global collaborative signals can be effectively extracted and integrated into the modality-specific user (item) embeddings. Moreover, a novel group-wise contrastive learning strategy is also proposed to align inter-modality preference information and further enhance the extraction of global collaborative signals. By conducting extensive experiments on three benchmark datasets, we empirically and theoretically demonstrate the superiority of HMMGCF, validating the importance of global collaborative signal extraction in multimodal recommender systems.
ISSN:2169-3536