MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network

Abstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Mult...

Full description

Saved in:
Bibliographic Details
Main Authors: Khalil Bachiri, Ali Yahyaouy, Maria Malek, Nicoleta Rogovschi
Format: Article
Language:English
Published: Springer 2025-07-01
Series:International Journal of Computational Intelligence Systems
Subjects:
Online Access:https://doi.org/10.1007/s44196-025-00820-9
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Multimodal Representation Learning Heterogeneous Graph Neural network (MM-HGNN) to tackle these challenges. MM-HGNN introduces a novel Modality Transferability Function to quantify the heterogeneity between different modalities, which allows the model to dynamically adjust the attention scores and give precedence to unique information that is non-redundant. Additionally, it integrates modality-level attention that distributes attention in an adaptive way over different modalities according to their relevance, enhancing feature representations for tasks such as node classification. To further improve representation learning, a splicing mechanism is proposed to integrate outputs from multiple network layers, combining high-level features for more expressive node embeddings. We validate the effectiveness of MM-HGNN through extensive experiments on the IMDB and Amazon datasets. Our model outperforms several state-of-the-art methods under the Macro-F1, Micro-F1, and AUC metrics by a large margin, which well demonstrates its strong capability in dealing with the challenging multimodal and heterogeneous data. Comprehensive ablation studies further emphasize the contributions of each key component in improving the overall performance.
ISSN:1875-6883