MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network

Abstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Mult...

Full description

Saved in:
Bibliographic Details
Main Authors: Khalil Bachiri, Ali Yahyaouy, Maria Malek, Nicoleta Rogovschi
Format: Article
Language:English
Published: Springer 2025-07-01
Series:International Journal of Computational Intelligence Systems
Subjects:
Online Access:https://doi.org/10.1007/s44196-025-00820-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849234457008013312
author Khalil Bachiri
Ali Yahyaouy
Maria Malek
Nicoleta Rogovschi
author_facet Khalil Bachiri
Ali Yahyaouy
Maria Malek
Nicoleta Rogovschi
author_sort Khalil Bachiri
collection DOAJ
description Abstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Multimodal Representation Learning Heterogeneous Graph Neural network (MM-HGNN) to tackle these challenges. MM-HGNN introduces a novel Modality Transferability Function to quantify the heterogeneity between different modalities, which allows the model to dynamically adjust the attention scores and give precedence to unique information that is non-redundant. Additionally, it integrates modality-level attention that distributes attention in an adaptive way over different modalities according to their relevance, enhancing feature representations for tasks such as node classification. To further improve representation learning, a splicing mechanism is proposed to integrate outputs from multiple network layers, combining high-level features for more expressive node embeddings. We validate the effectiveness of MM-HGNN through extensive experiments on the IMDB and Amazon datasets. Our model outperforms several state-of-the-art methods under the Macro-F1, Micro-F1, and AUC metrics by a large margin, which well demonstrates its strong capability in dealing with the challenging multimodal and heterogeneous data. Comprehensive ablation studies further emphasize the contributions of each key component in improving the overall performance.
format Article
id doaj-art-889c6fe1f17c4879949518218ba2f7da
institution Kabale University
issn 1875-6883
language English
publishDate 2025-07-01
publisher Springer
record_format Article
series International Journal of Computational Intelligence Systems
spelling doaj-art-889c6fe1f17c4879949518218ba2f7da2025-08-20T04:03:07ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832025-07-0118112610.1007/s44196-025-00820-9MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural NetworkKhalil Bachiri0Ali Yahyaouy1Maria Malek2Nicoleta Rogovschi3ETIS Laboratory, ENSEA, UMR8051, CNRS, CY Cergy Paris UniversityL3IA Laboratory, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah UniversityETIS Laboratory, ENSEA, UMR8051, CNRS, CY Cergy Paris UniversityLIPADE Laboratory, University Paris CitéAbstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Multimodal Representation Learning Heterogeneous Graph Neural network (MM-HGNN) to tackle these challenges. MM-HGNN introduces a novel Modality Transferability Function to quantify the heterogeneity between different modalities, which allows the model to dynamically adjust the attention scores and give precedence to unique information that is non-redundant. Additionally, it integrates modality-level attention that distributes attention in an adaptive way over different modalities according to their relevance, enhancing feature representations for tasks such as node classification. To further improve representation learning, a splicing mechanism is proposed to integrate outputs from multiple network layers, combining high-level features for more expressive node embeddings. We validate the effectiveness of MM-HGNN through extensive experiments on the IMDB and Amazon datasets. Our model outperforms several state-of-the-art methods under the Macro-F1, Micro-F1, and AUC metrics by a large margin, which well demonstrates its strong capability in dealing with the challenging multimodal and heterogeneous data. Comprehensive ablation studies further emphasize the contributions of each key component in improving the overall performance.https://doi.org/10.1007/s44196-025-00820-9Multimodal learningHeterogeneous graph neural networksModality transferabilityAttention mechanismGraph representation learningNode classification
spellingShingle Khalil Bachiri
Ali Yahyaouy
Maria Malek
Nicoleta Rogovschi
MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
International Journal of Computational Intelligence Systems
Multimodal learning
Heterogeneous graph neural networks
Modality transferability
Attention mechanism
Graph representation learning
Node classification
title MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
title_full MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
title_fullStr MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
title_full_unstemmed MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
title_short MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
title_sort mm hgnn multimodal representation learning heterogeneous graph neural network
topic Multimodal learning
Heterogeneous graph neural networks
Modality transferability
Attention mechanism
Graph representation learning
Node classification
url https://doi.org/10.1007/s44196-025-00820-9
work_keys_str_mv AT khalilbachiri mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork
AT aliyahyaouy mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork
AT mariamalek mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork
AT nicoletarogovschi mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork