MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network
Abstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Mult...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-07-01
|
| Series: | International Journal of Computational Intelligence Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44196-025-00820-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849234457008013312 |
|---|---|
| author | Khalil Bachiri Ali Yahyaouy Maria Malek Nicoleta Rogovschi |
| author_facet | Khalil Bachiri Ali Yahyaouy Maria Malek Nicoleta Rogovschi |
| author_sort | Khalil Bachiri |
| collection | DOAJ |
| description | Abstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Multimodal Representation Learning Heterogeneous Graph Neural network (MM-HGNN) to tackle these challenges. MM-HGNN introduces a novel Modality Transferability Function to quantify the heterogeneity between different modalities, which allows the model to dynamically adjust the attention scores and give precedence to unique information that is non-redundant. Additionally, it integrates modality-level attention that distributes attention in an adaptive way over different modalities according to their relevance, enhancing feature representations for tasks such as node classification. To further improve representation learning, a splicing mechanism is proposed to integrate outputs from multiple network layers, combining high-level features for more expressive node embeddings. We validate the effectiveness of MM-HGNN through extensive experiments on the IMDB and Amazon datasets. Our model outperforms several state-of-the-art methods under the Macro-F1, Micro-F1, and AUC metrics by a large margin, which well demonstrates its strong capability in dealing with the challenging multimodal and heterogeneous data. Comprehensive ablation studies further emphasize the contributions of each key component in improving the overall performance. |
| format | Article |
| id | doaj-art-889c6fe1f17c4879949518218ba2f7da |
| institution | Kabale University |
| issn | 1875-6883 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Springer |
| record_format | Article |
| series | International Journal of Computational Intelligence Systems |
| spelling | doaj-art-889c6fe1f17c4879949518218ba2f7da2025-08-20T04:03:07ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832025-07-0118112610.1007/s44196-025-00820-9MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural NetworkKhalil Bachiri0Ali Yahyaouy1Maria Malek2Nicoleta Rogovschi3ETIS Laboratory, ENSEA, UMR8051, CNRS, CY Cergy Paris UniversityL3IA Laboratory, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah UniversityETIS Laboratory, ENSEA, UMR8051, CNRS, CY Cergy Paris UniversityLIPADE Laboratory, University Paris CitéAbstract Multimodal learning heterogeneous graphs are very challenging because of the diverse structures and data modalities. The existing graph neural networks cannot efficiently capture both the multimodality of the data and the inherent heterogeneity of such graphs. In this paper, we propose Multimodal Representation Learning Heterogeneous Graph Neural network (MM-HGNN) to tackle these challenges. MM-HGNN introduces a novel Modality Transferability Function to quantify the heterogeneity between different modalities, which allows the model to dynamically adjust the attention scores and give precedence to unique information that is non-redundant. Additionally, it integrates modality-level attention that distributes attention in an adaptive way over different modalities according to their relevance, enhancing feature representations for tasks such as node classification. To further improve representation learning, a splicing mechanism is proposed to integrate outputs from multiple network layers, combining high-level features for more expressive node embeddings. We validate the effectiveness of MM-HGNN through extensive experiments on the IMDB and Amazon datasets. Our model outperforms several state-of-the-art methods under the Macro-F1, Micro-F1, and AUC metrics by a large margin, which well demonstrates its strong capability in dealing with the challenging multimodal and heterogeneous data. Comprehensive ablation studies further emphasize the contributions of each key component in improving the overall performance.https://doi.org/10.1007/s44196-025-00820-9Multimodal learningHeterogeneous graph neural networksModality transferabilityAttention mechanismGraph representation learningNode classification |
| spellingShingle | Khalil Bachiri Ali Yahyaouy Maria Malek Nicoleta Rogovschi MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network International Journal of Computational Intelligence Systems Multimodal learning Heterogeneous graph neural networks Modality transferability Attention mechanism Graph representation learning Node classification |
| title | MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network |
| title_full | MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network |
| title_fullStr | MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network |
| title_full_unstemmed | MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network |
| title_short | MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network |
| title_sort | mm hgnn multimodal representation learning heterogeneous graph neural network |
| topic | Multimodal learning Heterogeneous graph neural networks Modality transferability Attention mechanism Graph representation learning Node classification |
| url | https://doi.org/10.1007/s44196-025-00820-9 |
| work_keys_str_mv | AT khalilbachiri mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork AT aliyahyaouy mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork AT mariamalek mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork AT nicoletarogovschi mmhgnnmultimodalrepresentationlearningheterogeneousgraphneuralnetwork |