GOMFuNet: A Geometric Orthogonal Multimodal Fusion Network for Enhanced Prediction Reliability
Integrating information from heterogeneous data sources poses significant mathematical challenges, particularly in ensuring the reliability and reducing the uncertainty of predictive models. This paper introduces the Geometric Orthogonal Multimodal Fusion Network (GOMFuNet), a novel mathematical fra...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/11/1791 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Integrating information from heterogeneous data sources poses significant mathematical challenges, particularly in ensuring the reliability and reducing the uncertainty of predictive models. This paper introduces the Geometric Orthogonal Multimodal Fusion Network (GOMFuNet), a novel mathematical framework designed to address these challenges. GOMFuNet synergistically combines two core mathematical principles: (1) It utilizes geometric deep learning, specifically Graph Convolutional Networks (GCNs), within its Cross-Modal Label Fusion Module (CLFM) to perform fusion in a high-level semantic label space, thereby preserving inter-sample topological relationships and enhancing robustness to inconsistencies. (2) It incorporates a novel Label Confidence Learning Module (LCLM) derived from optimization theory, which explicitly enhances prediction reliability by enforcing mathematical orthogonality among the predicted class probability vectors, directly minimizing output uncertainty. We demonstrate GOMFuNet’s effectiveness through comprehensive experiments, including confidence calibration analysis and robustness tests, and validate its practical utility via a case study on educational performance prediction using structured, textual, and audio data. Results show GOMFuNet achieves significantly improved performance (90.17% classification accuracy, 88.03% R<sup>2</sup> regression) and enhanced reliability compared to baseline and state-of-the-art multimodal methods, validating its potential as a robust framework for reliable multimodal learning. |
|---|---|
| ISSN: | 2227-7390 |