Dual-Layer Fusion Knowledge Reasoning with Enhanced Multi-modal Features

Most of the existing multi-modal knowledge reasoning methods use splicing or attention to directly fuse the multi-modal features extracted from the pre-trained model, often ignoring the heterogeneity and interaction complexity between different modes. Therefore, a two-layer fusion knowledge inferenc...

Full description

Saved in:
Bibliographic Details
Main Author: JING Boxiang, WANG Hairong, WANG Tong, YANG Zhenye
Format: Article
Language:zho
Published: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2025-02-01
Series:Jisuanji kexue yu tansuo
Subjects:
Online Access:http://fcst.ceaj.org/fileup/1673-9418/PDF/2312065.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Most of the existing multi-modal knowledge reasoning methods use splicing or attention to directly fuse the multi-modal features extracted from the pre-trained model, often ignoring the heterogeneity and interaction complexity between different modes. Therefore, a two-layer fusion knowledge inference method with multi-modal feature enhancement is proposed. The structural information embedding module uses adaptive graph attention mechanism to filter and aggregate key neighbor information to enhance the semantic representation of entity and relationship embedding. The multi-modal embedding infor-mation module uses different attention mechanisms to pay attention to the unique features of different modal data and the common features among the multi-modal data, and uses the complementary information of the common features to carry out modal interaction, so as to reduce the heterogeneity difference between modes. The multi-modal feature fusion module adopts a two-layer fusion strategy combining low-rank multi-modal feature fusion and decision fusion to realize the dynamic and complex interaction of multi-modal data between and within modes, and comprehensively considers the contribution degree of each mode in inference to obtain more comprehensive prediction results. To verify the effectiveness of the proposed method, experiments are carried out on the FB15K-237, DB15K and YAGO15K datasets, respectively. The results show that compared with the multi-modal reasoning method, MRR and Hits@1 have an average improvement of 3.6% and 2.2% respectively, and compared with the single-modal inference method, MRR and Hits@1 have an average improvement of 13.7% and 14.6%, respectively on the FB15K-237 dataset.
ISSN:1673-9418