MKNNet: Knowledge-aligned multimodal transformer for information retrieval
With the rapid advancement of artificial intelligence and the Internet of Things, data collected from multiple sensing modalities is growing rapidly in both volume and complexity. In this paper, we propose a novel deep learning framework called MKNNet, which combines modality alignment, Transformer-...
Saved in:
| Main Authors: | Xiaoqin Lin, Chentao Han, Jian Yao, Yue Li, Xujun Wang, Shufeng Jia |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-08-01
|
| Series: | Alexandria Engineering Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1110016825008051 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Uncertainty-aware coarse-to-fine alignment for text-image person retrieval
by: Yifei Deng, et al.
Published: (2025-04-01) -
Review on Key Techniques of Video Multimodal Sentiment Analysis
by: DUAN Zongtao, HUANG Junchen, ZHU Xiaole
Published: (2025-03-01) -
Cross modal recipe retrieval with fine grained modal interaction
by: Fan Zhao, et al.
Published: (2025-02-01) -
Survey of Multimodal Federated Learning: Exploring Data Integration, Challenges, and Future Directions
by: Mumin Adam, et al.
Published: (2025-01-01) -
DCLMA: Deep correlation learning with multi-modal attention for visual-audio retrieval
by: Jiwei Zhang, et al.
Published: (2025-09-01)