DCLMA: Deep correlation learning with multi-modal attention for visual-audio retrieval
The cross-modal retrieval task aims to retrieve audio modality information from the database that best matches the visual modality and vice versa. One of the key challenges in this field is the inconsistency of audio and visual features, which increases the complexity of capturing cross-modal inform...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-09-01
|
| Series: | Machine Learning with Applications |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2666827025000787 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|