Multimodal Latent Representation Learning for Video Moment Retrieval
The rise of artificial intelligence (AI) has revolutionized the processing and analysis of video sensor data, driving advancements in areas such as surveillance, autonomous driving, and personalized content recommendations. However, leveraging video data presents unique challenges, particularly in t...
Saved in:
| Main Authors: | Jinkwon Hwang, Mingyu Jeon, Junyeong Kim |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/14/4528 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions With Multi-Level Representations
by: Jie Jiang, et al.
Published: (2025-01-01) -
A Comparative Analysis of the Zernike Moments for Single Object Retrieval
by: Abu Bakar et al.
Published: (2019-06-01) -
A survey of multimodal composite editing and retrieval
by: Suyan Li, et al.
Published: (2025-07-01) -
Interactive Content Retrieval in Egocentric Videos Based on Vague Semantic Queries
by: Linda Ablaoui, et al.
Published: (2025-06-01) -
Exploring latent weight factors and global information for food-oriented cross-modal retrieval
by: Wenyu Zhao, et al.
Published: (2023-12-01)