Interactive Content Retrieval in Egocentric Videos Based on Vague Semantic Queries
Retrieving specific, often instantaneous, content from hours-long egocentric video footage based on hazily remembered details is challenging. Vision–language models (VLMs) have been employed to enable zero-shot textual-based content retrieval from videos. But, they fall short if the textual query co...
Saved in:
| Main Authors: | Linda Ablaoui, Wilson Estecio Marcilio-Jr, Lai Xing Ng, Christophe Jouffrais, Christophe Hurter |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Multimodal Technologies and Interaction |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2414-4088/9/7/66 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Applying GA for Optimizing the User Query in Image and Video Retrieval
by: Ehsan Lotfi
Published: (2024-02-01) -
Crafting the Path: Robust Query Rewriting for Information Retrieval
by: Ingeol Baek, et al.
Published: (2025-01-01) -
Legal Query RAG
by: Rahman S. M. Wahidur, et al.
Published: (2025-01-01) -
Dialogue-to-Video Retrieval via Multi-Grained Attention Network
by: Yi Yu, et al.
Published: (2025-01-01) -
Comparison of zero-shot approach and retrieval-augmented generation for analyzing the tone of comments in the Ukrainian language
by: M. Prytula, et al.
Published: (2024-12-01)