Embedding-based pair generation for contrastive representation learning in audio-visual surveillance data
Smart cities deploy various sensors such as microphones and RGB cameras to collect data to improve the safety and comfort of the citizens. As data annotation is expensive, self-supervised methods such as contrastive learning are used to learn audio-visual representations for downstream tasks. Focusi...
Saved in:
| Main Authors: | Wei-Cheng Wang, Sander De Coninck, Sam Leroux, Pieter Simoens |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-01-01
|
| Series: | Frontiers in Robotics and AI |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/frobt.2024.1490718/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
The Impact of Using Audio-Visual Interactive Media in Learning Mathematics
by: Linda Mardiani Setiawati, et al.
Published: (2024-08-01) -
PENGGUNAAN MEDIA AUDIO VISUAL PADA MATA PELAJARAN PENDIDIKAN AGAMA ISLAM UNTUK MENINGKATKAN AKTIVITAS BELAJAR SISWA KELAS V SD N 09 PALEMBANG
by: Ibrahim Ibrahim, et al.
Published: (2024-01-01) -
Application of Audio-Visual Learning Media in Increasing Islamic Boarding School Students’ Tajweed Learning Outcomes
by: Abdul Hamid, et al.
Published: (2025-05-01) -
Audio-Language Datasets of Scenes and Events: A Survey
by: Gijs Wijngaard, et al.
Published: (2025-01-01) -
The Utilization of Audio-Visual Learning Media in Learning Islamic Cultural History for Grade 10 High School Students
by: Wadza Umi Rosida, et al.
Published: (2024-05-01)