Leveraging Multi-Modality and Enhanced Temporal Networks for Robust Violence Detection

In this paper, we present a novel model that enhances performance by extending the dual-modality TEVAD model—originally leveraging visual and textual information—into a multi-modal framework that integrates visual, audio, and textual data. Additionally, we refine the multi-scale temporal network (MT...

Full description

Saved in:
Bibliographic Details
Main Authors: Gwangho Na, Jaepil Ko, Kyungjoo Cheoi
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/6/4/119
Tags: Add Tag
No Tags, Be the first to tag this record!