Vision transformer embedded video anomaly detection using attention driven recurrence
Automated video anomaly detection (VAD) is a challenging task due to its context-dependent and sporadic nature. However, recent deep learning advancements offer promising solutions. In this paper, we propose a novel framework for detecting anomalies in videos by uniquely analyzing spatial and tempor...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-09-01
|
| Series: | Array |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2590005625000980 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Automated video anomaly detection (VAD) is a challenging task due to its context-dependent and sporadic nature. However, recent deep learning advancements offer promising solutions. In this paper, we propose a novel framework for detecting anomalies in videos by uniquely analyzing spatial and temporal (spatio-temporal) features. We address challenges such as the processing of lengthy videos and the sparse occurrence of anomalies by segmenting and labeling anomalous parts within videos. We employ a modified pre-trained vision transformer for video feature extraction, leveraging its ability to capture complex spatio-temporal patterns and the global context. Additionally, we incorporate a parameter-efficient recurrent model, the Simple Recurrent Unit Plus Plus (SRU++), which processes long sequential video embeddings efficiently by reducing computational costs by ten times compared to traditional methods. To further enhance the multiclass prediction performance, we develop a cluster-based weighting mechanism that assigns weights to classification scores based on feature similarity. We extensively evaluated our approach on three popular datasets — UCF-Crime, RWF-2000, and Smart City CCTV Violence Detection (SCVD) — achieving superior performance compared to state-of-the-art methods, making it well-suited for real-world surveillance applications. |
|---|---|
| ISSN: | 2590-0056 |