Design of an integrated model with temporal graph attention and transformer-augmented RNNs for enhanced anomaly detection

Abstract It is important in the rising demands to have efficient anomaly detection in camera surveillance systems for improving public safety in a complex environment. Most of the available methods usually fail to capture the long-term temporal dependencies and spatial correlations, especially in dy...

Full description

Saved in:
Bibliographic Details
Main Authors: Sai Babu Veesam, Aravapalli Rama Satish, Sreenivasulu Tupakula, Yuvaraju Chinnam, Krishna Prakash, Shonak Bansal, Mohammad Rashed Iqbal Faruque
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-85822-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract It is important in the rising demands to have efficient anomaly detection in camera surveillance systems for improving public safety in a complex environment. Most of the available methods usually fail to capture the long-term temporal dependencies and spatial correlations, especially in dynamic multi-camera settings. Also, many traditional methods rely heavily on large labeled datasets, generalizing poorly when encountering unseen anomalies in the process. We introduce a new framework to address such challenges by incorporating state-of-the-art deep learning models that improve temporal and spatial context modeling. We combine RNNs with GATs to model long-term dependencies across cameras effectively distributed over space. The Transformer-Augmented RNN allows for a better way than standard RNNs through self-attention mechanisms to improve robust temporal modeling. We employ a Multimodal Variational Autoencoder-MVAE that fuses video, audio, and motion sensor information in a manner resistant to noise and missing samples. To address the challenge of having a few labeled anomalies, we apply the Prototypical Networks to perform few-shot learning and enable generalization based on a few examples. Then, a Spatiotemporal Autoencoder is adopted to realize unsupervised anomaly detection by learning normal behavior patterns and deviations from them as anomalies. The methods proposed here yield significant improvements of about 10% to 15% in precision, recall, and F1-scores over traditional models. Further, the generalization capability of the framework to unseen anomalies, up to a gain of + 20% on novel event detection, represents a major advancement for real-world surveillance systems.
ISSN:2045-2322