Design of an Improved Model for Anomaly Detection in CCTV Systems Using Multimodal Fusion and Attention-Based Networks
Traditional approaches for video analysis often misdefine anomalies; they usually rely on single-modality input and have inadequate management of complex temporal patterns. This paper resolves these limitations by proposing a comprehensive scheme for multimodal Closed-Circuit Television (CCTV) video...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10876563/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Traditional approaches for video analysis often misdefine anomalies; they usually rely on single-modality input and have inadequate management of complex temporal patterns. This paper resolves these limitations by proposing a comprehensive scheme for multimodal Closed-Circuit Television (CCTV) video analysis. The utilized techniques in this paper comprise the Multimodal Deep Boltzmann Machine (MDBM), Multimodal Variational Autoencoder (MVAE) and Attention-based Fusion Networks, all of which fully utilize the learned representations. MDBM learns shared representations out of heterogeneous data sources, MVAE captures the inherent distribution of multi-modalities, while the mechanism of attention in fusion networks is done to stress important features. Finally, temporal context is modeled using long short-term memory and transformer networks, temporal convolutional networks and transformer networks with temporal encoding. Long Short-Term Memory (LSTM) can capture long-range dependencies in sequential data, while Temporal Convolutional Network (TCN) efficiently models temporal patterns using convolutional layers and Transformer Networks fathom the relative importance of temporal features against one another through self-attention, thus improving their detection accuracy for anomalies that happen over a long duration. The proposed models also offer good improvements in the performance of anomaly detection. In particular, accuracy improved by 5% using MDBM, the false positive rate reduced by 15% with MVAE, a more than 10% improvement in the F1-score with the attentive fusion network, a 20% reduction in reconstruction error with Deep Convolutional Autoencoder (DCA), detection precision improved by 12% using Adversarially Learned Inference (ALI) and a gain of 8% in Area Under the Curve (AUC) using Deep InfoMax (DIM) operations. |
|---|---|
| ISSN: | 2169-3536 |