CABAD: A video dataset for benchmarking child aggression recognition

Recognizing aggressive behaviors in children is critical for early intervention and support in managing emotional and behavioral challenges. Despite the advancement of AI-based monitoring systems across various domains, research on child aggression recognition remains scarce due to the lack of avail...

Full description

Saved in:
Bibliographic Details
Main Authors: Shehzad Ali, Md Tanvir Islam, Ik Hyun Lee, Mohammad Hijji, Khan Muhammad
Format: Article
Language:English
Published: Elsevier 2025-08-01
Series:Alexandria Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1110016825007859
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recognizing aggressive behaviors in children is critical for early intervention and support in managing emotional and behavioral challenges. Despite the advancement of AI-based monitoring systems across various domains, research on child aggression recognition remains scarce due to the lack of available datasets. Consequently, traditional approaches rely on observation and subjective reporting, which hinder timely responses in critical situations. Addressing this gap, we introduce a new dataset Child Aggression Behavior Analysis Dataset (CABAD), comprising 900 videos collected from open-source platforms. CABAD comprises six aggressive classes, including “Hitting,” “Throwing Object,” “Kicking Surface,” “Breaking Object,” “Slamming Door,” and “Pushing”. Leveraging CABAD, we propose CABA_Net, a multi-stage deep-learning framework integrating MobileViT for spatial feature extraction, Temporal Convolutional Networks (TCN) for sequential modeling, and an Attention LSTM for refined temporal attention on behavioral patterns. Experimental results highlight that CABA_Net significantly improves child aggression recognition. Compared to CNN-based models, CABA_Net achieves up to 20.9% higher accuracy while maintaining a competitive frame rate. Against Transformer-based architectures, it offers a 5.8–9.9% gain in accuracy, 14–16.9% lower energy requirements, up to 14.8% less memory usage, and 21%–23% reduction in implementation cost. Compared to the latest SOTA models like VideoMamba and the Taylor-transformed skeleton, CABA_Net improves accuracy by 8.4% and 9.0%, respectively, while maintaining a balanced model size of 282.37 MB, improved scalability (+6.2%), and lower latency. These quantitative improvements reflect a strong balance between accuracy, stability, and computational efficiency, confirming CABA_Net’s effectiveness in recognizing child-aggressive activities.
ISSN:1110-0168