A novel YOLO LSTM approach for enhanced human action recognition in video sequences

Abstract Human Action Recognition (HAR) is a critical task in computer vision with applications in surveillance, healthcare, and human–computer interaction. This paper introduces a novel approach combining the strengths of You Only Look Once (YOLO) for feature extraction and Long Short-Term Memory (...

Full description

Saved in:
Bibliographic Details
Main Authors: Mahmoud Elnady, Hossam E. Abdelmunim
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-01898-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Human Action Recognition (HAR) is a critical task in computer vision with applications in surveillance, healthcare, and human–computer interaction. This paper introduces a novel approach combining the strengths of You Only Look Once (YOLO) for feature extraction and Long Short-Term Memory (LSTM) networks for temporal modeling to achieve robust and accurate action recognition in video sequences. The YOLO model efficiently identifies key features from individual frames, enabling real-time processing, while the LSTM network captures temporal dependencies to understand sequential dynamics in human movements. The proposed YOLO–LSTM framework is evaluated on multiple publicly available HAR datasets, achieving an accuracy of 96%, precision of 96%, recall of 97%, and F1-score of 96% on the UCF101 dataset; 99% across all metrics on the KTH dataset; 100% on the WEIZMANN dataset; and 98% on the IXMAS dataset. These results demonstrate the superior performance of our approach compared to existing methods in terms of both accuracy and processing speed. Additionally, this approach effectively handles challenges such as occlusions, varying illumination, and complex backgrounds, making it suitable for real-world applications. The results highlight the potential of combining object detection and recurrent architectures for advancing state-of-the-art HAR systems.
ISSN:2045-2322