A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection

Recognizing human actions through video analysis has gained significant attention in applications like surveillance, sports analytics, and human–computer interaction. While deep learning models such as 3D convolutional neural networks (CNNs) and recurrent neural networks (RNNs) deliver promising res...

Full description

Saved in:

Bibliographic Details
Main Authors:	Majid Joudaki, Mehdi Imani, Hamid R. Arabnia
Format:	Article
Language:	English
Published:	MDPI AG 2025-02-01
Series:	Technologies
Subjects:	action recognition convolutional restricted Boltzmann machine long short-term memory spatial–temporal feature extraction video processing
Online Access:	https://www.mdpi.com/2227-7080/13/2/53
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850231558708920320
author	Majid Joudaki Mehdi Imani Hamid R. Arabnia
author_facet	Majid Joudaki Mehdi Imani Hamid R. Arabnia
author_sort	Majid Joudaki
collection	DOAJ
description	Recognizing human actions through video analysis has gained significant attention in applications like surveillance, sports analytics, and human–computer interaction. While deep learning models such as 3D convolutional neural networks (CNNs) and recurrent neural networks (RNNs) deliver promising results, they often struggle with computational inefficiencies and inadequate spatial–temporal feature extraction, hindering scalability to larger datasets or high-resolution videos. To address these limitations, we propose a novel model combining a two-dimensional convolutional restricted Boltzmann machine (2D Conv-RBM) with a long short-term memory (LSTM) network. The 2D Conv-RBM efficiently extracts spatial features such as edges, textures, and motion patterns while preserving spatial relationships and reducing parameters via weight sharing. These features are subsequently processed by the LSTM to capture temporal dependencies across frames, enabling effective recognition of both short- and long-term action patterns. Additionally, a smart frame selection mechanism minimizes frame redundancy, significantly lowering computational costs without compromising accuracy. Evaluation on the KTH, UCF Sports, and HMDB51 datasets demonstrated superior performance, achieving accuracies of 97.3%, 94.8%, and 81.5%, respectively. Compared to traditional approaches like 2D RBM and 3D CNN, our method offers notable improvements in both accuracy and computational efficiency, presenting a scalable solution for real-time applications in surveillance, video security, and sports analytics.
format	Article
id	doaj-art-d3b709e1bcab4971b49b2a5608ef606f
institution	OA Journals
issn	2227-7080
language	English
publishDate	2025-02-01
publisher	MDPI AG
record_format	Article
series	Technologies
spelling	doaj-art-d3b709e1bcab4971b49b2a5608ef606f2025-08-20T02:03:30ZengMDPI AGTechnologies2227-70802025-02-011325310.3390/technologies13020053A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame SelectionMajid Joudaki0Mehdi Imani1Hamid R. Arabnia2Electrical and Computer Engineering, University of Kashan, Kashan 8731753153, IranDepartment of Computer and System Sciences, Stockholm University, 10691 Stockholm, SwedenSchool of Computing, University of Georgia, Athens GA 30602, USARecognizing human actions through video analysis has gained significant attention in applications like surveillance, sports analytics, and human–computer interaction. While deep learning models such as 3D convolutional neural networks (CNNs) and recurrent neural networks (RNNs) deliver promising results, they often struggle with computational inefficiencies and inadequate spatial–temporal feature extraction, hindering scalability to larger datasets or high-resolution videos. To address these limitations, we propose a novel model combining a two-dimensional convolutional restricted Boltzmann machine (2D Conv-RBM) with a long short-term memory (LSTM) network. The 2D Conv-RBM efficiently extracts spatial features such as edges, textures, and motion patterns while preserving spatial relationships and reducing parameters via weight sharing. These features are subsequently processed by the LSTM to capture temporal dependencies across frames, enabling effective recognition of both short- and long-term action patterns. Additionally, a smart frame selection mechanism minimizes frame redundancy, significantly lowering computational costs without compromising accuracy. Evaluation on the KTH, UCF Sports, and HMDB51 datasets demonstrated superior performance, achieving accuracies of 97.3%, 94.8%, and 81.5%, respectively. Compared to traditional approaches like 2D RBM and 3D CNN, our method offers notable improvements in both accuracy and computational efficiency, presenting a scalable solution for real-time applications in surveillance, video security, and sports analytics.https://www.mdpi.com/2227-7080/13/2/53action recognitionconvolutional restricted Boltzmann machinelong short-term memoryspatial–temporal feature extractionvideo processing
spellingShingle	Majid Joudaki Mehdi Imani Hamid R. Arabnia A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection Technologies action recognition convolutional restricted Boltzmann machine long short-term memory spatial–temporal feature extraction video processing
title	A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection
title_full	A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection
title_fullStr	A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection
title_full_unstemmed	A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection
title_short	A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection
title_sort	new efficient hybrid technique for human action recognition using 2d conv rbm and lstm with optimized frame selection
topic	action recognition convolutional restricted Boltzmann machine long short-term memory spatial–temporal feature extraction video processing
url	https://www.mdpi.com/2227-7080/13/2/53
work_keys_str_mv	AT majidjoudaki anewefficienthybridtechniqueforhumanactionrecognitionusing2dconvrbmandlstmwithoptimizedframeselection AT mehdiimani anewefficienthybridtechniqueforhumanactionrecognitionusing2dconvrbmandlstmwithoptimizedframeselection AT hamidrarabnia anewefficienthybridtechniqueforhumanactionrecognitionusing2dconvrbmandlstmwithoptimizedframeselection AT majidjoudaki newefficienthybridtechniqueforhumanactionrecognitionusing2dconvrbmandlstmwithoptimizedframeselection AT mehdiimani newefficienthybridtechniqueforhumanactionrecognitionusing2dconvrbmandlstmwithoptimizedframeselection AT hamidrarabnia newefficienthybridtechniqueforhumanactionrecognitionusing2dconvrbmandlstmwithoptimizedframeselection

A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection

Similar Items