Human Pose Estimation and Event Recognition via Feature Extraction and Neuro-Fuzzy Classifier

This paper introduces an advanced framework for Human Pose Estimation (HPE) and Semantic Event Classification (SEC), addressing the growing demand for sophisticated human skeleton modeling, context-aware feature extraction, and machine learning techniques for accurate event recognition in real-world...

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad Hanzla, Naif S. Alshammari, Shuaa S. Alharbi, Wasim Wahid, Nouf Abdullah Almujally, Ahmad Jalal, Hui Liu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10870209/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper introduces an advanced framework for Human Pose Estimation (HPE) and Semantic Event Classification (SEC), addressing the growing demand for sophisticated human skeleton modeling, context-aware feature extraction, and machine learning techniques for accurate event recognition in real-world scenarios. HPE, a fundamental task in applications such as sports analytics and surveillance systems, involves predicting human joint locations from visual data. Despite substantial progress driven by deep learning advancements, particularly in handling occlusions and crowded environments, a comprehensive integration of these innovations remains limited in current literature. To bridge this gap, we propose a novel HPE and SEC system featuring a robust six-stage pipeline. The workflow begins with preprocessing steps, including video-to-image sequence conversion, floor removal to reduce noise, grayscale transformation, and human silhouette extraction via binary masks. Human detection is performed using the GrabCut algorithm. Full-body feature extraction incorporates 1-D Fourier Transform, 2-D energy-based motion analysis, 0–180° intensity mapping via the Radon Transform algorithm, and movable body part detection using Fourier-Based Correlation. Key point features are characterized through the degree of freedom, human landmark detection via the HSV algorithm, and angular point analysis using the media pipe algorithm. Features are then optimized using the Ray Optimizer and classified through a Neuro-Fuzzy classifier. The proposed system achieves classification accuracies of 94.82%, 92.33%, and 93.63% on the UCF Sports Actions, Human Action Recognition (HAR), and MPII Human Pose datasets, respectively, outperforming state-of-the-art methods and demonstrating its effectiveness in tackling real-world challenges.
ISSN:2169-3536