InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.

Human Action Recognition (HAR) has grown significantly because of its many uses, including real-time surveillance and human-computer interaction. Various variations in routine human actions make the recognition process of action more difficult. In this paper, we proposed a novel deep learning archit...

Full description

Saved in:
Bibliographic Details
Main Authors: Yasir Khan Jadoon, Muhammad Attique Khan, Yasir Noman Khalid, Jamel Baili, Nebojsa Bacanin, MinKyung Hong, Yunyoung Nam
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0322555
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850159935123357696
author Yasir Khan Jadoon
Muhammad Attique Khan
Yasir Noman Khalid
Jamel Baili
Nebojsa Bacanin
MinKyung Hong
Yunyoung Nam
author_facet Yasir Khan Jadoon
Muhammad Attique Khan
Yasir Noman Khalid
Jamel Baili
Nebojsa Bacanin
MinKyung Hong
Yunyoung Nam
author_sort Yasir Khan Jadoon
collection DOAJ
description Human Action Recognition (HAR) has grown significantly because of its many uses, including real-time surveillance and human-computer interaction. Various variations in routine human actions make the recognition process of action more difficult. In this paper, we proposed a novel deep learning architecture known as Inverted Bottleneck Residual with Self-Attention (InBRwSA). The proposed architecture is based on two different modules. In the first module, 6-parallel inverted bottleneck residual blocks are designed, and each block is connected with a skip connection. These blocks aim to learn complex human actions in many convolutional layers. After that, the second module is designed based on the self-attention mechanism. The learned weights of the first module are passed to self-attention, extract the most essential features, and can easily discriminate complex human actions. The proposed architecture is trained on the selected datasets, whereas the hyperparameters are chosen using the particle swarm optimization (PSO) algorithm. The trained model is employed in the testing phase for the feature extraction from the self-attention layer and passed to the shallow wide neural network classifier for the final classification. The HMDB51 and UCF 101 are frequently used as action recognition standard datasets. These datasets are chosen to allow for meaningful comparison with earlier research. UCF101 dataset has a wide range of activity classes, and HMDB51 has varied real-world behaviors. These features test the generalizability and flexibility of the presented model. Moreover, these datasets define the evaluation scope within a particular domain and guarantee relevance to real-world circumstances. The proposed technique is tested on both datasets, and accuracies of 78.80% and 91.80% were achieved, respectively. The ablation study demonstrated that a margin of error value of 70.1338 ± 3.053 (±4.35%) and 82.7813 ± 2.852 (±3.45%) for the confidence level 95%,1.960σx̄ is obtained for HMDB51 and UCF datasets respectively. The training time for the highest accuracy for HDMB51 and UCF101 is 134.09 and 252.10 seconds, respectively. The proposed architecture is compared with several pre-trained deep models and state-of-the-art (SOTA) existing techniques. Based on the results, the proposed architecture outperformed existing techniques.
format Article
id doaj-art-5eca68138c064609bc4edfec396b9d59
institution OA Journals
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-5eca68138c064609bc4edfec396b9d592025-08-20T02:23:19ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032255510.1371/journal.pone.0322555InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.Yasir Khan JadoonMuhammad Attique KhanYasir Noman KhalidJamel BailiNebojsa BacaninMinKyung HongYunyoung NamHuman Action Recognition (HAR) has grown significantly because of its many uses, including real-time surveillance and human-computer interaction. Various variations in routine human actions make the recognition process of action more difficult. In this paper, we proposed a novel deep learning architecture known as Inverted Bottleneck Residual with Self-Attention (InBRwSA). The proposed architecture is based on two different modules. In the first module, 6-parallel inverted bottleneck residual blocks are designed, and each block is connected with a skip connection. These blocks aim to learn complex human actions in many convolutional layers. After that, the second module is designed based on the self-attention mechanism. The learned weights of the first module are passed to self-attention, extract the most essential features, and can easily discriminate complex human actions. The proposed architecture is trained on the selected datasets, whereas the hyperparameters are chosen using the particle swarm optimization (PSO) algorithm. The trained model is employed in the testing phase for the feature extraction from the self-attention layer and passed to the shallow wide neural network classifier for the final classification. The HMDB51 and UCF 101 are frequently used as action recognition standard datasets. These datasets are chosen to allow for meaningful comparison with earlier research. UCF101 dataset has a wide range of activity classes, and HMDB51 has varied real-world behaviors. These features test the generalizability and flexibility of the presented model. Moreover, these datasets define the evaluation scope within a particular domain and guarantee relevance to real-world circumstances. The proposed technique is tested on both datasets, and accuracies of 78.80% and 91.80% were achieved, respectively. The ablation study demonstrated that a margin of error value of 70.1338 ± 3.053 (±4.35%) and 82.7813 ± 2.852 (±3.45%) for the confidence level 95%,1.960σx̄ is obtained for HMDB51 and UCF datasets respectively. The training time for the highest accuracy for HDMB51 and UCF101 is 134.09 and 252.10 seconds, respectively. The proposed architecture is compared with several pre-trained deep models and state-of-the-art (SOTA) existing techniques. Based on the results, the proposed architecture outperformed existing techniques.https://doi.org/10.1371/journal.pone.0322555
spellingShingle Yasir Khan Jadoon
Muhammad Attique Khan
Yasir Noman Khalid
Jamel Baili
Nebojsa Bacanin
MinKyung Hong
Yunyoung Nam
InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
PLoS ONE
title InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_full InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_fullStr InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_full_unstemmed InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_short InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_sort inbrwsanet self attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities
url https://doi.org/10.1371/journal.pone.0322555
work_keys_str_mv AT yasirkhanjadoon inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities
AT muhammadattiquekhan inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities
AT yasirnomankhalid inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities
AT jamelbaili inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities
AT nebojsabacanin inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities
AT minkyunghong inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities
AT yunyoungnam inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities