InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.

Human Action Recognition (HAR) has grown significantly because of its many uses, including real-time surveillance and human-computer interaction. Various variations in routine human actions make the recognition process of action more difficult. In this paper, we proposed a novel deep learning archit...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yasir Khan Jadoon, Muhammad Attique Khan, Yasir Noman Khalid, Jamel Baili, Nebojsa Bacanin, MinKyung Hong, Yunyoung Nam
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2025-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0322555
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850159935123357696
author	Yasir Khan Jadoon Muhammad Attique Khan Yasir Noman Khalid Jamel Baili Nebojsa Bacanin MinKyung Hong Yunyoung Nam
author_facet	Yasir Khan Jadoon Muhammad Attique Khan Yasir Noman Khalid Jamel Baili Nebojsa Bacanin MinKyung Hong Yunyoung Nam
author_sort	Yasir Khan Jadoon
collection	DOAJ
description	Human Action Recognition (HAR) has grown significantly because of its many uses, including real-time surveillance and human-computer interaction. Various variations in routine human actions make the recognition process of action more difficult. In this paper, we proposed a novel deep learning architecture known as Inverted Bottleneck Residual with Self-Attention (InBRwSA). The proposed architecture is based on two different modules. In the first module, 6-parallel inverted bottleneck residual blocks are designed, and each block is connected with a skip connection. These blocks aim to learn complex human actions in many convolutional layers. After that, the second module is designed based on the self-attention mechanism. The learned weights of the first module are passed to self-attention, extract the most essential features, and can easily discriminate complex human actions. The proposed architecture is trained on the selected datasets, whereas the hyperparameters are chosen using the particle swarm optimization (PSO) algorithm. The trained model is employed in the testing phase for the feature extraction from the self-attention layer and passed to the shallow wide neural network classifier for the final classification. The HMDB51 and UCF 101 are frequently used as action recognition standard datasets. These datasets are chosen to allow for meaningful comparison with earlier research. UCF101 dataset has a wide range of activity classes, and HMDB51 has varied real-world behaviors. These features test the generalizability and flexibility of the presented model. Moreover, these datasets define the evaluation scope within a particular domain and guarantee relevance to real-world circumstances. The proposed technique is tested on both datasets, and accuracies of 78.80% and 91.80% were achieved, respectively. The ablation study demonstrated that a margin of error value of 70.1338 ± 3.053 (±4.35%) and 82.7813 ± 2.852 (±3.45%) for the confidence level 95%,1.960σx̄ is obtained for HMDB51 and UCF datasets respectively. The training time for the highest accuracy for HDMB51 and UCF101 is 134.09 and 252.10 seconds, respectively. The proposed architecture is compared with several pre-trained deep models and state-of-the-art (SOTA) existing techniques. Based on the results, the proposed architecture outperformed existing techniques.
format	Article
id	doaj-art-5eca68138c064609bc4edfec396b9d59
institution	OA Journals
issn	1932-6203
language	English
publishDate	2025-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj-art-5eca68138c064609bc4edfec396b9d592025-08-20T02:23:19ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032255510.1371/journal.pone.0322555InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.Yasir Khan JadoonMuhammad Attique KhanYasir Noman KhalidJamel BailiNebojsa BacaninMinKyung HongYunyoung NamHuman Action Recognition (HAR) has grown significantly because of its many uses, including real-time surveillance and human-computer interaction. Various variations in routine human actions make the recognition process of action more difficult. In this paper, we proposed a novel deep learning architecture known as Inverted Bottleneck Residual with Self-Attention (InBRwSA). The proposed architecture is based on two different modules. In the first module, 6-parallel inverted bottleneck residual blocks are designed, and each block is connected with a skip connection. These blocks aim to learn complex human actions in many convolutional layers. After that, the second module is designed based on the self-attention mechanism. The learned weights of the first module are passed to self-attention, extract the most essential features, and can easily discriminate complex human actions. The proposed architecture is trained on the selected datasets, whereas the hyperparameters are chosen using the particle swarm optimization (PSO) algorithm. The trained model is employed in the testing phase for the feature extraction from the self-attention layer and passed to the shallow wide neural network classifier for the final classification. The HMDB51 and UCF 101 are frequently used as action recognition standard datasets. These datasets are chosen to allow for meaningful comparison with earlier research. UCF101 dataset has a wide range of activity classes, and HMDB51 has varied real-world behaviors. These features test the generalizability and flexibility of the presented model. Moreover, these datasets define the evaluation scope within a particular domain and guarantee relevance to real-world circumstances. The proposed technique is tested on both datasets, and accuracies of 78.80% and 91.80% were achieved, respectively. The ablation study demonstrated that a margin of error value of 70.1338 ± 3.053 (±4.35%) and 82.7813 ± 2.852 (±3.45%) for the confidence level 95%,1.960σx̄ is obtained for HMDB51 and UCF datasets respectively. The training time for the highest accuracy for HDMB51 and UCF101 is 134.09 and 252.10 seconds, respectively. The proposed architecture is compared with several pre-trained deep models and state-of-the-art (SOTA) existing techniques. Based on the results, the proposed architecture outperformed existing techniques.https://doi.org/10.1371/journal.pone.0322555
spellingShingle	Yasir Khan Jadoon Muhammad Attique Khan Yasir Noman Khalid Jamel Baili Nebojsa Bacanin MinKyung Hong Yunyoung Nam InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities. PLoS ONE
title	InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_full	InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_fullStr	InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_full_unstemmed	InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_short	InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.
title_sort	inbrwsanet self attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities
url	https://doi.org/10.1371/journal.pone.0322555
work_keys_str_mv	AT yasirkhanjadoon inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities AT muhammadattiquekhan inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities AT yasirnomankhalid inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities AT jamelbaili inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities AT nebojsabacanin inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities AT minkyunghong inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities AT yunyoungnam inbrwsanetselfattentionbasedparallelinvertedresidualbottleneckarchitectureforhumanactionrecognitioninsmartcities

InBRwSANet: Self-attention based parallel inverted residual bottleneck architecture for human action recognition in smart cities.

Similar Items