Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement

Abstract In order to solve the problem of weak single domain generalization ability in existing crowd counting methods, this study proposes a new crowd counting framework called Multi-scale Attention and Hierarchy level Enhancement (MAHE). Firstly, the model can focus on both the detailed features a...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiarui Zhou, Jianming Zhang, Yan Gui
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-83725-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841559587140927488
author Jiarui Zhou
Jianming Zhang
Yan Gui
author_facet Jiarui Zhou
Jianming Zhang
Yan Gui
author_sort Jiarui Zhou
collection DOAJ
description Abstract In order to solve the problem of weak single domain generalization ability in existing crowd counting methods, this study proposes a new crowd counting framework called Multi-scale Attention and Hierarchy level Enhancement (MAHE). Firstly, the model can focus on both the detailed features and the macro information of structural position changes through the fusion of channel attention and spatial attention. Secondly, the addition of multi-head attention feature module facilitates the model’s capacity to effectively capture complex dependency relationships between sequence elements. In addition, the three-stage encoding and decoding processing mode enables the model to effectively represent crowd density information. Finally, the fusion of multi-scale features derived from different receptive fields is further enhanced through multi-scale hierarchy level feature fusion, thereby enabling the model to learn high-level semantic information and low-level multi-scale visual field feature information. This method enhances the model’s capacity to capture key feature information, even in highly differentiated datasets, thereby improving the model’s generalization ability on a single domain. The model has demonstrated strong generalization capabilities through extensive experiments on different datasets. This study not only improves the accuracy of crowd counting, but also introduces a new research approach for single domain generalization of crowd counting.
format Article
id doaj-art-282d22b3694c47239ef3787aa30d71c1
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-282d22b3694c47239ef3787aa30d71c12025-01-05T12:21:28ZengNature PortfolioScientific Reports2045-23222025-01-0115111410.1038/s41598-024-83725-5Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancementJiarui Zhou0Jianming Zhang1Yan Gui2School of Computer and Communication Engineering, Changsha University of Science and TechnologySchool of Computer and Communication Engineering, Changsha University of Science and TechnologySchool of Computer and Communication Engineering, Changsha University of Science and TechnologyAbstract In order to solve the problem of weak single domain generalization ability in existing crowd counting methods, this study proposes a new crowd counting framework called Multi-scale Attention and Hierarchy level Enhancement (MAHE). Firstly, the model can focus on both the detailed features and the macro information of structural position changes through the fusion of channel attention and spatial attention. Secondly, the addition of multi-head attention feature module facilitates the model’s capacity to effectively capture complex dependency relationships between sequence elements. In addition, the three-stage encoding and decoding processing mode enables the model to effectively represent crowd density information. Finally, the fusion of multi-scale features derived from different receptive fields is further enhanced through multi-scale hierarchy level feature fusion, thereby enabling the model to learn high-level semantic information and low-level multi-scale visual field feature information. This method enhances the model’s capacity to capture key feature information, even in highly differentiated datasets, thereby improving the model’s generalization ability on a single domain. The model has demonstrated strong generalization capabilities through extensive experiments on different datasets. This study not only improves the accuracy of crowd counting, but also introduces a new research approach for single domain generalization of crowd counting.https://doi.org/10.1038/s41598-024-83725-5Crowd countingSpatial attentionChannel attentionMulti-scale featuresDomain generalization
spellingShingle Jiarui Zhou
Jianming Zhang
Yan Gui
Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement
Scientific Reports
Crowd counting
Spatial attention
Channel attention
Multi-scale features
Domain generalization
title Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement
title_full Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement
title_fullStr Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement
title_full_unstemmed Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement
title_short Crowd counting in domain generalization based on multi-scale attention and hierarchy level enhancement
title_sort crowd counting in domain generalization based on multi scale attention and hierarchy level enhancement
topic Crowd counting
Spatial attention
Channel attention
Multi-scale features
Domain generalization
url https://doi.org/10.1038/s41598-024-83725-5
work_keys_str_mv AT jiaruizhou crowdcountingindomaingeneralizationbasedonmultiscaleattentionandhierarchylevelenhancement
AT jianmingzhang crowdcountingindomaingeneralizationbasedonmultiscaleattentionandhierarchylevelenhancement
AT yangui crowdcountingindomaingeneralizationbasedonmultiscaleattentionandhierarchylevelenhancement