Spatial correlation guided cross scale feature fusion for age and gender estimation

Abstract To address the challenges of age and gender recognition in uncontrolled scenarios with facial absence or severe occlusion, this paper proposes a Spatial Correlation Guided Cross Scale Feature Fusion Network (SCGNet). The proposed method specifically tackles the limitations of existing appro...

Full description

Saved in:
Bibliographic Details
Main Authors: Shiyi Jiang, Qing Ji, Hukui Shi, Che Chen, Yang Xu
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-03081-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849334764287295488
author Shiyi Jiang
Qing Ji
Hukui Shi
Che Chen
Yang Xu
author_facet Shiyi Jiang
Qing Ji
Hukui Shi
Che Chen
Yang Xu
author_sort Shiyi Jiang
collection DOAJ
description Abstract To address the challenges of age and gender recognition in uncontrolled scenarios with facial absence or severe occlusion, this paper proposes a Spatial Correlation Guided Cross Scale Feature Fusion Network (SCGNet). The proposed method specifically tackles the limitations of existing approaches that heavily rely on facial features, which become unreliable under partial/complete occlusion scenarios. The method integrates multi-granularity semantic features through a Cross-Scale Combination (CSC) module, enhances local detail representation using a Local Feature Guided Fusion (LFGF) module, and designs a Spatial Correlation Composition Analysis (SCCA) module based on Getis-Ord Gi* statistics for feature reorganization, effectively resolving interference from non-informative regions. The SCCA module introduces a novel bipartite grouping mechanism that leverages hotspot detection to preserve discriminative body features when facial cues are unavailable. Comprehensive experiments demonstrate that SCGNet achieves state-of-the-art performance with minimum Mean Absolute Error (MAE) 4.01% for age estimation on IMDB-Clean (2.9% improvement over VOLO-D1) and highest gender classification accuracy on IMDB-Clean, UTKFace, and Lagenda datasets, showing improvements in cross-scene adaptability compared to VOLO and MiVOLO models respectively. Notably, the method maintains gender discrimination accuracy under complete facial occlusion scenarios, validating the effectiveness of spatial correlation modeling for non-facial feature reasoning, maintaining 97.32% gender accuracy even with complete facial occlusion on Lagenda dataset. The proposed architecture shows 73.30% CS@5 for age prediction in cross-domain testing, demonstrating superior cross-scene adaptability compared to VOLO (69.72%) and MiVOLO (71.27%). Ablation studies confirm the individual contributions of CSC, LFGF, and SCCA modules. This research provides new insights for robust identity analysis in human-computer interaction and intelligent security applications.
format Article
id doaj-art-6312957cf9d84c20af26a07f9ac9adee
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-6312957cf9d84c20af26a07f9ac9adee2025-08-20T03:45:28ZengNature PortfolioScientific Reports2045-23222025-07-0115111710.1038/s41598-025-03081-wSpatial correlation guided cross scale feature fusion for age and gender estimationShiyi Jiang0Qing Ji1Hukui Shi2Che Chen3Yang Xu4Guizhou Mobile Information Technology Co., LtdGuizhou Mobile Information Technology Co., LtdGuizhou Mobile Information Technology Co., LtdGuizhou Mobile Information Technology Co., LtdCollege of Big Data and Information Engineering, Guizhou UniversityAbstract To address the challenges of age and gender recognition in uncontrolled scenarios with facial absence or severe occlusion, this paper proposes a Spatial Correlation Guided Cross Scale Feature Fusion Network (SCGNet). The proposed method specifically tackles the limitations of existing approaches that heavily rely on facial features, which become unreliable under partial/complete occlusion scenarios. The method integrates multi-granularity semantic features through a Cross-Scale Combination (CSC) module, enhances local detail representation using a Local Feature Guided Fusion (LFGF) module, and designs a Spatial Correlation Composition Analysis (SCCA) module based on Getis-Ord Gi* statistics for feature reorganization, effectively resolving interference from non-informative regions. The SCCA module introduces a novel bipartite grouping mechanism that leverages hotspot detection to preserve discriminative body features when facial cues are unavailable. Comprehensive experiments demonstrate that SCGNet achieves state-of-the-art performance with minimum Mean Absolute Error (MAE) 4.01% for age estimation on IMDB-Clean (2.9% improvement over VOLO-D1) and highest gender classification accuracy on IMDB-Clean, UTKFace, and Lagenda datasets, showing improvements in cross-scene adaptability compared to VOLO and MiVOLO models respectively. Notably, the method maintains gender discrimination accuracy under complete facial occlusion scenarios, validating the effectiveness of spatial correlation modeling for non-facial feature reasoning, maintaining 97.32% gender accuracy even with complete facial occlusion on Lagenda dataset. The proposed architecture shows 73.30% CS@5 for age prediction in cross-domain testing, demonstrating superior cross-scene adaptability compared to VOLO (69.72%) and MiVOLO (71.27%). Ablation studies confirm the individual contributions of CSC, LFGF, and SCCA modules. This research provides new insights for robust identity analysis in human-computer interaction and intelligent security applications.https://doi.org/10.1038/s41598-025-03081-wAge estimationGender recognitionCision transformerCross-scale fusionSpatial correlation
spellingShingle Shiyi Jiang
Qing Ji
Hukui Shi
Che Chen
Yang Xu
Spatial correlation guided cross scale feature fusion for age and gender estimation
Scientific Reports
Age estimation
Gender recognition
Cision transformer
Cross-scale fusion
Spatial correlation
title Spatial correlation guided cross scale feature fusion for age and gender estimation
title_full Spatial correlation guided cross scale feature fusion for age and gender estimation
title_fullStr Spatial correlation guided cross scale feature fusion for age and gender estimation
title_full_unstemmed Spatial correlation guided cross scale feature fusion for age and gender estimation
title_short Spatial correlation guided cross scale feature fusion for age and gender estimation
title_sort spatial correlation guided cross scale feature fusion for age and gender estimation
topic Age estimation
Gender recognition
Cision transformer
Cross-scale fusion
Spatial correlation
url https://doi.org/10.1038/s41598-025-03081-w
work_keys_str_mv AT shiyijiang spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation
AT qingji spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation
AT hukuishi spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation
AT chechen spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation
AT yangxu spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation