Spatial correlation guided cross scale feature fusion for age and gender estimation
Abstract To address the challenges of age and gender recognition in uncontrolled scenarios with facial absence or severe occlusion, this paper proposes a Spatial Correlation Guided Cross Scale Feature Fusion Network (SCGNet). The proposed method specifically tackles the limitations of existing appro...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-03081-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849334764287295488 |
|---|---|
| author | Shiyi Jiang Qing Ji Hukui Shi Che Chen Yang Xu |
| author_facet | Shiyi Jiang Qing Ji Hukui Shi Che Chen Yang Xu |
| author_sort | Shiyi Jiang |
| collection | DOAJ |
| description | Abstract To address the challenges of age and gender recognition in uncontrolled scenarios with facial absence or severe occlusion, this paper proposes a Spatial Correlation Guided Cross Scale Feature Fusion Network (SCGNet). The proposed method specifically tackles the limitations of existing approaches that heavily rely on facial features, which become unreliable under partial/complete occlusion scenarios. The method integrates multi-granularity semantic features through a Cross-Scale Combination (CSC) module, enhances local detail representation using a Local Feature Guided Fusion (LFGF) module, and designs a Spatial Correlation Composition Analysis (SCCA) module based on Getis-Ord Gi* statistics for feature reorganization, effectively resolving interference from non-informative regions. The SCCA module introduces a novel bipartite grouping mechanism that leverages hotspot detection to preserve discriminative body features when facial cues are unavailable. Comprehensive experiments demonstrate that SCGNet achieves state-of-the-art performance with minimum Mean Absolute Error (MAE) 4.01% for age estimation on IMDB-Clean (2.9% improvement over VOLO-D1) and highest gender classification accuracy on IMDB-Clean, UTKFace, and Lagenda datasets, showing improvements in cross-scene adaptability compared to VOLO and MiVOLO models respectively. Notably, the method maintains gender discrimination accuracy under complete facial occlusion scenarios, validating the effectiveness of spatial correlation modeling for non-facial feature reasoning, maintaining 97.32% gender accuracy even with complete facial occlusion on Lagenda dataset. The proposed architecture shows 73.30% CS@5 for age prediction in cross-domain testing, demonstrating superior cross-scene adaptability compared to VOLO (69.72%) and MiVOLO (71.27%). Ablation studies confirm the individual contributions of CSC, LFGF, and SCCA modules. This research provides new insights for robust identity analysis in human-computer interaction and intelligent security applications. |
| format | Article |
| id | doaj-art-6312957cf9d84c20af26a07f9ac9adee |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-6312957cf9d84c20af26a07f9ac9adee2025-08-20T03:45:28ZengNature PortfolioScientific Reports2045-23222025-07-0115111710.1038/s41598-025-03081-wSpatial correlation guided cross scale feature fusion for age and gender estimationShiyi Jiang0Qing Ji1Hukui Shi2Che Chen3Yang Xu4Guizhou Mobile Information Technology Co., LtdGuizhou Mobile Information Technology Co., LtdGuizhou Mobile Information Technology Co., LtdGuizhou Mobile Information Technology Co., LtdCollege of Big Data and Information Engineering, Guizhou UniversityAbstract To address the challenges of age and gender recognition in uncontrolled scenarios with facial absence or severe occlusion, this paper proposes a Spatial Correlation Guided Cross Scale Feature Fusion Network (SCGNet). The proposed method specifically tackles the limitations of existing approaches that heavily rely on facial features, which become unreliable under partial/complete occlusion scenarios. The method integrates multi-granularity semantic features through a Cross-Scale Combination (CSC) module, enhances local detail representation using a Local Feature Guided Fusion (LFGF) module, and designs a Spatial Correlation Composition Analysis (SCCA) module based on Getis-Ord Gi* statistics for feature reorganization, effectively resolving interference from non-informative regions. The SCCA module introduces a novel bipartite grouping mechanism that leverages hotspot detection to preserve discriminative body features when facial cues are unavailable. Comprehensive experiments demonstrate that SCGNet achieves state-of-the-art performance with minimum Mean Absolute Error (MAE) 4.01% for age estimation on IMDB-Clean (2.9% improvement over VOLO-D1) and highest gender classification accuracy on IMDB-Clean, UTKFace, and Lagenda datasets, showing improvements in cross-scene adaptability compared to VOLO and MiVOLO models respectively. Notably, the method maintains gender discrimination accuracy under complete facial occlusion scenarios, validating the effectiveness of spatial correlation modeling for non-facial feature reasoning, maintaining 97.32% gender accuracy even with complete facial occlusion on Lagenda dataset. The proposed architecture shows 73.30% CS@5 for age prediction in cross-domain testing, demonstrating superior cross-scene adaptability compared to VOLO (69.72%) and MiVOLO (71.27%). Ablation studies confirm the individual contributions of CSC, LFGF, and SCCA modules. This research provides new insights for robust identity analysis in human-computer interaction and intelligent security applications.https://doi.org/10.1038/s41598-025-03081-wAge estimationGender recognitionCision transformerCross-scale fusionSpatial correlation |
| spellingShingle | Shiyi Jiang Qing Ji Hukui Shi Che Chen Yang Xu Spatial correlation guided cross scale feature fusion for age and gender estimation Scientific Reports Age estimation Gender recognition Cision transformer Cross-scale fusion Spatial correlation |
| title | Spatial correlation guided cross scale feature fusion for age and gender estimation |
| title_full | Spatial correlation guided cross scale feature fusion for age and gender estimation |
| title_fullStr | Spatial correlation guided cross scale feature fusion for age and gender estimation |
| title_full_unstemmed | Spatial correlation guided cross scale feature fusion for age and gender estimation |
| title_short | Spatial correlation guided cross scale feature fusion for age and gender estimation |
| title_sort | spatial correlation guided cross scale feature fusion for age and gender estimation |
| topic | Age estimation Gender recognition Cision transformer Cross-scale fusion Spatial correlation |
| url | https://doi.org/10.1038/s41598-025-03081-w |
| work_keys_str_mv | AT shiyijiang spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation AT qingji spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation AT hukuishi spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation AT chechen spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation AT yangxu spatialcorrelationguidedcrossscalefeaturefusionforageandgenderestimation |