A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network
Abstract Deep neural networks are used to accurately detect, estimate, and predict human body poses in images or videos through deep learning-based human pose estimation. However, traditional multi-person pose estimation methods face challenges due to partial occlusions and overlaps between multiple...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-00259-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850284640395329536 |
|---|---|
| author | Shaobin Cai Han Xu Wanchen Cai Yuchang Mo Liansuo Wei |
| author_facet | Shaobin Cai Han Xu Wanchen Cai Yuchang Mo Liansuo Wei |
| author_sort | Shaobin Cai |
| collection | DOAJ |
| description | Abstract Deep neural networks are used to accurately detect, estimate, and predict human body poses in images or videos through deep learning-based human pose estimation. However, traditional multi-person pose estimation methods face challenges due to partial occlusions and overlaps between multiple human bodies and body parts. To address these issues, we propose EE-YOLOv8, a human pose estimation network based on the YOLOv8 framework, which integrates Efficient Multi-scale Receptive Field (EMRF) and Expanded Feature Pyramid Network (EFPN). First, the EMRF module is employed to further enhance the model’s feature representation capability. Second, the EFPN optimizes cross-level information exchange and improves multi-scale data integration. Finally, Wise-IoU replaces the traditional Intersection over Union (IoU) to improve detection accuracy through precise overlap measurement between predicted and ground-truth bounding boxes. We evaluate EE-YOLOv8 on the MS COCO 2017 dataset. Compared to YOLOv8-Pose, EE-YOLOv8 achieves an AP of 89.0% at an IoU threshold of 0.5 (an improvement of 3.3%) and an AP of 65.6% over the IoU range of 0.5–0.95 (an improvement of 5.8%). Therefore, EE-YOLOv8 achieves the highest accuracy while maintaining the lowest parameter count and computational complexity among all analyzed algorithms. These results demonstrate that EE-YOLOv8 exhibits superior competitiveness compared to other mainstream methods. |
| format | Article |
| id | doaj-art-47c7fb1fc5a449199f3f178baff69f03 |
| institution | OA Journals |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-47c7fb1fc5a449199f3f178baff69f032025-08-20T01:47:32ZengNature PortfolioScientific Reports2045-23222025-05-0115111210.1038/s41598-025-00259-0A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid networkShaobin Cai0Han Xu1Wanchen Cai2Yuchang Mo3Liansuo Wei4College of Information Engineering, Huzhou UniversityCollege of Information Engineering, Huzhou UniversityCollege of Management, National Taiwan UniversityCollege of Mathematics, Huaqiao UniversityCollege of Information Engineering, Suqian UniversityAbstract Deep neural networks are used to accurately detect, estimate, and predict human body poses in images or videos through deep learning-based human pose estimation. However, traditional multi-person pose estimation methods face challenges due to partial occlusions and overlaps between multiple human bodies and body parts. To address these issues, we propose EE-YOLOv8, a human pose estimation network based on the YOLOv8 framework, which integrates Efficient Multi-scale Receptive Field (EMRF) and Expanded Feature Pyramid Network (EFPN). First, the EMRF module is employed to further enhance the model’s feature representation capability. Second, the EFPN optimizes cross-level information exchange and improves multi-scale data integration. Finally, Wise-IoU replaces the traditional Intersection over Union (IoU) to improve detection accuracy through precise overlap measurement between predicted and ground-truth bounding boxes. We evaluate EE-YOLOv8 on the MS COCO 2017 dataset. Compared to YOLOv8-Pose, EE-YOLOv8 achieves an AP of 89.0% at an IoU threshold of 0.5 (an improvement of 3.3%) and an AP of 65.6% over the IoU range of 0.5–0.95 (an improvement of 5.8%). Therefore, EE-YOLOv8 achieves the highest accuracy while maintaining the lowest parameter count and computational complexity among all analyzed algorithms. These results demonstrate that EE-YOLOv8 exhibits superior competitiveness compared to other mainstream methods.https://doi.org/10.1038/s41598-025-00259-0Pose estimationAttention mechanismFeature pyramid network |
| spellingShingle | Shaobin Cai Han Xu Wanchen Cai Yuchang Mo Liansuo Wei A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network Scientific Reports Pose estimation Attention mechanism Feature pyramid network |
| title | A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network |
| title_full | A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network |
| title_fullStr | A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network |
| title_full_unstemmed | A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network |
| title_short | A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network |
| title_sort | human pose estimation network based on yolov8 framework with efficient multi scale receptive field and expanded feature pyramid network |
| topic | Pose estimation Attention mechanism Feature pyramid network |
| url | https://doi.org/10.1038/s41598-025-00259-0 |
| work_keys_str_mv | AT shaobincai ahumanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT hanxu ahumanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT wanchencai ahumanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT yuchangmo ahumanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT liansuowei ahumanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT shaobincai humanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT hanxu humanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT wanchencai humanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT yuchangmo humanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork AT liansuowei humanposeestimationnetworkbasedonyolov8frameworkwithefficientmultiscalereceptivefieldandexpandedfeaturepyramidnetwork |