Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network
In this study, we propose a novel approach for human pose estimation (HPE) in occluded scenes by progressively fusing features extracted from RGB-D images, which contain RGB and depth images. Conventional bottom-up human pose estimation models that rely solely on RGB inputs often produce erroneous s...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-08-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/15/8746 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849407548850962432 |
|---|---|
| author | Jae-hyuk Yoon Soon-kak Kwon |
| author_facet | Jae-hyuk Yoon Soon-kak Kwon |
| author_sort | Jae-hyuk Yoon |
| collection | DOAJ |
| description | In this study, we propose a novel approach for human pose estimation (HPE) in occluded scenes by progressively fusing features extracted from RGB-D images, which contain RGB and depth images. Conventional bottom-up human pose estimation models that rely solely on RGB inputs often produce erroneous skeletons when parts of a person’s body are obscured by another individual, because they struggle to accurately infer body connectivity due to the lack of 3D topological information. To address this limitation, we modify the traditional OpenPose that is a bottom-up HPE model to take a depth image as an additional input, thereby providing explicit 3D spatial cues. Each input modality is processed by a dedicated feature extractor. Each input modality is processed by a dedicated feature extractor. In addition to the two existing modules for each stage—joint connectivity and joint confidence map estimations for the color image—we integrate a new module for estimating joint confidence maps for the depth image into the initial few stages. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next, ensuring that 3D topological information from the depth image is effectively utilized for both joint localization and body part association. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next to ensure that 3D topological information is effectively utilized for estimating both joint localization and their connectivity. The experimental results on the NTU 120+ RGB-D Dataset verify that our proposed approach achieves a 13.3% improvement in average recall compared to the original OpenPose model. The proposed method can enhance the performance of the bottom-up HPE models for the occlusion scenes. |
| format | Article |
| id | doaj-art-edd3b72ae01c47ddb8a17cd80e9ae969 |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-edd3b72ae01c47ddb8a17cd80e9ae9692025-08-20T03:36:02ZengMDPI AGApplied Sciences2076-34172025-08-011515874610.3390/app15158746Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural NetworkJae-hyuk Yoon0Soon-kak Kwon1Department of Computer Software Engineering, Dong-eui University, Busan 47340, Republic of KoreaDepartment of Computer Software Engineering, Dong-eui University, Busan 47340, Republic of KoreaIn this study, we propose a novel approach for human pose estimation (HPE) in occluded scenes by progressively fusing features extracted from RGB-D images, which contain RGB and depth images. Conventional bottom-up human pose estimation models that rely solely on RGB inputs often produce erroneous skeletons when parts of a person’s body are obscured by another individual, because they struggle to accurately infer body connectivity due to the lack of 3D topological information. To address this limitation, we modify the traditional OpenPose that is a bottom-up HPE model to take a depth image as an additional input, thereby providing explicit 3D spatial cues. Each input modality is processed by a dedicated feature extractor. Each input modality is processed by a dedicated feature extractor. In addition to the two existing modules for each stage—joint connectivity and joint confidence map estimations for the color image—we integrate a new module for estimating joint confidence maps for the depth image into the initial few stages. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next, ensuring that 3D topological information from the depth image is effectively utilized for both joint localization and body part association. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next to ensure that 3D topological information is effectively utilized for estimating both joint localization and their connectivity. The experimental results on the NTU 120+ RGB-D Dataset verify that our proposed approach achieves a 13.3% improvement in average recall compared to the original OpenPose model. The proposed method can enhance the performance of the bottom-up HPE models for the occlusion scenes.https://www.mdpi.com/2076-3417/15/15/8746deep learningcomputer visionhuman pose estimationRGB-D image |
| spellingShingle | Jae-hyuk Yoon Soon-kak Kwon Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network Applied Sciences deep learning computer vision human pose estimation RGB-D image |
| title | Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network |
| title_full | Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network |
| title_fullStr | Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network |
| title_full_unstemmed | Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network |
| title_short | Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network |
| title_sort | robust human pose estimation method for body to body occlusion using rgb d fusion neural network |
| topic | deep learning computer vision human pose estimation RGB-D image |
| url | https://www.mdpi.com/2076-3417/15/15/8746 |
| work_keys_str_mv | AT jaehyukyoon robusthumanposeestimationmethodforbodytobodyocclusionusingrgbdfusionneuralnetwork AT soonkakkwon robusthumanposeestimationmethodforbodytobodyocclusionusingrgbdfusionneuralnetwork |