ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging
Accurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust altern...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/11/6138 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust alternative. NIR penetrates clothing and adapts to illumination changes, enhancing body shape and pose estimation. This work explores an SPI camera (850–1550 nm) with Time-of-Flight (TOF) technology for human detection in low-light conditions. SPI-derived point clouds are processed using a Vision Transformer (ViT) to align poses with a predefined SMPL-X model. A self-supervised PointNet++ network estimates global rotation, translation, body shape, and pose, enabling precise 3D human mesh reconstruction. Laboratory experiments simulating night-time conditions validate NIR-SPI’s potential for real-world applications, including human detection in rescue missions. |
|---|---|
| ISSN: | 2076-3417 |