ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging
Accurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust altern...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/11/6138 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849330853594791936 |
|---|---|
| author | Carlos Osorio Quero Daniel Durini Jose Martinez-Carranza |
| author_facet | Carlos Osorio Quero Daniel Durini Jose Martinez-Carranza |
| author_sort | Carlos Osorio Quero |
| collection | DOAJ |
| description | Accurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust alternative. NIR penetrates clothing and adapts to illumination changes, enhancing body shape and pose estimation. This work explores an SPI camera (850–1550 nm) with Time-of-Flight (TOF) technology for human detection in low-light conditions. SPI-derived point clouds are processed using a Vision Transformer (ViT) to align poses with a predefined SMPL-X model. A self-supervised PointNet++ network estimates global rotation, translation, body shape, and pose, enabling precise 3D human mesh reconstruction. Laboratory experiments simulating night-time conditions validate NIR-SPI’s potential for real-world applications, including human detection in rescue missions. |
| format | Article |
| id | doaj-art-dead3da1a96544ef8d186f0567567896 |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-dead3da1a96544ef8d186f05675678962025-08-20T03:46:48ZengMDPI AGApplied Sciences2076-34172025-05-011511613810.3390/app15116138ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel ImagingCarlos Osorio Quero0Daniel Durini1Jose Martinez-Carranza2INAOE Computer Science, Tonantzintla, Puebla 72840, MexicoINAOE Electronics Department, Tonantzintla, Puebla 72840, MexicoINAOE Computer Science, Tonantzintla, Puebla 72840, MexicoAccurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust alternative. NIR penetrates clothing and adapts to illumination changes, enhancing body shape and pose estimation. This work explores an SPI camera (850–1550 nm) with Time-of-Flight (TOF) technology for human detection in low-light conditions. SPI-derived point clouds are processed using a Vision Transformer (ViT) to align poses with a predefined SMPL-X model. A self-supervised PointNet++ network estimates global rotation, translation, body shape, and pose, enabling precise 3D human mesh reconstruction. Laboratory experiments simulating night-time conditions validate NIR-SPI’s potential for real-world applications, including human detection in rescue missions.https://www.mdpi.com/2076-3417/15/11/6138single-pixel imaging (SPI)self-supervisedSMPL-X modeldepth perceptionvision transformers (ViT)3D human model |
| spellingShingle | Carlos Osorio Quero Daniel Durini Jose Martinez-Carranza ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging Applied Sciences single-pixel imaging (SPI) self-supervised SMPL-X model depth perception vision transformers (ViT) 3D human model |
| title | ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging |
| title_full | ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging |
| title_fullStr | ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging |
| title_full_unstemmed | ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging |
| title_short | ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging |
| title_sort | vit based classification and self supervised 3d human mesh generation from nir single pixel imaging |
| topic | single-pixel imaging (SPI) self-supervised SMPL-X model depth perception vision transformers (ViT) 3D human model |
| url | https://www.mdpi.com/2076-3417/15/11/6138 |
| work_keys_str_mv | AT carlososorioquero vitbasedclassificationandselfsupervised3dhumanmeshgenerationfromnirsinglepixelimaging AT danieldurini vitbasedclassificationandselfsupervised3dhumanmeshgenerationfromnirsinglepixelimaging AT josemartinezcarranza vitbasedclassificationandselfsupervised3dhumanmeshgenerationfromnirsinglepixelimaging |