MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
Monocular 3D object detection refers to detecting 3D objects using a single camera. This approach offers low sensor costs, high resolution, and rich texture information, making it widely adopted. However, monocular sensors face challenges from environmental factors like occlusion and truncation, lea...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-01-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/3/760 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850199957430075392 |
|---|---|
| author | Yuhan Gao Peng Wang Xiaoyan Li Mengyu Sun Ruohai Di Liangliang Li Wei Hong |
| author_facet | Yuhan Gao Peng Wang Xiaoyan Li Mengyu Sun Ruohai Di Liangliang Li Wei Hong |
| author_sort | Yuhan Gao |
| collection | DOAJ |
| description | Monocular 3D object detection refers to detecting 3D objects using a single camera. This approach offers low sensor costs, high resolution, and rich texture information, making it widely adopted. However, monocular sensors face challenges from environmental factors like occlusion and truncation, leading to reduced detection accuracy. Additionally, the lack of depth information poses significant challenges for predicting 3D positions. To address these issues, this paper presents a monocular 3D object detection method based on improvements to MonoCD, designed to enhance detection accuracy and robustness in complex environments. In order to effectively obtain and integrate depth information, this paper designs a multi-branch depth prediction with weight sharing module. Furthermore, an adaptive focus mechanism is proposed to emphasize target regions while minimizing interference from irrelevant areas. The experimental results demonstrate that MonoDFNet achieves significant improvements over existing methods, with AP3D gains of +4.09% (Easy), +2.78% (Moderate), and +1.63% (Hard), confirming its effectiveness in 3D object detection. |
| format | Article |
| id | doaj-art-28c219e88f9449d28a6abefb6dd9ab96 |
| institution | OA Journals |
| issn | 1424-8220 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Sensors |
| spelling | doaj-art-28c219e88f9449d28a6abefb6dd9ab962025-08-20T02:12:29ZengMDPI AGSensors1424-82202025-01-0125376010.3390/s25030760MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive OptimizationYuhan Gao0Peng Wang1Xiaoyan Li2Mengyu Sun3Ruohai Di4Liangliang Li5Wei Hong6School of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Optoelectronic Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Mechanical and Electrical Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Mechanical and Electrical Engineering, Xi’an Technological University, Xi’an 710021, ChinaMonocular 3D object detection refers to detecting 3D objects using a single camera. This approach offers low sensor costs, high resolution, and rich texture information, making it widely adopted. However, monocular sensors face challenges from environmental factors like occlusion and truncation, leading to reduced detection accuracy. Additionally, the lack of depth information poses significant challenges for predicting 3D positions. To address these issues, this paper presents a monocular 3D object detection method based on improvements to MonoCD, designed to enhance detection accuracy and robustness in complex environments. In order to effectively obtain and integrate depth information, this paper designs a multi-branch depth prediction with weight sharing module. Furthermore, an adaptive focus mechanism is proposed to emphasize target regions while minimizing interference from irrelevant areas. The experimental results demonstrate that MonoDFNet achieves significant improvements over existing methods, with AP3D gains of +4.09% (Easy), +2.78% (Moderate), and +1.63% (Hard), confirming its effectiveness in 3D object detection.https://www.mdpi.com/1424-8220/25/3/760deep learningmonocular 3D detectiondepth estimation |
| spellingShingle | Yuhan Gao Peng Wang Xiaoyan Li Mengyu Sun Ruohai Di Liangliang Li Wei Hong MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization Sensors deep learning monocular 3D detection depth estimation |
| title | MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization |
| title_full | MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization |
| title_fullStr | MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization |
| title_full_unstemmed | MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization |
| title_short | MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization |
| title_sort | monodfnet monocular 3d object detection with depth fusion and adaptive optimization |
| topic | deep learning monocular 3D detection depth estimation |
| url | https://www.mdpi.com/1424-8220/25/3/760 |
| work_keys_str_mv | AT yuhangao monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT pengwang monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT xiaoyanli monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT mengyusun monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT ruohaidi monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT liangliangli monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT weihong monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization |