MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization

Monocular 3D object detection refers to detecting 3D objects using a single camera. This approach offers low sensor costs, high resolution, and rich texture information, making it widely adopted. However, monocular sensors face challenges from environmental factors like occlusion and truncation, lea...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yuhan Gao, Peng Wang, Xiaoyan Li, Mengyu Sun, Ruohai Di, Liangliang Li, Wei Hong
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Sensors
Subjects:	deep learning monocular 3D detection depth estimation
Online Access:	https://www.mdpi.com/1424-8220/25/3/760
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850199957430075392
author	Yuhan Gao Peng Wang Xiaoyan Li Mengyu Sun Ruohai Di Liangliang Li Wei Hong
author_facet	Yuhan Gao Peng Wang Xiaoyan Li Mengyu Sun Ruohai Di Liangliang Li Wei Hong
author_sort	Yuhan Gao
collection	DOAJ
description	Monocular 3D object detection refers to detecting 3D objects using a single camera. This approach offers low sensor costs, high resolution, and rich texture information, making it widely adopted. However, monocular sensors face challenges from environmental factors like occlusion and truncation, leading to reduced detection accuracy. Additionally, the lack of depth information poses significant challenges for predicting 3D positions. To address these issues, this paper presents a monocular 3D object detection method based on improvements to MonoCD, designed to enhance detection accuracy and robustness in complex environments. In order to effectively obtain and integrate depth information, this paper designs a multi-branch depth prediction with weight sharing module. Furthermore, an adaptive focus mechanism is proposed to emphasize target regions while minimizing interference from irrelevant areas. The experimental results demonstrate that MonoDFNet achieves significant improvements over existing methods, with AP3D gains of +4.09% (Easy), +2.78% (Moderate), and +1.63% (Hard), confirming its effectiveness in 3D object detection.
format	Article
id	doaj-art-28c219e88f9449d28a6abefb6dd9ab96
institution	OA Journals
issn	1424-8220
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj-art-28c219e88f9449d28a6abefb6dd9ab962025-08-20T02:12:29ZengMDPI AGSensors1424-82202025-01-0125376010.3390/s25030760MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive OptimizationYuhan Gao0Peng Wang1Xiaoyan Li2Mengyu Sun3Ruohai Di4Liangliang Li5Wei Hong6School of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Optoelectronic Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Electronics Information Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Mechanical and Electrical Engineering, Xi’an Technological University, Xi’an 710021, ChinaSchool of Mechanical and Electrical Engineering, Xi’an Technological University, Xi’an 710021, ChinaMonocular 3D object detection refers to detecting 3D objects using a single camera. This approach offers low sensor costs, high resolution, and rich texture information, making it widely adopted. However, monocular sensors face challenges from environmental factors like occlusion and truncation, leading to reduced detection accuracy. Additionally, the lack of depth information poses significant challenges for predicting 3D positions. To address these issues, this paper presents a monocular 3D object detection method based on improvements to MonoCD, designed to enhance detection accuracy and robustness in complex environments. In order to effectively obtain and integrate depth information, this paper designs a multi-branch depth prediction with weight sharing module. Furthermore, an adaptive focus mechanism is proposed to emphasize target regions while minimizing interference from irrelevant areas. The experimental results demonstrate that MonoDFNet achieves significant improvements over existing methods, with AP3D gains of +4.09% (Easy), +2.78% (Moderate), and +1.63% (Hard), confirming its effectiveness in 3D object detection.https://www.mdpi.com/1424-8220/25/3/760deep learningmonocular 3D detectiondepth estimation
spellingShingle	Yuhan Gao Peng Wang Xiaoyan Li Mengyu Sun Ruohai Di Liangliang Li Wei Hong MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization Sensors deep learning monocular 3D detection depth estimation
title	MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
title_full	MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
title_fullStr	MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
title_full_unstemmed	MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
title_short	MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
title_sort	monodfnet monocular 3d object detection with depth fusion and adaptive optimization
topic	deep learning monocular 3D detection depth estimation
url	https://www.mdpi.com/1424-8220/25/3/760
work_keys_str_mv	AT yuhangao monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT pengwang monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT xiaoyanli monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT mengyusun monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT ruohaidi monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT liangliangli monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization AT weihong monodfnetmonocular3dobjectdetectionwithdepthfusionandadaptiveoptimization

MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization

Similar Items