A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications

This review presents a comprehensive survey on deep learning-driven 3D object detection, focusing on the synergistic innovation between sensor modalities and technical architectures. Through a dual-axis “sensor modality–technical architecture” classification framework, it systematically analyzes det...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiang Zhang, Hai Wang, Haoran Dong
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/12/3668
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849425419534598144
author Xiang Zhang
Hai Wang
Haoran Dong
author_facet Xiang Zhang
Hai Wang
Haoran Dong
author_sort Xiang Zhang
collection DOAJ
description This review presents a comprehensive survey on deep learning-driven 3D object detection, focusing on the synergistic innovation between sensor modalities and technical architectures. Through a dual-axis “sensor modality–technical architecture” classification framework, it systematically analyzes detection methods based on RGB cameras, LiDAR, and multimodal fusion. From the sensor perspective, the study reveals the evolutionary paths of monocular depth estimation optimization, LiDAR point cloud processing from voxel-based to pillar-based modeling, and three-level cross-modal fusion paradigms (data-level alignment, feature-level interaction, and result-level verification). Regarding technical architectures, the paper examines structured representation optimization in traditional convolutional networks, spatiotemporal modeling breakthroughs in bird’s-eye view (BEV) methods, voxel-level modeling advantages of occupancy networks for irregular objects, and dynamic scene understanding capabilities of temporal fusion architectures. The applications in autonomous driving and agricultural robotics are discussed, highlighting future directions including depth perception enhancement, open-scene modeling, and lightweight deployment to advance 3D perception systems toward higher accuracy and stronger generalization.
format Article
id doaj-art-b4e025b89e5d4401b641d3ca6c81ec75
institution Kabale University
issn 1424-8220
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-b4e025b89e5d4401b641d3ca6c81ec752025-08-20T03:29:47ZengMDPI AGSensors1424-82202025-06-012512366810.3390/s25123668A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and ApplicationsXiang Zhang0Hai Wang1Haoran Dong2School of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang 212013, ChinaSchool of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang 212013, ChinaSchool of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang 212013, ChinaThis review presents a comprehensive survey on deep learning-driven 3D object detection, focusing on the synergistic innovation between sensor modalities and technical architectures. Through a dual-axis “sensor modality–technical architecture” classification framework, it systematically analyzes detection methods based on RGB cameras, LiDAR, and multimodal fusion. From the sensor perspective, the study reveals the evolutionary paths of monocular depth estimation optimization, LiDAR point cloud processing from voxel-based to pillar-based modeling, and three-level cross-modal fusion paradigms (data-level alignment, feature-level interaction, and result-level verification). Regarding technical architectures, the paper examines structured representation optimization in traditional convolutional networks, spatiotemporal modeling breakthroughs in bird’s-eye view (BEV) methods, voxel-level modeling advantages of occupancy networks for irregular objects, and dynamic scene understanding capabilities of temporal fusion architectures. The applications in autonomous driving and agricultural robotics are discussed, highlighting future directions including depth perception enhancement, open-scene modeling, and lightweight deployment to advance 3D perception systems toward higher accuracy and stronger generalization.https://www.mdpi.com/1424-8220/25/12/36683D object detectiondeep learningLiDARmultimodal fusionautonomous driving
spellingShingle Xiang Zhang
Hai Wang
Haoran Dong
A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications
Sensors
3D object detection
deep learning
LiDAR
multimodal fusion
autonomous driving
title A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications
title_full A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications
title_fullStr A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications
title_full_unstemmed A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications
title_short A Survey of Deep Learning-Driven 3D Object Detection: Sensor Modalities, Technical Architectures, and Applications
title_sort survey of deep learning driven 3d object detection sensor modalities technical architectures and applications
topic 3D object detection
deep learning
LiDAR
multimodal fusion
autonomous driving
url https://www.mdpi.com/1424-8220/25/12/3668
work_keys_str_mv AT xiangzhang asurveyofdeeplearningdriven3dobjectdetectionsensormodalitiestechnicalarchitecturesandapplications
AT haiwang asurveyofdeeplearningdriven3dobjectdetectionsensormodalitiestechnicalarchitecturesandapplications
AT haorandong asurveyofdeeplearningdriven3dobjectdetectionsensormodalitiestechnicalarchitecturesandapplications
AT xiangzhang surveyofdeeplearningdriven3dobjectdetectionsensormodalitiestechnicalarchitecturesandapplications
AT haiwang surveyofdeeplearningdriven3dobjectdetectionsensormodalitiestechnicalarchitecturesandapplications
AT haorandong surveyofdeeplearningdriven3dobjectdetectionsensormodalitiestechnicalarchitecturesandapplications