CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
Three-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to ch...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-12-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/16/23/4593 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850060157510221824 |
|---|---|
| author | Jiahang Lyu Yongze Qi Suilian You Jin Meng Xin Meng Sarath Kodagoda Shifeng Wang |
| author_facet | Jiahang Lyu Yongze Qi Suilian You Jin Meng Xin Meng Sarath Kodagoda Shifeng Wang |
| author_sort | Jiahang Lyu |
| collection | DOAJ |
| description | Three-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to challenges such as difficulties in feature extraction and a lack of texture information. To address this issue, this study proposes a 3D object detection network, CaLiJD (Camera and Lidar Joint Contender for 3D object Detection), guided by two-dimensional detection results. CaLiJD creatively integrates advanced channel attention mechanisms with a novel bounding-box filtering method to improve detection accuracy, especially for small and occluded objects. Bounding boxes are detected by the 2D and 3D networks for the same object in the same scene as an associated pair. The detection results that satisfy the criteria are then fed into the fusion layer for training. In this study, a novel fusion network is proposed. It consists of numerous convolutions arranged in both sequential and parallel forms and includes a Grouped Channel Attention Module for extracting interactions among multi-channel information. Moreover, a novel bounding-box filtering mechanism was introduced, incorporating the normalized distance from the object to the radar as a filtering criterion within the process. Experiments were conducted using the KITTI 3D object detection benchmark. The results showed that a substantial improvement in mean Average Precision (mAP) was achieved by CaLiJD compared with the baseline single-modal 3D detection model, with an enhancement of 7.54%. Moreover, the improvement achieved by our method surpasses that of other classical fusion networks by an additional 0.82%. In particular, CaLiJD achieved mAP values of 73.04% and 59.86%, respectively, thus demonstrating state-of-the-art performance for challenging small-object detection tasks such as those involving cyclists and pedestrians. |
| format | Article |
| id | doaj-art-a5c3efed4e8b449bbc0eec7882bf201c |
| institution | DOAJ |
| issn | 2072-4292 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-a5c3efed4e8b449bbc0eec7882bf201c2025-08-20T02:50:40ZengMDPI AGRemote Sensing2072-42922024-12-011623459310.3390/rs16234593CaLiJD: Camera and LiDAR Joint Contender for 3D Object DetectionJiahang Lyu0Yongze Qi1Suilian You2Jin Meng3Xin Meng4Sarath Kodagoda5Shifeng Wang6School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaFaculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NWS 2007, AustraliaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaThree-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to challenges such as difficulties in feature extraction and a lack of texture information. To address this issue, this study proposes a 3D object detection network, CaLiJD (Camera and Lidar Joint Contender for 3D object Detection), guided by two-dimensional detection results. CaLiJD creatively integrates advanced channel attention mechanisms with a novel bounding-box filtering method to improve detection accuracy, especially for small and occluded objects. Bounding boxes are detected by the 2D and 3D networks for the same object in the same scene as an associated pair. The detection results that satisfy the criteria are then fed into the fusion layer for training. In this study, a novel fusion network is proposed. It consists of numerous convolutions arranged in both sequential and parallel forms and includes a Grouped Channel Attention Module for extracting interactions among multi-channel information. Moreover, a novel bounding-box filtering mechanism was introduced, incorporating the normalized distance from the object to the radar as a filtering criterion within the process. Experiments were conducted using the KITTI 3D object detection benchmark. The results showed that a substantial improvement in mean Average Precision (mAP) was achieved by CaLiJD compared with the baseline single-modal 3D detection model, with an enhancement of 7.54%. Moreover, the improvement achieved by our method surpasses that of other classical fusion networks by an additional 0.82%. In particular, CaLiJD achieved mAP values of 73.04% and 59.86%, respectively, thus demonstrating state-of-the-art performance for challenging small-object detection tasks such as those involving cyclists and pedestrians.https://www.mdpi.com/2072-4292/16/23/4593deep learning3D object detectiondata fusionpoint cloud data |
| spellingShingle | Jiahang Lyu Yongze Qi Suilian You Jin Meng Xin Meng Sarath Kodagoda Shifeng Wang CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection Remote Sensing deep learning 3D object detection data fusion point cloud data |
| title | CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection |
| title_full | CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection |
| title_fullStr | CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection |
| title_full_unstemmed | CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection |
| title_short | CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection |
| title_sort | calijd camera and lidar joint contender for 3d object detection |
| topic | deep learning 3D object detection data fusion point cloud data |
| url | https://www.mdpi.com/2072-4292/16/23/4593 |
| work_keys_str_mv | AT jiahanglyu calijdcameraandlidarjointcontenderfor3dobjectdetection AT yongzeqi calijdcameraandlidarjointcontenderfor3dobjectdetection AT suilianyou calijdcameraandlidarjointcontenderfor3dobjectdetection AT jinmeng calijdcameraandlidarjointcontenderfor3dobjectdetection AT xinmeng calijdcameraandlidarjointcontenderfor3dobjectdetection AT sarathkodagoda calijdcameraandlidarjointcontenderfor3dobjectdetection AT shifengwang calijdcameraandlidarjointcontenderfor3dobjectdetection |