CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection

Three-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to ch...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiahang Lyu, Yongze Qi, Suilian You, Jin Meng, Xin Meng, Sarath Kodagoda, Shifeng Wang
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/23/4593
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850060157510221824
author Jiahang Lyu
Yongze Qi
Suilian You
Jin Meng
Xin Meng
Sarath Kodagoda
Shifeng Wang
author_facet Jiahang Lyu
Yongze Qi
Suilian You
Jin Meng
Xin Meng
Sarath Kodagoda
Shifeng Wang
author_sort Jiahang Lyu
collection DOAJ
description Three-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to challenges such as difficulties in feature extraction and a lack of texture information. To address this issue, this study proposes a 3D object detection network, CaLiJD (Camera and Lidar Joint Contender for 3D object Detection), guided by two-dimensional detection results. CaLiJD creatively integrates advanced channel attention mechanisms with a novel bounding-box filtering method to improve detection accuracy, especially for small and occluded objects. Bounding boxes are detected by the 2D and 3D networks for the same object in the same scene as an associated pair. The detection results that satisfy the criteria are then fed into the fusion layer for training. In this study, a novel fusion network is proposed. It consists of numerous convolutions arranged in both sequential and parallel forms and includes a Grouped Channel Attention Module for extracting interactions among multi-channel information. Moreover, a novel bounding-box filtering mechanism was introduced, incorporating the normalized distance from the object to the radar as a filtering criterion within the process. Experiments were conducted using the KITTI 3D object detection benchmark. The results showed that a substantial improvement in mean Average Precision (mAP) was achieved by CaLiJD compared with the baseline single-modal 3D detection model, with an enhancement of 7.54%. Moreover, the improvement achieved by our method surpasses that of other classical fusion networks by an additional 0.82%. In particular, CaLiJD achieved mAP values of 73.04% and 59.86%, respectively, thus demonstrating state-of-the-art performance for challenging small-object detection tasks such as those involving cyclists and pedestrians.
format Article
id doaj-art-a5c3efed4e8b449bbc0eec7882bf201c
institution DOAJ
issn 2072-4292
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-a5c3efed4e8b449bbc0eec7882bf201c2025-08-20T02:50:40ZengMDPI AGRemote Sensing2072-42922024-12-011623459310.3390/rs16234593CaLiJD: Camera and LiDAR Joint Contender for 3D Object DetectionJiahang Lyu0Yongze Qi1Suilian You2Jin Meng3Xin Meng4Sarath Kodagoda5Shifeng Wang6School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaFaculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NWS 2007, AustraliaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaThree-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to challenges such as difficulties in feature extraction and a lack of texture information. To address this issue, this study proposes a 3D object detection network, CaLiJD (Camera and Lidar Joint Contender for 3D object Detection), guided by two-dimensional detection results. CaLiJD creatively integrates advanced channel attention mechanisms with a novel bounding-box filtering method to improve detection accuracy, especially for small and occluded objects. Bounding boxes are detected by the 2D and 3D networks for the same object in the same scene as an associated pair. The detection results that satisfy the criteria are then fed into the fusion layer for training. In this study, a novel fusion network is proposed. It consists of numerous convolutions arranged in both sequential and parallel forms and includes a Grouped Channel Attention Module for extracting interactions among multi-channel information. Moreover, a novel bounding-box filtering mechanism was introduced, incorporating the normalized distance from the object to the radar as a filtering criterion within the process. Experiments were conducted using the KITTI 3D object detection benchmark. The results showed that a substantial improvement in mean Average Precision (mAP) was achieved by CaLiJD compared with the baseline single-modal 3D detection model, with an enhancement of 7.54%. Moreover, the improvement achieved by our method surpasses that of other classical fusion networks by an additional 0.82%. In particular, CaLiJD achieved mAP values of 73.04% and 59.86%, respectively, thus demonstrating state-of-the-art performance for challenging small-object detection tasks such as those involving cyclists and pedestrians.https://www.mdpi.com/2072-4292/16/23/4593deep learning3D object detectiondata fusionpoint cloud data
spellingShingle Jiahang Lyu
Yongze Qi
Suilian You
Jin Meng
Xin Meng
Sarath Kodagoda
Shifeng Wang
CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
Remote Sensing
deep learning
3D object detection
data fusion
point cloud data
title CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_full CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_fullStr CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_full_unstemmed CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_short CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_sort calijd camera and lidar joint contender for 3d object detection
topic deep learning
3D object detection
data fusion
point cloud data
url https://www.mdpi.com/2072-4292/16/23/4593
work_keys_str_mv AT jiahanglyu calijdcameraandlidarjointcontenderfor3dobjectdetection
AT yongzeqi calijdcameraandlidarjointcontenderfor3dobjectdetection
AT suilianyou calijdcameraandlidarjointcontenderfor3dobjectdetection
AT jinmeng calijdcameraandlidarjointcontenderfor3dobjectdetection
AT xinmeng calijdcameraandlidarjointcontenderfor3dobjectdetection
AT sarathkodagoda calijdcameraandlidarjointcontenderfor3dobjectdetection
AT shifengwang calijdcameraandlidarjointcontenderfor3dobjectdetection