CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection

Three-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to ch...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiahang Lyu, Yongze Qi, Suilian You, Jin Meng, Xin Meng, Sarath Kodagoda, Shifeng Wang
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Remote Sensing
Subjects:	deep learning 3D object detection data fusion point cloud data
Online Access:	https://www.mdpi.com/2072-4292/16/23/4593
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850060157510221824
author	Jiahang Lyu Yongze Qi Suilian You Jin Meng Xin Meng Sarath Kodagoda Shifeng Wang
author_facet	Jiahang Lyu Yongze Qi Suilian You Jin Meng Xin Meng Sarath Kodagoda Shifeng Wang
author_sort	Jiahang Lyu
collection	DOAJ
description	Three-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to challenges such as difficulties in feature extraction and a lack of texture information. To address this issue, this study proposes a 3D object detection network, CaLiJD (Camera and Lidar Joint Contender for 3D object Detection), guided by two-dimensional detection results. CaLiJD creatively integrates advanced channel attention mechanisms with a novel bounding-box filtering method to improve detection accuracy, especially for small and occluded objects. Bounding boxes are detected by the 2D and 3D networks for the same object in the same scene as an associated pair. The detection results that satisfy the criteria are then fed into the fusion layer for training. In this study, a novel fusion network is proposed. It consists of numerous convolutions arranged in both sequential and parallel forms and includes a Grouped Channel Attention Module for extracting interactions among multi-channel information. Moreover, a novel bounding-box filtering mechanism was introduced, incorporating the normalized distance from the object to the radar as a filtering criterion within the process. Experiments were conducted using the KITTI 3D object detection benchmark. The results showed that a substantial improvement in mean Average Precision (mAP) was achieved by CaLiJD compared with the baseline single-modal 3D detection model, with an enhancement of 7.54%. Moreover, the improvement achieved by our method surpasses that of other classical fusion networks by an additional 0.82%. In particular, CaLiJD achieved mAP values of 73.04% and 59.86%, respectively, thus demonstrating state-of-the-art performance for challenging small-object detection tasks such as those involving cyclists and pedestrians.
format	Article
id	doaj-art-a5c3efed4e8b449bbc0eec7882bf201c
institution	DOAJ
issn	2072-4292
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj-art-a5c3efed4e8b449bbc0eec7882bf201c2025-08-20T02:50:40ZengMDPI AGRemote Sensing2072-42922024-12-011623459310.3390/rs16234593CaLiJD: Camera and LiDAR Joint Contender for 3D Object DetectionJiahang Lyu0Yongze Qi1Suilian You2Jin Meng3Xin Meng4Sarath Kodagoda5Shifeng Wang6School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaFaculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NWS 2007, AustraliaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaThree-dimensional object detection has been a key area of research in recent years because of its rich spatial information and superior performance in addressing occlusion issues. However, the performance of 3D object detection still lags significantly behind that of 2D object detection, owing to challenges such as difficulties in feature extraction and a lack of texture information. To address this issue, this study proposes a 3D object detection network, CaLiJD (Camera and Lidar Joint Contender for 3D object Detection), guided by two-dimensional detection results. CaLiJD creatively integrates advanced channel attention mechanisms with a novel bounding-box filtering method to improve detection accuracy, especially for small and occluded objects. Bounding boxes are detected by the 2D and 3D networks for the same object in the same scene as an associated pair. The detection results that satisfy the criteria are then fed into the fusion layer for training. In this study, a novel fusion network is proposed. It consists of numerous convolutions arranged in both sequential and parallel forms and includes a Grouped Channel Attention Module for extracting interactions among multi-channel information. Moreover, a novel bounding-box filtering mechanism was introduced, incorporating the normalized distance from the object to the radar as a filtering criterion within the process. Experiments were conducted using the KITTI 3D object detection benchmark. The results showed that a substantial improvement in mean Average Precision (mAP) was achieved by CaLiJD compared with the baseline single-modal 3D detection model, with an enhancement of 7.54%. Moreover, the improvement achieved by our method surpasses that of other classical fusion networks by an additional 0.82%. In particular, CaLiJD achieved mAP values of 73.04% and 59.86%, respectively, thus demonstrating state-of-the-art performance for challenging small-object detection tasks such as those involving cyclists and pedestrians.https://www.mdpi.com/2072-4292/16/23/4593deep learning3D object detectiondata fusionpoint cloud data
spellingShingle	Jiahang Lyu Yongze Qi Suilian You Jin Meng Xin Meng Sarath Kodagoda Shifeng Wang CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection Remote Sensing deep learning 3D object detection data fusion point cloud data
title	CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_full	CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_fullStr	CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_full_unstemmed	CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_short	CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection
title_sort	calijd camera and lidar joint contender for 3d object detection
topic	deep learning 3D object detection data fusion point cloud data
url	https://www.mdpi.com/2072-4292/16/23/4593
work_keys_str_mv	AT jiahanglyu calijdcameraandlidarjointcontenderfor3dobjectdetection AT yongzeqi calijdcameraandlidarjointcontenderfor3dobjectdetection AT suilianyou calijdcameraandlidarjointcontenderfor3dobjectdetection AT jinmeng calijdcameraandlidarjointcontenderfor3dobjectdetection AT xinmeng calijdcameraandlidarjointcontenderfor3dobjectdetection AT sarathkodagoda calijdcameraandlidarjointcontenderfor3dobjectdetection AT shifengwang calijdcameraandlidarjointcontenderfor3dobjectdetection

CaLiJD: Camera and LiDAR Joint Contender for 3D Object Detection

Similar Items