Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection

Mono RGB cameras and automotive radar sensors provide a complementary information set that makes them excellent candidates for sensor data fusion to obtain robust traffic user detection. This has been widely used in the vehicle domain and recently introduced in roadside-mounted smart infrastructure-...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shiva Agrawal, Savankumar Bhanderi, Gordon Elger
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Sensors
Subjects:	artificial intelligence camera deep learning data processing object detection perception
Online Access:	https://www.mdpi.com/1424-8220/25/11/3422
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850158964398882816
author	Shiva Agrawal Savankumar Bhanderi Gordon Elger
author_facet	Shiva Agrawal Savankumar Bhanderi Gordon Elger
author_sort	Shiva Agrawal
collection	DOAJ
description	Mono RGB cameras and automotive radar sensors provide a complementary information set that makes them excellent candidates for sensor data fusion to obtain robust traffic user detection. This has been widely used in the vehicle domain and recently introduced in roadside-mounted smart infrastructure-based road user detection. However, the performance of the most commonly used late fusion methods often degrades when the camera fails to detect road users in adverse environmental conditions. The solution is to fuse the data using deep neural networks at the early stage of the fusion pipeline to use the complete data provided by both sensors. Research has been carried out in this area, but is limited to vehicle-based sensor setups. Hence, this work proposes a novel deep neural network to jointly fuse RGB mono-camera images and 3D automotive radar point cloud data to obtain enhanced traffic user detection for the roadside-mounted smart infrastructure setup. Projected radar points are first used to generate anchors in image regions with a high likelihood of road users, including areas not visible to the camera. These anchors guide the prediction of 2D bounding boxes, object categories, and confidence scores. Valid detections are then used to segment radar points by instance, and the results are post-processed to produce final road user detections in the ground plane. The trained model is evaluated for different light and weather conditions using ground truth data from a lidar sensor. It provides a precision of 92%, recall of 78%, and F1-score of 85%. The proposed deep fusion methodology has 33%, 6%, and 21% absolute improvement in precision, recall, and F1-score, respectively, compared to object-level spatial fusion output.
format	Article
id	doaj-art-e92df8a7adda465ca467d6d0409243e7
institution	OA Journals
issn	1424-8220
language	English
publishDate	2025-05-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj-art-e92df8a7adda465ca467d6d0409243e72025-08-20T02:23:44ZengMDPI AGSensors1424-82202025-05-012511342210.3390/s25113422Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User DetectionShiva Agrawal0Savankumar Bhanderi1Gordon Elger2Institute of Innovative Mobility (IIMo), Technische Hochschule Ingolstadt, 85049 Ingolstadt, GermanyInstitute of Innovative Mobility (IIMo), Technische Hochschule Ingolstadt, 85049 Ingolstadt, GermanyInstitute of Innovative Mobility (IIMo), Technische Hochschule Ingolstadt, 85049 Ingolstadt, GermanyMono RGB cameras and automotive radar sensors provide a complementary information set that makes them excellent candidates for sensor data fusion to obtain robust traffic user detection. This has been widely used in the vehicle domain and recently introduced in roadside-mounted smart infrastructure-based road user detection. However, the performance of the most commonly used late fusion methods often degrades when the camera fails to detect road users in adverse environmental conditions. The solution is to fuse the data using deep neural networks at the early stage of the fusion pipeline to use the complete data provided by both sensors. Research has been carried out in this area, but is limited to vehicle-based sensor setups. Hence, this work proposes a novel deep neural network to jointly fuse RGB mono-camera images and 3D automotive radar point cloud data to obtain enhanced traffic user detection for the roadside-mounted smart infrastructure setup. Projected radar points are first used to generate anchors in image regions with a high likelihood of road users, including areas not visible to the camera. These anchors guide the prediction of 2D bounding boxes, object categories, and confidence scores. Valid detections are then used to segment radar points by instance, and the results are post-processed to produce final road user detections in the ground plane. The trained model is evaluated for different light and weather conditions using ground truth data from a lidar sensor. It provides a precision of 92%, recall of 78%, and F1-score of 85%. The proposed deep fusion methodology has 33%, 6%, and 21% absolute improvement in precision, recall, and F1-score, respectively, compared to object-level spatial fusion output.https://www.mdpi.com/1424-8220/25/11/3422artificial intelligencecameradeep learningdata processingobject detectionperception
spellingShingle	Shiva Agrawal Savankumar Bhanderi Gordon Elger Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection Sensors artificial intelligence camera deep learning data processing object detection perception
title	Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection
title_full	Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection
title_fullStr	Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection
title_full_unstemmed	Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection
title_short	Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection
title_sort	infra 3drc fusionnet deep fusion of roadside mounted rgb mono camera and three dimensional automotive radar for traffic user detection
topic	artificial intelligence camera deep learning data processing object detection perception
url	https://www.mdpi.com/1424-8220/25/11/3422
work_keys_str_mv	AT shivaagrawal infra3drcfusionnetdeepfusionofroadsidemountedrgbmonocameraandthreedimensionalautomotiveradarfortrafficuserdetection AT savankumarbhanderi infra3drcfusionnetdeepfusionofroadsidemountedrgbmonocameraandthreedimensionalautomotiveradarfortrafficuserdetection AT gordonelger infra3drcfusionnetdeepfusionofroadsidemountedrgbmonocameraandthreedimensionalautomotiveradarfortrafficuserdetection

Infra-3DRC-FusionNet: Deep Fusion of Roadside Mounted RGB Mono Camera and Three-Dimensional Automotive Radar for Traffic User Detection

Similar Items