RGB-to-Infrared Translation Using Ensemble Learning Applied to Driving Scenarios

Multimodal sensing is essential in order to reach the robustness required of autonomous vehicle perception systems. Infrared (IR) imaging is of particular interest due to its low cost and complementarity with traditional RGB sensors. However, the lack of IR data in many datasets and simulation tools...

Full description

Saved in:
Bibliographic Details
Main Authors: Leonardo Ravaglia, Roberto Longo, Kaili Wang, David Van Hamme, Julie Moeyersoms, Ben Stoffelen, Tom De Schepper
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/11/7/206
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multimodal sensing is essential in order to reach the robustness required of autonomous vehicle perception systems. Infrared (IR) imaging is of particular interest due to its low cost and complementarity with traditional RGB sensors. However, the lack of IR data in many datasets and simulation tools limits the development and validation of sensor fusion algorithms that exploit this complementarity. To address this, we propose an augmentation method that synthesizes realistic IR data from RGB images using gradient-boosting decision trees. We demonstrate that this method is an effective alternative to traditional deep learning methods for image translation such as CNNs and GANs, particularly in data-scarce situations. The proposed approach generates high-quality synthetic IR, i.e., Near-Infrared (NIR) and thermal images from RGB images, enhancing datasets such as MS2, EPFL, and Freiburg. Our synthetic images exhibit good visual quality when evaluated using metrics such as <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>R</mi><mn>2</mn></msup></semantics></math></inline-formula>, PSNR, SSIM, and LPIPS, achieving an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>R</mi><mn>2</mn></msup></semantics></math></inline-formula> of 0.98 on the MS2 dataset and a PSNR of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>21.3</mn></mrow></semantics></math></inline-formula> dB on the Freiburg dataset. We also discuss the application of this method to synthetic RGB images generated by the CARLA simulator for autonomous driving. Our approach provides richer datasets with a particular focus on IR modalities for sensor fusion along with a framework for generating a wider variety of driving scenarios within urban driving datasets, which can help to enhance the robustness of sensor fusion algorithms.
ISSN:2313-433X