Enhancing Cross-Modal Camera Image and LiDAR Data Registration Using Feature-Based Matching

Registering light detection and ranging (LiDAR) data with optical camera images enhances spatial awareness in autonomous driving, robotics, and geographic information systems. The current challenges in this field involve aligning 2D-3D data acquired from sources with distinct coordinate systems, ori...

Full description

Saved in:
Bibliographic Details
Main Authors: Jennifer Leahy, Shabnam Jabari, Derek Lichti, Abbas Salehitangrizi
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/3/357
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Registering light detection and ranging (LiDAR) data with optical camera images enhances spatial awareness in autonomous driving, robotics, and geographic information systems. The current challenges in this field involve aligning 2D-3D data acquired from sources with distinct coordinate systems, orientations, and resolutions. This paper introduces a new pipeline for camera–LiDAR post-registration to produce colorized point clouds. Utilizing deep learning-based matching between 2D spherical projection LiDAR feature layers and camera images, we can map 3D LiDAR coordinates to image grey values. Various LiDAR feature layers, including intensity, bearing angle, depth, and different weighted combinations, are used to find correspondence with camera images utilizing state-of-the-art deep learning matching algorithms, i.e., SuperGlue and LoFTR. Registration is achieved using collinearity equations and RANSAC to remove false matches. The pipeline’s accuracy is tested using survey-grade terrestrial datasets from the TX5 scanner, as well as datasets from a custom-made, low-cost mobile mapping system (MMS) named Simultaneous Localization And Mapping Multi-sensor roBOT (SLAMM-BOT) across diverse scenes, in which both outperformed their baseline solutions. SuperGlue performed best in high-feature scenes, whereas LoFTR performed best in low-feature or sparse data scenes. The LiDAR intensity layer had the strongest matches, but combining feature layers improved matching and reduced errors.
ISSN:2072-4292