Efficient Stereo Visual Odometry on FPGA Featuring On-Chip Map Management and Pipelined Descriptor-Based Block Matching

Due to its ability to capture natural light without emitting additional signals, a stereo camera is a promising option for low-power visual odometry (VO). However, achieving real-time processing of VO using a stereo camera on embedded CPUs and GPUs is challenging due to the high computational demand...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuki Ichikawa, Kazushi Kawamura, Masato Motomura, Thiem van Chu
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10689404/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to its ability to capture natural light without emitting additional signals, a stereo camera is a promising option for low-power visual odometry (VO). However, achieving real-time processing of VO using a stereo camera on embedded CPUs and GPUs is challenging due to the high computational demands and irregular computational patterns. These irregular patterns arise from the dynamic and non-uniform nature of feature detection, stereo matching, and map management, which vary significantly with scene complexity and camera movement. Various hardware accelerators have been proposed to address this challenge, but they typically tackle individual aspects of stereo VO, leading to difficulties in combining them efficiently. This paper introduces a comprehensive stereo VO accelerator implemented on an AMD Kria KV260 FPGA board, aimed at improving processing efficiency and reducing energy consumption. Our accelerator features innovative techniques including on-FPGA map management and pipelined stereo matching with the use of descriptors instead of image patches. These advancements result in significant reduction in off-chip data transfer, memory savings, and faster processing times. Additionally, an adaptive bucketing-based feature selection method is employed to enhance system accuracy without significant increases in hardware resource usage. Evaluation using the KITTI dataset shows that our accelerator achieves speedups of up to <inline-formula> <tex-math notation="LaTeX">$3.08\times $ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$2.75\times $ </tex-math></inline-formula> over CPU-only and CPU+GPU implementations of LVT, an optimized algorithm, on an NVIDIA Jetson Nano developer kit B01 4GB, with corresponding energy efficiency improvements of <inline-formula> <tex-math notation="LaTeX">$3.55\times $ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$1.95\times $ </tex-math></inline-formula>. The results demonstrate the accelerator&#x2019;s effectiveness for real-time applications, highlighting the benefits of FPGA-based solutions in complex visual processing tasks.
ISSN:2169-3536