Optimizing Convolution Operations for YOLOv4-based Object Detection on GPU

Real-time object detection is crucial for autonomous vehicles, and YOLO (You Only Look Once) algorithms have demonstrated their effectiveness for this purpose. This study examines the performance of YOLOv4 [3] for real-time object detection on an embedded architecture. We focus on optimizing the com...

Full description

Saved in:
Bibliographic Details
Main Authors: Guerrouj Fatima Zahra, Rodríguez Flórez Sergio, El Ouardi Abdelhafid, Abouzahir Mohamed, Ramzi Mustapha
Format: Article
Language:English
Published: EDP Sciences 2024-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2024/12/itmconf_maih2024_04008.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Real-time object detection is crucial for autonomous vehicles, and YOLO (You Only Look Once) algorithms have demonstrated their effectiveness for this purpose. This study examines the performance of YOLOv4 [3] for real-time object detection on an embedded architecture. We focus on optimizing the computationally intensive convolution operations by employing the cuDNN library to achieve efficient inference. The evaluation assesses critical performance metrics, including object detection accuracy in terms of Mean Average Precision (mAP) and inference latency on the embedded architecture. We conduct a comparative analysis using the publicly available KITTI [7] database. The reported results establish a benchmark between the parallelized YOLOv4 model and the baseline implementation, assessing the advantages of cuDNN acceleration for real-time object detection on resource-constrained devices.
ISSN:2271-2097