A Benchmark Review of YOLO Algorithm Developments for Object Detection

You Only Look Once (YOLO) has established itself as a prominent object detection framework due to its excellent balance between speed and accuracy. This article provides a thorough review of the YOLO series, from YOLOv1 to YOLOv10, including YOLOX, emphasizing their architectural advancements, loss...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhengmao Hua, Kaviya Aranganadin, Cheng-Cheng Yeh, Xinhe Hai, Chen-Yun Huang, Tsan-Chuen Leung, Hua-Yi Hsu, Yung-Chiang Lan, Ming-Chieh Lin
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	COCO2017 computer vision object detection mAP VOC07+12 YOLO
Online Access:	https://ieeexplore.ieee.org/document/11072404/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	You Only Look Once (YOLO) has established itself as a prominent object detection framework due to its excellent balance between speed and accuracy. This article provides a thorough review of the YOLO series, from YOLOv1 to YOLOv10, including YOLOX, emphasizing their architectural advancements, loss function improvements, and performance enhancements. We have benchmarked the officially released versions from YOLOv3 to YOLOv10 and YOLOX, using widely recognized datasets VOC07+12 and COCO2017, on diverse hardware platforms: NVIDIA GTX Titan X, RTX 3060, and Tesla V100. The benchmark provides significant insights, such as YOLOv9-E achieving the highest mean average precision (mAP) of 76.0% on VOC07+12 and also showing superior detection accuracy on COCO2017 with an mAP of 56.6% which is 1.2% higher than that of the latest YOLOv10-X. YOLOv9-E stands out for its superior detection accuracy making it more suitable for detection that needs high accuracy such as analysis of medical images, while some lightweight versions like YOLOv5-S, YOLOv7-S, YOLOv8-S, and YOLOv10-S offer the great balance of accuracy and speed, making them ideal for real-time applications. Among them, YOLOv7-S has the highest mAP value among these lightweight models. Inference benchmarks highlight lightweight YOLO models such as YOLOv10-S for their exceptional inference speed on all GPUs and results of training time also indicate YOLOv9-E would take the longest time to converge among all versions using both datasets. This study would provide researchers and developers with some strategies in choosing appropriate YOLO models based on accuracy, resource availability, and application-specific needs.
ISSN:	2169-3536

A Benchmark Review of YOLO Algorithm Developments for Object Detection

Similar Items