DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules

Small object detection represents a pivotal sub-domain within the field of computer vision. Previous research aimed at enhancing detection accuracy has included augmenting the head layer, refining multi-layer feature pooling techniques, incorporating attention mechanisms, and optimizing loss functio...

Full description

Saved in:
Bibliographic Details
Main Authors: Yicheng Qiu, Feng Sha, Li Niu
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10792910/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850114061734248448
author Yicheng Qiu
Feng Sha
Li Niu
author_facet Yicheng Qiu
Feng Sha
Li Niu
author_sort Yicheng Qiu
collection DOAJ
description Small object detection represents a pivotal sub-domain within the field of computer vision. Previous research aimed at enhancing detection accuracy has included augmenting the head layer, refining multi-layer feature pooling techniques, incorporating attention mechanisms, and optimizing loss functions. Despite these efforts, issues such as false negatives and classification ambiguities persist, leading to suboptimal outcomes. To solve these issues, DKA-YOLO is proposed as a new model focusing on improving convolution kernel structures. We develop novel modules based on the concept of dilation kernels aggregation convolution, integrate them into the robust and advanced YOLOv8 framework, and apply the enhanced model to small object detection tasks. The proposed modules include the large size dilation kernels aggregation convolution for the backbone layer, which combines large kernel sizes with dilation convolution structure, and utilizes extensive receptive fields to improve detailed feature extraction. Additionally, the multi-scale dilation kernels aggregation convolution is introduced in the neck layers to enhance performance and efficiency with a diverse set of kernels. Finally, the model’s head layer employs multi-scale convolution kernels detect to enhance feature expression diversity, generalization ability, and computational efficiency of detection. Experimental validation on public datasets demonstrates a significant improvement in detection accuracy by our method, with an increase in mean average precision by 1.5% on the VisDrone and 1.15% on the UAVDT compared to advanced previous methods. Our method also surpasses other previous models in comparative experiments, highlighting its superiority and robustness.
format Article
id doaj-art-e3e942825ed0448180277c8f635a5e5a
institution OA Journals
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-e3e942825ed0448180277c8f635a5e5a2025-08-20T02:36:59ZengIEEEIEEE Access2169-35362024-01-011218735318736610.1109/ACCESS.2024.351520110792910DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution ModulesYicheng Qiu0https://orcid.org/0009-0006-8841-8648Feng Sha1https://orcid.org/0000-0003-0005-3826Li Niu2https://orcid.org/0000-0001-9971-4641Research Center of Big Data Technology, Nanhu Laboratory, Jiaxing, ChinaResearch Center of Big Data Technology, Nanhu Laboratory, Jiaxing, ChinaResearch Center of Big Data Technology, Nanhu Laboratory, Jiaxing, ChinaSmall object detection represents a pivotal sub-domain within the field of computer vision. Previous research aimed at enhancing detection accuracy has included augmenting the head layer, refining multi-layer feature pooling techniques, incorporating attention mechanisms, and optimizing loss functions. Despite these efforts, issues such as false negatives and classification ambiguities persist, leading to suboptimal outcomes. To solve these issues, DKA-YOLO is proposed as a new model focusing on improving convolution kernel structures. We develop novel modules based on the concept of dilation kernels aggregation convolution, integrate them into the robust and advanced YOLOv8 framework, and apply the enhanced model to small object detection tasks. The proposed modules include the large size dilation kernels aggregation convolution for the backbone layer, which combines large kernel sizes with dilation convolution structure, and utilizes extensive receptive fields to improve detailed feature extraction. Additionally, the multi-scale dilation kernels aggregation convolution is introduced in the neck layers to enhance performance and efficiency with a diverse set of kernels. Finally, the model’s head layer employs multi-scale convolution kernels detect to enhance feature expression diversity, generalization ability, and computational efficiency of detection. Experimental validation on public datasets demonstrates a significant improvement in detection accuracy by our method, with an increase in mean average precision by 1.5% on the VisDrone and 1.15% on the UAVDT compared to advanced previous methods. Our method also surpasses other previous models in comparative experiments, highlighting its superiority and robustness.https://ieeexplore.ieee.org/document/10792910/Object detectionsmall object detectionconvolution neural networkunmanned aerial vehicledrone
spellingShingle Yicheng Qiu
Feng Sha
Li Niu
DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules
IEEE Access
Object detection
small object detection
convolution neural network
unmanned aerial vehicle
drone
title DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules
title_full DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules
title_fullStr DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules
title_full_unstemmed DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules
title_short DKA-YOLO: Enhanced Small Object Detection via Dilation Kernel Aggregation Convolution Modules
title_sort dka yolo enhanced small object detection via dilation kernel aggregation convolution modules
topic Object detection
small object detection
convolution neural network
unmanned aerial vehicle
drone
url https://ieeexplore.ieee.org/document/10792910/
work_keys_str_mv AT yichengqiu dkayoloenhancedsmallobjectdetectionviadilationkernelaggregationconvolutionmodules
AT fengsha dkayoloenhancedsmallobjectdetectionviadilationkernelaggregationconvolutionmodules
AT liniu dkayoloenhancedsmallobjectdetectionviadilationkernelaggregationconvolutionmodules