ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection

Over the past decade, the field of object detection has advanced remarkably, especially in the accurate recognition of medium- and large-sized objects. Nevertheless, detecting small objects is still difficult because their low-resolution appearance provides insufficient discriminative features, and...

Full description

Saved in:
Bibliographic Details
Main Authors: Gaofeng Xing, Zhikang Xu, Yulong He, Hailong Ning, Menghao Sun, Chunmei Wang
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:AppliedMath
Subjects:
Online Access:https://www.mdpi.com/2673-9909/5/2/58
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849433654261972992
author Gaofeng Xing
Zhikang Xu
Yulong He
Hailong Ning
Menghao Sun
Chunmei Wang
author_facet Gaofeng Xing
Zhikang Xu
Yulong He
Hailong Ning
Menghao Sun
Chunmei Wang
author_sort Gaofeng Xing
collection DOAJ
description Over the past decade, the field of object detection has advanced remarkably, especially in the accurate recognition of medium- and large-sized objects. Nevertheless, detecting small objects is still difficult because their low-resolution appearance provides insufficient discriminative features, and they often suffer severe occlusions, particularly in the safety-critical context of autonomous driving. Conventional detectors often fail to extract sufficient information from shallow feature maps, which limits their ability to detect small objects with high precision. To address this issue, we propose the ECAN-Detector, an efficient context-aggregation method designed to enrich the feature representation of shallow layers, which are particularly beneficial for small-object detection. The model first employs an additional shallow detection layer to extract high-resolution features that provide more detailed information for subsequent stages of the network, and then incorporates a dynamic scaled transformer (DST) that enriches spatial perception by adaptively fusing global semantics and local context. Concurrently, a context-augmentation module (CAM) embedded in the shallow layer complements both global and local features relevant to small objects. To further boost the average precision of small-object detection, we implement a faster method utilizing two reparametrized convolutions in the detection head. Finally, extensive experiments conducted on the VisDrone2012-DET and VisDrone2021-DET datasets verified that our proposed method surpasses the baseline model, and achieved a significant improvement of 3.1% in AP and 3.5% in <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>A</mi><msub><mi>P</mi><mi>s</mi></msub></mrow></semantics></math></inline-formula>. Compared with recent state-of-the-art (SOTA) detectors, ECAN Detector delivers comparable accuracy yet preserves real-time throughput, reaching 54.3 FPS.
format Article
id doaj-art-b7f7a60720f841c797a98695d32fa7ff
institution Kabale University
issn 2673-9909
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series AppliedMath
spelling doaj-art-b7f7a60720f841c797a98695d32fa7ff2025-08-20T03:26:57ZengMDPI AGAppliedMath2673-99092025-05-01525810.3390/appliedmath5020058ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object DetectionGaofeng Xing0Zhikang Xu1Yulong He2Hailong Ning3Menghao Sun4Chunmei Wang5School of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaOver the past decade, the field of object detection has advanced remarkably, especially in the accurate recognition of medium- and large-sized objects. Nevertheless, detecting small objects is still difficult because their low-resolution appearance provides insufficient discriminative features, and they often suffer severe occlusions, particularly in the safety-critical context of autonomous driving. Conventional detectors often fail to extract sufficient information from shallow feature maps, which limits their ability to detect small objects with high precision. To address this issue, we propose the ECAN-Detector, an efficient context-aggregation method designed to enrich the feature representation of shallow layers, which are particularly beneficial for small-object detection. The model first employs an additional shallow detection layer to extract high-resolution features that provide more detailed information for subsequent stages of the network, and then incorporates a dynamic scaled transformer (DST) that enriches spatial perception by adaptively fusing global semantics and local context. Concurrently, a context-augmentation module (CAM) embedded in the shallow layer complements both global and local features relevant to small objects. To further boost the average precision of small-object detection, we implement a faster method utilizing two reparametrized convolutions in the detection head. Finally, extensive experiments conducted on the VisDrone2012-DET and VisDrone2021-DET datasets verified that our proposed method surpasses the baseline model, and achieved a significant improvement of 3.1% in AP and 3.5% in <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>A</mi><msub><mi>P</mi><mi>s</mi></msub></mrow></semantics></math></inline-formula>. Compared with recent state-of-the-art (SOTA) detectors, ECAN Detector delivers comparable accuracy yet preserves real-time throughput, reaching 54.3 FPS.https://www.mdpi.com/2673-9909/5/2/58small object detectionautonomous drivingcontextual informationmultiscale representation
spellingShingle Gaofeng Xing
Zhikang Xu
Yulong He
Hailong Ning
Menghao Sun
Chunmei Wang
ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
AppliedMath
small object detection
autonomous driving
contextual information
multiscale representation
title ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
title_full ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
title_fullStr ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
title_full_unstemmed ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
title_short ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
title_sort ecan detector an efficient context aggregation network for small object detection
topic small object detection
autonomous driving
contextual information
multiscale representation
url https://www.mdpi.com/2673-9909/5/2/58
work_keys_str_mv AT gaofengxing ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection
AT zhikangxu ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection
AT yulonghe ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection
AT hailongning ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection
AT menghaosun ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection
AT chunmeiwang ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection