ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection
Over the past decade, the field of object detection has advanced remarkably, especially in the accurate recognition of medium- and large-sized objects. Nevertheless, detecting small objects is still difficult because their low-resolution appearance provides insufficient discriminative features, and...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | AppliedMath |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2673-9909/5/2/58 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849433654261972992 |
|---|---|
| author | Gaofeng Xing Zhikang Xu Yulong He Hailong Ning Menghao Sun Chunmei Wang |
| author_facet | Gaofeng Xing Zhikang Xu Yulong He Hailong Ning Menghao Sun Chunmei Wang |
| author_sort | Gaofeng Xing |
| collection | DOAJ |
| description | Over the past decade, the field of object detection has advanced remarkably, especially in the accurate recognition of medium- and large-sized objects. Nevertheless, detecting small objects is still difficult because their low-resolution appearance provides insufficient discriminative features, and they often suffer severe occlusions, particularly in the safety-critical context of autonomous driving. Conventional detectors often fail to extract sufficient information from shallow feature maps, which limits their ability to detect small objects with high precision. To address this issue, we propose the ECAN-Detector, an efficient context-aggregation method designed to enrich the feature representation of shallow layers, which are particularly beneficial for small-object detection. The model first employs an additional shallow detection layer to extract high-resolution features that provide more detailed information for subsequent stages of the network, and then incorporates a dynamic scaled transformer (DST) that enriches spatial perception by adaptively fusing global semantics and local context. Concurrently, a context-augmentation module (CAM) embedded in the shallow layer complements both global and local features relevant to small objects. To further boost the average precision of small-object detection, we implement a faster method utilizing two reparametrized convolutions in the detection head. Finally, extensive experiments conducted on the VisDrone2012-DET and VisDrone2021-DET datasets verified that our proposed method surpasses the baseline model, and achieved a significant improvement of 3.1% in AP and 3.5% in <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>A</mi><msub><mi>P</mi><mi>s</mi></msub></mrow></semantics></math></inline-formula>. Compared with recent state-of-the-art (SOTA) detectors, ECAN Detector delivers comparable accuracy yet preserves real-time throughput, reaching 54.3 FPS. |
| format | Article |
| id | doaj-art-b7f7a60720f841c797a98695d32fa7ff |
| institution | Kabale University |
| issn | 2673-9909 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | AppliedMath |
| spelling | doaj-art-b7f7a60720f841c797a98695d32fa7ff2025-08-20T03:26:57ZengMDPI AGAppliedMath2673-99092025-05-01525810.3390/appliedmath5020058ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object DetectionGaofeng Xing0Zhikang Xu1Yulong He2Hailong Ning3Menghao Sun4Chunmei Wang5School of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaSchool of Computing, Xi’an University of Posts and Telecommunications, No. 618, West Chang’an Street, Chang’an District, Xi’an 710121, ChinaOver the past decade, the field of object detection has advanced remarkably, especially in the accurate recognition of medium- and large-sized objects. Nevertheless, detecting small objects is still difficult because their low-resolution appearance provides insufficient discriminative features, and they often suffer severe occlusions, particularly in the safety-critical context of autonomous driving. Conventional detectors often fail to extract sufficient information from shallow feature maps, which limits their ability to detect small objects with high precision. To address this issue, we propose the ECAN-Detector, an efficient context-aggregation method designed to enrich the feature representation of shallow layers, which are particularly beneficial for small-object detection. The model first employs an additional shallow detection layer to extract high-resolution features that provide more detailed information for subsequent stages of the network, and then incorporates a dynamic scaled transformer (DST) that enriches spatial perception by adaptively fusing global semantics and local context. Concurrently, a context-augmentation module (CAM) embedded in the shallow layer complements both global and local features relevant to small objects. To further boost the average precision of small-object detection, we implement a faster method utilizing two reparametrized convolutions in the detection head. Finally, extensive experiments conducted on the VisDrone2012-DET and VisDrone2021-DET datasets verified that our proposed method surpasses the baseline model, and achieved a significant improvement of 3.1% in AP and 3.5% in <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>A</mi><msub><mi>P</mi><mi>s</mi></msub></mrow></semantics></math></inline-formula>. Compared with recent state-of-the-art (SOTA) detectors, ECAN Detector delivers comparable accuracy yet preserves real-time throughput, reaching 54.3 FPS.https://www.mdpi.com/2673-9909/5/2/58small object detectionautonomous drivingcontextual informationmultiscale representation |
| spellingShingle | Gaofeng Xing Zhikang Xu Yulong He Hailong Ning Menghao Sun Chunmei Wang ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection AppliedMath small object detection autonomous driving contextual information multiscale representation |
| title | ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection |
| title_full | ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection |
| title_fullStr | ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection |
| title_full_unstemmed | ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection |
| title_short | ECAN-Detector: An Efficient Context-Aggregation Network for Small-Object Detection |
| title_sort | ecan detector an efficient context aggregation network for small object detection |
| topic | small object detection autonomous driving contextual information multiscale representation |
| url | https://www.mdpi.com/2673-9909/5/2/58 |
| work_keys_str_mv | AT gaofengxing ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection AT zhikangxu ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection AT yulonghe ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection AT hailongning ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection AT menghaosun ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection AT chunmeiwang ecandetectoranefficientcontextaggregationnetworkforsmallobjectdetection |