A path aggregation network with deformable convolution for visual object detection

One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named...

Full description

Saved in:
Bibliographic Details
Main Authors: Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao
Format: Article
Language:English
Published: PeerJ Inc. 2025-08-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-3083.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849232964965105664
author Chengming Rao
Zunhao Hu
QiMing Zhao
Min Shan
Li Mao
author_facet Chengming Rao
Zunhao Hu
QiMing Zhao
Min Shan
Li Mao
author_sort Chengming Rao
collection DOAJ
description One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.
format Article
id doaj-art-4d01f902ea1146aaab68ea05d0762494
institution Kabale University
issn 2376-5992
language English
publishDate 2025-08-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj-art-4d01f902ea1146aaab68ea05d07624942025-08-20T15:05:19ZengPeerJ Inc.PeerJ Computer Science2376-59922025-08-0111e308310.7717/peerj-cs.3083A path aggregation network with deformable convolution for visual object detectionChengming Rao0Zunhao Hu1QiMing Zhao2Min Shan3Li Mao4College of Internet of Things Technology, Wuxi Institute of Technology, Wuxi, Jiangsu, ChinaSchool of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu, ChinaSchool of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, ChinaSchool of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, ChinaSchool of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, ChinaOne of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.https://peerj.com/articles/cs-3083.pdfDePAN architectureFeature fusionObject detection
spellingShingle Chengming Rao
Zunhao Hu
QiMing Zhao
Min Shan
Li Mao
A path aggregation network with deformable convolution for visual object detection
PeerJ Computer Science
DePAN architecture
Feature fusion
Object detection
title A path aggregation network with deformable convolution for visual object detection
title_full A path aggregation network with deformable convolution for visual object detection
title_fullStr A path aggregation network with deformable convolution for visual object detection
title_full_unstemmed A path aggregation network with deformable convolution for visual object detection
title_short A path aggregation network with deformable convolution for visual object detection
title_sort path aggregation network with deformable convolution for visual object detection
topic DePAN architecture
Feature fusion
Object detection
url https://peerj.com/articles/cs-3083.pdf
work_keys_str_mv AT chengmingrao apathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT zunhaohu apathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT qimingzhao apathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT minshan apathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT limao apathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT chengmingrao pathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT zunhaohu pathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT qimingzhao pathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT minshan pathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection
AT limao pathaggregationnetworkwithdeformableconvolutionforvisualobjectdetection