MSFE-Net: Multi-Scale Feature Enhancement Network for Remote Sensing Object Detection

Dense objects detection in remote sensing is challenging due to similar neighboring features, causing redundant boxes and positioning errors. To address this, we propose MSFE-Net, a multi-scale feature enhancement network designed to effectively suppress background interference and detect adjacent s...

Full description

Saved in:
Bibliographic Details
Main Authors: Kai Yuan, Xing Li, Yaoyao Ren, Lianpeng Zhang, Wei Liu, Erzhu Li
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Applied Artificial Intelligence
Online Access:https://www.tandfonline.com/doi/10.1080/08839514.2025.2514324
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dense objects detection in remote sensing is challenging due to similar neighboring features, causing redundant boxes and positioning errors. To address this, we propose MSFE-Net, a multi-scale feature enhancement network designed to effectively suppress background interference and detect adjacent similar targets. Our Cascading Feature Fusion Module (CFFM) and Weighted Dilated Convolutional Pyramid (WDCP) enhance shallow texture and deep semantic features, respectively. To further reduce redundant target boxes, the Weighted Feature Fusion Enhancement Module (WFFEM) learns differential and fused features across multiple branches, thereby enriching target contextual features and suppressing background noise. Ultimately, the Multi-scale Feature Stairstep-upsampling Fusion Module (MFSFM) refines high-resolution texture and semantic features for targets across scales, using a stairstep-upsampling fusion strategy with outputs from the CFFM, WDCP, and WFFEM. Experimental results on the NWPU VHR-10 dataset show that MSFE-Net achieves 92.8% in mAP50 and 62.6% in mAP75, outperforming state-of-the-art methods such as YOLOv6 and YOLOv7. Compared to other models, MSFE-Net balances between parameter counts and computational demand, with Params slightly higher than YOLOv5s and YOLOv7-tiny and GFLOPs in a moderately high range. These results underscore MSFE-Net’s efficacy in balancing accuracy with computational demands, rendering it a highly practical option for dense object detection in remote sensing.
ISSN:0883-9514
1087-6545