MFCANet: Multiscale Feature Context Aggregation Network for Oriented Object Detection in Remote-Sensing Images

Rotated object detection in remote sensing images presents a highly challenging task due to the extensive fields of view and complex backgrounds. While Convolutional Neural Networks (CNNs) and Transformer networks have made progress in this area, there is still a lack of research on extracting and f...

Full description

Saved in:
Bibliographic Details
Main Authors: Honghui Jiang, Tingting Luo, Hu Peng, Guozheng Zhang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10478508/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Rotated object detection in remote sensing images presents a highly challenging task due to the extensive fields of view and complex backgrounds. While Convolutional Neural Networks (CNNs) and Transformer networks have made progress in this area, there is still a lack of research on extracting and fusing features for small targets in complex backgrounds. To address this gap, we have extended the RTMDet framework by introducing three modules: the Focused Feature Context Aggregation Module, the Feature Context Information Enhancement Module, and the Multi-scale Feature Fusion Module. In the Focused Feature Context Aggregation Module, we replaced the Spatial Pyramid Pooling Bottleneck (SPPFBottleneck) to better extract small target features by focusing on contextual information. The Feature Context Information Enhancement Module enhances the model’s perception of multi-dimensional temporal and spatial information. Finally, we combined the original features with the fused ones to prevent the loss of specific features during the fusion process. Our proposed model, named the Multi-scale Feature Context Aggregation Network (MFCANet), was evaluated on four challenging remote sensing datasets (MAR20, SRSDD, HRSC, and DIOR-R). The experimental results demonstrate that our method outperforms baseline models, achieving improvements of 2.13%, 10.28%, 1.46%, and 1.13% in mAP for the MAR20, SRSDD, HRSC, and DIOR-R datasets, respectively.
ISSN:2169-3536