MHOE-DETR: A Ship Detection Method for Small and Fuzzy Targets Based on Satellite Remote Sensing Image Data

Pinpointing elusive and minor target vessels from satellite-based images is recognized as a considerable obstacle in the specialized areas of computer vision and the examination of remote sensing imagery. The majority of existing methods are based on the YOLO architecture, which relies on manually d...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhuhua Hu, Xiyu Fan, Yaochi Zhao, Wei Wu, Jie Liu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11096611/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Pinpointing elusive and minor target vessels from satellite-based images is recognized as a considerable obstacle in the specialized areas of computer vision and the examination of remote sensing imagery. The majority of existing methods are based on the YOLO architecture, which relies on manually designed anchor points and nonmaximum suppression (NMS) postprocessing. The detection of small targets in a single scene, the phenomenon of “catastrophic forgetting” due to the streaming of data, and the issue of an “information bottleneck” present significant challenges in this field. In order to address these issues, we propose the following solutions. A hybrid explicit spatial prior MH-Net network based on Manhattan distance is designed. By decomposing the self-attention matrix and the spatial attenuation matrix, the spatial correlations of different directions and positions are captured, thus effectively alleviating the problem of catastrophic forgetting. We propose an online convolutional reparameterization efficient layer aggregation networks cross-stage fusion network. Through equivalent transformations, the complex network architecture is compressed into a single linear layer. The network groups and processes input features in parallel, integrating both low-dimensional and high-dimensional features to alleviate the information bottleneck problem. The prediction head of the model uses the DINO decoder and applies contrastive denoising to remove useless prediction boxes. This allows the proposed MHOE-DETR model to avoid thresholding and NMS), reducing the model’s computational complexity. The experimental results demonstrate that the MHOE-DETR algorithm, designed for this purpose, markedly enhances the detection performance of small and indistinct targets in private remote sensing datasets. The average accuracy, recall, and AP50 reached 96.3%, 91.4%, and 95.4%, respectively, while maintaining a low GFLOPS value (54.4 G) and parametric count (77.3 M). These findings offer substantial technical justification for the implementation of sea area management and maritime safety monitoring strategies.
ISSN:1939-1404
2151-1535