Ship-DETR: A Transformer-Based Model for EfficientShip Detection in Complex Maritime Environments

With the widespread application of ship detection in civilian and military domains, ship detection in optical remote sensing images has become an important research direction. The core task is to determine the presence of ship targets in remote sensing images and to perform detection, classification...

Full description

Saved in:
Bibliographic Details
Main Authors: Yi Wang, Xiang Li
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10960349/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the widespread application of ship detection in civilian and military domains, ship detection in optical remote sensing images has become an important research direction. The core task is to determine the presence of ship targets in remote sensing images and to perform detection, classification, and localization. However, ship targets in remote sensing images exhibit multi-scale characteristics, complex backgrounds, and varying meteorological conditions, leading to challenges such as low detection accuracy, false positives, and missed detections, particularly for small-scale ships. To address these issues, we propose an effective ship detection algorithm&#x2014;Ship Detection Transformer (Ship-DETR). First, we introduce the high-low frequency (HiLo) attention into the intra-scale feature interaction module to enhance the extraction of both high- and low-frequency features, reduce computational complexity, and improve detection performance. To strengthen the model&#x2019;s multi-scale fusion capability for ship features of different scales, we incorporate the bidirectional feature pyramid network (BiFPN) to optimize cross-scale feature fusion and add the <inline-formula> <tex-math notation="LaTeX">$S_{2}$ </tex-math></inline-formula> features to further enhance the representation of small-scale ship features. Additionally, we replace traditional downsampling operations with the Haar wavelet-based downsampling (HWD) module, which reduces the spatial resolution of feature maps while preserving edge and texture details as much as possible, thereby minimizing feature information loss. Experimental results on the LEVIR-Ship dataset and the HRSC2016-MS dataset demonstrate that Ship-DETR improves the ship detection accuracy while maintaining a small number of parameters and computational load. Ship-DETR achieved a mean average precision of 75.7% and 73.1% on the two datasets, respectively, representing an improvement of 3.5% and 1.9% compared to the baseline model. Compared with other mainstream object detection algorithms, our proposed model shows superior detection performance.
ISSN:2169-3536