NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images

Due to the significant discrepancies in the distribution of ships in nearshore and offshore areas, the wide range of their size, and the randomness of target orientation in the sea, traditional detection models in the field of computer vision struggle to achieve performance in SAR image ship target...

Full description

Saved in:
Bibliographic Details
Main Authors: Yiyang Huang, Di Wang, Boxuan Wu, Daoxiang An
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/24/4760
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850058537676308480
author Yiyang Huang
Di Wang
Boxuan Wu
Daoxiang An
author_facet Yiyang Huang
Di Wang
Boxuan Wu
Daoxiang An
author_sort Yiyang Huang
collection DOAJ
description Due to the significant discrepancies in the distribution of ships in nearshore and offshore areas, the wide range of their size, and the randomness of target orientation in the sea, traditional detection models in the field of computer vision struggle to achieve performance in SAR image ship target detection comparable to that in optical image detection. This paper proposes an oriented ship target detection model based on the YOLO11 algorithm, Neural Swin Transformer-YOLO11 (NST-YOLO11). The proposed model integrates an improved Swin Transformer module called Neural Swin-T and a Cross-Stage connected Spatial Pyramid Pooling-Fast (CS-SPPF) module. By introducing a spatial/channel unified attention mechanism with neuron suppression in the spatial domain, the information redundancy generated by the local window self-attention module in the Swin Transformer Block is cut off. Furthermore, the idea of cross-stage partial (CSP) connections is applied to the fast spatial pyramid pooling (SPPF) module, effectively enhancing the ability to retain information in multi-scale feature extraction. Experiments conducted on the Rotated Ship Detection Dataset in SAR Images (RSDD-SAR) and the SAR Ship Detection Dataset (SSDD+) and comparisons with other oriented detection models demonstrate that the proposed NST-YOLO11 achieves state-of-the-art detection performance, demonstrate outstanding generalization ability and robustness of the proposed model.
format Article
id doaj-art-038d3ad0d8154a38bba778f81ffe3faf
institution DOAJ
issn 2072-4292
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-038d3ad0d8154a38bba778f81ffe3faf2025-08-20T02:51:07ZengMDPI AGRemote Sensing2072-42922024-12-011624476010.3390/rs16244760NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR ImagesYiyang Huang0Di Wang1Boxuan Wu2Daoxiang An3College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, ChinaDue to the significant discrepancies in the distribution of ships in nearshore and offshore areas, the wide range of their size, and the randomness of target orientation in the sea, traditional detection models in the field of computer vision struggle to achieve performance in SAR image ship target detection comparable to that in optical image detection. This paper proposes an oriented ship target detection model based on the YOLO11 algorithm, Neural Swin Transformer-YOLO11 (NST-YOLO11). The proposed model integrates an improved Swin Transformer module called Neural Swin-T and a Cross-Stage connected Spatial Pyramid Pooling-Fast (CS-SPPF) module. By introducing a spatial/channel unified attention mechanism with neuron suppression in the spatial domain, the information redundancy generated by the local window self-attention module in the Swin Transformer Block is cut off. Furthermore, the idea of cross-stage partial (CSP) connections is applied to the fast spatial pyramid pooling (SPPF) module, effectively enhancing the ability to retain information in multi-scale feature extraction. Experiments conducted on the Rotated Ship Detection Dataset in SAR Images (RSDD-SAR) and the SAR Ship Detection Dataset (SSDD+) and comparisons with other oriented detection models demonstrate that the proposed NST-YOLO11 achieves state-of-the-art detection performance, demonstrate outstanding generalization ability and robustness of the proposed model.https://www.mdpi.com/2072-4292/16/24/4760neural networkoriented detectionremote sensingship detectionsynthetic aperture radar (SAR)
spellingShingle Yiyang Huang
Di Wang
Boxuan Wu
Daoxiang An
NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
Remote Sensing
neural network
oriented detection
remote sensing
ship detection
synthetic aperture radar (SAR)
title NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
title_full NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
title_fullStr NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
title_full_unstemmed NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
title_short NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
title_sort nst yolo11 vit merged model with neuron attention for arbitrary oriented ship detection in sar images
topic neural network
oriented detection
remote sensing
ship detection
synthetic aperture radar (SAR)
url https://www.mdpi.com/2072-4292/16/24/4760
work_keys_str_mv AT yiyanghuang nstyolo11vitmergedmodelwithneuronattentionforarbitraryorientedshipdetectioninsarimages
AT diwang nstyolo11vitmergedmodelwithneuronattentionforarbitraryorientedshipdetectioninsarimages
AT boxuanwu nstyolo11vitmergedmodelwithneuronattentionforarbitraryorientedshipdetectioninsarimages
AT daoxiangan nstyolo11vitmergedmodelwithneuronattentionforarbitraryorientedshipdetectioninsarimages