MtAD-Net: Multi-Threshold Adaptive Decision Net for Unsupervised Synthetic Aperture Radar Ship Instance Segmentation
In synthetic aperture radar (SAR) images, pixel-level Ground Truth (GT) is a scarce resource compared to Bounding Box (BBox) annotations. Therefore, exploring the use of unsupervised instance segmentation methods to convert BBox-level annotations into pixel-level GT holds great significance in the S...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/4/593 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In synthetic aperture radar (SAR) images, pixel-level Ground Truth (GT) is a scarce resource compared to Bounding Box (BBox) annotations. Therefore, exploring the use of unsupervised instance segmentation methods to convert BBox-level annotations into pixel-level GT holds great significance in the SAR field. However, previous unsupervised segmentation methods fail to perform well on SAR images due to the presence of speckle noise, low imaging accuracy, and gradual pixel transitions at the boundaries between targets and background, resulting in unclear edges. In this paper, we propose a Multi-threshold Adaptive Decision Network (MtAD-Net), which is capable of segmenting SAR ship images under unsupervised conditions and demonstrates good performance. Specifically, we design a Multiple CFAR Threshold-extraction Module (MCTM) to obtain a threshold vector by a false alarm rate vector. A Local U-shape Feature Extractor (LUFE) is designed to project each pixel of SAR images into a high-dimensional feature space, and a Global Vision Transformer Encoder (GVTE) is designed to obtain global features, and then, we use the global features to obtain a probability vector, which is the probability of each CFAR threshold. We further propose a PLC-Loss to adaptively reduce the feature distance of pixels of the same category and increase the feature distance of pixels of different categories. Moreover, we designed a label smoothing module to denoise the result of MtAD-Net. Experimental results on the dataset show that our MtAD-Net outperforms traditional and existing deep learning-based unsupervised segmentation methods in terms of pixel accuracy, kappa coefficient, mean intersection over union, frequency weighted intersection over union, and F1-Score. |
|---|---|
| ISSN: | 2072-4292 |