CSNet: A Remote Sensing Image Semantic Segmentation Network Based on Coordinate Attention and Skip Connections
In recent years, the continuous development of deep learning has significantly advanced its application in the field of remote sensing. However, the semantic segmentation of high-resolution remote sensing images remains challenging due to the presence of multi-scale objects and intricate spatial det...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/12/2048 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In recent years, the continuous development of deep learning has significantly advanced its application in the field of remote sensing. However, the semantic segmentation of high-resolution remote sensing images remains challenging due to the presence of multi-scale objects and intricate spatial details, often leading to the loss of critical information during segmentation. To address this issue and enable fast and accurate segmentation of remote sensing images, we made improvements based on SegNet and named the enhanced model CSNet. CSNet is built upon the SegNet architecture and incorporates a coordinate attention (CA) mechanism, which enables the network to focus on salient features and capture global spatial information, thereby improving segmentation accuracy and facilitating the recovery of spatial structures. Furthermore, skip connections are introduced between the encoder and decoder to directly transfer low-level features to the decoder. This promotes the fusion of semantic information at different levels, enhances the recovery of fine-grained details, and optimizes the gradient flow during training, effectively mitigating the vanishing gradient problem and improving training efficiency. Additionally, a hybrid loss function combining weighted cross-entropy and Dice loss is employed. To address the issue of class imbalance, several categories within the dataset are merged, and samples with an excessively high proportion of background pixels are removed. These strategies significantly enhance the segmentation performance, particularly for small-sample classes. Experimental results from the Five-Billion-Pixels dataset demonstrate that, while introducing only a modest increase in parameters compared to SegNet, CSNet achieves superior segmentation performance in terms of overall classification accuracy, boundary delineation, and detail preservation, outperforming established methods such as U-Net, FCN, DeepLabv3+, SegNet, ViT, HRNe and BiFormert. |
|---|---|
| ISSN: | 2072-4292 |