Integrating unsupervised domain adaptation and SAM technologies for image semantic segmentation: a case study on building extraction from high-resolution remote sensing images

Deep learning (DL) has become the mainstream technique for extracting information from high-spatial-resolution (HSR) imagery because of its powerful feature representation capabilities. However, DL models rely heavily on accurate annotations, which limits their generalizability to new data. Recently...

Full description

Saved in:
Bibliographic Details
Main Authors: Mengyuan Yang, Rui Yang, Min Wang, Haiyan Xu, Gang Xu
Format: Article
Language:English
Published: Taylor & Francis Group 2025-08-01
Series:International Journal of Digital Earth
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/17538947.2025.2491108
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deep learning (DL) has become the mainstream technique for extracting information from high-spatial-resolution (HSR) imagery because of its powerful feature representation capabilities. However, DL models rely heavily on accurate annotations, which limits their generalizability to new data. Recently, the Segment Anything Model (SAM) has significantly advanced image segmentation techniques, showing great potential for use in remote sensing applications. To address the above limitations and explore the potential of the SAM for use with HSR imagery, we propose a novel method for completing semantic segmentation tasks that combines the SAM and unsupervised domain adaptation (UDA) techniques, enhancing model performance on unlabeled HSR imagery. Specifically, we propose a pseudolabel refinement module by integrating SAM and UDA techniques. Furthermore, the obtained pseudolabels are used to train the proposed self-training and SAM-based network (STSAMNet) for performing semantic segmentation; this network embeds two types of adapter layers to adapt the capabilities of the SAM to HSR imagery. During the training process, an iterative training strategy and a noise-weighted loss are applied to further improve the accuracy of the model on unlabeled images. Compared with other UDA methods, our method achieves the best performance in terms of F1 and mean intersection over union (mIoU) values.
ISSN:1753-8947
1753-8955