Enhancing Cross-Domain Remote Sensing Scene Classification by Multi-Source Subdomain Distribution Alignment Network

Multi-source domain adaptation (MSDA) in remote sensing (RS) scene classification has recently gained significant attention in the visual recognition community. It leverages multiple well-labeled source domains to train a model capable of achieving strong generalization on the target domain with lit...

Full description

Saved in:
Bibliographic Details
Main Authors: Yong Wang, Zhehao Shu, Yinzhi Feng, Rui Liu, Qiusheng Cao, Danping Li, Lei Wang
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/7/1302
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multi-source domain adaptation (MSDA) in remote sensing (RS) scene classification has recently gained significant attention in the visual recognition community. It leverages multiple well-labeled source domains to train a model capable of achieving strong generalization on the target domain with little to no labeled data from the target domain. However, the distribution shifts among multiple source domains make it more challenging to align the distributions between the target domain and all source domains concurrently. Moreover, relying solely on global alignment risks losing fine-grained information for each class, especially in the task of RS scene classification. To alleviate these issues, we present a Multi-Source Subdomain Distribution Alignment Network (MSSDANet), which introduces novel network structures and loss functions for subdomain-oriented MSDA. By adopting a two-level feature extraction strategy, this model attains better global alignment between the target domain and multiple source domains, as well as alignment at the subdomain level. First, it includes a pre-trained convolutional neural network (CNN) as a common feature extractor to fully exploit the shared invariant features across one target and multiple source domains. Secondly, a dual-domain feature extractor is used after the common feature extractor, which maps the data from each pair of target and source domains to a specific dual-domain feature space and performs subdomain alignment. Finally, a dual-domain feature classifier is employed to make predictions by averaging the outputs from multiple classifiers. Accompanied by the above network, two novel loss functions are proposed to boost the classification performance. Discriminant Semantic Transfer (DST) loss is exploited to force the model to effectively extract semantic information among target and source domain samples, while Class Correlation (CC) loss is introduced to reduce the feature confusion from different classes within the target domain. It is noteworthy that our MSSDANet is developed in an unsupervised manner for domain adaptation, indicating that no label information from the target domain is required during training. Extensive experiments on four common RS image datasets demonstrate that the proposed method achieves state-of-the-art performance for cross-domain RS scene classification. Specifically, in the dual-source and three-source settings, MSSDANet outperforms the second-best algorithm in terms of overall accuracy (OA) by 2.2% and 1.6%, respectively.
ISSN:2072-4292