Reconstructing Domain-Specific Features for Unsupervised Domain-Adaptive Object Detection

Unsupervised domain adaptation (UDA) effectively transfers knowledge learned from a labeled source domain to an unlabeled target domain. The teacher–student framework, which generates pseudo-labels for target domain samples and uses them for pseudo-supervised training, enables self-training and impr...

Full description

Saved in:
Bibliographic Details
Main Authors: Shuai Dong, Kang Deng, Kun Zou
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/6/439
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unsupervised domain adaptation (UDA) effectively transfers knowledge learned from a labeled source domain to an unlabeled target domain. The teacher–student framework, which generates pseudo-labels for target domain samples and uses them for pseudo-supervised training, enables self-training and improves generalization in UDA object detection. However, for one-stage detection models, pseudo-labels are unreliable when positive and negative samples are imbalanced. This may lead the model to overfit the source domain and overlook important target-domain information. In this work, we propose a novel domain-specific student–teacher framework to address this issue. The innovations of the proposed framework can be summarized in two aspects. First, we employ two domain-specific heads (DSHs) in the student model to handle inputs from the source domain and the target domain separately. These two heads are optimized independently with samples from their respective domains. This design allows for reducing the impact of unreliable pseudo-labels and fully leveraging unique information specific to the target domain. Second, we introduce an auxiliary reconstruction branch, named the multi-scale mask adversarial alignment (MMAA) module, into the teacher–student framework. The MMAA is tasked with reconstructing randomly masked multi-scale features of the source domain, which enhances the student model’s semantic representation capability and facilitates the generation of high-quality pseudo-labels. Experimental results on six diverse cross-domain scenarios demonstrate the effectiveness of our framework.
ISSN:2078-2489