FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold

Among the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value f...

Full description

Saved in:
Bibliographic Details
Main Authors: Xin Wu, Jingjing Xu, Kuan Li, Jianping Yin, Jian Xiong
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/3/392
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849717681743200256
author Xin Wu
Jingjing Xu
Kuan Li
Jianping Yin
Jian Xiong
author_facet Xin Wu
Jingjing Xu
Kuan Li
Jianping Yin
Jian Xiong
author_sort Xin Wu
collection DOAJ
description Among the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value from unlabeled data exceeds the threshold, the associated class is designated as the data’s pseudo-label. However, current methods utilize fixed or dynamic thresholds, disregarding the varying learning difficulties across categories in unbalanced datasets. To overcome these issues, in this paper, we first designed Cumulative Effective Labeling (CEL) to reflect a particular class’s learning difficulty. This approach differs from previous methods because it uses effective pseudo-labels and ground truth, collectively influencing the model’s capacity to acquire category knowledge. In addition, based on CEL, we propose a simple but effective way to compute the threshold, Self-adaptive Dynamic Threshold (SDT). It requires a single hyperparameter to adjust to various scenarios, eliminating the necessity for a unique threshold modification approach for each case. SDT utilizes a clever mapping function that can solve the problem of differential learning difficulty of various categories in an unbalanced image dataset that adversely affects dynamic thresholding. Finally, we propose a deep semi-supervised method with SDT called FldtMatch. Through theoretical analysis and extensive experiments, we have fully proven that FldtMatch can overcome the negative impact of unbalanced data. Regardless of the choice of the backbone network, our method achieves the best results on multiple datasets. The maximum improvement of the macro F1-Score metric is about 5.6% in DFUC2021 and 2.2% in ISIC2018.
format Article
id doaj-art-ffbf8f0901ae48a5acde0e0e69976f9d
institution DOAJ
issn 2227-7390
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-ffbf8f0901ae48a5acde0e0e69976f9d2025-08-20T03:12:35ZengMDPI AGMathematics2227-73902025-01-0113339210.3390/math13030392FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic ThresholdXin Wu0Jingjing Xu1Kuan Li2Jianping Yin3Jian Xiong4Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, ChinaDepartment of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, ChinaSchool of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, ChinaSchool of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, ChinaDepartment of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, ChinaAmong the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value from unlabeled data exceeds the threshold, the associated class is designated as the data’s pseudo-label. However, current methods utilize fixed or dynamic thresholds, disregarding the varying learning difficulties across categories in unbalanced datasets. To overcome these issues, in this paper, we first designed Cumulative Effective Labeling (CEL) to reflect a particular class’s learning difficulty. This approach differs from previous methods because it uses effective pseudo-labels and ground truth, collectively influencing the model’s capacity to acquire category knowledge. In addition, based on CEL, we propose a simple but effective way to compute the threshold, Self-adaptive Dynamic Threshold (SDT). It requires a single hyperparameter to adjust to various scenarios, eliminating the necessity for a unique threshold modification approach for each case. SDT utilizes a clever mapping function that can solve the problem of differential learning difficulty of various categories in an unbalanced image dataset that adversely affects dynamic thresholding. Finally, we propose a deep semi-supervised method with SDT called FldtMatch. Through theoretical analysis and extensive experiments, we have fully proven that FldtMatch can overcome the negative impact of unbalanced data. Regardless of the choice of the backbone network, our method achieves the best results on multiple datasets. The maximum improvement of the macro F1-Score metric is about 5.6% in DFUC2021 and 2.2% in ISIC2018.https://www.mdpi.com/2227-7390/13/3/392deep semi-supervised learningunbalanced dataclassificationdynamic threshold
spellingShingle Xin Wu
Jingjing Xu
Kuan Li
Jianping Yin
Jian Xiong
FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
Mathematics
deep semi-supervised learning
unbalanced data
classification
dynamic threshold
title FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
title_full FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
title_fullStr FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
title_full_unstemmed FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
title_short FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
title_sort fldtmatch improving unbalanced data classification via deep semi supervised learning with self adaptive dynamic threshold
topic deep semi-supervised learning
unbalanced data
classification
dynamic threshold
url https://www.mdpi.com/2227-7390/13/3/392
work_keys_str_mv AT xinwu fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold
AT jingjingxu fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold
AT kuanli fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold
AT jianpingyin fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold
AT jianxiong fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold