FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold
Among the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value f...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-01-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/3/392 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849717681743200256 |
|---|---|
| author | Xin Wu Jingjing Xu Kuan Li Jianping Yin Jian Xiong |
| author_facet | Xin Wu Jingjing Xu Kuan Li Jianping Yin Jian Xiong |
| author_sort | Xin Wu |
| collection | DOAJ |
| description | Among the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value from unlabeled data exceeds the threshold, the associated class is designated as the data’s pseudo-label. However, current methods utilize fixed or dynamic thresholds, disregarding the varying learning difficulties across categories in unbalanced datasets. To overcome these issues, in this paper, we first designed Cumulative Effective Labeling (CEL) to reflect a particular class’s learning difficulty. This approach differs from previous methods because it uses effective pseudo-labels and ground truth, collectively influencing the model’s capacity to acquire category knowledge. In addition, based on CEL, we propose a simple but effective way to compute the threshold, Self-adaptive Dynamic Threshold (SDT). It requires a single hyperparameter to adjust to various scenarios, eliminating the necessity for a unique threshold modification approach for each case. SDT utilizes a clever mapping function that can solve the problem of differential learning difficulty of various categories in an unbalanced image dataset that adversely affects dynamic thresholding. Finally, we propose a deep semi-supervised method with SDT called FldtMatch. Through theoretical analysis and extensive experiments, we have fully proven that FldtMatch can overcome the negative impact of unbalanced data. Regardless of the choice of the backbone network, our method achieves the best results on multiple datasets. The maximum improvement of the macro F1-Score metric is about 5.6% in DFUC2021 and 2.2% in ISIC2018. |
| format | Article |
| id | doaj-art-ffbf8f0901ae48a5acde0e0e69976f9d |
| institution | DOAJ |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-ffbf8f0901ae48a5acde0e0e69976f9d2025-08-20T03:12:35ZengMDPI AGMathematics2227-73902025-01-0113339210.3390/math13030392FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic ThresholdXin Wu0Jingjing Xu1Kuan Li2Jianping Yin3Jian Xiong4Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, ChinaDepartment of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, ChinaSchool of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, ChinaSchool of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, ChinaDepartment of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, ChinaAmong the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value from unlabeled data exceeds the threshold, the associated class is designated as the data’s pseudo-label. However, current methods utilize fixed or dynamic thresholds, disregarding the varying learning difficulties across categories in unbalanced datasets. To overcome these issues, in this paper, we first designed Cumulative Effective Labeling (CEL) to reflect a particular class’s learning difficulty. This approach differs from previous methods because it uses effective pseudo-labels and ground truth, collectively influencing the model’s capacity to acquire category knowledge. In addition, based on CEL, we propose a simple but effective way to compute the threshold, Self-adaptive Dynamic Threshold (SDT). It requires a single hyperparameter to adjust to various scenarios, eliminating the necessity for a unique threshold modification approach for each case. SDT utilizes a clever mapping function that can solve the problem of differential learning difficulty of various categories in an unbalanced image dataset that adversely affects dynamic thresholding. Finally, we propose a deep semi-supervised method with SDT called FldtMatch. Through theoretical analysis and extensive experiments, we have fully proven that FldtMatch can overcome the negative impact of unbalanced data. Regardless of the choice of the backbone network, our method achieves the best results on multiple datasets. The maximum improvement of the macro F1-Score metric is about 5.6% in DFUC2021 and 2.2% in ISIC2018.https://www.mdpi.com/2227-7390/13/3/392deep semi-supervised learningunbalanced dataclassificationdynamic threshold |
| spellingShingle | Xin Wu Jingjing Xu Kuan Li Jianping Yin Jian Xiong FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold Mathematics deep semi-supervised learning unbalanced data classification dynamic threshold |
| title | FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold |
| title_full | FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold |
| title_fullStr | FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold |
| title_full_unstemmed | FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold |
| title_short | FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold |
| title_sort | fldtmatch improving unbalanced data classification via deep semi supervised learning with self adaptive dynamic threshold |
| topic | deep semi-supervised learning unbalanced data classification dynamic threshold |
| url | https://www.mdpi.com/2227-7390/13/3/392 |
| work_keys_str_mv | AT xinwu fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold AT jingjingxu fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold AT kuanli fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold AT jianpingyin fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold AT jianxiong fldtmatchimprovingunbalanceddataclassificationviadeepsemisupervisedlearningwithselfadaptivedynamicthreshold |