Emerging SMOTE and GAN Variants for Data Augmentation in Imbalance Machine Learning Tasks: A Review
Class imbalance is a pervasive challenge in real-world machine learning (ML) applications, where the minority class, often the class of interest, is significantly underrepresented. This imbalance can degrade model performance, result in misleading evaluation metrics, and complicate validation proces...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11062634/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Class imbalance is a pervasive challenge in real-world machine learning (ML) applications, where the minority class, often the class of interest, is significantly underrepresented. This imbalance can degrade model performance, result in misleading evaluation metrics, and complicate validation processes. Two prominent data-augmentation techniques to address class imbalance are the Synthetic Minority Oversampling Technique (SMOTE) and Generative Adversarial Networks (GAN). However, both techniques have inherent limitations, motivating the emergence of novel variants designed to overcome these challenges. While previous reviews have typically focused on specific domains, conventional methodologies, or broad strategy overviews, this review presents a unified taxonomy that outlines the causes, types, and implications of class imbalance across diverse ML tasks. It further examines emerging trends in the application of SMOTE and GAN techniques, their limitations, and hybrid adaptations. By categorising imbalance types and analysing models, metrics, datasets, and comparative approaches, this review provides actionable insights and identifies future research directions for practitioners and researchers working to address class imbalance in real-world ML tasks. |
|---|---|
| ISSN: | 2169-3536 |