Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
Abstract Recent advances in machine learning have resulted in techniques that are effective in complex scenarios, such as those with many rare classes or with multimodal data; in particular, low-shot learning (LSL) is a challenging task for which multiple strong approaches have been developed. We hy...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SpringerOpen
2025-05-01
|
| Series: | Journal of Big Data |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s40537-025-01171-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849314651615002624 |
|---|---|
| author | Preston Billion-Polak Taghi M. Khoshgoftaar |
| author_facet | Preston Billion-Polak Taghi M. Khoshgoftaar |
| author_sort | Preston Billion-Polak |
| collection | DOAJ |
| description | Abstract Recent advances in machine learning have resulted in techniques that are effective in complex scenarios, such as those with many rare classes or with multimodal data; in particular, low-shot learning (LSL) is a challenging task for which multiple strong approaches have been developed. We hypothesize that these techniques’ effectiveness against the data scarcity within LSL may translate to effectiveness against the data scarcity within more “traditional” supervised, imbalanced, binary classification tasks such as fraud detection; however, there has been relatively little research which applies them in these contexts. In this paper, we aim to fill this gap by selecting two LSL papers from prior literature (representing two major approaches to LSL, optimization-based and contrastive), and reevaluate their models on two highly-imbalanced tabular fraud detection datasets, including a “big-data” Medicare dataset. To the best of our knowledge, our work is the first to directly compare optimization-based and contrastive approaches in any setting, and the first work to examine either of these approaches on a tabular big-data task. We find that the contrastive learning method we test, Siamese-RNN, performs on par with state-of-the-art non-LSL baseline learners for especially big and severely imbalanced data, and significantly outperforms them for smaller and less severely imbalanced data. |
| format | Article |
| id | doaj-art-9a579da336f348fba96c5218cb98fe1c |
| institution | Kabale University |
| issn | 2196-1115 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | SpringerOpen |
| record_format | Article |
| series | Journal of Big Data |
| spelling | doaj-art-9a579da336f348fba96c5218cb98fe1c2025-08-20T03:52:24ZengSpringerOpenJournal of Big Data2196-11152025-05-0112112110.1186/s40537-025-01171-0Evaluating techniques from low-shot learning on traditional imbalanced classification tasksPreston Billion-Polak0Taghi M. Khoshgoftaar1Florida Atlantic UniversityFlorida Atlantic UniversityAbstract Recent advances in machine learning have resulted in techniques that are effective in complex scenarios, such as those with many rare classes or with multimodal data; in particular, low-shot learning (LSL) is a challenging task for which multiple strong approaches have been developed. We hypothesize that these techniques’ effectiveness against the data scarcity within LSL may translate to effectiveness against the data scarcity within more “traditional” supervised, imbalanced, binary classification tasks such as fraud detection; however, there has been relatively little research which applies them in these contexts. In this paper, we aim to fill this gap by selecting two LSL papers from prior literature (representing two major approaches to LSL, optimization-based and contrastive), and reevaluate their models on two highly-imbalanced tabular fraud detection datasets, including a “big-data” Medicare dataset. To the best of our knowledge, our work is the first to directly compare optimization-based and contrastive approaches in any setting, and the first work to examine either of these approaches on a tabular big-data task. We find that the contrastive learning method we test, Siamese-RNN, performs on par with state-of-the-art non-LSL baseline learners for especially big and severely imbalanced data, and significantly outperforms them for smaller and less severely imbalanced data.https://doi.org/10.1186/s40537-025-01171-0Low-shot learningContrastive learningMeta-learningClass imbalanceBig dataFraud detection |
| spellingShingle | Preston Billion-Polak Taghi M. Khoshgoftaar Evaluating techniques from low-shot learning on traditional imbalanced classification tasks Journal of Big Data Low-shot learning Contrastive learning Meta-learning Class imbalance Big data Fraud detection |
| title | Evaluating techniques from low-shot learning on traditional imbalanced classification tasks |
| title_full | Evaluating techniques from low-shot learning on traditional imbalanced classification tasks |
| title_fullStr | Evaluating techniques from low-shot learning on traditional imbalanced classification tasks |
| title_full_unstemmed | Evaluating techniques from low-shot learning on traditional imbalanced classification tasks |
| title_short | Evaluating techniques from low-shot learning on traditional imbalanced classification tasks |
| title_sort | evaluating techniques from low shot learning on traditional imbalanced classification tasks |
| topic | Low-shot learning Contrastive learning Meta-learning Class imbalance Big data Fraud detection |
| url | https://doi.org/10.1186/s40537-025-01171-0 |
| work_keys_str_mv | AT prestonbillionpolak evaluatingtechniquesfromlowshotlearningontraditionalimbalancedclassificationtasks AT taghimkhoshgoftaar evaluatingtechniquesfromlowshotlearningontraditionalimbalancedclassificationtasks |