Evaluating techniques from low-shot learning on traditional imbalanced classification tasks

Abstract Recent advances in machine learning have resulted in techniques that are effective in complex scenarios, such as those with many rare classes or with multimodal data; in particular, low-shot learning (LSL) is a challenging task for which multiple strong approaches have been developed. We hy...

Full description

Saved in:
Bibliographic Details
Main Authors: Preston Billion-Polak, Taghi M. Khoshgoftaar
Format: Article
Language:English
Published: SpringerOpen 2025-05-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-025-01171-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849314651615002624
author Preston Billion-Polak
Taghi M. Khoshgoftaar
author_facet Preston Billion-Polak
Taghi M. Khoshgoftaar
author_sort Preston Billion-Polak
collection DOAJ
description Abstract Recent advances in machine learning have resulted in techniques that are effective in complex scenarios, such as those with many rare classes or with multimodal data; in particular, low-shot learning (LSL) is a challenging task for which multiple strong approaches have been developed. We hypothesize that these techniques’ effectiveness against the data scarcity within LSL may translate to effectiveness against the data scarcity within more “traditional” supervised, imbalanced, binary classification tasks such as fraud detection; however, there has been relatively little research which applies them in these contexts. In this paper, we aim to fill this gap by selecting two LSL papers from prior literature (representing two major approaches to LSL, optimization-based and contrastive), and reevaluate their models on two highly-imbalanced tabular fraud detection datasets, including a “big-data” Medicare dataset. To the best of our knowledge, our work is the first to directly compare optimization-based and contrastive approaches in any setting, and the first work to examine either of these approaches on a tabular big-data task. We find that the contrastive learning method we test, Siamese-RNN, performs on par with state-of-the-art non-LSL baseline learners for especially big and severely imbalanced data, and significantly outperforms them for smaller and less severely imbalanced data.
format Article
id doaj-art-9a579da336f348fba96c5218cb98fe1c
institution Kabale University
issn 2196-1115
language English
publishDate 2025-05-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj-art-9a579da336f348fba96c5218cb98fe1c2025-08-20T03:52:24ZengSpringerOpenJournal of Big Data2196-11152025-05-0112112110.1186/s40537-025-01171-0Evaluating techniques from low-shot learning on traditional imbalanced classification tasksPreston Billion-Polak0Taghi M. Khoshgoftaar1Florida Atlantic UniversityFlorida Atlantic UniversityAbstract Recent advances in machine learning have resulted in techniques that are effective in complex scenarios, such as those with many rare classes or with multimodal data; in particular, low-shot learning (LSL) is a challenging task for which multiple strong approaches have been developed. We hypothesize that these techniques’ effectiveness against the data scarcity within LSL may translate to effectiveness against the data scarcity within more “traditional” supervised, imbalanced, binary classification tasks such as fraud detection; however, there has been relatively little research which applies them in these contexts. In this paper, we aim to fill this gap by selecting two LSL papers from prior literature (representing two major approaches to LSL, optimization-based and contrastive), and reevaluate their models on two highly-imbalanced tabular fraud detection datasets, including a “big-data” Medicare dataset. To the best of our knowledge, our work is the first to directly compare optimization-based and contrastive approaches in any setting, and the first work to examine either of these approaches on a tabular big-data task. We find that the contrastive learning method we test, Siamese-RNN, performs on par with state-of-the-art non-LSL baseline learners for especially big and severely imbalanced data, and significantly outperforms them for smaller and less severely imbalanced data.https://doi.org/10.1186/s40537-025-01171-0Low-shot learningContrastive learningMeta-learningClass imbalanceBig dataFraud detection
spellingShingle Preston Billion-Polak
Taghi M. Khoshgoftaar
Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
Journal of Big Data
Low-shot learning
Contrastive learning
Meta-learning
Class imbalance
Big data
Fraud detection
title Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
title_full Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
title_fullStr Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
title_full_unstemmed Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
title_short Evaluating techniques from low-shot learning on traditional imbalanced classification tasks
title_sort evaluating techniques from low shot learning on traditional imbalanced classification tasks
topic Low-shot learning
Contrastive learning
Meta-learning
Class imbalance
Big data
Fraud detection
url https://doi.org/10.1186/s40537-025-01171-0
work_keys_str_mv AT prestonbillionpolak evaluatingtechniquesfromlowshotlearningontraditionalimbalancedclassificationtasks
AT taghimkhoshgoftaar evaluatingtechniquesfromlowshotlearningontraditionalimbalancedclassificationtasks