Enhancing credit card fraud detection: highly imbalanced data case

Abstract In the contemporary landscape, fraud is a widespread challenge in today’s financial landscape, requiring innovative methods and technologies to detect and prevent losses from the sophisticated tactics used by fraudsters. This paper emphasizes the main issues in fraud detection and suggests...

Full description

Saved in:
Bibliographic Details
Main Authors: Dalia Breskuvienė, Gintautas Dzemyda
Format: Article
Language:English
Published: SpringerOpen 2024-12-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-024-01059-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850103167694405632
author Dalia Breskuvienė
Gintautas Dzemyda
author_facet Dalia Breskuvienė
Gintautas Dzemyda
author_sort Dalia Breskuvienė
collection DOAJ
description Abstract In the contemporary landscape, fraud is a widespread challenge in today’s financial landscape, requiring innovative methods and technologies to detect and prevent losses from the sophisticated tactics used by fraudsters. This paper emphasizes the main issues in fraud detection and suggests a novel feature selection method called FID-SOM (feature selection for imbalanced data using SOM). Feature selection can significantly improve classification performance. Given the inherent imbalance in fraud detection data, feature selection must be done with an enhanced focus. To accomplish this task, we use Self-Organizing maps, which are a special type of artificial neural network. FID-SOM is designed to address the challenge of dimensionality reduction in scenarios characterized by highly imbalanced data. It has been specifically designed to efficiently process and analyze vast and complex datasets commonly encountered in the financial sector, showcasing adaptability to the dynamic nature of big data environments. The uniqueness of the proposed method is in forming a new dataset containing the Best-Matching Units of the trained SOM as vectors of attributes corresponding to the initial features. These attributes are sorted based on variance in descending order. By keeping the required number of attributes that hold the highest percentage of variability, we select features corresponding to those attributes for further analysis. The proposed FID-SOM method has demonstrated its ability to perform on par with, if not surpass, existing methodologies. It also shows innovative potential.
format Article
id doaj-art-3617686a2a1f485cabe9ef45578cc45c
institution DOAJ
issn 2196-1115
language English
publishDate 2024-12-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj-art-3617686a2a1f485cabe9ef45578cc45c2025-08-20T02:39:37ZengSpringerOpenJournal of Big Data2196-11152024-12-0111112410.1186/s40537-024-01059-5Enhancing credit card fraud detection: highly imbalanced data caseDalia Breskuvienė0Gintautas Dzemyda1Institute of Data Science and Digital Technologies, Vilnius UniversityInstitute of Data Science and Digital Technologies, Vilnius UniversityAbstract In the contemporary landscape, fraud is a widespread challenge in today’s financial landscape, requiring innovative methods and technologies to detect and prevent losses from the sophisticated tactics used by fraudsters. This paper emphasizes the main issues in fraud detection and suggests a novel feature selection method called FID-SOM (feature selection for imbalanced data using SOM). Feature selection can significantly improve classification performance. Given the inherent imbalance in fraud detection data, feature selection must be done with an enhanced focus. To accomplish this task, we use Self-Organizing maps, which are a special type of artificial neural network. FID-SOM is designed to address the challenge of dimensionality reduction in scenarios characterized by highly imbalanced data. It has been specifically designed to efficiently process and analyze vast and complex datasets commonly encountered in the financial sector, showcasing adaptability to the dynamic nature of big data environments. The uniqueness of the proposed method is in forming a new dataset containing the Best-Matching Units of the trained SOM as vectors of attributes corresponding to the initial features. These attributes are sorted based on variance in descending order. By keeping the required number of attributes that hold the highest percentage of variability, we select features corresponding to those attributes for further analysis. The proposed FID-SOM method has demonstrated its ability to perform on par with, if not surpass, existing methodologies. It also shows innovative potential.https://doi.org/10.1186/s40537-024-01059-5Feature selectionSOMImbalanced dataClassificationFraud detection
spellingShingle Dalia Breskuvienė
Gintautas Dzemyda
Enhancing credit card fraud detection: highly imbalanced data case
Journal of Big Data
Feature selection
SOM
Imbalanced data
Classification
Fraud detection
title Enhancing credit card fraud detection: highly imbalanced data case
title_full Enhancing credit card fraud detection: highly imbalanced data case
title_fullStr Enhancing credit card fraud detection: highly imbalanced data case
title_full_unstemmed Enhancing credit card fraud detection: highly imbalanced data case
title_short Enhancing credit card fraud detection: highly imbalanced data case
title_sort enhancing credit card fraud detection highly imbalanced data case
topic Feature selection
SOM
Imbalanced data
Classification
Fraud detection
url https://doi.org/10.1186/s40537-024-01059-5
work_keys_str_mv AT daliabreskuviene enhancingcreditcardfrauddetectionhighlyimbalanceddatacase
AT gintautasdzemyda enhancingcreditcardfrauddetectionhighlyimbalanceddatacase