Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems
This study addresses the challenges of data imbalance and missing values in credit card transaction datasets by employing mode-based imputation and various machine learning models. We analyzed two distinct datasets: one consisting of European cardholders and the other from American Express, applying...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/15/2446 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849405827161522176 |
|---|---|
| author | Xiaomei Feng Song-Kyoo Kim |
| author_facet | Xiaomei Feng Song-Kyoo Kim |
| author_sort | Xiaomei Feng |
| collection | DOAJ |
| description | This study addresses the challenges of data imbalance and missing values in credit card transaction datasets by employing mode-based imputation and various machine learning models. We analyzed two distinct datasets: one consisting of European cardholders and the other from American Express, applying multiple machine learning algorithms, including Artificial Neural Networks, Convolutional Neural Networks, and Gradient Boosted Decision Trees, as well as others. Notably, the Gradient Boosted Decision Tree demonstrated superior predictive performance, with accuracy increasing by 4.53%, reaching 96.92% on the European cardholders dataset. Mode imputation significantly improved data quality, enabling stable and reliable analysis of merged datasets with up to 50% missing values. Hypothesis testing confirmed that the performance of the merged dataset was statistically significant compared to the original datasets. This study highlights the importance of robust data handling techniques in developing effective fraud detection systems, setting the stage for future research on combining different datasets and improving predictive accuracy in the financial sector. |
| format | Article |
| id | doaj-art-8915171605f147d7906c1237efa2bf42 |
| institution | Kabale University |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-8915171605f147d7906c1237efa2bf422025-08-20T03:36:34ZengMDPI AGMathematics2227-73902025-07-011315244610.3390/math13152446Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection SystemsXiaomei Feng0Song-Kyoo Kim1Faculty of Applied Sciences, Macao Polytechnic University, R. de Luis Gonzaga Gomes, Macao SAR, ChinaFaculty of Applied Sciences, Macao Polytechnic University, R. de Luis Gonzaga Gomes, Macao SAR, ChinaThis study addresses the challenges of data imbalance and missing values in credit card transaction datasets by employing mode-based imputation and various machine learning models. We analyzed two distinct datasets: one consisting of European cardholders and the other from American Express, applying multiple machine learning algorithms, including Artificial Neural Networks, Convolutional Neural Networks, and Gradient Boosted Decision Trees, as well as others. Notably, the Gradient Boosted Decision Tree demonstrated superior predictive performance, with accuracy increasing by 4.53%, reaching 96.92% on the European cardholders dataset. Mode imputation significantly improved data quality, enabling stable and reliable analysis of merged datasets with up to 50% missing values. Hypothesis testing confirmed that the performance of the merged dataset was statistically significant compared to the original datasets. This study highlights the importance of robust data handling techniques in developing effective fraud detection systems, setting the stage for future research on combining different datasets and improving predictive accuracy in the financial sector.https://www.mdpi.com/2227-7390/13/15/2446credit card fraudstatistical data generationmachine learningcredit predictionpredictive modeling |
| spellingShingle | Xiaomei Feng Song-Kyoo Kim Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems Mathematics credit card fraud statistical data generation machine learning credit prediction predictive modeling |
| title | Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems |
| title_full | Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems |
| title_fullStr | Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems |
| title_full_unstemmed | Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems |
| title_short | Statistical Data-Generative Machine Learning-Based Credit Card Fraud Detection Systems |
| title_sort | statistical data generative machine learning based credit card fraud detection systems |
| topic | credit card fraud statistical data generation machine learning credit prediction predictive modeling |
| url | https://www.mdpi.com/2227-7390/13/15/2446 |
| work_keys_str_mv | AT xiaomeifeng statisticaldatagenerativemachinelearningbasedcreditcardfrauddetectionsystems AT songkyookim statisticaldatagenerativemachinelearningbasedcreditcardfrauddetectionsystems |