An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection

Online financial transactions bring convenience to people’s lives, but also present vulnerabilities for criminals to embezzle users’ accounts and trick users into credit card fraud. Although machine learning methods have been adopted to detect anomalous transactions, it’s hard for a single machine l...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lianhong Ding, Luqi Liu, Yangchuan Wang, Peng Shi, Jianye Yu
Format:	Article
Language:	English
Published:	PeerJ Inc. 2024-10-01
Series:	PeerJ Computer Science
Subjects:	Anomaly detection AutoEncoder BCR Credit card fraud LightGBM MCC
Online Access:	https://peerj.com/articles/cs-2323.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850181166828617728
author	Lianhong Ding Luqi Liu Yangchuan Wang Peng Shi Jianye Yu
author_facet	Lianhong Ding Luqi Liu Yangchuan Wang Peng Shi Jianye Yu
author_sort	Lianhong Ding
collection	DOAJ
description	Online financial transactions bring convenience to people’s lives, but also present vulnerabilities for criminals to embezzle users’ accounts and trick users into credit card fraud. Although machine learning methods have been adopted to detect anomalous transactions, it’s hard for a single machine learning method to achieve satisfying results with the increasing scale and dimensionality of financial datasets. In addition, for anomaly detection of financial data, there is an obvious imbalance between normal records and abnormal. In this situation, the experimental results cannot be objectively evaluated only by the traditional metrics, such as precision, recall, and accuracy. This paper proposes an AutoEncoder enhanced LightGBM method for credit card detection. The method inherits the advantages of each component, using an AutoEncoder for feature reconstruction on the dataset, and integrating the LightGBM algorithm for improving the GBDT (Gradient Boosting Decison Tree) to detect abnormal data more accurately and efficiently. Besides the traditional evaluation metrics, F-measure, area under curve (AUC), Matthew’s correlation coefficient (MCC), and balanced classification rate (BCR) are also adopted as the evaluation metrics. Two financial datasets were used to validate the performance and robustness of the proposed model. Results obtained from the credit card fraud dataset containing 31 features indicate that our model significantly outperforms other models with a recall of 94.85%, representing a 10.70% improvement compared to the best detection performance model with a recall of only 86%. Additionally, our model’s BCR score is also significantly better than other models, with a BCR score of 97%, as opposed to the best detection performance model’s BCR score of 92%, representing a 5% improvement by our model. Various sampling methods and model combinations were considered in this study. It was found that the SMOTE algorithm combined with the proposed model produced the best results, with an AUC value of 96.83% and an F-measure score of 80.27%. The Santander bank transaction record dataset is a high dimensional large dataset containing 200 features. Experimental results on this dataset reveal that compared to other models, our model significantly improved recall and F-measure results, raising the recall to 94.14% and the F-measure score by 11.51%, surpassing the second-best-performing model. Overall, these findings demonstrate the robustness and superiority of our model in detecting fraudulent transactions and highlight the effectiveness of the SMOTE algorithm in combination with the proposed model.
format	Article
id	doaj-art-d6fe3f80f3a443abac94cd307663d7d5
institution	OA Journals
issn	2376-5992
language	English
publishDate	2024-10-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ Computer Science
spelling	doaj-art-d6fe3f80f3a443abac94cd307663d7d52025-08-20T02:17:57ZengPeerJ Inc.PeerJ Computer Science2376-59922024-10-0110e232310.7717/peerj-cs.2323An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detectionLianhong Ding0Luqi Liu1Yangchuan Wang2Peng Shi3Jianye Yu4Beijing Wuzi University, Beijing, ChinaBeijing Wuzi University, Beijing, ChinaBeijing Wuzi University, Beijing, ChinaUniversity of Science and Technology Beijing, Beijing, ChinaBeijing Wuzi University, Beijing, ChinaOnline financial transactions bring convenience to people’s lives, but also present vulnerabilities for criminals to embezzle users’ accounts and trick users into credit card fraud. Although machine learning methods have been adopted to detect anomalous transactions, it’s hard for a single machine learning method to achieve satisfying results with the increasing scale and dimensionality of financial datasets. In addition, for anomaly detection of financial data, there is an obvious imbalance between normal records and abnormal. In this situation, the experimental results cannot be objectively evaluated only by the traditional metrics, such as precision, recall, and accuracy. This paper proposes an AutoEncoder enhanced LightGBM method for credit card detection. The method inherits the advantages of each component, using an AutoEncoder for feature reconstruction on the dataset, and integrating the LightGBM algorithm for improving the GBDT (Gradient Boosting Decison Tree) to detect abnormal data more accurately and efficiently. Besides the traditional evaluation metrics, F-measure, area under curve (AUC), Matthew’s correlation coefficient (MCC), and balanced classification rate (BCR) are also adopted as the evaluation metrics. Two financial datasets were used to validate the performance and robustness of the proposed model. Results obtained from the credit card fraud dataset containing 31 features indicate that our model significantly outperforms other models with a recall of 94.85%, representing a 10.70% improvement compared to the best detection performance model with a recall of only 86%. Additionally, our model’s BCR score is also significantly better than other models, with a BCR score of 97%, as opposed to the best detection performance model’s BCR score of 92%, representing a 5% improvement by our model. Various sampling methods and model combinations were considered in this study. It was found that the SMOTE algorithm combined with the proposed model produced the best results, with an AUC value of 96.83% and an F-measure score of 80.27%. The Santander bank transaction record dataset is a high dimensional large dataset containing 200 features. Experimental results on this dataset reveal that compared to other models, our model significantly improved recall and F-measure results, raising the recall to 94.14% and the F-measure score by 11.51%, surpassing the second-best-performing model. Overall, these findings demonstrate the robustness and superiority of our model in detecting fraudulent transactions and highlight the effectiveness of the SMOTE algorithm in combination with the proposed model.https://peerj.com/articles/cs-2323.pdfAnomaly detectionAutoEncoderBCRCredit card fraudLightGBMMCC
spellingShingle	Lianhong Ding Luqi Liu Yangchuan Wang Peng Shi Jianye Yu An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection PeerJ Computer Science Anomaly detection AutoEncoder BCR Credit card fraud LightGBM MCC
title	An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection
title_full	An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection
title_fullStr	An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection
title_full_unstemmed	An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection
title_short	An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection
title_sort	autoencoder enhanced light gradient boosting machine method for credit card fraud detection
topic	Anomaly detection AutoEncoder BCR Credit card fraud LightGBM MCC
url	https://peerj.com/articles/cs-2323.pdf
work_keys_str_mv	AT lianhongding anautoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT luqiliu anautoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT yangchuanwang anautoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT pengshi anautoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT jianyeyu anautoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT lianhongding autoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT luqiliu autoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT yangchuanwang autoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT pengshi autoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection AT jianyeyu autoencoderenhancedlightgradientboostingmachinemethodforcreditcardfrauddetection

An AutoEncoder enhanced light gradient boosting machine method for credit card fraud detection

Similar Items