Hybrid feature selection framework for enhanced credit card fraud detection using machine learning models.

Electronic payment methods are increasingly prevalent worldwide, facilitating both in-person and online transactions. As credit card usage for online payments grows, fraud and payment defaults have also risen, resulting in significant financial losses. Detecting fraudulent transactions is challengin...

Full description

Saved in:
Bibliographic Details
Main Authors: Al Mahmud Siam, Pankaj Bhowmik, Md Palash Uddin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0326975
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Electronic payment methods are increasingly prevalent worldwide, facilitating both in-person and online transactions. As credit card usage for online payments grows, fraud and payment defaults have also risen, resulting in significant financial losses. Detecting fraudulent transactions is challenging due to the highly imbalanced nature of transaction datasets, where fraudulent activities constitute only a small fraction of the data. To address this, we propose a novel hybrid feature selection framework designed to enhance the performance of machine learning models in credit card fraud detection. Our framework integrates three complementary feature selection techniques: Pearson correlation, information gain (IG), and random forest importance (RFI), each optimized for the dataset's characteristics. Pearson Correlation eliminates redundancy by removing highly correlated features, while IG and RFI evaluate the relevance of the remaining features. A union operation combines the most informative features from these methods, ensuring comprehensive and efficient feature selection. To validate the proposed approach, we test it on five diverse datasets with varying characteristics and imbalance levels, employing five state-of-the-art machine learning algorithms: Random Forest (RF), Extra Trees (ET), XGBoost (XGBC), AdaBoost, and CatBoost. We primarily propose this work for PCA-transformed datasets, but for the validation of our research, we also apply it to a real-world dataset. The results demonstrate that our methodology outperforms existing baseline approaches, achieving superior fraud detection performance across all datasets. Our findings highlight the robustness and adaptability of the proposed framework, offering a practical solution for real-world fraud detection systems. Additionally, we believe that our proposed framework can serve as a decision support system for the detection of fraudulent transactions in real-time credit cards, with the potential to make a substantial contribution to the business industry.
ISSN:1932-6203