Enhancing credit card fraud detection: the impact of oversampling rates and ensemble methods with diverse feature selection
The subject matter of this article is enhancing credit card fraud detection systems by exploring the impact of oversampling rates and ensemble methods with diverse feature selection techniques. Credit card fraud has become a major issue in the financial world, leading to substantial losses for both...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
National Aerospace University «Kharkiv Aviation Institute»
2025-02-01
|
| Series: | Радіоелектронні і комп'ютерні системи |
| Subjects: | |
| Online Access: | http://nti.khai.edu/ojs/index.php/reks/article/view/2777 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The subject matter of this article is enhancing credit card fraud detection systems by exploring the impact of oversampling rates and ensemble methods with diverse feature selection techniques. Credit card fraud has become a major issue in the financial world, leading to substantial losses for both financial institutions and consumers. As the volume of credit card transactions continues to grow, accurately detecting fraudulent behavior has become increasingly challenging. The goal of this study is to enhance credit card fraud detection by analyzing oversampling rates to select the optimal one for the highest-performing models and using ensemble techniques based on diverse feature selection approaches. The key tasks undertaken in this study include assessing the models’ performance based on accuracy, recall, and AUC scores, analyzing the effect of oversampling using the Synthetic Minority Over-sampling Technique (SMOTE), and proposing an ensemble method that combines the strengths of different feature selection techniques and classifiers. The methods used in this research involve applying a range of machine learning techniques, including logistic regression, decision trees, random forests, and gradient boosting, to an imbalanced dataset where legitimate transactions significantly outnumber fraudulent ones. To address the data imbalance, the researchers systematically investigated the impact of varying oversampling rates using SMOTE. Additionally, they developed an ensemble model that integrates seven feature selection methods with the eXtreme Gradient Boosting (XGB) algorithm. The results show that the application of SMOTE significantly improves the performance of the machine learning models, with an optimal oversampling rate of 20% identified. The XGB model stood out for its exceptional performance, with high accuracy, recall, and AUC scores. Furthermore, the proposed ensemble approach, which combines the strengths of the diverse feature selection techniques and the XGB classifier, further enhances the detection accuracy and system performance compared to the traditional methods. The conclusions drawn from this research contribute to advancing the field of credit card fraud detection by providing insights into the impact of oversampling and the benefits of ensemble methods with diverse feature selection. These insights can aid in the development of more effective and robust fraud detection systems, helping financial institutions and consumers better protect against the growing threat of credit card fraud. |
|---|---|
| ISSN: | 1814-4225 2663-2012 |