Cost-Sensitive Credit Card Fraud Detection Using Extreme Gradient Boosting and SMOTE on Imbalanced Financial Datasets


International Research Journal of Economics and Management Studies
© 2026 by IRJEMS
Volume 5  Issue 4
Year of Publication : 2026
Authors : Md. Rasel Uddin, MD. Rahim, Md. Sadman Hafiz, Atika Rahman Ounte
irjems doi : 10.56472/25835238/IRJEMS-V5I4P122

Citation:

Md. Rasel Uddin, MD. Rahim, Md. Sadman Hafiz, Atika Rahman Ounte. "Cost-Sensitive Credit Card Fraud Detection Using Extreme Gradient Boosting and SMOTE on Imbalanced Financial Datasets" International Research Journal of Economics and Management Studies, Vol. 5, No. 4, pp. 176-182, 2026. Crossref. http://doi.org/10.56472/25835238/IRJEMS-V5I4P122

Abstract:

As the number of digital payment systems expands at an exponential pace, it increases the risk of financial abuse, such as credit card fraud. Standard comprehension models often work inefficiently when they have to deal with skewed datasets, including those that have large class imbalances. This often leads to false positives, which stand in the way of genuine transactions and make operations challenging. This study examines the friction-fraud trade-off by employing and refining a cost-sensitive Extreme Gradient Boosting (XGBoost) model integrated with SMOTE on a highly imbalanced financial dataset, concentrating on attaining an optimal detection threshold that markedly enhances current benchmarks for balancing fraud detection and reducing false positives. The research evaluates model performance using precision-recall metrics to optimize the classification threshold for real-world applications. Our findings show that the XGBoost design is better than other ensemble models, which have an Area Under the Precision-Recall Curve (AUPRC) of 0.0156. The model uses an optimal detection level of 0.0321 to keep a balance between finding fraud and actual user behaviour. Our study of feature importance demonstrates that categorical infrastructure, such as types of devices and merchant categories, is a much better way to predict fraud than continuous numerical data. These findings facilitate the implementation of tiered fraud triage systems, ultimately optimizing the balance between robust financial security and seamless business continuity.

References:

[1] Ahmed, K. H., Axelsson, S., Li, Y., & Sagheer, A. M. (2025). A credit card fraud detection approach based on ensemble machine learning classifier with hybrid data sampling. Machine Learning with Applications, 20, 100675. https://doi.org/10.1016/j.mlwa.2025.100675
[2] Alamri, M., & Ykhlef, M. (2024). Hybrid Feature Engineering Based on Customer Spending Behavior for Credit Card Anomaly and Fraud Detection. Electronics 2024, Vol. 13, 13(20). https://doi.org/10.3390/electronics13203978
[3] Aljunaid, S. K., Almheiri, S. J., Dawood, H., & Khan, M. A. (2025). Secure and Transparent Banking: Explainable AI-Driven Federated Learning Model for Financial Fraud Detection. Journal of Risk and Financial Management 2025, Vol. 18, 18(4). https://doi.org/10.3390/jrfm18040179
[4] Alkattan, H., Turyasingura, B., Willbroad, B., & Jaafar, A. A. K. (2025). Economic Performance Classification in Iraq (2000-2023): A Statistical Analysis Using Machine Learning with Support Vector Machines and Random Forest. EDRAAK, 2025, 29–37. https://doi.org/10.70470/edraak/2025/005
[5] Al‐Khasawneh, M. A., Faheem, M., Alsekait, D. M., Abubakar, A., & Issa, G. F. (2025). Hybrid Neural Network Methods for the Detection of Credit Card Fraud. SECURITY AND PRIVACY, 8(1). https://doi.org/10.1002/spy2.500
[6] Baisholan, N., Dietz, J. E., Gnatyuk, S., Turdalyuly, M., Matson, E. T., & Baisholanova, K. (2025a). A Systematic Review of Machine Learning in Credit Card Fraud Detection Under Original Class Imbalance. Computers 2025, Vol. 14, 14(10). https://doi.org/10.3390/computers14100437
[7] Baisholan, N., Dietz, J. E., Gnatyuk, S., Turdalyuly, M., Matson, E. T., & Baisholanova, K. (2025b). A Systematic Review of Machine Learning in Credit Card Fraud Detection Under Original Class Imbalance. Computers 2025, Vol. 14, 14(10). https://doi.org/10.3390/computers14100437
[8] Baisholan, N., Dietz, J. E., Gnatyuk, S., Turdalyuly, M., Matson, E. T., & Baisholanova, K. (2025c). FraudX AI: An Interpretable Machine Learning Framework for Credit Card Fraud Detection on Imbalanced Datasets. Computers 2025, Vol. 14, 14(4). https://doi.org/10.3390/computers14040120
[9] Bounab, R., Zarour, K., Guelib, B., & Khlifa, N. (2024). Enhancing Medicare Fraud Detection Through Machine Learning: Addressing Class Imbalance With SMOTE-ENN. IEEE Access, 12(3), 54382–54396. https://doi.org/10.1109/ACCESS.2024.3385781
[10] Btoush, E. A. L. M., Zhou, X., Gururajan, R., Chan, K. C., Genrich, R., & Sankaran, P. (2023). A systematic review of literature on credit card cyber fraud detection using machine and deep learning. PeerJ Computer Science, 9, e1278. https://doi.org/10.7717/PEERJ-CS.1278
[11] Correa Bahnsen, A., Aouada, D., Stojanovic, A., & Ottersten, B. (2016). Feature engineering strategies for credit card fraud detection. Expert Systems With Applications, 51, 134–142. https://doi.org/10.1016/j.eswa.2015.12.030
[12] Correa Bahnsen, A., Stojanovic, A., Aouada, D., & Ottersten, B. (2013). Cost Sensitive Credit Card Fraud Detection using Bayes Minimum Risk. https://doi.org/10.1109/ICMLA.2013.68
[13] Fanai, H., & Abbasimehr, H. (2023). A novel combined approach based on deep Autoencoder and deep classifiers for credit card fraud detection. Expert Systems with Applications, 217, 119562. https://doi.org/10.1016/j.eswa.2023.119562
[14] Gu, Z., Cao, M., Wang, C., Yu, N., & Qing, H. (2022). Research on Mining Maximum Subsidence Prediction Based on Genetic Algorithm Combined with XGBoost Model. Sustainability 2022, Vol. 14, 14(16). https://doi.org/10.3390/su141610421
[15] Iqbal, S., Awan, K. M., Kamal, S., & Rehman, Z. U. (2025a). Interpretable Ensemble Learning Models for Credit Card Fraud Detection. Applied Sciences 2025, Vol. 15, 15(22). https://doi.org/10.3390/app152212073
[16] Iqbal, S., Awan, K. M., Kamal, S., & Rehman, Z. U. (2025b). Interpretable Ensemble Learning Models for Credit Card Fraud Detection. Applied Sciences 2025, Vol. 15, 15(22). https://doi.org/10.3390/app152212073
[17] Kadam, P. (2024). Enhancing Financial Fraud Detection with Human-in-the-Loop Feedback and Feedback Propagation. https://arxiv.org/pdf/2411.05859v1
[18] Leevy, J. L., Johnson, J. M., Hancock, J., & Khoshgoftaar, T. M. (2023a). Threshold optimization and random undersampling for imbalanced credit card data. Journal of Big Data 2023 10:1, 10(1), 58-. https://doi.org/10.1186/s40537-023-00738-z
[19] Leevy, J. L., Johnson, J. M., Hancock, J., & Khoshgoftaar, T. M. (2023b). Threshold optimization and random undersampling for imbalanced credit card data. Journal of Big Data 2023 10:1, 10(1), 58-. https://doi.org/10.1186/s40537-023-00738-z
[20] Liu, J., Yan, X., Li, W., Xue, S. H., Wang, Z., & Su, R. (2025). Genomic Selection for Cashmere Traits in Inner Mongolian Cashmere Goats Using Random Forest, Gradient Boosting Decision Tree, Extreme Gradient Boosting and Light Gradient Boosting Machine Methods. Animals 2025, Vol. 15, 15(20). https://doi.org/10.3390/ani15202940
[21] Ming, R., Mohamad, O., Innab, N., & Hanafy, M. (2024). Applied Artificial Intelligence An International Journal Bagging Vs. Boosting in Ensemble Machine Learning? An Integrated Application to Fraud Risk Analysis in the Insurance Sector. https://doi.org/10.1080/08839514.2024.2355024
[22] Sami, M., Mir, A., & Insany, G. P. (2025). Detection of Bank Transaction Fraud Using Machine Learning. Engineering Proceedings 2025, Vol. 107, 107(1). https://doi.org/10.3390/engproc2025107034
[23] Setiadi, D. R. I. M., Muslikh, A. R., Iriananda, S. W., Warto, W., Gondohanindijo, J., & Ojugo, A. A. (2024). Outlier Detection Using Gaussian Mixture Model Clustering to Optimize XGBoost for Credit Approval Prediction. Journal of Computing Theories and Applications, 2(2), 244–255. https://doi.org/10.62411/jcta.11638
[24] Taha, A. A., & Malebary, S. J. (2020). An Intelligent Approach to Credit Card Fraud Detection Using an Optimized Light Gradient Boosting Machine. IEEE Access, 8, 25579–25587. https://doi.org/10.1109/ACCESS.2020.2971354
[25] Theodorakopoulos, L., Theodoropoulou, A., Tsimakis, A., & Halkiopoulos, C. (2025). Big Data-Driven Distributed Machine Learning for Scalable Credit Card Fraud Detection Using PySpark, XGBoost, and CatBoost. Electronics 2025, Vol. 14, 14(9). https://doi.org/10.3390/electronics14091754
[26] Xie, Y., Li, A., Gao, L., & Liu, Z. (2021). A Heterogeneous Ensemble Learning Model Based on Data Distribution for Credit Card Fraud Detection. Wireless Communications and Mobile Computing, 2021. https://doi.org/10.1155/2021/2531210

Keywords:

Cost-Sensitive Learning, Credit Card Fraud, Extreme Gradient Boosting (Xgboost), Imbalanced Datasets, Precision-Recall Optimization.