Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
Abstract Customer turnover is a crucial issue in banking since maintained profitability depends on keeping clients. This work aims to categorize consumer turnover in banks by using a new ensemble approach combining many machine learning methods, hence enhancing churn prediction models. Using a compr...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2025-01-01
|
Series: | Discover Sustainability |
Subjects: | |
Online Access: | https://doi.org/10.1007/s43621-025-00807-8 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832595032793677824 |
---|---|
author | Ruchika Bhuria Sheifali Gupta Upinder Kaur Salil Bharany Ateeq Ur Rehman Seada Hussen Ghanshyam G. Tejani Pradeep Jangir |
author_facet | Ruchika Bhuria Sheifali Gupta Upinder Kaur Salil Bharany Ateeq Ur Rehman Seada Hussen Ghanshyam G. Tejani Pradeep Jangir |
author_sort | Ruchika Bhuria |
collection | DOAJ |
description | Abstract Customer turnover is a crucial issue in banking since maintained profitability depends on keeping clients. This work aims to categorize consumer turnover in banks by using a new ensemble approach combining many machine learning methods, hence enhancing churn prediction models. Using a comprehensive dataset including demographic, financial, and behavioral data—such as credit score, account balance, tenure, and activity levels—the study employs the goal variable revealing if a customer has left the bank. The study starts with univariate, bivariate, and multivariate feature exploration and subsequently uses the Interquartile Range (IQR) approach to identify outliers thereby improving the data quality. Five models—K-Nearest Neighbors, Support Vector Classifier, Random Forest, Decision Tree, and XGBoost—a Voting Classifier ensemble—are used to estimate project churn. Building upon all the strengths of each model, this approach improves the prediction of classification and provides a balanced and highly robust classification system. The applied approaches are K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), Random Forest, Decision Tree, and XGBoost within a Voting Classifier configuration. The performance of the Voting Classifier without SMOTE yields the following results: Accuracy: 0.87, precision: 0.87, recall: 0.80, and F1-Score: 0.87. The proposed model that extend the base model using SMOTE (Synthetic Minority Over-sampling Technique), yields a higher prediction accuracy of 0.90, precision of 0.90, recall of 0.90 and F1-Score of 0.90. This enhancement is proving the efficiency of SMOTE to handle the class imbalance problem in order to render the churn prediction more balanced and reliable system. The proposed approach assures a reliable solution to the strategies to retain the customers in the banking organisations. |
format | Article |
id | doaj-art-343f584eb4b84283be95920ec74654ee |
institution | Kabale University |
issn | 2662-9984 |
language | English |
publishDate | 2025-01-01 |
publisher | Springer |
record_format | Article |
series | Discover Sustainability |
spelling | doaj-art-343f584eb4b84283be95920ec74654ee2025-01-19T12:05:03ZengSpringerDiscover Sustainability2662-99842025-01-016112810.1007/s43621-025-00807-8Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral dataRuchika Bhuria0Sheifali Gupta1Upinder Kaur2Salil Bharany3Ateeq Ur Rehman4Seada Hussen5Ghanshyam G. Tejani6Pradeep Jangir7Chitkara University Institute of Engineering and Technology, Chitkara UniversityChitkara University Institute of Engineering and Technology, Chitkara UniversityDepartment of Computer Science and Engineering, Lovely Professional UniversityChitkara University Institute of Engineering and Technology, Chitkara UniversitySchool of Computing, Gachon UniversityDepartment of Electrical Power, Adama Science and Technology UniversityJadara Research Center, Jadara UniversityDepartment of CSE, Graphic Era Deemed, To Be UniversityAbstract Customer turnover is a crucial issue in banking since maintained profitability depends on keeping clients. This work aims to categorize consumer turnover in banks by using a new ensemble approach combining many machine learning methods, hence enhancing churn prediction models. Using a comprehensive dataset including demographic, financial, and behavioral data—such as credit score, account balance, tenure, and activity levels—the study employs the goal variable revealing if a customer has left the bank. The study starts with univariate, bivariate, and multivariate feature exploration and subsequently uses the Interquartile Range (IQR) approach to identify outliers thereby improving the data quality. Five models—K-Nearest Neighbors, Support Vector Classifier, Random Forest, Decision Tree, and XGBoost—a Voting Classifier ensemble—are used to estimate project churn. Building upon all the strengths of each model, this approach improves the prediction of classification and provides a balanced and highly robust classification system. The applied approaches are K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), Random Forest, Decision Tree, and XGBoost within a Voting Classifier configuration. The performance of the Voting Classifier without SMOTE yields the following results: Accuracy: 0.87, precision: 0.87, recall: 0.80, and F1-Score: 0.87. The proposed model that extend the base model using SMOTE (Synthetic Minority Over-sampling Technique), yields a higher prediction accuracy of 0.90, precision of 0.90, recall of 0.90 and F1-Score of 0.90. This enhancement is proving the efficiency of SMOTE to handle the class imbalance problem in order to render the churn prediction more balanced and reliable system. The proposed approach assures a reliable solution to the strategies to retain the customers in the banking organisations.https://doi.org/10.1007/s43621-025-00807-8Machine learning modelsEnsemble methodsExplainabilityChallenges in statistical methodsInsights into customer behaviorResilience |
spellingShingle | Ruchika Bhuria Sheifali Gupta Upinder Kaur Salil Bharany Ateeq Ur Rehman Seada Hussen Ghanshyam G. Tejani Pradeep Jangir Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data Discover Sustainability Machine learning models Ensemble methods Explainability Challenges in statistical methods Insights into customer behavior Resilience |
title | Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data |
title_full | Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data |
title_fullStr | Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data |
title_full_unstemmed | Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data |
title_short | Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data |
title_sort | ensemble based customer churn prediction in banking a voting classifier approach for improved client retention using demographic and behavioral data |
topic | Machine learning models Ensemble methods Explainability Challenges in statistical methods Insights into customer behavior Resilience |
url | https://doi.org/10.1007/s43621-025-00807-8 |
work_keys_str_mv | AT ruchikabhuria ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT sheifaligupta ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT upinderkaur ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT salilbharany ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT ateequrrehman ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT seadahussen ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT ghanshyamgtejani ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata AT pradeepjangir ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata |