Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data

Abstract Customer turnover is a crucial issue in banking since maintained profitability depends on keeping clients. This work aims to categorize consumer turnover in banks by using a new ensemble approach combining many machine learning methods, hence enhancing churn prediction models. Using a compr...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruchika Bhuria, Sheifali Gupta, Upinder Kaur, Salil Bharany, Ateeq Ur Rehman, Seada Hussen, Ghanshyam G. Tejani, Pradeep Jangir
Format: Article
Language:English
Published: Springer 2025-01-01
Series:Discover Sustainability
Subjects:
Online Access:https://doi.org/10.1007/s43621-025-00807-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595032793677824
author Ruchika Bhuria
Sheifali Gupta
Upinder Kaur
Salil Bharany
Ateeq Ur Rehman
Seada Hussen
Ghanshyam G. Tejani
Pradeep Jangir
author_facet Ruchika Bhuria
Sheifali Gupta
Upinder Kaur
Salil Bharany
Ateeq Ur Rehman
Seada Hussen
Ghanshyam G. Tejani
Pradeep Jangir
author_sort Ruchika Bhuria
collection DOAJ
description Abstract Customer turnover is a crucial issue in banking since maintained profitability depends on keeping clients. This work aims to categorize consumer turnover in banks by using a new ensemble approach combining many machine learning methods, hence enhancing churn prediction models. Using a comprehensive dataset including demographic, financial, and behavioral data—such as credit score, account balance, tenure, and activity levels—the study employs the goal variable revealing if a customer has left the bank. The study starts with univariate, bivariate, and multivariate feature exploration and subsequently uses the Interquartile Range (IQR) approach to identify outliers thereby improving the data quality. Five models—K-Nearest Neighbors, Support Vector Classifier, Random Forest, Decision Tree, and XGBoost—a Voting Classifier ensemble—are used to estimate project churn. Building upon all the strengths of each model, this approach improves the prediction of classification and provides a balanced and highly robust classification system. The applied approaches are K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), Random Forest, Decision Tree, and XGBoost within a Voting Classifier configuration. The performance of the Voting Classifier without SMOTE yields the following results: Accuracy: 0.87, precision: 0.87, recall: 0.80, and F1-Score: 0.87. The proposed model that extend the base model using SMOTE (Synthetic Minority Over-sampling Technique), yields a higher prediction accuracy of 0.90, precision of 0.90, recall of 0.90 and F1-Score of 0.90. This enhancement is proving the efficiency of SMOTE to handle the class imbalance problem in order to render the churn prediction more balanced and reliable system. The proposed approach assures a reliable solution to the strategies to retain the customers in the banking organisations.
format Article
id doaj-art-343f584eb4b84283be95920ec74654ee
institution Kabale University
issn 2662-9984
language English
publishDate 2025-01-01
publisher Springer
record_format Article
series Discover Sustainability
spelling doaj-art-343f584eb4b84283be95920ec74654ee2025-01-19T12:05:03ZengSpringerDiscover Sustainability2662-99842025-01-016112810.1007/s43621-025-00807-8Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral dataRuchika Bhuria0Sheifali Gupta1Upinder Kaur2Salil Bharany3Ateeq Ur Rehman4Seada Hussen5Ghanshyam G. Tejani6Pradeep Jangir7Chitkara University Institute of Engineering and Technology, Chitkara UniversityChitkara University Institute of Engineering and Technology, Chitkara UniversityDepartment of Computer Science and Engineering, Lovely Professional UniversityChitkara University Institute of Engineering and Technology, Chitkara UniversitySchool of Computing, Gachon UniversityDepartment of Electrical Power, Adama Science and Technology UniversityJadara Research Center, Jadara UniversityDepartment of CSE, Graphic Era Deemed, To Be UniversityAbstract Customer turnover is a crucial issue in banking since maintained profitability depends on keeping clients. This work aims to categorize consumer turnover in banks by using a new ensemble approach combining many machine learning methods, hence enhancing churn prediction models. Using a comprehensive dataset including demographic, financial, and behavioral data—such as credit score, account balance, tenure, and activity levels—the study employs the goal variable revealing if a customer has left the bank. The study starts with univariate, bivariate, and multivariate feature exploration and subsequently uses the Interquartile Range (IQR) approach to identify outliers thereby improving the data quality. Five models—K-Nearest Neighbors, Support Vector Classifier, Random Forest, Decision Tree, and XGBoost—a Voting Classifier ensemble—are used to estimate project churn. Building upon all the strengths of each model, this approach improves the prediction of classification and provides a balanced and highly robust classification system. The applied approaches are K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), Random Forest, Decision Tree, and XGBoost within a Voting Classifier configuration. The performance of the Voting Classifier without SMOTE yields the following results: Accuracy: 0.87, precision: 0.87, recall: 0.80, and F1-Score: 0.87. The proposed model that extend the base model using SMOTE (Synthetic Minority Over-sampling Technique), yields a higher prediction accuracy of 0.90, precision of 0.90, recall of 0.90 and F1-Score of 0.90. This enhancement is proving the efficiency of SMOTE to handle the class imbalance problem in order to render the churn prediction more balanced and reliable system. The proposed approach assures a reliable solution to the strategies to retain the customers in the banking organisations.https://doi.org/10.1007/s43621-025-00807-8Machine learning modelsEnsemble methodsExplainabilityChallenges in statistical methodsInsights into customer behaviorResilience
spellingShingle Ruchika Bhuria
Sheifali Gupta
Upinder Kaur
Salil Bharany
Ateeq Ur Rehman
Seada Hussen
Ghanshyam G. Tejani
Pradeep Jangir
Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
Discover Sustainability
Machine learning models
Ensemble methods
Explainability
Challenges in statistical methods
Insights into customer behavior
Resilience
title Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
title_full Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
title_fullStr Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
title_full_unstemmed Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
title_short Ensemble-based customer churn prediction in banking: a voting classifier approach for improved client retention using demographic and behavioral data
title_sort ensemble based customer churn prediction in banking a voting classifier approach for improved client retention using demographic and behavioral data
topic Machine learning models
Ensemble methods
Explainability
Challenges in statistical methods
Insights into customer behavior
Resilience
url https://doi.org/10.1007/s43621-025-00807-8
work_keys_str_mv AT ruchikabhuria ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT sheifaligupta ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT upinderkaur ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT salilbharany ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT ateequrrehman ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT seadahussen ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT ghanshyamgtejani ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata
AT pradeepjangir ensemblebasedcustomerchurnpredictioninbankingavotingclassifierapproachforimprovedclientretentionusingdemographicandbehavioraldata