Optimizing XGBoost Hyperparameters for Credit Scoring Classification Using Weighted Cognitive Avoidance Particle Swarm

Decision trees in machine learning achieved satisfactory performance in classification. Decision trees offer the advantage of handling high-dimensional and complexly correlated data through feature combination and selection. Extreme Gradient Boosting (XGBoost) overcomes the issue of overfitting in d...

Full description

Saved in:
Bibliographic Details
Main Authors: Atul Vikas Lakra, Sudarson Jena, Kaushik Mishra
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11071533/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Decision trees in machine learning achieved satisfactory performance in classification. Decision trees offer the advantage of handling high-dimensional and complexly correlated data through feature combination and selection. Extreme Gradient Boosting (XGBoost) overcomes the issue of overfitting in decision trees by integrating multiple tree models. Weighted cognitive avoidance particle swarm optimization for the XGBoost (WCAPSO-XGB) model has been proposed for credit score classification. The structure of the XGBoost (XGB) model is determined by the hyperparameters, which must be initialized prior to model evaluation. The optimal hyperparameter values for the XGBoost model can vary significantly depending on the specific problem at hand. Manually setting these hyperparameters can be a time-consuming process. Therefore, to automate this process, weighted cognitive avoidance particle swarm optimization (WCAPSO) is employed for hyperparameter optimization. The novelty of the proposed work includes a modified version of the particle swarm optimizer (PSO), which is used to address the issue of getting trapped into a local optimal solution. The proposed WCAPSO restricts the movement of particles toward local optima and enhances exploration of the search space for better solutions. Here, the hyperparameters of XGBoost are represented as particles in WCAPSO, which defines the state and its movement in a continuous search space. The proposed WCAPSO-XGB model tunes the hyperparameters of XGBoost and classifies the credit scoring, and the experimental results are compared with various classifier such as Random Forest (RF), K-neighbors (KNN), Gaussian Naive Bayes (NB), AdaBoost, Gradient Boosting, Logistic Regression (LR), Neural Network (NN), Decision Tree (DT) and Linear Discriminant Analysis (LDA), and hyperparameter optimization methods, such as Grid Search (GS), Random Search (RS), Bays Optimization, Optuna Optimization, Hybrid Snake Optimizer Algorithm (HSOA), Exploratory Cuckoo Search, island Cuckoo Search (iCSPM and iCSPM2), and Improved SSA (ISSA) with HDPM, on four different datasets with a varying number of instances from small to large. The experimental results and analysis show that the proposed model secures better results in accuracy of 76.75%, 82.43%, 88.98%, and 88.74%; F1-score of 66.52%, 89.44%, 93.53%, and 90.64%; precision of 78.79%, 84.31%, 88.46%, and 90.41%; and recall of 64.07%, 95.24%, 99.21%, and 90.87% in the UCI German, UCI Taiwan dataset, P2P US lending club, and Credit scoring Kaggle dataset, respectively. Further, the overall confidence interval of the proposed model ranges from 82% to 83% in accuracy, 73.1% to 74.2% in F1-score, and 71% to 72% in AUC, with a confidence of 95% measured over 20 evaluations on each dataset.
ISSN:2169-3536