CatBoost Optimization Using Recursive Feature Elimination

CatBoost is a powerful machine learning algorithm capable of classification and regression application. There are many studies focusing on its application but are still lacking on how to enhance its performance, especially when using RFE as a feature selection. This study examines the CatBoost optim...

Full description

Saved in:
Bibliographic Details
Main Authors: Agus Hadianto, Wiranto Herry Utomo
Format: Article
Language:English
Published: Department of Informatics, UIN Sunan Gunung Djati Bandung 2024-08-01
Series:JOIN: Jurnal Online Informatika
Subjects:
Online Access:https://join.if.uinsgd.ac.id/index.php/join/article/view/1324
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:CatBoost is a powerful machine learning algorithm capable of classification and regression application. There are many studies focusing on its application but are still lacking on how to enhance its performance, especially when using RFE as a feature selection. This study examines the CatBoost optimization for regression tasks by using Recursive Feature Elimination (RFE) for feature selection in combination with several regression algorithm. Furthermore, an Isolation Forest algorithm is employed at preprocessing to identify and eliminate outliers from the dataset. The experiment is conducted by comparing the CatBoost regression model's performances with and without the use of RFE feature selection. The outcomes of the experiments indicate that CatBoost with RFE, which selects features using Random Forests, performs better than the baseline model without feature selection. CatBoost-RFE outperformed the baseline with notable gains of over 48.6% in training time, 8.2% in RMSE score, and 1.3% in R2 score. Furthermore, compared to AdaBoost, Gradient Boosting, XGBoost, and artificial neural networks (ANN), it demonstrated better prediction accuracy. The CatBoost improvement has a substantial implication for predicting the exhaust temperature in a coal-fired power plant.
ISSN:2528-1682
2527-9165