A Cloud-Based Optimized Ensemble Model for Risk Prediction of Diabetic Progression—An Azure Machine Learning Perspective
The application of Machine Learning for predictive analysis in healthcare, particularly for diseases like diabetes, has proven highly beneficial. This study introduces an optimized Light Gradient-Boosting Machine (Light GBM) and K-Nearest Neighbour (KNN) based ensemble algorithm for predicting diabe...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10836739/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The application of Machine Learning for predictive analysis in healthcare, particularly for diseases like diabetes, has proven highly beneficial. This study introduces an optimized Light Gradient-Boosting Machine (Light GBM) and K-Nearest Neighbour (KNN) based ensemble algorithm for predicting diabetic progression of Type 2 Diabetes, classifying it as high or low risk, using patient health parameters and serum measurements. Our model uses LightGBM, a rapid and efficient gradient boosting framework, coupled with KNN, which uses proximity to classify data points. The proposed model uses various optimization techniques, such as 10 fold cross validation, grid search method etc. to get the best results out of the ensemble model. As the model combines optimized version of LightGBM and KNN through a voting classifier which uses soft voting technique to find the final class, it utilizes the predictive capabilities of both the methods in an effective manner. The experiment is performed and implemented in Microsoft’s Azure cloud, using Azure Machine Learning service, that leverages the advantages of cloud computing with respect to scalability, security and its potential integration possibilities into IoT-based smart healthcare systems.This aspect highlights its versatility and impact with respect to remote monitoring of patients as well. The ensemble achieves an 83.2% Area Under the Curve (AUC) of Receiver Operating Characteristics (ROC) score, indicating good classification efficiency. It produced 75% accuracy as well. The proposed model is compared with other classification and ensemble models, showcasing its superiority against other models.The ensemble is also tested with some meta heuristic optimization methods, which produced comparable scores. The method’s effectiveness is validated against another risk prediction dataset, proving its reliability. The model’s accurate predictions can aid individuals in understanding disease progression risks and guide medical professionals in intervention strategies. |
---|---|
ISSN: | 2169-3536 |