All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values

Background/Objectives: Diabetes mellitus (DM) is a prevalent disease with an increased risk of complications. Identifying risk factors for mortality in these patients is crucial, as early recognition can facilitate prompt therapeutic intervention. Machine learning (ML) models have proved to be valua...

Full description

Saved in:
Bibliographic Details
Main Authors: Oana Mirea, Mostafa Ghelich Oghli, Oana Neagoe, Mihaela Berceanu, Eugen Țieranu, Liviu Moraru, Victor Raicea, Ionuț Donoiu
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Diabetology
Subjects:
Online Access:https://www.mdpi.com/2673-4540/6/1/5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588716186533888
author Oana Mirea
Mostafa Ghelich Oghli
Oana Neagoe
Mihaela Berceanu
Eugen Țieranu
Liviu Moraru
Victor Raicea
Ionuț Donoiu
author_facet Oana Mirea
Mostafa Ghelich Oghli
Oana Neagoe
Mihaela Berceanu
Eugen Țieranu
Liviu Moraru
Victor Raicea
Ionuț Donoiu
author_sort Oana Mirea
collection DOAJ
description Background/Objectives: Diabetes mellitus (DM) is a prevalent disease with an increased risk of complications. Identifying risk factors for mortality in these patients is crucial, as early recognition can facilitate prompt therapeutic intervention. Machine learning (ML) models have proved to be valuable tools in different scenarios of healthcare decision making. We aimed to develop and test an ML model to predict all-cause mortality in a large cohort of subjects with DM. Methods: We included 1969 consecutive patients with DM type 1 (T1DM, <i>n</i> = 255) and type 2 (T2DM, <i>n</i> = 1714). eXtreme Gradient Boosting (XGBoost) was used for the prediction of all-cause mortality in this cohort and the Shapley additive explanation (SHAP) was used to assess the importance of each feature of the classifier. The missing values were imputed using the Missforest methodology. Results: The all-cause mortality rate was 21% during 5.5 ± 1.1 years of follow-up. The ML model achieved 90% sensitivity and 87% specificity with an AUC of 0.88 and an accuracy of 88% for predicting all-cause mortality. The SHAP analysis identified a lower glomerular filtration rate (eGFR), duration of insulin therapy, and a lower level of hemoglobin as the first three factors that contribute to the higher mortality rate. Conclusions: ML models can become valuable tools in clinical practice due to their unique ability to simultaneously assess the cumulative influence of multiple parameters and discover high-order interactions. The application of such models in clinical practice could improve the early identification of subjects at risk for complications and mortality and prompt early therapeutical interventions.
format Article
id doaj-art-a2d176fd0dfd415a8c9babaf439a1a03
institution Kabale University
issn 2673-4540
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Diabetology
spelling doaj-art-a2d176fd0dfd415a8c9babaf439a1a032025-01-24T13:28:46ZengMDPI AGDiabetology2673-45402025-01-0161510.3390/diabetology6010005All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley ValuesOana Mirea0Mostafa Ghelich Oghli1Oana Neagoe2Mihaela Berceanu3Eugen Țieranu4Liviu Moraru5Victor Raicea6Ionuț Donoiu7Department of Cardiology, University of Medicine and Pharmacy of Craiova, 200349 Craiova, RomaniaResearch and Development Department, Med Fanavaran Plus Co., Karaj 3187411213, IranDepartment of Cardiology, University of Medicine and Pharmacy of Craiova, 200349 Craiova, RomaniaDepartment of Cardiology, University of Medicine and Pharmacy of Craiova, 200349 Craiova, RomaniaDepartment of Cardiology, University of Medicine and Pharmacy of Craiova, 200349 Craiova, RomaniaDepartment of Anatomy, “George Emil Palade” University of Medicine, Pharmacy, Sciences and Technology, 540142 Târgu Mureș, RomaniaDepartment of Cardiovascular Surgery, University of Medicine and Pharmacy of Craiova, 200349 Craiova, RomaniaDepartment of Cardiology, University of Medicine and Pharmacy of Craiova, 200349 Craiova, RomaniaBackground/Objectives: Diabetes mellitus (DM) is a prevalent disease with an increased risk of complications. Identifying risk factors for mortality in these patients is crucial, as early recognition can facilitate prompt therapeutic intervention. Machine learning (ML) models have proved to be valuable tools in different scenarios of healthcare decision making. We aimed to develop and test an ML model to predict all-cause mortality in a large cohort of subjects with DM. Methods: We included 1969 consecutive patients with DM type 1 (T1DM, <i>n</i> = 255) and type 2 (T2DM, <i>n</i> = 1714). eXtreme Gradient Boosting (XGBoost) was used for the prediction of all-cause mortality in this cohort and the Shapley additive explanation (SHAP) was used to assess the importance of each feature of the classifier. The missing values were imputed using the Missforest methodology. Results: The all-cause mortality rate was 21% during 5.5 ± 1.1 years of follow-up. The ML model achieved 90% sensitivity and 87% specificity with an AUC of 0.88 and an accuracy of 88% for predicting all-cause mortality. The SHAP analysis identified a lower glomerular filtration rate (eGFR), duration of insulin therapy, and a lower level of hemoglobin as the first three factors that contribute to the higher mortality rate. Conclusions: ML models can become valuable tools in clinical practice due to their unique ability to simultaneously assess the cumulative influence of multiple parameters and discover high-order interactions. The application of such models in clinical practice could improve the early identification of subjects at risk for complications and mortality and prompt early therapeutical interventions.https://www.mdpi.com/2673-4540/6/1/5diabetes mellitusmachine learningall-cause mortality
spellingShingle Oana Mirea
Mostafa Ghelich Oghli
Oana Neagoe
Mihaela Berceanu
Eugen Țieranu
Liviu Moraru
Victor Raicea
Ionuț Donoiu
All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
Diabetology
diabetes mellitus
machine learning
all-cause mortality
title All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
title_full All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
title_fullStr All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
title_full_unstemmed All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
title_short All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values
title_sort all cause mortality prediction in subjects with diabetes mellitus using a machine learning model and shapley values
topic diabetes mellitus
machine learning
all-cause mortality
url https://www.mdpi.com/2673-4540/6/1/5
work_keys_str_mv AT oanamirea allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT mostafaghelichoghli allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT oananeagoe allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT mihaelaberceanu allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT eugentieranu allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT liviumoraru allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT victorraicea allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues
AT ionutdonoiu allcausemortalitypredictioninsubjectswithdiabetesmellitususingamachinelearningmodelandshapleyvalues