Clinical validation and optimization of machine learning models for early prediction of sepsis

IntroductionSepsis is a global health threat that has a high incidence and mortality rate. Early prediction of sepsis onset can drive effective interventions and improve patients’ outcome.MethodsData were collected retrospectively from a cohort of 2,329 adult patients with positive bacteria cultures...

Full description

Saved in:
Bibliographic Details
Main Authors: Xi Liu, Meiyi Li, Xu Liu, Yuting Luo, Dong Yang, Hui Ouyang, Jiaoling He, Jinyu Xia, Fei Xiao
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmed.2025.1521660/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832096703922044928
author Xi Liu
Meiyi Li
Xu Liu
Yuting Luo
Dong Yang
Hui Ouyang
Jiaoling He
Jinyu Xia
Fei Xiao
Fei Xiao
Fei Xiao
author_facet Xi Liu
Meiyi Li
Xu Liu
Yuting Luo
Dong Yang
Hui Ouyang
Jiaoling He
Jinyu Xia
Fei Xiao
Fei Xiao
Fei Xiao
author_sort Xi Liu
collection DOAJ
description IntroductionSepsis is a global health threat that has a high incidence and mortality rate. Early prediction of sepsis onset can drive effective interventions and improve patients’ outcome.MethodsData were collected retrospectively from a cohort of 2,329 adult patients with positive bacteria cultures from a tertiary hospital in China between October 1, 2019 and September 30, 2020. Thirty six clinical features were selected as inputs for the models. We trained models in predicting sepsis by machine learning (ML) methods, including logistic regression, decision tree, random forest (RF), multi-layer perceptron, and light gradient boosting. We evaluated the performance of the five ML models and the evaluation metrics were: area under the ROC curve (AUC), accuracy, F1-score, sensitivity and specificity. The data of another cohort of 2,286 patients between October 1, 2020 and April 1, 2022 were used to validate the performance of the model performing best in the in the internal validation set. Shapley additive explanations (SHAP) method was applied to evaluate feature importance and explain the predictions of this model.ResultsOf the five machine learning models developed, the RF model demonstrated the best performance in terms of AUC (0.818), F1 value (0.38), and sensitivity (0.746). The RF model also has a comparable AUC (0.771) in the external validation set. The SHAP method identified procalcitonin, albumin, prothrombin time, and sex as the important variables contributing to the prediction of sepsis.DiscussionThe RF model we developed showed the greatest potential for early prediction of sepsis in admitted patients, which could aid clinicians in their decision-making process. Our findings also suggested that male patients with bacterial infections and high procalcitonin levels, lower albumin levels, or prolonged prothrombin times were more likely to develop sepsis.
format Article
id doaj-art-ca578809801a48b4a58fe6aae008237e
institution Kabale University
issn 2296-858X
language English
publishDate 2025-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Medicine
spelling doaj-art-ca578809801a48b4a58fe6aae008237e2025-02-05T12:25:09ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-02-011210.3389/fmed.2025.15216601521660Clinical validation and optimization of machine learning models for early prediction of sepsisXi Liu0Meiyi Li1Xu Liu2Yuting Luo3Dong Yang4Hui Ouyang5Jiaoling He6Jinyu Xia7Fei Xiao8Fei Xiao9Fei Xiao10Department of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaGuangzhou AID Cloud Technology, Guangzhou, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaGuangdong Provincial Key Laboratory of Biomedical Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, ChinaGuangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, ChinaIntroductionSepsis is a global health threat that has a high incidence and mortality rate. Early prediction of sepsis onset can drive effective interventions and improve patients’ outcome.MethodsData were collected retrospectively from a cohort of 2,329 adult patients with positive bacteria cultures from a tertiary hospital in China between October 1, 2019 and September 30, 2020. Thirty six clinical features were selected as inputs for the models. We trained models in predicting sepsis by machine learning (ML) methods, including logistic regression, decision tree, random forest (RF), multi-layer perceptron, and light gradient boosting. We evaluated the performance of the five ML models and the evaluation metrics were: area under the ROC curve (AUC), accuracy, F1-score, sensitivity and specificity. The data of another cohort of 2,286 patients between October 1, 2020 and April 1, 2022 were used to validate the performance of the model performing best in the in the internal validation set. Shapley additive explanations (SHAP) method was applied to evaluate feature importance and explain the predictions of this model.ResultsOf the five machine learning models developed, the RF model demonstrated the best performance in terms of AUC (0.818), F1 value (0.38), and sensitivity (0.746). The RF model also has a comparable AUC (0.771) in the external validation set. The SHAP method identified procalcitonin, albumin, prothrombin time, and sex as the important variables contributing to the prediction of sepsis.DiscussionThe RF model we developed showed the greatest potential for early prediction of sepsis in admitted patients, which could aid clinicians in their decision-making process. Our findings also suggested that male patients with bacterial infections and high procalcitonin levels, lower albumin levels, or prolonged prothrombin times were more likely to develop sepsis.https://www.frontiersin.org/articles/10.3389/fmed.2025.1521660/fullsepsismachine learningartificial intelligenceprediction modelinfectious disease
spellingShingle Xi Liu
Meiyi Li
Xu Liu
Yuting Luo
Dong Yang
Hui Ouyang
Jiaoling He
Jinyu Xia
Fei Xiao
Fei Xiao
Fei Xiao
Clinical validation and optimization of machine learning models for early prediction of sepsis
Frontiers in Medicine
sepsis
machine learning
artificial intelligence
prediction model
infectious disease
title Clinical validation and optimization of machine learning models for early prediction of sepsis
title_full Clinical validation and optimization of machine learning models for early prediction of sepsis
title_fullStr Clinical validation and optimization of machine learning models for early prediction of sepsis
title_full_unstemmed Clinical validation and optimization of machine learning models for early prediction of sepsis
title_short Clinical validation and optimization of machine learning models for early prediction of sepsis
title_sort clinical validation and optimization of machine learning models for early prediction of sepsis
topic sepsis
machine learning
artificial intelligence
prediction model
infectious disease
url https://www.frontiersin.org/articles/10.3389/fmed.2025.1521660/full
work_keys_str_mv AT xiliu clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT meiyili clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT xuliu clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT yutingluo clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT dongyang clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT huiouyang clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT jiaolinghe clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT jinyuxia clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT feixiao clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT feixiao clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis
AT feixiao clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis