Clinical validation and optimization of machine learning models for early prediction of sepsis
IntroductionSepsis is a global health threat that has a high incidence and mortality rate. Early prediction of sepsis onset can drive effective interventions and improve patients’ outcome.MethodsData were collected retrospectively from a cohort of 2,329 adult patients with positive bacteria cultures...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-02-01
|
Series: | Frontiers in Medicine |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fmed.2025.1521660/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832096703922044928 |
---|---|
author | Xi Liu Meiyi Li Xu Liu Yuting Luo Dong Yang Hui Ouyang Jiaoling He Jinyu Xia Fei Xiao Fei Xiao Fei Xiao |
author_facet | Xi Liu Meiyi Li Xu Liu Yuting Luo Dong Yang Hui Ouyang Jiaoling He Jinyu Xia Fei Xiao Fei Xiao Fei Xiao |
author_sort | Xi Liu |
collection | DOAJ |
description | IntroductionSepsis is a global health threat that has a high incidence and mortality rate. Early prediction of sepsis onset can drive effective interventions and improve patients’ outcome.MethodsData were collected retrospectively from a cohort of 2,329 adult patients with positive bacteria cultures from a tertiary hospital in China between October 1, 2019 and September 30, 2020. Thirty six clinical features were selected as inputs for the models. We trained models in predicting sepsis by machine learning (ML) methods, including logistic regression, decision tree, random forest (RF), multi-layer perceptron, and light gradient boosting. We evaluated the performance of the five ML models and the evaluation metrics were: area under the ROC curve (AUC), accuracy, F1-score, sensitivity and specificity. The data of another cohort of 2,286 patients between October 1, 2020 and April 1, 2022 were used to validate the performance of the model performing best in the in the internal validation set. Shapley additive explanations (SHAP) method was applied to evaluate feature importance and explain the predictions of this model.ResultsOf the five machine learning models developed, the RF model demonstrated the best performance in terms of AUC (0.818), F1 value (0.38), and sensitivity (0.746). The RF model also has a comparable AUC (0.771) in the external validation set. The SHAP method identified procalcitonin, albumin, prothrombin time, and sex as the important variables contributing to the prediction of sepsis.DiscussionThe RF model we developed showed the greatest potential for early prediction of sepsis in admitted patients, which could aid clinicians in their decision-making process. Our findings also suggested that male patients with bacterial infections and high procalcitonin levels, lower albumin levels, or prolonged prothrombin times were more likely to develop sepsis. |
format | Article |
id | doaj-art-ca578809801a48b4a58fe6aae008237e |
institution | Kabale University |
issn | 2296-858X |
language | English |
publishDate | 2025-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Medicine |
spelling | doaj-art-ca578809801a48b4a58fe6aae008237e2025-02-05T12:25:09ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-02-011210.3389/fmed.2025.15216601521660Clinical validation and optimization of machine learning models for early prediction of sepsisXi Liu0Meiyi Li1Xu Liu2Yuting Luo3Dong Yang4Hui Ouyang5Jiaoling He6Jinyu Xia7Fei Xiao8Fei Xiao9Fei Xiao10Department of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaGuangzhou AID Cloud Technology, Guangzhou, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaDepartment of Infectious Diseases, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, ChinaGuangdong Provincial Key Laboratory of Biomedical Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, ChinaGuangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, ChinaIntroductionSepsis is a global health threat that has a high incidence and mortality rate. Early prediction of sepsis onset can drive effective interventions and improve patients’ outcome.MethodsData were collected retrospectively from a cohort of 2,329 adult patients with positive bacteria cultures from a tertiary hospital in China between October 1, 2019 and September 30, 2020. Thirty six clinical features were selected as inputs for the models. We trained models in predicting sepsis by machine learning (ML) methods, including logistic regression, decision tree, random forest (RF), multi-layer perceptron, and light gradient boosting. We evaluated the performance of the five ML models and the evaluation metrics were: area under the ROC curve (AUC), accuracy, F1-score, sensitivity and specificity. The data of another cohort of 2,286 patients between October 1, 2020 and April 1, 2022 were used to validate the performance of the model performing best in the in the internal validation set. Shapley additive explanations (SHAP) method was applied to evaluate feature importance and explain the predictions of this model.ResultsOf the five machine learning models developed, the RF model demonstrated the best performance in terms of AUC (0.818), F1 value (0.38), and sensitivity (0.746). The RF model also has a comparable AUC (0.771) in the external validation set. The SHAP method identified procalcitonin, albumin, prothrombin time, and sex as the important variables contributing to the prediction of sepsis.DiscussionThe RF model we developed showed the greatest potential for early prediction of sepsis in admitted patients, which could aid clinicians in their decision-making process. Our findings also suggested that male patients with bacterial infections and high procalcitonin levels, lower albumin levels, or prolonged prothrombin times were more likely to develop sepsis.https://www.frontiersin.org/articles/10.3389/fmed.2025.1521660/fullsepsismachine learningartificial intelligenceprediction modelinfectious disease |
spellingShingle | Xi Liu Meiyi Li Xu Liu Yuting Luo Dong Yang Hui Ouyang Jiaoling He Jinyu Xia Fei Xiao Fei Xiao Fei Xiao Clinical validation and optimization of machine learning models for early prediction of sepsis Frontiers in Medicine sepsis machine learning artificial intelligence prediction model infectious disease |
title | Clinical validation and optimization of machine learning models for early prediction of sepsis |
title_full | Clinical validation and optimization of machine learning models for early prediction of sepsis |
title_fullStr | Clinical validation and optimization of machine learning models for early prediction of sepsis |
title_full_unstemmed | Clinical validation and optimization of machine learning models for early prediction of sepsis |
title_short | Clinical validation and optimization of machine learning models for early prediction of sepsis |
title_sort | clinical validation and optimization of machine learning models for early prediction of sepsis |
topic | sepsis machine learning artificial intelligence prediction model infectious disease |
url | https://www.frontiersin.org/articles/10.3389/fmed.2025.1521660/full |
work_keys_str_mv | AT xiliu clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT meiyili clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT xuliu clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT yutingluo clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT dongyang clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT huiouyang clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT jiaolinghe clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT jinyuxia clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT feixiao clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT feixiao clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis AT feixiao clinicalvalidationandoptimizationofmachinelearningmodelsforearlypredictionofsepsis |