Development and validation of an explainable machine learning model for predicting prognosis in sepsis patients with a history of cancer who were admitted to the intensive care unit

Background Sepsis is the leading cause of mortality in critically ill cancer patients; however, traditional prognostic models fail to capture the complexity of their immune and physiological interactions. Methods This retrospective study analyzed electronic health records from the Medical Informatio...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiang Luo, Xiuji Kan, Dongliang Wang, Yu Shi, Siqi Zhu, Zhenyu Chen, Congcong Wang, Wenqi Zhu, Xiangtong Wang, Wenwen Sun
Format: Article
Language:English
Published: SAGE Publishing 2025-08-01
Series:Journal of International Medical Research
Online Access:https://doi.org/10.1177/03000605251362991
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background Sepsis is the leading cause of mortality in critically ill cancer patients; however, traditional prognostic models fail to capture the complexity of their immune and physiological interactions. Methods This retrospective study analyzed electronic health records from the Medical Information Mart for Intensive Care IV database, including the records of patients with sepsis who had a documented history of cancer and were admitted to the intensive care unit. A two-step feature selection approach, combining least absolute shrinkage and selection operator regression and recursive feature elimination, was used to identify key prognostic variables. Eight machine learning algorithms, such as random forest and extreme gradient boosting, were trained and evaluated using five-fold cross-validation. Model performance was assessed using the area under the receiver operating characteristic curve value, Brier scores, sensitivity, and specificity. SHapley Additive exPlanations, Partial Dependence Plots, and break down algorithms were applied to enhance model interpretability. Results The final cohort included 3364 patients admitted to the intensive care unit. Nonsurvivors had significantly higher illness severity scores (Acute Physiology Score III and Sequential Organ Failure Assessment) than survivors (p < 0.001). Among the tested models, the random forest model demonstrated superior performance, achieving the highest area under the receiver operating characteristic curve value (0.78; 95% confidence interval: 0.76–0.80) and the lowest Brier score (0.15), indicating strong predictive accuracy. Conclusions This study developed machine learning models for predicting in-hospital mortality in sepsis patients with a history of cancer, leveraging the Medical Information Mart for Intensive Care IV database for comprehensive risk assessment.
ISSN:1473-2300