Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients

Abstract South Africa was the most affected country in Africa by the coronavirus disease 2019 (COVID-19) pandemic, where over 4 million confirmed cases of COVID-19 and over 102,000 deaths have been recorded since 2019. Aside from clinical methods, artificial intelligence (AI)-based solutions such as...

Full description

Saved in:
Bibliographic Details
Main Authors: Olawande Daramola, Tatenda Duncan Kavu, Maritha J. Kotze, Jeanine L. Marnewick, Oluwafemi A. Sarumi, Boniface Kabaso, Thomas Moser, Karl Stroetmann, Isaac Fwemba, Fisayo Daramola, Martha Nyirenda, Susan J. van Rensburg, Peter S. Nyasulu
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-46712-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832594824531804160
author Olawande Daramola
Tatenda Duncan Kavu
Maritha J. Kotze
Jeanine L. Marnewick
Oluwafemi A. Sarumi
Boniface Kabaso
Thomas Moser
Karl Stroetmann
Isaac Fwemba
Fisayo Daramola
Martha Nyirenda
Susan J. van Rensburg
Peter S. Nyasulu
author_facet Olawande Daramola
Tatenda Duncan Kavu
Maritha J. Kotze
Jeanine L. Marnewick
Oluwafemi A. Sarumi
Boniface Kabaso
Thomas Moser
Karl Stroetmann
Isaac Fwemba
Fisayo Daramola
Martha Nyirenda
Susan J. van Rensburg
Peter S. Nyasulu
author_sort Olawande Daramola
collection DOAJ
description Abstract South Africa was the most affected country in Africa by the coronavirus disease 2019 (COVID-19) pandemic, where over 4 million confirmed cases of COVID-19 and over 102,000 deaths have been recorded since 2019. Aside from clinical methods, artificial intelligence (AI)-based solutions such as machine learning (ML) models have been employed in treating COVID-19 cases. However, limited application of AI for COVID-19 in Africa has been reported in the literature. This study aimed to investigate the performance and interpretability of several ML algorithms, including deep multilayer perceptron (Deep MLP), support vector machine (SVM) and Extreme gradient boosting trees (XGBoost) for predicting COVID-19 mortality risk with an emphasis on the effect of cross-validation (CV) and principal component analysis (PCA) on the results. For this purpose, a dataset with 154 features from 490 COVID-19 patients admitted into the intensive care unit (ICU) of Tygerberg Hospital in Cape Town, South Africa, during the first wave of COVID-19 in 2020 was retrospectively analysed. Our results show that Deep MLP had the best overall performance (F1 = 0.92; area under the curve (AUC) = 0.94) when CV and the synthetic minority oversampling technique (SMOTE) were applied without PCA. By using the Shapley Additive exPlanations (SHAP) model to interpret the mortality risk predictions, we identified the Length of stay (LOS) in the hospital, LOS in the ICU, Time to ICU from admission, days discharged alive or death, D-dimer (blood clotting factor), and blood pH as the six most critical variables for mortality risk prediction. Also, Age at admission, Pf ratio (PaO2/FiO2 ratio), troponin T (TropT), ferritin, ventilation, C-reactive protein (CRP), and symptoms of acute respiratory distress syndrome (ARDS) were associated with the severity and fatality of COVID-19 cases. The study reveals how ML could assist medical practitioners in making informed decisions on handling critically ill COVID-19 patients with comorbidities. It also offers insight into the combined effect of CV, PCA, and SMOTE on the performance of ML models for COVID-19 mortality risk prediction, which has been little explored.
format Article
id doaj-art-491af48889af4249b3345bdc0698c4a7
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-491af48889af4249b3345bdc0698c4a72025-01-19T12:18:29ZengNature PortfolioScientific Reports2045-23222025-01-0115112010.1038/s41598-023-46712-wPredictive modelling and identification of critical variables of mortality risk in COVID-19 patientsOlawande Daramola0Tatenda Duncan Kavu1Maritha J. Kotze2Jeanine L. Marnewick3Oluwafemi A. Sarumi4Boniface Kabaso5Thomas Moser6Karl Stroetmann7Isaac Fwemba8Fisayo Daramola9Martha Nyirenda10Susan J. van Rensburg11Peter S. Nyasulu12Department of Information Technology, Cape Peninsula University of TechnologyDepartment of Information Technology, Cape Peninsula University of TechnologyDivision of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch UniversityApplied Microbial and Health Biotechnology Institute, Cape Peninsula University of TechnologyDepartment of Mathematics and Computer Science, Philipps University of MarburgDepartment of Information Technology, Cape Peninsula University of TechnologySt Pölten University of Applied SciencesSchool of Health Information Science, University of VictoriaDivision of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch UniversityDivision of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch UniversityDivision of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch UniversityDivision of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch UniversityDivision of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch UniversityAbstract South Africa was the most affected country in Africa by the coronavirus disease 2019 (COVID-19) pandemic, where over 4 million confirmed cases of COVID-19 and over 102,000 deaths have been recorded since 2019. Aside from clinical methods, artificial intelligence (AI)-based solutions such as machine learning (ML) models have been employed in treating COVID-19 cases. However, limited application of AI for COVID-19 in Africa has been reported in the literature. This study aimed to investigate the performance and interpretability of several ML algorithms, including deep multilayer perceptron (Deep MLP), support vector machine (SVM) and Extreme gradient boosting trees (XGBoost) for predicting COVID-19 mortality risk with an emphasis on the effect of cross-validation (CV) and principal component analysis (PCA) on the results. For this purpose, a dataset with 154 features from 490 COVID-19 patients admitted into the intensive care unit (ICU) of Tygerberg Hospital in Cape Town, South Africa, during the first wave of COVID-19 in 2020 was retrospectively analysed. Our results show that Deep MLP had the best overall performance (F1 = 0.92; area under the curve (AUC) = 0.94) when CV and the synthetic minority oversampling technique (SMOTE) were applied without PCA. By using the Shapley Additive exPlanations (SHAP) model to interpret the mortality risk predictions, we identified the Length of stay (LOS) in the hospital, LOS in the ICU, Time to ICU from admission, days discharged alive or death, D-dimer (blood clotting factor), and blood pH as the six most critical variables for mortality risk prediction. Also, Age at admission, Pf ratio (PaO2/FiO2 ratio), troponin T (TropT), ferritin, ventilation, C-reactive protein (CRP), and symptoms of acute respiratory distress syndrome (ARDS) were associated with the severity and fatality of COVID-19 cases. The study reveals how ML could assist medical practitioners in making informed decisions on handling critically ill COVID-19 patients with comorbidities. It also offers insight into the combined effect of CV, PCA, and SMOTE on the performance of ML models for COVID-19 mortality risk prediction, which has been little explored.https://doi.org/10.1038/s41598-023-46712-w
spellingShingle Olawande Daramola
Tatenda Duncan Kavu
Maritha J. Kotze
Jeanine L. Marnewick
Oluwafemi A. Sarumi
Boniface Kabaso
Thomas Moser
Karl Stroetmann
Isaac Fwemba
Fisayo Daramola
Martha Nyirenda
Susan J. van Rensburg
Peter S. Nyasulu
Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients
Scientific Reports
title Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients
title_full Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients
title_fullStr Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients
title_full_unstemmed Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients
title_short Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients
title_sort predictive modelling and identification of critical variables of mortality risk in covid 19 patients
url https://doi.org/10.1038/s41598-023-46712-w
work_keys_str_mv AT olawandedaramola predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT tatendaduncankavu predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT marithajkotze predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT jeaninelmarnewick predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT oluwafemiasarumi predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT bonifacekabaso predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT thomasmoser predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT karlstroetmann predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT isaacfwemba predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT fisayodaramola predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT marthanyirenda predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT susanjvanrensburg predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients
AT petersnyasulu predictivemodellingandidentificationofcriticalvariablesofmortalityriskincovid19patients