Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study

<i>Background and Objectives</i>: Venous thromboembolism (VTE) can be the first manifestation of an underlying cancer. This study aimed to develop a predictive model to assess the risk of occult cancer between 30 days and 24 months after a venous thrombotic event using machine learning (...

Full description

Saved in:
Bibliographic Details
Main Authors: Anabel Franco-Moreno, Elena Madroñal-Cerezo, Cristina Lucía de Ancos-Aracil, Ana Isabel Farfán-Sedano, Nuria Muñoz-Rivas, José Bascuñana Morejón-Girón, José Manuel Ruiz-Giardín, Federico Álvarez-Rodríguez, Jesús Prada-Alonso, Yvonne Gala-García, Miguel Ángel Casado-Suela, Ana Bustamante-Fermosel, Nuria Alfaro-Fernández, Juan Torres-Macho
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Medicina
Subjects:
Online Access:https://www.mdpi.com/1648-9144/61/1/18
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832587996261515264
author Anabel Franco-Moreno
Elena Madroñal-Cerezo
Cristina Lucía de Ancos-Aracil
Ana Isabel Farfán-Sedano
Nuria Muñoz-Rivas
José Bascuñana Morejón-Girón
José Manuel Ruiz-Giardín
Federico Álvarez-Rodríguez
Jesús Prada-Alonso
Yvonne Gala-García
Miguel Ángel Casado-Suela
Ana Bustamante-Fermosel
Nuria Alfaro-Fernández
Juan Torres-Macho
author_facet Anabel Franco-Moreno
Elena Madroñal-Cerezo
Cristina Lucía de Ancos-Aracil
Ana Isabel Farfán-Sedano
Nuria Muñoz-Rivas
José Bascuñana Morejón-Girón
José Manuel Ruiz-Giardín
Federico Álvarez-Rodríguez
Jesús Prada-Alonso
Yvonne Gala-García
Miguel Ángel Casado-Suela
Ana Bustamante-Fermosel
Nuria Alfaro-Fernández
Juan Torres-Macho
author_sort Anabel Franco-Moreno
collection DOAJ
description <i>Background and Objectives</i>: Venous thromboembolism (VTE) can be the first manifestation of an underlying cancer. This study aimed to develop a predictive model to assess the risk of occult cancer between 30 days and 24 months after a venous thrombotic event using machine learning (ML). <i>Materials and Methods</i>: We designed a case–control study nested in a cohort of patients with VTE included in a prospective registry from two Spanish hospitals between 2005 and 2021. Both clinically and ML-driven feature selection were performed to identify predictors for occult cancer. XGBoost, LightGBM, and CatBoost algorithms were used to train different prediction models, which were subsequently validated in a hold-out dataset. <i>Results</i>: A total of 815 patients with VTE were included (51.5% male and median age of 59). During follow-up, 56 patients (6.9%) were diagnosed with cancer. One hundred and twenty-one variables were explored for the predictive analysis. CatBoost obtained better performance metrics among the ML models analyzed. The final CatBoost model included, among the top 15 variables to predict hidden malignancy, age, gender, systolic blood pressure, heart rate, weight, chronic lung disease, D-dimer, alanine aminotransferase, hemoglobin, serum creatinine, cholesterol, platelets, triglycerides, leukocyte count and previous VTE. The model had an ROC-AUC of 0.86 (95% CI, 0.83–0.87) in the test set. Sensitivity, specificity, and negative and positive predictive values were 62%, 94%, 93% and 75%, respectively. <i>Conclusions</i>: This is the first risk score developed for identifying patients with VTE who are at increased risk of occult cancer using ML tools, obtaining a remarkably high diagnostic accuracy. This study’s limitations include potential information bias from electronic health records and a small cancer sample size. In addition, variability in detection protocols and evolving clinical practices may affect model accuracy. Our score needs external validation.
format Article
id doaj-art-68485b073d1e465caede3a7eea38b9ac
institution Kabale University
issn 1010-660X
1648-9144
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Medicina
spelling doaj-art-68485b073d1e465caede3a7eea38b9ac2025-01-24T13:40:16ZengMDPI AGMedicina1010-660X1648-91442024-12-016111810.3390/medicina61010018Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER StudyAnabel Franco-Moreno0Elena Madroñal-Cerezo1Cristina Lucía de Ancos-Aracil2Ana Isabel Farfán-Sedano3Nuria Muñoz-Rivas4José Bascuñana Morejón-Girón5José Manuel Ruiz-Giardín6Federico Álvarez-Rodríguez7Jesús Prada-Alonso8Yvonne Gala-García9Miguel Ángel Casado-Suela10Ana Bustamante-Fermosel11Nuria Alfaro-Fernández12Juan Torres-Macho13Department of Internal Medicine, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario de Fuenlabrada, 28942 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario de Fuenlabrada, 28942 Madrid, SpainDepartment of Internal Medicine, Clínica Universidad de Navarra-Hospital, 31008 Pamplona, SpainDepartment of Internal Medicine, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario 12 de Octubre, 28041 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario de Fuenlabrada, 28942 Madrid, SpainDepartment of Anatomical Pathology, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, SpainHorus-ML, Alcalá Street 268, 28027 Madrid, SpainHorus-ML, Alcalá Street 268, 28027 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, SpainDepartment of Internal Medicine, Hospital Universitario Infanta Leonor–Virgen de la Torre, 28031 Madrid, Spain<i>Background and Objectives</i>: Venous thromboembolism (VTE) can be the first manifestation of an underlying cancer. This study aimed to develop a predictive model to assess the risk of occult cancer between 30 days and 24 months after a venous thrombotic event using machine learning (ML). <i>Materials and Methods</i>: We designed a case–control study nested in a cohort of patients with VTE included in a prospective registry from two Spanish hospitals between 2005 and 2021. Both clinically and ML-driven feature selection were performed to identify predictors for occult cancer. XGBoost, LightGBM, and CatBoost algorithms were used to train different prediction models, which were subsequently validated in a hold-out dataset. <i>Results</i>: A total of 815 patients with VTE were included (51.5% male and median age of 59). During follow-up, 56 patients (6.9%) were diagnosed with cancer. One hundred and twenty-one variables were explored for the predictive analysis. CatBoost obtained better performance metrics among the ML models analyzed. The final CatBoost model included, among the top 15 variables to predict hidden malignancy, age, gender, systolic blood pressure, heart rate, weight, chronic lung disease, D-dimer, alanine aminotransferase, hemoglobin, serum creatinine, cholesterol, platelets, triglycerides, leukocyte count and previous VTE. The model had an ROC-AUC of 0.86 (95% CI, 0.83–0.87) in the test set. Sensitivity, specificity, and negative and positive predictive values were 62%, 94%, 93% and 75%, respectively. <i>Conclusions</i>: This is the first risk score developed for identifying patients with VTE who are at increased risk of occult cancer using ML tools, obtaining a remarkably high diagnostic accuracy. This study’s limitations include potential information bias from electronic health records and a small cancer sample size. In addition, variability in detection protocols and evolving clinical practices may affect model accuracy. Our score needs external validation.https://www.mdpi.com/1648-9144/61/1/18early detection of cancermachine learningoccult malignancypredictive modelvenous thromboembolism
spellingShingle Anabel Franco-Moreno
Elena Madroñal-Cerezo
Cristina Lucía de Ancos-Aracil
Ana Isabel Farfán-Sedano
Nuria Muñoz-Rivas
José Bascuñana Morejón-Girón
José Manuel Ruiz-Giardín
Federico Álvarez-Rodríguez
Jesús Prada-Alonso
Yvonne Gala-García
Miguel Ángel Casado-Suela
Ana Bustamante-Fermosel
Nuria Alfaro-Fernández
Juan Torres-Macho
Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
Medicina
early detection of cancer
machine learning
occult malignancy
predictive model
venous thromboembolism
title Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
title_full Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
title_fullStr Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
title_full_unstemmed Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
title_short Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
title_sort development of a predictive model of occult cancer after a venous thromboembolism event using machine learning the clover study
topic early detection of cancer
machine learning
occult malignancy
predictive model
venous thromboembolism
url https://www.mdpi.com/1648-9144/61/1/18
work_keys_str_mv AT anabelfrancomoreno developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT elenamadronalcerezo developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT cristinaluciadeancosaracil developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT anaisabelfarfansedano developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT nuriamunozrivas developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT josebascunanamorejongiron developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT josemanuelruizgiardin developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT federicoalvarezrodriguez developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT jesuspradaalonso developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT yvonnegalagarcia developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT miguelangelcasadosuela developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT anabustamantefermosel developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT nuriaalfarofernandez developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy
AT juantorresmacho developmentofapredictivemodelofoccultcancerafteravenousthromboembolismeventusingmachinelearningthecloverstudy