Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis
Abstract Breast cancer, with its high incidence and mortality globally, necessitates early prediction of local and distant recurrence to improve treatment outcomes. This study develops and validates predictive models for breast cancer recurrence and metastasis using Recurrence-Free Survival Analysis...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-025-87622-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832571641424510976 |
---|---|
author | Shahd M. Noman Youssef M. Fadel Mayar T. Henedak Nada A. Attia Malak Essam Sarah Elmaasarawii Fayrouz A. Fouad Esraa G. Eltasawi Walid Al-Atabany |
author_facet | Shahd M. Noman Youssef M. Fadel Mayar T. Henedak Nada A. Attia Malak Essam Sarah Elmaasarawii Fayrouz A. Fouad Esraa G. Eltasawi Walid Al-Atabany |
author_sort | Shahd M. Noman |
collection | DOAJ |
description | Abstract Breast cancer, with its high incidence and mortality globally, necessitates early prediction of local and distant recurrence to improve treatment outcomes. This study develops and validates predictive models for breast cancer recurrence and metastasis using Recurrence-Free Survival Analysis and machine learning techniques. We merged datasets from the Molecular Taxonomy of Breast Cancer International Consortium, Memorial Sloan Kettering Cancer Center, Duke University, and the SEER program, creating a comprehensive dataset of 272, 252 rows and 23 columns. Our methodology utilized three predictive strategies: assessing recurrence risk, differentiating local from distant recurrences, and identifying potential metastatic sites. Key prognostic factors were identified through survival analysis. LightGBM, XGBoost, and Random Forest models were employed and validated against data from the Baheya Foundation. The models demonstrated strong performance; the survival analysis achieved a C-index of 0.837. The LightGBM model reached an AUC of 92% in predicting recurrences, while XGBoost and Random Forest models distinguished recurrence types with up to 86% accuracy, and they effectively differentiated between bone metastasis and all other locations combined (brain, liver, and lungs). This study highlights the significant potential of machine learning in advancing breast cancer management and sets a new benchmark for predictive analytics. Future research will integrate genetic data to further enhance these models. |
format | Article |
id | doaj-art-2ed292d836ed4cd0b96788a9849b1353 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-2ed292d836ed4cd0b96788a9849b13532025-02-02T12:24:24ZengNature PortfolioScientific Reports2045-23222025-01-0115111610.1038/s41598-025-87622-3Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasisShahd M. Noman0Youssef M. Fadel1Mayar T. Henedak2Nada A. Attia3Malak Essam4Sarah Elmaasarawii5Fayrouz A. Fouad6Esraa G. Eltasawi7Walid Al-Atabany8Center for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorCenter for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorCenter for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorCenter for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorCenter for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorCenter for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorBaheya Center for Early Detection and Treatment of Breast Cancer, Research CenterBaheya Center for Early Detection and Treatment of Breast Cancer, Research CenterCenter for Informatics Science (CIS), School of Information Technology and Computer Science, Nile University, 26th of July CorridorAbstract Breast cancer, with its high incidence and mortality globally, necessitates early prediction of local and distant recurrence to improve treatment outcomes. This study develops and validates predictive models for breast cancer recurrence and metastasis using Recurrence-Free Survival Analysis and machine learning techniques. We merged datasets from the Molecular Taxonomy of Breast Cancer International Consortium, Memorial Sloan Kettering Cancer Center, Duke University, and the SEER program, creating a comprehensive dataset of 272, 252 rows and 23 columns. Our methodology utilized three predictive strategies: assessing recurrence risk, differentiating local from distant recurrences, and identifying potential metastatic sites. Key prognostic factors were identified through survival analysis. LightGBM, XGBoost, and Random Forest models were employed and validated against data from the Baheya Foundation. The models demonstrated strong performance; the survival analysis achieved a C-index of 0.837. The LightGBM model reached an AUC of 92% in predicting recurrences, while XGBoost and Random Forest models distinguished recurrence types with up to 86% accuracy, and they effectively differentiated between bone metastasis and all other locations combined (brain, liver, and lungs). This study highlights the significant potential of machine learning in advancing breast cancer management and sets a new benchmark for predictive analytics. Future research will integrate genetic data to further enhance these models.https://doi.org/10.1038/s41598-025-87622-3Breast cancerRecurrence predictionMachine learningMetastasisSurvival analysis |
spellingShingle | Shahd M. Noman Youssef M. Fadel Mayar T. Henedak Nada A. Attia Malak Essam Sarah Elmaasarawii Fayrouz A. Fouad Esraa G. Eltasawi Walid Al-Atabany Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis Scientific Reports Breast cancer Recurrence prediction Machine learning Metastasis Survival analysis |
title | Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis |
title_full | Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis |
title_fullStr | Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis |
title_full_unstemmed | Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis |
title_short | Leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis |
title_sort | leveraging survival analysis and machine learning for accurate prediction of breast cancer recurrence and metastasis |
topic | Breast cancer Recurrence prediction Machine learning Metastasis Survival analysis |
url | https://doi.org/10.1038/s41598-025-87622-3 |
work_keys_str_mv | AT shahdmnoman leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT youssefmfadel leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT mayarthenedak leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT nadaaattia leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT malakessam leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT sarahelmaasarawii leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT fayrouzafouad leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT esraageltasawi leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis AT walidalatabany leveragingsurvivalanalysisandmachinelearningforaccuratepredictionofbreastcancerrecurrenceandmetastasis |