A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications

Objective: The purpose of this study is to explore the epidemiological characteristics of acute myeloid leukemia (AML) and establish a more accurate model for predicting the prognosis of AML patients based on machine learning. Methods: We obtained clinical data of a total of 87,090 AML patients betw...

Full description

Saved in:
Bibliographic Details
Main Authors: Zheng-yi Jia, Maierbiya Abulimiti, Yun Wu, Li-na Ma, Xiao-yu Li, Jie Wang
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844025004104
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832573062938099712
author Zheng-yi Jia
Maierbiya Abulimiti
Yun Wu
Li-na Ma
Xiao-yu Li
Jie Wang
author_facet Zheng-yi Jia
Maierbiya Abulimiti
Yun Wu
Li-na Ma
Xiao-yu Li
Jie Wang
author_sort Zheng-yi Jia
collection DOAJ
description Objective: The purpose of this study is to explore the epidemiological characteristics of acute myeloid leukemia (AML) and establish a more accurate model for predicting the prognosis of AML patients based on machine learning. Methods: We obtained clinical data of a total of 87,090 AML patients between 1975 and 2019 from the SEER database. First, we used Kaplan-Meier analysis to examine the prognosis of patients in different strata. Then, we discussed the independent factors that influenced the overall survival (OS) of AML patients, using univariate and multivariate Cox regression analysis. Finally, we used 11 machine learning algorithms to predict the survival rate of AML patients at 1, 2, and 3 years, respectively. We also used five-fold cross-validation with 20 cycles to obtain the optimal parameters for each model, in order to improve the accuracy of predictions. Results: The Kaplan-Meier analysis showed that the survival rate of patients diagnosed after 2010 was significantly higher than that of those diagnosed before. In addition, older age, male gender, and non-black race were associated with poor prognosis. Among the FAB subtypes, M3 AML had a better prognosis than other subtypes, and among the WHO subtypes, AML associated with Down syndrome had the best prognosis, followed by AML with eosinophilic abnormalities. The Cox regression analysis demonstrated that gender, age, race, and family income were significantly related to the survival of AML patients. Among the 11 machine learning models, the random forest classifier performed best on multiple evaluation metrics in predicting survival at 1, 2, and 3 years. In addition, both the XGBoost classifier and the neural network classifier showed high accuracy and reliability at each prediction stage. Conclusion: Through in-depth analysis, this study provides a deeper understanding of the epidemiological characteristics of AML and successfully establishes a prediction model based on machine learning, which demonstrates good accuracy and reliability in predicting the prognosis of AML patients.
format Article
id doaj-art-280afa21b9a64bf4961c68979d3abd20
institution Kabale University
issn 2405-8440
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Heliyon
spelling doaj-art-280afa21b9a64bf4961c68979d3abd202025-02-02T05:28:49ZengElsevierHeliyon2405-84402025-01-01112e42030A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applicationsZheng-yi Jia0Maierbiya Abulimiti1Yun Wu2Li-na Ma3Xiao-yu Li4Jie Wang5School of Pharmacy, Xinjiang Medical University, Urumqi, 830011, ChinaSchool of Pharmacy, Xinjiang Medical University, Urumqi, 830011, ChinaDepartment of General Medicine, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830011, ChinaSchool of Pharmacy, Xinjiang Medical University, Urumqi, 830011, ChinaSchool of Pharmacy, Xinjiang Medical University, Urumqi, 830011, ChinaDepartment of Pharmacy, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830011, China; Corresponding author.Objective: The purpose of this study is to explore the epidemiological characteristics of acute myeloid leukemia (AML) and establish a more accurate model for predicting the prognosis of AML patients based on machine learning. Methods: We obtained clinical data of a total of 87,090 AML patients between 1975 and 2019 from the SEER database. First, we used Kaplan-Meier analysis to examine the prognosis of patients in different strata. Then, we discussed the independent factors that influenced the overall survival (OS) of AML patients, using univariate and multivariate Cox regression analysis. Finally, we used 11 machine learning algorithms to predict the survival rate of AML patients at 1, 2, and 3 years, respectively. We also used five-fold cross-validation with 20 cycles to obtain the optimal parameters for each model, in order to improve the accuracy of predictions. Results: The Kaplan-Meier analysis showed that the survival rate of patients diagnosed after 2010 was significantly higher than that of those diagnosed before. In addition, older age, male gender, and non-black race were associated with poor prognosis. Among the FAB subtypes, M3 AML had a better prognosis than other subtypes, and among the WHO subtypes, AML associated with Down syndrome had the best prognosis, followed by AML with eosinophilic abnormalities. The Cox regression analysis demonstrated that gender, age, race, and family income were significantly related to the survival of AML patients. Among the 11 machine learning models, the random forest classifier performed best on multiple evaluation metrics in predicting survival at 1, 2, and 3 years. In addition, both the XGBoost classifier and the neural network classifier showed high accuracy and reliability at each prediction stage. Conclusion: Through in-depth analysis, this study provides a deeper understanding of the epidemiological characteristics of AML and successfully establishes a prediction model based on machine learning, which demonstrates good accuracy and reliability in predicting the prognosis of AML patients.http://www.sciencedirect.com/science/article/pii/S2405844025004104Acute myeloid leukemiaMachine learningSEER databaseEpidemiological characteristicsPrognosis prediction
spellingShingle Zheng-yi Jia
Maierbiya Abulimiti
Yun Wu
Li-na Ma
Xiao-yu Li
Jie Wang
A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications
Heliyon
Acute myeloid leukemia
Machine learning
SEER database
Epidemiological characteristics
Prognosis prediction
title A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications
title_full A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications
title_fullStr A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications
title_full_unstemmed A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications
title_short A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications
title_sort novel perspective on survival prediction for aml patients integration of machine learning in seer database applications
topic Acute myeloid leukemia
Machine learning
SEER database
Epidemiological characteristics
Prognosis prediction
url http://www.sciencedirect.com/science/article/pii/S2405844025004104
work_keys_str_mv AT zhengyijia anovelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT maierbiyaabulimiti anovelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT yunwu anovelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT linama anovelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT xiaoyuli anovelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT jiewang anovelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT zhengyijia novelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT maierbiyaabulimiti novelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT yunwu novelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT linama novelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT xiaoyuli novelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications
AT jiewang novelperspectiveonsurvivalpredictionforamlpatientsintegrationofmachinelearninginseerdatabaseapplications