A novel perspective on survival prediction for AML patients: Integration of machine learning in SEER database applications

Objective: The purpose of this study is to explore the epidemiological characteristics of acute myeloid leukemia (AML) and establish a more accurate model for predicting the prognosis of AML patients based on machine learning. Methods: We obtained clinical data of a total of 87,090 AML patients betw...

Full description

Saved in:
Bibliographic Details
Main Authors: Zheng-yi Jia, Maierbiya Abulimiti, Yun Wu, Li-na Ma, Xiao-yu Li, Jie Wang
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844025004104
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective: The purpose of this study is to explore the epidemiological characteristics of acute myeloid leukemia (AML) and establish a more accurate model for predicting the prognosis of AML patients based on machine learning. Methods: We obtained clinical data of a total of 87,090 AML patients between 1975 and 2019 from the SEER database. First, we used Kaplan-Meier analysis to examine the prognosis of patients in different strata. Then, we discussed the independent factors that influenced the overall survival (OS) of AML patients, using univariate and multivariate Cox regression analysis. Finally, we used 11 machine learning algorithms to predict the survival rate of AML patients at 1, 2, and 3 years, respectively. We also used five-fold cross-validation with 20 cycles to obtain the optimal parameters for each model, in order to improve the accuracy of predictions. Results: The Kaplan-Meier analysis showed that the survival rate of patients diagnosed after 2010 was significantly higher than that of those diagnosed before. In addition, older age, male gender, and non-black race were associated with poor prognosis. Among the FAB subtypes, M3 AML had a better prognosis than other subtypes, and among the WHO subtypes, AML associated with Down syndrome had the best prognosis, followed by AML with eosinophilic abnormalities. The Cox regression analysis demonstrated that gender, age, race, and family income were significantly related to the survival of AML patients. Among the 11 machine learning models, the random forest classifier performed best on multiple evaluation metrics in predicting survival at 1, 2, and 3 years. In addition, both the XGBoost classifier and the neural network classifier showed high accuracy and reliability at each prediction stage. Conclusion: Through in-depth analysis, this study provides a deeper understanding of the epidemiological characteristics of AML and successfully establishes a prediction model based on machine learning, which demonstrates good accuracy and reliability in predicting the prognosis of AML patients.
ISSN:2405-8440