Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data

Abstract Glioblastoma is an aggressive, malignant primary brain tumour and the most prevalent histological type of glioma. Our study attempted to investigate the independent predictors of overall survival (OS) and cancer-specific survival (CSS) in Asian patients with glioblastoma and establish predi...

Full description

Saved in:
Bibliographic Details
Main Authors: Denglin Li, Luxin Zhang, Lifei Xu, Renhe Zhai, Hanyu Gao, Junlan Gao, Minghai Wei, Ningwei Che, Yeting He
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-15553-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849226253754695680
author Denglin Li
Luxin Zhang
Lifei Xu
Renhe Zhai
Hanyu Gao
Junlan Gao
Minghai Wei
Ningwei Che
Yeting He
author_facet Denglin Li
Luxin Zhang
Lifei Xu
Renhe Zhai
Hanyu Gao
Junlan Gao
Minghai Wei
Ningwei Che
Yeting He
author_sort Denglin Li
collection DOAJ
description Abstract Glioblastoma is an aggressive, malignant primary brain tumour and the most prevalent histological type of glioma. Our study attempted to investigate the independent predictors of overall survival (OS) and cancer-specific survival (CSS) in Asian patients with glioblastoma and establish predictive models for the OS and CSS of Asian patients with glioblastoma based on the machine learning algorithms. Data from Asian patients with glioblastoma in the SEER database were retrieved and stochastically grouped into a training set (n = 845) and a validation set (n = 362), and patients in our centre were assigned to the test set (n = 172). Univariate and multivariate Cox regression analyses were performed to evaluate the prognostic factors. Predictive models for OS and CSS were established based on eight machine learning algorithms, including Lasso Cox, random survival forest, CoxBoost, generalized boosted regression modelling (GBM), stepwise Cox and survival support vector machine, eXtreme Gradient Boosting, supervised principal component and partial least squares regression for Cox, and the selected predictive models were evaluated by the area under the ROC curves (AUC) and 95% confidence interval (CI), calibration curves and decision curve analyses in the training set, validation set and test set. In our retrospective study, age, tumour history, histologic type, surgery and chemotherapy were confirmed to be predictors of OS (p < 0.05); age, tumour history, histologic type, surgery and chemotherapy were identified as independent factors for CSS (p < 0.05). The predictive model for OS based on the GBM algorithm exhibited excellent predictive performance at 6 months (AUC = 0.837, 95% CI: 0.803–0.870), 12 months (AUC = 0.809, 95% CI: 0.780–0.839) and 24 months (AUC = 0.750, 95% CI: 0.717–0.783) in the training set, and the powerful predictive performance of the GBM model was confirmed in the validation and test sets, with good concordance between the predicted and observed OS rates demonstrated by calibration curves and clinical decision making performance suggested by the decision curve analyses curves. The predictive model based on the GBM algorithm for CSS also performed best = in the training set at 6 months (AUC = 0.808, 95% CI: 0.770–0.847), 12 months (AUC = 0.755, 95% CI: 0.721–0.789) and 24 months (AUC = 0.692, 95% CI: 0.657–0.728) in the training set, and convincing predictive effectiveness was also confirmed in the validation and test sets with good calibration and clinical utility. Age, tumour history, histologic type, surgery and chemotherapy were confirmed to be independent factors for OS; and age, tumour history, histologic type, surgery and chemotherapy were identified as prognostic factors for CSS in our retrospective study. The predictive model constructed for OS and CSS based on the GBM algorithm in Asian patients with glioblastoma can be used to accurately predict OS and CSS in clinical practice, which may help tailor personalized treatment regimens and provide significant benefits for these patients.
format Article
id doaj-art-01d4aab6e71a478ea4d5005affef3f18
institution Kabale University
issn 2045-2322
language English
publishDate 2025-08-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-01d4aab6e71a478ea4d5005affef3f182025-08-24T11:29:13ZengNature PortfolioScientific Reports2045-23222025-08-0115111910.1038/s41598-025-15553-0Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese dataDenglin Li0Luxin Zhang1Lifei Xu2Renhe Zhai3Hanyu Gao4Junlan Gao5Minghai Wei6Ningwei Che7Yeting He8Department of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Urology, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Emergency, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityDepartment of Neurosurgery, Second Affiliated Hospital of Dalian Medical UniversityAbstract Glioblastoma is an aggressive, malignant primary brain tumour and the most prevalent histological type of glioma. Our study attempted to investigate the independent predictors of overall survival (OS) and cancer-specific survival (CSS) in Asian patients with glioblastoma and establish predictive models for the OS and CSS of Asian patients with glioblastoma based on the machine learning algorithms. Data from Asian patients with glioblastoma in the SEER database were retrieved and stochastically grouped into a training set (n = 845) and a validation set (n = 362), and patients in our centre were assigned to the test set (n = 172). Univariate and multivariate Cox regression analyses were performed to evaluate the prognostic factors. Predictive models for OS and CSS were established based on eight machine learning algorithms, including Lasso Cox, random survival forest, CoxBoost, generalized boosted regression modelling (GBM), stepwise Cox and survival support vector machine, eXtreme Gradient Boosting, supervised principal component and partial least squares regression for Cox, and the selected predictive models were evaluated by the area under the ROC curves (AUC) and 95% confidence interval (CI), calibration curves and decision curve analyses in the training set, validation set and test set. In our retrospective study, age, tumour history, histologic type, surgery and chemotherapy were confirmed to be predictors of OS (p < 0.05); age, tumour history, histologic type, surgery and chemotherapy were identified as independent factors for CSS (p < 0.05). The predictive model for OS based on the GBM algorithm exhibited excellent predictive performance at 6 months (AUC = 0.837, 95% CI: 0.803–0.870), 12 months (AUC = 0.809, 95% CI: 0.780–0.839) and 24 months (AUC = 0.750, 95% CI: 0.717–0.783) in the training set, and the powerful predictive performance of the GBM model was confirmed in the validation and test sets, with good concordance between the predicted and observed OS rates demonstrated by calibration curves and clinical decision making performance suggested by the decision curve analyses curves. The predictive model based on the GBM algorithm for CSS also performed best = in the training set at 6 months (AUC = 0.808, 95% CI: 0.770–0.847), 12 months (AUC = 0.755, 95% CI: 0.721–0.789) and 24 months (AUC = 0.692, 95% CI: 0.657–0.728) in the training set, and convincing predictive effectiveness was also confirmed in the validation and test sets with good calibration and clinical utility. Age, tumour history, histologic type, surgery and chemotherapy were confirmed to be independent factors for OS; and age, tumour history, histologic type, surgery and chemotherapy were identified as prognostic factors for CSS in our retrospective study. The predictive model constructed for OS and CSS based on the GBM algorithm in Asian patients with glioblastoma can be used to accurately predict OS and CSS in clinical practice, which may help tailor personalized treatment regimens and provide significant benefits for these patients.https://doi.org/10.1038/s41598-025-15553-0GlioblastomaSurvivalRisk factorMachine learningPredictive model
spellingShingle Denglin Li
Luxin Zhang
Lifei Xu
Renhe Zhai
Hanyu Gao
Junlan Gao
Minghai Wei
Ningwei Che
Yeting He
Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data
Scientific Reports
Glioblastoma
Survival
Risk factor
Machine learning
Predictive model
title Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data
title_full Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data
title_fullStr Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data
title_full_unstemmed Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data
title_short Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data
title_sort development and validation of a machine learning based survival prediction model for asian glioblastoma patients using the seer database and chinese data
topic Glioblastoma
Survival
Risk factor
Machine learning
Predictive model
url https://doi.org/10.1038/s41598-025-15553-0
work_keys_str_mv AT denglinli developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT luxinzhang developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT lifeixu developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT renhezhai developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT hanyugao developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT junlangao developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT minghaiwei developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT ningweiche developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata
AT yetinghe developmentandvalidationofamachinelearningbasedsurvivalpredictionmodelforasianglioblastomapatientsusingtheseerdatabaseandchinesedata