Development and validation of survival prediction tools in early and late onset colorectal cancer patients

Abstract This study aims to develop online calculators using machine learning models to predict survival probabilities for early- and late-onset colorectal cancer (EOCRC and LOCRC) over a 1- to 8-year period. We extracted data on 117,965 CRC patients from the published database spanning 2010 to 2021...

Full description

Saved in:
Bibliographic Details
Main Authors: Wanling Li, Jinshan Liu, Yuntong Lan, Dongling Yu, Bingqiang Zhang
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-95385-0
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract This study aims to develop online calculators using machine learning models to predict survival probabilities for early- and late-onset colorectal cancer (EOCRC and LOCRC) over a 1- to 8-year period. We extracted data on 117,965 CRC patients from the published database spanning 2010 to 2021, divided into training and internal testing datasets. The data of 200 CRC patients from Chongqing Hospital of Jiangsu Province Hospital was used as the external testing dataset. We conducted univariate and multivariate regression analyses on the training dataset to identify key survival factors and develop predictive machine learning models. The models were evaluated using internal and external testing datasets based on AUC, accuracy, precision, recall, and F1 score. Web-based calculators were subsequently developed to predict survival curves for EOCRC and LOCRC patients under different treatment strategies. In the multivariate Cox regression analysis, 16 and 18 variables were independently significant survival factors for EOCRC and LOCRC, respectively. In the EOCRC group, the machine learning models achieved AUC values of 0.880 and 0.804 in the internal and external testing cohorts. For the LOCRC group, the machine learning models exhibited AUC values of 0.857 and 0.823 in the internal and external testing cohorts. The online calculators, powered by trained machine learning models, are accessible at https://eocrc-surv.streamlit.app/ and https://locrc-surv.streamlit.app/ . These tools estimate survival probabilities for EOCRC and LOCRC patients under various treatment strategies and display the corresponding survival curves post-treatment over the 1- to 8-year period. This study successfully developed online calculators using machine learning algorithms to predict 1- to 8-year survival probabilities for EOCRC and LOCRC patients under various treatment strategies.
ISSN:2045-2322