Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms

BackgroundColorectal cancer is a prevalent malignancy of the digestive system, with an increasing incidence. Lower extremity deep vein thrombosis (DVT) is a frequent postoperative complication, occurring in up to 40% of cases.ObjectiveThis research aims to develop and validate a machine learning mod...

Full description

Saved in:
Bibliographic Details
Main Authors: Xin Liu, Xingming Shu, Yejiang Zhou, Yifan Jiang
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-11-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2024.1499794/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850224865443840000
author Xin Liu
Xingming Shu
Yejiang Zhou
Yifan Jiang
author_facet Xin Liu
Xingming Shu
Yejiang Zhou
Yifan Jiang
author_sort Xin Liu
collection DOAJ
description BackgroundColorectal cancer is a prevalent malignancy of the digestive system, with an increasing incidence. Lower extremity deep vein thrombosis (DVT) is a frequent postoperative complication, occurring in up to 40% of cases.ObjectiveThis research aims to develop and validate a machine learning model (ML) to predict the risk of lower limb deep vein thrombosis in patients with colorectal cancer, facilitating preventive and therapeutic measures to enhance recovery and ensure safety.MethodsIn this retrospective cohort study, we collected data from 429 colorectal cancer patients from January 2021 to January 2024. The medical records included age, blood test results, body mass index, underlying diseases, clinical staging, histological typing, surgical methods, and postoperative complications. We employed the Synthetic Minority Oversampling Technique to address imbalanced data and split the dataset into training and validation sets in a 7:3 ratio. Feature selection was performed using Random Forest (RF), XGBoost, and Least Absolute Shrinkage and Selection Operator algorithms (LASSO). We then trained six machine learning models: Logistic Regression (LR), Naive Bayes (NB), Gaussian Process (GP), Random Forest, XGBoost, and Multilayer Perceptron (MLP). The model’s performance was evaluated using metrics such as area under the Receiver Operating Characteristic curve, accuracy, sensitivity, specificity, F1 score, and confusion matrix. Additionally, SHAP and LIME were used to enhance the interpretability of the results.ResultsThe study combined Random Forest, XGBoost algorithms, and LASSO regression with univariate regression analysis to identify significant predictive factors, including age, preoperative prealbumin, preoperative albumin, preoperative hemoglobin, operation time, PIKVA2, CEA, and preoperative neutrophil count. The XGBoost model outperformed other ML algorithms, achieving an AUC of 0.996, an accuracy of 0.9636, a specificity of 0.9778, and an F1 score of 0.9576. Moreover, the SHAP method identified age and preoperative prealbumin as the primary determinants influencing ML model predictions. Finally, the study employed LIME for more precise prediction and interpretation of individual predictions.ConclusionThe machine learning algorithms effectively predicted postoperative lower limb deep vein thrombosis in colorectal cancer patients. The XGBoost model demonstrated strong potential for improving early detection and treatment in clinical settings.
format Article
id doaj-art-7b61a877657b4c7cae5fdc6afb3bc825
institution OA Journals
issn 2234-943X
language English
publishDate 2024-11-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Oncology
spelling doaj-art-7b61a877657b4c7cae5fdc6afb3bc8252025-08-20T02:05:31ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2024-11-011410.3389/fonc.2024.14997941499794Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithmsXin Liu0Xingming Shu1Yejiang Zhou2Yifan Jiang3Department of Clinical Medicine, Southwest Medical University, Luzhou, ChinaDepartment of Clinical Medicine, Southwest Medical University, Luzhou, ChinaDepartment of Gastrointestinal Surgery, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, ChinaDepartment of Gastrointestinal Surgery, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, ChinaBackgroundColorectal cancer is a prevalent malignancy of the digestive system, with an increasing incidence. Lower extremity deep vein thrombosis (DVT) is a frequent postoperative complication, occurring in up to 40% of cases.ObjectiveThis research aims to develop and validate a machine learning model (ML) to predict the risk of lower limb deep vein thrombosis in patients with colorectal cancer, facilitating preventive and therapeutic measures to enhance recovery and ensure safety.MethodsIn this retrospective cohort study, we collected data from 429 colorectal cancer patients from January 2021 to January 2024. The medical records included age, blood test results, body mass index, underlying diseases, clinical staging, histological typing, surgical methods, and postoperative complications. We employed the Synthetic Minority Oversampling Technique to address imbalanced data and split the dataset into training and validation sets in a 7:3 ratio. Feature selection was performed using Random Forest (RF), XGBoost, and Least Absolute Shrinkage and Selection Operator algorithms (LASSO). We then trained six machine learning models: Logistic Regression (LR), Naive Bayes (NB), Gaussian Process (GP), Random Forest, XGBoost, and Multilayer Perceptron (MLP). The model’s performance was evaluated using metrics such as area under the Receiver Operating Characteristic curve, accuracy, sensitivity, specificity, F1 score, and confusion matrix. Additionally, SHAP and LIME were used to enhance the interpretability of the results.ResultsThe study combined Random Forest, XGBoost algorithms, and LASSO regression with univariate regression analysis to identify significant predictive factors, including age, preoperative prealbumin, preoperative albumin, preoperative hemoglobin, operation time, PIKVA2, CEA, and preoperative neutrophil count. The XGBoost model outperformed other ML algorithms, achieving an AUC of 0.996, an accuracy of 0.9636, a specificity of 0.9778, and an F1 score of 0.9576. Moreover, the SHAP method identified age and preoperative prealbumin as the primary determinants influencing ML model predictions. Finally, the study employed LIME for more precise prediction and interpretation of individual predictions.ConclusionThe machine learning algorithms effectively predicted postoperative lower limb deep vein thrombosis in colorectal cancer patients. The XGBoost model demonstrated strong potential for improving early detection and treatment in clinical settings.https://www.frontiersin.org/articles/10.3389/fonc.2024.1499794/fullcolorectal cancervenous thrombosismachine learningprediction modelpostoperative complications
spellingShingle Xin Liu
Xingming Shu
Yejiang Zhou
Yifan Jiang
Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
Frontiers in Oncology
colorectal cancer
venous thrombosis
machine learning
prediction model
postoperative complications
title Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
title_full Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
title_fullStr Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
title_full_unstemmed Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
title_short Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
title_sort construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms
topic colorectal cancer
venous thrombosis
machine learning
prediction model
postoperative complications
url https://www.frontiersin.org/articles/10.3389/fonc.2024.1499794/full
work_keys_str_mv AT xinliu constructionofariskpredictionmodelforpostoperativedeepveinthrombosisincolorectalcancerpatientsbasedonmachinelearningalgorithms
AT xingmingshu constructionofariskpredictionmodelforpostoperativedeepveinthrombosisincolorectalcancerpatientsbasedonmachinelearningalgorithms
AT yejiangzhou constructionofariskpredictionmodelforpostoperativedeepveinthrombosisincolorectalcancerpatientsbasedonmachinelearningalgorithms
AT yifanjiang constructionofariskpredictionmodelforpostoperativedeepveinthrombosisincolorectalcancerpatientsbasedonmachinelearningalgorithms