Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study

The accuracy of indices widely used to evaluate lung metastasis (LM) in patients with kidney cancer (KC) is insufficient. Therefore, we aimed at developing a model to estimate the risk of developing LM in KC based on a large population size and machine learning algorithms. Demographic and clinicopat...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinglin Yi, Yuhan Zhang, Juan Cai, Yu Hu, Kai Wen, Pan Xie, Na Yin, Xiangdong Zhou, Hu Luo
Format: Article
Language:English
Published: Wiley 2023-01-01
Series:International Journal of Clinical Practice
Online Access:http://dx.doi.org/10.1155/2023/8001899
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849395302708019200
author Xinglin Yi
Yuhan Zhang
Juan Cai
Yu Hu
Kai Wen
Pan Xie
Na Yin
Xiangdong Zhou
Hu Luo
author_facet Xinglin Yi
Yuhan Zhang
Juan Cai
Yu Hu
Kai Wen
Pan Xie
Na Yin
Xiangdong Zhou
Hu Luo
author_sort Xinglin Yi
collection DOAJ
description The accuracy of indices widely used to evaluate lung metastasis (LM) in patients with kidney cancer (KC) is insufficient. Therefore, we aimed at developing a model to estimate the risk of developing LM in KC based on a large population size and machine learning algorithms. Demographic and clinicopathologic variables of patients with KC diagnosed between 2004 and 2017 were retrospectively analyzed. We performed a univariate logistic regression analysis to identify risk factors for LM in patients with KC. Six machine learning (ML) classifiers were established and tuned using the ten-fold cross-validation method. External validation was performed using clinicopathologic information from 492 patients from the Southwest Hospital, Chongqing, China. Algorithm performance was estimated by analyzing the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, precision, recall, F1 score, clinical decision analysis (DCA), and clinical utility curve (CUC). A total of 52,714 eligible patients diagnosed with KC were enrolled, of whom 2,618 developed LM. Variables of age, sex, race, T stage, N stage, tumor size, histology, and grade were identified as important for the prediction of LM. The extreme gradient boosting (XGB) algorithm performed better than other models in both the internal validation (AUC: 0.913, sensitivity: 0.873, specificity: 0.809, and F1 score: 0.325) and the external validation (AUC: 0.904, sensitivity: 0.750, specificity: 0.878, and F1 score: 0.364). This study established a predictive model for LM in KC patients based on ML algorithms which showed high accuracy and applicative value. A web-based predictor was built using the XGB model to help clinicians make more rational and personalized decisions.
format Article
id doaj-art-ab633ecbf989468e960f7e1ba0ca72b3
institution Kabale University
issn 1742-1241
language English
publishDate 2023-01-01
publisher Wiley
record_format Article
series International Journal of Clinical Practice
spelling doaj-art-ab633ecbf989468e960f7e1ba0ca72b32025-08-20T03:39:40ZengWileyInternational Journal of Clinical Practice1742-12412023-01-01202310.1155/2023/8001899Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based StudyXinglin Yi0Yuhan Zhang1Juan Cai2Yu Hu3Kai Wen4Pan Xie5Na Yin6Xiangdong Zhou7Hu Luo8Department of Respiratory and Critical Care MedicineDepartment of Renal Dialysis CenterDepartment of Renal Dialysis CenterDepartment of Renal Dialysis CenterDepartment of Renal Dialysis CenterDepartment of Renal Dialysis CenterDepartment of Renal Dialysis CenterDepartment of Respiratory and Critical Care MedicineDepartment of Respiratory and Critical Care MedicineThe accuracy of indices widely used to evaluate lung metastasis (LM) in patients with kidney cancer (KC) is insufficient. Therefore, we aimed at developing a model to estimate the risk of developing LM in KC based on a large population size and machine learning algorithms. Demographic and clinicopathologic variables of patients with KC diagnosed between 2004 and 2017 were retrospectively analyzed. We performed a univariate logistic regression analysis to identify risk factors for LM in patients with KC. Six machine learning (ML) classifiers were established and tuned using the ten-fold cross-validation method. External validation was performed using clinicopathologic information from 492 patients from the Southwest Hospital, Chongqing, China. Algorithm performance was estimated by analyzing the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, precision, recall, F1 score, clinical decision analysis (DCA), and clinical utility curve (CUC). A total of 52,714 eligible patients diagnosed with KC were enrolled, of whom 2,618 developed LM. Variables of age, sex, race, T stage, N stage, tumor size, histology, and grade were identified as important for the prediction of LM. The extreme gradient boosting (XGB) algorithm performed better than other models in both the internal validation (AUC: 0.913, sensitivity: 0.873, specificity: 0.809, and F1 score: 0.325) and the external validation (AUC: 0.904, sensitivity: 0.750, specificity: 0.878, and F1 score: 0.364). This study established a predictive model for LM in KC patients based on ML algorithms which showed high accuracy and applicative value. A web-based predictor was built using the XGB model to help clinicians make more rational and personalized decisions.http://dx.doi.org/10.1155/2023/8001899
spellingShingle Xinglin Yi
Yuhan Zhang
Juan Cai
Yu Hu
Kai Wen
Pan Xie
Na Yin
Xiangdong Zhou
Hu Luo
Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study
International Journal of Clinical Practice
title Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study
title_full Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study
title_fullStr Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study
title_full_unstemmed Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study
title_short Development and External Validation of Machine Learning-Based Models for Predicting Lung Metastasis in Kidney Cancer: A Large Population-Based Study
title_sort development and external validation of machine learning based models for predicting lung metastasis in kidney cancer a large population based study
url http://dx.doi.org/10.1155/2023/8001899
work_keys_str_mv AT xinglinyi developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT yuhanzhang developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT juancai developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT yuhu developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT kaiwen developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT panxie developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT nayin developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT xiangdongzhou developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy
AT huluo developmentandexternalvalidationofmachinelearningbasedmodelsforpredictinglungmetastasisinkidneycanceralargepopulationbasedstudy