Predictive model of malignancy probability in pulmonary nodules based on multicenter data

ObjectivesTo study the characteristic factors associated with the occurrence of malignant nodules in patients presenting with pulmonary nodules, develop a predictive model, and evaluate its diagnostic performance.MethodsThis study analyzed the clinical and imaging data of 830 patients with pulmonary...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuyan Huang, Yong Chen, Fang He, Li Jiang
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2025.1588147/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ObjectivesTo study the characteristic factors associated with the occurrence of malignant nodules in patients presenting with pulmonary nodules, develop a predictive model, and evaluate its diagnostic performance.MethodsThis study analyzed the clinical and imaging data of 830 patients with pulmonary nodules from the Affiliated Hospital of North Sichuan Medical College. The Least Absolute Shrinkage and Selection Operator (LASSO) and multivariate logistic regression analysis were utilized to identify characteristic predictors. Multiple machine learning classification models were employed for analysis, with the optimal model ultimately selected. A Shapley Additive Explanations (SHAP) framework was developed for personalized risk assessment. Finally, external testing was performed using data from 330 pulmonary nodule patients at Guang’an People’s Hospital.ResultsThe predictive factors for malignant pulmonary nodules included: age, gender, nodule diameter, spiculation, lobulation, calcification, vacuole, vascular convergence sign, air bronchogram sign, pleural traction, and density of the nodule. The Gradient Boosting Decision Tree (GBDT) classification model demonstrated optimal performance, with an area under the curve (AUC) of 0.873 (95% confidence interval [CI]: 0.840–0.906) on the internal test set and 0.726 (95% CI: 0.668–0.784) on the external test set. Both the calibration curve and clinical decision curve analysis (DCA) indicated excellent model calibration and substantial clinical benefits.ConclusionsWe developed a GBDT model that provides a basis for differentiating malignant pulmonary nodules, which may assist in the diagnosis and treatment of patients with pulmonary nodules.
ISSN:2234-943X