Predictive model of malignancy probability in pulmonary nodules based on multicenter data
ObjectivesTo study the characteristic factors associated with the occurrence of malignant nodules in patients presenting with pulmonary nodules, develop a predictive model, and evaluate its diagnostic performance.MethodsThis study analyzed the clinical and imaging data of 830 patients with pulmonary...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-05-01
|
| Series: | Frontiers in Oncology |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fonc.2025.1588147/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849762000232513536 |
|---|---|
| author | Yuyan Huang Yong Chen Fang He Li Jiang |
| author_facet | Yuyan Huang Yong Chen Fang He Li Jiang |
| author_sort | Yuyan Huang |
| collection | DOAJ |
| description | ObjectivesTo study the characteristic factors associated with the occurrence of malignant nodules in patients presenting with pulmonary nodules, develop a predictive model, and evaluate its diagnostic performance.MethodsThis study analyzed the clinical and imaging data of 830 patients with pulmonary nodules from the Affiliated Hospital of North Sichuan Medical College. The Least Absolute Shrinkage and Selection Operator (LASSO) and multivariate logistic regression analysis were utilized to identify characteristic predictors. Multiple machine learning classification models were employed for analysis, with the optimal model ultimately selected. A Shapley Additive Explanations (SHAP) framework was developed for personalized risk assessment. Finally, external testing was performed using data from 330 pulmonary nodule patients at Guang’an People’s Hospital.ResultsThe predictive factors for malignant pulmonary nodules included: age, gender, nodule diameter, spiculation, lobulation, calcification, vacuole, vascular convergence sign, air bronchogram sign, pleural traction, and density of the nodule. The Gradient Boosting Decision Tree (GBDT) classification model demonstrated optimal performance, with an area under the curve (AUC) of 0.873 (95% confidence interval [CI]: 0.840–0.906) on the internal test set and 0.726 (95% CI: 0.668–0.784) on the external test set. Both the calibration curve and clinical decision curve analysis (DCA) indicated excellent model calibration and substantial clinical benefits.ConclusionsWe developed a GBDT model that provides a basis for differentiating malignant pulmonary nodules, which may assist in the diagnosis and treatment of patients with pulmonary nodules. |
| format | Article |
| id | doaj-art-9e3cbe1a413e4055ade4f68a3f0abf1d |
| institution | DOAJ |
| issn | 2234-943X |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Oncology |
| spelling | doaj-art-9e3cbe1a413e4055ade4f68a3f0abf1d2025-08-20T03:05:52ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2025-05-011510.3389/fonc.2025.15881471588147Predictive model of malignancy probability in pulmonary nodules based on multicenter dataYuyan HuangYong ChenFang HeLi JiangObjectivesTo study the characteristic factors associated with the occurrence of malignant nodules in patients presenting with pulmonary nodules, develop a predictive model, and evaluate its diagnostic performance.MethodsThis study analyzed the clinical and imaging data of 830 patients with pulmonary nodules from the Affiliated Hospital of North Sichuan Medical College. The Least Absolute Shrinkage and Selection Operator (LASSO) and multivariate logistic regression analysis were utilized to identify characteristic predictors. Multiple machine learning classification models were employed for analysis, with the optimal model ultimately selected. A Shapley Additive Explanations (SHAP) framework was developed for personalized risk assessment. Finally, external testing was performed using data from 330 pulmonary nodule patients at Guang’an People’s Hospital.ResultsThe predictive factors for malignant pulmonary nodules included: age, gender, nodule diameter, spiculation, lobulation, calcification, vacuole, vascular convergence sign, air bronchogram sign, pleural traction, and density of the nodule. The Gradient Boosting Decision Tree (GBDT) classification model demonstrated optimal performance, with an area under the curve (AUC) of 0.873 (95% confidence interval [CI]: 0.840–0.906) on the internal test set and 0.726 (95% CI: 0.668–0.784) on the external test set. Both the calibration curve and clinical decision curve analysis (DCA) indicated excellent model calibration and substantial clinical benefits.ConclusionsWe developed a GBDT model that provides a basis for differentiating malignant pulmonary nodules, which may assist in the diagnosis and treatment of patients with pulmonary nodules.https://www.frontiersin.org/articles/10.3389/fonc.2025.1588147/fullpulmonary nodulesmalignancymachine learningprediction modelexternal test |
| spellingShingle | Yuyan Huang Yong Chen Fang He Li Jiang Predictive model of malignancy probability in pulmonary nodules based on multicenter data Frontiers in Oncology pulmonary nodules malignancy machine learning prediction model external test |
| title | Predictive model of malignancy probability in pulmonary nodules based on multicenter data |
| title_full | Predictive model of malignancy probability in pulmonary nodules based on multicenter data |
| title_fullStr | Predictive model of malignancy probability in pulmonary nodules based on multicenter data |
| title_full_unstemmed | Predictive model of malignancy probability in pulmonary nodules based on multicenter data |
| title_short | Predictive model of malignancy probability in pulmonary nodules based on multicenter data |
| title_sort | predictive model of malignancy probability in pulmonary nodules based on multicenter data |
| topic | pulmonary nodules malignancy machine learning prediction model external test |
| url | https://www.frontiersin.org/articles/10.3389/fonc.2025.1588147/full |
| work_keys_str_mv | AT yuyanhuang predictivemodelofmalignancyprobabilityinpulmonarynodulesbasedonmulticenterdata AT yongchen predictivemodelofmalignancyprobabilityinpulmonarynodulesbasedonmulticenterdata AT fanghe predictivemodelofmalignancyprobabilityinpulmonarynodulesbasedonmulticenterdata AT lijiang predictivemodelofmalignancyprobabilityinpulmonarynodulesbasedonmulticenterdata |