Identification of optimal biomarkers associated with distant metastasis in breast cancer using Boruta and Lasso machine learning algorithms

Abstract Objective The aim of this study was to identify optimal biomarkers associated with distant metastasis in patients with breast cancer from among nutritional and inflammatory indicators using the Boruta and Least Absolute Shrinkage and Selection Operator (LASSO) machine learning algorithms, t...

Full description

Saved in:
Bibliographic Details
Main Authors: Jia-ning Qin, Wen-bin Dai, Wen-hai Zhang, Bin-jie Chen, Ling Liang, Chun-feng Liang, Chun-guo Lu, Qi-xing Tan, Chang-yuan Wei, Yang Tan, Fang Wu
Format: Article
Language:English
Published: BMC 2025-08-01
Series:BMC Cancer
Subjects:
Online Access:https://doi.org/10.1186/s12885-025-14664-1
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Objective The aim of this study was to identify optimal biomarkers associated with distant metastasis in patients with breast cancer from among nutritional and inflammatory indicators using the Boruta and Least Absolute Shrinkage and Selection Operator (LASSO) machine learning algorithms, thereby improving the ability to identify distant metastasis. Methods A total of 348 patients newly diagnosed with breast cancer were included, comprising 185 patients with nonmetastatic breast cancer and 163 patients with distant metastatic breast cancer. The variables were initially screened using the Boruta algorithm, followed by further optimization through LASSO regression. The selected key indicators were evaluated for their association with distant metastasis risk using multivariate logistic regression analysis and restricted cubic spline functions. Discriminative performance was assessed through ROC curve analysis. Results Boruta and LASSO analyses identified five important indicators: the advanced lung cancer inflammation index (ALI), systemic inflammation response index (SIRI), monocyte-to-lymphocyte ratio (MLR), albumin-to-globulin ratio (AGR), and geriatric nutritional risk index (GNRI). Multivariate logistic regression analysis revealed that an elevated SIRI and MLR were associated with an increased risk of distant metastasis in patients with breast cancer, whereas a higher ALI, AGR, and GNRI were associated with a reduced risk. ROC analysis indicated moderate predictive performance for these indicators, with AUC values of approximately 0.65. Conclusion The ALI, SIRI, MLR, AGR, and GNRI are effective biomarkers for identifying the risk of distant metastasis in patients with breast cancer. These indicators could be incorporated into clinical practice to improve risk stratification, guide personalized treatment, and enhance patient outcomes.
ISSN:1471-2407