An interpretable machine learning model based on computed tomography radiomics for predicting programmed death ligand 1 expression status in gastric cancer
Abstract Background Programmed death ligand 1 (PD-L1) expression status, closely related to immunotherapy outcomes, is a reliable biomarker for screening patients who may benefit from immunotherapy. Here, we developed and validated an interpretable machine learning (ML) model based on contrast-enhan...
Saved in:
| Main Authors: | , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-03-01
|
| Series: | Cancer Imaging |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s40644-025-00855-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Background Programmed death ligand 1 (PD-L1) expression status, closely related to immunotherapy outcomes, is a reliable biomarker for screening patients who may benefit from immunotherapy. Here, we developed and validated an interpretable machine learning (ML) model based on contrast-enhanced computed tomography (CECT) radiomics for preoperatively predicting PD-L1 expression status in patients with gastric cancer (GC). Methods We retrospectively recruited 285 GC patients who underwent CECT and PD-L1 detection from two medical centers. A PD-L1 combined positive score (CPS) of ≥ 5 was considered to indicate a high PD-L1 expression status. Patients from center 1 were divided into training (n = 143) and validation sets (n = 62), and patients from center 2 were considered a test set (n = 80). Radiomics features were extracted from venous-phase CT images. After feature reduction and selection, 11 ML algorithms were employed to develop predictive models, and their performance in predicting PD-L1 expression status was evaluated using areas under receiver operating characteristic curves (AUCs). SHapley Additive exPlanations (SHAP) were used to interpret the optimal model and visualize the decision-making process for a single individual. Results Nine features significantly associated with PD-L1 expression status were ultimately selected to construct the predictive model. The light gradient-boosting machine (LGBM) model demonstrated the best performance for PD-L1 high expression status prediction in the training, validation, and test sets, with AUCs of 0.841(95% CI: 0.773, 0.908), 0.834 (95% CI:0.729, 0.939), and 0.822 (95% CI: 0.718, 0.926), respectively. The SHAP summary and bar plots illustrated that a feature’s value affected the feature’s impact attributed to the model. The SHAP waterfall plots were used to visualize the decision-making process for a single individual. Conclusion Our CT radiomics–based LGBM model may aid in preoperatively predicting PD-L1 expression status in GC patients, and the SHAP method may improve the interpretability of this model. |
|---|---|
| ISSN: | 1470-7330 |