Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1

BackgroundGallbladder cancer (GBC) frequently exhibits non-specific early symptoms, delaying diagnosis. This study (i) assessed whether routine blood biomarkers can distinguish early T stages via machine learning and (ii) compared the T-stage discrimination performance of a large language model (Dee...

Full description

Saved in:

Bibliographic Details
Main Authors:	Joongwon Chae, Zhenyu Wang, Duanpo Wu, Lian Zhang, Alexander Tuzikov, Magrupov Talat Madiyevich, Min Xu, Dongmei Yu, Peiwu Qin
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-08-01
Series:	Frontiers in Oncology
Subjects:	gallbladder cancer GBC machine learning large language model DeepSeek-R1 staging
Online Access:	https://www.frontiersin.org/articles/10.3389/fonc.2025.1613462/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849707379012141056
author	Joongwon Chae Zhenyu Wang Duanpo Wu Lian Zhang Alexander Tuzikov Magrupov Talat Madiyevich Min Xu Dongmei Yu Peiwu Qin
author_facet	Joongwon Chae Zhenyu Wang Duanpo Wu Lian Zhang Alexander Tuzikov Magrupov Talat Madiyevich Min Xu Dongmei Yu Peiwu Qin
author_sort	Joongwon Chae
collection	DOAJ
description	BackgroundGallbladder cancer (GBC) frequently exhibits non-specific early symptoms, delaying diagnosis. This study (i) assessed whether routine blood biomarkers can distinguish early T stages via machine learning and (ii) compared the T-stage discrimination performance of a large language model (DeepSeek-R1) when supplied with (a) radiology-report text alone versus (b) radiology-report text plus blood-biomarker values.MethodsWe retrospectively analyzed 232 pathologically confirmed GBC patients treated at Lishui Central Hospital between 2023 and 2024 (T1, n = 51; T2, n = 181). Seven blood variables—neutrophil-to-lymphocyte ratio (NLR), monocyte-to-lymphocyte ratio (MLR), platelet-tolymphocyte ratio (PLR), carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 125 (CA125), and alpha-fetoprotein (AFP)—were used to train Random forest, Support Vector Machine (SVC), XGBoost, and LightGBM models. Synthetic Minority Over-sampling Technique (SMOTE) was applied only to the training folds in one setting and omitted in another. Model performance was evaluated on an independent test set (N = 47) by the area under the receiver-operating-characteristic curve (AUROC, 95% CI by 1 000-sample bootstrap confidence interval, CI); cross-validation (CV) accuracy served as a supplementary metric. DeepSeek-R1 was prompted in a zero-shot, chain-of-thought manner to classify T1 versus T2 using (a) the radiology report alone or (b) the report plus the patient’s biomarker profile.ResultsBiomarker-based machine-learning models yielded uniformly poor T-stage discrimination. Without SMOTE, individual models such as XGBoost achieved an AUROC of 0.508 on the independent test set, while recall for the T1 class remained low (e.g., 14.3% for some models), indicating performance near random chance. Applying SMOTE to the training data produced statistically significant gains in cross-validation (CV) accuracy for several models (e.g., XGBoost CV Acc. 0.71 → 0.80, p = 0.005; LGBM CV Acc. [No-SMOTE] → [SMOTE], p = 0.004). However, these improvements did not translate to better discrimination on the independent test set; for instance, XGBoost’s AUROC decreased from 0.508 to 0.473 after SMOTE application. Overall, the biomarker models failed to provide clinically meaningful T-stage differentiation. DeepSeek-R1 analyzing radiology text alone reached 89.6% accuracy on the full 232-patient cohort dataset, and consistently flagged T2 cases on phrases such as “gallbladder wall thickening.” Supplying biomarker values did not change accuracy (89.6%)ConclusionsThe evaluated blood biomarkers did not independently aid early T-stage discrimination, and SMOTE offered no meaningful performance gain. Conversely, a radiologytext-driven large language model delivered high accuracy with interpretable rationale, highlighting its potential to guide surgical strategy in GBC. Prospective multi-center studies with larger cohorts are warranted to confirm these findings.
format	Article
id	doaj-art-d8b2b774c0b04e9b964a89abee8a95a9
institution	DOAJ
issn	2234-943X
language	English
publishDate	2025-08-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Oncology
spelling	doaj-art-d8b2b774c0b04e9b964a89abee8a95a92025-08-20T03:15:56ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2025-08-011510.3389/fonc.2025.16134621613462Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1Joongwon Chae0Zhenyu Wang1Duanpo Wu2Lian Zhang3Alexander Tuzikov4Magrupov Talat Madiyevich5Min Xu6Dongmei Yu7Peiwu Qin8Institute of Biopharmaceutical and Health Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, ChinaInstitute of Biopharmaceutical and Health Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, ChinaSchool of Communication Engineering and the Artificial Intelligence Institute, Hangzhou Dianzi University, Hangzhou, Zhejiang, ChinaThe First Hospital of Hebei Medical University, Shijiazhuang, Hebei, ChinaUnited Institute of Informatics Problems, National Academy of Sciences of Belarus, Minsk, BelarusDepartment of Biomedical Engineering & Tashkent State Technical University, Tashkent, UzbekistanAffiliated Fifth Hospital, Wenzhou Medical University, Wenzhou, Zhejiang, ChinaAffiliated Fifth Hospital, Wenzhou Medical University, Wenzhou, Zhejiang, ChinaInstitute of Biopharmaceutical and Health Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, ChinaBackgroundGallbladder cancer (GBC) frequently exhibits non-specific early symptoms, delaying diagnosis. This study (i) assessed whether routine blood biomarkers can distinguish early T stages via machine learning and (ii) compared the T-stage discrimination performance of a large language model (DeepSeek-R1) when supplied with (a) radiology-report text alone versus (b) radiology-report text plus blood-biomarker values.MethodsWe retrospectively analyzed 232 pathologically confirmed GBC patients treated at Lishui Central Hospital between 2023 and 2024 (T1, n = 51; T2, n = 181). Seven blood variables—neutrophil-to-lymphocyte ratio (NLR), monocyte-to-lymphocyte ratio (MLR), platelet-tolymphocyte ratio (PLR), carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 125 (CA125), and alpha-fetoprotein (AFP)—were used to train Random forest, Support Vector Machine (SVC), XGBoost, and LightGBM models. Synthetic Minority Over-sampling Technique (SMOTE) was applied only to the training folds in one setting and omitted in another. Model performance was evaluated on an independent test set (N = 47) by the area under the receiver-operating-characteristic curve (AUROC, 95% CI by 1 000-sample bootstrap confidence interval, CI); cross-validation (CV) accuracy served as a supplementary metric. DeepSeek-R1 was prompted in a zero-shot, chain-of-thought manner to classify T1 versus T2 using (a) the radiology report alone or (b) the report plus the patient’s biomarker profile.ResultsBiomarker-based machine-learning models yielded uniformly poor T-stage discrimination. Without SMOTE, individual models such as XGBoost achieved an AUROC of 0.508 on the independent test set, while recall for the T1 class remained low (e.g., 14.3% for some models), indicating performance near random chance. Applying SMOTE to the training data produced statistically significant gains in cross-validation (CV) accuracy for several models (e.g., XGBoost CV Acc. 0.71 → 0.80, p = 0.005; LGBM CV Acc. [No-SMOTE] → [SMOTE], p = 0.004). However, these improvements did not translate to better discrimination on the independent test set; for instance, XGBoost’s AUROC decreased from 0.508 to 0.473 after SMOTE application. Overall, the biomarker models failed to provide clinically meaningful T-stage differentiation. DeepSeek-R1 analyzing radiology text alone reached 89.6% accuracy on the full 232-patient cohort dataset, and consistently flagged T2 cases on phrases such as “gallbladder wall thickening.” Supplying biomarker values did not change accuracy (89.6%)ConclusionsThe evaluated blood biomarkers did not independently aid early T-stage discrimination, and SMOTE offered no meaningful performance gain. Conversely, a radiologytext-driven large language model delivered high accuracy with interpretable rationale, highlighting its potential to guide surgical strategy in GBC. Prospective multi-center studies with larger cohorts are warranted to confirm these findings.https://www.frontiersin.org/articles/10.3389/fonc.2025.1613462/fullgallbladder cancerGBCmachine learninglarge language modelDeepSeek-R1staging
spellingShingle	Joongwon Chae Zhenyu Wang Duanpo Wu Lian Zhang Alexander Tuzikov Magrupov Talat Madiyevich Min Xu Dongmei Yu Peiwu Qin Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1 Frontiers in Oncology gallbladder cancer GBC machine learning large language model DeepSeek-R1 staging
title	Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1
title_full	Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1
title_fullStr	Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1
title_full_unstemmed	Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1
title_short	Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1
title_sort	pre operative t stage discrimination in gallbladder cancer using machine learning and deepseek r1
topic	gallbladder cancer GBC machine learning large language model DeepSeek-R1 staging
url	https://www.frontiersin.org/articles/10.3389/fonc.2025.1613462/full
work_keys_str_mv	AT joongwonchae preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT zhenyuwang preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT duanpowu preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT lianzhang preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT alexandertuzikov preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT magrupovtalatmadiyevich preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT minxu preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT dongmeiyu preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1 AT peiwuqin preoperativetstagediscriminationingallbladdercancerusingmachinelearninganddeepseekr1

Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1

Similar Items