T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging

BackgroundT-cell receptor (TCR) repertoires provide insights into tumor immunology, yet their variations across digestive system cancers are not well understood. Characterizing TCR differences between colorectal cancer (CRC) and gastric cancer (GC), as well as developing machine learning models to d...

Full description

Saved in:
Bibliographic Details
Main Authors: Changjin Yuan, Bin Wang, Hong Wang, Fang Wang, Xiangze Li, Ya’nan Zhen
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-04-01
Series:Frontiers in Immunology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fimmu.2025.1556165/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849702282205069312
author Changjin Yuan
Bin Wang
Hong Wang
Fang Wang
Xiangze Li
Ya’nan Zhen
author_facet Changjin Yuan
Bin Wang
Hong Wang
Fang Wang
Xiangze Li
Ya’nan Zhen
author_sort Changjin Yuan
collection DOAJ
description BackgroundT-cell receptor (TCR) repertoires provide insights into tumor immunology, yet their variations across digestive system cancers are not well understood. Characterizing TCR differences between colorectal cancer (CRC) and gastric cancer (GC), as well as developing machine learning models to distinguish cancer types, metastatic status, and disease stages are crucial for guiding clinical practices.MethodsA cohort study of 143 tumor patients (96 CRC, 47 GC) was conducted. High-throughput TCR sequencing was performed to capture TCR beta (TRB), delta (TRD), and gamma (TRG) chain data. Tissue-specific patterns in TCR repertoire features, such as V-J gene recombination, complementarity-determining region 3 (CDR3) sequences, and motif distributions, were analyzed. Multi-layer machine learning-based diagnostic models were developed by leveraging motif-based feature and deep learning-based feature extraction using ProteinBERT from the 100 most abundant CDR3 sequences per sample. These models were used to differentiate CRC from GC, distinguish between primary and metastatic CRC lesions, and predict disease stages in CRC.ResultsTissue-specific differences in TCR repertoires were observed across CRC, GC, and between primary and metastatic lesions, as well as across disease stages in CRC. Distinct V-J gene recombination patterns were identified, with CRC showing enrichment in TRBV*-TRBJ* combinations, while GC exhibited higher levels of γδT-cell-related recombination. Primary and metastatic lesions of CRC patients displayed distinct V-J recombination preferences (e.g., TRBV7-9/TRBJ2-1 higher in metastatic; TRBV20-1/TRBJ1-2 higher in primary) and CDR3 sequence differences, with metastatic having shorter TRG CDR3 lengths (p-value = 0.019). Across CRC stages, later stages (III–IV) showed higher clonal diversity (p-value < 0.05) and stage-specific V-J patterns, alongside distinct CDR3 amino acid preferences at N-terminal (positions 1–2) and central positions (positions 5–12). Multi-dimensional machine learning models demonstrated exceptional diagnostic performance across all classification tasks. For distinguishing CRC from GC, the model achieved an accuracy of 97.9% and an area under the curve (AUC) of 0.996. For differentiating primary from metastatic CRC, the model achieved 100% accuracy with an AUC of 1.000. In predicting CRC disease stages, the model attained an accuracy of 96.9% and an AUC of 0.993. Extensive validation using simulated and publicly available datasets, confirmed the robustness and reliability of the models, demonstrating consistent performance across diverse datasets and experimental conditions.ConclusionsOur investigation provides novel insights into TCR repertoire variations in digestive system tumors, and highlight the potential of immune repertoire features as powerful diagnostic tools for understanding cancer progression and potentially improving clinical decision-making.
format Article
id doaj-art-08f11bc0b0264f60bee63f3ee5283e8b
institution DOAJ
issn 1664-3224
language English
publishDate 2025-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Immunology
spelling doaj-art-08f11bc0b0264f60bee63f3ee5283e8b2025-08-20T03:17:43ZengFrontiers Media S.A.Frontiers in Immunology1664-32242025-04-011610.3389/fimmu.2025.15561651556165T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and stagingChangjin Yuan0Bin Wang1Hong Wang2Fang Wang3Xiangze Li4Ya’nan Zhen5Clinical Laboratory, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, ChinaMinimally Invasive Surgery, The Third Affiliated Hospital of Shandong First Medical University, Jinan, ChinaDepartment of Gastrointestinal Surgery, Shandong Provincial Third Hospital, Shandong University, Jinan, ChinaDepartment of Gastrointestinal Surgery, The Third Affiliated Hospital of Shandong First Medical University, Jinan, ChinaDepartment of Gastrointestinal Surgery, Shandong Provincial Third Hospital, Shandong University, Jinan, ChinaDepartment of Gastrointestinal Surgery, Shandong Provincial Third Hospital, Shandong University, Jinan, ChinaBackgroundT-cell receptor (TCR) repertoires provide insights into tumor immunology, yet their variations across digestive system cancers are not well understood. Characterizing TCR differences between colorectal cancer (CRC) and gastric cancer (GC), as well as developing machine learning models to distinguish cancer types, metastatic status, and disease stages are crucial for guiding clinical practices.MethodsA cohort study of 143 tumor patients (96 CRC, 47 GC) was conducted. High-throughput TCR sequencing was performed to capture TCR beta (TRB), delta (TRD), and gamma (TRG) chain data. Tissue-specific patterns in TCR repertoire features, such as V-J gene recombination, complementarity-determining region 3 (CDR3) sequences, and motif distributions, were analyzed. Multi-layer machine learning-based diagnostic models were developed by leveraging motif-based feature and deep learning-based feature extraction using ProteinBERT from the 100 most abundant CDR3 sequences per sample. These models were used to differentiate CRC from GC, distinguish between primary and metastatic CRC lesions, and predict disease stages in CRC.ResultsTissue-specific differences in TCR repertoires were observed across CRC, GC, and between primary and metastatic lesions, as well as across disease stages in CRC. Distinct V-J gene recombination patterns were identified, with CRC showing enrichment in TRBV*-TRBJ* combinations, while GC exhibited higher levels of γδT-cell-related recombination. Primary and metastatic lesions of CRC patients displayed distinct V-J recombination preferences (e.g., TRBV7-9/TRBJ2-1 higher in metastatic; TRBV20-1/TRBJ1-2 higher in primary) and CDR3 sequence differences, with metastatic having shorter TRG CDR3 lengths (p-value = 0.019). Across CRC stages, later stages (III–IV) showed higher clonal diversity (p-value < 0.05) and stage-specific V-J patterns, alongside distinct CDR3 amino acid preferences at N-terminal (positions 1–2) and central positions (positions 5–12). Multi-dimensional machine learning models demonstrated exceptional diagnostic performance across all classification tasks. For distinguishing CRC from GC, the model achieved an accuracy of 97.9% and an area under the curve (AUC) of 0.996. For differentiating primary from metastatic CRC, the model achieved 100% accuracy with an AUC of 1.000. In predicting CRC disease stages, the model attained an accuracy of 96.9% and an AUC of 0.993. Extensive validation using simulated and publicly available datasets, confirmed the robustness and reliability of the models, demonstrating consistent performance across diverse datasets and experimental conditions.ConclusionsOur investigation provides novel insights into TCR repertoire variations in digestive system tumors, and highlight the potential of immune repertoire features as powerful diagnostic tools for understanding cancer progression and potentially improving clinical decision-making.https://www.frontiersin.org/articles/10.3389/fimmu.2025.1556165/fullT-cell receptor repertoire (TCR)colorectal cancer (CRC)gastric cancer (GC)multi-layer machine learningdiagnostic model
spellingShingle Changjin Yuan
Bin Wang
Hong Wang
Fang Wang
Xiangze Li
Ya’nan Zhen
T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging
Frontiers in Immunology
T-cell receptor repertoire (TCR)
colorectal cancer (CRC)
gastric cancer (GC)
multi-layer machine learning
diagnostic model
title T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging
title_full T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging
title_fullStr T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging
title_full_unstemmed T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging
title_short T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging
title_sort t cell receptor dynamics in digestive system cancers a multi layer machine learning approach for tumor diagnosis and staging
topic T-cell receptor repertoire (TCR)
colorectal cancer (CRC)
gastric cancer (GC)
multi-layer machine learning
diagnostic model
url https://www.frontiersin.org/articles/10.3389/fimmu.2025.1556165/full
work_keys_str_mv AT changjinyuan tcellreceptordynamicsindigestivesystemcancersamultilayermachinelearningapproachfortumordiagnosisandstaging
AT binwang tcellreceptordynamicsindigestivesystemcancersamultilayermachinelearningapproachfortumordiagnosisandstaging
AT hongwang tcellreceptordynamicsindigestivesystemcancersamultilayermachinelearningapproachfortumordiagnosisandstaging
AT fangwang tcellreceptordynamicsindigestivesystemcancersamultilayermachinelearningapproachfortumordiagnosisandstaging
AT xiangzeli tcellreceptordynamicsindigestivesystemcancersamultilayermachinelearningapproachfortumordiagnosisandstaging
AT yananzhen tcellreceptordynamicsindigestivesystemcancersamultilayermachinelearningapproachfortumordiagnosisandstaging