Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study

Abstract Background Gastric cancer (GC) is a leading cause of cancer-related deaths worldwide, with early diagnosis remaining a significant challenge. Available serum biomarkers lack specificity, making it difficult to accurately identify early non-metastatic GC cases. Reliable diagnostic biomarkers...

Full description

Saved in:
Bibliographic Details
Main Authors: Kewei Du, Wenfei Hu, Shan Gao, Jianxin Gan, Chongge You, Shangdi Zhang
Format: Article
Language:English
Published: BMC 2025-05-01
Series:BMC Cancer
Subjects:
Online Access:https://doi.org/10.1186/s12885-025-14396-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850231449611927552
author Kewei Du
Wenfei Hu
Shan Gao
Jianxin Gan
Chongge You
Shangdi Zhang
author_facet Kewei Du
Wenfei Hu
Shan Gao
Jianxin Gan
Chongge You
Shangdi Zhang
author_sort Kewei Du
collection DOAJ
description Abstract Background Gastric cancer (GC) is a leading cause of cancer-related deaths worldwide, with early diagnosis remaining a significant challenge. Available serum biomarkers lack specificity, making it difficult to accurately identify early non-metastatic GC cases. Reliable diagnostic biomarkers that can detect early GC are critical to improve prognosis. Methods We employed serum proteomics combined with bioinformatics to identify genes differentially expressed in the serum of non-metastatic GC patients. Single-cell RNA sequencing (ScRNA-seq) and immune infiltration analysis were performed to evaluate the relationship between gene expression and immune cell function. Then we evaluated 107 machine learning models for biomarker-based early GC diagnosis and develops a nomogram validated for accuracy and clinical utility, subsequently comparing the performance of potential biomarkers with traditional tumor markers in diagnosing early gastric cancer. Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR) and immunohistochemical staining using the Human Protein Atlas (HPA) database were used to validate the differential expression of candidate genes in GC tissues and adjacent non-cancerous tissues. Results The proteomic analysis identified several genes upregulated in the serum of GC patients compared to healthy controls. Single-cell RNA sequencing analysis further revealed that these upregulated genes were associated with altered immune cell infiltration in the tumor microenvironment. The glmBoost + XGBoost model incorporating B2M, CFL1, CTSD, and HSP90AB1 demonstrated strong diagnostic performance (mean AUC = 0.792), with 101 algorithm combinations achieving an average AUC > 0.7. A nomogram integrating gene expression and clinical data was developed, validated through calibration and decision curve analyses, highlighting its potential for early GC diagnosis. Additionally, four genes—TAGLN2, HSP90AB1, SH3BGRL3, and CFL1—were found to be highly expressed in non-metastatic GC tissues and were significantly correlated with immune infiltration, including CD8 + T cells, monocytes, and myeloid-derived suppressor cells. These findings were validated by qRT-PCR and immunohistochemical analyses, confirming their elevated expression in GC tissues. Conclusions TAGLN2, HSP90AB1, SH3BGRL3 and CFL1 are potential diagnostic biomarkers for early-stage GC, with strong associations with immune cell infiltration. Machine learning model shows excellent diagnostic performance. These results provide a foundation for future studies to improve early diagnosis and individualized treatment strategies for GC.
format Article
id doaj-art-ebe5a20f702f4d0a8053029cef082c3a
institution OA Journals
issn 1471-2407
language English
publishDate 2025-05-01
publisher BMC
record_format Article
series BMC Cancer
spelling doaj-art-ebe5a20f702f4d0a8053029cef082c3a2025-08-20T02:03:31ZengBMCBMC Cancer1471-24072025-05-0125112310.1186/s12885-025-14396-2Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development studyKewei Du0Wenfei Hu1Shan Gao2Jianxin Gan3Chongge You4Shangdi Zhang5Laboratory Medicine Center, The Second Hospital & Clinical Medical School, Lanzhou UniversityLaboratory Medicine Center, The Second Hospital & Clinical Medical School, Lanzhou UniversityLaboratory Medicine Center, The Second Hospital & Clinical Medical School, Lanzhou UniversityDepartment of General Surgery, The Second Hospital & Clinical Medical School, Lanzhou UniversityLaboratory Medicine Center, The Second Hospital & Clinical Medical School, Lanzhou UniversityLaboratory Medicine Center, The Second Hospital & Clinical Medical School, Lanzhou UniversityAbstract Background Gastric cancer (GC) is a leading cause of cancer-related deaths worldwide, with early diagnosis remaining a significant challenge. Available serum biomarkers lack specificity, making it difficult to accurately identify early non-metastatic GC cases. Reliable diagnostic biomarkers that can detect early GC are critical to improve prognosis. Methods We employed serum proteomics combined with bioinformatics to identify genes differentially expressed in the serum of non-metastatic GC patients. Single-cell RNA sequencing (ScRNA-seq) and immune infiltration analysis were performed to evaluate the relationship between gene expression and immune cell function. Then we evaluated 107 machine learning models for biomarker-based early GC diagnosis and develops a nomogram validated for accuracy and clinical utility, subsequently comparing the performance of potential biomarkers with traditional tumor markers in diagnosing early gastric cancer. Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR) and immunohistochemical staining using the Human Protein Atlas (HPA) database were used to validate the differential expression of candidate genes in GC tissues and adjacent non-cancerous tissues. Results The proteomic analysis identified several genes upregulated in the serum of GC patients compared to healthy controls. Single-cell RNA sequencing analysis further revealed that these upregulated genes were associated with altered immune cell infiltration in the tumor microenvironment. The glmBoost + XGBoost model incorporating B2M, CFL1, CTSD, and HSP90AB1 demonstrated strong diagnostic performance (mean AUC = 0.792), with 101 algorithm combinations achieving an average AUC > 0.7. A nomogram integrating gene expression and clinical data was developed, validated through calibration and decision curve analyses, highlighting its potential for early GC diagnosis. Additionally, four genes—TAGLN2, HSP90AB1, SH3BGRL3, and CFL1—were found to be highly expressed in non-metastatic GC tissues and were significantly correlated with immune infiltration, including CD8 + T cells, monocytes, and myeloid-derived suppressor cells. These findings were validated by qRT-PCR and immunohistochemical analyses, confirming their elevated expression in GC tissues. Conclusions TAGLN2, HSP90AB1, SH3BGRL3 and CFL1 are potential diagnostic biomarkers for early-stage GC, with strong associations with immune cell infiltration. Machine learning model shows excellent diagnostic performance. These results provide a foundation for future studies to improve early diagnosis and individualized treatment strategies for GC.https://doi.org/10.1186/s12885-025-14396-2Gastric cancerProteomicsMachine learningImmune infiltrationSingle-cell RNA sequencingDiagnostic model
spellingShingle Kewei Du
Wenfei Hu
Shan Gao
Jianxin Gan
Chongge You
Shangdi Zhang
Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
BMC Cancer
Gastric cancer
Proteomics
Machine learning
Immune infiltration
Single-cell RNA sequencing
Diagnostic model
title Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
title_full Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
title_fullStr Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
title_full_unstemmed Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
title_short Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
title_sort identification of multiomics and immune infiltration associated biomarkers for early gastric cancer a machine learning based diagnostic model development study
topic Gastric cancer
Proteomics
Machine learning
Immune infiltration
Single-cell RNA sequencing
Diagnostic model
url https://doi.org/10.1186/s12885-025-14396-2
work_keys_str_mv AT keweidu identificationofmultiomicsandimmuneinfiltrationassociatedbiomarkersforearlygastriccanceramachinelearningbaseddiagnosticmodeldevelopmentstudy
AT wenfeihu identificationofmultiomicsandimmuneinfiltrationassociatedbiomarkersforearlygastriccanceramachinelearningbaseddiagnosticmodeldevelopmentstudy
AT shangao identificationofmultiomicsandimmuneinfiltrationassociatedbiomarkersforearlygastriccanceramachinelearningbaseddiagnosticmodeldevelopmentstudy
AT jianxingan identificationofmultiomicsandimmuneinfiltrationassociatedbiomarkersforearlygastriccanceramachinelearningbaseddiagnosticmodeldevelopmentstudy
AT chonggeyou identificationofmultiomicsandimmuneinfiltrationassociatedbiomarkersforearlygastriccanceramachinelearningbaseddiagnosticmodeldevelopmentstudy
AT shangdizhang identificationofmultiomicsandimmuneinfiltrationassociatedbiomarkersforearlygastriccanceramachinelearningbaseddiagnosticmodeldevelopmentstudy