Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets

Abstract Saliva, a non-invasive, self-collected liquid biopsy, holds promise for early gastric cancer (GC) screening. This study aims to assess the potential of saliva as a proxy for malignant gastric transformation and its diagnostic value through transcriptomic profiling. Leveraging transcriptomic...

Full description

Saved in:
Bibliographic Details
Main Authors: Catarina Lopes, Andreia Brandão, Manuel R. Teixeira, Mário Dinis-Ribeiro, Carina Pereira
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-96864-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849688140276563968
author Catarina Lopes
Andreia Brandão
Manuel R. Teixeira
Mário Dinis-Ribeiro
Carina Pereira
author_facet Catarina Lopes
Andreia Brandão
Manuel R. Teixeira
Mário Dinis-Ribeiro
Carina Pereira
author_sort Catarina Lopes
collection DOAJ
description Abstract Saliva, a non-invasive, self-collected liquid biopsy, holds promise for early gastric cancer (GC) screening. This study aims to assess the potential of saliva as a proxy for malignant gastric transformation and its diagnostic value through transcriptomic profiling. Leveraging transcriptomic data from the Gene Expression Omnibus (GEO), we constructed and validated predictive models through machine learning algorithms within the tidymodels framework. Tissue-based models were validated on independent tissue datasets, and subsequently applied to saliva. Additionally, an independent saliva-derived model was created and evaluated using sensitivity, specificity, accuracy, area under the curve (AUC), and likelihood ratio (LR) metrics. Tissue-derived models demonstrated excellent performance, with AUC values exceeding 0.9, but did not translate effectively to saliva, suggesting distinct molecular landscapes between tissue and saliva in GC. The saliva-specific model using support vector machine (SVM) achieved the highest performance, with an AUC of 0.87 (95% CI 0.72–0.97), a sensitivity of 0.79 (95% CI 0.58–0.95) and a specificity of 0.70 (95% CI 0.40–0.90). While saliva may not mirror tissue gene expression profile, it represents a promising non-invasive predictive tool for the early detection of GC. Further research is warranted to optimize saliva-derived molecular signatures, increasing their sensitivity and specificity for early cancer detection and advance the use of liquid biopsies in personalized medicine for improved screening, diagnostic and prognostic capabilities.
format Article
id doaj-art-247a0de6048449df895a22d5e88eb75a
institution DOAJ
issn 2045-2322
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-247a0de6048449df895a22d5e88eb75a2025-08-20T03:22:07ZengNature PortfolioScientific Reports2045-23222025-05-0115111210.1038/s41598-025-96864-0Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasetsCatarina Lopes0Andreia Brandão1Manuel R. Teixeira2Mário Dinis-Ribeiro3Carina Pereira4Precancerous Lesions and Early Cancer Management Group, Research Center of IPO Porto (CI-IPOP)/CI-IPOP@RISE (Health Research Group), Portuguese Institute of Oncology of Porto (IPO Porto)/Porto Comprehensive Cancer Center Raquel Seruca (Porto.CCC)Cancer Genetics Group, Research Center of IPO Porto (CI-IPOP)/CI-IPOP@RISE (Health Research Group), Portuguese Institute of Oncology of Porto (IPO Porto)/Porto Comprehensive Cancer Center Raquel Seruca (Porto.CCC)ICBAS – School of Medicine and Biomedical Sciences, University of PortoPrecancerous Lesions and Early Cancer Management Group, Research Center of IPO Porto (CI-IPOP)/CI-IPOP@RISE (Health Research Group), Portuguese Institute of Oncology of Porto (IPO Porto)/Porto Comprehensive Cancer Center Raquel Seruca (Porto.CCC)Precancerous Lesions and Early Cancer Management Group, Research Center of IPO Porto (CI-IPOP)/CI-IPOP@RISE (Health Research Group), Portuguese Institute of Oncology of Porto (IPO Porto)/Porto Comprehensive Cancer Center Raquel Seruca (Porto.CCC)Abstract Saliva, a non-invasive, self-collected liquid biopsy, holds promise for early gastric cancer (GC) screening. This study aims to assess the potential of saliva as a proxy for malignant gastric transformation and its diagnostic value through transcriptomic profiling. Leveraging transcriptomic data from the Gene Expression Omnibus (GEO), we constructed and validated predictive models through machine learning algorithms within the tidymodels framework. Tissue-based models were validated on independent tissue datasets, and subsequently applied to saliva. Additionally, an independent saliva-derived model was created and evaluated using sensitivity, specificity, accuracy, area under the curve (AUC), and likelihood ratio (LR) metrics. Tissue-derived models demonstrated excellent performance, with AUC values exceeding 0.9, but did not translate effectively to saliva, suggesting distinct molecular landscapes between tissue and saliva in GC. The saliva-specific model using support vector machine (SVM) achieved the highest performance, with an AUC of 0.87 (95% CI 0.72–0.97), a sensitivity of 0.79 (95% CI 0.58–0.95) and a specificity of 0.70 (95% CI 0.40–0.90). While saliva may not mirror tissue gene expression profile, it represents a promising non-invasive predictive tool for the early detection of GC. Further research is warranted to optimize saliva-derived molecular signatures, increasing their sensitivity and specificity for early cancer detection and advance the use of liquid biopsies in personalized medicine for improved screening, diagnostic and prognostic capabilities.https://doi.org/10.1038/s41598-025-96864-0Early screeningSalivaomicsLiquid biopsiesBiomarkersBioinformatics
spellingShingle Catarina Lopes
Andreia Brandão
Manuel R. Teixeira
Mário Dinis-Ribeiro
Carina Pereira
Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
Scientific Reports
Early screening
Salivaomics
Liquid biopsies
Biomarkers
Bioinformatics
title Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
title_full Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
title_fullStr Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
title_full_unstemmed Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
title_short Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
title_sort saliva derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets
topic Early screening
Salivaomics
Liquid biopsies
Biomarkers
Bioinformatics
url https://doi.org/10.1038/s41598-025-96864-0
work_keys_str_mv AT catarinalopes salivaderivedtranscriptomicsignatureforgastriccancerdetectionusingmachinelearningandleveragingpubliclyavailabledatasets
AT andreiabrandao salivaderivedtranscriptomicsignatureforgastriccancerdetectionusingmachinelearningandleveragingpubliclyavailabledatasets
AT manuelrteixeira salivaderivedtranscriptomicsignatureforgastriccancerdetectionusingmachinelearningandleveragingpubliclyavailabledatasets
AT mariodinisribeiro salivaderivedtranscriptomicsignatureforgastriccancerdetectionusingmachinelearningandleveragingpubliclyavailabledatasets
AT carinapereira salivaderivedtranscriptomicsignatureforgastriccancerdetectionusingmachinelearningandleveragingpubliclyavailabledatasets