Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification

Abstract Background Advancements in the management of gastric cancer (GC) and innovative therapeutic approaches highlight the significance of the role of biomarkers in GC prognosis. Machine-learning (ML)-based methods can be applied to identify the most important predictors and unravel their interac...

Full description

Saved in:
Bibliographic Details
Main Authors: Haniyeh Rafiepoor, Mohammad M. Banoei, Alireza Ghorbankhanloo, Ahad Muhammadnejad, Amirhossein Razavirad, Saeed Soleymanjahi, Saeid Amanpour
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Cancer
Subjects:
Online Access:https://doi.org/10.1186/s12885-025-14204-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850284726108028928
author Haniyeh Rafiepoor
Mohammad M. Banoei
Alireza Ghorbankhanloo
Ahad Muhammadnejad
Amirhossein Razavirad
Saeed Soleymanjahi
Saeid Amanpour
author_facet Haniyeh Rafiepoor
Mohammad M. Banoei
Alireza Ghorbankhanloo
Ahad Muhammadnejad
Amirhossein Razavirad
Saeed Soleymanjahi
Saeid Amanpour
author_sort Haniyeh Rafiepoor
collection DOAJ
description Abstract Background Advancements in the management of gastric cancer (GC) and innovative therapeutic approaches highlight the significance of the role of biomarkers in GC prognosis. Machine-learning (ML)-based methods can be applied to identify the most important predictors and unravel their interactions to classify patients, which might guide prioritized treatment decisions. Methods A total of 140 patients with histopathological confirmed GC who underwent surgery between 2011 and 2016 were enrolled in the study. The inspired modification of the partial least squares (SIMPLS)-based model was used to identify the most significant predictors and interactions between variables. Predictive partition analysis was employed to establish the decision tree model to prioritize markers for clinical use. ML models have also been developed to predict TNM stage and different subtypes of GC. Latent class analysis (LCA) and principal component analysis (PCA) were carried out to cluster the GC patients and to find a subgroup of survivors who tended to die. Results The findings revealed that the SIMPLS method was able to predict the mortality of GC patients with high predictabilities (Q2 = 0.45–0.70). The analysis identified MMP-7, P53, Ki67, and vimentin as the top predictors. Correlation analysis revealed different patterns of prognostic markers in the non-survivor and survivor cohorts and different GC subtypes. The main prediction models were verified via other ML-based analyses, with a high area under the curve (AUC) (0.84–0.99), specificity (0.82–0.99) and sensitivity (0.87–0.99). Patients were classified into three clusters of mortality risk, which highlighted the most significant mortality predictors. Partition analysis prioritizes the most significant predictors P53 ≥ 6, COX-2 > 2, vimentin > 2, Ki67 ≥ 13 in mortality of patients (AUC = 0.85–0.90). Conclusion The present study highlights the importance of considering multiple variables and their interactions to predict the prognosis of mortality and stage in GC patients through ML-based techniques. These findings suggest that the incorporation of molecular biomarkers may enhance patient prognosis compared to relying solely on clinical factors. Furthermore, they demonstrate the potential for personalized medicine in GC treatment by identifying high-risk patients for early intervention and optimizing therapeutic strategies. The partition analysis technique offers a practical tool for identifying cutoffs and prioritizing markers for clinical application. Additionally, providing Clinical Decision Support systems with predictive tools can assist clinicians and pathologists in identifying aggressive cases, thereby improving patient outcomes while minimizing unnecessary treatments. Overall, this study contributes to the ongoing efforts to improve patient outcomes by advancing our comprehension of the intricate nature of GC.
format Article
id doaj-art-cb6ff26573f5465db1dd83932e7008a7
institution OA Journals
issn 1471-2407
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series BMC Cancer
spelling doaj-art-cb6ff26573f5465db1dd83932e7008a72025-08-20T01:47:29ZengBMCBMC Cancer1471-24072025-04-0125111510.1186/s12885-025-14204-xExploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratificationHaniyeh Rafiepoor0Mohammad M. Banoei1Alireza Ghorbankhanloo2Ahad Muhammadnejad3Amirhossein Razavirad4Saeed Soleymanjahi5Saeid Amanpour6Cancer Biology Research Center, Cancer Institute, Tehran University of Medical SciencesDepartment of Critical Care Medicine, University of CalgaryCancer Biology Research Center, Cancer Institute, Tehran University of Medical SciencesCancer Biology Research Center, Cancer Institute, Tehran University of Medical SciencesCancer Biology Research Center, Cancer Institute, Tehran University of Medical SciencesDepartment of Internal Medicine, Department of Digital Health, Yale School of MedicineCancer Biology Research Center, Cancer Institute, Tehran University of Medical SciencesAbstract Background Advancements in the management of gastric cancer (GC) and innovative therapeutic approaches highlight the significance of the role of biomarkers in GC prognosis. Machine-learning (ML)-based methods can be applied to identify the most important predictors and unravel their interactions to classify patients, which might guide prioritized treatment decisions. Methods A total of 140 patients with histopathological confirmed GC who underwent surgery between 2011 and 2016 were enrolled in the study. The inspired modification of the partial least squares (SIMPLS)-based model was used to identify the most significant predictors and interactions between variables. Predictive partition analysis was employed to establish the decision tree model to prioritize markers for clinical use. ML models have also been developed to predict TNM stage and different subtypes of GC. Latent class analysis (LCA) and principal component analysis (PCA) were carried out to cluster the GC patients and to find a subgroup of survivors who tended to die. Results The findings revealed that the SIMPLS method was able to predict the mortality of GC patients with high predictabilities (Q2 = 0.45–0.70). The analysis identified MMP-7, P53, Ki67, and vimentin as the top predictors. Correlation analysis revealed different patterns of prognostic markers in the non-survivor and survivor cohorts and different GC subtypes. The main prediction models were verified via other ML-based analyses, with a high area under the curve (AUC) (0.84–0.99), specificity (0.82–0.99) and sensitivity (0.87–0.99). Patients were classified into three clusters of mortality risk, which highlighted the most significant mortality predictors. Partition analysis prioritizes the most significant predictors P53 ≥ 6, COX-2 > 2, vimentin > 2, Ki67 ≥ 13 in mortality of patients (AUC = 0.85–0.90). Conclusion The present study highlights the importance of considering multiple variables and their interactions to predict the prognosis of mortality and stage in GC patients through ML-based techniques. These findings suggest that the incorporation of molecular biomarkers may enhance patient prognosis compared to relying solely on clinical factors. Furthermore, they demonstrate the potential for personalized medicine in GC treatment by identifying high-risk patients for early intervention and optimizing therapeutic strategies. The partition analysis technique offers a practical tool for identifying cutoffs and prioritizing markers for clinical application. Additionally, providing Clinical Decision Support systems with predictive tools can assist clinicians and pathologists in identifying aggressive cases, thereby improving patient outcomes while minimizing unnecessary treatments. Overall, this study contributes to the ongoing efforts to improve patient outcomes by advancing our comprehension of the intricate nature of GC.https://doi.org/10.1186/s12885-025-14204-xGastric cancerPrediction modelMachine learningImmunohistochemistryMortality
spellingShingle Haniyeh Rafiepoor
Mohammad M. Banoei
Alireza Ghorbankhanloo
Ahad Muhammadnejad
Amirhossein Razavirad
Saeed Soleymanjahi
Saeid Amanpour
Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification
BMC Cancer
Gastric cancer
Prediction model
Machine learning
Immunohistochemistry
Mortality
title Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification
title_full Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification
title_fullStr Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification
title_full_unstemmed Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification
title_short Exploring the potential of machine learning in gastric cancer: prognostic biomarkers, subtyping, and stratification
title_sort exploring the potential of machine learning in gastric cancer prognostic biomarkers subtyping and stratification
topic Gastric cancer
Prediction model
Machine learning
Immunohistochemistry
Mortality
url https://doi.org/10.1186/s12885-025-14204-x
work_keys_str_mv AT haniyehrafiepoor exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification
AT mohammadmbanoei exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification
AT alirezaghorbankhanloo exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification
AT ahadmuhammadnejad exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification
AT amirhosseinrazavirad exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification
AT saeedsoleymanjahi exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification
AT saeidamanpour exploringthepotentialofmachinelearningingastriccancerprognosticbiomarkerssubtypingandstratification