Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.

Epigenetic mechanisms play a crucial role in driving transcript expression and shaping the phenotypic plasticity of glioblastoma stem cells (GSCs), contributing to tumor heterogeneity and therapeutic resistance. These mechanisms dynamically regulate the expression of key oncogenic and stemness-assoc...

Full description

Saved in:
Bibliographic Details
Main Authors: Yusuke Suita, Hardy Bright, Yuan Pu, Merih Deniz Toruner, Jordan Idehen, Nikos Tapinos, Ritambhara Singh
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-08-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012272
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849228263032881152
author Yusuke Suita
Hardy Bright
Yuan Pu
Merih Deniz Toruner
Jordan Idehen
Nikos Tapinos
Ritambhara Singh
author_facet Yusuke Suita
Hardy Bright
Yuan Pu
Merih Deniz Toruner
Jordan Idehen
Nikos Tapinos
Ritambhara Singh
author_sort Yusuke Suita
collection DOAJ
description Epigenetic mechanisms play a crucial role in driving transcript expression and shaping the phenotypic plasticity of glioblastoma stem cells (GSCs), contributing to tumor heterogeneity and therapeutic resistance. These mechanisms dynamically regulate the expression of key oncogenic and stemness-associated genes, enabling GSCs to adapt to environmental cues and evade targeted therapies. Importantly, epigenetic reprogramming allows GSCs to transition between cellular states, including therapy-resistant mesenchymal-like phenotypes, underscoring the need for epigenetic-targeting strategies to disrupt these adaptive processes. Understanding these epigenetic drivers of gene expression provides a foundation for novel therapeutic interventions aimed at eradicating GSCs and improving glioblastoma outcomes. Using machine learning (ML), we employ cross-patient prediction of transcript expression in GSCs by combining epigenetic features from various sources, including ATAC-seq, CTCF ChIP-seq, RNAPII ChIP-seq, H3K27Ac ChIP-seq, and RNA-seq. We investigate different ML and deep learning (DL) models for this task and ultimately build our final pipeline using XGBoost. The model trained on one patient generalizes to other 11 patients with high performance. Notably, H3K27Ac alone from a single patient is sufficient to predict gene expression in all 11 patients. Furthermore, the distribution of H3K27Ac peaks across the genomes of all patients is remarkably similar. These findings suggest that GSCs share a common distributional pattern of enhancer activity characterized by H3K27Ac, which can be utilized to predict gene expression in GSCs across patients. In summary, while GSCs are known for their transcriptomic and phenotypic heterogeneity, we propose that they share a common epigenetic pattern of enhancer activation that defines their underlying transcriptomic expression pattern. This pattern can predict gene expression across patient samples, providing valuable insights into the biology of GSCs.
format Article
id doaj-art-ea4aa1bcc07545649664e650120f4dc2
institution Kabale University
issn 1553-734X
1553-7358
language English
publishDate 2025-08-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-ea4aa1bcc07545649664e650120f4dc22025-08-23T05:31:15ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-08-01218e101227210.1371/journal.pcbi.1012272Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.Yusuke SuitaHardy BrightYuan PuMerih Deniz TorunerJordan IdehenNikos TapinosRitambhara SinghEpigenetic mechanisms play a crucial role in driving transcript expression and shaping the phenotypic plasticity of glioblastoma stem cells (GSCs), contributing to tumor heterogeneity and therapeutic resistance. These mechanisms dynamically regulate the expression of key oncogenic and stemness-associated genes, enabling GSCs to adapt to environmental cues and evade targeted therapies. Importantly, epigenetic reprogramming allows GSCs to transition between cellular states, including therapy-resistant mesenchymal-like phenotypes, underscoring the need for epigenetic-targeting strategies to disrupt these adaptive processes. Understanding these epigenetic drivers of gene expression provides a foundation for novel therapeutic interventions aimed at eradicating GSCs and improving glioblastoma outcomes. Using machine learning (ML), we employ cross-patient prediction of transcript expression in GSCs by combining epigenetic features from various sources, including ATAC-seq, CTCF ChIP-seq, RNAPII ChIP-seq, H3K27Ac ChIP-seq, and RNA-seq. We investigate different ML and deep learning (DL) models for this task and ultimately build our final pipeline using XGBoost. The model trained on one patient generalizes to other 11 patients with high performance. Notably, H3K27Ac alone from a single patient is sufficient to predict gene expression in all 11 patients. Furthermore, the distribution of H3K27Ac peaks across the genomes of all patients is remarkably similar. These findings suggest that GSCs share a common distributional pattern of enhancer activity characterized by H3K27Ac, which can be utilized to predict gene expression in GSCs across patients. In summary, while GSCs are known for their transcriptomic and phenotypic heterogeneity, we propose that they share a common epigenetic pattern of enhancer activation that defines their underlying transcriptomic expression pattern. This pattern can predict gene expression across patient samples, providing valuable insights into the biology of GSCs.https://doi.org/10.1371/journal.pcbi.1012272
spellingShingle Yusuke Suita
Hardy Bright
Yuan Pu
Merih Deniz Toruner
Jordan Idehen
Nikos Tapinos
Ritambhara Singh
Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.
PLoS Computational Biology
title Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.
title_full Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.
title_fullStr Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.
title_full_unstemmed Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.
title_short Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.
title_sort machine learning on multiple epigenetic features reveals h3k27ac as a driver of gene expression prediction across patients with glioblastoma
url https://doi.org/10.1371/journal.pcbi.1012272
work_keys_str_mv AT yusukesuita machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma
AT hardybright machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma
AT yuanpu machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma
AT merihdeniztoruner machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma
AT jordanidehen machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma
AT nikostapinos machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma
AT ritambharasingh machinelearningonmultipleepigeneticfeaturesrevealsh3k27acasadriverofgeneexpressionpredictionacrosspatientswithglioblastoma