Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants

Abstract Background Genetic variants play a pivotal role in the initiation and progression of many diseases, including cancer. Detecting these variants is the first step in understanding their contribution to disease mechanisms. RNA sequencing (RNA-Seq) has become a crucial assay in cancer research,...

Full description

Saved in:
Bibliographic Details
Main Authors: Audrey Bollas, Jeffrey Gaither, Kathleen M. Schieffer, Peter White, Elaine R. Mardis
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Communications Medicine
Online Access:https://doi.org/10.1038/s43856-025-00901-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850242585400967168
author Audrey Bollas
Jeffrey Gaither
Kathleen M. Schieffer
Peter White
Elaine R. Mardis
author_facet Audrey Bollas
Jeffrey Gaither
Kathleen M. Schieffer
Peter White
Elaine R. Mardis
author_sort Audrey Bollas
collection DOAJ
description Abstract Background Genetic variants play a pivotal role in the initiation and progression of many diseases, including cancer. Detecting these variants is the first step in understanding their contribution to disease mechanisms. RNA sequencing (RNA-Seq) has become a crucial assay in cancer research, offering insights beyond those provided by DNA sequencing. This study introduces VarRNA, a novel method that utilizes RNA-Seq data to classify single nucleotide variants and insertions/deletions from tumor transcriptomes. Methods VarRNA distinguishes transcriptome variants as germline, somatic, or artifact using a combination of two XGBoost machine learning models. These models were trained and validated using a cohort of pediatric cancer samples with paired tumor and normal DNA exome sequencing data serving as ground truth. We performed additional validation on RNA-Seq data from two distinct cancer datasets, demonstrating that VarRNA outperforms existing RNA variant calling methods. Results VarRNA identifies 50% of the variants detected by exome sequencing and detects unique RNA variants absent in paired tumor and normal DNA exome data. Some variants classified by VarRNA exhibit variant allele frequencies distinct from the corresponding DNA exome data. Strikingly, this phenomenon is prevalent in cancer-driving genes, where VarRNA analysis of the RNA-Seq data reveals the variant allele expression as much higher than expected based on the exome sequencing data. Conclusions These findings highlight the potential of RNA-Seq not only to uncover clinically relevant genetic variants but also to offer a deeper understanding of disease-specific expression dynamics that influence cancer pathogenesis, with implications for prognosis and therapeutic strategies.
format Article
id doaj-art-f3a3d64e05e641beb15ba9e50e1aa805
institution OA Journals
issn 2730-664X
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Communications Medicine
spelling doaj-art-f3a3d64e05e641beb15ba9e50e1aa8052025-08-20T02:00:14ZengNature PortfolioCommunications Medicine2730-664X2025-05-015111210.1038/s43856-025-00901-yVariant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variantsAudrey Bollas0Jeffrey Gaither1Kathleen M. Schieffer2Peter White3Elaine R. Mardis4The Office of Data Sciences, The Abigail Wexner Research Institute, Nationwide Children’s HospitalThe Office of Data Sciences, The Abigail Wexner Research Institute, Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children’s HospitalThe Office of Data Sciences, The Abigail Wexner Research Institute, Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children’s HospitalAbstract Background Genetic variants play a pivotal role in the initiation and progression of many diseases, including cancer. Detecting these variants is the first step in understanding their contribution to disease mechanisms. RNA sequencing (RNA-Seq) has become a crucial assay in cancer research, offering insights beyond those provided by DNA sequencing. This study introduces VarRNA, a novel method that utilizes RNA-Seq data to classify single nucleotide variants and insertions/deletions from tumor transcriptomes. Methods VarRNA distinguishes transcriptome variants as germline, somatic, or artifact using a combination of two XGBoost machine learning models. These models were trained and validated using a cohort of pediatric cancer samples with paired tumor and normal DNA exome sequencing data serving as ground truth. We performed additional validation on RNA-Seq data from two distinct cancer datasets, demonstrating that VarRNA outperforms existing RNA variant calling methods. Results VarRNA identifies 50% of the variants detected by exome sequencing and detects unique RNA variants absent in paired tumor and normal DNA exome data. Some variants classified by VarRNA exhibit variant allele frequencies distinct from the corresponding DNA exome data. Strikingly, this phenomenon is prevalent in cancer-driving genes, where VarRNA analysis of the RNA-Seq data reveals the variant allele expression as much higher than expected based on the exome sequencing data. Conclusions These findings highlight the potential of RNA-Seq not only to uncover clinically relevant genetic variants but also to offer a deeper understanding of disease-specific expression dynamics that influence cancer pathogenesis, with implications for prognosis and therapeutic strategies.https://doi.org/10.1038/s43856-025-00901-y
spellingShingle Audrey Bollas
Jeffrey Gaither
Kathleen M. Schieffer
Peter White
Elaine R. Mardis
Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants
Communications Medicine
title Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants
title_full Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants
title_fullStr Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants
title_full_unstemmed Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants
title_short Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants
title_sort variant calling from rna seq data reveals allele specific differential expression of pathogenic cancer variants
url https://doi.org/10.1038/s43856-025-00901-y
work_keys_str_mv AT audreybollas variantcallingfromrnaseqdatarevealsallelespecificdifferentialexpressionofpathogeniccancervariants
AT jeffreygaither variantcallingfromrnaseqdatarevealsallelespecificdifferentialexpressionofpathogeniccancervariants
AT kathleenmschieffer variantcallingfromrnaseqdatarevealsallelespecificdifferentialexpressionofpathogeniccancervariants
AT peterwhite variantcallingfromrnaseqdatarevealsallelespecificdifferentialexpressionofpathogeniccancervariants
AT elainermardis variantcallingfromrnaseqdatarevealsallelespecificdifferentialexpressionofpathogeniccancervariants