Genome-wide association studies are enriched for interacting genes

Abstract Background With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence developm...

Full description

Saved in:
Bibliographic Details
Main Authors: Peter T. Nguyen, Simon G. Coetzee, Irina Silacheva, Dennis J. Hazelett
Format: Article
Language:English
Published: BMC 2025-01-01
Series:BioData Mining
Subjects:
Online Access:https://doi.org/10.1186/s13040-024-00421-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832594999664967680
author Peter T. Nguyen
Simon G. Coetzee
Irina Silacheva
Dennis J. Hazelett
author_facet Peter T. Nguyen
Simon G. Coetzee
Irina Silacheva
Dennis J. Hazelett
author_sort Peter T. Nguyen
collection DOAJ
description Abstract Background With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk. Results We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies. Conclusions Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.
format Article
id doaj-art-b6fa1c6555e24fb2a87ab260ac69f5c6
institution Kabale University
issn 1756-0381
language English
publishDate 2025-01-01
publisher BMC
record_format Article
series BioData Mining
spelling doaj-art-b6fa1c6555e24fb2a87ab260ac69f5c62025-01-19T12:12:45ZengBMCBioData Mining1756-03812025-01-0118111810.1186/s13040-024-00421-wGenome-wide association studies are enriched for interacting genesPeter T. Nguyen0Simon G. Coetzee1Irina Silacheva2Dennis J. Hazelett3The Department of Biomedical and Translational Sciences, Cedars-Sinai Medical CenterThe Department of Computational Biomedicine, Cedars-Sinai Medical CenterThe Department of Biomedical and Translational Sciences, Cedars-Sinai Medical CenterThe Department of Computational Biomedicine, Cedars-Sinai Medical CenterAbstract Background With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk. Results We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies. Conclusions Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.https://doi.org/10.1186/s13040-024-00421-wGWASGenetic algorithmsVariant prioritizationMulti-omicsBreast cancerComplex disease
spellingShingle Peter T. Nguyen
Simon G. Coetzee
Irina Silacheva
Dennis J. Hazelett
Genome-wide association studies are enriched for interacting genes
BioData Mining
GWAS
Genetic algorithms
Variant prioritization
Multi-omics
Breast cancer
Complex disease
title Genome-wide association studies are enriched for interacting genes
title_full Genome-wide association studies are enriched for interacting genes
title_fullStr Genome-wide association studies are enriched for interacting genes
title_full_unstemmed Genome-wide association studies are enriched for interacting genes
title_short Genome-wide association studies are enriched for interacting genes
title_sort genome wide association studies are enriched for interacting genes
topic GWAS
Genetic algorithms
Variant prioritization
Multi-omics
Breast cancer
Complex disease
url https://doi.org/10.1186/s13040-024-00421-w
work_keys_str_mv AT petertnguyen genomewideassociationstudiesareenrichedforinteractinggenes
AT simongcoetzee genomewideassociationstudiesareenrichedforinteractinggenes
AT irinasilacheva genomewideassociationstudiesareenrichedforinteractinggenes
AT dennisjhazelett genomewideassociationstudiesareenrichedforinteractinggenes