SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies

Detecting epistatic interactions in GWAS (genome-wide association studies) data is of great significance in studying common and complex diseases; however, the ability to detect high-order epistatic interactions in GWAS data is still insufficient. Existing methods are usually used to identify two-ord...

Full description

Saved in:
Bibliographic Details
Main Authors: Liyan Sun, Guixia Liu, Rongquan Wang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8970268/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850054259967524864
author Liyan Sun
Guixia Liu
Rongquan Wang
author_facet Liyan Sun
Guixia Liu
Rongquan Wang
author_sort Liyan Sun
collection DOAJ
description Detecting epistatic interactions in GWAS (genome-wide association studies) data is of great significance in studying common and complex diseases; however, the ability to detect high-order epistatic interactions in GWAS data is still insufficient. Existing methods are usually used to identify two-order interactions, and they cannot detect a large number of interactions. In this article, we propose a novel stochastic approach named SHEIB-AGM (stochastic approach for detecting high-order epistatic interactions using bioinformation with automatic gene matrix). SHEIB-AGM utilizes bioinformation to construct a gene matrix. In each iteration, it randomly generate a high-order SNP combination based on the gene matrix. SHEIB-AGM utilizes k2 (the Bayesian network scoring criterion) and G-test to detect epistasis in the generated combination and automatically update the gene matrix. We have compared SHEIB-AGM with six other methods, i.e., DECMDR, SNPHarvester, MACOED, AntEpiSeeker, HS-MMGKG and SEE, on simulated data including 108 epistatic models and 17,600 files. The results demonstrate that SHEIB-AGM greatly outperforms the above methods in terms of F-measure and power. We utilized SHEIB-AGM (with and without bioinformation) to analyze a real GWAS dataset from the Wellcome Trust Case Control Consortium. The results indicate that SHEIB-AGM with bioinformation can detect 33.94~3069.40-times more epistatic interactions. We have found numerous genes and gene pairs that may play an important role in seven complex diseases. Some of them have been found in the CTD database (the Comparative Toxicogenomics Database).
format Article
id doaj-art-d34949bf4d7e41c78b99827d2d707b48
institution DOAJ
issn 2169-3536
language English
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-d34949bf4d7e41c78b99827d2d707b482025-08-20T02:52:19ZengIEEEIEEE Access2169-35362020-01-018216762169310.1109/ACCESS.2020.29694658970268SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association StudiesLiyan Sun0https://orcid.org/0000-0003-2145-6341Guixia Liu1https://orcid.org/0000-0002-2456-1289Rongquan Wang2https://orcid.org/0000-0002-3375-9561Department of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, ChinaDepartment of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, ChinaDepartment of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, ChinaDetecting epistatic interactions in GWAS (genome-wide association studies) data is of great significance in studying common and complex diseases; however, the ability to detect high-order epistatic interactions in GWAS data is still insufficient. Existing methods are usually used to identify two-order interactions, and they cannot detect a large number of interactions. In this article, we propose a novel stochastic approach named SHEIB-AGM (stochastic approach for detecting high-order epistatic interactions using bioinformation with automatic gene matrix). SHEIB-AGM utilizes bioinformation to construct a gene matrix. In each iteration, it randomly generate a high-order SNP combination based on the gene matrix. SHEIB-AGM utilizes k2 (the Bayesian network scoring criterion) and G-test to detect epistasis in the generated combination and automatically update the gene matrix. We have compared SHEIB-AGM with six other methods, i.e., DECMDR, SNPHarvester, MACOED, AntEpiSeeker, HS-MMGKG and SEE, on simulated data including 108 epistatic models and 17,600 files. The results demonstrate that SHEIB-AGM greatly outperforms the above methods in terms of F-measure and power. We utilized SHEIB-AGM (with and without bioinformation) to analyze a real GWAS dataset from the Wellcome Trust Case Control Consortium. The results indicate that SHEIB-AGM with bioinformation can detect 33.94~3069.40-times more epistatic interactions. We have found numerous genes and gene pairs that may play an important role in seven complex diseases. Some of them have been found in the CTD database (the Comparative Toxicogenomics Database).https://ieeexplore.ieee.org/document/8970268/Epistasisgenome-wide association studiessingle-nucleotide polymorphism
spellingShingle Liyan Sun
Guixia Liu
Rongquan Wang
SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
IEEE Access
Epistasis
genome-wide association studies
single-nucleotide polymorphism
title SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
title_full SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
title_fullStr SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
title_full_unstemmed SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
title_short SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
title_sort sheib agm a novel stochastic approach for detecting high order epistatic interactions using bioinformation with automatic gene matrix in genome wide association studies
topic Epistasis
genome-wide association studies
single-nucleotide polymorphism
url https://ieeexplore.ieee.org/document/8970268/
work_keys_str_mv AT liyansun sheibagmanovelstochasticapproachfordetectinghighorderepistaticinteractionsusingbioinformationwithautomaticgenematrixingenomewideassociationstudies
AT guixialiu sheibagmanovelstochasticapproachfordetectinghighorderepistaticinteractionsusingbioinformationwithautomaticgenematrixingenomewideassociationstudies
AT rongquanwang sheibagmanovelstochasticapproachfordetectinghighorderepistaticinteractionsusingbioinformationwithautomaticgenematrixingenomewideassociationstudies