Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS

Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GS...

Full description

Saved in:
Bibliographic Details
Main Authors: Ji-sun Kwon, Jihye Kim, Dougu Nam, Sangsoo Kim
Format: Article
Language:English
Published: BioMed Central 2012-06-01
Series:Genomics & Informatics
Subjects:
Online Access:http://genominfo.org/upload/pdf/gni-10-123.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832573360402333696
author Ji-sun Kwon
Jihye Kim
Dougu Nam
Sangsoo Kim
author_facet Ji-sun Kwon
Jihye Kim
Dougu Nam
Sangsoo Kim
author_sort Ji-sun Kwon
collection DOAJ
description Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.
format Article
id doaj-art-77b7c8d39cca4866aaf6688aa91e515c
institution Kabale University
issn 1598-866X
2234-0742
language English
publishDate 2012-06-01
publisher BioMed Central
record_format Article
series Genomics & Informatics
spelling doaj-art-77b7c8d39cca4866aaf6688aa91e515c2025-02-02T04:34:07ZengBioMed CentralGenomics & Informatics1598-866X2234-07422012-06-0110212312710.5808/GI.2012.10.2.1233Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWASJi-sun Kwon0Jihye Kim1Dougu Nam2Sangsoo Kim3Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea.Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea.School of Nano-Bioscience and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Korea.Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea.Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.http://genominfo.org/upload/pdf/gni-10-123.pdfgene set analysisgenome-wide association studyGSA-SNPi-GSEA4GWASimputation
spellingShingle Ji-sun Kwon
Jihye Kim
Dougu Nam
Sangsoo Kim
Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
Genomics & Informatics
gene set analysis
genome-wide association study
GSA-SNP
i-GSEA4GWAS
imputation
title Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
title_full Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
title_fullStr Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
title_full_unstemmed Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
title_short Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
title_sort performance comparison of two gene set analysis methods for genome wide association study results gsa snp vs i gsea4gwas
topic gene set analysis
genome-wide association study
GSA-SNP
i-GSEA4GWAS
imputation
url http://genominfo.org/upload/pdf/gni-10-123.pdf
work_keys_str_mv AT jisunkwon performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas
AT jihyekim performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas
AT dougunam performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas
AT sangsookim performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas