Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS
Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GS...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BioMed Central
2012-06-01
|
Series: | Genomics & Informatics |
Subjects: | |
Online Access: | http://genominfo.org/upload/pdf/gni-10-123.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832573360402333696 |
---|---|
author | Ji-sun Kwon Jihye Kim Dougu Nam Sangsoo Kim |
author_facet | Ji-sun Kwon Jihye Kim Dougu Nam Sangsoo Kim |
author_sort | Ji-sun Kwon |
collection | DOAJ |
description | Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended. |
format | Article |
id | doaj-art-77b7c8d39cca4866aaf6688aa91e515c |
institution | Kabale University |
issn | 1598-866X 2234-0742 |
language | English |
publishDate | 2012-06-01 |
publisher | BioMed Central |
record_format | Article |
series | Genomics & Informatics |
spelling | doaj-art-77b7c8d39cca4866aaf6688aa91e515c2025-02-02T04:34:07ZengBioMed CentralGenomics & Informatics1598-866X2234-07422012-06-0110212312710.5808/GI.2012.10.2.1233Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWASJi-sun Kwon0Jihye Kim1Dougu Nam2Sangsoo Kim3Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea.Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea.School of Nano-Bioscience and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Korea.Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea.Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.http://genominfo.org/upload/pdf/gni-10-123.pdfgene set analysisgenome-wide association studyGSA-SNPi-GSEA4GWASimputation |
spellingShingle | Ji-sun Kwon Jihye Kim Dougu Nam Sangsoo Kim Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS Genomics & Informatics gene set analysis genome-wide association study GSA-SNP i-GSEA4GWAS imputation |
title | Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS |
title_full | Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS |
title_fullStr | Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS |
title_full_unstemmed | Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS |
title_short | Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS |
title_sort | performance comparison of two gene set analysis methods for genome wide association study results gsa snp vs i gsea4gwas |
topic | gene set analysis genome-wide association study GSA-SNP i-GSEA4GWAS imputation |
url | http://genominfo.org/upload/pdf/gni-10-123.pdf |
work_keys_str_mv | AT jisunkwon performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas AT jihyekim performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas AT dougunam performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas AT sangsookim performancecomparisonoftwogenesetanalysismethodsforgenomewideassociationstudyresultsgsasnpvsigsea4gwas |