Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data

In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genot...

Full description

Saved in:
Bibliographic Details
Main Authors: Soon-Young Kim, Ji-Hong Kim, Yeun-Jun Chung
Format: Article
Language:English
Published: BioMed Central 2012-09-01
Series:Genomics & Informatics
Subjects:
Online Access:http://genominfo.org/upload/pdf/gni-10-194.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832573934820655104
author Soon-Young Kim
Ji-Hong Kim
Yeun-Jun Chung
author_facet Soon-Young Kim
Ji-Hong Kim
Yeun-Jun Chung
author_sort Soon-Young Kim
collection DOAJ
description In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genotyping data have been developed; however, due to the fundamental limitation of SNP genotyping data for the measurement of signal intensity, there are still concerns regarding the possibility of false discovery or low sensitivity for detecting CNVs. In this study, we aimed to verify the effect of combining multiple CNV calling algorithms and set up the most reliable pipeline for CNV calling with Affymetrix Genomewide SNP 5.0 data. For this purpose, we selected the 3 most commonly used algorithms for CNV segmentation from SNP genotyping data, PennCNV, QuantiSNP; and BirdSuite. After defining the CNV loci using the 3 different algorithms, we assessed how many of them overlapped with each other, and we also validated the CNVs by genomic quantitative PCR. Through this analysis, we proposed that for reliable CNV-based genomewide association study using SNP array data, CNV calls must be performed with at least 3 different algorithms and that the CNVs consistently called from more than 2 algorithms must be used for association analysis, because they are more reliable than the CNVs called from a single algorithm. Our result will be helpful to set up the CNV analysis protocols for Affymetrix Genomewide SNP 5.0 genotyping data.
format Article
id doaj-art-e2117c5dcd604e18a50a40d2de7cbc0a
institution Kabale University
issn 1598-866X
2234-0742
language English
publishDate 2012-09-01
publisher BioMed Central
record_format Article
series Genomics & Informatics
spelling doaj-art-e2117c5dcd604e18a50a40d2de7cbc0a2025-02-02T02:21:56ZengBioMed CentralGenomics & Informatics1598-866X2234-07422012-09-0110319419910.5808/GI.2012.10.3.19414Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping DataSoon-Young Kim0Ji-Hong Kim1Yeun-Jun Chung2Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine, Seoul 137-701, Korea.Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine, Seoul 137-701, Korea.Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine, Seoul 137-701, Korea.In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genotyping data have been developed; however, due to the fundamental limitation of SNP genotyping data for the measurement of signal intensity, there are still concerns regarding the possibility of false discovery or low sensitivity for detecting CNVs. In this study, we aimed to verify the effect of combining multiple CNV calling algorithms and set up the most reliable pipeline for CNV calling with Affymetrix Genomewide SNP 5.0 data. For this purpose, we selected the 3 most commonly used algorithms for CNV segmentation from SNP genotyping data, PennCNV, QuantiSNP; and BirdSuite. After defining the CNV loci using the 3 different algorithms, we assessed how many of them overlapped with each other, and we also validated the CNVs by genomic quantitative PCR. Through this analysis, we proposed that for reliable CNV-based genomewide association study using SNP array data, CNV calls must be performed with at least 3 different algorithms and that the CNVs consistently called from more than 2 algorithms must be used for association analysis, because they are more reliable than the CNVs called from a single algorithm. Our result will be helpful to set up the CNV analysis protocols for Affymetrix Genomewide SNP 5.0 genotyping data.http://genominfo.org/upload/pdf/gni-10-194.pdfCNV defining algorithmDNA copy number variationsSNP array
spellingShingle Soon-Young Kim
Ji-Hong Kim
Yeun-Jun Chung
Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
Genomics & Informatics
CNV defining algorithm
DNA copy number variations
SNP array
title Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_full Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_fullStr Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_full_unstemmed Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_short Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data
title_sort effect of combining multiple cnv defining algorithms on the reliability of cnv calls from snp genotyping data
topic CNV defining algorithm
DNA copy number variations
SNP array
url http://genominfo.org/upload/pdf/gni-10-194.pdf
work_keys_str_mv AT soonyoungkim effectofcombiningmultiplecnvdefiningalgorithmsonthereliabilityofcnvcallsfromsnpgenotypingdata
AT jihongkim effectofcombiningmultiplecnvdefiningalgorithmsonthereliabilityofcnvcallsfromsnpgenotypingdata
AT yeunjunchung effectofcombiningmultiplecnvdefiningalgorithmsonthereliabilityofcnvcallsfromsnpgenotypingdata