Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants

Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing erro...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao Wang, Chang-Yun Lin, Yuanhao Zhang, Ruofeng Wen, Kenny Ye
Format: Article
Language:English
Published: Wiley 2012-01-01
Series:Journal of Probability and Statistics
Online Access:http://dx.doi.org/10.1155/2012/524724
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841524500973223936
author Tao Wang
Chang-Yun Lin
Yuanhao Zhang
Ruofeng Wen
Kenny Ye
author_facet Tao Wang
Chang-Yun Lin
Yuanhao Zhang
Ruofeng Wen
Kenny Ye
author_sort Tao Wang
collection DOAJ
description Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.
format Article
id doaj-art-ef7cc47059d54632bbe8ededb1804312
institution Kabale University
issn 1687-952X
1687-9538
language English
publishDate 2012-01-01
publisher Wiley
record_format Article
series Journal of Probability and Statistics
spelling doaj-art-ef7cc47059d54632bbe8ededb18043122025-02-03T05:53:06ZengWileyJournal of Probability and Statistics1687-952X1687-95382012-01-01201210.1155/2012/524724524724Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare VariantsTao Wang0Chang-Yun Lin1Yuanhao Zhang2Ruofeng Wen3Kenny Ye4Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, NY 10461, USADepartment of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taichung 402, TaiwanDepartment of Applied Mathematics and Statistics, Stony Brook University, New York, NY 11794, USADepartment of Applied Mathematics and Statistics, Stony Brook University, New York, NY 11794, USADepartment of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, NY 10461, USANext generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.http://dx.doi.org/10.1155/2012/524724
spellingShingle Tao Wang
Chang-Yun Lin
Yuanhao Zhang
Ruofeng Wen
Kenny Ye
Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
Journal of Probability and Statistics
title Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
title_full Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
title_fullStr Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
title_full_unstemmed Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
title_short Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
title_sort design and statistical analysis of pooled next generation sequencing for rare variants
url http://dx.doi.org/10.1155/2012/524724
work_keys_str_mv AT taowang designandstatisticalanalysisofpoolednextgenerationsequencingforrarevariants
AT changyunlin designandstatisticalanalysisofpoolednextgenerationsequencingforrarevariants
AT yuanhaozhang designandstatisticalanalysisofpoolednextgenerationsequencingforrarevariants
AT ruofengwen designandstatisticalanalysisofpoolednextgenerationsequencingforrarevariants
AT kennyye designandstatisticalanalysisofpoolednextgenerationsequencingforrarevariants