Inference of population splits and mixtures from genome-wide allele frequency data.

Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixture...

Full description

Saved in:
Bibliographic Details
Main Authors: Joseph K Pickrell, Jonathan K Pritchard
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS Genetics
Online Access:https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1002967&type=printable
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850190384075898880
author Joseph K Pickrell
Jonathan K Pritchard
author_facet Joseph K Pickrell
Jonathan K Pritchard
author_sort Joseph K Pickrell
collection DOAJ
description Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.
format Article
id doaj-art-147fb7a1b9d8418c96e6c382fb9aa652
institution OA Journals
issn 1553-7390
1553-7404
language English
publishDate 2012-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Genetics
spelling doaj-art-147fb7a1b9d8418c96e6c382fb9aa6522025-08-20T02:15:19ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042012-01-01811e100296710.1371/journal.pgen.1002967Inference of population splits and mixtures from genome-wide allele frequency data.Joseph K PickrellJonathan K PritchardMany aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1002967&type=printable
spellingShingle Joseph K Pickrell
Jonathan K Pritchard
Inference of population splits and mixtures from genome-wide allele frequency data.
PLoS Genetics
title Inference of population splits and mixtures from genome-wide allele frequency data.
title_full Inference of population splits and mixtures from genome-wide allele frequency data.
title_fullStr Inference of population splits and mixtures from genome-wide allele frequency data.
title_full_unstemmed Inference of population splits and mixtures from genome-wide allele frequency data.
title_short Inference of population splits and mixtures from genome-wide allele frequency data.
title_sort inference of population splits and mixtures from genome wide allele frequency data
url https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1002967&type=printable
work_keys_str_mv AT josephkpickrell inferenceofpopulationsplitsandmixturesfromgenomewideallelefrequencydata
AT jonathankpritchard inferenceofpopulationsplitsandmixturesfromgenomewideallelefrequencydata