Impact of sampling schemes on demographic inference: an empirical study in two species with different mating systems and demographic histories

G3 (Bethesda). 2012 Jul;2(7):803-14. doi: 10.1534/g3.112.002410. Epub 2012 Jul 1.

Abstract

Most species have at least some level of genetic structure. Recent simulation studies have shown that it is important to consider population structure when sampling individuals to infer past population history. The relevance of the results of these computer simulations for empirical studies, however, remains unclear. In the present study, we use DNA sequence datasets collected from two closely related species with very different histories, the selfing species Capsella rubella and its outcrossing relative C. grandiflora, to assess the impact of different sampling strategies on summary statistics and the inference of historical demography. Sampling strategy did not strongly influence the mean values of Tajima's D in either species, but it had some impact on the variance. The general conclusions about demographic history were comparable across sampling schemes even when resampled data were analyzed with approximate Bayesian computation (ABC). We used simulations to explore the effects of sampling scheme under different demographic models. We conclude that when sequences from modest numbers of loci (<60) are analyzed, the sampling strategy is generally of limited importance. The same is true under intermediate or high levels of gene flow (4Nm > 2-10) in models in which global expansion is combined with either local expansion or hierarchical population structure. Although we observe a less severe effect of sampling than predicted under some earlier simulation models, our results should not be seen as an encouragement to neglect this issue. In general, a good coverage of the natural range, both within and between populations, will be needed to obtain a reliable reconstruction of a species's demographic history, and in fact, the effect of sampling scheme on polymorphism patterns may itself provide important information about demographic history.

Keywords: Capsella; Tajima’s D; frequency spectrum; population structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Capsella / genetics*
  • Computer Simulation
  • Genes, Plant
  • Genetic Loci
  • Genetic Variation
  • Genetics, Population*
  • Models, Genetic
  • Polymorphism, Genetic