Display Settings:

Format

Send to:

Choose Destination
    PLoS One. 2009;4(1):e4269. Epub 2009 Jan 27.

    Accurate inference of subtle population structure (and other genetic discontinuities) using principal coordinates.

    Source

    United States Department of Agriculture, Agricultural Research Service, National Center for Genetic Resources Preservation, Fort Collins, Colorado, United States of America.

    Abstract

    BACKGROUND:

    Accurate inference of genetic discontinuities between populations is an essential component of intraspecific biodiversity and evolution studies, as well as associative genetics. The most widely-used methods to infer population structure are model-based, Bayesian MCMC procedures that minimize Hardy-Weinberg and linkage disequilibrium within subpopulations. These methods are useful, but suffer from large computational requirements and a dependence on modeling assumptions that may not be met in real data sets. Here we describe the development of a new approach, PCO-MC, which couples principal coordinate analysis to a clustering procedure for the inference of population structure from multilocus genotype data.

    METHODOLOGY/PRINCIPAL FINDINGS:

    PCO-MC uses data from all principal coordinate axes simultaneously to calculate a multidimensional "density landscape", from which the number of subpopulations, and the membership within subpopulations, is determined using a valley-seeking algorithm. Using extensive simulations, we show that this approach outperforms a Bayesian MCMC procedure when many loci (e.g. 100) are sampled, but that the Bayesian procedure is marginally superior with few loci (e.g. 10). When presented with sufficient data, PCO-MC accurately delineated subpopulations with population F(st) values as low as 0.03 (G'(st)>0.2), whereas the limit of resolution of the Bayesian approach was F(st) = 0.05 (G'(st)>0.35).

    CONCLUSIONS/SIGNIFICANCE:

    We draw a distinction between population structure inference for describing biodiversity as opposed to Type I error control in associative genetics. We suggest that discrete assignments, like those produced by PCO-MC, are appropriate for circumscribing units of biodiversity whereas expression of population structure as a continuous variable is more useful for case-control correction in structured association studies.

    PMID:
    19172174
    [PubMed - indexed for MEDLINE]
    PMCID:
    PMC2625398
    Free PMC Article

    Images from this publication.See all images (6) Free text

    Figure 2
    Figure 4
    Figure 6
    Figure 1
    Figure 3
    Figure 5

      Supplemental Content

      Icon for Public Library of Science Icon for PubMed Central

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk