A Markov chain Monte Carlo method for estimating population mixing using Y-chromosome markers: mixing of the Han people in China

Ann Hum Genet. 2007 May;71(Pt 3):407-20. doi: 10.1111/j.1469-1809.2006.00329.x. Epub 2006 Dec 5.

Abstract

We present a new approach for estimating mixing between populations based on non-recombining markers, specifically Y-chromosome microsatellites. A Markov chain Monte Carlo (MCMC) Bayesian statistical approach is used to calculate the posterior probability distribution of population parameters of interest, including the effective population size and the time to most recent common ancestor (MRCA). To test whether two populations are homogeneously mixed we introduce a "mixing" statistic defined for each coalescent event that weights the contribution of that ancestor's descendants to the two subpopulations, and an associated population "purity" statistic. Using simulated data with low levels of migration between two populations, we demonstrate that our method is more sensitive than other commonly used distance-based methods such as R(ST) and D(SW). To illustrate our method, we analysed mixing between 11 pre-defined Chinese ethnic/regional populations, using 5 microsatellite markers from the non-recombining region of the Y-chromosome (NRY), demonstrating a significant clustering of a subset of subpopulations with a high mutual relative degree of mixing (homogeneous mixing with support >0.99). Our analysis suggests that there is a strong correlation between effective population size and mixing with other subpopulations. Thus, despite considerable mixing between these groups, the purity statistic still identifies significant heterogeneity, suggesting that periods of historical isolation continue to leave a recoverable signal despite modern introgression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asian People / genetics
  • Bayes Theorem
  • China
  • Chromosomes, Human, Y / genetics*
  • Databases, Genetic
  • Ethnicity / genetics
  • Humans
  • Male
  • Markov Chains
  • Models, Genetic
  • Monte Carlo Method