• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of genoresGenome ResearchCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. Jun 1999; 9(6): 558–567.
PMCID: PMC310766

A View of Modern Human Origins from Y Chromosome Microsatellite Variation

Abstract

The idea that all modern humans share a recent (within the last 150,000 years) African origin has been proposed and supported on the basis of three observations. Most genetic loci examined to date have (1) shown greater diversity in African populations than in others, (2) placed the first branch between African and all non-African populations in phylogenetic trees, and (3) indicated recent dates for either the molecular coalescence (with the exception of some autosomal and X-chromosomal loci) or for the time of separation between African and non-African populations. We analyze variation at 10 Y chromosome microsatellite loci that were typed in 506 males representing 49 populations and every inhabited continent and find significantly greater Y chromosome diversity in Africa than elsewhere, find the first branch in phylogenetic trees of the continental populations to fall between African and all non-African populations, and date this branching with the (δμ)2 distance measure to 5800–17,400 or 12,800–36,800 years BP depending on the mutation rate used. The magnitude of the excess Y chromosome diversity in African populations appears to result from a greater antiquity of African populations rather than a greater long-term effective population size. These observations are most consistent with a recent African origin for all modern humans.

For the last 10 years, human population genetics has focused intently on the question of modern human origins. Most geneticists have considered two opposing hypotheses. Both agree that Homo erectus was the first species in our lineage to leave Africa for Europe and Asia—sometime within the last 2 million years—but the models disagree about what happened next. One (the multiregional or candelabra theory) posits that modern humans evolved simultaneously from the descendants of Homo erectus throughout the Old World—synchronized, perhaps, by some amount of gene flow among archaic populations (Wolpoff et al. 1984). The other suggests that anatomically modern humans evolved in Africa within the last 150,000 years, before supplanting archaic populations in Europe and Asia. This has become known as the “out of Africa” hypothesis and is associated primarily with studies of mitochondrial DNA (mtDNA) and the so-called “African” or “Mitochondrial Eve.” Only recently has this view of a recent African origin gained full acceptance and independent verification from paleontologists and archeologists (Stringer and Andrews 1988; Lahr 1996; Foley 1998).

Three lines of evidence favoring a recent African origin have emerged from studies of most genetic systems. First is the observation of greater genetic diversity in Africa than elsewhere. It is reasonable to assume that older populations have had more time to accumulate genetic variation, although variation within populations is affected by many factors in addition to age—most notably, fluctuations in population size. With the exception of classical protein polymorphisms and restriction fragment length polymorphisms (RFLPs) (Cavalli-Sforza et al. 1994), elevated genetic diversity in African populations has been documented for most other genetic systems: mtDNA (Vigilant et al. 1991), autosomal microsatellites (Bowcock et al. 1994; Jorde et al. 1997), an autosomal minisatellite (Armour et al. 1996), and various other autosomal systems (Batzer et al. 1994; Tishkoff et al. 1996). The tendency for nuclear RFLPs and classical polymorphisms to display greater diversity in Europe has been adequately explained by an ascertainment bias (Mountain and Cavalli-Sforza 1994; Rogers and Jorde 1996); most of these markers were discovered in samples of European origin, ensuring that they would be maximally polymorphic in Europeans. A clear picture of the geographic distribution of Y-chromosomal variation has yet to emerge and be rigorously tested. Jorde et al. (1998) have observed greater Y-chromosomal variation in Asia. Hammer et al. (1997) have observed greater Y haplotypic diversity in Africa for five biallelic polymorphisms and observe greater Asian diversity in some Y chromosome lineages (Hammer et al. 1998). The availability of microsatellites has allowed the first tests of the significance of excess Y-chromosomal variation in African populations.

Phylogenetic analyses provide the second line of evidence for an African origin. Beginning with the study of Cann et al. (1987), nearly every study of human mtDNA has presented a tree whose first branch separates African from non-African populations—just as expected if all non-African populations are descendants of an African one. Similarly, trees constructed from autosomal markers—classical polymorphisms (Cavalli-Sforza et al. 1994), autosomal RFLPs (Bowcock et al. 1991), and autosomal microsatellites (Bowcock et al. 1994)—have been consistent in placing the first split between African and non-African populations and generally agree on the placement of subsequent branches as well.

However, the competing theories of modern human origins both posit an African origin, and estimates of the timing of our descent from Africa comprise the third, most discriminating line of genetic evidence. Dates for the mitochondrial coalescence time center around 150,000 years BP. Two useful estimates have been made for the Y chromosome (Hammer 1995; Whitfield et al. 1995). Both dates are recent, which would appear to exclude a multiregional origin.

mtDNA

The most compelling genetic evidence has come from the study of mtDNA, which has a high mutation rate and does not recombine. Under most reasonable models of population structure, the molecular coalescence of a nonrecombining molecule should antedate the actual diversification of populations (Cann et al. 1987). Multiregional evolution could be excluded if the mitochondrial coalescence occurred within the last 1,000,000 years or so. Although technical problems afflicted the earliest estimates, many subsequent studies have put the human mitochondrial coalescence to within the last 250,000 years (Cann et al. 1987; Vigilant et al. 1991; Ruvolo et al. 1993; Horai et al. 1995; Zischler et al. 1995).

An independent approach to the analysis of mtDNA data has been taken with the work of Rogers and Harpending (1992) and Harpending et al. (1993). Their method seeks to extract demographic information from the distribution of mtDNA mismatches within populations. The data suggest that the major human population groups split from each other ~100,000 years ago but did not begin to expand in size until several tens of thousands of years later. The reliability of these mismatch analyses is still a matter of concern (Marjoram and Donnelly 1994). However, they correspond closely with archeological evidence indicating that the modern human anatomy developed in Africa some tens of thousands of years before the major cultural changes that allowed the phenomenal expansion and spread of modern humans into the rest of the Old World only within the last 70,000 years or so (Klein 1995).

Further doubt has been cast on the theory of multiregional evolution (at least in Europe) by the determination of mtDNA D-loop sequence from a Neanderthal bone (Krings et al. 1997). The Neanderthal sequence is quite different from all known human sequences, suggesting that few (if any) Neanderthal mtDNA lineages will be found in modern European populations. Given the difficulty of analyzing DNA of this antiquity, however, we may never be able to exclude entirely the possibility that some Neanderthal gene lineages survive among modern humans (Nordborg 1998).

Autosomal Evidence

Considerable evidence in favor of a recent African origin has also been found in genes of the nucleus. However, the autosomes are more ambiguous in their support of a recent origin for all modern humans, because recombination complicates coalescent analyses of genetic variation; coalescence times are expected to be four times greater, on average, than for the mitochondrial genome; and variation is less abundant than for mtDNA. At the same time, there are many independent loci on the autosomes, and a composite view from several loci is likely to be more informative than the single loci of mtDNA and the Y chromosome. The earliest studies of nuclear DNA to support a recent African origin were the surveys of RFLPs performed by Luca Cavalli-Sforza and his collaborators (Mountain et al. 1993). Like some studies of classical markers before them, the RFLPs placed the deepest split between African and non-African populations (consistent with an African origin). The RFLP data also allowed the assignment of a crude date to the divergence of African and non-African populations by regressing genetic distances among populations onto archeologically derived dates for the time of the first arrival of modern humans on each continent. The fit of this regression is surprisingly linear and indicates (by extrapolation) a date of 100 kya for the split between Africans and non-Africans (Bowcock et al. 1991; Mountain et al. 1993). The weakness of this approach (as also for dates based on mtDNA) is the need for an outside reference to calibrate the rate of genetic divergence (archeological dates in the case of autosomal RFLPs and estimated human–chimp divergence times for mtDNA). More recent work with autosomal microsatellites has allowed an independent determination of the tree topology and the assignment of dates that rely only on estimates of the microsatellite mutation rate and not on comparisons with external events (Goldstein et al. 1995b). Using this approach, Goldstein et al. (1995b) dated the split between African and non-African populations at 156,000 years BP. Approaches based on genetic distances, though, may be afflicted by admixture among populations that would reduce the genetic distance between them and lead to underestimates of the divergence times.

Another approach has been taken by analyzing the geographic distribution of diversity at a minisatellite locus (Armour et al. 1996). The non-African populations contained a very restricted subset of the allelic diversity seen in Africa. Estimates of the age of the African versus non-African split from this locus, however, were around 15,000 years BP—far earlier than conceivable. Homoplasy (a problem with microsatellite and minisatellite loci) as well as mutation rate heterogeneity (e.g., the potential dependence of mutation rate on allele size) and genetic admixture can probably explain the disagreement between this date and others. Tishkoff et al. (1996) took a similar tack, based on the decay of linkage disequilibrium between two closely linked markers on chromosome 12. Again, much greater haplotypic diversity was found in Africa than outside that continent. By comparing levels of linkage disequilibrium in African and non-African populations, these authors estimated the age of chromosomes outside Africa. Their result was 102,000 years BP with an upper limit of 313 kya, although some details of their approach have been criticized (Pritchard and Feldman 1996; Slatkin and Rannala 1998).

A few coalescent analyses of autosomal DNA sequence variation are now appearing. Harding et al. (1997) have chosen to describe an analysis of 349 geographically diverse β-globin gene sequences as contradicting a recent African origin. Nevertheless, the estimated coalescence date for the β-globin locus of ~800,000 years ago is in excellent agreement with mitochondrial coalescence times, when we consider that the average coalescence time for autosomal loci should be about four times greater. These authors find some evidence for an Asian (as well as African) contribution to modern human allelic diversity, but this assertion requires closer examination. Natural selection in response to malaria has had a major impact in shaping β-globin diversity in Africa and parts of Asia and Europe. Clearly, more sequence data from many more autosomal genes will be required to evaluate the likelihood of admixture between populations of modern and archaic humans as the former spread from Africa.

The analysis of X-chromosomal loci has not greatly clarified the picture provided by nuclear genetic variation. Conflicting conclusions have been drawn from two recent studies (Zietkiewicz et al. 1998; Harris and Hey 1999). Nucleotide polymorphisms in an 8-kb segment surrounding exon 44 of the dystrophin gene appear to be evolving neutrally (Zietkiewicz et al. 1998). As a result, their allele frequencies are equivalent, in most cases, to their age. Alleles that appear >100,000–200,000 years old are generally found at similar frequencies in African and most non-African populations. Younger alleles display much more restricted geographic distributions, strongly supporting the notion that all modern populations descend from an ancestral population that existed as recently as 100,000 years ago. An estimate of the long-term effective population size is ~10,000, consistent with estimates from other loci. Zietkeiwicz et al. (1998) suggest that a population with an effective size anywhere near 10,000 would have difficulty evolving independently (multiregionally) into modern Homo sapiens over such wide geographic expanses. Furthermore, the dystrophin gene displays greater African diversity, although much of this diversity appears to be somewhat recent. This relatively young, low-frequency variation might indicate that the African population size has been larger or began to expand earlier than other continental populations.

The geographic patterns of variation in the PDHA1 gene studied by Harris and Hey (1999) are quite unusual and deserve further analysis, although conclusions based on present data appear seriously premature. A fixed nucleotide difference between African and non-African populations was observed, which forms the basis for many of the authors’ conclusions. Because only 35 chromosome were analyzed, however, most assertions are somewhat tentative. The picture of genetic variation presented by most loci examined to date suggests that modern human populations living outside Africa descend from a limited number of African populations that had already begun to diversify (Armour et al. 1996; Tishkoff et al. 1996). When only 16 African chromosomes have been sampled, it is hard to preclude the possibility that both alleles are present in Africa or other continents like Europe, which is represented by only 6 French X chromosomes. The action of natural selection (perhaps via “hitchhiking”) is detected in sweeping the non-African allele to apparent fixation, although the authors nevertheless attribute the extreme allele distributions to strongly subdivided ancestral populations. Although noting that admixture among populations will cause dates based on genetic distance among populations to underestimate the time of their actual fission, the authors fail to note that their estimates of coalescence times (which have very broad confidence intervals) must overestimate the time of population fission by an unknown amount. The events of interest are likely to fall within the dates estimated by coalescence times of alleles and genetic distances among populations. Because a fixed nucleotide difference among continents could not be maintained if admixture among the continents were high, the results of Harris and Hey (1999) might suggest that the time of population fission is more reliably estimated by genetic distance measures than by coalescence times.

The Y Chromosome

As the paternally transmitted counterpart to mtDNA, the Y chromosome has attracted great interest. It has provided a unique challenge as well. Unlike mtDNA, its mutation rate is very low, and variation has been exceedingly hard to find. Conventional searches for RFLPs encountered little success (Casanova et al. 1985; Lucotte and Ngo 1985). The first single nucleotide polymorphism to be identified on the Y chromosome was described only in 1994 (Seielstad et al. 1994). One study found no variation in a 729-bp intron of the ZFY gene in 38 human samples from throughout the world (Dorit et al. 1995). Other work has allowed estimates of the Y chromosome coalescence time to be made. Hammer’s (1995) study, based on three polymorphic sites assayed in 18 individuals, indicates a coalescence time of 188,000 BP with a confidence interval stretching from 51 kya to 411 kya. Whitfield et al. (1995) assayed a different set of three polymorphisms in five individuals and calculated a coalescence time of between 37,000 years and 49,000 years BP. As indicated by the large confidence interval and the discordance between the two Y chromosome studies, the number of polymorphic sites and individuals needs to be increased substantially. Since these studies were published, more efficient techniques for detecting polymorphisms have been developed (Underhill et al. 1997), and more definitive estimates of the Y chromosome coalescence time derived from nucleotide sequence information will soon be available.

In this paper we examine the ability of Y chromosome microsatellites to discriminate between the predictions of the two rival theories of modern human origins. The statistical significance of the observed excess of Y chromosome microsatellite diversity in African populations is tested following the approach of Jorde et al. (1997). Significantly greater diversity in African populations is observed. The magnitude of this excess is greater than reported for autosomal microsatellites using many of the same population samples (Jorde et al. 1997). As explained below, this is more likely to result from an early population expansion among African populations than from substantial differences in Ne among populations. A phylogenetic analysis based on the (δμ)2 distance (among others) also supports an African origin by placing the first split between African and all non-African populations. However, attempts to date the split between African and non-African populations—or estimating the length of time over which the observed levels of microsatellite diversity have been accumulating—are more difficult tasks. (δμ)2 is a linear distance measure, and knowledge of the microsatellite mutation rate allows the date of any split in the tree to be estimated (Goldstein et al. 1995b). Applying this method to the Y chromosome microsatellites results in very recent dates for the split between African and non-African populations. On the whole, Y chromosome microsatellites are consistent with a recent African origin for modern humans and tend to agree with the results of studies of mtDNA and the autosomes.

RESULTS

Table Table2,2, A, B, and C (below) reports the average gene diversities and the results of the test for excess genetic diversity (described in Methods). The tests were performed on three different continental groupings. Initially, six regional groups were identified: Africa, Asia, Pakistan, Europe, Oceania, and America (Table (Table1).1). This scheme resulted in some groups with very few chromosomes and might have been biased toward dividing non-African populations into inappropriate subpopulations (e.g., considering Pakistan separately from Europe or Asia). For this reason, the test was repeated on an alternative grouping that was designed to minimize the differences in sample size, while increasing the variance of non-African populations in a way that made some geographic and historical sense. Ethiopians, who have experienced admixture with populations in Western Asia, were split off from sub-Saharan Africans along with the Beja from Sudan and Tuareg from Mali. These populations were grouped instead with Europeans and Pakistanis. A third group encompassing populations from Asia, the Americas, and Oceania was also formed. The results for this grouping are reported in Table Table2B.2B. The final arrangement (Table (Table2C)2C) was designed to be extremely conservative and identified only two groups: sub-Saharan Africans versus all other populations (including Ethiopians). Raw data are available at http://www.stats.ox.ac.uk/~pritch/ydata.html.

Table 1
List of Populations for Which Y Chromosome Genotypes Were Determined at 10 Loci
Table 2
Y-Chromosomal Variation in Six Groups Based on Continent of Origin

s.e.s surrounding estimates of gene diversity are fairly large [calculated according to Nei (1987)], reflecting the small number of loci currently available. The results of the first grouping reported in Table Table2A2A indicate higher gene diversity (H) in Africans than the other continents, although this difference is not significant given the large sampling variance. The observed S was 1.76, suggesting an excess of genetic diversity in Africa of 76% relative to the other continents (see Methods). As shown in Table Table2B,2B, sub-Saharan African populations continued to exhibit greater variance in repeat score in a more conservative classification scheme. Not unexpectedly, the African excess was lower (39%). The largest value for S in >100,000 random replicates was 1.24, indicating P < 10−5. Finally, Africans continue to show a significant excess of diversity in even the most conservative classification reported in Table Table2C.2C. A similar excess in gene diversity is not observed in either of the latter two classifications, although none of these differences is significant.

The UPGMA tree depicted in Figure Figure11 places the root between African and non-African populations, like a great number of trees before it. Although it is frequently criticized, the reliability of average linkage in reconstructing evolutionary relationships is well established (Cavalli-Sforza et al. 1994), particularly when evolutionary rates are approximately constant among populations. One reason for its good performance may relate to its calculation of average distances between each taxon added to the tree and those that have already been added—an approach that may minimize the errors associated with genetic distances between particular pairs of populations.

Figure 1
Average linkage tree constructed from 10 Y chromosome microsatellite genes using the (δμ)2 distance measure of Goldstein et al. (1995b).

The matrix of (δμ)2 distances and s.e.s estimated from 10,000 bootstrap replicates are reproduced in Table Table3.3. The average distance of all non-African populations to the African population is 1.029 with a 95% confidence interval of 0.515–1.546. Applying equation 1 (in Methods) and a microsatellite mutation rate of 1.2 × 10−3 mutations per generation (Heyer et al. 1997; Bianchi et al. 1998) yields an estimate of 429 generations since the split of all non-African populations from the African population. If the human generation time is 27 years (Weiss 1973), this corresponds to a date of 11,600 years BP. The 95% confidence interval is 5800–17,400 BP. If a lower mutation rate of 5.6 × 10−4 is used as suggested by Weber and Wong (1993) (and still within the confidence interval of Heyer et al.’s estimate), a date of 24,800 ± 12,000 is derived.

Table 3
(δμ)2 Distance Matrix for Six Continental Populations

DISCUSSION

There are several possible reasons for an excess of genetic diversity in African populations: African populations are older and have been accumulating genetic variation for a longer period of time, African populations have maintained a higher long-term effective population size, gene flow into Africa has been higher than into other continents, and population subdivision has been greater outside Africa. These last two possibilities are easily discounted. If gene flow into Africa from the other continents was sufficient to elevate African genetic diversity, then the smallest genetic distances should be found among African and non-African populations. We observe the opposite phenomenon, with the greatest genetic distances occurring between African and all non-African populations. We would also anticipate a very short branch leading to the African population in a neighbor-joining tree, which is not the case (Fig. (Fig.2).2). Extensive population subdivision outside Africa can be excluded by noting that, although genetic diversity within any subpopulation may be reduced, the genetic variation across the entire collection of subdivided populations should be very high. The data of Table Table2C2C grouping 177 African chromosomes against 329 chromosomes from all the other continents combined demonstrate a significantly higher variance in repeat number for African populations than for all others. Non-African populations appear as if they are but a sample of African genetic diversity. Even if greater genetic variation were eliminated in non-African populations through a demographic crisis, one would not expect to find similar variants eliminated from populations as widely separated as Basques in the Pyrenees and Quechua in the Andes. Natural selection could remove particular variants from populations, but why would it leave Africa alone unaffected?

Figure 2
Neighbor-joining tree constructed from the same (δμ)2 distance matrix used to construct the tree in Fig. Fig.11.

It is more difficult to distinguish between an older African population and a larger long-term effective population size. Doing so depends on whether the microsatellites have reached mutation-drift equilibrium. At mutation-drift equilibrium, the variation within a population is proportional only to n (the effective population size) and μ (the mutation rate, which is probably the same in all populations), so differences in levels of variation should depend only on the relative effective population sizes. Before mutation-drift equilibrium is reached, however, a population’s age will influence the amount of genetic variation it displays. Variation will accumulate in direct proportion to time and the mutation rate.

Equilibrium cannot be rejected with a small number of loci. The variance of the variance of repeat number among loci is high, requiring huge numbers of loci (or loci with low variances) to reject mutation-drift equilibrium (Goldstein et al. 1996). However, even if we cannot formally reject mutation-drift equilibrium with the limited data we have today, we should note that the attainment of equilibrium is very slow—roughly equal to the reciprocal of the mutation rate in generations. For the estimated mutation rates we have used, this would range from 22,500 years to 48,000 years, although this should be regarded as a minimum estimate because the time necessary to reach equilibrium would be lengthened by population subdivision and fluctuations in population size. Most non-African populations appear unlikely to be much older than 48,000 years, so we might be justified in assuming that differences in the age of populations have not been completely erased by the attainment of equilibrium.

Two observations might lead us to conclude that the fact of greater genetic diversity in Africa is not purely the result of a larger long-term effective population size. The smaller a population, the greater the effect of genetic drift in changing gene frequencies. Drift would tend to differentiate populations, increasing genetic distances between them and lengthening branches in trees relating the populations. This is not what we observe (Figs. (Figs.11 and and2).2). If non-African populations were smaller and drifting to a greater extent, they would appear more dissimilar from each other rather than displaying the similarity that is a feature of most trees—most notably of trees constructed from mitochondrial DNA, which should experience increased drift as a result of their decreased effective population size relative to the autosomes.

The second argument, suggested by Jorde et al. (1997), receives further support from the study of Y chromosome microsatellites. Citing Li (1977) and Slatkin (1995), Jorde et al. (1997) note that diversity for an expanding population (and analyses of mitochondrial and Y-chromosomal DNA indicate that most major populations have been expanding) is proportional to n + 2t for mtDNA and the Y chromosome, whereas it is proportional to 2n + t for autosomal loci with a fourfold greater effective population size. t is the time following the population expansion. The relative excess of African diversity appears to be much greater for mtDNA and the Y chromosome than for autosomal microsatellites. From the equations above, this observation suggests that African populations began to expand before the others and suggests that African populations have not been larger than non-African ones. This is exactly the model of modern human origins for which Klein (1995) finds the strongest support. The modern human morphology appears in Southern and Eastern Africa ~150,000 years ago, whereas evidence for the behavioral transition to modern humans is not found until ~50,000 years ago when modern humans appear to have begun leaving Africa. Thus, African populations may have been expanding and accumulating genetic variation for tens of thousands of years before a subset of that continent’s population began to expand and colonize the rest of the world.

This view is supported by the application of (δμ)2 to date the separation time between African and non-African populations. The dates calculated using this method would certainly seem to preclude multiregional evolution or an ancient ancestry for modern humans. However, the dates are too recent to be trusted entirely. Recent archeological results suggest that modern humans may not have attained behavioral modernity and left Africa until as recently as 50,000 years ago (Klein 1995), but the dates derived from Y chromosome microsatellites do not go beyond 40,000 years.

It is possible to imagine several explanations for the extreme recency of these dates. The small number of loci available (n = 10) and the fact that the Y chromosome itself is effectively a single locus may produce large stochastic errors around the estimate. Admixture among populations will reduce the genetic distance between them. Some such admixture has undoubtedly occurred, but it is probably not of sufficient magnitude to produce the entire effect. The application of (δμ)2 to 30 autosomal microsatellites resulted in dates of 155,000 years, which is in very good agreement with other data and does not indicate the effect of significant admixture (Goldstein et al. 1995b).

(δμ)2 is linear and proportional to time regardless of population size but only if the populations are at mutation-drift equilibrium. Because some of the populations used in the calculations may not have reached equilibrium (as noted in the previous discussion), this too may affect the accuracy of estimated dates. Homoplasy is another limitation with loci that have a high mutation rate, and it may affect the linearity of (δμ)2. An average of six repeat units separated the extreme alleles for single-copy loci. This suggests (δμ)2 will remain linear up to a maximum distance of 5.8 (Goldstein et al. 1995a).

Failure of these microsatellites to adhere to the stepwise mutation model assumed by the (δμ)2 distance could also affect the estimates. We have direct evidence of a mutation rate dependence on repeat length and a constraint on maximal allele size (X. Xu, M.T. Seielstad, and X. Xu, unpubl.). With these caveats in mind, we should emphasize that dates estimated by (δμ)2 are not coalescent dates. Coalescence dates are estimates of the time to a molecular ancestor, which must precede the diversification of populations. Estimates from (δμ)2 come closer to estimating the actual divergence times, unless admixture among populations is extensive—in which case, the date will underestimate the parameter.

A recent African origin for all humans is supported by a robust corpus of evidence. This evidence takes three major tracks: (1) increased genetic diversity in Africa versus the rest of the world, (2) phylogenetic analyses placing the deepest branches between African and non-African populations, and (3) indications that the age of the “genetic most recent common ancestor” is very young. The Y chromosome, like mtDNA and the autosomes, appears to support a recent African origin. Although no locus taken alone would be sufficient to demonstrate support for the out of Africa theory, the consistency of so many loci in supporting a recent African origin comprises a compelling case. Perhaps we can begin to use the genetic data that are so rapidly accumulating to answer more interesting questions about the forces that have generated and maintained the geographic patterns of genetic diversity we observe today.

METHODS

Populations

Samples from the Hadendowah and Beni Amer tribes of the Beja were collected in and around Kassala, Sudan, in January 1994 and the Dinka from settlements near Khartoum. Eleven populations from Ethiopia were sampled in March and April 1995: Konso from Konso town, Tsamako and Ongota near Weyto, Hamar near Turmi, Dasenech in Omorate, Dizi and Surma near Maji, Bume (or Nyangatom) at Kibish, Bench in Mizan Teferi, Majangir near Tepi, and Berta near Kurmuk. Collections from five populations in Mali were made in February 1996: Dogon and Peulh near Bandiagara (14.361°N, 3.647°W), Bozo in Mopti, Tuareg near Timbuktu (16.661°N, 3.240°W), and Songhai near Gao (16.287°N, 0.044°W). Australian Aborigine and New Guinean DNA samples are described by Stoneking et al. (1990). Samples from Nasioi Melanesians were collected by Dr. Jonathan Friedlaender in Bougainville, Solomon Islands. Dr. Judy Kidd and her collaborators provided DNA from several Taiwanese Aboriginal and Native American populations (primarily from South America): Karitiana, Mayan, Moskoke, Quechua, Surui, and Ticuna. DNA samples from South African populations (Khoisan, Pedi, Sotho, Swazi, Tswana, Xhosa, and Zulu) are described by Spurdle and Jenkins (1992) and from Pakistan by Dr. Qasim Mehdi [Baluchi (30.5°N,66.5°E), Brahui, Hunza (Burushaski speaking), Pathan (33.5°N, 70.5°E), and Sindhi (25°N, 69°E)]. The remaining DNA samples were extracted from EBV-transformed B cell lines as described in Bowcock et al. (1987). Samples from northern Italians are described in Matullo et al. (1994). Pygmy and Lissongo samples were collected from the Central African Republic and Zaire. Chinese, Japanese, and Northern Europeans (mostly German) samples were collected from immigrants to the San Francisco Bay Area. Cambodian samples are from Khmer living in Santa Ana, California. Y chromosome microsatellite genotype data for the Basque and Catalan populations were published by Perez-Lezaun et al. (1997).

DNA Extraction

DNA for samples collected in Ethiopia, Sudan, and Mali was extracted from 5 ml of whole blood drawn in EDTA anticoagulant. Extraction was begun in the field as quickly as possible (but always <2 days after collection) using a “salting out” procedure modified from Miller et al. (1988). Initially, whole blood was centrifuged at 1200g for 10 min to separate plasma that was discarded before adding 10 ml of chilled (where possible) red cell lysis buffer (RCLB: 1 mm NH4HCO3 and 115 mm NH4Cl). In subsequent procedures, 10 ml of RCLB was added directly to 5 ml of whole blood without first discarding plasma. After gentle mixing, white cells were collected by centrifugation at 1200g for 10 min and washed as necessary with additional RCLB followed by centrifugation. When clean, the cell pellet was lysed in 3 ml of white cell lysis buffer [WCLB: 100 mm Tris-Cl at pH 7.6, 40 mm EDTA at pH 8.0, 50 mm NaCl, 0.2% SDS, and 0.05% NaN3 (to inhibit microbial growth)]. At this stage, DNA is sufficiently stabilized to allow storage at room temperature with little additional loss. Final extraction and purification in the laboratory continued with an optional proteinase K digestion (10 μl of 20 mg/ml of ProK). Proteins were precipitated by the addition of one-third volume-saturated NaCl (~6 m) followed by centrifugation at 12,000 rpm for 10 min. The supernatant was collected, and DNA was precipitated by adding two volumes of absolute ethanol and centrifuging at >12,000 rpm for 15 min. The DNA pellet was washed in 70% ethanol, dried, and resuspended in 100 μl of TE before fluorimetric or spectrophotometric quantitation and dilution.

Y Microsatellites

Eight sets of primers were used to amplify 10 microsatellite loci on the Y chromosome. A total of 506 chromosomes were analyzed in 49 populations from every inhabited continent. With the exception of DYS19, all primers are available with the indicated dye labels from Research Genetics (Huntsville, AL): DYS385-FAM, DYS388-FAM, DYS389-FAM, DYS390-FAM, DYS391-FAM, DYS392-HEX, and DYS393-HEX. Many of the primers can be coamplified, although no consistent multiplexing scheme was used in this study. Two of the primer sets typically produced two bands (DYS385 and DYS389). It is known that the two bands of DYS389 correspond to two discrete repeat units, whereas DYS385 produces two bands of similar size. In many of the analyses, the two bands of DYS385 and DYS389 were treated as if from separate loci. A complete description of all loci can be found in Kayser et al. (1997).

PCR Amplification and Allele Size Determination

Microsatellite loci were amplified by PCR in 5-μl reactions. Reactions were “hotstart” using Taqstart (Clontech) anti-Taq antibody. Reactions consisted of 0.5 μl of 10× PCR buffer (Boehringer Mannheim); 0.1 μl of each labeled primer (8.5–10 μm); 0.1 μl of each unlabeled primer (8.5–10 μm); 0.8 μl of a dNTP mix (1.25 mm dATP, 1.25 mm dCTP, 1.25 mm dTTP, and 1.25 mm dGTP); 0.05 μl of Taq DNA polymerase (5 U/μl) (Boehringer Mannheim); 0.05 μl of Taqstart anti-Taq polymerase antibody (5 U/μl) (Clontech); 0.2 μl of Taqstart antibody buffer (Clontech); 0.3 μl of MgCl2 (25mm; for a final concentration of 1.5 mm); 1.0 μl of template DNA (10–50 ng/ml); and distilled water to a final volume of 5 μl.

A “touchdown” cycling regime was used, beginning with 14 cycles of successive 0.5°C decreases in annealing temperature—from 63°C to 56.5°C. Twenty cycles with a constant 56°C annealing temperature followed. Each step lasted for 30 sec, and a denaturing step of 94°C and extension step of 72°C were used for all cycles. Samples were incubated at 72°C for 4 min following completion of the final cycle.

PCR products were diluted with water and run on an ABI373a DNA sequencer with GS-350 or GS-500 dye-labeled size standard. Allele sizes were determined with ABI’s GS Analysis software. Data are available at http://www.stats.ox.ac.uk/~pritch/ydata.html.

Diversity Among Populations

Several measures of genetic variation are available. Gene diversity (average heterozygosity) is one, although a more suitable quantity for microsatellites obeying a stepwise mutation model is the variance in repeat number. Gene diversity and its s.e. were calculated according to the method of Nei (1987). Jorde et al. (1997) have devised a test to identify differences in genetic diversity among populations based on the variance in repeat scores. A resampling strategy is used to assess the level of statistical significance. Jorde et al.’s test first calculates the variance within continental groupings, Vij, for each locus i and continental grouping j. The mean of the ratio of Vij over the mean of Vij among populations is the value Rj, reported in Table Table2,2, A, B, and C. S is the ratio of the largest Rj to the mean of the others and indicates the degree to which diversity is higher in that population relative to the others. The statistical significance of S was determined by taking random replicates of the haplotypes, equal in size to the original continental groupings, and determining the value of S in each replicate. The fraction of replicates that exceeds the value calculated from the original data is an indication of the statistical significance of the observed effect. Calculations and replicates were performed using a program (“strvar”) provided by Alan Rogers.

Phylogenetic Analysis

 Figure 1 shows a tree relating the continental groupings listed in Table Table1.1. (δμ)2 distances were calculated with the Microsat program written by Eric Minch (available at http://lotka.stanford.edu/microsat.html). Average linkage (UPGMA) trees (Sokal and Michener 1958) were constructed using PHYLIP (Felsenstein 1993). Average linkage was chosen, because a suitable outgroup for the rapidly evolving Y chromosome microsatellites is not available. Figure Figure22 depicts an unrooted neighbor-joining tree constructed from the same distance matrix (Table (Table3).3). Trees were constructed using several other measures of genetic distance (e.g., DSW, proportion of shared alleles, absolute difference, FST and GST), and most agreed with the topology of the tree in Figure Figure1,1, although the placement of the American population group sometimes differed. As suggested by the low values of genetic diversity in the Americas (Table (Table2A),2A), it appears that genetic drift has profoundly affected Y chromosome variation in Native American populations—contributing at least in part to their increased distance from the other populations.

The performance of a distance measure is primarily a function of its linearity (the duration over which it increases linearly with the time since population fission) and its coefficient of variance (Goldstein and Pollock 1997). Often, optimizing linearity results in increased variance, so that, in practice, the various distance measures perform differently in different circumstances. Some genetic distance measures work best only with closely related populations (displaying a low coefficient of variance but maintaining linearity over only short distances). (δμ)2 is among the more linear distance measures. As a result it seems to perform well in an array of diverse circumstances.

Estimating Separation Times Among Populations

The same (δμ)2 distances used to construct the trees in Figures Figures11 and and22 were also used to estimate the time of the split between African and non-African populations. Following the approach of Goldstein et al. (1995b), the average of the distances between each non-African population and the African population was calculated. The following formula derived by Goldstein et al. (1995b) was used to estimate the number of generations since the split of African and non-African populations:

equation M1
1

β is the mutation rate per locus per generation, and τ is time in generations. Confidence intervals (95%) for the (δμ)2 distance were estimated by analyzing 10,000 bootstrap replications as implemented by Microsat.

Acknowledgments

We wish to thank all of the DNA donors who participated in this project. Laboratory work was funded by National Institutes of Health grant GM28428 to Luca Cavalli-Sforza, who also provided much helpful discussion. The collection of DNA samples in Ethiopia, Sudan, and Mali was supported by grants from the Arthur Green Fund of Harvard University and the L.S.B. Leakey Foundation to M.T.S. We thank David Goldstein for helpful advice and discussion and Trefor Jenkins and S. Qasim Mehdi for DNA samples from South African and Pakistani populations, respectively.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

E-MAIL ude.dravrah.gpp@kram; FAX (617) 432-2956.

REFERENCES

  • Armour J, Anttinen T, May C, Vega E, Sajantila A, Kidd J, Bertranpetit J, Pääbo S, Jeffreys A. Minisatellite diversity supports a recent African origin for modern humans. Nat Genet. 1996;13:154–160. [PubMed]
  • Batzer M, Stoneking M, Alegria-Hartman M, Bazan H, Kass D, Shaikh T, Novick G, Ioannou P, Scheer W, Herrera R, et al. African origin of human-specific polymorphic Alu insertions. Proc Natl Acad Sci. 1994;91:12288–12292. [PMC free article] [PubMed]
  • Bianchi NO, Catanesi CI, Bailliet G, Martinez-Marignac V, Bravi C, Vidal-Rioja L, Herrera R, Lopez-Camelo J. Characterization of ancestral and derived Y-chromosome haplotypes of New World native populations. Am J Hum Genet. 1998;63:1862–1871. [PMC free article] [PubMed]
  • Bowcock A, Bucci C, Hebert J, Kidd J, Kidd K, Friedlaender J, Cavalli-Sforza L. Study of 47 DNA markers in five populations from four continents. Gene Geog. 1987;1:47–64. [PubMed]
  • Bowcock AM, Kidd JR, Mountain JL, Hebert JM, Carotenuto L, Kidd KK, Cavalli-Sforza LL. Drift, admixture, and selection in human evolution: A study with DNA polymorphisms. Proc Natl Acad Sci. 1991;88:839–843. [PMC free article] [PubMed]
  • Bowcock AM, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd JR, Cavalli-Sforza LL. High resolution of human evolutionary trees with polymorphic microsatellites. Nature. 1994;368:455–457. [PubMed]
  • Cann R, Stoneking M, Wilson A. Mitochondrial DNA and human evolution. Nature. 1987;325:31–36. [PubMed]
  • Casanova M, Leroy P, Boucekkine C, Weissenbach J, Fellous M, Purrello M, Fiori G, Siniscalco M. A human Y-linked DNA polymorphism and its potential for estimating genetic and evolutionary distance. Science. 1985;230:1403–1406. [PubMed]
  • Cavalli-Sforza LL, Menozzi P, Piazza A. The history and geography of human genes. Princeton, NJ: Princeton University Press; 1994.
  • Dorit RL, Akashi H, Gilbert W. Absence of polymorphism at the ZFY locus on the human Y chromosome. Science. 1995;268:1183–1184. [PubMed]
  • Felsenstein J. PHYLIP Phylogeny Inference Package. Seattle, WA: University of Washington; 1993.
  • Foley R. The context of human genetic evolution. Genome Res. 1998;8:339–347. [PubMed]
  • Goldstein D, Ruiz-Linares A, Cavalli-Sforza L, Feldman M. An evaluation of genetic distances for use with microsatellite loci. Genetics. 1995a;139:463–471. [PMC free article] [PubMed]
  • ————— Genetic absolute dating based on microsatellites and the origin of modern humans. Proc Natl Acad Sci. 1995b;92:6723–6727. [PMC free article] [PubMed]
  • Goldstein DB, Zhivotovsky LA, Nayar K, Linares A, Cavalli-Sforza L, Feldman M. Statistical properties of the variation at linked microsatellite loci: Implications for the history of human Y chromosomes. Mol Biol Evol. 1996;13:1213–1218. [PubMed]
  • Goldstein DB, Pollock DD. Launching microsatellites: A review of mutation processes and methods of phylogenetic interference. J Hered. 1997;88:335–342. [PubMed]
  • Hammer MF. A recent common ancestry for human Y chromosome. Nature. 1995;378:376–378. [PubMed]
  • Hammer MF, Spurdle AB, Karafet T, Bonner MR, Wood ET, Novelleto A, Malaspina P, Mitchell RJ, Horai S, Jenkins T, Zegura SL. The geographic distribution of human Y chromosome variation. Genetics. 1997;145:787–805. [PMC free article] [PubMed]
  • Hammer M, Karafet T, Rasanayagam A, Wood E, Altheide T, Jenkins T, Griffiths R, Templeton A, Zegura S. Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. Mol Biol Evol. 1998;15:427–441. [PubMed]
  • Harding RM, Fullerton SM, Griffiths RC, Bond J, Cox M, Schneider J, Moulin D, Clegg J. Archaic African and Asian lineages in the genetic ancestry of modern humans. Am J Hum Genet. 1997;60:772–789. [PMC free article] [PubMed]
  • Harpending H, Sherry S, Rogers A, Stoneking M. The genetic structure of ancient human populations. Curr Anthropol. 1993;34:483–496.
  • Harris E, Hey J. X chromosome evidence for ancient human histories. Proc Natl Acad Sci. 1999;96:3320–3324. [PMC free article] [PubMed]
  • Heyer E, Puymirat J, Dieltjes P, Bakker E, de Knijff P. Estimating Y-chromosome specific microsatellite mutation frequencies using deep rooting pedigrees. Hum Mol Genet. 1997;6:799–803. [PubMed]
  • Horai S, Hayasaka K, Kondo R, Tsugane K, Takahata N. Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc Natl Acad Sci. 1995;92:532–536. [PMC free article] [PubMed]
  • Jorde L, Rogers A, Bamshad M, Watkins W, Krakowiak P, Sung S, Kere J, Harpening H. Microsatellite diversity and the demographic history of modern humans. Proc Natl Acad Sci. 1997;94:3100–3103. [PMC free article] [PubMed]
  • Jorde L, Bamshad M, Dixon M, Watkins W, Kere J, Lum J, Sung S, Rogers A. A worldwide comparison of genetic variation in Y chromosome, autosomal, and mitochondrial polymorphisms. Am J Hum Genet (Suppl.) 1998;63:A219.
  • Kayser M, Caglià A, Corach D, Fretwell N, Gehrig C, Graziosi G, Heidorn F, Herrmann S, Herzog B, Hidding M, et al. Evaluation of Y-chromosomal STRs: A multicenter study. Int J Leg Med. 1997;110:125–133. [PubMed]
  • Klein R. Anatomy, behavior, and modern human origins. J World Prehist. 1995;9:167–198.
  • Krings M, Stone A, Schmitz R, Krainitzki H, Stoneking M, Pääbo S. Neandertal DNA sequences and the origin of modern humans. Cell. 1997;90:19–30. [PubMed]
  • Lahr M. The evolution of modern human diversity: A study of cranial variation. Cambridge, UK: Cambridge University Press; 1996.
  • Li W-H. Distribution of nucleotide differences between two randomly chosen cistrons of a finite population. Genetics. 1977;85:331–337. [PMC free article] [PubMed]
  • Lucotte G, Ngo NY. p49f, a highly polymorphic probe, that detects TaqI RFLPs on the human Y chromosome. Nucleic Acids Res. 1985;13:82–85. [PMC free article] [PubMed]
  • Marjoram P, Donnelly P. Pairwise comparisons of mitochondrial DNA sequences in subdivided populations and implications for early human evolution. Genetics. 1994;136:673–683. [PMC free article] [PubMed]
  • Matullo G, Griffo R, Mountain J, Piazza A, Cavalli-Sforza L. RFLP analysis on a sample from northern Italy. Gene Geogr. 1994;8:25–34. [PubMed]
  • Miller SA, Dykes DD, Polesky HF. A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res. 1988;16:1215. [PMC free article] [PubMed]
  • Mountain JL, Lin AA, Bowcock AM, Cavalli-Sforza LL. Evolution of modern humans: Evidence from nuclear DNA polymorphisms. In: Aitken MJ, Stringer CB, Mellars PA, editors. The origin of modern humans and the impact of chronometric dating. Princeton, NJ: Princeton University Press; 1993.
  • Mountain JL, Cavalli-Sforza LL. Inference of human evolution through cladistic analysis of nuclear DNA restriction polymorphisms. Proc Natl Acad Sci. 1994;91:6515–6519. [PMC free article] [PubMed]
  • Nordborg M. On the probability of Neanderthal ancestry. Am J Hum Genet. 1998;63:1237–1240. [PMC free article] [PubMed]
  • Nei M. Molecular evolutionary genetics. New York, NY: Columbia University Press; 1987.
  • Perez-Lezaun A, Calafell F, Seielstad M, Mateu E, Comas D, Bosch E, Bertranpetit J. Population genetics of Y chromosome short tandem repeats in humans. J Mol Evol. 1997;45:265–270. [PubMed]
  • Pritchard J, Feldman M. Genetic data and the African origin of humans. Science. 1996;274:1548. [PubMed]
  • Rogers A, Harpending H. Population growth makes waves in the distribution of pairwise genetic differences. Mol Biol Evol. 1992;9:552–569. [PubMed]
  • Rogers AR, Jorde LB. Ascertainment bias in estimates of average heterozygosity. Am J Hum Genet. 1996;58:1033–1041. [PMC free article] [PubMed]
  • Ruvolo M, Zehr S, von Dornum M, Pan D, Chang B, Lin J. Mitochondrial COII sequences and modern human origins. Mol Biol Evol. 1993;10:1115–1135. [PubMed]
  • Seielstad MT, Hebert JM, Lin AA, Underhill PA, Ibrahim M, Vollrath D, Cavalli-Sforza LL. Construction of human Y-chromosomal haplotypes using a new polymorphic A to G transition. Hum Mol Genet. 1994;3:2159–2161. [PubMed]
  • Slatkin M. A measure of population subdivision based on microsatellite allele frequencies. Genetics. 1995;139:457–462. [PMC free article] [PubMed]
  • Slatkin M, Rannala B. Likelihood analysis of disequilibrium mapping and related problems. Am J Hum Genet. 1998;63:459–473. [PMC free article] [PubMed]
  • Sokal R, Michener C. A statistical method for evaluating systematic relationship. Univ Kans Sci Bull. 1958;38:1409–1438.
  • Spurdle A, Jenkins T. Y chromosome probe p49a detects complex PvuII haplotypes and many new TaqI haplotypes in southern African populations. Am J Hum Genet. 1992;50:107–125. [PMC free article] [PubMed]
  • Stoneking M, Jorde L, Bhatia K, Wilson A. Geographic variation in human mitochondrial DNA from Papua New Guinea. Genetics. 1990;124:717–733. [PMC free article] [PubMed]
  • Stringer C, Andrews P. Genetic and fossil evidence for the origin of modern humans. Science. 1988;239:1263–1268. [PubMed]
  • Tishkoff S, Dietzsch E, Speed W, Pakstis A, Kidd J, Cheung K, Bonné-Tamir B, Santachiara-Benerecetti A, Moral P, Krings M. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science. 1996;271:1380–1387. [PubMed]
  • Underhill PA, Jin L, Lin AA, Mehdi SQ, Jenkins T, Vollrath D, Davis RW, Cavalli-Sforza LL, Oefner PJ. Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography. Genome Res. 1997;7:996–1005. [PMC free article] [PubMed]
  • Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson A. African populations and the evolution of human mitochondrial DNA. Science. 1991;253:1503–1507. [PubMed]
  • Weber JL, Wong C. Mutation of human short tandem repeats. Hum Mol Genet. 1993;2:1123–1128. [PubMed]
  • Weiss K. Human generation time. Am Antiq. 1973;38:1–186.
  • Whitfield LS, Sulston JE, Goodfellow PN. Sequence variation of the human Y chromosome. Nature. 1995;378:379–380. [PubMed]
  • Wolpoff M, Wu X, Thorne A. Modern Homo sapiens origins: A general theory of hominid evolution involving the fossil evidence from East Asia. In: Smith FH, Spencer F, editors. The origins of modern humans: A world survey of the fossil evidence. New York, NY: A. R. Liss; 1984.
  • Zietkiewicz E, Yotova V, Jarnik M, Korab-Laskowska M, Kidd K, Modiano D, Scozzari R, Stoneking M, Tishkoff S, Batzer M, et al. Genetic structure of the ancestral population of modern humans. J Mol Evol. 1998;47:146–155. [PubMed]
  • Zischler H, Geisert H, von Haeseler A, Pääbo S. A nuclear “fossil” of the mitochondrial D-loop and the origin of modern humans. Nature. 1995;378:489–492. [PubMed]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...