![]() | ![]() |
Formats:
|
|||||||||||||
Copyright © 1997, The National Academy of Sciences of the USA Evolution An apportionment of human DNA diversity*Department of Biology, University of Ferrara, via Borsari 46, I-44100 Ferrara, Italy; †Department of Statistical Sciences, University of Bologna, Italy; and ‡Department of Genetics, Stanford University, Stanford, CA 94305 L. Luca Cavalli-Sforza Accepted February 27, 1997. This article has been cited by other articles in PMC.Abstract It is often taken for granted that the human species is divided in rather homogeneous groups or races, among which biological differences are large. Studies of allele frequencies do not support this view, but they have not been sufficient to rule it out either. We analyzed human molecular diversity at 109 DNA markers, namely 30 microsatellite loci and 79 polymorphic restriction sites (restriction fragment length polymorphism loci) in 16 populations of the world. By partitioning genetic variances at three hierarchical levels of population subdivision, we found that differences between members of the same population account for 84.4% of the total, which is in excellent agreement with estimates based on allele frequencies of classic, protein polymorphisms. Genetic variation remains high even within small population groups. On the average, microsatellite and restriction fragment length polymorphism loci yield identical estimates. Differences among continents represent roughly 1/10 of human molecular diversity, which does not suggest that the racial subdivision of our species reflects any major discontinuity in our genome. Keywords: genetic variation, microsatellite loci, restriction polymorphisms, racial classification In 1972, Richard Lewontin analyzed allele frequencies at 15 protein loci and concluded that 85% of the overall human genetic diversity is represented by individual diversity within populations (1). Differences among seven racial groups accounted for less than 7% of the total. Nei and Roychoudhury reached a similar apportionment of genetic diversity among populations from three continents (2). Although these results were repeatedly confirmed by studies of protein markers (3–5), the idea that the human species is deeply subdivided into races has not disappeared (6, 7). Reasons for this include some perceived discontinuity among populations, usually reported for quantitative traits (6), and the possibility that protein markers, including blood groups, may not exhaustively describe genetic variation, leaving open the possibility that the undetected variation might show a different pattern. In this study, we analyzed how DNA variation is distributed at 109 loci (Fig. (Fig.1).1
Materials and Methods Three largely independent sets of genetic data were used in this study. The microsatellite database comprises individual allele lengths for 29 repeats and one tetranucleotide repeat of chromosomes 13 and 15. This is the set of data used by Bowcock et al. (8) from which we excluded nonhuman primates. The map distances between adjacent loci, except eight of them, are such that linkage disequilibrium can hardly be considered a major disturbing factor. Fourteen populations are included, for an overall sample size of 148. However, for no marker was a complete set of 296 chromosomes available. The missing values never exceed 10% at any given locus, and in no case were they replaced by interpolated data. The RFLP database includes frequencies of the alternative alleles (presence or absence of cut) at 79 autosomal loci in 1109 individuals from 10 populations on 4 continents (extended data set). For 321 individuals from 12 populations on 5 continents, we also had full individual multilocus genotypes at 16 loci (reduced data set). Both data sets came from the analysis of cell lines that were used in a variety of other studies, in which sampling procedures are described in detail (9–12). As usual with molecular data, sample sizes were small. On the other hand, previous results show that the number of markers considered, rather than sample sizes, is crucial for population discrimination (13). Therefore, if some bias exists in the data of this study, the genetic differences between populations will tend to be overly emphasized. A nonparametric method, analysis of molecular variance (1, 14), was used for hierarchically partitioning genetic diversity. At each locus, each individual allele was compared: (i) with the other alleles of the same sample; (ii) with the alleles of the other samples within the same continent; and (iii) with all of the alleles from other continents. This procedure was repeated independently for the multilocus genotypes of the reduced RFLP data set. The genetic variances thus computed reflected differences in allele frequencies and (for microsatellites) lengths. With a total sample size of n individuals from G continents and P populations, analysis of molecular variance generated a partition of the overall variance into G-1 df for the variance among continents, P-G for that among populations within continent, and N-P for that among individuals within a population. The significance of the estimated variances was tested by randomly assigning individuals to populations (according to two different randomization schemes) or populations to continents and repeating the randomizations 1000 times, each time recalculating the relevant variance. The observed variances were finally compared with the empirical expected distributions thus obtained. Results and Discussion Diversity among individuals of the same population (Table 1) was significant at 28 of 30 microsatellite loci and at all RFLP loci. It explained, on the average, 84.5% of the overall microsatellite variance and 83.6–84.5% of the overall RFLP variance (for the reduced and the extended data sets, respectively). Populations of the same continent tended to resemble each other; at the microsatellite level, their differences accounted only for 5% of diversity, reaching significance at nine loci; for RFLP loci, although significant in 54 cases, the comparisons between samples of the same continent accounted for less than 4% of the total molecular variance. Differences among continents accounted for the remaining fraction of variance, i.e., between 8 and 11.7%, and were significant at 12 microsatellite loci and 50 RFLP loci.
The results of this study of DNA and the results of comparable analyses of protein polymorphism are remarkably similar (Table 2). The similarity of within-population diversity estimates in our three DNA data sets may have been increased by the fact that many individuals were typed for both microsatellites and RFLPs; therefore, the results obtained for the two kinds of markers may not be completely independent. Also, this study was based on a limited number of small samples; wider assemblages of data and a more regular distribution in space may somewhat modify the picture. However, if the geographic dispersion of samples distorted our estimates, it was by enhancing diversity among populations and continents and not within populations. Thus, our results suggest that at least four-fifths of human genetic variation reflects individual differences, no matter whether the variation is inferred from allele frequencies for moderately polymorphic protein markers or from allele lengths and frequencies at highly polymorphic DNA loci.
A different class of problems may have arisen from the fact that the populations examined in our study have different complexities, ranging from a camp of a few dozens of hunter–gatherers to a large country, such as China, or a wide region, such as Northern Europe. Might the latter samples, presumably heterogeneous, have inflated artificially our overall estimates of within-population variances? One way to answer is to see whether small isolated groups also show greater DNA homogeneity. For that purpose, we classified the populations studied into three groups, from small communities to very large regions (see legend to Fig. Fig.1).1 Implications for the Existence of Races in Humans But what do these results imply for the race concept? Although no consensus has ever been reached on how many races exist in our species, with proposed figures ranging from 3 to 200 (20), in general a species is divided in races when it can be regarded as an essentially discontinuous set of individuals (21). Studies on a limited number of populations, like ours, cannot exclude that there are true discontinuities in the distribution of some genetic markers all over the world. However, only for one of the 109 loci studied was the within-population component of variance less than 50% of the total. If loci showing a discontinuous distribution across continents exist, they have not been observed in this study, and so the burden of the proof is now on the supporters of a biological basis for human racial classification. Further support for the conclusions of this study comes from the observation that, almost without exception, gene frequencies form smooth clines over all continents (22). Zones of discontinuity in human gene frequency distributions are present, but the local gradients are so small that they can be identified only by simultaneously studying many loci using complex statistical techniques (23). In addition, such regions of relatively sharp genetic change do not surround large clusters of populations, on a continental or nearly continental scale. On the contrary, they occur irregularly, within continents and even within single countries (24, 25), often overlapping with geographic and linguistic barriers (26–29). Genetic enclaves seem to be mostly limited to islands. Probably any two populations compared at a sufficient number of loci may be shown to differ, as suggested by the fact that several variances among populations, although low in relative terms, are statistically significant in this study. However, this has little to do with the subdivision of the human population into a small number of clearly distinct, racial or continental, groups. The existence of such broad groups is not supported by the present analysis of DNA. Even with the present, limited sample sizes, this study shows that previous findings of large individual diversity within populations were not due to the particular nature of the markers chosen, normally frequencies of protein variants at biallelic loci. Microsatellite loci are among the most polymorphic in the genome, yet they yield variance estimates in excellent agreement with the previous ones and with variances estimated from other DNA markers. The differences among human groups, even very distant ones and no matter whether the groups are defined on a racial or on a geographical basis, represent only a small fraction of the global genetic diversity of our species. Acknowledgments We thank Giorgio Bertorelle and Ayse Ergüven for their comments on an earlier version of this manuscript. This study was supported by Italian Consiglio Nazionale delle Ricerche Grant 95-0889 to G.B. ABBREVIATION
References 1. Lewontin R C. Evol Biol. 1972;6:381–398. 2. Nei M, Roychoudhury A K. Am J Hum Genet. 1974;26:421–443. [PubMed] 3. Latter B D H. Am Nat. 1980;116:220–237. 4. Nei M, Roychoudhury A K. Mol Biol Evol. 1993;10:927–943. [PubMed] 5. Ryman N, Chakraborty R, Nei M. Hum Hered. 1983;33:93–102. [PubMed] 6. Harrison G A, Tanner J M, Pilbeam D R, Baker P T. Human Biology. 4th Ed. Oxford: Oxford Univ. Press; 1989. 7. Stein P L, Rowe B M. Physical Anthropology. 4th Ed. New York: McGraw–Hill; 1989. 8. Bowcock A M, Ruiz-Linares A, Tomfohrde J, Minch E, Cavalli-Sforza L L. Nature (London). 1994;368:455–457. [PubMed] 9. Bowcock A M, Bucci C, Hebert J M, Kidd J R, Kidd K K, Friedlaender J S, Cavalli-Sforza L L. Gene Geogr. 1987;1:47–64. [PubMed] 10. Bowcock A M, Hebert J M, Mountain J L, Kidd J R, Rogers J, Kidd K K, Cavalli-Sforza L L. Gene Geogr. 1991;5:151–173. [PubMed] 11. Lin A A, Hebert J M, Mountain J L, Cavalli-Sforza L L. Gene Geogr. 1994;8:191–214. [PubMed] 12. Poloni E S, Excoffier L, Mountain J L, Langaney A, Cavalli-Sforza L L. Ann Hum Genet. 1995;59:43–61. [PubMed] 13. Pamilo P, Nei M. Mol Biol Evol. 1988;5:568–583. [PubMed] 14. Excoffier L, Smouse P E, Quattro J M. Genetics. 1992;131:479–491. [PubMed] 15. Goldstein D B, Ruiz-Linares A, Cavalli-Sforza L L, Feldman M. Proc Natl Acad Sci USA. 1995;92:6723–6727. [PubMed] 16. Stringer C B, Andrews P. Science. 1988;239:1263–1268. [PubMed] 17. Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson A C. Science. 1991;253:1503–1507. [PubMed] 18. Takahata N. Mol Biol Evol. 1993;10:2–22. [PubMed] 19. Mirazon Lahr M, Foley R. Evol Anthropol. 1994;3:48–60. 20. Armelagos G J. Am J Phys Anthropol. 1994;93:381–383. [PubMed] 21. Mayr E. Animal Species and Evolution. Cambridge, MA: Belknap; 1963. 22. Cavalli-Sforza L L, Menozzi P, Piazza A. History and Geography of Human Genes. Princeton: Princeton Univ. Press; 1994. 23. Barbujani G, Sokal R R. Proc Natl Acad Sci USA. 1990;87:1816–1819. [PubMed] 24. Barbujani G, Sokal R R. Am J Hum Genet. 1991;48:398–411. [PubMed] 25. Calafell F, Bertranpetit J. Am J Phys Anthropol. 1994;93:201–215. [PubMed] 26. Barbujani G, Nasidze I S, Whitehead G N. Hum Biol. 1994;66:639–668. [PubMed] 27. Sajantila A, Lahermo P, Anttinen T, Lukka M, Sistonen P, Savontaus M, Aula P, Beckman L, Tranebjaerg L, Gedde-Dahl T, Issel-Tarver L, DiRienzo A, Pääbo S. Genome Res. 1995;5:42–52. [PubMed] 28. Stenico M, Nigro L, Bertorelle G, Calafell F, Capitanio M, Corrain C, Barbujani G. Am J Hum Genet. 1996;59:1363–1375. [PubMed] 29. Excoffier L, Poloni E S, Santachiara-Benerecetti A S, Semino O, Langaney A. In: Molecular Biology and Human Diversity. Boyce A J, Mascie-Taylor C G N, editors. Cambridge: Cambridge Univ. Press; 1996. pp. 141–155. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
||||||||||||
Am J Hum Genet. 1974 Jul; 26(4):421-43.
[Am J Hum Genet. 1974]Hum Hered. 1983; 33(2):93-102.
[Hum Hered. 1983]Nature. 1994 Mar 31; 368(6470):455-7.
[Nature. 1994]Gene Geogr. 1987 Apr; 1(1):47-64.
[Gene Geogr. 1987]Ann Hum Genet. 1995 Jan; 59(Pt 1):43-61.
[Ann Hum Genet. 1995]Mol Biol Evol. 1988 Sep; 5(5):568-83.
[Mol Biol Evol. 1988]Genetics. 1992 Jun; 131(2):479-91.
[Genetics. 1992]Proc Natl Acad Sci U S A. 1995 Jul 18; 92(15):6723-7.
[Proc Natl Acad Sci U S A. 1995]Science. 1988 Mar 11; 239(4845):1263-8.
[Science. 1988]Am J Phys Anthropol. 1994 Mar; 93(3):381-3.
[Am J Phys Anthropol. 1994]Proc Natl Acad Sci U S A. 1990 Mar; 87(5):1816-9.
[Proc Natl Acad Sci U S A. 1990]Am J Hum Genet. 1991 Feb; 48(2):398-411.
[Am J Hum Genet. 1991]Am J Phys Anthropol. 1994 Feb; 93(2):201-15.
[Am J Phys Anthropol. 1994]Hum Biol. 1994 Aug; 66(4):639-68.
[Hum Biol. 1994]Gene Geogr. 1991 Dec; 5(3):151-73.
[Gene Geogr. 1991]Am J Hum Genet. 1996 Dec; 59(6):1363-75.
[Am J Hum Genet. 1996]Genome Res. 1995 Aug; 5(1):42-52.
[Genome Res. 1995]Proc Natl Acad Sci U S A. 1995 Jul 18; 92(15):6723-7.
[Proc Natl Acad Sci U S A. 1995]Mol Biol Evol. 1993 Sep; 10(5):927-43.
[Mol Biol Evol. 1993]