![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2000 by The American Society of Human Genetics. All rights reserved. mtDNA Variation in the South African Kung and Khwe—and Their Genetic Relationships to Other African Populations 1Center for Molecular Medicine and 2Department of Anthropology, Emory University, and 3Program in Biomedical Engineering, College of Computing, Georgia Institute of Technology, Atlanta; and 4Department of Human Genetics, University of Pretoria, Pretoria Address for correspondence and reprints: Dr. Douglas C. Wallace, Director, Center for Molecular Medicine, Emory University School of Medicine, 1462 Clifton Road, N.E., Atlanta, GA 30322. E-mail: dwallace/at/gen.emory.edu *Present affiliation: Department of Psychiatry, University of Chicago Medical School, Chicago. †Present affiliation: Southwest Foundation for Biomedical Research, Department of Genetics, San Antonio. ‡Present affiliation: Laboratory of Genetics, Department of Biology, University of Turku, Turku, Finland. Received May 19, 1999; Accepted October 17, 1999. This article has been cited by other articles in PMC.Abstract The mtDNA variation of 74 Khoisan-speaking individuals (Kung and Khwe) from Schmidtsdrift, in the Northern Cape Province of South Africa, was examined by high-resolution RFLP analysis and control region (CR) sequencing. The resulting data were combined with published RFLP haplotype and CR sequence data from sub-Saharan African populations and then were subjected to phylogenetic analysis to deduce the evolutionary relationships among them. More than 77% of the Kung and Khwe mtDNA samples were found to belong to the major mtDNA lineage, macrohaplogroup L* (defined by a HpaI site at nucleotide position 3592), which is prevalent in sub-Saharan African populations. Additional sets of RFLPs subdivided macrohaplogroup L* into two extended haplogroups—L1 and L2—both of which appeared in the Kung and Khwe. Besides revealing the significant substructure of macrohaplogroup L* in African populations, these data showed that the Biaka Pygmies have one of the most ancient RFLP sublineages observed in African mtDNA and, thus, that they could represent one of the oldest human populations. In addition, the Kung exhibited a set of related haplotypes that were positioned closest to the root of the human mtDNA phylogeny, suggesting that they, too, represent one of the most ancient African populations. Comparison of Kung and Khwe CR sequences with those from other African populations confirmed the genetic association of the Kung with other Khoisan-speaking peoples, whereas the Khwe were more closely linked to non–Khoisan-speaking (Bantu) populations. Finally, the overall sequence divergence of 214 African RFLP haplotypes defined in both this and an earlier study was 0.364%, giving an estimated age, for all African mtDNAs, of 125,500–165,500 years before the present, a date that is concordant with all previous estimates derived from mtDNA and other genetic data, for the time of origin of modern humans in Africa. Introduction The South African Kung San were among the first human groups in which mtDNA variation was analyzed by restriction analysis (Denaro et al. 1981; Johnson et al. 1983). In these studies, the mtDNA sequences of the Kung were surveyed for variation, by means of six rare-cutting restriction endonucleases (AvaII, BamHI, HaeII, HincII, HpaI, and MspI) and Southern blotting. Although this procedure screened only 2%–3% of the mtDNA sequence for variation, it revealed that ~90%–95% of the Kung mtDNAs were characterized by a HpaI site gain at nucleotide position (np) 3592. This marker was subsequently found at very high frequencies in mtDNAs of other sub-Saharan African populations (Scozzari et al. 1988, 1994; Soodyall and Jenkins 1992, 1993; Chen et al. 1995) but was not observed in populations of non-African origin, with a few exceptions (Cann et al. 1987; Soodyall 1993). Furthermore, the African mtDNAs defined by the HpaI np-3592 site gain formed a group of related mtDNA haplotypes (originally defined as haplogroup “L” and here redefined as “L*”), which was found to be the most divergent of those identified in human populations from around the world (Chen et al. 1995). These findings contributed to the hypothesis of an African origin of modern human mtDNAs (Johnson et al. 1983; Cann et al. 1987; Chen et al. 1995), although other interpretations of these data have also been put forward (Excoffier and Langaney 1989; Templeton 1992). Since 1981, relatively limited data have been collected on the RFLP variation in Kung or related Khoisan-speaking populations. Those studies that were conducted on Khoisan-speaking populations continued to use low-resolution (LR)–RFLP analysis with only the six rare-cutting restriction endonucleases listed above (Soodyall and Jenkins 1992, 1993). As a consequence, these data provided additional information about Khoisan groups but could not be entirely integrated with those obtained, on other populations, by high-resolution (HR)–RFLP analysis (Cann et al. 1987; Torroni et al. 1992). The value of using HR-RFLP analysis with African mtDNAs, for phylogenetic reconstructions of human mtDNA, has been shown primarily by two studies of African populations (Cann et al. 1987; Chen et al. 1995). In the first study, Cann et al. (1987) conducted genomic digestions of mtDNAs, by 12 enzymes (AluI, AvaII, DdeI, FnuDII, HaeIII, HhaI, HinfI, HpaI, HpaII, MboI, RsaI, and TaqI), with the resulting restriction fragments being resolved by PAGE. In the later study, the entire genome of African mtDNAs was PCR amplified in nine overlapping fragments, which were then subjected to digestion by 14 enzymes (AluI, AvaII, BamHI, DdeI, HaeII, HaeIII, HhaI, HincII, HinfI, HpaI, HpaII, MboI, RsaI, and TaqI), with the resulting restriction fragments being resolved by agarose gel electrophoresis. Both of these studies revealed that African haplotypes form the deepest branches of the human mtDNA phylogeny and that African groups are the most divergent of all world populations. Several additional aspects of mtDNA variation in African populations were shown in the Chen et al. (1995) study. First, two types of length polymorphisms were observed in some of the central-African mtDNAs (Mbuti [eastern] and Biaka [western] Pygmies), one of which was the COII/tRNALys intergenic 9-bp deletion that formerly had been thought to occur only in Asian populations. This length polymorphism had previously been noted in African individuals but had not been assigned to a particular haplogroup or mtDNA lineage (Cann and Wilson 1983). In addition, western-African populations from Senegal were found to have haplotypes that lacked the HpaI np-3592 site gain and, thus, appeared not to belong to haplogroup L*. Because nearly all of these “non-L” haplotypes had the DdeI np-10394 site gain, a polymorphism that was also present in almost every haplogroup L* mtDNA, the data suggested that the former haplotypes derived from the latter, although the exact subhaplogroup (L1 or L2) from which these non-L haplotypes originated was not clear. Alternatively, non-L mtDNAs could have evolved from haplotypes observed in Europeans, such as those belonging to haplogroups I–K (Torroni et al. 1994a), and could have been spread into Africa through population contractions/expansions since their origin. Such ambiguities clearly indicated that further analysis of non-L haplotypes was necessary in order to understand their phylogenetic relationships to other African and non-African mtDNAs. Similarly, until very recently, only a limited number of Kung mtDNAs had been surveyed for control region (CR) sequence variation. In earlier studies (Vigilant et al. 1989, 1991), mtDNAs from Kung individuals consistently clustered at the deepest nodes of the phylogenies constructed from CR sequences, a result that indicated that they are some of the most divergent—hence, oldest—mtDNAs present in human populations. Soodyall et al. (1996) also studied CR sequence variation in southern-African populations and found that the COII/tRNALys intergenic 9-bp deletion had occurred multiple times in African populations, independent of those appearing in Asian mtDNAs, a trend first noted by Vigilant (1990). Interestingly, deletion mtDNAs were not observed in Khoisan-speaking populations and were rare in western- and southwestern-African groups, whereas they occurred frequently in Pygmy and so-called Negroid populations from central Africa and in Bantu-speaking peoples of southern Africa (Soodyall et al. 1996). This distribution suggested that deletion mtDNAs arose in central Africa and were disseminated into southern Africa via the recent Bantu expansion. Although studies using HR-RFLP methods of analysis have provided new insights into the mtDNA variation of African populations, there has been almost no effort to relate the data sets generated by RFLP to those generated by a CR sequence analyses. To date, only the western-African Mandelaku (Mandenka) have been subjected to both LR-RFLP and CR sequencing analyses (Graven et al. 1995), whereas the linkage between the HR-RFLP data and the CR sequence data for the African samples analyzed by Cann et al. (1987) and Vigilant et al. (1989, 1991) has not been made explicitly clear. Thus, the relationship between HR-RFLP haplotypes and CR sequences in African populations has yet to be adequately addressed. In addition, because of the different methods of RFLP haplotyping used in the Cann et al. (1987) and Chen et al. (1995) studies, there were some inconsistencies in the positioning of certain polymorphisms that were critical for phylogenetic reconstructions of the haplotypes defined in each study. Consequently, a more complete characterization of the distribution of genetic variation in coding (i.e., RFLP) versus noncoding (i.e., CR) regions of the mtDNA genome was needed. Both to increase our knowledge of the genetic relationships between Khoisan (Kung San and Khoi) populations and other African peoples, and to better define the relationship between RFLP and CR sequence variation within these groups, we conducted HR-RFLP and CR sequence analyses of 74 Kung San and Khwe individuals from southern Africa and then compared the resulting data to those obtained for African populations in previous studies. This comparison revealed that many of the Kung mtDNAs clustered within a distinct sublineage (L1a2) of subhaplogroup L1a and that the Khwe are more closely genetically related to western-African (Bantu-speaking) populations than they are to the Kung San. Furthermore, the genetic relationships deduced from HR-RFLP analysis were found to be completely consistent with—and sometimes more detailed than—those obtained from CR sequencing alone. These findings implied the need for studies combining the two methods of analysis, in order to more fully understand the genetic relationships among human populations. Subjects and Methods Subjects Khoisan-speaking populations can generally be divided into two distinct groups, the San and the Khoi. San populations consist of 10 different Kung groups (“!Xu” is pronounced “Kung”), as well as the //au//en, Nharo, G/wi, G//ana, !Xo and G!aokxíte, who also speak click-languages. The Khoi consist of five populations—two Topanaar groups and the Kede, Hei//om, and Nama. A third set of southern-African populations, which coexist with Khoisan-speaking groups, are the so-called Negroids, who are largely Bantu-speakers. These include the Kwisi, Kwadi, Cimba, Dama, Kgalagadi, and Denasena (Nurse et al. 1985). Although the two Khwe (Kwengo) groups physically appear to be “Negroid,” they speak a Khoisan language (de Almeida 1965). Consequently, they have been called “black Bushmen.” The geographic locations of the populations analyzed in this study, as well as the populations to which they are compared, are shown in figure 1
A total of 181 Khoisan-speaking individuals of the northwestern Kalahari Desert in southern Africa formed the sample group for this study. Of this group, 144 were identified as Kung (Vasikela Kung), 37 as Khwe (also known as “Barakwena”). Of the 181 southern-African mtDNA samples, 74 (43 Kung and 31 Khwe) were analyzed for mtDNA sequence variation. The individuals involved in this study were classified according to the ethnicity listed on either their identity documents or birth certificates. In the few rare cases in which this information was not available, both parents were independently questioned about their own ancestry before their child was classified as belonging to either of the two groups. Fortunately, in all cases, the family history furnished by both parents was identical, simplifying the classification of their children. Interpreters were used in the interviews and counseling sessions, and care was taken to verify the data and information gathered. Even though these two groups have been known to intermarry, no such instance was recorded for these particular individuals. Sample Acquisition and Preparation After informed consent was obtained, two 10-ml tubes of blood were collected from Kung and Khwe individuals, by venipuncture, in acid citrate–dextrose vacutainer tubes and then were stored at 4°C until they were shipped to Emory University. At Emory University, the blood samples from each individual were separated into their constituent cellular fractions, by low-speed centrifugation. The platelets were subsequently collected from the plasmas by centrifugation in 15-ml Corning tubes, at 5,000 g and 10°C for 20 min; all the mtDNAs used in the RFLP and CR sequence analyses were then extracted from these platelet pellets (Torroni et al. 1992). Molecular Genetic Analysis The entire mtDNA of each sample was subjected to HR-RFLP analysis using the primer pairs and PCR amplification conditions described by Torroni et al. (1992, 1993). This HR-RFLP analysis defined the complete haplotype for each individual. The evolutionary relationships among the Kung and Khwe mtDNAs were further differentiated by the sequencing of both hypervariable segments (HVS-I and HVS-II) of the CR of each individual, by methods described by Schurr et al. (1999). Both hypervariable segments were PCR amplified by use of heavy-strand primer H15704 (np 15704–15723) and light-strand primer L770 (np 770–751), whereas two different sets of primers—H15978 (np 15978–15997) and reverse primer L16501 (np 16501-16483) for HVS-I, H1 (np 1–19) and L429 (np 429–412), for HVS-II—were used for sequencing. The resulting data were analyzed by SEQUENCHER 3.0 software (Gene Codes). Phylogenetic Analyses The phylogenetic relationships between the mtDNA haplotypes observed in the Kung and Khwe and those previously reported in the other sub-Saharan African populations (Chen et al. 1995) were inferred by parsimony analysis with PAUP 3.1.1 (Swofford 1994) and PAUP 4.0.2b (Swofford 1998). The samples included complete haplotypes of 62 Senegalese (AF01–AF24, AF26–AF36, AF45–AF59, AF64–AF65, and AF70–AF79), 17 Pygmy (AF25, AF37–AF44, AF60–AF63 and AF66–AF69), and 29 Kung and Khwe (AF46, AF80-AF107). All dendrograms were rooted from a chimpanzee haplotype that was extrapolated from the whole mitochondrial genome sequence presented by Horai et al. (1995), by identification of all recognition sequences of the 14 enzymes used in the HR-RFLP analyses. The African haplotypes were also midpoint rooted without an outgroup. Maximum parsimony (MP) trees were generated via random addition of sequences, by the tree-bisection and- reconnection (TBR) algorithm, with 10 MP trees being saved for each replication. Because of the numerous terminal taxa in this data set, a large number of MP trees were obtained, with 3,000 MP trees being generated after 1,388 replications. Although shorter trees could exist, none were observed in this analysis. Strict and 50%-majority-rule consensus trees encompassing all MP trees were also obtained, to test the consistency of the branching order in the MP trees. Similarly, the data were subjected to bootstrap analysis to test the statistical support for the branch structure of the MP trees. All RFLP haplotypes were bootstrapped through 10–500 replications, with resampling of characters (i.e., RFLPs) by the TBR algorithm, with a 50%-majority-rule bootstrap consensus tree being generated from the best trees saved after each independent analysis. Genetic distance/neighbor-joining (NJ) trees were also generated by PAUP 4.0.2b, on the basis of the mean character differences of the haplotypes. These were also subjected to bootstrap analysis (Swofford 1998).In addition, the CR sequences obtained from the 43 Kung and 31 Khwe samples were combined with the 11 distinct CR sequences from the Botswana Kung (sequences 1, 5–9, and 11–15) reported by Vigilant et al. (1989) and were subjected to phylogenetic analysis. To distinguish between the CR sequences belonging to the two different Kung San groups, the population samples described in this study were called the “Vasikela Kung,” and those analyzed by Vigilant et al. (1989) were called the “Botswana Kung.” Phylogenies of these sequences were obtained with the NJ (Saitou and Nei 1987) method present in the MEGA 1.01 statistical package (Kumar et al. 1993). The evolutionary distance between pairs of CR sequences were estimated as p distances, the proportion (p) of nucleotide sites at which the pair of sequences being compared differed, as calculated by dividing the number of nucleotide differences (nd) by the total number of nucleotides compared (n). Similar genetic distances were obtained with the other algorithms available in MEGA, but they are not presented here. All p distances were used with the NJ method, to produce phylogenetic trees. The NJ trees generated from HVS-I sequences were rooted by use of either a chimpanzee (Horai et al. 1995) or the Neanderthal (Krings et al. 1997) sequence, and those generated from both HVS-I and HVS-II data were rooted from only the chimpanzee sequence, because no HVS-II data were available for the Neanderthal mtDNA. Unrooted NJ trees were midpoint rooted without an outgroup sequence. The branching structure of these NJ trees was, in turn, tested by bootstrap analysis, which produced bootstrap confidence-level estimates for each interior branch. Sequence-Divergence Estimates To calculate the genetic divergence of the southern-African populations, as well as the divergence within specific haplogroups and their sublineages, we used the iterative maximum likelihood (ML) method of Nei and Tajima (1983). The interpopulation ML estimates were obtained by taking a weighted average of the number of individuals in each population. “Corrected” interpopulation values (Corr) were obtained by taking an average of two intrapopulation estimates and subtracting that value from the uncorrected interpopulation value—that is, Corr=δxy-[(δx+δy)/2], where x and y are the two populations being analyzed. The same method was used for estimation of the divergence of haplogroups and their sublineages. When calculating the divergence times of these haplogroups and of their sublineages, we used the mtDNA evolutionary rate of 2.2%–2.9%/1 million years (Torroni et al. 1994c). Results Southern-African RFLP Haplotypes A total of 29 haplotypes (AF46 and AF80–AF107) defined by 47 restriction-site polymorphisms were observed among the 74 South African individuals analyzed (Appendix A). One of these haplotypes, AF46, was described in our previous study of African mtDNA (Chen et al. 1995), whereas haplotypes AF80-AF107 were observed for the first time in African populations. A total of 18 haplotypes were detected in the Vasikela Kung, and 13 haplotypes were found in the Khwe, only 2 of which (i.e., AF85 and AF99) were shared between them (table 1). Of these southern-African haplotypes, 77% clustered into haplogroup L*, including 84% of the Vasikela Kung mtDNAs and 68% of the Khwe mtDNAs.
Haplotypes belonging to haplogroup L* have previously been observed in the western-African Senegalese (Chen et al. 1995; Graven et al. 1995) and Biaka and Mbuti Pygmy (Chen et al. 1995) populations, at very high frequencies, as well as at similar frequencies in the Bamileke of Cameroon (Scozzari et al. 1994) and in various southern-African groups (Soodyall and Jenkins 1992, 1993). However, no comparable haplotypes have been observed in European, Middle Eastern, or Asian populations (Ballinger et al. 1992; Torroni et al. 1994a, 1994b, 1994c, 1996), with the exception of those groups with a history of contact with African populations (Bonné-Tamir et al. 1986; De Benedictis et al. 1989; Semino et al. 1989; Ritte et al. 1993). Thus, these results substantiate the African origin of this mtDNA lineage, as well as its widespread distribution, albeit at varying frequencies, in all African populations. Consistent with a previous study of Khoisan mtDNA variation (Soodyall et al. 1996), none of the Vasikela Kung or the Khwe haplotypes exhibited any length polymorphisms in the COII/tRNALys intergenic region. These results support the conclusion that deletion haplotypes from haplogroup L* originated not in Khoisan populations but, instead, in Bantu-speaking and/or Pygmy groups of western and central Africa. Phylogenetic Analysis of RFLP Haplotypes The phylogenetic relationships among the mtDNA haplotypes observed in the Vasikela Kung and Khwe and those previously reported in the other sub-Saharan African populations were assessed by both MP and genetic-distance/NJ analysis. The MP tree of African mtDNAs is presented infigure 2
From the MP tree, it is clear that the African mtDNA phylogeny forms a successive series of branches, each one occupied by clusters of related mtDNA haplotypes (fig. 2
The upper one-third of the mtDNA phylogeny encompasses mtDNAs that lack the np-3592 HpaI site, which, in our previous study, we had designated “non-L” (Chen et al. 1995). This was subsequently redefined as “L3,” by Watson et al. (1997). When this nomenclature is adopted, L3 is subdivided into four haplogroups: L3a, associated with a MboI np-2349 site gain; L3b, associated with a MboI np-8616 site loss; L3c, associated with a TaqI np-10084 site gain; and L3d, associated with a DdeI np-10394 site loss (fig. 2 L1, L2, and L3 each encompass a sequence diversity that is approximately equivalent to those of other continent-specific haplogroups (Ballinger et al. 1992; Torroni et al. 1994a, 1994b; Chen et al. 1995). Therefore, it is reasonable to delineate these as haplogroups and to delineate the combination of L1 and L2, delineated by the presence of the HpaI site at np 3592, as a macrohaplogroup, “L*.” This then relegates L1a, L1b, L2a–L2c, and L3a–L3d to the level of subhaplogroups. Of our L3 subhaplogroups (i.e., L3a–L3d), L3a and L3b are identical to similarly named groupings recognized by Watson et al. (1997). Within the African MP phylogeny (fig. 2 To establish the reliability of the major branches of our African mtDNA MP tree, we subjected the MP tree to a bootstrap analysis. Because of the large number of taxa (haplotypes) and character states (restriction-site variants) in our data set, the 50%-majority-rule bootstrap tree was unable to provide confidence values for a number of the internal branches. To facilitate the analysis, we reduced the number of taxa from each of the major clusters seen infigure 2 To further define the African mtDNA phylogeny, we performed a genetic distance/NJ analysis on the same data set, using PAUP 4.0.2b. The resulting NJ tree (fig. 3
To test the robustness of the African NJ phylogeny, we applied bootstrap analysis using PAUP 4.0.2b. Since the genetic distances used to construct the NJ trees represent single summaries of the genetic relationships among taxa (Saitou and Nei 1987), the number and complexity of the taxa were reduced. Consequently, bootstrap analysis of the NJ-tree genetic-distance data gave confidence values, for most branches, that were in the 51%–94% range. Again, however, some branches could not be assigned numerical values (fig. 3 Since both the MP and the genetic distance/NJ methods produced nearly identical trees, with the major branches having high bootstrap values, it is apparent that the African mtDNA phylogenies shown in figures figures22 At present, the relationship of the African mtDNA phylogeny to the mtDNA phylogenies of Europe and Asia can only partially be deduced. If included in the African phylogenetic analysis, all European mtDNAs would cluster outside haplogroups L1 and L2 and would form separate branches within haplogroup L3. This is because all European haplogroups are defined by distinct sets of RFLPs not usually present in African haplogroups (Torroni et al. 1996, 1998). Many of the L3 mtDNAs have the DdeI site at np 10394, a characteristic also seen in European haplogroups I–K; others, such as AF01, AF02, and AF03 of L3d, lack the DdeI np-10394 site, which is characteristic of European haplogroups H and T–X. Among the African mtDNAs lacking the DdeI 10394 site, haplotypes AF01 and AF02 from the Senegalese have a HinfI site at np 12308, which characterizes haplogroup U in European populations (Torroni et al. 1996). This raises the possibility that the L3d lineage may have given rise to several of the European haplogroups; however, it is also possible that the haplogroup U mtDNAs observed in these African samples may be the product of back-migrations into Africa from Europe. The Asian mtDNA phylogeny is subdivided into two macrohaplogroups, one of which is M. M is delineated by a DdeI site at np 10394 and an AluI site of np 10397. The only African mtDNA found to have both of these sites is the Senegalese haplotype AF24. This haplotype branches off African subhaplogroup L3a (figs. (figs.22 Interestingly, the MboI np-2349 site gain is shared by the haplogroup L3a mtDNAs in the Vasikela Kung and the Khwe populations and a subset of subhaplogroup L1b mtDNAs from Senegalese populations, implying a possible linkage between these population groups. Unlike subhaplogroup L1b mtDNAs, however, these L3a mtDNAs do not have either the HinfI np-10806 site gain or the AluI np-7055 and RsaI np-2758 site losses that define haplogroup L1b. Thus, it appears that the haplogroup L3a mtDNAs acquired the MboI np-2349 site independently of the subhaplogroup L1b mtDNAs. Haplogroup Distribution in African Populations The overall distribution of mtDNA haplogroups in African populations, including the Vasikela Kung and the Khwe, is shown in table 2. Both the Mbuti and Biaka Pygmy populations have 100% macrohaplogroup L* mtDNAs. In addition, macrohaplogroup L* haplotypes constitute the large majority of the mtDNAs of the Vasikela Kung, Khwe, and Senegalese groups. However, these groups also encompass significant frequencies of L3 haplotypes, with subhaplogroups L3a–L3d being nonuniformly distributed among these populations. Furthermore, there is some degree of geographic partitioning of the clusters within macrohaplogroup L* (table 2), a trend also noted by Watson et al. (1997). The distribution of haplogroups among these African populations implies that they have experienced quite different genetic histories and that macrohaplogroup L* has undergone considerable regional diversification in Africa. The Vasikela Kung and the Khwe encompass different arrays of mtDNA haplotypes. Of those present in the Vasikela Kung, 79% and 5% are found in haplogroups L1 and L2, respectively, with these haplotypes belonging to subhaplogroups L1a and L2b (table 2). In addition, Vasikela Kung subhaplogroup L1a mtDNAs segregate into two distinct lineages: L1a2, which encompass >51% of their mtDNAs (AF87–AF91 and AF93–AF95), and L1a1, which encompass 28% of their mtDNAs (AF96–AF97 and AF99–AF101) (figs. (figs.22 Among the Khwe, 52% and 16% of the mtDNAs are found in haplogroups L1 and L2, respectively (table 2). With the exception of AF104, which is located in subhaplogroup L1b, within a cluster of related western-African mtDNAs, all the Khwe mtDNAs from haplogroup L1 cluster within subhaplogroup L1a. Within L1a, 32% of the Khwe mtDNAs (AF98–AF99 and AF102–AF103) belong to lineage L1a1, and only 16% occur in lineage L1a2. Lineage L1a1 is also present in the Vasikela Kung and Biaka Pygmies, but L1a2 is found almost exclusively in the Vasikela Kung, with only one Khwe haplotype (AF92) being found in this cluster. Hence, the presence of a Khwe haplotype in lineage L1a2 is most likely due to admixture with Khoisan-speakers. Both the Vasikela Kung and the Khwe possess haplotypes belonging to subhaplogroup L2b (table 2). The two Vasikela Kung haplotypes from this subhaplogroup (i.e., AF106 and AF107) are closely related. The two Khwe haplotypes include one (AF46) that is the same as that occurring in three Wolof from Senegal (Chen et al. 1995) and another (AF105) that is found in three Khwe and that occupies the nodal position in the haplogroup L2; hence, AF105 could be the founding haplotype for this subhaplogroup. This distribution raises the possibility that the L2b haplotypes found in the Vasikela Kung were acquired by admixture with the Khwe. Both southern-African populations also have haplogroup L3 mtDNAs (table 2). Two of the L3 haplotypes in the Kung (i.e., AF85 and AF86) are quite distinctive and cluster on a distant branch of haplogroup L3b, haplotype AF85 also occurring in the Khwe. Khwe haplotypes AF80–AF84 all cluster together within haplogroup L3a, with AF82 being shared with the Kung. This pattern of haplotype sharing suggests that the L3 haplotypes in the Vasikela Kung and the Khwe are probably of Khwe origin. Hence, it seems probable that lineage L1a2 mtDNAs originated in the Vasikela Kung and were disseminated into the Khwe, whereas the L2 and L3 mtDNAs originated in the Khwe and were passed on to the Vasikela Kung. HR-RFLP analysis also reveals several distinct sublineages that appear to be population specific (figs. (figs.22
Sequence Divergence of African Haplogroups On the basis of intrapopulation ML calculations, we estimated the divergence times for all major clusters of mtDNAs thus far observed in African populations (table 4 and Appendix B). The sequence-divergence value for all African HR haplotypes is 0.364%, an estimate that gives a maximum age, for the most recent common ancestor (MRCA), of ~126,000–166,000 years before the present (YBP). Similarly, macrohaplogroup L* shows a sequence divergence of 0.356%, which gives a divergence time of 123,000–162,000 YBP. These findings are concordant with earlier estimates of African mtDNA divergence, which gave coalescence values of 110,000–160,000 YBP (Chen et al. 1995; Graven et al. 1995; Horai et al. 1995; Vigilant et al. 1989, 1991; Watson et al. 1997), and confirm that macrohaplogroup L* is the most divergent of all haplogroups in human populations (Chen et al. 1995). These data further support the hypothesis that, in being the oldest mtDNAs in human populations, these African haplotypes form the root of the modern human mtDNA phylogeny (figs. (figs.22
Within macrohaplogroup L*, the major clusters of African mtDNAs show varying levels of sequence diversity (table 4). Haplogroup L1 is nearly as diverse as macrohaplogroup L*, whereas haplogroup L2 is half as diverse. This indicates that haplogroup L1 originated well before the emergence of haplogroup L2. Among the smaller haplotype clusters within macrohaplogroup L*, subhaplogroup L1a has the largest divergence value, whereas subhaplogroup L2c has the smallest value, implying that there are considerable differences in times of origin for these haplotype clusters within African populations. Haplogroup L3 is less divergent than L1, and the subhaplogroups of L3 are generally less divergent than those of L1. Only haplogroup L3b is comparable, in terms of sequence diversity, to the oldest subhaplogroups of L1 and L2; however, the L3b value could be inflated, because of the inclusion of the highly divergent haplotypes AF85 and AF86, since these southern-African haplotypes are at least five mutations different from any other haplotypes (AF04–AF06 and AF08–AF11) of this Senegalese-specific subhaplogroup (Chen et al. 1995). Therefore, it appears that the L3 subhaplogroups arose long after the radiation of haplogroup L1 and approximately at the time that haplogroup L2 split from haplogroup L1. Sequence-divergence estimates were also calculated for each of the population-specific sublineages (table 3). These results show that sublineages found in the Mbuti Pygmies (γ) and the Senegalese (δ) have approximately comparable divergence values (0.042%–0.052%). By contrast, the Vasikela Kung lineage (α) is twice as diverse (0.119%), and the Biaka Pygmy (β) is four times as diverse (0.225%), as the Mbuti Pygmies and Senegalese. These diversity levels suggest that the Biaka Pygmies of sublineage β may be one of the oldest distinct African populations and, hence, one of the oldest human populations in the world, although other demographic factors could have contributed to this level of diversity. The Vasikela Kung sublineage α may be the other truly ancient African mtDNA cluster. Not only is sublineage α of the Kung the second most diverse, but it is positioned at the deepest root of the African phylogenetic tree (figs. (figs.22 Genetic Divergence of African Populations The intra- and interpopulation sequence divergences within and between African populations are shown in table 5. The Biaka Pygmies have the highest intrapopulation sequence divergence (0.342%), followed by the Vasikela Kung (0.320%) and then the Khwe (0.277%). In addition, all the interpopulation sequence-divergence values between Vasikela Kung and other African populations are higher than those between the Khwe and other African populations. Thus, the Vasikela Kung are more divergent from other African populations than are the Khwe. This finding is consistent both with the Vasikela Kung being genetically distinct from the Khwe (Nurse et al. 1977) and with the Khwe being more closely related to the Negroid populations of South Africa, who, in turn, show greater affinities with populations from western and central Africa.
These same relationships are seen quite clearly in the NJ tree based on these ML estimates (fig. 4 Distribution of CR Sequences in Southern-African Populations A total of 640 nucleotides from HVS-I and HVS-II of the CRs for all 74 southern-African individuals were sequenced. The sequence data from the Vasikela Kung and Khwe are shown in table 6. A total of 24 distinct CR sequences based on 65 variable nucleotides were detected. Of the 65 variable nucleotides found in this study, only 31 had previously been reported in the study of CR sequence variation in the Botswana Kung (Vigilant et al. 1989). The vast majority (84%) of these nucleotide substitutions are transitions, both in the HVS-I and in the HVS-II, a result that is consistent with patterns of variation observed in previous CR-sequencing studies of human populations (Horai and Hayasaka 1990; Vigilant et al. 1991).
The distribution of CR sequences in the Vasikela Kung and the Khwe show them to be quite distinctive from each other; indeed, there is virtually no overlap of their CR sequences (table 6). Therefore, each CR sequence has been given a two-letter prefix indicating its probable population of origin, as shown in figure 5
Phylogenetic Analysis of CR Sequences The phylogenetic relationships among the CR sequences of the Vasikela Kung, Khwe, and Botswana Kung (Vigilant et al. 1989) are shown in figure 5 The CR sequences of all L1a subhaplogroup sequences—both those in L1a1 and those in L1a2—cluster together with one another and apart from those belonging to haplogroups L2 and L3 (fig. 5 Although there is a good correspondence between the RFLP haplotypes and CR sequences, specific RFLP haplotypes have multiple CR sequences associated with them (e.g., AF93 and CR sequences VK3 and VK5). In other cases, the same CR sequence is associated with multiple RFLP haplotypes (e.g., CR sequence KW15 with AF80–82 and AF84). These observations indicate that both HR-RFLP and CR sequence data provide valuable phylogenetic information about human mtDNAs and are more valuable when used together than when employed separately. With regard to the population-specific clusters, the CR sequences for the Kung San populations generally segregate into distinct clusters of related types. With one exception (BK12), all the Botswana Kung CR sequences (BK01–BK11 and BK13–BK15) cluster together within lineage L1a2 (fig. 5 Discussion The Origins of African Populations Studies of African mtDNA sequence variation that have used HR-RFLP analysis (Chen et al. 1995; present study) have confirmed the ubiquity of macrohaplogroup L* in African populations. Moreover, we have documented that macrohaplogroup L* is the oldest African and human mitochondrial lineage, encompassing a sequence diversity of 0.364% and having an estimated age ~126,000–165,000 YBP; as such, it must be considered the root of the human mtDNA phylogeny, from which all other haplogroups evolved. Macrohaplogroup L* is further subdivided into two major haplogroups: L1 and L2 (figs. (figs.22 Haplogroup L2 is considerably younger than haplogroup L1, being approximately half the age of the latter. It is subdivided into three subhaplogroups: L2a, L2b, and L2c. L2a contains the core population-specific haplotypes (γ) of the Mbuti Pygmies, whereas L2c contains the core haplotypes (δ) of the Bantu-speaking Senegalese (figs. (figs.22 The existence of four distinct L3 subhaplogroups (Chen et al. 1995) also has been confirmed. Of these, L3a and L3b are the oldest, whereas L3c and L3d are of more recent origin. In addition, L3 mtDNAs have been found to be much more closely related to those from L2 than either is to L1, by both RFLP and CR sequence analysis (figs. (figs.22 An unexpected result is the marked difference, in haplotype distribution, between the Mbuti and Biaka Pygmy populations. Not only do the mtDNAs of these two populations form distinct clusters by parsimony analysis (figs. (figs.22 The Vasikela Kung mtDNA data also suggest a relatively ancient origin for the Kung San. The population-specific haplotypes of the Biaka Pygmies (β) and the Vasikela Kung (α) are both located in the ancient African haplogroup L1. Moreover, the Vasikela Kung sublineage α haplotypes form the deepest root of the African tree, relative to the chimpanzee outgroup haplotype (figs. (figs.22 The Relationship between the Kung and the Khwe The Vasikela Kung and the Khwe both speak Khoisan languages. However, the Khwe have a physical appearance that is more similar to that seen in many Bantu-derived populations than to that of the Vasikela Kung. Therefore, the Khwe may have originated from the Bantu migration into southern Africa and subsequently may have adopted the Khoisan language. This possibility is supported both by the striking differences between the population-specific haplotypes of the Kung San (α) and those of the Bantu-speaking Senegalese (δ) and by the greater similarities between the Khwe and Senegalese haplotypes than between the Kung and the Khwe haplotypes. To most effectively address this question, we have compared the mtDNA variation in all of the southern-African populations that have been studied thus far. However, with the exception of our current and previous (Chen et al. 1995) studies, all African mtDNA RFLP variation has been analyzed by LR-RFLP methods rather than by HR-RFLP methods. As noted above, LR-RFLP analysis typically employs the five rare-cutting restriction endonucleases—HpaI, BamHI, HaeII, MspI, and AvaII—and, in some instances, HincII also has been included (Johnson et al. 1983; Scozzari et al. 1988; Soodyall and Jenkins 1992, 1993). Fortunately, the most informative restriction-site polymorphisms for African mtDNAs are recognized by HpaI, MspI, and AvaII, which are included in both LR- and HR-RFLP analysis. Hence, once nomenclatural differences between LR- and HR-RFLP haplotypes have been resolved, a constructive population comparison can be made for African populations, by use of both types of data. LR-RFLP analysis involves examination of restriction-fragment patterns by use of Southern blot analysis, with a unique restriction-endonuclease pattern being called a morph and being given a specific morph number. The sum of all of the individual morphs for a particular mtDNA is defined as its mtDNA type. The LR-RFLP types are numbered and described by listing the morph numbers of the five restriction endonucleases, in the order HpaI-BamHI-HaeII-MspI/HpaII-AvaII. Thus, type 1 is defined by morph-2 for HpaI, morph-1 for BamHI, morph-1 for HaeII, morph-1 for MspI/HpaII, and morph-1 for AvaII, which can be written, in shorthand form, as “2-1-1-1-1.” Later, when HincII was added to the enzyme set, its morph was appended to the type number, such that a type 1 mtDNA with a HincII morph-2 became type 1-2. Most of the variation represented by these morphs can now be equated with the presence or absence of specific restriction sites identified by HR-RFLP analysis. Thus, the LR-RFLP types can be equated with HR-RFLP haplogroups, at the restriction site—and, thus, at the nucleotide level. These associations can be further refined by use of CR sequence data. HpaI morph-3 is the most common African HpaI restriction pattern. By contrast, HpaI morph-2 delineates all L3 African mtDNAs and is found in all European and Asian mtDNAs. HpaI morph-3 differs from morph-2 by having the HpaI site gain at np 3592. Thus HpaI np-3592 morph 3 defines macrohaplogroup L*, and distinguishes L* from L3 and the remaining global mtDNAs. Accordingly, all African LR-RFLP types that start with a 3 (e.g., 3-1-1-1-1-2 or type 7-2) can be considered L*(L1 or L2) mtDNAs, while those which start with a 2 (e.g., 2-1-1-1-1-2 or Type 1-2) can be considered L3 mtDNAs (table 7).
Similarly, the MspI/HpaII morphs delineate major subsets of African mtDNAs. These are represented by the fourth number in the LR-RFLP type designation (table 7). MspI-2 is defined by site losses at np 8112 and np 8150. These are the primary site changes that define the HR-RFLP lineage L1a2 and the Kung San-specific sublineage α (figs. (figs.22 Several of the AvaII morphs also correspond to RFLPs that define major sublineages of the African phylogeny. The first is AvaII-2, which is defined by a site gain at np 8249. This site appears in haplotypes from lineage L1a2 and from the Vasikela Kung–specific sublineage α (figs. (figs.22 Therefore, the LR-RFLP sites for HpaI-3, for MspI -2, -3, and –3′, and for AvaII-2 all help to define the lineage L1a2, which encompasses the Kung-specific mtDNA sublineage α. Consequently, mtDNA types 3-2, 4-2, 4-9, 4′-2, 5-2, 14 –2, and 32-2 (all HpaI-3 and MspI-2 or 3) can be considered to be of Kung San origin (table 7). As a result, these mtDNA types can be used to determine the relative representation of Kung San–derived mtDNAs in other African populations that have been analyzed by LR-RFLP analyses (table 7). By contrast, mtDNAs that are HpaI-2 and MspI-1 belong to haplogroup L3, with type 1-2 (i.e., 2-1-1-1-1-2) belonging to subhaplogroup L3a and with type 21-2 (i.e., 2-1-1-1-2-2) belonging to subhaplogroup L3b. Since the Senegalese have a much higher frequency of L3 types than does the Kung population, the presence of these mtDNAs are more indicative of a Bantu origin than of a Kung origin (table 7). Using these distinctions, we can now compare our current data on the Vasikela Kung and Khwe with those from other LR-RFLP studies of the Kung San and Bantu populations (table 7). The three Kung San groups that have been studied are the Vasikela Kung, the Botswana Kung, and the Sekele Kung. The single Khoi group that has been examined is the Nama, whereas the Bantu-speaking groups hat have been analyzed include the Bantu, Herero, Dama, and Ambo. The Kung-associated mtDNA types (3-2, 4-2, 4-9, 4′-2, 5-2, 14-2, and 32-2) constitute 51% of the Vasikela, 88% of the Botswana, and 59% of the Sekele Kung mtDNAs (table 7). Of these, types 4-2, 4-9, and 4′-2 are Kung specific, whereas types 3-2 and 7-2 are present in all southern-African populations, although at generally higher frequencies in the Kung and Khoi (Nama) (table 7). By contrast, the Khwe lack virtually all the Kung San–specific L1a2 types and, instead, exhibit 29% of the Bantu-associated types 1-2 and 21-2 of haplogroup L3. This frequency of L3 mtDNAs in the Khwe is comparable to the 18%–83% seen in the Bantu-derived groups but is quite different from the 3%–12% seen in the Kung San populations (table 7). Thus, on the basis of these data, as well as on the basis of genetic-distance analysis (fig. 4 Since nearly all of the haplotypes in Bantu-speaking populations from western Africa and in the Pygmy populations from eastern and central Africa are MspI-1 (Chen et al. 1995; Graven et al. 1995; authors' unpublished data), MspI-2 must be of Khoisan origin. This further supports the conclusion that the MspI-2 mtDNAs in the Bantu-speaking populations have been acquired through genetic admixture with Khoisan-speaking populations since the Bantu expansion into southern Africa. The population distribution of the individual LR-RFLP morphs can also permit deduction of their interrelatedness; for example, AvaII-5 is defined by the np-8249 site gain of AvaII-2 and by the np-16390 site loss of AvaII-3. Therefore, it could have derived from either of these morphs. However, both AvaII-2 and AvaII-5 occur primarily in the Kung San, whereas AvaII-3 does not. Therefore, it is most likely that AvaII-5 derived from AvaII-2 within the Kung population, through the loss of the np-16390 site. A parallel and independent mutation must then have created the np-16390 site loss seen in AvaII-3. This conclusion is supported by the fact that the np-16390 site is located in the hypervariable CR at the very end of HVS-I and, thus, is prone to repeated mutational events. Correlation between RFLP and CR Sequence Data In most respects, the phylogenetic relationships between the CR sequences of the Kung and Khwe correlated well with those seen for the combined LR- and HR-RFLP haplotype data (fig. 5 The conclusion that the deepest root of the African mtDNA tree occurs at the Vasikela Kung sublineage α—and that the Kung and the Biaka and Mbuti Pygmies harbor distinctive groups of mtDNA types—is supported by both RFLP and CR sequence data (figs. (figs.22 The importance of macrohaplogroup L*, with its component haplogroups L1 and L2, as opposed to haplogroup L3 (non-L), was also confirmed in another study, by Watson et al. (1997), of African mtDNA CR sequence variation. In that study, a total of 407 African samples, including Kung San, Mbuti and Biaka Pygmies, and multiple Bantu-derived populations, were analyzed, by CR sequencing and screening, for both the presence of the HpaI np-3592 site (delineating L*) and the absence of the AvaII np-16390 site (equal to the presence of the HinfI site at np 16389 and delineating L2) (figs. (figs.22 The primary question addressed by Watson et al. (1997) is the number and nature of African population expansions. To address this question, they examined the sequence diversity of these samples, partitioning their CR sequences into those which were observed in more than one population (87%) versus those which were found in only one population (13%). The multipopulation CR sequences were then used to construct a median network, which resulted in the separation of these sequences into haplogroups L1 (a and b), L2, and L3 (a–c). Estimation of the distribution of pairwise differences of and between haplogroups L1a, L1b, L2, L3a, and L3b revealed all of them to have unimodal distributions, a finding that is consistent with each haplogroup being associated with a different population expansion. The remaining population-specific CR sequences (termed “isolated lineages”) were then combined into a new category termed “L1i.” A median network of the L1i sequences converged on a MCRA having CR transition mutations at np 16129, 16187, 16189, 16223, 16230, 16278, and 16311; however, this L1i network was not starlike, and the pairwise distribution of CR sequences was not unimodal. The nature of Watson et al. (1997)’s L1i haplogroup can be more fully explained if it is compared with our data (Chen et al. 1995; present study). By placing all of the population-specific CR sequences into L1i, Watson et al. (1997) combined mtDNAs from the Kung San–specific sublineage α from L1a, the Biaka Pygmy sublineage β from L1b, the Mbuti Pygmy sublineage γ from L2a, and the Senegalese sublineage δ from L2a. In addition, a comparison of the variant sites of their MRCA with the data from our study shows that it has the greatest similarities to our CR sequences 4, 5, 7, and 8 of lineage L1a2, sequence 9 of subhaplogroup L1b, and sequences 22–24 of lineage L1a1 (fig. 5 Our estimates of the age of the African population and its haplogroups are also comparable to estimates made by other investigators. In our studies, the antiquity of African mtDNAs has been demonstrated by means of a sequence-divergence time of 125,500–165,500 YBP (Chen et al. 1995; present study). Similarly, the coalescence time for the MRCA, as calculated by Watson et al. (1997) on the basis of their L1i cluster, was 111,000 ± 5,700 YBP; that estimated by Horai et al. (1995) on the basis of whole mtDNA sequences was 143,000 ± 18,000 YBP; and that estimated by Cann et al. (1987) on the basis of RFLP haplotype data was 120,000 YBP. By contrast, the age of haplogroup L2 was much younger, being 59,000–77,700 YBP by our analysis, 56,000 ± 3,000 YBP according to Watson et al. (1997), and 70,333 ± 25,710 YBP according to Graven et al. (1995). Finally, the age of haplogroup L3 is 78,300–103,200 YBP by our analysis, 60,000 ± 2,400 YBP according to Watson et al. (1997), and 66,321 ± 24,965 YBP according to Graven et al. (1995). Thus, all of the most recent studies show consistent trends in the diversity and ages of the major clusters of mtDNAs in African populations. The other interesting question raised both by our analyses (Chen et al. 1995; present study) and by that of Watson et al. (1997) is the origin and dispersal of haplogroup L3 mtDNAs. These haplotypes are widely dispersed in eastern Africa but are less prevalent in western and southern Africa (table 2; also see Watson et al. 1997; Passarino et al. 1998). Moreover, there are several different subhaplogroups within L3 (figs. (figs.22 Since the subhaplogroups of L3 are the most likely precursors of modern European and Asian mtDNA haplotypes (Chen et al. 1995; Watson et al. 1997), their sequence variation and age are of considerable importance in the determination of the timing and process by which these mtDNAs were dispersed out of Africa. In this regard, subhaplogroups L3a and L3b appear to be the oldest of the L3 subhaplogroups, dating to 41,400–54,500 YBP and 77,600–102,300 YBP, respectively. These estimates are somewhat similar to the 60,000 ± 3,200 YBP and 44,000 ± 3,000 YBP coalescence times that Watson et al. (1997) calculated for the analogous subhaplogroups. In contrast, subhaplogroups L3c and L3d have somewhat younger divergence times—28,300–37,300 YBP and 17,600–23,200 YBP, respectively (table 4)—suggesting that they emerged after the evolution of L3a and L3b. We have also observed that subhaplogroup L3d lacks the DdeI np-10394 site, a polymorphism that it has in common with the vast majority of European haplogroups (H and T–X). Although our estimate of the age of subhaplogroup L3d is younger than those of most European haplogroups (Torroni et al. 1994a, 1996, 1998), the relatively young age of L3d in the present study might be due to the limited number of L3d mtDNAs that we sampled. Since two of the L3d haplotypes (i.e., AF01 and AF02) identified in our study possess the HinfI np-12308 site-gain marker for haplogroup U (Torroni et al. 1996), haplotype U could have arisen in Africa and migrated into Europe. Consistent with this hypothesis, the third haplotype in this subhaplogroup (i.e., AF03) lacks this haplogroup U marker but clusters with the haplogroup U mtDNAs. Hence, AF03 could be an African precursor to haplogroup U; alternatively, the haplogroup U mtDNAs in our sample may have been introduced into Africa by a back-migration/flow of European mtDNAs. Additional L3d mtDNAs, from other African populations, will need to be analyzed to further clarify the relationship of African haplogroup L3 and L3d mtDNAs to European mtDNA haplogroups. Similarly, L3a was found to have a close affinity to haplotype AF24, a mtDNA that has the DdeI np-10394 and AluI np-10397 site gains characteristic of Asian macrohaplogroup M (figs. (figs.22 On the basis of these observations, it is possible that mtDNA subhaplogroups L3a and L3d arose in sub-Saharan Africa and then moved upward into eastern Africa and out of eastern Africa into the Middle East, to yield Asian macrohaplogroup M and European haplogroup U. Such a hypothesis is supported by recent studies of eastern-African populations, which have revealed an unusually high percentage of L3 mtDNAs (Watson et al. 1997; Passarino et al. 1998). Interestingly, a number of these mtDNAs also have the DdeI np-10394 and AluI np-10397 site gains characteristic of “Asian” macrohaplogroup M (Passarino et al. 1998). Therefore, it is possible that subhaplogroups L3a and L3d radiated out of eastern Africa, to give rise to European and Asian mtDNAs. If so, then eastern Africa may still harbor the progenitor haplotypes from which European and Asian mtDNAs were derived. Acknowledgments The authors would like to thank Lorri Griffin and the Clinical Research Center of the Emory University School of Medicine, supported by National Institutes of Health (NIH) grant M01-RR-00039, for their assistance in the processing of blood samples. This work was supported by NIH grants AG13154, NS 21328, and HL45572 (to D.C.W.). Appendix A Table A1 Polymorphic Sites Defining RFLP Haplotypes in Southern-African Populations
aRestriction enzymes used in the analysis are designated by single-letter code, as follows: a =AluI; b = AvaII; c = DdeI; e = HaeIII, HhaI; g = HinfI; h = HpaII; i = MspI; j = MboI; k = RsaI; l = TaqI; m = BamHI; n = HaeII; and o = HincII. Sites separated by a slash (/) indicate either (a) simultaneous site gains or site losses for two different enzymes or (b) a site gain for one enzyme and a site loss for another, because of a single inferred nucleotide substitution; in the parsimony analysis and sequence-divergence calculations, these sites are considered to be only one restriction-site polymorphism. Sites marked with an asterisk (*) were found to be present or absent in all samples, contrary to the published sequence. bA “1” indicates the presence of a site; and a “0” indicates the absence of a site. Sites are numbered from the first nucleotide of the recognition sequence, according to the sequence published by Anderson et al. (1981). Appendix B Table B1 Percent Sequence Divergence Within and Between African Haplogroups
aIntrahaplogroup sequence-divergence values are shown on the diagonal (underlined), whereas the raw and corrected interhaplogroup values are shown above and below the diagonal, respectively. References Anderson S, Bankier AT, Barrell BG, De Bruijn MHL, Coulson AR, Drouin J, Eperon IC, et al (1981) Sequence and organization of the human mitochondrial genome. Nature 290:457–465 [PubMed] Ballinger SW, Schurr TG, Torroni A, Gan YY, Hodge JA, Hassan K, Chen KH, et al (1992) Southeast Asian mitochondrial DNA analysis reveals genetic continuity of ancient Mongoloid migrations. Genetics 130:139–152 [PubMed] Bonné-Tamir B, Johnson MJ, Natali A, Wallace DC, Cavalli-Sforza LL (1986) Mitochondrial DNA types in two Israeli populations—a comparative study at the DNA level. Am J Hum Genet 38:341–351 [PubMed] Cann RL, Stoneking M, Wilson AC (1987) Mitochondrial DNA and human evolution. Nature 325:31–36 [PubMed] Cann RL, Wilson AC (1983) Length mutations in human mitochondrial DNA. Genetics 104:699–711 [PubMed] Cavalli-Sforza LL, Menozzi P, Piazza A (1994) The history and geography of human genes. Princeton University Press, Princeton, NJ. Chen Y-S, Torroni A, Excoffier L, Santachiara-Benerecetti AS, Wallace DC (1995) Analysis of mtDNA variation in African populations identifies the most ancient of all human continent-specific haplogroups. Am J Hum Genet 57:133–149 [PubMed] De Almeida A (1965) Bushmen and other non-Bantu peoples of Angola. Witwatersrand University Press for the Institute for the Study of Man in Africa, Johannesburg. De Benedictis G, Rose G, Passarino G, Quagliarello C (1989) Restriction fragment length polymorphism of human mitochondrial DNA in a sample population from Apulia (south Italy). Ann Hum Genet 53:311–318 [PubMed] Denaro M, Blanc H, Johnson MJ, Chen KH, Wilmsen E, Cavalli-Sforza LL, Wallace DC (1981) Ethnic variation in HpaI endonuclease cleavage patterns of human mitochondrial DNA. Proc Natl Acad Sci USA 78:5768–5772 [PubMed] Excoffier L, Langaney A (1989) Origin and differentiation of human mitochondrial DNA. Am J Hum Genet 44:73–85 [PubMed] Graven L, Passarino G, Semino O, Boursot P, Santachiara-Benerecetti AS, Langaney A, Excoffier L (1995) Evolutionary correlation between control region sequence and restriction polymorphisms in the mitochondrial genome of a large Senegalese Mandenka sample. Mol Biol Evol 12:334–345 [PubMed] Greenberg JH (1963) The languages of Africa. Indiana University Press, Bloomington. Horai S, Hayasaka K (1990) Intraspecific nucleotide sequence differences in the major noncoding region of human mitochondrial DNA. Am J Hum Genet 46:828–842 [PubMed] Horai S, Hayasaka K, Kondo R, Tsugane K, Takahata N (1995) Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc Natl Acad Sci USA 92:532–536 [PubMed] Johnson MJ, Wallace DC, Ferris SD, Rattazzi MC, Cavalli-Sforza LL (1983) Radiation of human mitochondrial DNA types analyzed by restriction endonuclease cleavage patterns. J Mol Evol 19:255–271 [PubMed] Krings M, Stone A, Schmitz RW, Krainitzki H, Stoneking M, Pääbo S (1997) Neandertal DNA sequences and the origin of modern humans. Cell 90:19–30 [PubMed] Kumar S, Tamura K, Nei M (1993) MEGA: molecular evolutionary genetics analysis, version 1.01. Pennsylvania State University, University Park. Nei M, Tajima F (1983) Maximum likelihood estimation of the number of nucleotide substitutions from restriction site data. Genetics 105:207–217 [PubMed] Nurse GT, Botha MC, Jenkins T (1977) Sero-genetic studies of the San of south west Africa. Hum Hered 27:81–98 [PubMed] Nurse GT, Weiner JS, Jenkins T (1985) Research monographs on human population biology. Vol 3: The peoples of southern Africa and their affinities. Clarendon Press, Oxford. Passarino G, Semino O, Quintana-Murici L, Excoffier L, Hammer M, Santachiara-Benerecetti AS (1998) Different genetic components in the Ethiopian population, identified by mtDNA and Y-chromosome polymorphisms. Am J Hum Genet 62:420–434 [PubMed] Ritte U, Neufeld E, Prager FM, Gross M, Hakim I, Khatib A, Bonné-Tamir B (1993) Mitochondrial DNA affinities of several Jewish communities. Hum Biol 65:359–385 [PubMed] Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425 [PubMed] Schurr TG, Sukernik RI, Starikovskaya EB, Wallace DC (1999) Mitochondrial DNA diversity in Koryaks and Itel’men: population replacement in the Okhotsk Sea - Bering Sea region during the Neolithic. Am J Phys Anthropol 108:1–40 [PubMed] Scozzari R, Torroni A, Semino O, Cruciani F, Spedini G, Santachiara Benerecetti AS (1994) Genetic studies in Cameroon: mitochondrial DNA polymorphisms in Bamileke. Hum Biol 66:1–12 [PubMed] Scozzari R, Torroni A, Semino O, Sirugo G, Brega A, Santachiara Benerecetti AS (1988) Genetic studies on the Senegal population. I. Mitochondrial DNA polymorphisms. Am J Hum Genet 43:534–544 [PubMed] Semino O, Torroni A, Scozzari R, Brega A, De Benedictis G, Santachiara-Benerecetti AS (1989) Mitochondrial DNA polymorphisms in Italy. III. Population data from Sicily: a possible quantitation of maternal African ancestry. Ann Hum Genet 53:193–202 [PubMed] Soodyall H (1993) Mitochondrial DNA polymorphisms in Southern African populations. Ph.D. thesis, University of the Witwatersand, Johannesburg. Soodyall H, Jenkins T (1992) Mitochondrial DNA polymorphisms in Khoisan populations from southern Africa. Ann Hum Genet 56:315–324 [PubMed] Soodyall H, Jenkins T (1993) Mitochondrial DNA polymorphisms in Negroid populations from Namibia: new light on the origins of the Dama, Herero, and Ambo. Ann Hum Biol 20:477–485 [PubMed] Soodyall H, Vigilant L, Hill AV, Stoneking M, Jenkins T (1996) mtDNA control-region sequence variation suggests multiple independent origins or an “Asian-specific” deletion in sub-Saharan Africans. Am J Hum Genet 58:595–608 [PubMed] Swofford D (1994) Phylogenetic analysis using parsimony (PAUP), version 3.1.1. Illinois Natural History Survey, Champaign. Swofford D (1998) PAUP*: phylogenetic analysis using parsimony (*and other methods), version 4.0. Sinauer Associates, Sunderland, MA. Templeton AR (1992) Human origins and analysis of mitochondrial DNA sequences. Science 255:737 [PubMed] Torroni A, Bandelt H-J, D'Urbano L, Lahermo P, Moral P, Sellitto D, Rengo C, et al (1998) mtDNA analysis reveals a major late Paleolithic population expansion from southwestern to northeastern Europe. Am J Hum Genet 62:1137–1152 [PubMed] Torroni A, Huoponen K, Francalacci P, Petrozzi M, Morelli L, Scozzari R, Obinu P, et al (1996) Classification of European mtDNAs from an analysis of three European populations. Genetics 144:1835–1850 [PubMed] Torroni A, Lott MT, Cabell MF, Chen Y-S, Lavergne L, Wallace DC (1994a) Mitochondrial DNA and the origin of Caucasians: identification of ancient Caucasian-specific haplogroups, one of which is prone to a recurrent somatic duplication in the D-loop region. Am J Hum Genet 55:760–776 [PubMed] Torroni A, Miller JA, Moore LG, Zamudio S, Zhuang J, Droma T, Wallace DC (1994b) Mitochondrial DNA analysis in Tibet: implications for the origin of the Tibetan population and its adaptation to high altitude. Am J Phys Anthropol 93:189–199 [PubMed] Torroni A, Neel JV, Barrantes R, Schurr TG, Wallace DC (1994c) A mitochondrial DNA “clock” for the Amerinds and its implications for timing their entry into North America. Proc Natl Acad Sci USA 91:1158–1162 [PubMed] Torroni A, Schurr TG, Cabell MF, Brown MD, Neel JV, Larsen M, Smith DG, et al (1993) Asian affinities and continental radiation of the four founding Native American mitochondrial DNAs. Am J Hum Genet 53:563–590 [PubMed] Torroni A, Schurr TG, Yang C-C, Szathmary EJE, Williams RC, Schanfield MS, Troup GA, et al (1992) Native American mitochondrial DNA analysis indicates that the Amerind and the Nadene populations were founded by two independent migrations. Genetics 130:153–162 [PubMed] Vigilant L (1990) Control region sequences from African populations and the evolution of human mitochondrial DNA. PhD thesis, University of California, Berkeley. Vigilant L, Pennington R, Harpending H, Kocher TD, Wilson AC (1989) Mitochondrial DNA sequences in single hairs from a southern African population. Proc Natl Acad Sci USA 86:9350–9354 [PubMed] Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC (1991) African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507 [PubMed] Watson E, Forster P, Richards M, Bandelt H-J (1997) Mitochondrial footprints of human expansions in Africa. Am J Hum Genet 61:691–704 [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Proc Natl Acad Sci U S A. 1981 Sep; 78(9):5768-72.
[Proc Natl Acad Sci U S A. 1981]J Mol Evol. 1983; 19(3-4):255-71.
[J Mol Evol. 1983]Am J Hum Genet. 1988 Oct; 43(4):534-44.
[Am J Hum Genet. 1988]Hum Biol. 1994 Feb; 66(1):1-12.
[Hum Biol. 1994]Ann Hum Genet. 1992 Oct; 56(Pt 4):315-24.
[Ann Hum Genet. 1992]Ann Hum Genet. 1992 Oct; 56(Pt 4):315-24.
[Ann Hum Genet. 1992]Ann Hum Biol. 1993 Sep-Oct; 20(5):477-85.
[Ann Hum Biol. 1993]Nature. 1987 Jan 1-7; 325(6099):31-6.
[Nature. 1987]Genetics. 1992 Jan; 130(1):153-62.
[Genetics. 1992]Nature. 1987 Jan 1-7; 325(6099):31-6.
[Nature. 1987]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Genetics. 1983 Aug; 104(4):699-711.
[Genetics. 1983]Am J Hum Genet. 1994 Oct; 55(4):760-76.
[Am J Hum Genet. 1994]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Am J Hum Genet. 1996 Mar; 58(3):595-608.
[Am J Hum Genet. 1996]Mol Biol Evol. 1995 Mar; 12(2):334-45.
[Mol Biol Evol. 1995]Nature. 1987 Jan 1-7; 325(6099):31-6.
[Nature. 1987]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Genetics. 1992 Jan; 130(1):153-62.
[Genetics. 1992]Genetics. 1992 Jan; 130(1):153-62.
[Genetics. 1992]Am J Hum Genet. 1993 Sep; 53(3):563-90.
[Am J Hum Genet. 1993]Am J Phys Anthropol. 1999 Jan; 108(1):1-39.
[Am J Phys Anthropol. 1999]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Proc Natl Acad Sci U S A. 1995 Jan 17; 92(2):532-6.
[Proc Natl Acad Sci U S A. 1995]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Mol Biol Evol. 1987 Jul; 4(4):406-25.
[Mol Biol Evol. 1987]Proc Natl Acad Sci U S A. 1995 Jan 17; 92(2):532-6.
[Proc Natl Acad Sci U S A. 1995]Cell. 1997 Jul 11; 90(1):19-30.
[Cell. 1997]Genetics. 1983 Sep; 105(1):207-17.
[Genetics. 1983]Proc Natl Acad Sci U S A. 1994 Feb 1; 91(3):1158-62.
[Proc Natl Acad Sci U S A. 1994]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Mol Biol Evol. 1995 Mar; 12(2):334-45.
[Mol Biol Evol. 1995]Hum Biol. 1994 Feb; 66(1):1-12.
[Hum Biol. 1994]Ann Hum Genet. 1992 Oct; 56(Pt 4):315-24.
[Ann Hum Genet. 1992]Ann Hum Biol. 1993 Sep-Oct; 20(5):477-85.
[Ann Hum Biol. 1993]Am J Hum Genet. 1996 Mar; 58(3):595-608.
[Am J Hum Genet. 1996]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Genetics. 1992 Jan; 130(1):139-52.
[Genetics. 1992]Am J Hum Genet. 1994 Oct; 55(4):760-76.
[Am J Hum Genet. 1994]Am J Phys Anthropol. 1994 Feb; 93(2):189-99.
[Am J Phys Anthropol. 1994]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Mol Biol Evol. 1987 Jul; 4(4):406-25.
[Mol Biol Evol. 1987]Genetics. 1996 Dec; 144(4):1835-50.
[Genetics. 1996]Am J Hum Genet. 1998 May; 62(5):1137-52.
[Am J Hum Genet. 1998]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Mol Biol Evol. 1995 Mar; 12(2):334-45.
[Mol Biol Evol. 1995]Proc Natl Acad Sci U S A. 1995 Jan 17; 92(2):532-6.
[Proc Natl Acad Sci U S A. 1995]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Hum Hered. 1977; 27(2):81-98.
[Hum Hered. 1977]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Am J Hum Genet. 1990 Apr; 46(4):828-42.
[Am J Hum Genet. 1990]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Mol Biol Evol. 1995 Mar; 12(2):334-45.
[Mol Biol Evol. 1995]Am J Hum Genet. 1990 Apr; 46(4):828-42.
[Am J Hum Genet. 1990]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]J Mol Evol. 1983; 19(3-4):255-71.
[J Mol Evol. 1983]Am J Hum Genet. 1988 Oct; 43(4):534-44.
[Am J Hum Genet. 1988]Ann Hum Genet. 1992 Oct; 56(Pt 4):315-24.
[Ann Hum Genet. 1992]Ann Hum Biol. 1993 Sep-Oct; 20(5):477-85.
[Ann Hum Biol. 1993]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Mol Biol Evol. 1995 Mar; 12(2):334-45.
[Mol Biol Evol. 1995]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Science. 1991 Sep 27; 253(5027):1503-7.
[Science. 1991]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Proc Natl Acad Sci U S A. 1995 Jan 17; 92(2):532-6.
[Proc Natl Acad Sci U S A. 1995]Nature. 1987 Jan 1-7; 325(6099):31-6.
[Nature. 1987]Mol Biol Evol. 1995 Mar; 12(2):334-45.
[Mol Biol Evol. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1998 Feb; 62(2):420-34.
[Am J Hum Genet. 1998]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1994 Oct; 55(4):760-76.
[Am J Hum Genet. 1994]Genetics. 1996 Dec; 144(4):1835-50.
[Genetics. 1996]Am J Phys Anthropol. 1994 Feb; 93(2):189-99.
[Am J Phys Anthropol. 1994]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Am J Hum Genet. 1997 Sep; 61(3):691-704.
[Am J Hum Genet. 1997]Am J Hum Genet. 1998 Feb; 62(2):420-34.
[Am J Hum Genet. 1998]Nature. 1981 Apr 9; 290(5806):457-65.
[Nature. 1981]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]Proc Natl Acad Sci U S A. 1995 Jan 17; 92(2):532-6.
[Proc Natl Acad Sci U S A. 1995]Nature. 1981 Apr 9; 290(5806):457-65.
[Nature. 1981]Proc Natl Acad Sci U S A. 1989 Dec; 86(23):9350-4.
[Proc Natl Acad Sci U S A. 1989]Proc Natl Acad Sci U S A. 1995 Jan 17; 92(2):532-6.
[Proc Natl Acad Sci U S A. 1995]Am J Hum Genet. 1995 Jul; 57(1):133-49.
[Am J Hum Genet. 1995]J Mol Evol. 1983; 19(3-4):255-71.
[J Mol Evol. 1983]Am J Hum Genet. 1986 Mar; 38(3):341-51.
[Am J Hum Genet. 1986]Ann Hum Genet. 1989 Oct; 53(Pt 4):311-8.
[Ann Hum Genet. 1989]Am J Hum Genet. 1988 Oct; 43(4):534-44.
[Am J Hum Genet. 1988]Hum Biol. 1994 Feb; 66(1):1-12.
[Hum Biol. 1994]J Mol Evol. 1983; 19(3-4):255-71.
[J Mol Evol. 1983]Ann Hum Genet. 1992 Oct; 56(Pt 4):315-24.
[Ann Hum Genet. 1992]Ann Hum Biol. 1993 Sep-Oct; 20(5):477-85.
[Ann Hum Biol. 1993]Ann Hum Genet. 1992 Oct; 56(Pt 4):315-24.
[Ann Hum Genet. 1992]