• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ajhgLink to Publisher's site
Am J Hum Genet. May 2004; 74(5): 827–845.
Published online Apr 7, 2004. doi:  10.1086/383236
PMCID: PMC1181978

Where West Meets East: The Complex mtDNA Landscape of the Southwest and Central Asian Corridor

Abstract

The southwestern and Central Asian corridor has played a pivotal role in the history of humankind, witnessing numerous waves of migration of different peoples at different times. To evaluate the effects of these population movements on the current genetic landscape of the Iranian plateau, the Indus Valley, and Central Asia, we have analyzed 910 mitochondrial DNAs (mtDNAs) from 23 populations of the region. This study has allowed a refinement of the phylogenetic relationships of some lineages and the identification of new haplogroups in the southwestern and Central Asian mtDNA tree. Both lineage geographical distribution and spatial analysis of molecular variance showed that populations located west of the Indus Valley mainly harbor mtDNAs of western Eurasian origin, whereas those inhabiting the Indo-Gangetic region and Central Asia present substantial proportions of lineages that can be allocated to three different genetic components of western Eurasian, eastern Eurasian, and south Asian origin. In addition to the overall composite picture of lineage clusters of different origin, we observed a number of deep-rooting lineages, whose relative clustering and coalescent ages suggest an autochthonous origin in the southwestern Asian corridor during the Pleistocene. The comparison with Y-chromosome data revealed a highly complex genetic and demographic history of the region, which includes sexually asymmetrical mating patterns, founder effects, and female-specific traces of the East African slave trade.

Introduction

The southwestern Asian corridor is a wide geographical area that extends from Anatolia and the trans-Caucasus area through the Iranian plateau to the Indo-Gangetic plains of Pakistan and northwestern India. This region is characterized by a patchwork of different physical-anthropology types with complex boundaries and gradients and by the coexistence of several language families (e.g., Indo-European, Turkic, and Sino-Tibetan) as well as relict linguistic outliers. The southwestern Asian corridor, located at the crossroads of major population expansions, was the first portion of Eurasia to be inhabited by the Homo sapiens sapiens population(s) that left Africa ~60,000 years before the present (YBP) (Tishkoff et al. 1996; Watson et al. 1997; Quintana-Murci et al. 1999), and from this region modern humans migrated to the rest of the world. Although Paleolithic and Mesolithic people left their mark in the area, major prehistorical and historical events with possible genetic consequences occurred during the Neolithic period and later. Important agricultural developments occurred in the eastern horn of the Fertile Crescent ~8,000 YBP, notably in Elam (southwestern Iran). The highly urban Elamite civilization had close contacts with Mesopotamians but exhibited an extensive differentiation from the rest of the Fertile Crescent populations, including a language that is thought to belong to the Dravidian family. It is hypothesized that the proto-Elamo-Dravidian language (McAlpin 1974, 1981), spoken by the Elamites in southwestern Iran, spread eastwards with the movement of farmers from this region to the Indus Valley and the Indian subcontinent (Cavalli-Sforza et al. 1994; Cavalli-Sforza 1996; Renfrew 1996). Starting ~5,000 YBP, animal domestication, particularly the horse, gave the inhabitants of the Central Asian steppes the opportunity to expand geographically in different directions (Zvelebil 1980). These Central Asian nomads, probably from the Andronovo and Srubnaya cultures, migrated through Iran and Afghanistan, reaching Pakistan and India, and their arrival is contemporaneous with the decline of the strong agricultural South Asian civilizations, such as the Harappans. Most likely, their arrival on the Iranian plateau ~4,000 YBP brought the Indo-Iranian branch of the Indo-European language family and, eventually, caused the replacement of Dravidian languages in Iran, Pakistan, and most of northern and central India (Renfrew 1987, 1996; Cavalli-Sforza 1996). Starting in the 3rd century b.c., the eastern part of the Eurasian steppes witnessed similar pastoral movements. By the time of the 3rd century a.d., Turkic-speaking peoples from the Altai region began to migrate westwards, replacing Indo-European languages in parts of Central Asia and, eventually, in what is now modern Turkey. Later, the Mongols also moved westward and, by the 13th century a.d., established their rule over a vast region, including parts of India, Pakistan, and Iran and reaching as far west as the Caucasus and Turkey (Cavalli-Sforza et al. 1994).

In the past decade, studies of mtDNA variation have provided a substantial contribution to the understanding of human origins and diffusion patterns. mtDNA surveys in worldwide populations have shown a continent-specific distribution of mtDNA lineages (Wallace et al. 1999; Ingman et al. 2000; Maca-Meyer et al. 2001; Herrnstadt et al. 2002; Mishmar et al. 2003). African populations are characterized by the oldest superhaplogroups, L1, L2, and L3 (Bandelt et al. 1995, 2001; Chen et al. 1995, 2000; Graven et al. 1995; Soodyall et al. 1996; Bandelt and Forster 1997; Watson et al. 1997; Alves-Silva et al. 2000; Torroni et al. 2001b; Salas et al. 2002), but it seems that only L3 radiated out of Africa, mainly in the form of haplogroups M and N, ~60,000 YBP, giving rise to the extant Eurasian variation (Watson et al. 1997; Quintana-Murci et al. 1999; Wallace et al. 1999). Most western Eurasians are characterized by clades within haplogroup N (Torroni et al. 1996; Macaulay et al. 1999; Richards et al. 2000), whereas N and M contributed almost equally to the current eastern Eurasian mtDNA pool (Stoneking et al. 1990; Ballinger et al. 1992; Torroni et al. 1993; Horai et al. 1996; Kolman et al. 1996; Comas et al. 1998; Starikovskaya et al. 1998; Redd and Stoneking 1999; Schurr et al. 1999; Derbeneva et al. 2002; Kivisild et al. 2002; Yao et al. 2002).

Despite the major role played by the transect between the Near East and India in human origin and population dispersals, the extent and nature of mtDNA variation in the populations of the area are still not well resolved. In this context, mtDNA studies have focused on the western and eastern extremities of the southwestern Asian corridor, including the Near East/Caucasus region (Macaulay et al. 1999; Comas et al. 2000; Richards et al. 2000; Tambets et al. 2000; Nasidze and Stoneking 2001) and India (Mountain et al. 1995; Kivisild et al. 1999a, 1999b; Bamshad et al. 2001; Roychoudhury et al. 2001; Kivisild et al. 2003). In addition, Central Asian mtDNA variation is poorly characterized and is based only on HVS-I sequence data (Comas et al. 1998). Some populations of the region have been also analyzed for Y-chromosome variation, including Iranian (Quintana-Murci et al. 2001), Pakistani (Qamar et al. 2002), and, especially, Central Asian populations (Pérez-Lezaun et al. 1999; Karafet et al. 2001; Wells et al. 2001; Zerjal et al. 2002). To obtain a global mtDNA perspective of the entire region, we have now analyzed 910 mtDNAs from 23 different populations, located mainly in the southwestern Asian corridor but also, for comparison, in Central Asia. As a first step in the study, we performed high-resolution RFLP analysis and control-region sequencing of 208 mtDNAs, 108 from the western part of the corridor (Anatolia and the Caucasus), and 100 mtDNAs from its southeastern counterpart (southeast Pakistan). This allowed a clear-cut definition of the haplogroups (and their diagnostic markers) existing in the area. The phylogenetic information retrieved from this initial data set, together with previously published RFLP and HVS-I data, was then used to classify an extended collection of 702 newly obtained HVS-I sequences from the Iranian plateau, the Indus Valley, and Central Asia. The observed patterns of variation revealed different genetic contributions from western and eastern Eurasians and South Asians and evince complex demographic processes in some specific populations, including sexually asymmetrical mating patterns, founder effects, and differential migration patterns.

Material and Methods

Population Samples

The approximate location of the 23 populations from which the 910 mtDNAs were sampled is shown in figure 1. Each sample comprises unrelated healthy donors from whom appropriate informed consent was obtained. For the preliminary part of the study, 208 individuals from three different geographic regions were analyzed: 58 individuals from the Caucasus, 50 from Turkey and 100 from southeastern Pakistan. The three samples were heterogeneous; the sample from the Caucasus region was made up of three different ethnic groups: Georgians, Balkarians, and Chechens. The sample from Turkey was collected mainly in Konya (Anatolia). The Pakistani sample was collected in Karachi and comprised mainly Sindhis, who are a mix of tribes of different religions and ethnicities from the southeastern province of Sindh. The extended population sample included 702 individuals from 20 different populations living in the Iranian plateau, the Indus Valley, the Karakorum and Hindu Kush mountains, and Central Asia. Further details of the whole sample collection are reported in table 1 and in the work of Wells et al. (2001) and Qamar et al. (2002). The term “Makrani” refers to the so-called “Negroid Makrani” population living in the Makran coast of Baluchistan, distinct from the Makrani Baluch population, which is not considered in this study.

Figure  1
Map of the southwestern and Central Asian corridor, showing the samples analyzed in the present study. Population codes are as reported in table 1. Boxed populations are those used for the initial step of the study (see the “Materials And Methods ...
Table 1
Description of the Populations Included in the Study

mtDNA Analysis

High-resolution RFLP haplotypes were determined for the samples from the Caucasus region, Anatolia and Karachi. The entire mtDNA of each subject was PCR amplified using primer pairs and procedures previously described (Torroni et al. 1997). Each of the PCR segments was then digested with 14 restriction endonucleases (AluI, AvaII, BamHI, DdeI, HaeII, HaeIII, HhaI, HincII, HinfI, HpaI, MspI, MboI, RsaI, and TaqI). In addition, all mtDNAs were screened for the NlaIII sites at nucleotide positions (nps) 4216 and 4577. The presence/absence of the BstOI/BstNI site at np 13704, the AccI sites at nps 14465 and 15254, the BfaI site at np 4914, the XbaI site at np 7440, the MseI sites at nps 14766 and 16297, the MnlI site at np 10871, the MboII site at np 12703, and the HphI site at np 10237 were also analyzed in all the Pakistani-Karachi mtDNAs but only hierarchically in the mtDNAs from the Caucasus and Anatolia. Polymorphisms at nps 12308 and 11719 were also tested, the first by use of a mismatched primer that generates a HinfI site when the transition at 12308 is present (Torroni et al. 1996) and the second by use of a mismatched primer that generates a HaeIII site when the transition at 11719 is present (Saillard et al. 2000). The sequencing of the mtDNA control-region in the 208 individuals from the Caucasus region, Anatolia, and Karachi was performed as described elsewhere (Torroni et al. 2001a) and, in most cases, encompassed a large region (generally from np 16000 to nps 100–200). For the remaining 702 individuals, sequence data encompassed a shorter region (from np 16000 to np 16401), which includes the entire HVS-I, and variable positions were determined between nps 16024–16383, relative to the reference sequence (Anderson et al. 1981; Andrews et al. 1999). The published RFLP data (Macaulay et al. 1999; Quintana-Murci et al. 1999; Richards et al. 2000) and the new data obtained from the high resolution RFLP analyses of the 208 mtDNAs (see appendix A [online only]) were used to identify the RFLP and HVS-I sites (fig. 2), which are diagnostic of the main haplogroups and subhaplogroups within the mtDNA phylogeny. These markers were then selectively assayed, on the basis of the HVS-I information, in the remaining 702 mtDNAs by PCR amplification of the appropriate fragment and digestion with the informative restriction enzyme.

Figure  2
Schematic phylogenetic tree of mtDNA haplogroups observed in the populations analyzed. The diagnostic mutations used to classify the whole data set are reported on the branches. Restriction enzyme sites are numbered from the first nucleotide of the recognition ...

Data Analysis

Descriptive statistical indexes, the Tajima’s D (Tajima 1989) and Fu’s FS (Fu 1997) neutrality tests, and the analysis of molecular variance (AMOVA) (Excoffier et al. 1992) were calculated using the Arlequin software, version 2.001 (Schneider et al. 2000). For the AMOVA analysis, we used the number of pairwise differences for the HVS-I sequence data and haplogroup frequencies for haplogroup data. We performed the AMOVA analyses either with all populations in a single group or divided into several groups, according to their geographic location or linguistic affiliation. For the geographic grouping, we divided populations into four regions: the Anatolian/Caucasus region (Anatolians and Caucasus populations), the Iranian plateau (Persians, Iranian Turks, Lurs, Iranian Kurds, Mazandarans, and Gilaks), the Indus Valley (Baluchi, Brahui, Parsi, Sindhi, Pakistani-Karachi, Pathans, Makrani, Hazara, and Gujarat) and Central Asia (Uzbeks, Turkmen, Kurds from Turkmenistan, Shugnan, Hunza Burusho, and Kalash). For the linguistic division, we grouped populations according to their linguistic affiliation: Indo-Europeans (Persians, Lurs, Iranian Kurds, Mazandarans, Gilaks, Baluchi, Parsi, Sindhi, Pakistani-Karachi, Pathans, Makrani, Hazara, Shugnan, Kalash, and Gujarat), Altaic (Anatolian, Iranian Turks, Turkmen, and Uzbek), Dravidian (Brahui), Caucasian (Caucasus), and language isolates (Burusho). The population genetic structure was also explored through the spatial analysis of molecular variance (SAMOVA) approach (Dupanloup et al. 2002), which defines groups of populations that are geographically homogeneous and maximally differentiated from each other. This method is based on a simulated annealing procedure that aims at maximizing the proportion of total genetic variance due to differences between groups of populations without any a priori definition of groups of populations that is based on geographic or linguistic features. The SAMOVA analyses were based on HVS-I sequence data and were done using the SAMOVA 1.0 software.

Median-joining networks (Bandelt et al. 1995, 1999) were constructed by hand and confirmed by the Network program (A. Röhl; Shareware Phylogenetic Network Software Web site). For network construction of some specific lineages, sequence data from other populations were taken from the literature. From the Anatolia/Caucasus region, we included Armenians (AM), Azerbaijanis (AZ), Turks (TR), and Kurds (KR) from Richards et al. (2000); Turks (TC) from Calafell et al. (1996); Turks (TT) from Tambets et al. (2000); and Kurds (KC) from Comas et al. (2000). From the Middle East/Arabian Peninsula, we included Iraqis (IQ), Syrians (SY), Yemenites (YM), Palestinians (PL), and Druze (DZ) from Richards et al. (2000); individuals from Dubai (DB) from A.T. (unpublished data); and Egyptians (EG) from Krings et al. (1999). From Pakistan/India, we included Pakistanis (PK) and Indians from Andhra Pradesh (AP), Gujarat (GK), Haryana (HY), Kashmir (KS), Maharashtra (MH), Punjab (PN), Rajasthan (RJ), Uttar Pradesh (UP), and Tamil Nadu (TN) from Kivisild et al. (1999a); and Indians (IN) from Mountain et al. (1995). From Central Asia, we included Kirghiz (KG), Uighur (UG), and Kazakh (KZ) samples from Comas et al. (1998). From western Eurasia, we included Basques (BS), Sicilians (SC), Bulgarians (BL), and Italians from Tuscany (TS) from Richards et al. (2000); Russians (RS) from Malyarchuk et al. (2002); Mansi (MN) from Derbeneva et al. (2002); and Sardinians (SD) from Di Rienzo and Wilson (1991). We also included Chinese (CH) from Yao et al. (2002). The time to the most recent common ancestor of some clades and their SEs were calculated by means of the estimator ρ, the averaged distance to a specified founder haplotype, and were determined as described by Forster et al. (1996) and Saillard et al. (2000). Time estimates were also calculated, using the Network program. Principal-components (PC) analyses were performed using SPSS version 10.0.7 software, with basal mtDNA haplogroup frequencies as input vectors. Admixture proportions (mY) and their SEs were calculated, using information from all haplogroups, by means of the program Admix 2.0 (Dupanloup and Bertorelle 2001), on the basis of 1,000 bootstraps. The parental populations used for the analysis were Iranian populations and Gujarati for the Parsi population, and Pakistani populations (excluding the Makrani) and a geographically dispersed set of sub-Saharan African samples (Krings et al. 1999; Brakez et al. 2001; Brehm et al. 2002, Salas et al. 2002) for the Makrani population.

Results

The Topology of the Southwest and Central Asian mtDNA Tree

The complete high-resolution RFLP haplotypes and HVS-I sequence data of the 208 individuals from the Caucasus region, Anatolia, and southeastern Pakistan and the detailed haplogroup classification and HVS-I sequence data of the extended database of 702 individuals are reported in the online-only material.

The phylogenetic relationships of the 51 different named haplogroups observed in the 910 samples, along with the diagnostic sites used for the mtDNA haplogroup classification, are shown in figure 2. The vast majority of the mtDNAs clustered into macrohaplogroups M, N, and R, but a limited number were found to belong to the sub-Saharan haplogroups L1a, L2a, L3b, and L3d. Five haplogroups, N1d, HV2, U9, R5, and R6, are defined here for the first time, whereas others (U2a, U2b, and U2c) represent newly identified subclades. Moreover, for some previously known haplogroups (R2 and U8b), we detected diagnostic coding-region markers that allow a better definition of the haplogroup topology within the tree.

Macrohaplogroup N in southwestern and Central Asia is partitioned into several branches: N1 (which also encompasses haplogroup I), N9a, A, W, X, and R. Within the N trunk, the new haplogroup N1d stems from the node of N1 and is defined by three characteristic RFLP sites (−951MboI, −5003DdeI, −8616MboI) and two HVS-I transitions (nps 16301 and 16356). The internal topology of superhaplogroup R has also been improved. The novel lineage R5 is defined by −8592MboI and transitions at nps 16266 and 16304, whereas the new other haplogroup, R6, is characterized by −12282AluI and transitions at nps 16129 and 16362. Moreover, the R2 mtDNAs, previously recognizable only by the HVS-I transition at np 16071, are now identifiable through the diagnostic coding-region motif +4216NlaIII, +4769AluI, −14304AluI. It is worth noting that +4216NlaIII is also one defining mutation of the lineage-cluster J-T (fig. 2). However, the comparison of entire mtDNA sequences belonging to both R2 and J-T (A.T., unpublished data) indicates that +4216NlaIII has indeed occurred independently on the two branches of the phylogeny. An improvement of the classification within HV was also obtained. A haplogroup, named HV2, was found to bear the HVS-I transition at np 16217 and most likely is also characterized by +9336RsaI, since 16 of the 20 mtDNAs with the HVS-I 16217 mutation harbor this coding region site. This lineage corresponds to an internal node of HV that Tambets et al. (2000) tentatively identified as P*. The improvement of the haplogroup U subclassification was even more extensive. This major western Eurasian haplogroup is also found in the Middle East and India and, at lower frequencies, in northern and eastern Africa, but the frequency distributions of its subclades appear to differ considerably among geographical regions (Kivisild et al. 1999a, 2003; Macaulay et al. 1999; Richards et al. 2000). Subclade U2 (characterized by the rather variable HVS-I transition at np 16051) was previously subdivided into two branches, the “European” U2e characterized by a further HVS-I transversion at np 16129, and the “Indian” U2i lacking such a transversion (Kivisild et al. 1999a). We show that U2e also harbors the distinguishing RFLP motif +13730HinfI, +15907RsaI, and U2i is indeed made up of three clusters, here termed “U2a,” “U2b,” and “U2c.” U2a is characterized by the rare and stable HVS-I transversion 16206C, U2b is defined by the diagnostic site −15047HaeIII, and U2c harbors the RFLP motif +5789TaqI, +8020MboI/+8022TaqI, −15060MboI (fig. 2). A subset of U that was already known but is now better defined is U8b. Finnilä et al. (2001) observed that, on the basis of the shared transition at np 9698, haplogroup U8 formed a sister clade with haplogroup K. Our data reveal that at least a subset of U8, here termed “U8b,” is also characterized by −9052HaeII, which is indeed also the diagnostic marker of haplogroup K. This observation strengthens the sister haplogroup status of U8b and K. Finally, our data reveal the presence of a new—and rare—previously undefined subgroup of U, termed “U9,” that is characterized by −6383HaeIII. This haplogroup does not correspond to the U9 of Herrnstadt et al. (2002), which, in reality, corresponds to a subset of the previously defined U3.

The comparison of the RFLP and HVS-I data obtained from our data set identified some pitfalls when classifying the internal lineages within some haplogroups (e.g., J and M) on the basis of the HVS-I sequence data alone. Thus, we classified all our J mtDNAs according only to their differential RFLP status (fig. 2), and, since an accurate RFLP classification of the South Asian branches of haplogroup M remains to be defined, we adopted a conservative classification and merged all South Asian M mtDNAs into M*.

Haplogroup Profile Distribution

The haplogroup repertoire present in the study populations is shaped mainly by the presence of lineages that can be attributed to eastern Eurasia, South Asia, and western Eurasia (fig. 1; table 2). Sub-Saharan African lineages, represented by haplogroups L1, L2, and L3A and their internal derivatives, are virtually absent from all populations analyzed except the Makrani from southern Pakistan, among whom they reach high frequencies (39%).

Table 2
mtDNA Haplogroup and Subcluster Frequencies for the 23 Study Populations[Note]

The eastern Eurasian component is represented by haplogroups A, B, F, and N9a, all of which belong to the major N trunk, and the East Asian branches of macrohaplogroup M, such as the C, D, G, and Z haplogroups. The latter lineages are particularly widespread among northern and East Asians and, to a lesser extent, Central Asians (Torroni et al. 1993, 1994a, 1994b; Kivisild et al. 2002; Yao et al. 2002; Kong et al. 2003). The eastern Eurasian lineage cluster shows, with some exceptions, a decreasing gradient of frequencies towards the west (fig. 1; table 2). The highest frequencies of these branches were found among the Central Asian populations, reaching their maximum in the Turkmen and Uzbeks (37% and 31%, respectively). Interestingly, Kurds from Turkmenistan showed the lowest frequencies of eastern Eurasian lineages (9%) in Central Asia, in sharp contrast to the local Turkmen population. These eastern Eurasian–specific lineages were absent—or at very low frequencies—in populations from the Anatolian/Caucasus region, the Iranian plateau, and the Indus Valley, with one exception: the Hazaras from northern Pakistan, among whom they reach 35%.

The South Asian influence is mainly represented by the nodal type of macrohaplogroup M (M*) and the three sister clades U2a, U2b, and U2c. The M* haplogroup is absent or infrequent in all the populations west of the Indus Valley and is present at low frequencies in our Central Asian populations (<12%). Conversely, it is present at high frequencies (30%–55%) in populations living in the southern coasts of Pakistan and northwestern India. The three sister clades U2a, U2b, and U2c show a similar geographic pattern to that of haplogroup M*, although their distribution is somewhat more restricted to the Indo-Pakistani region. Also, N1d and HV2 and some lineages within paragroup R* are at higher frequencies in populations located east of the Iranian plateau, and this will be discussed in more detail below.

The proportion of western Eurasian lineages (HV, pre-HV, N1, J-T, U-K, I, W, and X) showed the opposite pattern of that exhibited by eastern Eurasian lineages (fig. 1; table 2). They exhibit their highest frequencies in the Anatolian/Caucasus and Iranian regions and their prevalence decreases eastwards. Despite this decreasing frequency cline towards the East, they are still present at relatively high frequencies in the Indus Valley and Central Asia. Indeed, the western Eurasian presence in the Kalash population reaches a frequency of 100%, the most prevalent haplogroups being U4, (pre-HV)1, U2e, and J2.

Phylogeography of Specific Haplogroups

The phylogeography of several haplogroups suggests that they are either autochthonous to the southwestern Asian corridor or that at least they underwent a major expansion in this region. Among these lineages, haplogroup U7 presents the most widespread distribution. U7 is virtually absent in western and eastern European populations and is present at low frequencies (2%–4%) in the Near East, the Caucasus region, Central Asia, and the Indian subcontinent (Kivisild et al. 1999a, 2003; Macaulay et al. 1999; Richards et al. 2000; Tambets et al. 2000; Malyarchuk and Derenko 2001; Malyarchuk et al. 2002). Our data show that this haplogroup is present in most of the populations linking the Near East with Central and South Asia, reaching its highest frequencies in some Iranian and Indus Valley populations (table 2), in agreement with recent data reporting a frequency of 9% in a composite Iranian sample (Kivisild et al. 2003). Figure 3 shows the median-joining network for this haplogroup. The topology of the network shows that this haplogroup is divided into two major well-defined star-like subclades separated by a transition at np 16309. The time-depth calculated for paragroup U7* (without 16309) is 35,100 ± 8,500 years, whereas that for U7a (with 16309) is 22,500 ± 5,400 years. These coalescence times support the idea that the 16318T mutation is indeed the ancestral feature of U7. The overall coalescence time calculated for U7 is 38,200 ± 13,900 years.

Figure  3
Network of the U7 lineage. Circle areas are proportional to haplotype frequency. Population codes are as reported in table 1 and in the “Materials And Methods” section. Mutated sites (−16,000) are indicated along the branches. ...

The phylogeography of haplogroups HV2 and R2 resembles that of U7 but has a more restricted geographic distribution. Both haplogroups are concentrated in southern Pakistan and India, with some overflow into adjacent areas, including the Near East/Caucasus region, the Iranian plateau, the Arabian Peninsula, and Central Asia, where most of the derived types are observed (fig. 4a and and4b).4b). The coalescence times were estimated at 27,700 ± 9600 years for HV2 and 31,200 ± 8200 years for R2.

Figure  4
Networks of (a) HV2 and (b) R2 lineages

The distribution of the three sister clades within haplogroup U2 (U2a, U2b, and U2c) is essentially restricted to the Indo-Pakistani regions (fig. 5a–c). They have not been observed in Europe and the Near East and, according to our data, they are absent in the Iranian plateau and Central Asian populations. They are, however, common in populations from Pakistan and India. The estimated coalescence times for these haplogroups are: 45,700 ± 14,400 years for U2a, 35,900 ± 9000 years for U2b, and 45,200 ± 10,400 years for U2c. The R5 lineage showed a similar distribution to the U2 subclades (fig. 5d), but its root types are more concentrated in the Indus Valley region, with the derivatives in central and southern India. The estimated time depth of this lineage is 51,800 ± 13,800 years.

Figure  5
Networks of (a) U2a, (b) U2b, (c) U2c, and (d) R5 lineages

Finally, three small haplogroups (R6, N1d, and U9) have been observed so far only in south Pakistan. R6 was found in three individuals from the mixed sample from Karachi; N1d in one Baluchi, one Brahui, one Makrani, and three individuals from Karachi; and U9 in three Makrani, one Pathan, and one individual from Karachi.

Population Diversity and Demographic Regimes

HVS-I sequences have also been used to gain information on the internal population diversity (table 3). Most populations showed similar sequence diversity values, with the Kalash showing the lowest (0.830) and the Indian Gujarati the highest (0.998). The low diversity exhibited by the Kalash population is also evident in the low mean number of pairwise differences (3.857). This is the lowest value of all the populations studied, which otherwise ranged from 4.399 in the Baluchi and the Caucasus populations to 6.633 in the Makrani. As shown in table 3, most populations yielded significantly negative values for both Tajima’s D and Fu’s FS neutrality tests. The only exceptions were the Mazandarians, the Kurds from Turkmenistan, and the Kalash. The former two groups exhibited significantly negative Fu’s FS values and unimodal mismatch distributions (not shown) but the Tajima’s D statistic was not significantly different from 0. This contrasting pattern may be the result of mutation rate heterogeneity along the HVS-I region; this effect has been shown to confound the signature of population expansion in Tajima’s test, leading to higher D values (Aris-Brosou and Excoffier 1996). For the Kalash population, both neutrality tests gave nonsignificantly negative values (table 3), and the mismatch distribution was unequivocally multimodal (data not shown).

Table 3
Diversity Indices and Neutrality Tests for the Study Populations

Population Relationships

The basal mtDNA haplogroup frequencies of the 23 populations were used as input vectors to perform a PC analysis. Figure 6 shows the PC plot for the first two principal components, which account for 43% and 12% of the total variation respectively. Leaving aside the two outliers, the Kalash and the Makrani, geographic grouping of populations are apparent in the diagram. The first PC (PC1) mainly reveals a west-to-east cline by separating a group of closely related populations from the Iranian plateau from those inhabiting the Indus Valley and northwest India. The Central Asian Uzbeks, Turkmen, and Shugnan tend to be closer to populations from the Anatolian/Caucasus/Iranian regions, rather than to Indus Valley populations, as a consequence of the high prevalence of western Eurasian lineages observed in these populations. The Hazara from Pakistan shows an intermediate position between populations from the Indus Valley and those from Central Asia. PC2 essentially displays the outlier genetic position of the Makrani and the Kalash populations, who are separated from the rest of populations of the Iranian plateau and the Indus Valley.

Figure  6
PC plot based on haplogroup frequencies for the 23 population samples (population codes are as in table 1).

Population Genetic Structure: AMOVA and SAMOVA Analyses

We investigated how the proportion of variance, based on haplogroup (main lineages) and haplotype (HVS-I sequences) frequencies, was distributed in a hierarchical mode by an AMOVA analysis (Excoffier et al. 1992). When the 23 populations were treated as a single group, populations turned out to show overall differentiation: the FST value for the haplogroup data was 0.067 (P<.001) and the [var phi]ST for the sequence data was 0.032 (P<.001). The fraction of genetic variance due to differences among linguistic groups (see the “Materials and Methods” section) was not statistically different from 0, independently of the genetic system used (i.e., haplogroup or sequence data), indicating that genetic variance within any population or among populations within groups was larger than that between groups and, therefore, that the division by linguistic affiliation is not reflected in mtDNA variation. Finally, when populations were regrouped into four geographic groups (see the “Materials and Methods” section), a small but significant differentiation among groups was detected (FCT=0.043 and P<.001 for haplogroup data and [var phi]CT=0.016 and P<.001 for HVS-I data).

To investigate in greater detail the genetic structure of the populations and the amount of genetic variation due to differences among population groups, we applied the SAMOVA algorithm (Dupanloup et al. 2002), on the basis of HVS-I data, searching for two, three, and four groups. The inclusion of the Kalash population, which is among the most differentiated (table 3; fig. 6), gave inconsistent results (data not shown), so this population was excluded from further analyses. A search for two significantly differentiated population clusters revealed one group consisting of all populations from the Anatolian/Caucasus region and the Iranian plateau (including the Kurds from Turkmenistan), and a second group made up of populations from the Indus Valley and Central Asia (FCT=0.021; P<.001). In the three-group search, the previous two remained unchanged, and a third group, represented by the Hazara, emerged from the analysis (FCT=0.021; P<.001). Finally, the search for four groups revealed the Makrani of southern Pakistan as the fourth most differentiated group (FCT=0.022; P<.001).

Discussion

This study provides the first comprehensive survey of mtDNA variation in a part of the world that was among the first regions to be inhabited after the “out of Africa” exit, and has subsequently experienced numerous waves of migration during the last 50,000 years. We now discuss the events, both ancient and modern, that are likely to have led to the current mtDNA distribution, and compare the mtDNA data with that from other loci, particularly the Y chromosome.

The mtDNA Landscape of the Southwestern Asian Corridor

A simple pattern underlies the mtDNA variation in this region: a west-to-east divide with a sharp boundary. Populations located west of the Indus basin, including those from Iran, Anatolia and the Caucasus, exhibit a common mtDNA lineage composition, consisting mainly of western Eurasian lineages, with a very limited contribution from South Asia and eastern Eurasia (fig. 1). Indeed, the different Iranian populations show a striking degree of homogeneity. This is revealed not only by the nonsignificant FST values and the PC plot (fig. 6) but also by the SAMOVA results, in which a significant genetic barrier separates populations west of Pakistan from those east and north of the Indus Valley (results not shown). These observations suggest either a common origin of modern Iranian populations and/or extensive levels of gene flow amongst them. There is a virtual absence of both common South Asian lineages (M*, U2a, U2b, and U2c) and the more autochthonous U9, R*, R2, R5, R6, N1d, and HV2 lineages in the Anatolian/Caucasus region and Iranian plateau, whereas these lineages make up >50% of the haplogroup profile in the adjacent Indus Valley. Most of these lineages appear to be restricted to the eastern part of the corridor (fig. 1). Whereas geographical clustering and the coalescent age of U7 (~38,000 YBP; see table 2 and fig. 3) suggest that it is the most widespread local lineage of Pleistocene origin connecting the western and eastern extremes of the corridor, the Indus Valley and India show signals of an in situ differentiation of deep-rooting lineages (HV2, R2, R5, U2a, U2b, and U2c), the distribution of which appears to be limited to this region (figs. (figs.44 and and5).5). All these lineages have high time depths (28,000–52,000 YBP), similar to haplogroup M* (32,000–53,000) in the region (Kivisild et al. 1999b; Quintana-Murci et al. 1999; Roychoudhury et al. 2001). Notably, haplogroup M* has also not penetrated west of the Indus Valley, although it is present at high frequencies in south Pakistani and Indian populations. Thus, the distribution and ages of these lineages suggest that they are the legacy of the first inhabitants of the southwestern Asian region who underwent important expansions during the Paleolithic period. It is interesting that the newly identified haplogroup U9, found in the Indus Valley, has also been observed in Ethiopia (A.T., unpublished data), supporting the link between East Africa and the southwestern and southern coasts of Asia (Kivisild et al. 1999a; Quintana-Murci et al. 1999). The absence of these lineages west of Pakistan may be due either to limited gene flow from the Indus basin or to important demographic expansions in the Fertile Crescent (including its eastern lobe, represented by present-day Iran), associated with a substantial increase in frequency and diversity of western Eurasian lineages. Geographical features such as the Dash-e Kavir and Dasht-e Lut deserts in Iran could have acted as significant barriers to gene flow, and this is consistent with Y-chromosomal data (e.g., the distribution of lineage R-M17) from these regions (Quintana-Murci et al. 2001; Wells et al. 2001; Qamar et al. 2002).

Gene flow from the Fertile Crescent to India has, however, been more common than that from east to west (fig. 1). Eastern populations within the corridor mostly exhibit a rich variety of west Eurasian lineages at high frequencies (26%–57%), with a gradient towards the Indian subcontinent, with lower frequencies in caste (<10%) and tribal groups (<1%) (Kivisild et al. 1999a; 2003; Bamshad et al. 2001). The substantial western Eurasian presence in the Indus Valley and northwestern India may have been the result of repeated gene flow received from further west at different periods, including the first Paleolithic arrivals to the corridor region from the Middle East and subsequent dispersals associated with Neolithic urban civilizations, such as Mesopotamians and Elamites, who may have carried farming towards the eastern part of the corridor. The exact mode and tempo in which the different western Eurasian lineages reached the Indo-Gangetic plains remains to be elucidated. However, it appears that J, T1, and U3, which have been proposed as the main marker haplogroups for the Neolithic diffusion of agriculture from the Middle East to the West (Richards et al. 2000, 2002), did not play an equivalent role in the diffusion of farming toward the East. The eastern Eurasian contribution to the west, in contrast, is negligible (fig. 1), in agreement with HVS-I sequence data in Turkish populations (Calafell et al. 1996; Comas et al. 1996, 1998). This pattern may seem surprising in view of the historically documented repeated waves of Altaic-speaking nomads (e.g., Turks, Huns, and Mongols) starting in the 3rd century a.d., who traveled from east to west, imposing Altaic languages in some western regions (e.g., Anatolia and Azerbaijan), probably through an elite-dominance process. In this context, it is interesting that five of the seven individuals belonging to eastern Eurasian lineages west of the Indus Valley are Turkic-speaking. The east-to-west differences in the genetic contribution of the eastern invaders along the corridor may be due to the existing population densities in these regions at the time of arrival. The genetic contribution of the newcomers was strong in the sparsely populated arid lands of eastern Central Asia. Conversely, the eastern influence in western territories was much lower, since the invading eastern nomads probably found higher population densities. Our results reinforce recent Y-chromosome data from Central Asia, which show that the paternal genetic contribution of the eastern invaders is barely detectable west of Uzbekistan (Zerjal et al. 2002).

The Effects of Admixture and Drift: Demographic Events in Central Asia

Central Asians exhibit high frequencies of East Asian lineages, which are otherwise virtually absent in populations from the Indo-Gangetic region and westwards, concomitantly with a high prevalence of lineages of western Eurasian origin (fig. 1). Two explanations have been put forward: Central Asians could represent an early incubator of Eurasian variation, or their current genetic diversity could result from later admixture between western and eastern Eurasian populations. Y-chromosome data have been interpreted as indicating that Central Asian populations are amongst the oldest on the continent and were the source of at least three major migration events (Wells et al. 2001) but were also a receiver of migrations (Zerjal et al. 2002). mtDNA studies (Comas et al. 1998) based on HVS-I variation in four populations of Central Asia found that they contained both European and East Asian motifs. This was interpreted as evidence for admixture between Europeans and East Asians, a conclusion that is substantiated by our more thorough analysis. Indeed, if Central Asia had been the source of modern Eurasian diversity, one would expect to observe (i) substantial overlap between present-day western and eastern Eurasian haplogroups and (ii) extensive divergence between the HVS-I types found in Central Asia and those observed in western and eastern Eurasia. This is not the case. Our data, which take into consideration coding-region information and provide a more clear-cut phylogeography, show a major demarcation in the Eurasian landscape between European and East Asian mtDNA lineages within both the R and N branches, and with M playing virtually no role in western Eurasia. Moreover, most Central Asian HVS-I types match sequences that are observed today in either western or eastern Eurasians, suggesting recent arrival in Central Asia.

The complexity of the peopling of the region is well illustrated by the Kalash population from the Hindu Kush valleys, where western Eurasian mtDNAs reach fixation with no detectable East or South Asian lineages (fig. 1 and table 2). Their outlying genetic position is seen in all analyses (table 3 and fig. 6). Moreover, although this population is composed of western Eurasian lineages, the most prevalent (i.e., U4, (pre-HV)1, U2e, and J2) are rare or absent in the surrounding populations and usually characterize populations from Eastern Europe, the Middle East, and the Caucasus (Macaulay et al. 1999; Richards et al. 2000; Tambets et al. 2000). Also, the internal HVS-I sequence diversity within each of these haplogroups was surprisingly low: 12 of 15 samples belonging to U4 were associated with the motif 16134–16356, all (pre-HV)1 samples harbored 16362, all U2e samples were characterized by the motif 16051-16129C-16154-16248-16362, and all J2 mtDNAs showed the motif 16069-16126-16193-16274-16278. These sequence motifs are almost entirely restricted to the Kalash community, except for those associated with U4. All these observations bear witness to the strong effects of genetic drift on the Kalash population. This distinctive demographic scenario is supported by the nonsignificantly negative values of Tajima’s D and Fu’s FS neutrality tests (table 3), which reject the hypothesis of population growth, the unambiguous multimodal mismatch distribution (not shown), and the small census size of the population, 3,000–6,000. It has been suggested that this population descends from Greeks or from Slavic peoples, and they claim descent from a place called Tsyam, possibly in Syria (Robertson 1896; Decker 1992). The strong effects of drift and the small population size make genetic inference about the geographic origin of the Kalash difficult. However, a western Eurasian origin for this population is likely, in view of their maternal lineages, which can ultimately be traced back to the Middle East.

Correlation of Genes and Languages in the Southwestern Asian Corridor

The study of the mtDNA pool of present-day populations living in the southwest and Central Asian corridor shows that the linguistic differences in these regions (i.e., mainly Indo-European vs. Altaic) are not reflected in the patterns of mtDNA diversity. However, there are two linguistic outliers that merit further consideration: the Hunza Burusho and the Brahui. The Hunzas live mainly in the remote Hunza Valley of northern Pakistan and speak Burushaski, a language isolate of uncertain origin. Our analysis shows that the Hunza mtDNAs, like the Y haplotypes (Qamar et al. 2002), are shared with neighboring populations, particularly with southern Pakistanis (see PC plot in fig. 6). This genetic pattern could have been established before the linguistic differentiation took place, or there could have been substantial gene flow with neighboring populations. In any case, no distinctive genetic signature accompanies the linguistic and geographic isolation of the Hunza Burusho population, in agreement with recent data based on 182 autosomal microsatellites (Ayub et al. 2003).

The second linguistic outlier is the Brahui population, located in central Baluchistan, which represents a Dravidian-speaking enclave outside India. Historical records indicate that the Brahui are descendants of Turko-Iranian tribes from west Asia (Hughes-Buller 1991). Today, Dravidian languages are essentially restricted to south India and Sri Lanka, but the proto-Elamo-Dravidian hypothesis (McAlpin 1974, 1981) proposes that they originated in the Iranian province of Elam and were once spoken over a much larger area, including Iran, Pakistan, Afghanistan, and all India. The Brahui population is characterized by high prevalences (55%) of western Eurasian mtDNAs and the lowest frequency in the region (21%) of haplogroup M*, which otherwise is common (~60%) among Dravidian-speaking Indian populations. As shown in the PC1 (fig. 6), the Brahui lie in an intermediate position between Iranian and Indus Valley populations, far from the Gujaratis and even farther from Dravidian-speaking Indian groups (results not shown). These observations exclude the possibility that the Dravidian presence in Baluchistan has resulted from recent incursions of Dravidian speakers from India and show that the maternal gene pool of the Brahui is similar to that of Indo-Iranian speakers from the southwestern Asian corridor. Although the present Brahui population could represent an ancient Indian Dravidian-speaking population relocated to Pakistan, where they admixed with local populations, no historical record supports this hypothesis. Thus, this suggests that they are the last northern survivors of a larger Dravidian-speaking region predating the arrival of Indo-Iranian speakers, thus reinforcing the proto-Elamo-Dravidian hypothesis (McAlpin 1974, 1981).

Traces of Recent and Sexually Asymmetrical Events

The phylogeographical cross-comparison of mtDNA and Y-chromosomal data is very useful for tracing differential male and female histories. Some populations studied here (Iranian, Pakistani, and Central Asian) have been analyzed previously for Y-chromosomal variation (Quintana-Murci et al. 2001; Qamar et al. 2002; Zerjal et al. 2002). In most cases, mtDNA variation is in good agreement with the Y-chromosomal data, suggesting that the patterns reflect general population processes. A good, although surprising, example of concordance between the two systems is the Hazara, who claim to be the direct male-line descendants of Genghis Khan’s army. The presence and time depth of the Y-chromosomal haplogroup C* (xC3c) in the Hazara, along with its absence from neighboring populations, has been interpreted as the genetic legacy of Genghis Khan and his male relatives (Qamar et al. 2002; Zerjal et al. 2003). Our results indicate that the Hazara are also characterized by very high frequencies of eastern Eurasian mtDNAs (35%, table 2, fig. 1), which are virtually absent from bordering populations, suggesting that the male descendants of Genghis Khan, or other Mongols, were accompanied by women of East Asian ancestry.

In contrast to the parallelism between mtDNA and Y-chromosomal data in most populations, the Parsis and the Makrani both show a sharp contrast between these loci. The Parsis live in southeastern Pakistan, and historical records indicate an Iranian origin (Nanavutty 1997). These followers of the prophet Zoroaster started their migration from Iran in the 7th century a.d., settling in the northwestern Indian province of Gujarat around 900 a.d. and eventually moving to Mumbai in India and Karachi in Pakistan. Y-chromosome data show that they resemble Iranian populations rather than their neighbors in Pakistan: an admixture estimate of 100% from Iran was obtained (Qamar et al. 2002), supporting the historical records. However, when the Parsi mtDNA pool was compared with those of the Iranians and Gujaratis (their putative parental populations), a strong contrast with the Y-chromosomal data emerged. About 60% of their maternal gene pool belongs to South Asian haplogroups, which make up only 7% of the combined Iranian sample (table 2). The very high frequency of haplogroup M among the Parsis (55%), similar to those of Indian populations and much higher than that of the combined Iranian sample (1.7%), highlights their close affinities with India (fig. 6). Our results lead to an admixture estimate of 100% from Gujarat and provide a strong contrast between the maternal and paternal components of this population. Although the small population size of the Parsis (a few thousand) may have distorted haplogroup frequencies in this population, diversity of both Y-chromosome and mtDNA lineages remains high, making a strong drift effect unlikely. Our results therefore support a male-mediated migration of the ancestors of the present-day Parsi population from Iran to India, where they admixed with local females, or directional mating in Gujarat between Iranian males and local women, leading ultimately to the loss of mtDNAs of Iranian origin.

Another example of an unequal sex-specific contribution is seen in the so-called “Negroid” Makrani of Baluchistan. This population lives in the Makran coastal region and shows distinct African physical traits (Sultana 1995). We observed a high presence (39%) of lineages L3d, L3b, L2a, and L1a, generally restricted to sub-Saharan African populations (Chen et al. 1995, 2000; Salas et al. 2002) and otherwise present in only 4 of the remaining 877 individuals examined. The presence of African mtDNAs among the Makrani seems to be of recent origin, since the Makrani haplotypes are identical to those observed in modern sub-Saharan African populations (Salas et al. 2002), particularly in Bantu-speaking populations from Mozambique. Indeed, all but one of the Makrani L1, L2, and L3A types matched Mozambique sequences, and these were the most frequent haplotypes in the Mozambique samples (L1a2, L2a1a, and L2a1b) (Pereira et al. 2001; Salas et al. 2002). Our results contrast with the Makrani Y-chromosome profile, which is similar to that of other Pakistani populations and is dominated by western Eurasian lineages (Qamar et al. 2002). The sub-Saharan African male-specific contribution, represented primarily by Hg E-M2, occurred at only 9% in the Makrani and is also present in neighboring populations, although at a lower prevalence (2%–4%). We estimated the maternal and paternal contributions of sub-Saharan Africans to the current Makrani gene pool, using information from all haplogroups, at 12% (±7%) for the Y chromosome and 40% (±9%) for the mtDNA. These findings must be interpreted in the light of known historical data. Forced migration from Africa began in the 7th century and increased considerably during the Omani Empire. The latter formed a strong slave-trade connection between the Makran port of Gwadar, the principal ports of Oman, and ports located in East Africa, including Mozambique (Clarence-Smith 1989; Sultana 1995). In the 16th and 17th centuries, the Portuguese also traded between Mozambique and southwestern Asia. The African component in the Makrani community may therefore represent the genetic legacy of this slave trade. Whereas the Atlantic slave trade dealt mainly with male labor, the East African slave trade seemingly favored females over males (Lovejoy 2000). Slave women were mainly domestics and/or concubines, and children fathered by the master were freed. In addition, strong cultural barriers hindered male slaves from fathering children, a situation exacerbated by the proportion of slaves imported as eunuchs (Lovejoy 2000). As a consequence of these practices, the contribution of paternal African genes to the population is expected to be low. Indeed, the contrast between male and female African contributions observed among the Makrani strongly supports historical records of a female sex bias during the East African slave trade. Other factors, such as asymmetrical mating patterns between African women and autochthonous males during the process of genetic admixture, and/or unequal reproductive success among Makrani males, might have accelerated the loss of African Y chromosomes from the population. In this context, a similar pattern has been reported recently in the Yemeni Hadramawt population (Richards et al. 2003), geographically adjacent to East Africa, where the African maternal contribution has also been interpreted as the result of the East African slave trade. Our data not only confirm a female-biased slave trade towards the East but also show that this pattern, which includes differential mating patterns between the sexes, extended to the eastern limits of the East African slave trade.

Conclusions

Our analysis of mtDNAs from the southwestern and Central Asian corridor shows that the highest variation is observed in populations located in the Indus Valley and Central Asia, highlighting this region as the place where western Eurasian lineages meet both the South Asian and eastern Eurasian genetic strata, respectively. The amalgamation of different genetic components in this area may have resulted from the successive and continuous waves of migration from diverse geographical sources at different time periods, from the early human settlements in the region after the “out of Africa” dispersal to migrations associated with the diffusion of new technologies, such as farming and/or pastoral nomadism, and accompanied by new languages, like the incursions of Indo-Iranian speakers from the northwest. In addition, the Indo-Gangetic region is characterized by the presence of autochthonous genetic footprints of Pleistocene origin and traces of recent historical events, such as the East African slave trade. This extraordinarily rich and heterogeneous genetic portrait testifies to the numerous and complex movements in the region and evinces more subtle demographic episodes in some populations, including founder effects and sexually asymmetrical events associated with differential migration patterns between males and females.

Acknowledgments

We warmly acknowledge Hans-Jürgen Bandelt, for stimulating remarks and quality check of the data; Francesca Luca, for help in data analysis; and two anonymous reviewers, for helpful and constructive criticisms. This work was supported by CNRS and a North Atlantic Treaty Organization collaborative linkage grant (LST.CLG.977507) (to L.Q.-M.). Financial support was also provided by The Wellcome Trust (to C.T.-S. and S.Q.M.), the Italian Ministry of the University (Progetti Ricerca Interesse Nazionale 2001, 2002, 2003) (to A.T., R.S., and A.C.), Progetto CNR-MIUR Genomica Funzionale-Legge 449/97 (to A.T.), Fondo Investimenti Ricerca di Base 2001 (to A.T.), Fondo d’Ateneo per la Ricerca 2002 dell’Università di Pavia (to A.T.), Progetto Finalizzato C.N.R. “Beni Culturali”(to A.S.S.-B.), Grandi Progetti di Ateneo (to R.S.), and the Istituto Pasteur Fondazione Cenci Bolognetti (to R.S.). N.A.-Z. was supported by The International Center for Genetic Engineering and Biology (Trieste) and University of Pavia fellowships.

Supplementary Material

Table A1

RFLP and Control Region Variation in Anatolian, Caucasus, and Pakistani-Karachi Samples

Haplotypeb
Haplogroup and SampleaRFLPcControl RegiondRead from Start np/End np
L2a:
 KAR04+3592h +4769e +10394c −10871z −12703t +13803e +16389g/−16390b16189 16223 16234 16249 16278 16294 16295 16390 73 143 146 152 19516000/00219
M*:
 KAR46−8005l −8783g +10394c +10397a −10871z −12703t +13366m/−13367b/+13367j −15925i +16389g/−16390b +16517e16048 16129 16223 16390 16519 7316033/00160
 KAR47−8005l −8783g +10394c +10397a −10871z −12703t +13366m/−13367b/+13367j −15925i +16389g/−16390b +16517e16048 16129 16223 16390 16519 7316001/00137
 KAR61−7859j +8249b/−8250e +10394c +10397a −10871z −12703t +15500e +16517e16223 16324 16357 16519 16527 7316004/00138
 KAR56−6850i −8882c +10394c +10397a −10871z −12703t +16517e16129 16223 16265C 16519 7316002/00136
 KAR50−5978a −7474a +9299a +10394c +10397a −10871z −12703t −16049k +16517e16051 16183C 16189 16293 16316 16519 73 147 15316001/00214
 KAR26−3337k −4990a +10394c +10397a −10598a −10871z −12703t +16517e16092 16223 16289 16342 16519 7316000/00132
 KAR16−4990a +8156k +10394c +10397a −10871z −12703t +16517e16129 16223 16519 7316034/00140
 KAR45−4990a +8156k +10394c +10397a −10871z −12703t +16517e16129 16223 16519 7316000/00130
 KAR05−4990a +8156k +10394c +10397a −10871z −12703t +16517e16129 16223 16519 7316033/00130
 KAR43−4577q +10394c +10397a −10871z −12703t −16310k +16517e16126 16223 16311 16519 7316026/00156
 KAR14−4577q +10394c +10397a −10871z −12703t +16517e16126 16223 16519 7316008/00142
 KAR22−4577q +10394c +10397a −10871z −12703t +16517e16126 16223 16519 7316000/00131
 KAR69−4577q +10394c +10397a −10871z −12703t +16517e16126 16223 16519 7316004/00138
 KAR82−4577q +10394c +10397a −10871z −12703t +15882b/−15883e −16310k +16517e16126 16223 16311 16519 7316000/00150
 KAR63−3534c/−3537a +10394c +10397a −10871z −12703t +13085k +16389g/−16390b +16517e16111 16223 16231 16272 16362 16390 16519 7316001/00232
 KAR68+9871a +10394c +10397a −10871z −11362a −12703t +16517e16129 16519 7316007/00141
 KAR86+8678a +8704k +10394c +10397a −10871z −12703t16111 16353 56+T 59del 60del 66+T 7316000/00225
 KAR90+8439a −8838e +9612k −9644a +10394c +10397a −10871z −12703t −13404l16129 16141 16223 16271 73 143 146 15216000/00160
 KAR11+8249b/−8250e +10394c +10397a −10871z +12234k (A12234G G12236A) −12703t +16517e16166del 16223 16519 7316000/00130
 KAR89+6618e −7859j +10394c +10397a −10871z −12703t −16310k +16517e16086 16145 16223 16261 16311 16519 7316000/00130
 KAR64+6618e −7859j +10394c +10397a −10871z −12703t −16310k +16517e16145 16223 16261 16266 16291 16311 16519 7316001/00134
 KAR18+6618e −7859j +10394c +10397a −10871z −12703t −13704p −16310k +16238l +16517e16145 16223 16240 16261 16311 16519 7316003/00137
 KAR85+6618e −7859j +10394c +10397a −10871z −12703t +15754c −16310k +16517e16145 16176 16223 16261 16311 16519 7316000/00144
 KAR91+4140l +10394c +10397a −10871z −12703t −16310k +16494i16129 16223 16311 16357 16497 73 146 15216000/00170
 KAR59+6333j −8616j −8882c +10394c +10397a −10871z −12703t +16517e16129 16223 16265C 16519 7316003/00128
 KAR37+3759o +10394c +10397a −10871z −12703t −15254s +16517e16223 16302 16519 7316000/00131
 KAR39+3759o +10394c +10397a −10871z −12703t −15254s +16517e16223 16302 16519 7316001/00130
 KAR13+3391e +10394c +10397a −10871z −12703t +16517e16158 16223 16234 16319 16348 16362 16519 7316003/00137
 KAR100+3391e +10394c +10397a −10871z −12703t +16389g/16390b +16517e16126 16162 16172 16223 16256 16342 16390 16519 7316000/00160
 KAR41+245l +9070l +10394c +10397a −10871z −12703t +14583c16223 16260 16318T 16325 7316001/00132
 KAR92+245l +10394c +10397a −10871z −12703t +16517e16093 16134 16223 16318C 16519 73 9316000/00131
 KAR83+245l +10394c +10397a −10871z −12703t +15487b +16517e16223 16319 16519 7316000/00133
 KAR77+10394c +10397a −10871z −12703t −16310k +16517e16129 16223 16255 16311 16519 7316000/00105
 KAR95+10394c +10397a −10871z −12703t −16310k +16389g/16390b +16517e16093 16129 16223 16311 16390 16519 7316000/00131
 KAR10+10394c +10397a −10871z −12703t +16517e16111 16223 16519 73 11416000/00130
 KAR20+10394c +10397a −10871z −12703t +16517e16129 16172 16223 16519 73 14616003/00220
 KAR48+10394c +10397a −10871z −12703t +16517e16129 16172 16223 16519 73 14616002/00210
 KAR19+10394c +10397a −10871z −12703t +16517e16129 16172 16223 16519 7316003/00135
 KAR65+10394c +10397a −10871z −12703t +16517e16145 16223 16519 7316004/00138
 KAR32+10394c +10397a −10871z −12703t +16517e16223 16230 16234 16519 7316003/00123
 KAR27+10394c +10397a −10871z −12703t +16517e16223 16234 16519 7316000/00130
 KAR70+10394c +10397a −10871z −12703t +16517e16223 16234 16519 7316002/00136
 KAR58+10394c +10397a −10871z −12703t +16517e16223 16289 16519 7316003/00137
 KAR15+10394c +10397a −10871z −12703t +16517e insertion of 11 Cs at np 590016223 16234 16519 7316000/00132
 KAR12+10394c +10397a −10871z −12703t +12810k +16517e16223 16270 16274 16292 16319 16352 16519 7316000/00130
 KAR97+10394c +10397a −10871z −12703t +12810k +16517e16223 16270 16319 16352 16519 7316000/00132
 KAR71+10394c +10397a −10871z +11001n/+11002f +16275k +16389g/16390b16223 16256 16275 16327A 16390 7316003/00137
 GEO15+10394c +10397a −10871z −12703t −13103g/+13104j16223 16362 16526 73 26316000/00280
D:
 ANA16−5176a +10394c +10397a −10871z −12703t16174 16223 16362 16468 7315997/00200
G:
 KAR36+4830n/+4831f +10394c +10397a −10871z −12703t −15925i16223 16274 16362 73 19516000/00250
 ANA53+4830n/+4831f +10394c +10397a −10871z −12703t +15494c16223 16227 16278 16362 73 152 207 26316000/00290
Z:
 KAR53+2349j −2776b −4360g +10394c +10397a −10871z −12703t −16297u −16310k +16517e16185 16223 16260 16298 16311 16519 73 151 15216004/00218
A:
 ANA07+663e −12703t16182C 16183C 16189 16223 16290 16319 16362 73 93 150 152 19516000/00215
 ANA50+663e −5823a −12703t16189 16223 16290 16319 16362 73 15216000/00205
N1b:
 BAL01−1715c +8249b/−8250e +10237v −11362a −12703t +16176j +16389g/−16390b+16517e16129 16145 16176G 16223 16309 16382 16390 1641316031/16496
 CHE14−1715c +8249b/−8250e +10237v −11362a −12703t +16176j +16389g/−16390b +16517e16145 16176G 16223 16291 16390 16519 7316003/00151
 ANA09−1715c +8249b/−8250e +10237v −11362a −12703t +16176j +16389g/−16390b +16517e16145 16176G 16223 16309 16390 16519 73 151 15216000/00200
 CHE17−1715c +8249b/−8250e +10237v −11362a −12703t +16176j +16389g/−16390b+16517e16145 16176G 16223 16390 16519 73 146 150 15215997/00200
 BAL121715c +8249b/−8250e +10237v −11362a −12703t +16176j +16389g/−16390b+16517e16145 16176G 16223 16390 16519 73 146 150 152 19915997/00200
N1d:
 KAR30−951j −1715c −5003c −8616j +10237v −12703t16223 16301 16356 7316011/00141
 KAR62−951j −1715c −5003c −8616j +10237v −12703t16223 16301 16356 7316004/00136
 KAR31−951j −1715c −5003c −8616j +10237v −12703t −16310k16223 16301 16311 16356 7316000/00130
I:
 GEO08−1715c +8249b/−8250e −8572e +10032a +10237v +10394c −12703t −16310k +16389m/+16390j/−16390b +16517e16129 16223 16259 16264 16311 16319 16362 16391 16519 73 199 20415997/00225
 ANA37−1715c +8249b/−8250e +10032a +10237v +10394c −12703t +16389m/+16390j/−16390b +16517e16086 16129 16223 16391 16519 73 152 19916000/00200
W
 KAR81+8249b/−8250e −8994e −10830g −12703t −13704p +16517e16183C 16189 16223 16292 16354 16519 73 143 189 194 195 196 204 20716000/00220
 GEO01+4092e +8249b/−8250e −8994e −12703t +16517e16173 16223 16292 16325 16352 16513+T 16519 7316000/00120
 GEO19+4092e +8249b/−8250e −8994e −12703t +16517e16192 16223 16292 16325 16362 16519 73 185 18915997/00190
 GEO20+4092e +8249b/−8250e −8994e −12703t +16517en.d-
X
 KAR29−1715c −12703t +14465s −16310k +16517e16172 16189 16278 16311 16519 73 19516001/00208
 BAL16−1715c −7853o +11329a −12703t +14465s −16310k +16517e16183C 16189 16223 16265 16278 16311 16519 73 153 19515997/00208
 GEO05−1715c −3944l −12703t +14465s +16517e16183C 16189 16223 16278 16519 73 153 19516000/00245
 ANA38−1715c −12703t +14465s+16517e16108 16183C 16189 16209 16278 16325 16519 73 153 19515996/00205
 BAL20−1715c −12703t +14465s +16517e16093 16183C 16189 16223 16278 16519 73 153 19516000/00208
 BAL19−1715c −12703t +14465s +16517e16104 16189 16223 16278 16519 73 153 19516000/00215
 ANA27−1715c −12703t +14465s +16517e16189 16218 16223 162781651916025/16519
 ANA32−1715c −12703t +14465s +16517e16189 16223 16278 16519 73 153 19516000/00200
 CHE18−1715c −12703t +14465s +16303k +16517e16129C 16189 16223 16278 16304 16519 7316000/00124
R*:
 KAR51−5584a +10143a +10289e −12406o/−12406h −13404l +13826g +16145e16086 16146 16260 16261 16319 16362 73 20716000/00250
 KAR80+16494i +16517e16136 16292 16497 16519 7316000/00132
R2:
 KAR72+4216q +4769a −14304a −16310k +16517e16071 16108 16111 16278 16311 16519 7316004/00138
 GEO11+4216q +4769a −14304a16071 16355 16357 73 8116003/00120
R5:
 KAR94−8592j +13633e −16303k −16310k +16517e16266 16304 16311 16356 16519 16524 73 15216000/00250
 KAR76+4051k −8150i −8592j −16303k +16517e16266 16304 16519 73 9316000/00132
R6:
 KAR54−12282a +16389g/−16390b +16517e16129 16213 16259A 16274 16319 16362 16390 16519 7316001/00115
 KAR52+10394c −12282a +15112a +16517e16129 16213 16362 16519 7316004/00138
 KAR73−12282a +16517e16114G 16129 16362 16519 7316004/00138
(pre-HV)1:
 ANA12+11718e16126 1636216023/16383
 KAR87+11718e +16517e16093 16126 16362 16519 57+C 64 19516000/00225
 KAR24+10727c +11718e16126 16362 60+T 64 152 15316000/00250
 ANA25−9266e +11718e −15812k16126 16168 16264 16295 16362 6415997/00200
 BAL11+4769a −5176a +11718e16126 16355 16362 58 64 14615997/00200
 CHE05+11074c +11718e +16517e16114 16126 1636216032/16420
HV*
 ANA02+6582j +11718e −14766uCRS16000/00140
 KAR21+1973g +11718e −14766uCRS16000/00132
 KAR23+11718e −14766uCRS16003/00137
 KAR55+11718e −14766u1631916002/00136
 ANA19−5584a +11718e −14766u −16310k1631116023/16383
 ANA39+11718e −14766u −16208k16189 16209 16320 16356 44+C 57+C 21416000/00242
HV2:
 KAR93+11718e −13704p −14766u +16517e insertion of 11 Cs at np 590016217 16519 73 15216000/00160
H:
 KAR78−951j −7025a +11718e −14766u.1635416000/00132
 KAR79−951j −7025a +11718e −14766u.1635416000/00160
 BAL09−951j +4769a −5176a −7025a +11718e −14766u +16517e16292 16354 1651915997/00200
 KAR06−7025a +11718e −14766u16111 16256 1625716008/00142
 KAR02−7025a +11718e −14766u −14858n/−14859f −16297u +16517e16218 16297A 16519 7316000/00130
 KAR03−7025a +11718e −14766u −14858n/−14859f −16297u +16517e16218 16297A 16519 7316000/00130
 KAR60−5003c −7025a +11718e −14766uCRS16004/00138
 ANA51−7025a +11718e −14766u −14869j/+14869g16148 16256 1631916023/16383
 KAR49−1004o −7025a −11365a +11718e +13366m/−13367b/+13367j −14766u −14869j/+14869g +16517e1651916002/00125
 KAR08−1004o −7025a +11718e −14766u −14869j/+14869g +16517e16243 1651916008/00142
 KAR98−1004o −7025a +11718e +12946c/+12949n/+12950f −14766u −14869j/+14869g +16517e16243 1651916000/00160
 KAR96−1004o −6260e −7025a +11718e −14766u −14869j/+14869g +16517e 9bp del1651916000/00131
 ANA21+4793e −7025a −8729j −10364e +11718e −14766u +16517e16221 16519 152 26316000/00290
 KAR75+4793e −7025a +11718e −14766u +16517e1651916007/00141
 ANA23+4332b −5837e −7025a +11718e −14766u −16303k16304 73 15116011/00220
 KAR09+4332b −7025a −8150i +8439a +10407k −10631c/+10634e +11718e −14766u −16303k1630416000/00130
 ANA01−7025a −9753g +11718e +14279e −14766u16328A15997/00200
 ANA30−7025a −9380f +11718e −14766u16362 1648216000/00200
 CHE07−7025a −9380f +11718e −14766u +16517e16297 16362 1651916002/00124
 GEO12−7025a −9052n/−9053f +11718e −14766u −16303k16304 1633516003/00075
 ANA36−7025a +9157e +11718e −14766u +16517e16172 16266 16519 9315997/00200
 ANA31−7025a +11718e −14766u16232 16260 1627815997/00200
 BAL14−7025a +11718e −14766u16324 15015997/00200
 ANA52−7025a +11718e −14766u1636616023/16383
 CHE16−7025a +11718e −14766u55 5715997/00204
 GEO02−7025a +11718e −14766u −16310k16311 1652715997/00200
 BAL10−7025a +11718e −14766u +16517e16468 16519 18315997/00200
 CHE10−7025a +11718e −14766u +16517e1651915997/00200
 CHE03−7025a +11718e −14766u +16517e1651915997/00200
 ANA15−7025a +11718e −14766u +16517e16086 1651916024/16540
 ANA42−7025a +11718e −14766u +16517eCRS16023/16383
 ANA46−7025a +11718e −14766u +16517eCRS16023/16383
 ANA03−7025a +11718e −14766u +15317c +16517e16519 1654616026/00019
 ANA17−7025a − 11326c +11718e −14766u −16303k +16517e16271 16304 1651915997/00200
 BAL15−5176a −7025a +11718e −14766u1619216000/00200
 GEO03+54k −7025a +11718e −14766u57 57+G15997/00200
 BAL05+5295a −7025a +11718e −14766u −16303k −16310k16304 16311 16400 16527het16000/00040
 GEO17+3744e +4732k/+4735k −7025a +11718e −14766u +16517e16244 16292 16519 15215997/00200
U1:
 ANA18−4990a +12308g −13103g/+13104j +14068l16008 16182C 16183C 16189 16249 7315997/00200
 ANA34−4990a +12308g −13103g/+13104j +14068l −16310k16166del 16183C 16189 16249 16271 16311 73 146 15216000/00200
 BAL02−4990a +12308g −13103g/+13104j +14068l16086 16183C 16189 16245 16249 7316000/00236
 CHE08−4990a +12308g −13103g/+13104j +14068l16183C 16189 16249 7316000/00218
 ANA04−4990a −7417l −7461l +12308g +8615g/−8616j −13103g/+13104j +14068l16183C 16189 16249 16269+A 7316018/00200
 BAL13−4990a +8911k/+8944k +12308g −13103g/+13104j +14068l16129 16183C 16189 16224 16249 16288 16295 16527 73 150 19516000/00208
U2a:
 KAR34−5552c −7055a +12308g −16049k −16303k −16310k +16517e16051 16206C 16230 16304 16311 16519 7316001/00132
 KAR42+5260b/−5261e +8569c/−8572e +12308g −16049k −16310k +16517e16051 16154 16206C 16230 16311 16519 7316026/00156
 KAR33+5260b/−5261e +8569c/−8572e +12308g −16049k −16310k +16517e16051 16154 16206C 16230 16311 16519 7316000/00250
 KAR84+5260b/−5261e +8569c/−8572e +10394c +12308g −16049k −16310k +16517e16051 16154 16206C 16230 16311 16519 7316000/00140
 KAR40+3391e +9135j +12308g +14465s −16049k −16310k16051 16206C 16242 16291 16311 7316001/00132
U2b:
 KAR35−8994e +12308g −15047e −16049k −16208k16051 16086 16209 16239 16352 16353 16355 7316001/00132
 KAR17+12308g −15047e −16049k16051 16086 16129 16259A 16267 16291 16326 7316001/00108
 KAR01+12308g −15047e −16049k −16208k16051 16209 16239 16244 16352 16353 73 146 152 23416000/00250
 KAR44+12308g −15047e −16049k −16208k16051 16209 16239 16352 16353 7316000/00130
U2c:
 KAR25−5789l +8020j/+8022l +8197a +12308g −15060j −16049k +16517e16051 16179 16234 16240C 16519 7316000/00132
 KAR07+4720b −5789l +8020j/+8022l +10806g +12308g −15060j −16049k16051 16189 16234 73 146 15216000/00250
U2e:
 ANA24−984c +12308g +13730g +15907k −16049k16051 16129C 16182C 16183C 16189 16256 16266 73 15216000/00208
 BAL18−984c +12308g +13730g +15907k −16049k +16517e16030 16051 16129C 16189 16256 16519 73 147 15216000/00280
 GEO10+12308g +13730g +15907k −16049k +16517e16051 16129C 16362 16519 7316000/00200
U3:
 ANA26−104i +12308g +16517e16189 16343 1651915997/00054
 CHE09+2293a +9189k/+9192k −9266e +12308g +14138j16193 16249 16343 16413A16023/16500
 GEO14+12308g +14138j16343 73 15015997/00200
U4:
 ANA45+4643k −4808c/−4810g +4984b +10907c +11329a +12308g +16517e16179 1635616023/16383
 CHE11+4643k +7702k −7853o +11329a +12308g +16517e16356 16519 73 19515997/00200
 CHE06+4643k +11329a +12308g +16517e16356 16519 73 152 19515997/00200
U5:
 ANA05+12308g −16310k16174 16189 16192 16270 16311 16368 73 15016021/00219
 ANA49+12308g+16398e16145 16183C 16189 16192 16256 16270 16399 73 197T16023/00200
 GEO06+12308g −16303k +16398e16169 16192 16256 16270 16304 16399 73 15015997/00200
 ANA28+12308g +16398e16256 16270 16399 7316000/00200
 ANA35+12308g +16143l16144 16189 1627016000/00070
U7:
 KAR66+12308g +16517e16086 16309 16318T 16519 73 151 15216003/00212
 KAR67+12308g +16517e16140 16214 16309 16318T 16362 16519 7316007/00141
 KAR28+12308g +16517e16173 16309 16318T 16362 16519 73 15216000/00250
 ANA10+12308g +16517e16309 16318T 16343 16519 73 151 15215997/00200
 KAR74+12308g +15754c +16517e16207 16309 16318C 16519 7316004/00138
 KAR38+12308g +12372a (G12372A A12373G) +16517e16172 16309 16318C 16519 7316001/00131
U8b:
 ANA339052n/−9053f +12308g −16208k16092 16169 16189 16192 16210 16234 16259 7315997/00191
U9
 KAR57−1004o −6383e −11350a +12308g16242 73 19516000/00250
K:
 ANA14−8391e −9052n/−9053f +10394c +12308g −13957e −16310k +16517e16093 16224 16311 16519 7316000/00080
 GEO07−9052n/−9053f +10394c +12308g −16310k +16517e16145 16224 16311 16519 73 19815997/00200
 CHE02−9052n/−9053f +10394c +12308g −16310k +16517e16224 16261 16271 16311 16519 7315997/00200
 GEO09−9052n/−9053f +10394c +12308g −16310k +16517e16224 16298 16311 16519 7315997/00200
 ANA20−9052n/−9053f +10394c +12308g −16310k +16517e16224 1631116023/16383
 ANA11−9052n/−9053f +10394c +12308g −16208k −16310k +16517e16093 16145 16209 16224 1631116023/16383
 CHE13−6260e −9052n/−9053f +10394c +12308g −16310k+16517e16172 16224 16311 16352 1651916003/16550
 GEO04+8249b/−8250e −9052n/−9053f +10394c +12308g −16310k+16517e16224 16234 16260 16311 16318T 16519 73 15215997/00200
J1:
 BAL04+4216q +10394c −13704p −16065g +16517e16069 16126 16145 16261 16519 7316022/00120
 KAR99+4216q −7598f +10394c −13704p −16065g16069 16126 16145 16172 16261 73 146 15216000/00160
 ANA40+4216q +8466a/+8470a +10394c −13704p −16065g16069 16126 16145 16172 16183C 16189 16222 16235 16249 16261 7316000/00200
 BAL06+4216q +8466a/+8470a +10394c −13704p+15500j −16065g16069 16126 16145 16189 16222 1626115997/16436
 CHE12+4216q +10394c −13704p −16065g16069 7316011/00075
 ANA13+4216q +10394c −13704p −16065g −16310k16069 16126 16291 1631116023/16383
 ANA08+4216q +10394c −13704p −16065g +16517e16069 16126 16519 73 185 18815997/00200
 BAL07+4216q +10394c −13704p −16065g +16517e16069 16126 16193 16214 16519 7316000/00200
 BAL17+4216q +10394c −13704p −16065g +16517e16069 16126 16193 16519 64 7316023/00130
J2:
 ANA48+4216q −7474a +10394c +11001n/+11002f −13704p −15254s −16065g16069 16126 1616916023/16383
T*:
 KAR88+4216q +4914r −13259o +13366m/−13367b/+13367j −13704p +15606a −15925i +16517e16126 16294 16519 7316000/00130
 ANA44+4216q +4914r −6260e +13366m/−13367b/+13367j −13704p +14068c +15606a +15925i −16310k16126 16292 16294 1631116023/16383
 CHE04+4216q +4914r −12406h −13259o +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16222 16294 16296 16519 7315997/00200
 ANA22+4216q +4914r +9253e +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16163 16183C 16189 16243 16294 16519 73 15016023/00220
 GEO13+4216q +4914r +8093a +8270k +13366m/ −13367b/+13367j +15606a −15925i+16517e16126 16183C 16189 16291 16294 16295 16296 165197316000/00204
 ANA43+3391e +4216q +4914r +13366m/−13367b/+13367j +14277c +15606a −15925i −15812k +15346k +16517e16126 16235 16294 1629616023/16383
 CHE15+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16163 16172 16189 16243 16294 16519 7316008/00113
 GEO18+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16294 16296 16519 73 20015997/00203
 GEO16+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16163 16189 16243 16294 16519 7315997/00101
 ANA06+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16294 16296 16519 7315997/00200
 CHE01+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16294 16296 16519 7315997/00200
T1:
 BAL03+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i −16310k +16517e16126 16163 16186 16189 16294 16311 1651916000/00019
 BAL08+4216q +4914r +13366m/−13367b/+13367j +15606a −15925i +16517e16126 16163 16186 16189 16294 16355 16356 1651916014/00030
aANA = Anatolian; BAL = Balkarian; CHE = Chechens; GEO = Georgians; KAR = Pakistani from Karachi.
bCoding-region sites and control-region motifs diagnostic of each haplogroup are shown in boldface.
cSites are numbered from the first nucleotide of the recognition sequence. A plus sign (+) indicates the presence of a restriction site, a minus sign (−) the absence of such a site. The explicit indication of the presence/absence of a site implies the absence/presence in haplotypes not so designated. The restriction enzymes used in the analysis are designated by the following single-letter codes: a, AluI; b, AvaII; c, DdeI; e, HaeIII; f, HhaI; g, HinfI; h, HpaI; i, MspI; j, MboI; k, RsaI; l, TaqI; m, BamHI; n, HaeII; o, HincII; p, BstOI; q, NlaIII; r, BfaI; s, AccI; t, MboII; u, MseI; v, HphI; z, MnlI. A slash (/) separating states indicates the simultaneous presence or absence of restriction sites that can be correlated with a single-nucleotide substitution.
dOnly those nucleotide positions that differ from the Cambridge Reference Sequence (CRS) (Anderson et al. 1981; Andrews et al.1999) are shown. Mutations are transitions, unless the base change is specified explicitly. A plus sign (+) indicates an insertion, and “del” indicates a deletion.

Table A2

Haplogroup Affiliation and Sequence Variation of the 702 Individuals Analyzed in the Present Study

Population a
Haplogroup or SubsclusterHVS-I Sequence VariationcPETIGIMAKILUBABRPASIPTMKHZHUKLGJUZTKKTSH
Haplogroupb4240372120173938442344332344443442413244
L1a148 172 187 188G 189 223 230 311 3201
L2a182C 183C 189 223 278 290 294 3092
L2a223 274 278 286 294 3091
L2a223 278 286 294 3091
L3b093 124 223 278 3622
L3b124 223 278 311 3621
L3d124 212 223 3191
L3d124 223 318T 3194
L3d124 223 31912
M*038 178 223 2881
M*051 183C 189 223 3191
M*066 126 154 2231
M*086 209 223 2781
M*093 129 223 311 3571
M*093 223 23411
M*103C 2231
M*108 129 223 291 2981
M*111 168 189 192 223 264 275 300 3521
M*111 223 2391
M*126 147 2234
M*126 154 2232
M*126 162 172 189 223 2561
M*126 169 2231
M*126 221 223 3111
M*126 223123
M*126 223 266 3111
M*126 223 2891
M*126 223 311221
M*126 2892
M*127 223 266 3111
M*129 172 2231
M*129 189 223 249 3111
M*129 2234512
M*129 223 265C15
M*129 223 3041
M*129 223 3111
M*129 223 311 3621
M*145 176 223 261 266 291 31111
M*145 176 223 261 290 3111
M*145 176 223 261 31111
M*145 223 243 261 3111
M*166del 22311
M*168 2391
M*179 223 3111
M*179del 2231
M*179del 223 288A 3021
M*184 223 298 3191
M*184 223 3111
M*185 223 289 3621
M*188 223 231 3621
M*2231
M*223 2342
M*223 234 2741
M*223 234 295G 311 3201
M*223 270 319 3521121
M*223 275 327A 36211
M*223 289111
M*223 2951
M*223 3045
M*223 304 3592
M*223 311211
M*223 311 3621
M*223 318T1
M*223 324 3571
M*223 360 3631
C093 223 261 288 2983
C093 223 298 3271
C129 223 298 3271
C223 239 298 327 3573
C223 239 3271
C223 239 327 3573
C223 297 298 301 3271
C223 298 311 3271
C223 298 3271
C223 298 327 344A 3571
D042 093 214 223 256 3621
D092 170 223 265C 311 3621
D129 174 223 239 3621
D129 223 249 311 3621
D153 221 223 290 301 3621
D172 182C 183C 189 223 3621d
D223 224 245 292 3621
D223 242 3621
D223 245 3621
D223 245 362 36814
D223 3622
G223 227 265C 278 3621
G223 227 274 278 3621
Z111 136 223 260 2982
Z185 209 260 2981
N*086 172 187 189 217 2231
N*0921
N*093 2921
N*111 144 223 261 3111
N*148 2231
N1093 223 292 309 3111
N1a147A 172 223 248 320 3551
N1b075 145 148 176G 186 223 3111
N1b126 145 176A1
N1b145 176G 209 2231
N1b145 176G 2231
N1b145 176G 223 3111
N1c184 201 223 265 2861
N1d223 301 356111
N9a093 111 129 223 257A 2611
N9a111 129 223 257A 2612
N9a223 257A 2611
I129 168 172 173 2231
I129 178 223 296 3111
I129 183C 189 223 3111
I129 186 223 2711
I129 22322
I129 223 3111111
A125 223 274 290 311 319 3621
A223 242 278 290 3191
A223 242 290 31911
A223 290 319 36221
W093 192 223 292 325 3261
W129 223 29211
W145 183C 189 223 292 3201
W145 189 223 291
W172 223 2921
W183C 189 223 292 3201
W189 223 2921
W189 223 292 325 3551
W192 223 292 311 3251
W192 223 292 3251
W209 223 255 292 381G2
W212 223 265C 292 3031
W223 286 2921
W223 2921111
X086 189 223 278 3091
X162 179 183C 189 223 2781
X183C 189 223 27811
X189 223 2781
R*093 126 325 355 3621
R*093 179 227 245 266 278 3621
R*172 220C 265 298 3621
R*179 227 245 266 278 3621
R*3042
R*3621
R*CRS1
R1278 3113
R2071331
R2071 2341
R2071 355 3571
R5093 266 3041
R5266 304 311 35611
R5266 304 325 3561
B086 136 182C 183C 189 2172
B093 140 182C 183C 189 217 274 3351
B136 154 183C 193+C 217 2181
B136 183C 189 2171
B140 182C 183C 189 2431
B182C 183C 189 2171
F134 182C 183C 189 232A 249 3041
F172 183C 189 232A 249 304 3111
F183C 189 232A 249 304 3111
F183C 189 3041
(pre-HV)1093 126 355 3621
(pre-HV)1126 270 3622
(pre-HV)1126 287 3621
(pre-HV)1126 301 3622
(pre-HV)1126 355 3621
(pre-HV)136210
HV*0371
HV*086 183C 1891
HV*0921
HV*155 189 311 3161
HV*158T 3111
HV*17411
HV*184 3571
HV*18912
HV*189 2231
HV*193 3111
HV*223 3241
HV*230 3251
HV*2601
HV*2951
HV*311111
HV*3171
HV*3192
HV*327A 3621
HV*3561
HV*3622
HV*CRS211
HV1067 179 3551
HV1067 35511
HV2093 2171e
HV2129 217 271 3111
HV2130 217 2431
HV2168 189 217 2871
HV2189 217 3431
HV2214 217 335231
HV22171e1e
HV2217 243 30911
HV2217 3094
H0861
H086 3111
H093 1291
H093 183C 1891
H093 3111
H104 300 325 3621
H111 167 288 304 3621
H111 239 3621
H124 1842
H1341
H145 2271
H168 2391
H168 2881
H1721
H172 2091
H172 274 319 3621
H1781
H1841
H1881
H189 356 3621
H193 3541
H201 256 319 3621
H20921
H218 3281
H223 2601
H24336
H256 35211
H256G1
H2611
H2641
H291 3241
H294 30411
H300 325 3621
H30411
H304 3111
H311211
H318T1
H3241
H3541111
H3621111
H3662
HCRS273213221111423
V153 2981
U*093 1891
U*3621
U1129 169 183C 189 249 2741
U1145 182C 183C 189 2491
U1182C 183C 189 206 2491
U1182C 183C 189 249113
U1182C 183C 189 249 3111
U1183C 185 189 2491
U1183C 189 2491
U1189 2491
U2*051 093 184 189 234 294 3421
U2a051 093 206C1
U2a051 129 206C 3621
U2a051 145 206C 31911
U2a051 172 206C11
U2a051 181 206C1
U2a051 206C111
U2a051 206C 215 230 304 3111
U2a051 206C 230 304 3111
U2b0511
U2b051 168 2101
U2b051 189 209 239 3521
U2b051 209 239 244 352 3531
U2b051 209 239 311 352 3534
U2b051 209 239 352 3531
U2b051 2391
U2b051 239 352 3531
U2c051 126 189 234 247 264 3041
U2c051 189 2341
U2c051 234 2471
U2c051 234 304 317 343 360 3621
U2c051 247 254 3551
U2e051 092 129C 153 182C 183C 189 261 3621
U2e051 092 129C 182C 183C 189 3621
U2e051 092 129C 183C 189 248 3621
U2e051 129C 145 183C 235 3621
U2e051 129C 154 248 3627
U2e051 129C 183C 189 223 256 3621
U2e051 129C 183C 189 3111
U2e051 129C 183C 189 311 3621
U2e051 129C 189 3621
U3086 3431
U3093 3431
U3104 266A 311 3431
U3111 3431
U3129 3431
U3168 343 355 3621
U3172 245 3431
U3203 3431
U334311
U4092 311 3561
U4134 172 3561
U4134 310 311 355 3561
U4134 35612
U4134 356 3621
U4136 3562
U4154 356 3621
U4179 3561
U4287 3561
U4298 356 3621
U43566121
U4356 3621
U5093 189 2701
U5114A 192 256 270 2941
U5114A 256 270 2941
U5129 192 256 270 2921
U5192 256 2701
U5192 256 270 3041
U5192 256 270 3621
U5256 270111
U5256 270 3091
U7069 227 278 318C 3591
U7092 278 309 318T1
U7092 278 318T1
U7093 129 318T 3621
U7111 129 318T1
U7126 148 309 318T1
U7129 146 318T11
U7129 309 318T1
U7129 318T1
U7176 309 318T1
U7189 207 309 318C1
U7189 207 318C1
U7192 309 318T1
U7207 309 318T1
U7233 309 318C1
U7243 309 318T1
U7309 318C1
U7309 318T211111
U7318C11
U7318T1
U7318T 35211
U8b066 129 169 183C 189 234 3111
U8b111 172 183C 189 3111
U8b129 183C 189 2661
U8b172 189 234 3111
U9051 3553
U9263 3551
K093 176 213 224 3111
K093 209 224 3113
K093 224 295 3111
K093 224 31111
K093 224 311 3192
K111 224 3111
K129 224 3111
K172 224 301 3111
K216T 224 311 3201
K224 269 3111
K224 311211
K224 311 3191
K224 311 32011
J1051 069 126 1931
J1069 086 126 145 2611
J1069 086 145 2611
J1069 092 126 145 172 222 261 2711
J1069 111A 126 145 222 261 3111
J1069 1261
J1069 126 145 163 222 261 296 3111
J1069 126 145 172 189 222 2611
J1069 126 145 172 222 261111
J1069 126 145 172 222 261 2781
J1069 126 145 172 222 261 3621
J1069 126 145 172 2611
J1069 126 145 185 2612
J1069 126 145 222 235 26123
J1069 126 145 222 235 261 3111
J1069 126 145 222 256 261 278 3651
J1069 126 145 222 261121
J1069 126 145 222 261 278 2871
J1069 126 145 223 2611
J1069 126 145 256 2611
J1069 126 145 261111
J1069 126 145 261 263+A1
J1069 126 145 261 2741
J1069 126 145 261 2781
J1069 126 145 261 2901
J1069 126 185 2981
J1069 126 1931112
J1069 126 193 2661
J1069 126 193 274 3431
J1069 126 2611
J1069 145 2611
J1126 2211
J2069 126 145 183C 1891
J2069 126 148 193 3041
J2069 126 193 274 2784
J2069 126 193 278 2951
J2069 126 239 3661
T*086 126 294 2961
T*111A 126 292 294 2961
T*126 153 294 2961
T*126 163 209 2941
T*126 163 294 3091
T*126 172 183C 189 2941
T*126 189 2941
T*126 189 294 296 3041
T*126 274 292 2941
T*126 287 294 304 3601
T*126 292 2941
T*126 292 294 29611
T*126 294121
T*126 294 29612212
T*126 294 296 3041
T*126 294 296 3621
T*126 294 302 3421
T*126 294 3041
T*126 296 362122
T1093 126 163 186 189 2942
T1126 163 172 186 189 294 2981
T1126 163 186 189 193 2941
T1126 163 186 189 239 243 294 3621
T1126 163 186 189 2941111
T1126 163 186 189 294 32521
T1126 163 186 189 294 3601
aPopulation codes as in table 1.
bHaplogroup affiliation was determined according to the HVS-I motifs and RFLP diagnostic markers shown in figure 2.
cMutation sites are −16,000.
dThis sample, assigned to haplogroup D and harboring the RFLP diagnostic mutation of this lineage (−5176 AluI), does not present the mutation +10397 AluI characterizing the macro-haplogroup M.
eThese samples assigned as haplogroup HV2 and harboring the HVS-I mutation 16217, which is diagnostic of this lineage, do not present the coding region site +9336 RsaI.

Electronic-Database Information

References

Alves-Silva J, da Silva Santos M, Guimaraes PE, Ferreira AC, Bandelt HJ, Pena SD, Prado VF (2000) The ancestry of Brazilian mtDNA lineages. Am J Hum Genet 67:444–461 [PMC free article] [PubMed]
Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJ, Staden R, Young IG (1981) Sequence and organization of the human mitochondrial genome. Nature 290:457–465 [PubMed]
Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM, Howell N (1999) Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat Genet 23:147 [PubMed] [Cross Ref]10.1038/13779
Aris-Brosou S, Excoffier L (1996) The impact of population expansion and mutation rate heterogeneity on DNA sequence polymorphism. Mol Biol Evol 13:494–504 [PubMed]
Ayub Q, Mansoor A, Ismail M, Khaliq S, Mohyuddin A, Hameed A, Mazhar K, Rehman S, Siddiqi S, Papaioannou M, Piazza A, Cavalli-Sforza LL, Mehdi SQ (2003) Reconstruction of human evolutionary tree using polymorphic autosomal microsatellites. Am J Phys Anthropol 122:259–268 [PubMed] [Cross Ref]10.1002/ajpa.10234
Ballinger SW, Schurr TG, Torroni A, Gan YY, Hodge JA, Hassan K, Chen KH, Wallace DC (1992) Southeast Asian mitochondrial DNA analysis reveals genetic continuity of ancient mongoloid migrations. Genetics 130:139–152 [PMC free article] [PubMed]
Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE, Rao BB, Naidu JM, Prasad BV, Reddy PG, Rasanayagam A, Papiha SS, Villems R, Redd AJ, Hammer MF, Nguyen SV, Carroll ML, Batzer MA, Jorde LB (2001) Genetic evidence on the origins of Indian caste populations. Genome Res 11:994–1004 [PMC free article] [PubMed] [Cross Ref]10.1101/gr.GR-1733RR
Bandelt HJ, Forster P, Sykes BC, Richards MB (1995) Mitochondrial portraits of human populations using median networks. Genetics 141:743–753 [PMC free article] [PubMed]
Bandelt HJ, Forster P (1997) The myth of bumpy hunter-gatherer mismatch distributions. Am J Hum Genet 61:980–983 [PMC free article] [PubMed]
Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16:37–48 [PubMed]
Bandelt HJ, Lahermo P, Richards M, Macaulay V (2001) Detecting errors in mtDNA data by phylogenetic analysis. Int J Legal Med 115:64–69 [PubMed] [Cross Ref]10.1007/s004140100228
Brakez Z, Bosch E, Izaabel H, Akhayat O, Comas D, Bertranpetit J, Calafell F (2001) Human mitochondrial DNA sequence variation in the Moroccan population of the Souss area. Ann Hum Biol 28:295–307 [PubMed] [Cross Ref]10.1080/030144601300119106
Brehm A, Pereira L, Bandelt H-J, Prata MJ, Amorim A (2002) Mitochondrial portrait of the Cabo Verde archipelago: the Senegambian outpost of Atlantic slave trade. Ann Hum Genet 66:49–60 [PubMed] [Cross Ref]10.1017/S0003480001001002
Calafell F, Underhill P, Tolun A, Angelicheva D, Kalaydjieva L (1996) From Asia to Europe: mitochondrial DNA sequence variability in Bulgarians and Turks. Ann Hum Genet 60:35–49 [PubMed]
Cavalli-Sforza LL, Piazza A, Menozzi P (1994) The history and geography of human genes. Princeton University Press, Princeton, NJ
Cavalli-Sforza (1996) The spread of agriculture and nomadic pastoralism: insights from the genetics, linguistics and archaeology. In: Harris DR (ed) The origins and spread of Agriculture and Pastoralism in Eurasia. Smithsonian Institution Press, Washington, DC, pp 51–69
Chen YS, Torroni A, Excoffier L, Santachiara-Benerecetti AS, Wallace DC (1995) Analysis of mtDNA variation in African populations reveals the most ancient of all human continent-specific haplogroups. Am J Hum Genet 57:133–149 [PMC free article] [PubMed]
Chen YS, Olckers A, Schurr TG, Kogelnik AM, Huoponen K, Wallace DC (2000) mtDNA variation in the South African Kung and Khwe-and their genetic relationships to other African populations. Am J Hum Genet 66:1362–1383 [PMC free article] [PubMed]
Clarence-Smith WG (1989) The economics of the Indian Ocean slave trade in the nineteenth century. Frank Cass, London
Comas D, Calafell F, Mateu E, Perez-Lezaun A, Bertranpetit J (1996) Geographic variation in human mitochondrial DNA control region sequence: the population history of Turkey and its relationship to the European populations. Mol Biol Evol 13:1067–1077 [PubMed]
Comas D, Calafell F, Mateu E, Perez-Lezaun A, Bosch E, Martinez-Arias R, Clarimon J, Facchini F, Fiori G, Luiselli D, Pettener D, Bertranpetit J (1998) Trading genes along the silk road: mtDNA sequences and the origin of Central Asian populations. Am J Hum Genet 63:1824–1838 [PMC free article] [PubMed]
Comas D, Calafell F, Bendukidze N, Fananas L, Bertranpetit J (2000) Georgian and Kurd mtDNA sequence analysis shows a lack of correlation between languages and female genetic lineages. Am J Phys Anthropol 112:5–16 [PubMed] [Cross Ref]10.1002/(SICI)1096-8644(200005)112:1<5::AID-AJPA2>3.0.CO;2-Z
Decker KD (1992) Sociolinguistic survey of northern Pakistan. Vol 5, Languages of Chitral. National Institute of Pakistan Studies, Islamabad
Derbeneva OA, Starikovskaya EB, Wallace DC, Sukernik RI (2002) Traces of early Eurasians in the Mansi of northwest Siberia revealed by mitochondrial DNA analysis. Am J Hum Genet 70:1009–14. [PMC free article] [PubMed]
Di Rienzo A, Wilson AC (1991) Branching pattern in the evolutionary tree for human mitochondrial DNA. Proc Natl Acad Sci USA 88:1597–1601 [PMC free article] [PubMed]
Dupanloup I, Bertorelle G (2001) Inferring admixture proportions from molecular data: extension to any number of parental populations. Mol Biol Evol 18:672–675 [PubMed]
Dupanloup I, Schneider S, Excoffier L (2002) A simulated annealing approach to define the genetic structure of populations. Mol Ecol 11:2571–2581 [PubMed] [Cross Ref]10.1046/j.1365-294X.2002.01650.x
Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479–491 [PMC free article] [PubMed]
Finnilä S, Lehtonen MS, Majamaa K (2001) Phylogenetic network for European mtDNA. Am J Hum Genet 68:1475–1484 [PMC free article] [PubMed]
Forster P, Harding R, Torroni A, Bandelt HJ (1996) Origin and evolution of Native American mtDNA variation: a reappraisal. Am J Hum Genet 59:935–945 [PMC free article] [PubMed]
Fu YX (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915–925 [PMC free article] [PubMed]
Graven L, Passarino G, Semino O, Boursot P, Santachiara-Benerecetti S, Langaney A, Excoffier L (1995) Evolutionary correlation between control region sequence and restriction polymorphisms in the mitochondrial genome of a large Senegalese Mandenka sample. Mol Biol Evol 12:334–345 [PubMed]
Herrnstadt C, Elson JL, Fahy E, Preston G, Turnbull DM, Anderson C, Ghosh SS, Olefsky JM, Beal MF, Davis RE, Howell N (2002) Reduced-median-network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups. Am J Hum Genet 70:1152–1171 [PMC free article] [PubMed]
Horai S, Murayama K, Hayasaka K, Matsubayashi S, Hattori Y, Fucharoen G, Harihara S, Park KS, Omoto K, Pan IH (1996) mtDNA polymorphism in East Asian populations, with special reference to the peopling of Japan. Am J Hum Genet 59:579–590 [PMC free article] [PubMed]
Hughes-Buller R (1991) Imperial gazetteer of India: provincial series, Baluchistan. Sang-e-Meel, Lahore, Pakistan
Ingman M, Kaessmann H, Pääbo S, Gyllensten U (2000) Mitochondrial genome variation and the origin of modern humans. Nature 408:708–713 [PubMed] [Cross Ref]10.1038/35047064
Karafet T, Xu L, Du R, Wang W, Feng S, Wells RS, Redd AJ, Zegura SL, Hammer MF (2001) Paternal population history of East Asia: sources, patterns, and microevolutionary processes. Am J Hum Genet 69:615–628 [PMC free article] [PubMed]
Kivisild T, Bamshad MJ, Kaldma K, Metspalu M, Metspalu E, Reidla M, Laos S, Parik J, Watkins WS, Dixon ME, Papiha SS, Mastana SS, Mir MR, Ferak V, Villems R (1999a) Deep common ancestry of indian and western-Eurasian mitochondrial DNA lineages. Curr Biol 9:1331–1334 [PubMed] [Cross Ref]10.1016/S0960-9822(00)80057-3
Kivisild T, Kaldma K, Metspalu M, Parik J, Papiha SS, Cillems R (1999b) The place of the Indian mitochondrial DNA variants in the global network of maternal lineages and the peopling of the Old Word. In: Deka R, Papiha SS (eds) Genomic Diversity. Kluwer/Academic/Plenum Publishers, New York, pp 135–152
Kivisild T, Tolk HV, Parik J, Wang Y, Papiha SS, Bandelt HJ, Villems R (2002) The emerging limbs and twigs of the East Asian mtDNA tree. Mol Biol Evol 19:1737–1751 [PubMed]
Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K, Parik J, Metspalu E, Adojaan M, Tolk HV, Stepanov V, Golge M, Usanga E, Papiha SS, Cinnioglu C, King R, Cavalli-Sforza L, Underhill PA, Villems R (2003) The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet 72:313–332 [PMC free article] [PubMed]
Kolman CJ, Sambuughin N, Bermingham E (1996) Mitochondrial DNA analysis of Mongolian populations and implications for the origin of New World founders. Genetics 142:1321–1334 [PMC free article] [PubMed]
Kong QP, Yao YG, Sun C, Bandelt HJ, Zhu CL, Zhang YP (2003) Phylogeny of East Asian mitochondrial DNA lineages inferred from complete sequences. Am J Hum Genet 73:671–676 [PMC free article] [PubMed]
Krings M, Salem AE, Bauer K, Geisert H, Malek AK, Chaix L, Simon C, Welsby D, Di Rienzo A, Utermann G, Sajantila A, Paabo S, Stoneking M (1999) mtDNA analysis of Nile River Valley populations: A genetic corridor or a barrier to migration? Am J Hum Genet 64:1166–1176 [PMC free article] [PubMed]
Lovejoy PE (2000) Transformations in slavery: a history of slavery in Africa. Cambridge University Press, Cambridge, United Kingdom
Maca-Meyer N, Gonzalez AM, Larruga JM, Flores C, Cabrera VM (2001) Major genomic mitochondrial lineages delineate early human expansions. BMC Genet 2:13 [PMC free article] [PubMed] [Cross Ref]10.1186/1471-2156-2-13
Macaulay V, Richards M, Hickey E, Vega E, Cruciani F, Guida V, Scozzari R, Bonne-Tamir B, Sykes B, Torroni A (1999) The emerging tree of west Eurasian mtDNAs: a synthesis of control-region sequences and RFLPs. Am J Hum Genet 64:232–249 [PMC free article] [PubMed]
Malyarchuk BA, Derenko MV (2001) Mitochondrial DNA variability in Russians and Ukrainians: implication to the origin of the Eastern Slavs. Ann Hum Genet 65:63–78 [PubMed] [Cross Ref]10.1046/j.1469-1809.2001.6510063.x
Malyarchuk BA, Grzybowski T, Derenko MV, Czarny J, Wozniak M, Miscicka-Sliwka D (2002) Mitochondrial DNA variability in Poles and Russians. Ann Hum Genet 66:261–283 [PubMed] [Cross Ref]10.1046/j.1469-1809.2002.00116.x
McAlpin DW (1974) Toward Proto-Elamo-Dravidian. Language 50:89–101
McAlpin DW (1981) Proto-Elamo-Dravidian: the evidence and its implications. Trans Am Phylosophical Soc 71:3–155
Mishmar D, Ruiz-Pesini E, Golik P, Macaulay V, Clark AG, Hosseini S, Brandon M, Easley K, Chen E, Brown MD, Sukernik RI, Olckers A, Wallace DC (2003) Natural selection shaped regional mtDNA variation in humans. Proc Natl Acad Sci USA 100:171–6. [PMC free article] [PubMed] [Cross Ref]10.1073/pnas.0136972100
Mountain JL, Hebert JM, Bhattacharyya S, Underhill PA, Ottolenghi C, Gadgil M, Cavalli-Sforza LL (1995) Demographic history of India and mtDNA-sequence diversity. Am J Hum Genet 56:979–992 [PMC free article] [PubMed]
Nanavutty P (1997) The Parsis. National Book Trust, New Delhi, India
Nasidze I, Stoneking M (2001) Mitochondrial DNA variation and language replacements in the Caucasus. Proc R Soc Lond B Biol Sci 268:1197–1206 [PMC free article] [PubMed] [Cross Ref]10.1098/rspb.2001.1610
Pereira L, Macaulay V, Torroni A, Scozzari R, Prata MJ, Amorim A (2001) Prehistoric and historic traces in the mtDNA of Mozambique: insights into the Bantu expansions and the slave trade. Ann Hum Genet 65:439–458 [PubMed] [Cross Ref]10.1046/j.1469-1809.2001.6550439.x
Pérez-Lezaun A, Calafell F, Comas D, Mateu E, Bosch E, Martinez-Arias R, Clarimon J, Fiori G, Luiselli D, Facchini F, Pettener D, Bertranpetit J (1999) Sex-specific migration patterns in Central Asian populations, revealed by analysis of Y-chromosome short tandem repeats and mtDNA. Am J Hum Genet 65:208–219 [PMC free article] [PubMed]
Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Mansoor A, Zerjal T, Tyler-Smith C, Mehdi SQ (2002) Y-chromosomal DNA variation in Pakistan. Am J Hum Genet 70:1107–1124 [PMC free article] [PubMed]
Quintana-Murci L, Semino O, Bandelt HJ, Passarino G, McElreavey K, Santachiara-Benerecetti AS (1999) Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa. Nat Genet 23:437–441 [PubMed] [Cross Ref]10.1038/70550
Quintana-Murci L, Krausz C, Zerjal T, Sayar SH, Hammer MF, Mehdi SQ, Ayub Q, Qamar R, Mohyuddin A, Radhakrishna U, Jobling MA, Tyler-Smith C, McElreavey K (2001) Y-chromosome lineages trace diffusion of people and languages in southwestern Asia. Am J Hum Genet 68:537–542 [PMC free article] [PubMed]
Redd AJ, Stoneking M (1999) Peopling of Sahul: mtDNA variation in aboriginal Australian and Papua New Guinean populations. Am J Hum Genet 65:808–828 [PMC free article] [PubMed]
Renfrew C (1987) Archaeology and language: the puzzle of Indo-European origins. Jonathan Cape, London
Renfrew C (1996) Languages families and the spread of farming. In: Harris DR (ed) The origins and spread of agriculture and pastoralism in Eurasia. Smithsonian Institution Press, Washington, DC, pp 70–92
Richards M, Macaulay V, Hickey E, Vega E, Sykes B, Guida V, Rengo C, et al (2000) Tracing European founder lineages in the Near Eastern mtDNA pool. Am J Hum Genet 67:1251–1276 [PMC free article] [PubMed]
Richards M, Macaulay V, Torroni A, Bandelt H-J (2002) In search of geographical patterns in European mtDNA. Am J Hum Genet 71:1168–1174 [PMC free article] [PubMed]
Richards M, Rengo C, Cruciani F, Gratrix F, Wilson JF, Scozzari R, Macaulay V, Torroni A (2003) Extensive female-mediated gene flow from sub-Saharan Africa into near eastern Arab populations. Am J Hum Genet 72:1058–1064 [PMC free article] [PubMed]
Robertson GS (1896) The Kafirs of the Hindu-Kush, Oxford University Press, Karachi, Pakistan
Roychoudhury S, Roy S, Basu A, Banerjee R, Vishwanathan H, Usha Rani MV, Sil SK, Mitra M, Majumder PP (2001) Genomic structures and population histories of linguistically distinct tribal groups of India. Hum Genet 109:339–350 [PubMed] [Cross Ref]10.1007/s004390100577
Saillard J, Forster P, Lynnerup N, Bandelt HJ, Norby S (2000) mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion. Am J Hum Genet 67:718–726 [PMC free article] [PubMed]
Salas A, Richards M, De la Fe T, Lareu MV, Sobrino B, Sanchez-Diz P, Macaulay V, Carracedo A (2002) The making of the African mtDNA landscape. Am J Hum Genet 71:1082–1111 [PMC free article] [PubMed]
Schneider S, Roessli D, Excoffier L (2000) Arlequin ver 2.0: a software for population genetics data analysis. Genetics and Biometry Laboratory, University of Geneva, Geneva, Switzerland
Schurr TG, Sukernik RI, Starikovskaya YB, Wallace DC (1999) Mitochondrial DNA variation in Koryaks and Itel’men: population replacement in the Okhotsk Sea-Bering Sea region during the Neolithic. Am J Phys Anthropol 108:1–39 [PubMed] [Cross Ref]10.1002/(SICI)1096-8644(199901)108:1<1::AID-AJPA1>3.0.CO;2-1
Soodyall H, Vigilant L, Hill AV, Stoneking M, Jenkins T (1996) mtDNA control-region sequence variation suggests multiple independent origins of an “Asian-specific” 9-bp deletion in sub-Saharan Africans. Am J Hum Genet 58:595–608 [PMC free article] [PubMed]
Starikovskaya YB, Sukernik RI, Schurr TG, Kogelnik AM, Wallace DC (1998) mtDNA diversity in Chukchi and Siberian Eskimos: implications for the genetic history of Ancient Beringia and the peopling of the New World. Am J Hum Genet 63:1473–1491 [PMC free article] [PubMed]
Stoneking M, Jorde LB, Bhatia K, Wilson AC (1990) Geographic variation in human mitochondrial DNA from Papua New Guinea. Genetics 124:717–733 [PMC free article] [PubMed]
Sultana F (1995) Gwat and Gwat-i-leb: Spirit healing and social change in Makran. In: Titus P (ed) Marginality and modernity: ethnicity and change in post-colonial Balochistan. Oxford University Press, Karachi, pp 28–50
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595 [PMC free article] [PubMed]
Tambets K, Kivisild T, Metspalu E, Parik J, Kaldma K, Laos S, Tolk HV, Gölge M, Demirtas H, Geberhiwot T, De Stefano GP, Papiha SS, Villems R (2000) The topology of the maternal lineages of the Anatolian and Trans-Caucasus populations and the peopling of Europe: preliminary conclusions. In: Renfrew C, Boyle K (eds) Archaeogenetics: DNA and the population prehistory of Europe. McDonald Institute for Archaeological Research Monograph Series, Cambridge University, Cambridge, United Kingdom, pp 219–235
Tishkoff SA, Dietzsch E, Speed W, Pakstis AJ, Kidd JR, Cheung K, Bonne-Tamir B, Santachiara-Benerecetti AS, Moral P, Krings M (1996) Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271:1380–1387 [PubMed]
Torroni A, Sukernik RI, Schurr TG, Starikorskaya YB, Cabell MF, Crawford MH, Comuzzie AG, Wallace DC (1993) mtDNA variation of aboriginal Siberians reveals distinct genetic affinities with Native Americans. Am J Hum Genet 53:591–608 [PMC free article] [PubMed]
Torroni A, Miller JA, Moore LG, Zamudio S, Zhuang J, Droma T, Wallace DC (1994a) Mitochondrial DNA analysis in Tibet: implications for the origin of the Tibetan population and its adaptation to high altitude. Am J Phys Anthropol 93:189–199 [PubMed]
Torroni A, Neel JV, Barrantes R, Schurr TG, Wallace DC (1994b) Mitochondrial DNA “clock” for the Amerinds and its implications for timing their entry into North America. Proc Natl Acad Sci USA 91:1158–1162 [PMC free article] [PubMed]
Torroni A, Huoponen K, Francalacci P, Petrozzi M, Morelli L, Scozzari R, Obinu D, Savontaus ML, Wallace DC (1996) Classification of European mtDNAs from an analysis of three European populations. Genetics 144:1835–1850 [PMC free article] [PubMed]
Torroni A, Petrozzi M, D’Urbano L, Sellitto D, Zeviani M, Carrara F, Carducci C, Leuzzi V, Carelli V, Barboni P, De Negri A, Scozzari R (1997) Haplotype and phylogenetic analyses suggest that one European-specific mtDNA background plays a role in the expression of Leber hereditary optic neuropathy by increasing the penetrance of the primary mutations 11778 and 14484. Am J Hum Genet 60:1107–1121 [PMC free article] [PubMed]
Torroni A, Bandelt HJ, Macaulay V, Richards M, Cruciani F, Rengo C, Martinez-Cabrera V, et al (2001a) A signal, from human mtDNA, of postglacial recolonization in Europe. Am J Hum Genet 69:844–852 [PMC free article] [PubMed]
Torroni A, Rengo C, Guida V, Cruciani F, Sellitto D, Coppa A, Luna Calderon F, Simionati B, Valle G, Richards M, Macaulay V, Scozzari R (2001b) Do the four clades of the mtDNA haplogroup L2 evolve at different rates? Am J Hum Genet 69:1348–1356 [PMC free article] [PubMed]
Wallace DC, Brown MD, Lott MT (1999) Mitochondrial DNA variation in human evolution and disease. Gene 238:211–230 [PubMed] [Cross Ref]10.1016/S0378-1119(99)00295-4
Watson E, Forster P, Richards M, Bandelt HJ (1997) Mitochondrial footprints of human expansions in Africa. Am J Hum Genet 61:691–704 [PMC free article] [PubMed]
Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I, Blue-Smith J, Jin L, et al (2001) The Eurasian heartland: a continental perspective on Y-chromosome diversity. Proc Natl Acad Sci USA 98:10244–10249 [PMC free article] [PubMed] [Cross Ref]10.1073/pnas.171305098
Yao YG, Kong QP, Bandelt HJ, Kivisild T, Zhang YP (2002) Phylogeographic differentiation of mitochondrial DNA in Han Chinese. Am J Hum Genet 70:635–651 [PMC free article] [PubMed]
Zerjal T, Wells RS, Yuldasheva N, Ruzibakiev R, Tyler-Smith C (2002) A genetic landscape reshaped by recent events: Y-chromosomal insights into Central Asia. Am J Hum Genet 71:466–482 [PMC free article] [PubMed]
Zerjal T, Xue Y, Bertorelle G, Wells RS, Bao W, Zhu S, Qamar R, Ayub Q, Mohyuddin A, Fu S, Li P, Yuldasheva N, Ruzibakiev R, Xu J, Shu Q, Du R, Yang H, Hurles ME, Robinson E, Gerelsaikhan T, Dashnyam B, Mehdi SQ, Tyler-Smith C (2003) The genetic legacy of the Mongols. Am J Hum Genet 72:717–721 [PMC free article] [PubMed]
Zvelebil M (1980) The rise of the nomads in Central Asia. In: Sherratt A (ed) The Cambridge encyclopedia of archaeology. Crown, New York, pp 252–256

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...