• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ajhgLink to Publisher's site
Am J Hum Genet. Mar 2003; 72(3): 659–670.
Published online Feb 4, 2003.
PMCID: PMC1180241

The Pedigree Rate of Sequence Divergence in the Human Mitochondrial Genome: There Is a Difference Between Phylogenetic and Pedigree Rates

Abstract

We have extended our previous analysis of the pedigree rate of control-region divergence in the human mitochondrial genome. One new germline mutation in the mitochondrial DNA (mtDNA) control region was detected among 185 transmission events (generations) from five Leber hereditary optic neuropathy (LHON) pedigrees. Pooling the LHON pedigree analyses yields a control-region divergence rate of 1.0 mutation/bp/106 years (Myr). When the results from eight published studies that used a similar approach were pooled with the LHON pedigree studies, totaling >2,600 transmission events, a pedigree divergence rate of 0.95 mutations/bp/Myr for the control region was obtained with a 99.5% confidence interval of 0.53–1.57. Taken together, the cumulative results support the original conclusion that the pedigree divergence rate for the control region is ~10-fold higher than that obtained with phylogenetic analyses. There is no evidence that any one factor explains this discrepancy, and the possible roles of mutational hotspots (rate heterogeneity), selection, and random genetic drift and the limitations of phylogenetic approaches to deal with high levels of homoplasy are discussed. In addition, we have extended our pedigree analysis of divergence in the mtDNA coding region. Finally, divergence of complete mtDNA sequences was analyzed in two tissues, white blood cells and skeletal muscle, from each of 17 individuals. In three of these individuals, there were four instances in which an mtDNA mutation was found in one tissue but not in the other. These results are discussed in terms of the occurrence of somatic mtDNA mutations.

Introduction

The noncoding control region, or D-loop, of the human mitochondrial genome (mtDNA) continues to be widely used to “time” human evolution and population movements, both ancient and modern (e.g., see Richards et al. 2000; Bandelt et al. 2001; Torroni et al. 2001a). Many of these studies continue to rely on phylogenetic analyses of mtDNA-control-region haplotype trees and phylogenetically derived rates of divergence, despite the complications that arise from the effects of marked site variability in the control region (Excoffier and Yang 1999; Meyer et al. 1999; Heyer et al. 2001), a high rate of parallel mutations (homoplasy; e.g., see fig. 1 of article by Bandelt et al. [2001]), evidence for violations of clocklike evolution (Ingman et al. 2000; Torroni et al. 2001b) and of nonneutral evolution (Gerber et al. 2001; Rand 2001), and the failure of most phylogenetic approaches to factor in the effects of population dynamics, admixture, and migration (Rannala and Bertorelle 2001; Relethford 2001). One approach that should reduce the confounding effects of control-region homoplasy involves analysis of the more slowly diverging mtDNA coding region (Ingman et al. 2000), although reduced-median-network analysis of a large set of coding region sequences revealed parallel mutations at a substantial number of sites (Herrnstadt et al. 2002a).

 Figure  1
The matrilineal pedigree of the USA2 11778 family with LHON. Filled symbols indicate that the family member is visually affected.

In an effort to estimate the human mtDNA-control-region divergence rate with a nonphylogenetic approach, we analyzed multiple members of a large matrilineal pedigree with Leber hereditary optic neuropathy (LHON [MIM #535000]) (Howell et al. 1996). At that same time, Parsons et al. (1997) published their analysis of control-region divergence in a large number of mother-offspring cohorts. The two studies obtained the same result, which was that the pedigree rate of control-region divergence was ~10-fold higher than phylogenetically derived rates. Several aspects of these pedigree studies have been challenged (Macaulay et al. 1997; Jazin et al. 1998) and defended (Howell and Mackey 1997; Parsons and Holland 1998). Subsequent studies have also estimated the pedigree divergence rate, but a consensus has not yet emerged, perhaps because most individual studies have analyzed relatively small numbers of transmission events. The same approach has been used for other genetic systems, such as the Y chromosome (e.g., see Heyer et al. 1997; Kayser et al. 2000), but the interpretation of those results has been more straightforward.

We address two issues here. The first question is whether the pedigree rate of divergence differs from the phylogenetically derived rate. The second issue is that, if there is a difference, then it becomes important to provide possible explanations for the discrepancy. In the present study, we expanded our previous analysis of control-region divergence in multigeneration LHON pedigrees. The results of our analyses, as well as those from other studies, indicate that there is a difference in the two rates, and we discuss the factors that are most likely to produce this difference.

Subjects, Material, and Methods

Subjects and DNA Samples

Venous blood samples were obtained from family members and patients, with informed consent and with approval by the appropriate institutional oversight bodies. In one set of analyses, the complete mtDNA sequence was determined for two tissues from a single individual. For these 17 individuals, DNA was isolated both from the white blood cell (WBC)/platelet fraction of venous blood samples and from skeletal muscle biopsy specimens. DNA was isolated from the WBC/platelet fraction by standard procedures of SDS/proteinase K digestion, phenol extraction, and ethanol precipitation. Frozen muscle biopsy specimens were pulverized in liquid nitrogen with a mortar and pestle, and DNA was extracted with QIAamp DNA kits (Qiagen). DNA was then precipitated with ethanol and resuspended in TE buffer (10 mM Tris-HCl [pH 7.5] and 1 mM EDTA). DNA concentrations were determined by use of UV absorption.

DNA Sequencing

All nucleotide changes are based on the L-strand sequence of the revised Cambridge Reference Sequence (rCRS) (Andrews et al. 1999). In the present report, we consider only single–base pair substitutions, and we omit contractions/expansions of simple repeat sequences. Sequence analysis of the control region (nt 16024–16569 and 1–576) for members of families with LHON was performed as described in our initial report (Howell et al. 1996). In brief, the control region is amplified by PCR, as a set of four overlapping fragments (~350 bp each). The fragments were ligated into M13 sequencing vectors and were transformed into Escherichia coli hosts, and single-stranded recombinant phage were used for manual sequencing. In the vast majority of experiments, 10–12 clones were informative for each control-region fragment, although there were instances (~5%) in which fewer clones were analyzed.

A site was classified as heteroplasmic if at least two M13 subclones carried the same sequence change that was not present in the pedigree’s consensus control-region sequence. To rule out a false-positive result, such as one that might arise from PCR amplification errors, the relevant site was independently reanalyzed. A heteroplasmic mtDNA mutation was classified as authentic only if reanalysis yielded a result similar to that of the first analysis, and the mutation was considered to have arisen in the germline only if at least one primary maternal relative also carried the sequence change.

We observed numerous instances in which a single clone carried a change from the consensus control-region sequence. These sequence changes most likely resulted from errors that arose during PCR amplification. To assess the likelihood that some instances of authentic mtDNA heteroplasmy had been overlooked, we analyzed six mtDNA fragments for which a single clonal difference was detected. The reanalysis involved independent PCR amplification and the sequencing of an additional 20–35 M13 subclones. None of these six instances was found to represent authentic mtDNA heteroplasmy, and our criteria appear adequate for the identification of newly arising mtDNA mutations in pedigrees. Our approach will not capture heteroplasmic mutations if the frequency of the minority allele is too low. When the sequencing approach is used, the “break-even point” occurs when the minority allele is ~30%, so mtDNA mutations below this value are unlikely to be detected.

Determination of complete mtDNA sequences was performed by use of an automated approach that was described in detail elsewhere (Herrnstadt et al. 2002a). For the complete mtDNA sequences of members of the TAS1 LHON pedigree, the presence of heteroplasmic sequence changes was assessed by visual inspection of the sequencing electropherograms. We have shown that this approach is sufficiently sensitive to detect heteroplasmy if the minor allele is present in 15%–20% of the mtDNA molecules (Herrnstadt et al. 2002b).

Estimates of Divergence Rates and Statistical Analysis

Divergence rates were calculated by the procedure used previously (e.g., see Parsons et al. 1997). Rates are derived from the number of “meioses” or transmission events, which is the number of cumulative generations tracing back to the most recent maternal ancestor. We limit our estimates to the number of newly arising germline mutations (those that show transmission through multiple generations). The divergence rate (twice the observed mutation rate) is expressed as base pairs per million years. In the analyses of other pedigree divergence rate studies, we make the conservative assumption that there are 1,122 bp in the noncoding control region, even in those studies that limit their sequencing analysis to HVR1 and/or HVR2, the two hypervariable regions. It is also assumed that there are 20 years per generation.

All newly arising substitutions, both heteroplasmic and homoplasmic, were included in our divergence-rate estimates. In contrast, Sigurðardóttir et al. (2000) calculated their divergence rate estimate on the basis of three homoplasmic mutations among 705 transmission events, although there were also three heteroplasmic control-region mutations in their pedigrees. The situation is complicated, and one of these heteroplasmic mutations (at nt 16111) was homoplasmic in another branch of the same pedigree. In the present study, we assume that there was a single origin of this mutation and a complicated pattern of segregation in the pedigree. To maintain consistency among the various studies, we include the other two heteroplasmic mutations (at nt 16239 and at nt 16257) in our calculations of the pedigree divergence rate. The inclusion of heteroplasmic mutations in the rate estimate of divergence is addressed further in the “Discussion” section.

Explanation of our utilization of the data from Bendall et al. (1996) is also necessary. They observed five heteroplasmic mutations among 85 DZ and 95 MZ twin pairs, and two of the mutations occurred within a single twin pair. On the basis of our analyses reported elsewhere (Howell and Smejkal 2000), we concluded that the heteroplasmy at nt 16192 is due to somatic hypermutation, not to a germline mtDNA mutation. Therefore, we assumed here that there are four newly arising germline mutations in this cohort. Although the actual number of meioses is 265, we have made the conservative assumption that these twin pairs represent 360 mitochondrial transmission events.

Many studies use the normal approximation to calculate binomial confidence limits for the estimated divergence rate (e.g., see Cavelier et al. 2000). However, that approximation underestimates the confidence limits for the small sample sizes used in most individual studies. In the present study, exact 99.5% binomial confidence limits were calculated with either the binconlim software program provided by J. Rosenblatt or with a publicly available program. The two programs gave identical results. The higher confidence limits were used because of the comparison of multiple studies (Bonferroni correction).

Results

Control-Region Sequence Analysis of the TAS1 LHON Pedigree

The TAS1 LHON pedigree carries the 11778 pathogenic mutation, spans 12 generations, and comprises ~1,600 family members (Mackey and Buttery 1992). The control-region sequence for pedigree member VIII-219, which represents the consensus sequence for this matrilineal pedigree, carried three sequence changes from the rCRS: 152 T→C, 263 A→G, and 16093 T→C. The control-region sequence was determined for a total of 55 TAS1 LHON family members. There were no newly arising mutations in the control region that showed evidence of germline transmission and which would be included in our calculations of the divergence rate. However, we did observe an example of mtDNA “hypermutation” or “persistent heteroplasmy” (Howell and Smejkal 2000; Tully et al. 2000).

In family member VIII-219, the 16093 T→C polymorphism was classified as homoplasmic because it was present in all 11 clones sequenced and analyzed for this region. However, we detected low-level heteroplasmy at this site in 16 of the 55 (~ 30%) family members. For those 16 individuals, the typical result was that one or two clones carried a T at nt 16093 (rCRS allele) versus a C (non-rCRS allele). This heteroplasmy was reproduced in independent reanalysis. For example, in the first analysis of family member VIII-195, 1 of 12 clones carried on the 16093T allele. The numbers in the first and second repeat analyses were 1/40 and 3/52, respectively. Overall, 5 of 104 clones analyzed (5%) carried the 16093T allele in this family member. In contrast, family member IX-158 was 1 of the 39 family members (of a total of 55) who were classified as homoplasmic for the 16093C allele in the initial sequencing experiments, because none of the 11 clones analyzed carried the 16093T allele. An additional 57 clones were sequenced, and 3 of these carried the 16093T allele. That is, this individual was actually heteroplasmic at this site, although this condition was “missed” initially. Whether heteroplasmy at this site occurs in all family members is not known.

We have also confirmed persistent heteroplasmy at this site in other mtDNAs that carry the 16093C allele (data not shown). Taken together, these results indicate that, when an mtDNA molecule carries a C at nt 16093, there is a high probability of reversion or back-mutation to a T residue (hypermutation). The probability of reversion is sufficiently high that a large proportion of individuals, perhaps all, who carry mtDNA with the 16093 T→C polymorphism will be heteroplasmic. While the present studies were in progress, Tully et al. (2000) also obtained evidence for hypermutation of the 16093C allele.

Control-Region Sequence Analysis of the USA2 LHON Pedigree

The USA2 LHON pedigree is heteroplasmic for the 11778 LHON mutation (Howell et al. 1994). The pedigree spans three generations and comprises 16 family members (fig. 1). Previous analysis of this pedigree indicated hypermutation at nt 16192 (Howell and Smejkal 2000). The complete mtDNA sequence of family member II-1 has been determined, and the following coding region polymorphisms were observed: 750 A→G, 1438 A→G, 1721 C→T, 2706 A→G, 2757 A→G, 3197 T→C, 3212 C→T, 4732 A→G, 4769 A→G, 4843 C→T, 7028 C→T, 7768 A→G, 8860 A→G, 9477 G→A, 11467 A→G, 11719 G→A, 12308 A→G, 12372 G→A, 13617 T→C, 13637 A→G, 14182 T→C, 14766 C→T, 14956 T→C, and 15326 A→G. This mtDNA belongs to a subcluster of European haplogroup U. The control-region sequence includes the following sequence changes: 73 A→G, 150 C→T, 263 A→G, 16189 T→C, 16192 C→T, and 16398 G→A (sequence 0204 in fig. 3 of article by Howell and Smejkal [2000]).

We have subsequently determined the control-region sequences for all 16 family members. This analysis revealed that this family was heteroplasmic for site 16270 in HVR1 (table 1). In the matrilineal founder of this pedigree, 17 of 22 clones carried the 16270 T allele (non-rCRS), and 5 clones (23%) carried the 16270 C allele (rCRS). Low levels of the 16270T allele were detected in five of the six members of generation II but in only one of the nine members of generation III (fig. 1). For generation II family members, 61 of 80 clones (76%) carried the 16270 C allele, whereas 63 of 66 clones (95%) analyzed for generation III family members carried this allele. This difference between generations is statistically significant (P[less-than-or-eq, slant].01), as determined with Fisher’s exact test. The presence of the 16270T allele in phylogenetically related lineages (Howell and Smejkal 2000) indicates that the USA2 LHON mtDNA has undergone a back-mutation to the CRS allele at this site.

Table 1
Inheritance of Heteroplasmic Mutations in the USA2 LHON Pedigree

There was a noticeable correlation between the WBC mutation loads of the 16270 C allele and the 11778 A pathogenic allele among the USA2 family members (table 1). This correlation was also observed for multiple tissues from the same individual (data not shown). The simplest explanation of these results is that the 16270 T→C and 11778 G→A mutations occurred within the same “founder” mtDNA molecule.

Control-Region Sequence Analysis of the NSW8 14482 LHON Pedigree

The pedigree of the NSW8 14482 LHON family and the complete mtDNA sequence of family member VIII-20 were reported elsewhere (Howell et al. 1998). This mtDNA belongs to European haplogroup I, and the control region from this individual’s mtDNA contained the following sequence changes: 73 A→G, 199 T→C, 204 T→C, 250 T→C, 263 A→G, 16129 G→A, 16172 T→C, 16223 C→T, 16311 T→C, 16391 G→A, and 16519 T→C. We have analyzed control-region sequences for eight family members. No newly arising mtDNA changes were detected.

Control-Region Sequence Analysis of the QLD1 4160+14484 Pedigree

The QLD1 family expresses LHON, as well as a number of severe clinical abnormalities, including dysarthria, ataxia, tremor, skeletal deformities, and a fatal infantile encephalopathy (Wallace 1970; see fig. 1 of our earlier report [Howell et al. 1991] for a partial pedigree). The mtDNA from this family was found to carry the 14484 LHON mutation and a second pathogenic mutation at nt 4160 (Howell et al. 1991, 1995). A partial coding-region sequence for this mtDNA has been determined (Howell et al. 1991), and the presence of polymorphisms at nt 4646, 11332, and 12372 indicates that this mtDNA belongs to European haplogroup U4 (see table 1 of article by Herrnstadt et al. [2002a]). The following sequence changes in the control region were detected: 73 A→G, 152 T→C, 263 A→G, 16134 C→T, 16356 T→C, and 16519 T→C. Control-region sequence was determined for 11 QLD1 family members, and no newly arising mutations were detected.

Control-Region Sequence Analysis of the ENG1 11778 LHON Pedigree

This LHON family comprises ~75 maternally related individuals and spans six generations (fig. 2). In addition to this “main” pedigree, a small LHON family (designated “A” for reasons of geographical association) has been found that also carries the 11778 LHON mutation and that has the identical control-region sequence (fig. 3). A genealogical linkage of these two LHON families has not been established, but one is suspected (P.F.C., unpublished data). Molecular genetic evidence now establishes that the “A” family is an authentic branch of the ENG1 mtDNA lineage, and those results will be described below in the “Pedigree Divergence Rate of the Coding Region” section.

Figure  2
A partial matrilineal pedigree of the ENG1 11778 family with LHON. The complete pedigree contains too many members to be conveniently shown. As a result, the numbers used to denote individual family members are not consecutive within a generation. When ...
Figure  3
Pedigree of the “A” branch of the ENG1 family with LHON. Although this pedigree cannot be definitively linked genealogically to the main ENG1 pedigree, sequencing analyses of coding region polymorphisms (described in the text) show that ...

The ENG1 control-region sequence was published earlier (Howell et al. 1995), and it carries the following polymorphisms: 73 A→G, 150 C→T, 152 T→C, 263 A→G, 295 C→T, 489 T→C, 16069 C→T, 16126 T→C, 16193 C→T, and 16278 C→T. This array of polymorphisms indicates that this mtDNA belongs to European haplogroup J. A total of 15 family members have been analyzed here: 9 from the main branch and 6 from the “A” branch; a conservative total of 26 transmission events has been assumed. No newly arising mutations were detected in the control region.

Meta-Analysis of Pedigree Divergence Rates

In our initial study (Howell et al. 1996), we detected two new germline mtDNA mutations (at HVR2 nt 152 and 195) in the TAS2 LHON family (a total of 88 transmission events), which yielded a control-region divergence rate of 1.9 mutations/bp/Myr (table 2). In the present study, one new mutation was observed among a total of 185 transmission events for the five families with LHON analyzed here. This number yields a divergence rate of 0.45 mutations/bp/Myr, a value that is within the confidence interval of the initial study. When we pool these results, we obtain a pedigree divergence rate of 1.0 mutations/bp/Myr for the control region (table 2). In other words, one should detect ~1 mutation per 90 transmission events.

Table 2
Estimates of Pedigree Divergence Rates

A number of other groups have used a similar approach to estimate the pedigree divergence rate in the control region (table 2). The 99.5% CIs for all data sets overlap, and we have therefore pooled the results to obtain a control-region pedigree divergence rate of 0.95 mutations/bp/Myr, a result that is essentially identical to the rate obtained for the LHON pedigrees. We used 99.5% CIs because of the multiple comparisons but found that use of 95% CIs had remarkably little effect on the results. With the more relaxed statistical limits, only one of the pairwise comparisons (Cavelier et al. 2000 compared with Parsons and Holland 1998) had nonoverlapping CIs, but this is a result that would be expected by chance alone.

A control-region divergence rate of 0.95 mutations/bp/Myr is ~10 times higher than the phylogenetically derived divergence rates for this region of the human mitochondrial genome. For example, estimates of 0.118 mutations/bp/Myr (Stoneking et al. 1992), 0.087 mutations/bp/Myr (Tamura and Nei 1993), and 0.098 mutations/bp/Myr (Hasegawa et al. 1993) have been obtained, with 95% CIs of 0.056–0.18, 0.025–0.15, and 0–0.22, respectively.

These estimates of the pedigree divergence rate will be biased conservatively because of three factors: (1) most studies limit their analyses to one or both of the hypervariable regions rather than the complete control region (table 2); (2) the approaches used will not capture newly arising heteroplasmic mutations whose allele proportions are [less-than-or-eq, slant]20%; and (3) the pedigree rate is calculated on the basis of the number of transmission events, and many family members are not directly analyzed (e.g., see Heyer et al. [2001] analyze 61 control-region sequences that represent a total of 508 transmission events). There will be a countervailing inflation of the pedigree rate if a large proportion of the mutations are somatic rather than germline. The issue of somatic mtDNA mutations is one that is further considered in the “Discussion” section.

Pedigree Divergence Rate of the Coding Region

The related issue of the pedigree divergence rate in the mtDNA coding region was raised in our initial analysis (Howell et al. 1996), and those fragmentary data suggested that this rate might also be higher than the phylogenetically derived rate. Other results do not support a higher pedigree rate of coding region divergence. Thus, Cavelier et al. (2000) sequenced a 500-bp fragment of the coding region, and they did not observe any newly arising mutations among a total of 194 individuals from 33 matrilineal pedigrees. To provide robust information on this issue will require much more sequence information than is currently available. Nevertheless, we present here some results of additional investigation of the coding-region divergence rate.

In the majority of branches of the TAS1 LHON family, penetrance attains a level in line with other Australian LHON pedigrees. However, there are branches that have a low penetrance and others that have atypically high penetrance (Howell and Mackey 1998). Coding region sequences were analyzed for the following family members: VIII-273, normal penetrance; X-22, low penetrance branch L1; VIII-137 and VIII-138, low penetrance branch L2; IX-149, high penetrance branch H1; and VIII-219, high penetrance branch H2 (the branches are designated according to criteria reported elsewhere [Howell and Mackey 1998]). The coding region for all six family members carried the following polymorphisms: 750 A→G, 1438 A→G, 4626 A→G, 4769 A→G, 8860 A→G, 11778 G→A, 11788 C→T, 15317 G→A, and 15326 A→G. There were no additional sequence changes in any of the TAS1 mtDNAs, homoplasmic or heteroplasmic, irrespective of the penetrance. Therefore, the different levels of penetrance in the branches of the 11778 TAS1 family are not caused by secondary mtDNA mutations in certain branches of the pedigree.

We have now completed the mtDNA sequence analysis of the ENG1 LHON pedigree for a family member from the main branch of the pedigree, and it carries the following coding region polymorphisms in addition to the LHON mutation at nt 11778: 750 A→G, 1438 A→G, 2706 A→G, 4216 T→C, 4769 A→G, 5471 G→A, 5633 C→T, 7028 C→T, 7476 C→T, 8860 A→G, 10172 G→A, 10398 A→G, 11251 A→G, 11719 G→A, 12612 A→G, 13708 G→A, 14569 G→A, 14766 C→T, 15257 G→A, 15326 A→G, 15452 C→A, and 15812 G→A. The polymorphism at nt 14569 is associated with a branch of haplogroup J2 (see the network in fig. 4 of the article by Herrnstadt et al. [2002a]). The mtDNA from family members of the “A” LHON family (fig. 3) also carry this polymorphism, a result that supports a close genetic relationship. In addition, we observed that the ENG1 mtDNA carries a polymorphism at nt 5471. More than 100 haplogroup J mtDNAs have been screened, including 6 mtDNAs from the 14569 subcluster, and only the ENG1 mtDNA carries the 5471 polymorphism (data not shown). Sequence analysis of two members of the “A” branch shows that they carry the 5471 polymorphism, and therefore this family is an authentic branch of the ENG1 LHON pedigree. One of the two visually affected members of the “A” branch family (fig. 3) was homoplasmic for both the 11778 and 5471 sequence changes. In contrast, his unaffected maternal great-aunt (fig. 3) is heteroplasmic at both sites, and the allele ratios are essentially identical. These results suggest that both coding region mutations arose simultaneously in the same mtDNA “founder” molecule. Analysis of other family members confirms that the 5471 mutation has arisen in the germline and is segregating in this maternal lineage (N.H., unpublished data).

The cumulative coding region data presented here can be combined with those published elsewhere (Howell et al. 1996), to derive a preliminary estimate of the pedigree divergence rate. Excluding the LHON mutations, the rate of newly arising germline mutations in the coding region is as follows: TAS1, 0 mutations/107 transmission events; ENG1, 1 mutation/26 transmission events; USA1, 1 mutation/11 transmission events; NWC1, 1 mutation/9 transmission events; and QLD1, 1 mutation/17 transmission events. Thus, there are 4 coding region mutations/170 transmission events, or ~0.15 mutations/bp/Myr (99.5% CI 0.02–0.49). This rate is ~100-fold higher than the phylogenetically derived rate (see also Howell et al. 1996). If we pool our results with those of Cavelier et al. (2000), we obtain 4 mutations/462 transmission events, or a rate of 0.06 mutations/bp/Myr (99.5% CI 0.008–0.20), which is ~30 times higher than the phylogenetic rate. This rate should be conservative because it is based on partial sequencing of the coding region, but there may be factors that bias the rate estimate in the opposite direction.

mtDNA Sequence in Different Tissues from the Same Individual

To address the issue of somatic versus germline mutations at the pedigree level, we obtained mtDNA sequence both from the WBC/platelet fraction of blood and from skeletal muscle for 17 individuals. The results for this analysis are presented in table 3. In 12 of these 17 individuals, there were no mtDNA sequence differences between muscle and blood, and there were no heteroplasmic sequence changes. However, in two individuals (sequences 145 and 149), each blood mtDNA carried a homoplasmic sequence change that was undetectable in muscle. Tissue-specific events may represent somatic mtDNA mutations, even if they are homoplasmic. A similar finding occurred for individual 140, who was also homoplasmic for a control-region polymorphism that could not be detected in the muscle of this individual. In addition, this individual carried a heteroplasmic sequence change in muscle that was not detected in blood. In a fourth individual (sequence 152), a control-region polymorphism was apparently homoplasmic in blood but heteroplasmic in muscle. Finally, individual 147 was heteroplasmic at two coding region sites, and the proportions of the non-rCRS alleles were clearly higher in blood than in muscle. Furthermore, the allele proportions for the two sites were similar (cosegregation), and the sites of the mutations were within 45 bp. These two polymorphisms apparently occurred simultaneously in a “founder” mtDNA molecule. In addition, we observed a smaller frequency of heteroplasmic individuals than did Tully et al. (2000), although the results are more similar if hypermutation at nt 16093 is omitted from their analysis.

Table 3
mtDNA Discrepancies between Blood and Muscle

The mitochondrial genomes of the USA2 LHON pedigree, the ENG1 LHON pedigree, and the individual who carries mtDNA 147 (table 3) all appear to have undergone “two-hit” mutational events. Two of these instances involve two coding-region mutations arising simultaneously, and the third involves one mutation in the coding region and one in the control region (table 1). The biological significance, if any, of these “two-hit” events is unclear at present, but they suggest that mtDNA mutations may not be independent in time.

Discussion

The results of the present study extend and support the previous contention that the pedigree rate of human mtDNA divergence in the control region is higher, by an order of magnitude, than the rate derived with phylogenetic approaches (Howell et al. 1996; Parsons et al. 1997). Although phylogenetic approaches underestimate, to some extent, the control-region divergence rate (see below), the main issue is the elevated rate of divergence in pedigrees. The available data also indicate that the pedigree rate of coding region divergence is elevated, but discussion of this issue is deferred until further data are obtained.

Let us first consider these two rates in terms of what we know about human mitochondrial genetics and what is being measured operationally. An important point is that mtDNA—to use the terminology of Rand (2001)—“exists in a nested hierarchy of populations.” That is, there are population effects at the levels of the mtDNA molecule, the mitochondrial organelle, the cell, the tissue, the individual, and the human population. Both selection and random genetic drift can operate at each of these population levels (Rand 2001). Starting at the level of the mtDNA molecule in the female germline, there is an overall mutation rate for substitutions in the control region (termed “ugerm” here). Both the pedigree (kped) and the phylogenetic (kphy) divergence rates must be defined operationally, and kped is defined here as the rate at which control-region mutations occur and segregate to a detectable level in a matrilineal pedigree. For reasons to be explained below, this terminology is different from that used by Sigurðardóttir et al. (2000).

Both heteroplasmic and homoplasmic mutations, in this study and in others, have been included in the pedigree rate estimate, with the practical limitation that only those heteroplasmic mutations that have reached a level of 20%–30% of the mtDNA population will be detectable. This definition raises the question of whether the kped/kphy rate difference can be resolved by simply “counting” only homoplasmic mutants, thereby presuming that most heteroplasmic mutations will not become homoplasmic and/or will not be transmitted at the level of the population. For example, Bendall et al. (1996) assume that one-half of heteroplasmic mutations detected in their pedigree analyses will not become homoplasmic for the “new” allele. That assumption, however, is not supported by any experimental evidence of which we are aware, and it does not recognize that the probability of becoming homoplasmic is almost certainly a function of allele load. Thus, heteroplasmic mutations that have reached a level of 20%–30%, the minimum levels for pedigree analysis of the divergence rate, should have a much higher probability of becoming homoplasmic than those that have reached, for example, only 1%–2%. Furthermore, even the omission of one-half of the heteroplasmic mutations would not resolve the pedigree/phylogenetic rate difference. The results of Heyer et al. (2001) are particularly germane. They have analyzed deep maternal lineages and have based their divergence rate estimate on control-region mutations that have arisen within these lineages, all of which had become homoplasmic. Their estimated divergence rate is ~30% lower than the rate obtained with the pooled results (table 2).

Sigurðardóttir et al. (2000) also raise the concern that many heteroplasmic mutations are somatic variants rather than true germline mutations. However, their data clearly show that the heteroplasmic mutations in their lineages were inherited through multiple generations and therefore that they cannot be somatic. The requirement for multigeneration transmission was a specific inclusion criterion for our analyses, and a survey of the other published studies indicates that inflation of the estimated kped (table 2) by somatic variants is untenable. Even in the study reported by Parsons et al. (1997), which analyzed small pedigrees, the newly arising mutations were detected in multiple family members and thus cannot be somatic. Somatic mtDNA variants do occur (table 3; see also Howell et al. 1996), but they will not limit pedigree analyses if the appropriate inclusion criteria are used. On balance, therefore, the use of heteroplasmic mutations in divergence rate estimates is appropriate, as long as one understands that the different divergence rates are complex functions of multiple evolutionary processes and not the measurements of a single, simple process.

Both kphy and kped will be less than ugerm, and we can identify some of the multiple factors that are involved in this “loss of mutations,” although it is not yet possible to quantify the relative magnitude of their effects. One major reason that kped will be less than ugerm is that a proportion of new mutations will not be transmitted because of the reduction—or bottleneck—in the units of mtDNA segregation that occurs during oogenesis (Chinnery et al. 2000; Howell et al. 2000; Jansen 2000). Selection can also reduce the number of mutations that are transmitted (e.g., mutations in the control region that impair replication). Because kped will be a function both of the mutation rate at the molecular level and the segregation/transmission processes at the molecular, organelle, cellular, and individual levels (Howell and Mackey 1997; Rand 2001), we use a term different from u, which was defined by Sigurðardóttir et al. (2000) as the mutation rate/individual/generation.

The available results also indicate that kphy will be less than kped in mitochondrial lineages. Sigurðardóttir et al. (2000) defined ks, the rate at which the population becomes fixed for a mutation, and ka, the rate of mutation along an ancestral lineage. The terms “ka” and “kphy” refer to the same rate, but we prefer “divergence” to “mutation” rate, to allow for the influence of processes other than mutation. Sigurðardóttir et al. (2000) argue that, under certain conditions of neutral evolution, ka/kphy and u/kped are “trying to estimate the same thing” (see their discussion on p. 1607). However, kped will be larger than kphy as a consequence of several processes, including both selection and the relative magnitudes of the mutation rate and the inverse of the effective population size (see also Donnelly 1991). Those conditions, as we discuss here, are more likely to apply to the human mitochondrial genome. Before considering selection or nonneutral evolution, there are two other process that will tend to reduce kphy relative to kped. In “real” populations, even homoplasmic mtDNA mutations will be lost through the extinction of maternal lineages, and this process will obviously be a function of the number of generations. Heyer (1995) has shown that mtDNA lineages are lost more frequently than nuclear genetic lineages, even in growing populations and over relatively short timescales. The second effect is the high mtDNA mutation rate and the relatively small effective population sizes of mitochondrial genes.

Sigurðardóttir et al. (2000) conclude that selection is unlikely to be a major factor that underlies the difference between phylogenetic and pedigree divergence rates, which is not the same thing as saying that selection does not act on mtDNA. There is substantial evidence that the evolution of mtDNA is not neutral (reviewed by Gerber et al. [2001] and Rand [2001]). It has also been shown that human mtDNA does not obey a molecular clock (Ingman et al. 2000; Torroni et al. 2001b; N.H., J. L. Elson, D.M.T., and C.H., unpublished data). However, it has not been possible to identify which specific sites in the control region, if any, are subject to selection. Meyer et al. (1999) have shown both that control-region domains with no known function can have low rates of divergence and that functional domains of the mtDNA control region do not have uniformly low divergence rates (see their fig. 2 and the accompanying discussion). Furthermore, site variability of divergence is not a perfect indicator of which sites are vulnerable to selection, because sequence context can act as a determinant of mutation rate (Zavolan and Kepler 2001). Sorting out the effects of selection on control-region sites is further complicated by the absence of recombination in human mtDNA, which “links” evolution of neutral sites in the control region to coding region sites that are subject to selection (e.g., see Rand 2001).

Some clue as to the relative roles of selection versus random genetic drift should come from the spectrum of the mutations that have been identified in pedigree analyses. On the basis of the published information (references in table 2), ~65% of these mutations involve C→T or T→C changes on the L-strand, whereas ~35% involve A→G or G→A changes. The number of mutations analyzed is small, but these proportions are compatible with the rate estimates for the human mtDNA control region derived from phylogenetic analyses (e.g., see fig. 2 of article by Tamura [2000]). At this point, the results suggest that selection has not preferentially distorted the pedigree mutation spectrum relative to the phylogenetic one, but more data are needed. In this regard, transmission of pathogenic mtDNA mutations appears to be determined more by random genetic drift than by selection in mother-offspring transmissions (Chinnery et al. 2000).

Heyer et al. (2001) have analyzed deep-rooting French-Canadian maternal lineages, and they derive a divergence rate (table 2) that is similar to the overall rate with pooled samples. Furthermore, they concluded that the pedigree divergence rate is higher than the phylogenetic rate because of the stronger effects of rapidly evolving sites on the former. This proposal has been made elsewhere (e.g., see Pääbo 1996; Macaulay et al. 1997; Jazin et al. 1998), but it is one that is not without problems (discussed in articles by Howell and Mackey [1997], Parsons and Holland [1998], and Meyer et al. [1999]). For example, because of the high frequency of homoplasy (parallel mutations and reversion), it is difficult for standard phylogenetic approaches to identify mutational hotspots (e.g., see table 4 in the article by Excoffier and Yang [1999]). In fact, one can turn the argument around and suggest that some fraction of the disparity in rate estimates results from this limitation of phylogenetic approaches (Howell et al. 1996; Sigurðardóttir et al. 2000). Heyer et al. (2001) attempt to avoid this limitation through the analysis of substitution-density distributions for the HVR1 and HVR2 segments of the control region among a set of European mtDNAs. However, their approach requires that the sequences have a “starlike” phylogeny, and this is not the case, except for the main subclusters of haplogroup H mtDNAs (see fig. 3 in article by Herrnstadt et al. [2002a]).

The issue of the mtDNA divergence rate has been studied in animals other than humans. Lambert et al. (2002) observed that the empirically derived rate of penguin mtDNA-control-region divergence was two to seven times higher than the phylogenetically derived rate. Denver et al. (2000) estimated the rate of mtDNA divergence in mutation accumulation lines of Caenorhabditis elegans to be 8.9 mutations/site/Myr, a rate that is ~100-fold higher than phylogenetically derived divergence rates. The authors obtained no evidence that mutational hotspots were causing the high rate of divergence. Furthermore, the accumulation lines do not show the bias toward synonymous substitutions that typify the phylogenetic results (see table 1 in the article by Denver et al. [2000]). On the basis of this observation, they suggest that selection is a strong force that has acted, during the evolution of mtDNA, to remove a substantial proportion of coding region mtDNA mutations before they become fixed at the population level.

In conclusion, the results presented here indicate that the pedigree rate of control-region divergence is significantly higher than the phylogenetic divergence rate. The disparity is unlikely to be caused by a single factor or evolutionary process, and we suggest that mutational hotspots, random genetic drift, the inability of phylogenetic methods to adequately capture the high levels of control-region homoplasy, and selection are involved. Pedigree analyses provide a complementary approach to phylogenetic analyses that will allow us to more fully understand the processes that shape the evolution of the human mitochondrial genome.

Acknowledgments

This work was submitted by C.B.S. to The University of Texas Medical Branch (UTMB) in partial fulfillment of the requirements for the degree of Master of Science. This research was supported, in part, by National Science Foundation grant BCS-9910871 (to N.H.) and by a Wellcome Collaboration Grant (to N.H. and D.M.T.). The technical assistance of Iwona Kubacka (UTMB) is gratefully acknowledged. We also acknowledge the assistance of Drs. Ivan Bodis-Wollner (State University of New York [SUNY], Health Science Center at Brooklyn) and Jerome Sherman (SUNY College of Optometry) in obtaining samples and pedigree information for the USA2 family with LHON. Drs. Jerrold M. Olefsky (University of California, San Diego) and Christen Anderson (MitoKor) provided the matched blood and muscle samples. Finally, we thank Dr. Judah Rosenblatt (UTMB), for his assistance with the statistical analyses of divergence rates, and Dr. Joanna Elson (University of Newcastle), for her assistance with Fisher’s exact tests and for her comments on earlier versions of the manuscript.

Electronic-Database Information

The accession number and URL for data presented herein are as follows:

Exact Binomial and Poisson Confidence Intervals, http://members.aol.com/johnp71/confint.html.
Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim (for LHON [MIM #535000])

References

Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM, Howell N (1999) Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat Genet 23:147. [PubMed]
Bandelt H-J, Alves-Silva J, Guimaraes PEM, Santos MS, Brehm A, Pereira L, Coppa A, Larruga JM, Rengo C, Scozzari R, Torroni A, Prata MJ, Amorim A, Prado VF, Pena SDJ (2001) Phylogeography of the human mitochondrial haplogroup L3e: a snapshot of African prehistory and Atlantic slave trade. Ann Hum Genet 65:549–563. [PubMed]
Bendall KE, Macaulay VA, Baker JR, Sykes BC (1996) Heteroplasmic point mutations in the human mtDNA control region. Am J Hum Genet 59:1276–1287. [PMC free article] [PubMed]
Cavelier L, Jazin E, Jalonen P, Gyllensten U (2000) mtDNA substitution rate and segregation of heteroplasmy in coding and noncoding regions. Hum Genet 107:45–50. [PubMed]
Chinnery PF, Thorburn DR, Samuels DC, White SL, Dahl H-HM, Turnbull DM, Lightowlers RN, Howell N (2000) The inheritance of mitochondrial DNA heteroplasmy: random drift, selection or both? Trends Genet 16:500–505. [PubMed]
Denver DR, Morris K, Lynch M, Vassilieva LL, Thomas WK (2000) High direct estimate of the mutation rate in the mitochondrial genome of Caenorhabditis elegans. Science 289:2342–2344. [PubMed]
Donnelly P (1991) Comment on the growth and stabilization of populations. Stat Sci 6:277–279.
Excoffier L, Yang Z (1999) Substitution rate variation among sites in mitochondrial hypervariable region I of humans and chimpanzees. Mol Biol Evol 16:1357–1368. [PubMed]
Gerber AS, Loggins R, Kumar S, Dowling TE (2001) Does nonneutral evolution shape observed patterns of DNA variation in animal mitochondrial genomes? Ann Rev Genet 35:539–566. [PubMed]
Hasegawa M, DiRenzo A, Kocher TD, Wilson AC (1993) Toward a more accurate estimate for the human mitochondrial DNA tree. J Mol Evol 37:347–354. [PubMed]
Herrnstadt C, Elson JL, Fahy E, Preston G, Turnbull DM, Anderson C, Ghosh SS, Olefsky JM, Beal MF, Davis RE, Howell N (2002a) Reduced-median-network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups. Am J Hum Genet 70:1152–1171. [PMC free article] [PubMed]
Herrnstadt C, Preston G, Andrews R, Chinnery PF, Lightowlers R, Turnbull DM, Kubacka I, Howell N (2002b) A high frequency of polymorphisms in HeLa cell sublines. Mutat Res 501:19–28. [PubMed]
Heyer E (1995) Mitochondrial and nuclear genetic contribution of female founders to a contemporary population in northeast Quebec. Am J Hum Genet 56:1450–1455. [PMC free article] [PubMed]
Heyer E, Puymirat J, Dieltjes P, Bakker E, de Knijff P (1997) Estimating Y chromosome specific microsatellite mutation frequencies using deep rooting pedigrees. Hum Mol Genet 6:799–803. [PubMed]
Heyer E, Zietkiewicz E, Rochowski A, Yotova V, Puymirat J, Labuda D (2001) Phylogenetic and familial estimates of mitochondrial substitution rates: study of control region mutations in deep-rooting pedigrees. Am J Hum Genet 69:1113–1126. [PMC free article] [PubMed]
Howell N, Bogolin C, Jamieson R, Marenda DR, Mackey DA (1998) mtDNA mutations that cause optic neuropathy: how do we know? Am J Hum Genet 62:196–202. [PMC free article] [PubMed]
Howell N, Chinnery PF, Ghosh SS, Fahy E, Turnbull DM (2000) Transmission of the human mitochondrial genome. Hum Reprod Suppl 15:235–245. [PubMed]
Howell N, Kubacka I, Halvorson S, Howell B, McCullough DA, Mackey D (1995) Phylogenetic analysis of the mitochondrial genomes from Leber hereditary optic neuropathy pedigrees. Genetics 140:285–302. [PMC free article] [PubMed]
Howell N, Kubacka I, Mackey DA (1996) How rapidly does the human mitochondrial genome evolve? Am J Hum Genet 59:501–509. [PMC free article] [PubMed]
Howell N, Mackey DA (1997) Reply to Macaulay et al. Am J Hum Genet 61:986–990.
——— (1998) Low-penetrance branches in matrilineal pedigrees with Leber hereditary optic neuropathy. Am J Hum Genet 63:1220–1224. [PMC free article] [PubMed]
Howell N, Smejkal CB (2000) Persistent heteroplasmy of a mutation in the human mtDNA control region: hypermutation as an apparent consequence of simple-repeat expansion/contraction. Am J Hum Genet 66:1589–1598. [PMC free article] [PubMed]
Howell N, Kubacka I, Xu M, McCullough DA (1991) Leber hereditary optic neuropathy: involvement of the mitochondrial ND1 gene and evidence for an intragenic suppressor mutation. Am J Hum Genet 48:935–942. [PMC free article] [PubMed]
Howell N, Xu M, Halvorson S, Bodis-Wollner I, Sherman J (1994) A heteroplasmic LHON family: tissue distribution and transmission of the 11778 mutation. Am J Hum Genet 55:203–206. [PMC free article] [PubMed]
Ingman M, Kaessmann H, Pääbo S, Gyllensten U (2000) Mitochondrial genome variation and the origin of modern humans. Nature 408:708–712. [PubMed]
Jansen RPS (ed) (2000) The bottleneck: gamete and embryo mitochondria in humans. Hum Reprod volume 15, supplement 2.
Jazin E, Soodyall H, Jalonen P, Lindholm E, Stoneking M, Gyllensten U (1998) Mitochondrial mutation rate revisited: hot spots and polymorphism. Nat Genet 18:109–110. [PubMed]
Kayser M, Roewer L, Hedman M, Henke L, Henke J, Brauer S, Kruger C, Krawczak M, Nagy M, Dobosz T, Szibor R, de Knijff P, Stoneking M, Sajantila A (2000) Characteristics and frequency of germline mutations at microsatellite loci from the human Y chromosome, as revealed by direct observation in father/son pairs. Am J Hum Genet 66:1580–1588. [PMC free article] [PubMed]
Lambert DM, Ritchie PA, Millar CD, Holland B, Drummond AJ, Baroni C (2002) Rates of evolution in ancient DNA from Adélie penguins. Science 295:2270–2273. [PubMed]
Macaulay VA, Richards MB, Forster P, Bendall KE, Watson E, Sykes BC, Bandelt H-J (1997) mtDNA mutation rates: no need to panic. Am J Hum Genet 61:983–986. [PMC free article] [PubMed]
Mackey DA, Buttery RG (1992) Leber hereditary optic neuropathy in Australia. Aust N Z J Ophthalmol 20:177–184. [PubMed]
Meyer S, Weiss G, von Haeseler A (1999) Pattern of nucleotide substitutions and rate heterogeneity in the hypervariable regions I and II of human mtDNA. Genetics 152:1103–1110. [PMC free article] [PubMed]
Mumm S, Whyte MP, Thakker RV, Buetow KH, Schlessinger D (1997) mtDNA analysis shows common ancestry in two kindreds with X-linked recessive hypoparathyroidism and reveals a heteroplasmic silent mutation. Am J Hum Genet 60:153–159. [PMC free article] [PubMed]
Pääbo S (1996) Mutational hot spots in the mitochondrial microcosm. Am J Hum Genet 59:493–496. [PMC free article] [PubMed]
Parsons TJ, Holland MM (1998) Mitochondrial mutation rate revisited: hot spots and polymorphism. Nat Genet 18:110. [PubMed]
Parsons TJ, Muniec DS, Sullivan K, Woodyatt N, Alliston-Grenier R, Wilson MR, Berry DL, Holland KA, Weedn VW, Gill P, Holland MM (1997) A high observed substitution rate in the human mitochondrial DNA control region. Nat Genet 15:363–368. [PubMed]
Rand DM (2001) The units of selection on mitochondrial DNA. Ann Rev Ecol Syst 32:415–448.
Rannala B, Bertorelle G (2001) Using linked markers to infer the age of a mutation. Hum Mutat 18:87–100. [PubMed]
Relethford JH (2001) Ancient DNA and the origin of modern humans. Proc Natl Acad Sci USA 98:390–391. [PMC free article] [PubMed]
Richards M, Macaulay V, Hickey E, Vega E, Sykes B, Guida V, Rengo C, et al (2000) Tracing European founder lineages in the Near Eastern mtDNA pool. Am J Hum Genet 67:1251–1277. [PMC free article] [PubMed]
Sigurðardóttir S, Helgason A, Gulcher JR, Stefansson K, Donnelly P (2000) The mutation rate in the human mtDNA control region. Am J Hum Genet 66:1599–1609. [PMC free article] [PubMed]
Soodyall H, Jenkins T, Mukherjee A, Du Toit E, Roberts DF, Stoneking M (1997) The founding mitochondrial DNA lineages of Tristan da Cunha. Am J Phys Anthropol 104:157–166. [PubMed]
Stoneking M, Sherry ST, Redd AJ, Vigilant L (1992) New approaches to dating suggest a recent age for the human mtDNA ancestor. Philos Trans R Soc London B Biol Sci 337:167–175. [PubMed]
Tamura K (2000) On the estimation of the rate of nucleotide substitution for the control region of human mitochondrial DNA. Gene 259:189–197. [PubMed]
Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526. [PubMed]
Torroni A, Bandelt H-J, Macaulay V, Richards M, Cruciani F, Rengo C, Martinez-Cabrera V, et al (2001a) A signal, from human mtDNA, of postglacial recolonization in Europe. Am J Hum Genet 69:844–852. [PMC free article] [PubMed]
Torroni A, Rengo R, Guida V, Cruciani F, Sellitto D, Coppa A, Calderon FL, Simionati B, Valle G, Richards M, Macaulay V, Scozzari R (2001b) Do the four clades of the mtDNA haplogroup L2 evolve at different rates? Am J Hum Genet 69:1348–1356. [PMC free article] [PubMed]
Tully LA, Parsons TJ, Steighner RJ, Holland MM, Marino MA, Prenger VL (2000) A sensitive denaturing gradient-gel electrophoresis assay reveals a high frequency of heteroplasmy in hypervariable region 1 of the human mtDNA control region. Am J Hum Genet 67:432–443. [PMC free article] [PubMed]
Wallace DC (1970) A new manifestation of Leber’s disease and a new explanation for the agency responsible for its unusual pattern of inheritance. Brain 93:121–132. [PubMed]
Zavolan M, Kepler TB (2001) Statistical inference of sequence-dependent mutation rates. Curr Opin Genet Devel 11:612–615. [PubMed]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...