Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. 2010 Feb; 184(2): 529–545.
PMCID: PMC2828730

Gene Genealogies Strongly Distorted by Weakly Interfering Mutations in Constant Environments


Neutral nucleotide diversity does not scale with population size as expected, and this “paradox of variation” is especially severe for animal mitochondria. Adaptive selective sweeps are often proposed as a major cause, but a plausible alternative is selection against large numbers of weakly deleterious mutations subject to Hill–Robertson interference. The mitochondrial genealogies of several species of whale lice (Amphipoda: Cyamus) are consistently too short relative to neutral-theory expectations, and they are also distorted in shape (branch-length proportions) and topology (relative sister-clade sizes). This pattern is not easily explained by adaptive sweeps or demographic history, but it can be reproduced in models of interference among forward and back mutations at large numbers of sites on a nonrecombining chromosome. A coalescent simulation algorithm was used to study this model over a wide range of parameter values. The genealogical distortions are all maximized when the selection coefficients are of critical intermediate sizes, such that Muller's ratchet begins to turn. In this regime, linked neutral nucleotide diversity becomes nearly insensitive to N. Mutations of this size dominate the dynamics even if there are also large numbers of more strongly and more weakly selected sites in the genome. A genealogical perspective on Hill–Robertson interference leads directly to a generalized background-selection model in which the effective population size is progressively reduced going back in time from the present.

OBSERVED levels of apparently neutral nucleotide diversity (πn) are typically lower than expected under the assumptions of standard equilibrium theories, and they vary much less among species than do estimates of long-term effective population sizes (Nei and Grauer 1984; Bazin et al. 2006; Nabholz et al. 2008). Many explanations have been proposed for the apparent shortfalls and the lack of proportionality with population size, including (1) complex demographic histories (e.g., recurring population bottlenecks), (2) adaptive selective sweeps (Maynard Smith and Haigh 1974; Gillespie 1999), and (3) selection against deleterious mutations (Charlesworth et al. 1993, 1995; McVean and Charlesworth 2000; Comeron et al. 2008). Of these three possibilities, bottlenecks and sweeps are by far the most frequently mentioned, even though deleterious mutations occur at high rates in all species, regardless of ecological circumstances (Eyre-Walker and Keightley 2007). Here we show that weakly deleterious mutations can distort genealogies in three different ways and dramatically reduce nucleotide diversities in large populations of nonrecombining chromosomes. The mitochondrial genealogies of several species of whale lice (Kaliszewska et al. 2005) are distorted in exactly these ways, and several lines of evidence suggest that bottlenecks and adaptive sweeps are not likely to be the primary causes.

Mitochondria have been proposed to be especially sensitive to selective sweeps. Animal mitochondrial genomes contain more than three dozen essential protein and structural RNA genes, so they are large targets for both mutation and selection (Ballard and Whitlock 2004). They do not undergo sexual recombination, so every advantageous mutation that fixes will reduce variation throughout the genome. Mitochondrial nucleotide diversity therefore could depend strongly on rates of environmental change, which could be similar for species with very different population sizes. Indeed, if rates of mitochondrial adaptation were mutation limited, then larger populations might actually experience higher rates of adaptive substitution and as a result show lower average levels of neutral diversity than smaller populations (Gillespie 2000, 2001). This idea was recently invoked to explain the remarkable similarity of average levels of mitochondrial nucleotide diversity among the major animal classes which appear to have very different average population sizes and substantially different average levels of nuclear nucleotide and amino acid diversity (Bazin et al. 2006).

Unconditionally deleterious mutations can also depress linked neutral diversity by reducing the effective population size either through (1) background selection against relatively strongly selected mutations (Charlesworth et al. 1993, 1995) or (2) Hill–Robertson interference (Hill and Robertson 1966) among large numbers of relatively weakly selected mutations (reviewed by Comeron et al. 2008). The second of these processes, called “weak-selection Hill–Robertson interference” (wsHRi) by McVean and Charlesworth (2000) and “interference selection” (IS) by Comeron and Kreitman (2002), can shorten genealogies, give them strongly nonneutral branch-length proportions, and skew their topologies (Higgs and Woodcock 1995; Maia et al. 2004).

To date, weak interference has mainly been studied by forward simulation, with the aim of assessing its possible effects on patterns of optimal synonymous codon use within eukaryotic nuclear genes and genomes, in the presence of recombination (Comeron and Guthrie 2005; Loewe and Charlesworth 2007; Comeron et al. 2008). In an attempt to understand the striking genealogical distortions seen in whale-louse mitochondria (Kaliszewska et al. 2005), we have developed a structured-coalescent algorithm that accurately models selection of arbitrary strength on a nonrecombining chromosome of finite length. All of the distortions seen in the whale-louse mitochondria are replicated under parameters that might plausibly apply to whale lice and many other animal species, and these distortions scale only weakly with population size.

Whale lice are permanent, obligate ectoparasites of cetaceans. They feed on the dead outer surface of their host's skin, and they appear to be harmless. They are amphipod Crustacea comprising a monophyletic family, Cyamidae, with ∼50 described species in several genera. Three of these species (Cyamus ovalis, C. gracilis, and C. erraticus) occur on right whales (Eubalaena spp.) but not regularly on any other hosts. Most adult right whales carry large populations of all three species.

Right whales in the North Pacific, the North Atlantic, and the southern hemisphere have been separated for ∼5 million years, and so have their cyamids (Rosenbaum et al. 2000; Gaines et al. 2005; Kaliszewska et al. 2005). For this reason the right whales in different ocean systems are now considered distinct species (Eubalaena japonica, E. glacialis, and E. australis), and we refer to their cyamids as North Pacific C. ovalis, North Atlantic C. ovalis, southern C. ovalis, and so on, in anticipation that a future revision of the genus Cyamus will recognize them as “triplet” sibling species (3 × 3 = 9 species in all). We studied their mitochondrial population genetics with the initial aim of quantifying patterns of genetic differentiation among the cyamid populations on individual whales within local populations (Kaliszewska et al. 2005).

We had reasoned (incorrectly) that the pattern of differentiation among whales might say something about their social interactions, since cyamids can transfer only between whales that are in direct physical contact with each other. We found very low levels of differentiation among whales and to our surprise literally no differentiation among the major southern hemisphere breeding aggregations that calve off the coasts of South America, South Africa, Australia, and New Zealand (Kaliszewska et al. 2005). This absence of population structure seems remarkable by terrestrial standards but is easily explained by modest rates of cyamid exchange among whales within local populations and between the major breeding aggregations, given the enormous sizes of cyamid populations. Right whales are highly gregarious (spending hours per day in social interactions), mobile (traveling thousands of kilometers per year on annual foraging migrations), and mortal (carrying their cyamid populations to the sea floor when they die). Thus cyamids have many opportunities to transfer between whales, and they might be expected to have evolved an inclination to do so when the opportunity presents itself (Hamilton and May 1977).

The well-defined ecology of right-whale cyamids allows their population sizes to be estimated directly. The number of adult cyamids per whale (∼500–10,000, varying by species) times the number of whales per ocean (∼50,000–200,000, prior to human exploitation) equals the number of cyamids per species (Kaliszewska et al. 2005). Thus for all three nominal species of right-whale cyamids, long-term census population sizes are expected to have been in the range 2.5 × 107–2 × 109. Given conservative estimates of the per-generation mitochondrial mutation rate, even the lower end of this range predicts levels of synonymous nucleotide diversity at least an order of magnitude larger than those actually seen in the cyamids, which are consistently modest and similar to those seen in typical terrestrial arthropods (Kaliszewska et al. 2005). The three North Atlantic and southern hemisphere sibling-species pairs are strongly reciprocally monophyletic, as illustrated for C. ovalis in Figure 1. This is not expected at mutation–drift equilibrium, given their very large population sizes.

Figure 1.
Mitochondrial gene genealogies for North Atlantic and southern hemisphere Cyamus ovalis, estimated by UPGMA from partial COI sequences. Left: The intraspecific genealogies coalesce globally at ∼0.5 and 1 MY and share an ancestor at ∼5 ...

These dramatic deficits of variation are not easily explained by population bottlenecks or adaptive selective sweeps. The bottleneck hypothesis is especially problematic because it requires the long-term near extinction of right whales in all three ocean systems. (Short bottlenecks such as those caused by human exploitation of right whales are not expected to have a noticeable effect on cyamid genetic diversity because cyamid populations remain large, and rates of genetic drift low, even when there are few whales.) That all three right whales have survived for millions of years suggests that they have maintained reasonably large population sizes, and the mitochondrial nucleotide diversity of southern right whales is consistent with this assumption (Kaliszewska et al. 2005), as is the nucleotide diversity of a cyamid nuclear gene (described below). Right whales eat copepods and krill, which are relatively close to the base of marine food webs, and right-whale populations are thought to be food limited. Thus a long-term, severe depression of their numbers would also seem to imply a collapse of marine ecosystems worldwide, for which there is no evidence.

Several features of cyamid mitochondrial nucleotide diversity are also inconsistent with the bottleneck model and with adaptive sweeps as well. The most obvious of these features is the uniformity of cyamid mitochondrial diversity among species (π = 0.007–0.015 for COI sequences in the seven species surveyed by Kaliszewska et al. 2005). Gene genealogies estimated from these sequences also seem remarkably uniform in total depth, with last common ancestors differing in age by only a factor of 3 and in six of the seven species by less than a factor of 2 (see Kaliszewska et al. 2005, Figure 4). Adaptive sweeps might be expected to occur at roughly random intervals and not to be well coordinated in time among seven species in three different ocean systems. Interspecific coordination of such sweeps (over the whole globe) would seem to be required if they were to be a plausible primary cause of the genealogical shortening.

Figure 4.
Apparent average effective sizes of ancestral populations, for models with different values of s. Values of Ne are given on the vertical axis (logarithmically scaled). They are estimated from the variance of expected contributions to the present, for ...

In addition to showing too little nucleotide variation, the cyamid mitochondrial genomes show strong and consistent excesses of rare nucleotide states, reflecting the “comb-like” or “star-like” shapes of the genealogies, in which deeper branches tend to be much too short relative to terminal branches (as if the trees had been “squished” from behind). This kind of distortion causes negative values of Tajima's (1989) D and related statistics. It can be caused by population expansion from a bottleneck or by lineage expansion under positive selection (Kaplan et al. 1989; Slatkin and Hudson 1991; Rogers and Harpending 1992; Bamshad and Wooding 2003). However, the form of branch-length distortion seen in the cyamid genealogies suggests a slow, steady, roughly exponential form of population or lineage growth, not the relatively sudden increases suggested by the bottleneck and selective-sweep hypotheses. Generalized skyline plots (Strimmer and Pybus 2001) describing the histories of population size implied by the shapes of the northern and southern C. ovalis gene genealogies are shown in Figure 1. They are remarkably similar, as are the growth rates and estimates of present-day θ (= 2Nfμn) obtained by fitting exponential growth models using the coalescent algorithms in LAMARC (Kuhner et al. 1998, 2004) or BEAST (Drummond and Rambaut 2007).

Cyamid populations cannot have grown in numbers as seemingly implied by these analyses. The number of cyamids on each whale appears to be set mainly by microhabitat limitations (e.g., by the area of rough callosity tissue on the head, where C. ovalis and C. gracilis live), and these features of their environment have hardly changed for millions of years, as demonstrated by the strong similarities of northern and southern right whales and their cyamids. Likewise, the numbers of right whales cannot have increased gradually from vanishingly small numbers over several hundred thousand years, for the reasons discussed above.

The genealogical signals of “growth” therefore seem likely to be caused by selection. Environmental change is the most obvious potential cause of selection, but the apparent rate of growth seen here is strangely slow—in fact, slower than glacial. The orbitally forced Plio-Pleistocene glacial climate cycles have a major period of ∼100,000 years (Lambert et al. 2008 and references therein), but all seven of the right-whale cyamids for which we have mitochondrial population samples appear to have been “expanding,” more or less continuously, through at least several such cycles. The seemingly fairly consistent rate of branch-length foreshortening seen in the genealogies therefore suggests the action of a process that is relatively homogeneous in time, in addition to being very slow overall.

The cyamid mitochondrial genealogies also appear to be topologically skewed, with sister clades too unequal in size, on average. In random bifurcating trees, the distribution of sister-clade sizes is uniform (Yule 1924; Heard 1992; Rogers 1994). Deviations from this null expectation can be quantified by statistics such as Colless's (1982) index of tree imbalance (Shao and Sokal 1990; Rogers 1996). Our estimates of the cyamid genealogies tend to be excessively imbalanced (Figure 1). Strong topological imbalance is not caused by classic adaptive sweeps or by population growth following a bottleneck, but previous theoretical work has indicated that it can be caused by selection (Higgs and Woodcock 1995; Maia et al. 2004).

To summarize, the cyamid mitochondrial genealogies are consistently much too short, too squished, and too skewed, relative to neutral-theory expectations. Owing to several special features of cyamid and right-whale biology, selection seems to be the only plausible explanation for this set of distortions, but conventional adaptive sweeps do not seem likely to be the primary cause. We therefore asked whether weakly deleterious mutations might be sufficient to generate the observed combination of patterns, in the absence of environmental change. Previous work (mentioned above) showed that interference among weak mutations at many sites can strongly affect linked neutral variation, but this work did not fully explore the parameter space relevant to our system or connect all the patterns in a genealogical setting.

To address this question we first carried out forward simulations of populations of nonrecombining chromosomes with large numbers of nucleotide positions subject to forward and back mutations with unconditional fitness effects of size s. Large numbers of linked neutral sites were used to estimate genealogies and to calculate population statistics of interest. We found that for a range of intermediate values of s, considerable fitness variation was maintained and all three of the genealogical distortions (and the signal of apparent exponential growth) seen in the cyamid genealogies reached impressively large maxima. However, the computational burden of full forward simulation prevented us from considering realistic parameter values (i.e., large N and small μ), and it was not obvious that extrapolations based on compound parameters (e.g., Nμ and Ns) would work as hoped in all respects (see Comeron et al. 2008). We developed an equivalent coalescent algorithm that accurately reproduces all results of the forward simulations and allows for realistic parameter values. Under parameters relevant to cyamid mitochondria, the distortions of genealogical depth, proportions, and topology can be even more extreme than those seen in the cyamids, and the mean pairwise coalescence times (and resulting neutral nucleotide diversities) associated with maximally distorting intermediate values of s (Us/s ∼ 5, where Us is the total genomic mutation rate at sites with selection coefficents of size s) depend only weakly on N.

All parameters of this model (including those of the environment) remain constant over time, yet in some respects it displays apparently nonequilibirum behavior. Under weak to intermediate selection (Us/s > 10), the effective population size appears to become progressively smaller as time recedes into the past, giving rise to the illusion of growth. And in the maximally distorting range of intermediate selection coefficients, the distributions of deleterious mutation numbers and the shapes of genealogies show conspicuous dynamical instability of a form that could be taken to suggest “adaptive evolution” in response to episodes of environmental change. Adaptive mutations contribute importantly to this process, but they are reversions at some of the many sites previously mutated to mildly deleterious states. Subtle patterns of environmental change that converted previously optimal nucleotide states to slightly suboptimal states could give rise to a category of “virtual reversions” that would augment (or even outnumber) simple reversions, and the effects of such a process might well be consistent with the distortions seen in the cyamid mitochondrial genealogies. However, models with no environmental change of any kind appear to explain the observations surprisingly well.


The models described here all assume unstructured populations of fixed size N with discrete, nonoverlapping generations. Individuals are haploid and chromosomes are nonrecombining, with Ls selected sites at which one of the four nucleotides confers higher fitness than the others and Ln neutral sites where s = 0. Individual (i.e., chromosomal) fitness is multiplicative: W = (1 − s)i, where i is the number of deleterious nucleotide states carried by a chromosome. Mutations strike all sites with probability μ per generation, changing the nucleotide at a mutated site to any of the other three with equal (Jukes–Cantor) probabilities, so back mutations to the fittest nucleotide state occur with probability μ/3.

The population mutation parameter θ is often defined as 4Neμ at a diploid locus, where Ne is some measure of the effective population size. Here we treat Ne as an effect of selection, not a given parameter, and the unqualified θ refers to 2Nμ (because the models are haploid). Statistics that estimate the parameter θ have other names (e.g., π) or carry subscripts (e.g., θW, θg).

Forward simulations:

The model was first implemented as a full forward Monte Carlo simulation written in C and using various methods including integer operations wherever possible to achieve speeds that allow modestly large populations of long chromosomes to be followed for many generations. As a consequence of this “extreme-integer” strategy, most of the parameters except the mutation rate were scaled as integral powers of 2. This allowed random numbers to be used in their raw form (as unsigned integers) in most situations, simply by truncating them to the appropriate range by right shifting. Chromosome samples (n = 200) were collected and population statistics were reported every 25,000 generations for 5 × 106 generations (a total of 201 samples and reports) following initial burn-in periods of 3 × 106 generations. In all of the forward simulations described here, μ = 1.5 × 10−6/site/generation.

Two forms of the model were considered. In model I, all selected sites had the same value of s that was varied among runs, holding other parameters constant. The case studied most thoroughly had chromosomes with 2048 selected and 6144 neutral sites, N = 65,536 (“64k”), and s varying from 2−18 to 2−6 at selected sites. In model II, each chromosome had blocks of sites with different values of s that spanned most of this range. Here a chromosome's fitness can be expressed as equation M1, where m(j) is the number of deleterious nucleotide states in the block of sites with selection coefficient s(j), and the product is taken over all of the blocks. There was also a large block of neutral sites. The case described here had 11 selected blocks of 512 sites each, with s ranging from 2−17 to 2−7, and 2560 neutral sites. Runs were carried out at values of N from 214 to 217 (16,384 = “16k” to 131,072 = “128k”). Control runs were also carried out at the same values of N, but with selection turned off [s(j) = 0 in all blocks].

Coalescent simulations:

Building on a model of background selection described by Hudson and Kaplan (1994), Gordo et al. (2002) developed a structured coalescent algorithm to study Muller's (1964) ratchet. In this approach, all chromosomes carrying i deleterious mutations are treated as members of a randomly mating mutational class (analogous to a deme) within which a coalescence may occur with the usual per-generation hazard

equation M2

where ki is the number of lineages with i mutations in the sample at a given generation, and Ni is the size of mutational class i in the population as a whole. Mutations transfer chromosomes from class i to class i − 1 (going backward in time), but back mutations (to fitter states going forward or less fit states going backward) do not occur (i.e., the model assumes infinite sites). If the haploid genomic mutation rate U is low enough that a chromosome acquires at most one new mutation in any generation, then the probability that a chromosome with i mutations was derived from one that had i − 1 mutations in the previous generation can be expressed approximately as

equation M3

These probabilities of “migration” from one mutational class to another are functions of U and of the current distribution of mutation numbers. Gordo et al. (2002) used the deterministic (infinite-N) distribution of mutation numbers under this model, which is the Poisson distribution with parameter λ = U(1 − s)/sU/s. Thus when U = s (so U/s = 1), the mean and variance of the mutation-number distribution is 1 and f(0) = f(1) = e−1. As s declines relative to U, the mean of the distribution increases and the size of the unloaded chromosomal class decreases toward zero.

The resulting coalescent algorithm works well in regimes where selection is strong enough that the unloaded class is occupied (N0 > 1), but its behavior begins to diverge from that of equivalent forward Monte Carlo simulations as selection becomes weak enough that all chromosomes carry at least one mutation and the ratchet begins to turn.

We generalized the approach of Gordo et al. (2002) to accommodate finite sites, four nucleotides, back mutations, and situations where all chromosomes may carry substantial numbers of deleterious mutations. At first we guessed that the problem might be solved by replacing the deterministic equilibrium distribution of mutation numbers with a distribution more appropriate to the finite-sites and finite-N assumptions of the model to be simulated (an idea inspired by the model of Rouzine et al. 2003, which works well for very weak selection). But we soon discovered that regimes involving intermediate values of s (those that give rise to the most extreme genealogical distortions) apparently cannot be modeled accurately with a fixed distribution of mutation numbers. We therefore resorted to simulating (in forward time) the full history of the mutation-number distribution which was then used (going backward) to guide the coalescent process. This hybrid approach is much more computationally intensive than classical coalescent simulation, but it is feasible for realistic parameter values and very accurate.

A simulation takes place in two phases, each carried out by a different program. First the forward program (WAVE) simulates a long history of the mutation-number distribution (the “wave” of Rouzine et al. 2003), by applying mutations to the current generation's distribution and then drawing a multinomial sample from it (weighted by relative fitness) to generate the next generation's distribution and so on for many generations. After the mutation–drift–selection process has reached steady state, a sequence of generations is saved to disk and then time reversed. These reversed-wave-history files can be huge (even using efficient schemes for representing each generation), depending on parameter values and the number of generations saved. The current implementation samples periodically from the mutation-number history rather than storing every generation to disk, because histories of ≥108 generations are needed for realistic parameter values and the files would be unmanageable if not compressed in this way. A sampling interval of 100 generations was used for the cases with realistic parameter values described here (e.g., Figures 3, ,5,5, ,6,6, and and77).

Figure 3.
Genealogy simulated under model I with parameters that could be realistic for right-whale cyamids. Mutations are shown on the branches; ticks to the right are forward (deleterious) mutations, and ticks to the left are back mutations. The number of nucleotide ...
Figure 5.
Average neutral nucleotide diversities (πn) as functions of the selection coefficient (s) for models with a realistically low mutation rate (μ = 2 × 10−8), 2048 selected sites, and a range of realistically large ...
Figure 6.
Mutation-number distributions and genealogies during pseudoadaptive sweeps. The history depicted covers 3 million generations (from 15 million to 18 million, from a total of 100 million) in a simulation where N = 2.5 × 107, μ = ...
Figure 7.
Dynamics of fitness as a function of s. Each trace shows the population's mean mutation number (as deviations from its long-term mean, in units of the average within-generation standard deviation) over all 100 million generations of the run that was used ...

The backward program (“dynamically restructured coalescent,” DRC) then reads the reversed-wave history and uses it to guide a series of coalescent simulations. First a sample of chromosomes is drawn from the mutation-number distribution of the current generation. Then lineages are advanced, generation by generation, allowing coalescence and mutation to occur as described above and elaborated below. Because the wave file records the mutation-number distribution only every 100 generations (or at some other specified interval), the distribution is smoothly “evolved” by linear interpolation between the previous distribution retrieved from the wave file and the next one, during the generations between values stored in the wave file.

When global coalescence occurs, DNA sequences are generated for each chromosome in the initial sample, statistics of interest are calculated from those sequences and from the genealogy itself, an approximate genealogy is estimated from the sequences (to mimic the estimation of genealogies in the forward simulations), and all of this information is written to report files. Time is then advanced over a large random number of generations in the reversed-wave file, and the process is repeated beginning with the sampling of chromosomes from the now-current generation. This goes on, sample after sample, until the end of the wave file is reached. If <100 samples have been simulated (because the average global coalescence time is large under the current parameters), then the wave file is reopened and advanced a large random number of generations, and the sampling process continues as before. Reusing the wave file in this way gives samples that are not fully independent of each other, but it yields better estimates of overall average statistics than would stopping with just a few samples. When ≥100 samples have been simulated, the averages and variances of various statistics are calculated and written to the relevant report files.

In the coalescent algorithm, expressions for the forward and backward mutational probabilities conditioned on the mutation-number distribution (Qi,i−1 and Pi,i+1) are more complex than in the model of Gordo et al. (2002). The raw genomic mutation rates can be precalculated as lookup tables [forward rate Uf(i) = μ(Li), backward rate Ub(i) = μi/3, where i is the genome's current deleterious mutation number]. However, the Qi,i−1 and Pi,i+1 must be recalculated in every generation (or at least in every generation where the mutation-number distribution is updated by reading the wave file or by interpolation). Thus

equation M4


equation M5


equation M6

Care must be taken with the treatment of boundary cases (where either Ni−1 or Ni+1 is zero) and with voids that can appear in the middle of the mutation-number distribution (where Ni goes from a finite number to zero while ki is still greater than zero), but otherwise the underlying logic of the method follows directly from that of Gordo et al. (2002). Source code and documentation for all of the programs (forward and coalescent) can be obtained from J. Seger.

Cyamid DNA sequences:

The existing samples of mitochondrial COI sequences for North Atlantic and southern C. ovalis were increased to 128 and 105 individuals, respectively, by methods essentially identical to those previously described (Kaliszewska et al. 2005). The new sequences have been deposited in GenBank under accession nos. GU172208GU172303.

Partial sequences of the nuclear EF-1α gene were obtained using “universal” arthropod primers and then extended by inverse PCR (Ochman et al. 1988, 1990). The gene has one small intron. A complete sequence from a North Atlantic C. ovalis individual (onVAsb04) is shown with its translation and heterozygosities in supporting information, Figure S1. Using this sequence, primers were designed to allow amplification and sequencing of most of the coding region and the intron (File S1). PCR products were sequenced directly on both strands for 7–15 individuals of North Atlantic and southern C. ovalis, C. gracilis, and C. erraticus. Heterozygous sites were identified by chromatogram traces showing well-defined peaks of approximately equal height for two different nucleotides at a given position. These are genuine allelic polymorphisms, as indicated by three facts: (1) almost all are synonymous, (2) almost all are close to Hardy–Weinberg genotype frequencies, and (3) some appear as shared polymorphisms between northern and southern hemisphere sibling species (Table 1 and Tables S1–S6). The sequences have been deposited in GenBank under accession nos. GU172304GU172368.

Polymorphism and divergence of EF-1α in sibling species of cyamids on North Atlantic (NA) and southern hemisphere (SH) right whales (Eubalaena glacialis and E. australis)

Sequence analysis:

Mitochondrial genealogies for North Atlantic and southern C. ovalis were estimated as described previously (Kaliszewska et al. 2005) and also by BEAST (Drummond and Rambaut 2007). θ and the apparent exponential growth rate (r) were estimated by LAMARC (Kuhner et al. 1998). Topological skew was calculated as the negative of Colless's (1982) index after standardization by its expected standard deviation (Rogers 1994). This statistic (−IS) can be viewed as an analog of Tajima's (1989) D (DT), in that negative values indicate distortions of the kind expected under directional selection, and the units are standard deviations. Values were calculated by the program EIGENLAD (available from J. Seger), which also “eigenladderizes” trees for easier viewing (e.g., Figure 1). This way of rationally arranging the clades in a tree was developed by Gary Olsen (G. Olsen, personal communication) and used in a number of his articles (e.g., Olsen et al. 1994; Woese et al. 2000), but apparently it has never been described in print. Unaware of Olsen's work, we rediscovered the method and called it eigenladderization for reasons explained in File S1.

Introns were removed from the edited EF-1α sequences because some contain small indels that frustrate accurate determination of both alleles. We inferred haplotypes using PHASE 2.0 (Stephens and Donnelly 2003), but with little success because linkage disequilibrium within the gene is very low, as indicated by plots of diploid pairwise linkage disequilibrium (r2) against the distance between polymorphic sites (Figure S2) (Rogers and Huff 2009). Standard diversity and divergence statistics were estimated using DnaSP 4.10 (Rozas et al. 2003) and several programs written by us.


Distortions maximized at intermediate values of s:

The sizes, shapes, and topologies of gene genealogies become substantially distorted as values of s decline from the “strong” end of the range (where Us/s < 1 and most chromosomes are mutation free) and enter the “intermediate” range (where Us/s is of order 5 and most chromosomes carry several mutations) (Figure 2). Mean values of πn, DT, and −IS all reach well-defined minima at about the same intermediate value of s, and the apparent population growth rate r reaches a maximum value that is about half as large as s. As s decreases even more, toward the “weak” range where Us/s > 10 and all chromosomes are substantially loaded, mean fitness reaches a minimum and then begins to recover, while πn, DT, and −IS increase and the apparent population growth rate declines. In this weak-selection regime, r is slightly larger than s, and θg (the population mutation parameter estimated jointly with the growth rate r) is close to the true value of θ. In the strong-selection regime, θg underestimates θ by an amount very close to that predicted by the classic background-selection model (Charlesworth et al. 1993).

Figure 2.
Polymorphism and tree-shape statistics for forward (black lines) and dynamically restructured coalescent simulations (red lines) of model I. Selection coefficients vary from weak (including s = 0) to strong along the horizontal axis. Other parameters ...

The values of these statistics as observed in the forward and backward (coalescent) simulations are extremely similar over the full range of s (Figure 2); even their standard deviations (over samples) are nearly identical. Thus the dynamically restructured coalescent algorithm is quantitatively equivalent to an exact forward simulation, for all practical purposes.

A well-defined point of maximum genealogical distortion at intermediate s does not form unless Ns ≫ 1 (i.e., the effectiveness of selection is high) at that point. Thus at smaller population sizes, the point of maximum distortion may occur at larger values of s (and hence smaller values of the ratio U/s). Maximum reductions of the mean pairwise difference and tree length appear clearly in the studies of Gordo et al. (2002; Figures 1 and and3)3) and Williamson and Orive (2002; Table 4, Figures 3 and and4),4), but at values of U/s between 4 and 0.5, corresponding to values of Ns between 50 and 10, owing to the small population sizes modeled in those studies. This effect can also be seen in Figure 5 in this article, where the minimum neutral polymorphism occurs at a larger s for N = 106 (Ns ≈ 20) than for N = 108 (Ns ≈ 500).

Tajima's D becomes strikingly less variable in the zone of intermediate s (and maximum distortion) than it is at the strong-selection and weak-selection ends of the range. The standardized tree imbalance (−IS) is more variable than DT at all values of s, and it becomes more variable in the intermediate-selection zone, as does the apparent growth rate r.

All of the statistics shown in Figure 2 were estimated from neutral sequences linked to the selected sites, not from the genealogies themselves which are known without error only in the coalescent simulations. The coalescent results have been “dumbed down” to make them comparable to the forward simulations, for which the tree-based statistics −IS and r must be calculated from trees estimated with error from sequences. Even so, in the coalescent simulations there is good agreement between the statistics as calculated from true and inferred genealogies.

Skew and growth caused by expansion of less-loaded clades:

A typical sample genealogy for apparently realistic N and μ, and 1000 selected sites at an intermediate value of s (U/s = 7.4), is shown in Figure 3, with its complete mutational history at selected sites marked by ticks on the branches. The root and several of the deeper branches have slightly fewer deleterious mutations than the best of the sequences in the contemporary sample, and an average contemporary sequence is worse than the oldest nodes in the tree by several mutational steps. The density of mutations is lower deep in the tree than near the present, and a larger fraction of the older mutations are back mutations that presumably helped to make their lineages among the best then present in the population. The least-loaded contemporary chromosomes have what appear to be improbably mutation-free histories (augmented in a few cases by back mutations), and they tend to have relatively large numbers of close relatives that themselves have lower than average numbers of deleterious mutations. Conversely, the most heavily loaded sequences and clades tend to be relatively isolated at the ends of longer branches, indicating that they have been declining for some time. These patterns all appear to be consistent with previous work on the ages of mutations (see Slatkin and Rannala 2000; Slatkin 2008).

The relative growth rates of two sister clades (at any level in the genealogy) will be influenced by selection (that is, by a difference in mean fitness, caused by a difference in mean mutation number) as well as by drift. Thus the expansion of less-loaded clades and the contraction of more-loaded clades will tend to increase the topological skew (Higgs and Woodcock 1995; Maia et al. 2004). This process also gives rise to the overall illusion of growth, because lineages that have been expanding for some time are usually larger those that have been contracting. Although extant lineages descend from ancestors that were among the least loaded of their generations, most will end up (in the future) more heavily loaded than average and doomed to displacement by lineages that appear to be almost miraculously mutation free, as they emerge from the endless flow of “allelic traffic” (Comeron and Kreitman 2002).

Ne not constant:

Going back in time, the ancestry of the current generation becomes progressively more restricted toward the high ends of the fitness distributions of earlier epochs (Figure 3). In the weak-selection regime, the apparent growth rate r is similar in value to the single-site selection coefficient s (Figure 2), but even here the apparent growth rate is not constant. This process can easily be modeled by projecting the distributions of parental mutation numbers backward in time, using the mutational probabilities conditioned on the distributions (Equations 35, above; see File S1 for additional details). This gives a sequence of probability density functions for the fitnesses of ancestors at any given time in the past (O'Fallon et al., in press). The apparent population size of a given generation t, with respect to the present, can then be estimated as equation M7, where equation M8 is the variance of expected ancestry among the adults of that generation. In the generation immediately prior to the present, equation M9 (the Poisson variance when all parents have the same chance to be ancestors), but equation M10 increases with t as the distribution of ancestry becomes increasingly concentrated in the less-loaded (more fit) tails of the mutation-number distributions and the expected number of descendants at t = 0 for most members of the population declines toward zero. As a consequence, the apparent population size (as seen from the present) becomes smaller at times farther in the past (Figure 4).

In the strong-selection regime, the contraction proceeds quickly and then stops at a fixed Ne < N, as expected under classic background selection (Charlesworth et al. 1993, 1995). At smaller values of s (in the intermediate- and weak-selection regimes), there is a shift to slower rates of contraction that continue for many more generations before they finally level off, again at values of Ne that can be vastly smaller than N. In all regimes the rate of apparent contraction (going backward) or of growth (going forward) first increases and then declines (Figure S3). This happens relatively abruptly for strong selection, but gradually for weak selection, in which case the apparent population growth is roughly exponential (Figures 1, ,3,3, and and44).

A progressive reduction of Ne explains much but not all of the phenomenology. We simulated neutral coalescents for populations that grow (literally) according to the models of Ne(t) depicted in Figure 4. Genealogies are shortened (having reduced mean values of πn) and squished (with negative DT) in ways quantitatively similar to those seen in the corresponding models of deleterious mutation (Figure 2). But as expected, there is no consistent pattern of topological skewing (−IS ≈ 0). When θ is estimated from the genealogies, jointly with the apparent exponential growth rate r, the mean values of both estimates are similar to those from the corresponding models of selection, but the growth rates tend to be biased upward especially in the intermediate-selection regime (results not shown) (see Kuhner et al. 1998).

Levels of neutral variation nearly insensitive to N:

Most of the genealogical distortions depend only weakly on population size in the zone of intermediate s (where Us/s ∼ 5). This is the regime in which Muller's ratchet would begin to turn, in the absence of back mutations (Gordo et al. 2002). Figure 5 shows nucleotide diversity at neutral sites (πn) as a function of s, for four different population sizes spanning two orders of magnitude (N = 106–108, θ = 0.04–4). Minimum values of πn at these four values of θ occur over a narrow range of values of s (Us/s ∼ 1–5). All of these minimum values of πn are much smaller than θ. For the largest population size that we considered (N = 108, θ = 4), πn < 1% of θ and absolutely less than the observed synonymous-site π for C. ovalis.

Models with different population sizes and mutation rates but the same values of θ (2Nμ) and Ls (the number of selected nucleotide positions) behave similarly, if the strength of selection is expressed in terms of the compound parameters Ns and U/s. Measures of neutral variation and tree shape take on nearly identical values, and the depression of mean fitness (mutational load) and apparent population growth rate are also very similar if expressed in units inversely proportional to N (Figure S4).

Dynamic instability maximized at intermediate s:

The transition from the “strong-selection” to the “intermediate-selection” regime occurs fairly suddenly, as s is decreased holding other parameters constant (Figures 2, ,4,4, ,5,5, and and7).7). At this transition, the size of the class of unloaded chromosomes falls to zero, which is to say, the least-loaded class becomes very rare and hence vulnerable to loss, no matter how large the population is (Gordo and Charlesworth 2000a,b, 2001; Rouzine et al. 2003). As mentioned above, this is the regime in which Muller's ratchet would turn in the absence of back mutations and in which the size of the subpopulation containing likely (future) last common ancestors becomes very small. The dynamics in this intermediate regime are strikingly different from those in the neighboring strong-selection regime, where the principal moments of the mutation-number distribution fluctuate very little. By contrast, even in the strong edge of the intermediate regime, the moments all make large, irregular excursions on a range of different timescales, and the shapes of gene trees may also vary dramatically (Figures 6 and and77).

The process giving rise to this dynamism is illustrated in Figure 6. When the least-loaded class is rare, the “even-less-loaded” class is empty but available to be repopulated by back mutation. Such mutations are advantageous in the classical sense, even though the environment has not changed. Of course in most cases they are lost simply because they are very rare. But occasionally one of them survives rarity and increases under positive selection, shifting the mutation-number distribution to a lower mean and thereby increasing the mean fitness. On even rarer occasions, two or three such “novel” back mutations will occur in the same lineage and escape immediate loss, leading to a relatively substantial and rapid increase of mean and maximum fitness. Two such events occur in the population history depicted in Figure 6, transiently changing the first three moments of the mutation-number distribution and strongly warping the shapes of genealogies that pass through these “mini-pseudo-adaptive sweeps.” The environment did not change, but the genealogies seem to suggest that it did.

These events tend to be most dramatic when s is as large as it can be while still being small enough that such events happen at all (because all chromosomes carry at least a few deleterious mutations, and “novel improvements” are therefore possible). The transition from the strong-selection-low-load regime to the intermediate-selection-high-load regime is a transition from quiet to noisy dynamics because at the strong/intermediate boundary, these “improvements” are rare but relatively dramatic. Figure 7 shows histories of the mean mutation number (equivalent to the mean fitness) for runs in which s takes values spanning the full range of relevant values. Each mutation-number history is plotted as the current deviation from the long-term mean, in units of the mean instantaneous standard deviation of mutation numbers. As s decreases from values in the strong range, the temporal variance of the standardized mean mutation number jumps from a very small value to something close to its maximum value, over just one halving of s.

Load and distortions caused mainly by sites with intermediate s:

The assumption that all selected sites have the same value of s is highly artificial. Real chromosomes (including mitochondrial chromosomes) must have sites with mutational effect sizes that vary more or less continuously over a very broad range. What happens when sites with different effect sizes exist together on the same asexual chromosome? To address this question we carried out forward simulations under model II, where each chromosome has a large number of sites with values of s that vary over several orders of magnitude from neutral and very weak through strong.

As expected, the favored nucleotide (arbitrarily defined as G) remained nearly fixed at strongly selected sites, regardless of population size, and the four nucleotides were equally frequent at neutral and very weakly selected sites (Figure 8). Nucleotide diversity at neutral and very weakly selected sites did vary with N, but only by a factor of slightly more than 2, for values of N that differed by a factor of 8. The genetic load was caused mainly by variation at sites with three intermediate values of s spanning just a factor of 4 (2−13 = 0.00012 to 2−11 = 0.00049, Us/s = 6.3–1.6), almost independent of N. The distortions of tree shape indicated by DT and −IS, and the signals of apparently exponential population growth (r), were remarkably similar at all four population sizes (Figure 9).

Figure 8.
Statistics for chromosomal blocks with different values of s under model II. The chromosome has 11 blocks of 512 sites subject to mutations with effect sizes of s = 2−17–2−7 in a doubling series, plus a large number of ...
Figure 9.
Estimates of θ and tree shape for model II. Panels on the left show control runs where s = 0 in all blocks; panels on the right show the experimental runs of Figure 8. Curves in the top panels show mean values of three estimators of θ ...

These patterns indicate that sites with intermediate values of s will tend to dominate the dynamics, even where the genome contains equal or larger numbers of sites with stronger and weaker mutational effects. Selection coefficients at the sites contributing most to the load are similar to the apparent exponential growth rate of r ≈ 0.00035. θ estimated jointly with the apparent growth rate underestimates the true θ by 20–25%, as expected if Ne is reduced by background selection at the strongly selected sites (Charlesworth et al. 1993, 1995).

Cyamid nuclear polymorphism consistent with continuously large N:

If the reduced nucleotide diversity and strong reciprocal monophyly of cyamid mitochondrial genomes were caused by population bottlenecks that occurred after the separation of the northern and southern hemisphere sibling species, then nuclear genes should also show evidence of such events (e.g., Galtier et al. 2000). To look for such evidence we sequenced the nuclear EF-1α gene in samples of the three sibling species pairs. All three pairs show shared polymorphisms (at 7–12 sites, File S2 and Tables S1–S6), but they also show fixed or nearly fixed differences (at 1–7 sites) and modest values of FST (0.12–0.54, Table 1).

Such a pattern would be expected if the northern and southern hemisphere populations had been continuously large for the last 5 MY or so, with little or no gene flow during that time. The effective generation times of cyamids are not known but seem likely to be a few months. If there were four generations per year, then there would have been ∼2 × 107 generations since separation. During this time, sibling species with effective population sizes of Ne ∼ 2.5 × 107 would be expected to have accumulated only modest average levels of differentiation (FST ∼ 0.3) at neutral sites not strongly affected by directional selection at nearby loci, consistent with the retention of ancestral polymorphisms at many sites. However, the lack of ongoing gene flow would allow fixations to occur at other sites. No strong conclusions about the actual effective population sizes or levels of gene flow can be drawn from this small data set for just one locus, but similar patterns are seen in all three sibling species pairs, so these data appear to be inconsistent with models in which population bottlenecks play a major role.


Hitchhiking (Maynard Smith and Haigh 1974) has long been understood to strongly affect levels and configurations of linked neutral nucleotide diversity. Most theoretical and empirical effort to date has been aimed at understanding the effects of classic adaptive sweeps triggered by environmental change (Berry et al. 1991; Begun and Aquadro 1992; Gillespie 2000, 2001; Sabeti et al. 2006; Voight et al. 2006; Wang et al. 2006; Andolfatto 2007). Unconditionally deleterious mutations have received less attention in this context, but several studies have noted that deleterious mutations can reduce levels of neutral variation and distort the shapes and sizes of genealogies, especially in the absence of recombination (e.g., Kaplan et al. 1988; Hudson and Kaplan 1994, 1995; Charlesworth et al. 1995; Golding 1997; Nachman 1998; Rand and Kann 1998; Przeworski et al. 1999; McVean and Charlesworth 2000; Tachida 2000; Comeron and Kreitman 2002; Gordo et al. 2002; Williamson and Orive 2002; Barton and Etheridge 2004; Hughes 2005; Reed et al. 2005; Comeron et al. 2008). The present study shows that these effects can be dramatic for nonrecombining genomes with high mutation rates, many sites subject to mutations with suitably small values of s (Us/s ∼ 5), and large population sizes (Ns ≫ 1). Distortions of genealogical size and shape similar to those seen in the mitochondria of right-whale cyamids can be produced by ∼1000 sites with s of order 3 × 10−6, given realistic values of Ne and μ (Figure 3).

However, the fact that unconditionally deleterious mutations and back mutations seem able to explain a set of observations does not imply that they are the sole or even the main cause of those observations. Right-whale cyamids may have unusually constant environments, and their mitochondria may be relatively well insulated from any ecological changes the cyamids experience on relevant timescales. But the world of the cyamids' mitochondrially encoded proteins and structural RNAs cannot simply be assumed to be nearly unchanging. Instead, this assumption must be tested. The models of greatest interest are therefore ones in which the genome includes environmentally sensitive sites (where the fittest nucleotide state occasionally changes) in addition to neutral and environmentally insensitive sites (like those modeled here, where one nucleotide state is always fittest) (e.g., Hahn 2008). We recently added a block of such sites to the coalescent algorithm so that it can be used to study how environmentally triggered adaptive evolution interacts with strictly purifying selection of the kind modeled here.

Although the form of selection modeled here is strictly “purifying,” the genealogical distortions that appear at intermediate values of s are caused by “directional” selective sweeps. As in classical sweeps, relatively fit alleles that are initially rare replace common, less fit alleles, causing a variety of effects including a dramatic shortening of the average pairwise coalescence time (e.g., Gillespie 2000, 2001). Here, however, the fitness differences among “alleles” arise from many mutations of individually small effect (mostly deleterious), rather than from one or a few mutations, each of relatively large effect (and advantageous). In the tree shown in Figure 3, representing a sample of 100 genomes at potentially realistic parameter values, there are 616 forward and 37 back mutations. The contemporary genomes carry loads of 44–60 sites in deleterious nucleotide states, from a total of 1000 selected sites. Ancestral fitnesses are generally higher, and relatively homogeneous, but the histories of sequential mutation along different lines of descent give rise to substantial heterogeneity within and between clades at all levels in the tree. Compared to a classic adaptive sweep, ones like this could be described as “diffuse” or even “futile,” in that the population's mean fitness is no higher (in typical instances) at the end of the sweep than at the beginning (reminiscent of the Red Queen process in a continually deteriorating environment). Given sufficiently large and well resolved genealogies (which could be generated for right-whale cyamid mitochondrial genomes), it might be possible to fit models that estimate the rates and effect sizes of mutations affecting the relative growth rates of lineages within the tree. An inferred mutational history like that of Figure 3 would then suggest a predominant role for “purifying sweeps” as opposed to “directional adaptive sweeps,” owing to the many mutations of small effect scattered throughout the tree that increase the variance of fitness especially toward the tips.

One implication is that a significant fraction of the total nucleotide variation could be weakly selected, especially in nonrecombining genomes with high mutation rates. This idea has been discussed as a potential explanation for the apparent time dependency of substitution rates (Ho et al. 2005, 2008) and the higher density of nonsynonymous substitutions near the tips of intraspecific mitochondrial genealogies (Kivisild et al. 2006). Site-frequency spectra and other statistics of weakly selected polymorphisms can differ substantially from those of neutral polymorphisms, potentially allowing their relative abundances and other parameters to be estimated even in nuclear genomes, given samples of sufficient size and quality (Messer 2009).

Previous studies have only rarely called attention to the dynamical liveliness associated with intermediate values of s (Comeron and Kreitman 2002; Comeron et al. 2008). The unsteadiness of the mutation-number distribution (Figures 6 and and7)7) appears to contribute to the pattern of genealogical distortion in at least two ways. First, it gives rise to the Hill–Robertson interference that shrinks the effective population size going back in time, thereby weakening the effect of selection and allowing mean mutation numbers to rise to levels far higher than would occur with recombination (Keightley and Otto 2006) or infinte N. The effects can be illustrated by repeating the coalescent simulations of Figure 2 using deterministic equilibrium mutation-number distributions for the same values of U and s. Such an experiment is shown (in the format of Figure 2) in Figure S5. In the strong-selection regime, and at the strong end of the intermediate regime, all of the statistics are indistinguishable from those under full stochastic dynamics. But mean fitness is not depressed in the intermediate- and weak-selection regimes. The other statistics (describing aspects of neutral polymorphism and tree shape) vary with s in patterns qualitatively similar to those for the full dynamics, but they differ quantitatively in many respects, especially at smaller values of s.

Second, generation-to-generation fluctuations of the mutation-number distribution also seem to matter, independent of the distribution's overall average shape, although these effects are relatively subtle. This can be demonstrated by repeating the simulations of Figure 2 with static idealized empirical distributions generated by averaging a modest number of typical distributions from the wave file for a given set of parameters. Such an experiment is also shown in Figure S5. To idealize the waves for different values of s, we calculated the first four moments of all the distributions in a history, then selected just a few hundred with moments most similar to the joint mean, and averaged them. The reduction of mean fitness in the intermediate-selection regime is then identical to that for the full dynamics, as expected, and the other statistics are generally more similar as well, compared to those obtained with deterministic equilibrium mutation-number distributions. But the means of the polymorphism and tree-shape statistics still differ noticeably from those obtained by forward simulation or the dynamically restructured coalescent algorithm, and not surprisingly they vary less (among instances) in the intermediate- and weak-selection regimes. Extreme variations such as the mini-pseudo-adaptive sweeps illustrated in Figure 6 can never occur under a static mutation-number distribution.

Despite these shortcomings, models based on static mutation-number distributions potentially offer many advantages, including that the coalescent simulation algorithm can be made simpler and faster than the one described here, and inference methods based on likelihood calculations become feasible. The challenge is to develop an adequate approximate model of the steady-state mutation-number (fitness) distribution. Such a model would need to work well in the intermediate-selection regime, which is likely to be of biological importance in systems with large population sizes, high mutation rates, and little recombination (e.g., animal mitochondria and many microbes and viruses), because in the absence of external environmental influences, sites belonging to this regime are expected to dominate the dynamics.

The depression of mean fitness and the genealogical distortions that occur in the intermediate-s regime are consequences of Hill–Robertson interference between linked beneficial and deleterious allelic states. The linkage disequilibrium giving rise to interference is relieved by even low levels of recombination (Felsenstein 1974; Keightley and Otto 2006; Gordo and Campos 2008). This suggests that the effects of symmetrical weak interference (as modeled here) might not be easy to detect in recombining nuclear genomes at distances greater than those of individual genes (Comeron and Guthrie 2005; Loewe and Charlesworth 2007) and that it might be especially difficult to distinguish such effects from those of asymmetrical interference (where conventional adaptive sweeps increase the frequencies of linked weakly deleterious alleles) (McVean and Charlesworth 2000; Comeron et al. 2008). In addition, animal nuclear genomes usually have mutation rates that are much lower than those of mitochondria, so intermediate mutational effect sizes will be smaller in the nucleus than in the mitochondrion, other things being equal.

In summary, the models described here show that interference among weakly selected, unconditionally deleterious mutations and back mutations can severely distort the genealogies of nonrecombining chromosomes with parameters plausibly like those of mitochondrial genomes in many animal species. One prediction is that levels of neutral polymorphism in mitochondrial genomes might depend only weakly on N, as seems to be the case (Bazin et al. 2006). We do not know to what extent these models explain the “paradox of variation” (Lewontin 1974) for mitochondria, but they seem likely to provide some part of the explanation because the necessary assumptions are all plausible. The mitochondrion is an ancient organelle serving a specific, highly conserved function. Its genome is compact, with a high mutation rate, and most of its amino acid polymorphisms appear to be young and mildly deleterious (Nachman 1998; Rand and Kann 1998; Weinreich and Rand 2000; Kivisild et al. 2006; Popadin et al. 2007). The question is, How deleterious are these segregating amino acid mutations, and how many additional synonymous and noncoding sites affect mitochondrial gene expression or other processes and also carry mutations of small effect? If the answer is many, and if nuclear genes are qualitatively similar in this respect, then very mildly deleterious mutations might indeed be, collectively, major contributors to fitness variation and to the average genetic loads of most species (Crow 1970, 1993, 1999; Kondrashov 1995; Eyre-Walker et al. 2002, 2006; Chamary et al. 2006; Loewe 2006; Ellegren 2009).


We thank Reed Cartwright, Joe Felsenstein, Mary Kuhner, Brendan O'Fallon, Gary Olsen, Alan Rogers, Lucian Smith, Jeff Thorne, Marcy Uyenoyama, Jon Yamoto, and two anonymous reviewers for helpful advice, encouragement, and comments on the manuscript. This work was supported in part by awards from the James S. McDonnell Foundation to F.R.A. and from the John D. and Catherine T. MacArthur Foundation to J.S.


Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. GU172208GU172368.

Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.103556/DC1.


  • Andolfatto, P., 2007. Hitchhiking effects of recurrent beneficial amino acid substitutions in the Drosophila melanogaster genome. Genome Res. 17 1755–1762. [PMC free article] [PubMed]
  • Ballard, J. W. O., and M. C. Whitlock, 2004. The incomplete natural history of mitochondria. Mol. Ecol. 13 729–744. [PubMed]
  • Bamshad, M., and S. P. Wooding, 2003. Signatures of natural selection in the human genome. Nat. Rev. Genet. 4 99–111. [PubMed]
  • Barton, N. H., and A. M. Etheridge, 2004. The effect of selection on genealogies. Genetics 166 1115–1131. [PMC free article] [PubMed]
  • Bazin, E., S. Glémin and N. Galtier, 2006. Population size does not influence mitochondrial genetic diversity in animals. Science 312 570–572. [PubMed]
  • Begun, D. J., and C. F. Aquadro, 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356 519–520. [PubMed]
  • Berry, A. J., J. W. Ajioka and M. Kreitman, 1991. Lack of polymorphism on the Drosophila fourth chromosome resulting from selection. Genetics 129 1111–1117. [PMC free article] [PubMed]
  • Chamary, J. V., J. L. Parmley and L. D. Hurst, 2006. Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat. Rev. Genet. 7 98–108. [PubMed]
  • Charlesworth, B., M. T. Morgan and D. Charlesworth, 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134 1289–1303. [PMC free article] [PubMed]
  • Charlesworth, D., B. Charlesworth and M. T. Morgan, 1995. The pattern of neutral molecular variation under the background selection model. Genetics 141 1619–1632. [PMC free article] [PubMed]
  • Colless, D. H., 1982. Review of “Phylogenetics: the theory and practice of phylogenetic systematics.” Syst. Zool. 31 100–104.
  • Comeron, J. M., and T. B. Guthrie, 2005. Intragenic Hill-Robertson interference influences selection intensity on synonymous mutations in Drosophila. Mol. Biol. Evol. 22 2519–2530. [PubMed]
  • Comeron, J. M., and M. Kreitman, 2002. Population, evolutionary and genomic consequences of interference selection. Genetics 161 389–410. [PMC free article] [PubMed]
  • Comeron, J. M., A. Williford and R. M. Kliman, 2008. The Hill–Robertson effect: evolutionary consequences of weak selection and linkage in finite populations. Heredity 100 19–31. [PubMed]
  • Crow, J. F., 1970. Genetic loads and the cost of natural selection, pp. 128–177 in Mathematical Topics in Population Genetics, edited by K.-I. Kojima. Springer-Verlag, New York.
  • Crow, J. F., 1993. Mutation, fitness, and genetic load. Oxf. Surv. Evol. Biol. 9 3–42.
  • Crow, J. F., 1999. The odds of losing at genetic roulette. Nature 397 293–294. [PubMed]
  • Drummond, A. J., and A. Rambaut, 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7 214. [PMC free article] [PubMed]
  • Ellegren, H., 2009. A selection model of molecular evolution incorporating the effective population size. Evolution 63 301–306. [PubMed]
  • Eyre-Walker, A., and P. D. Keightley, 2007. The distribution of fitness effects of new mutations. Nat. Rev. Genet. 8 610–618. [PubMed]
  • Eyre-Walker, A., P. D. Keightley, N. G. C. Smith and D. Gaffney, 2002. Quantifying the slightly deleterious mutation model of molecular evolution. Mol. Biol. Evol. 19 2142–2149. [PubMed]
  • Eyre-Walker, A., M. Woolfit and T. Phelps, 2006. The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173 891–900. [PMC free article] [PubMed]
  • Felsenstein, J., 1974. The evolutionary advantage of recombination. Genetics 78 737–756. [PMC free article] [PubMed]
  • Gaines, C. A., M. P. Hare, S. E. Beck and H. C. Rosenbaum, 2005. Nuclear markers confirm taxonomic status and relationships among highly endangered and closely related right whales. Proc. R. Soc. Lond. Ser. B 272 533–542. [PMC free article] [PubMed]
  • Galtier, N., F. Depaulis and N. H. Barton, 2000. Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics 155 981–987. [PMC free article] [PubMed]
  • Gillespie, J. H., 1999. The role of population size in molecular evolution. Theor. Popul. Biol. 55 145–156. [PubMed]
  • Gillespie, J. H., 2000. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155 909–919. [PMC free article] [PubMed]
  • Gillespie, J. H., 2001. Is the population size of a species relevant to its evolution? Evolution 55 2161–2169. [PubMed]
  • Golding, G. B., 1997. The effect of purifying selection on genealogies, pp. 271–285 in Progress In Population Genetics and Human Evolution (IMA Volumes In Mathematics and Its Applications, Vol. 87), edited by P. Donnelly and S. Tavaré. Springer-Verlag, New York.
  • Gordo, I., and P. R. A. Campos, 2008. Sex and deleterious mutations. Genetics 179 621–626. [PMC free article] [PubMed]
  • Gordo, I., and B. Charlesworth, 2000. a The degeneration of asexual haploid populations and the speed of Muller's ratchet. Genetics 154 1379–1387. [PMC free article] [PubMed]
  • Gordo, I., and B. Charlesworth, 2000. b On the speed of Muller's ratchet. Genetics 156 2137–2140. [PMC free article] [PubMed]
  • Gordo, I., and B. Charlesworth, 2001. The speed of Muller's ratchet, with background selection and the degeneration of Y chromosomes. Genet. Res. 78 149–162. [PubMed]
  • Gordo, I., A. Navarro and B. Charlesworth, 2002. Muller's ratchet and the pattern of variation at a neutral locus. Genetics 161 835–848. [PMC free article] [PubMed]
  • Hahn, M. W., 2008. Toward a selection theory of molecular evolution. Evolution 62 255–265. [PubMed]
  • Hamilton, W. D., and R. M. May, 1977. Dispersal in stable habitats. Nature 269 578–581.
  • Heard, S. B., 1992. Patterns in tree balance among cladistic, phenetic and randomly generated phylogenetic trees. Evolution 46 1818–1826.
  • Higgs, P., and G. Woodcock, 1995. The accumulation of mutations in asexual populations and the structure of genealogical trees in the presence of selection. J. Math. Biol. 33 677–702.
  • Hill, W. G., and A. Robertson, 1966. The effect of linkage on the limits to artificial selection. Genet. Res. 8 269–294. [PubMed]
  • Ho, S. Y. W., M. J. Phillips, A. Cooper and A. J. Drummond, 2005. Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Mol. Biol. Evol. 22 1561–1568. [PubMed]
  • Ho, S. Y. W., U. Saarma, R. Barnett, J. Haile and B. Shapiro, 2008. The effect of inappropriate calibration: three case studies in molecular ecology. PLos ONE 3 e1615. [PMC free article] [PubMed]
  • Hudson, R. R., and N. L. Kaplan, 1994. Gene trees with background selection, pp. 140–153 in Non-Neutral Evolution: Theories and Molecular Data, edited by B. Golding. Chapman & Hall, New York.
  • Hudson, R. R., and N. L. Kaplan, 1995. Deleterious background selection with recombination. Genetics 141 1605–1617. [PMC free article] [PubMed]
  • Hudson, R. R., M. Slatkin and W. P. Maddison, 1992. Estimation of levels of gene flow from DNA sequence data. Genetics 132 583–589. [PMC free article] [PubMed]
  • Hughes, A. L., 2005. Evidence for abundant slightly deleterious polymorphisms in bacterial populations. Genetics 169 533–538. [PMC free article] [PubMed]
  • Kaliszewska, Z. A., J. Seger, V. J. Rowntree, S. G. Barco, R. Benegas et al., 2005. Population histories of right whales (Cetacea: Eubalaena) inferred from mitochondrial sequence diversities and divergences of their whale lice (Amphipoda: Cyamus). Mol. Ecol. 14 3439–3456. [PubMed]
  • Kaplan, N. L., T. Darden and C. H. Langley, 1988. The coalescent process in models with selection. Genetics 120 819–829. [PMC free article] [PubMed]
  • Kaplan, N. L., R. R. Hudson and C. H. Langley, 1989. The “hitchhiking” effect revisited. Genetics 123 887–899. [PMC free article] [PubMed]
  • Keightley, P. D., and S. P. Otto, 2006. Interference among deleterious mutations favours sex and recombination in finite populations. Nature 443 89–92. [PubMed]
  • Kivisild, T., P. Shen, D. P. Wall, B. Do, R. Sung et al., 2006. The role of selection in the evolution of human mitochondrial genomes. Genetics 172 373–387. [PMC free article] [PubMed]
  • Kondrashov, A. S., 1995. Contamination of the genome by very slightly deleterious mutations: Why have we not died 100 times over? J. Theor. Biol. 175 583–594. [PubMed]
  • Kuhner, M. K., J. Yamato and J. Felsenstein, 1998. Maximum likelihood estimation of population growth rates based on the coalescent. Genetics 149 429–434. [PMC free article] [PubMed]
  • Kuhner, M., J. Yamato, P. Beerli, L. Smith, E. Rynes et al., 2004. Lamarc v 1.2.1. http://evolution.gs.washington.edu/lamarc.html.
  • Lambert, F., B. Delmonte, J. R. Petit, M. Bigler, P.R. Kaufmann et al., 2008. Dust-climate couplings over the past 800,000 years from the EPICA Dome C ice core. Nature 452 616–619. [PubMed]
  • Lewontin, R. C., 1974. The Genetic Basis of Evolutionary Change. Columbia University Press, New York.
  • Loewe, L., 2006. Quantifying the genomic decay paradox due to Muller's ratchet in human mitochondrial DNA. Genet. Res. 87 133–159. [PubMed]
  • Loewe, L., and B. Charlesworth, 2007. Background selection in single genes may explain patterns of codon bias. Genetics 175 1381–1393. [PMC free article] [PubMed]
  • Maia, L. P., A. Colato and J. F. Fontanari, 2004. Effect of selection on the topology of genealogical trees. J. Theor. Biol. 226 315–320. [PubMed]
  • Maynard Smith, J., and J. Haigh, 1974. The hitch-hiking effect of a favorable gene. Genet. Res. 23 23–35. [PubMed]
  • McVean, G. A. T., and B. Charlesworth, 2000. The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155 929–944. [PMC free article] [PubMed]
  • Messer, P. W., 2009. Measuring the rates of spontaneous mutation from deep and large-scale polymorphism data. Genetics 182 1219–1232. [PMC free article] [PubMed]
  • Muller, H. J., 1964. The relation of recombination to mutational advance. Mutat. Res. 1 2–9. [PubMed]
  • Nabholz, B., J.-F. Mauffrey, E. Bazin, N. Galtier and S. Glemin, 2008. Determination of mitochondrial genetic diversity in mammals. Genetics 178 351–361. [PMC free article] [PubMed]
  • Nachman, M. W., 1998. Deleterious mutations in animal mitochondrial DNA. Genetica 102/103 61–69. [PubMed]
  • Nei, M., and D. Grauer, 1984. Extent of protein polymorphism and the neutral mutation theory. Evol. Biol. 17 73–118.
  • Ochman, H., A. S. Gerber and D. L. Hartl, 1988. Genetic applications of an inverse polymerase chain reaction. Genetics 120 621–623. [PMC free article] [PubMed]
  • Ochman, H., M. M. Medhora, D. Garza and D. L. Hartl, 1990. Amplification of flanking sequences by inverse PCR, pp. 219–227 in PCR Protocols: A Guide to Methods and Applications, edited by M. A. Innis, D. H. Gelfand, J. J. Sninsky and T. J. White. Academic Press, New York.
  • O'Fallon, B. D., J. Seger and F. R. Adler, 2010. A continuous-state coalescent and the impact of weak selection on the structure of gene genealogies. Mol Biol. Evol. (in press). [PubMed]
  • Olsen, G. J., C. R. Woese and R. Overbeek, 1994. The winds of (evolutionary) change: breathing new life into microbiology. J. Bacteriol. 176 1–6. [PMC free article] [PubMed]
  • Popadin, K., L. V. Polishchuk, L. Mamirova, D. Knorre and K. Gunbin, 2007. Accumulation of slightly deleterious mutations in mitochondrial protein-coding genes of large versus small mammals. Proc. Natl. Acad. Sci. USA 104 13390–13395. [PMC free article] [PubMed]
  • Przeworski, M., B. Charlesworth and J. D. Wall, 1999. Genealogies and weak purifying selection. Mol. Biol. Evol. 16 246–252. [PubMed]
  • Pybus, O. G., and A. Rambaut, 2002. GENIE v3.0 User Manual. Department of Zoology, University of Oxford, Oxford. http://evolve.zoo.ox.ac.uk/software/Genie/.
  • Rand, D. M., and L. M. Kann, 1998. Mutation and selection at silent and replacement sites in the evolution of animal mitochondrial DNA. Genetica 102/103 393–407. [PubMed]
  • Reed, F. A., J. M. Akey and C. F. Aquadro, 2005. Fitting background-selection predictions to levels of nucleotide variation and divergence along the human autosomes. Genome Res. 15 1211–1221. [PMC free article] [PubMed]
  • Rogers, A. R., and H. C. Harpending, 1992. Population growth makes waves in the distribution of pairwise genetic differences. Mol. Biol. Evol. 9 552–569. [PubMed]
  • Rogers, A. R., and C. Huff, 2009. Linkage disequilibrium between loci with unknown phase. Genetics 182 839–844. [PMC free article] [PubMed]
  • Rogers, J. S., 1994. Central moments and probability distribution of Colless's coefficient of tree imbalance. Evolution 48 2026–2036.
  • Rogers, J. S., 1996. Central moments and probability distributions of three measures of phylogenetic tree imbalance. Syst. Biol. 45 99–110.
  • Rosenbaum, H. C., R. L. Brownell, Jr., M. W. Brown, C. Schaeff, V. Portway et al., 2000. World-wide genetic differentiation of Eubalaena: questioning the number of right whale species. Mol. Ecol. 9 1793–1802. [PubMed]
  • Rouzine, I. M., J. Wakeley and J. M. Coffin, 2003. The solitary wave of asexual evolution. Proc. Natl. Acad. Sci. USA 100 587–592. [PMC free article] [PubMed]
  • Rozas, J., J. C. Sánchez-DelBarrio, X. Messeguer and R. Rozas, 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19 2496–2497. [PubMed]
  • Sabeti, P. C., S. F. Schaffner, B. Fry, J. Lohmueller, P. Varilly et al., 2006. Positive natural selection in the human lineage. Science 312 1614–1620. [PubMed]
  • Shao, K. T., and R. Sokal, 1990. Tree balance. Syst. Zool. 39 226–276.
  • Slatkin, M., 2008. A Bayesian method for jointly estimating allele age and selection intensity. Genet. Res. 90 129–137. [PubMed]
  • Slatkin, M., and R. R. Hudson, 1991. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129 555–562. [PMC free article] [PubMed]
  • Slatkin, M., and B. Rannala, 2000. Estimating allele age. Annu. Rev. Genomics Hum. Genet. 1 225–249. [PubMed]
  • Stephens, M., and P. Donnelly, 2003. A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet. 73 1162–1169. [PMC free article] [PubMed]
  • Strimmer, K., and O. G. Pybus, 2001. Exploring the demographic history of DNA sequences using the generalized skyline plot. Mol. Biol. Evol. 18 2298–2305. [PubMed]
  • Tachida, H., 2000. Molecular evolution in a multisite nearly neutral mutation model. J. Mol. Evol. 50 69–81. [PubMed]
  • Tajima, F., 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123 585–595. [PMC free article] [PubMed]
  • Voight, B. F., S. Kudaravalli, X. Wen and J. K. Pritchard, 2006. A map of recent positive selection in the human genome. PLoS Biol. 4 0446–0458. [PMC free article] [PubMed]
  • Wang, E. T., G. Kodama, P. Baldi and R. K. Moyzis, 2006. Global landscape of recent inferred Darwinian selection for Homo sapiens. Proc. Natl. Acad. Sci. USA 103 135–140. [PMC free article] [PubMed]
  • Weinreich, D. M., and D. M. Rand, 2000. Contrasting patterns of nonneutral evolution in proteins encoded in nuclear and mitochondrial genomes. Genetics 156 385–399. [PMC free article] [PubMed]
  • Williamson, S., and M. E. Orive, 2002. The genealogy of a sequence subject to purifying selection at multiple sites. Mol. Biol. Evol. 19 1376–1384. [PubMed]
  • Woese, C. R., G. J. Olsen, M. Ibba and D. Soll, 2000. Aminoacyl-tRNA synthetases, the genetic code, and the evolutionary process. Microbiol. Mol. Biol. Rev. 64 202–236. [PMC free article] [PubMed]
  • Yule, G. U., 1924. A Mathematical Theory of Evolution: Based on the Conclusions of Dr. J. C. Willis, F.R.S. Harris & Sons, London.

Articles from Genetics are provided here courtesy of Genetics Society of America
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • Nucleotide
    Primary database (GenBank) nucleotide records reported in the current articles as well as Reference Sequences (RefSeqs) that include the articles as references.
  • PopSet
    Sets of sequences from population and evolutionary genetic studies in the PopSet database reported in the current articles.
  • Protein
    Protein translation features of primary database (GenBank) nucleotide records reported in the current articles as well as Reference Sequences (RefSeqs) that include the articles as references.
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...