- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

# The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus

^{*}Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, P.O. Box 22085, 46071 Valencia, Spain; and

^{‡}Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas, Universidad Politécnica de Valencia, 46022 Valencia, Spain

^{†}To whom correspondence should be addressed. E-mail: se.vu@naujnas.leafar.

## Abstract

Little is known about the mutational fitness effects associated with single-nucleotide substitutions on RNA viral genomes. Here, we used site-directed mutagenesis to create 91 single mutant clones of vesicular stomatitis virus derived from a common ancestral cDNA and performed competition experiments to measure the relative fitness of each mutant. The distribution of nonlethal deleterious effects was highly skewed and had a long, flat tail. As expected, fitness effects depended on whether mutations were chosen at random or reproduced previously described ones. The effect of random deleterious mutations was well described by a log-normal distribution, with -19% reduction of average fitness; the effects distribution of preobserved deleterious mutations was better explained by a β model. The fit of both models was improved when combined with a uniform distribution. Up to 40% of random mutations were lethal. The proportion of beneficial mutations was unexpectedly high. Beneficial effects followed a γ distribution, with expected fitness increases of 1% for random mutations and 5% for preobserved mutations.

Mutation is a double-edged sword. At one side, it is the ultimate source of genetic variation and the raw material for selection to act upon; a genotype with a null mutation rate would be sentenced to extinction because of its inability to respond to environmental perturbations. At the other side, mutations typically lead to reduced fitness and are removed by purifying selection. It is generally assumed that mutation is a blind process, so that living beings cannot benefit from it without suffering its negative consequences, which is why the avoidance of the detrimental consequences of mutation may be as important to survival as the genesis of adaptive novelties. For example, recombination and sex are although to be advantageous to accelerate the fixation of beneficial mutations (1, 2) but also to avoid the accumulation of deleterious mutations (2, 3). Therefore, the distribution of mutational effects on fitness is of fundamental importance for predicting evolutionary dynamics (4–6). Yet, surprisingly little quantitative information on the distribution of mutational effects exists. A few ambitious studies sought to measure the distribution of mutational effects in *Drosophila melanogaster* (7, 8), *Caenorhabditis elegans* (9), and *Escherichia coli* (10). However, these studies suffer from at least one of the following limitations: (*i*) they are focused on phenotypic traits of unclear adaptive significance or on viability that represents only one fitness component; (*ii*) they were done by introducing an unknown number of mutations by chemical mutagenesis or by the accumulation of spontaneous mutations under conditions of relaxed selection; and/or (*iii*) they focused on particular types of mutations such as gene knock-outs caused by transposon insertion.

A particular key property of RNA viruses is their error-prone replication (11), which is believed to confer them the advantage of great adaptability (12). In fact, RNA viral populations are usually described as molecular quasispecies that replicate near the maximum error rate compatible with the maintenance of the encoded genetic information (13). However, the nature of RNA viral populations does not depend only on mutation rate but also on the distribution of mutational fitness effects (14). Elena and Moya (15) analyzed fitness data for vesicular stomatitis virus (VSV) clones serially transferred throughout bottlenecks (16, 17), finding that the probability density function (pdf) better fitting the data was a complex one in which a minority of clones had fitness values drawn from a [0, 1] uniform, whereas the majority had fitness values sampled from a γ distribution (15). Recently, Lázaro *et al*. (18) explored the effect of random mutations on the long-term survival of foot-and-mouth disease virus clones subjected to continuous bottlenecks of size one. They found that the distribution of mutational effects was well described by a Weibull pdf, whereas the distribution observed for large, nonevolving populations was best described by a log-normal pdf (18). Regardless of the ground-breaking importance of these studies for evolutionary virology, they suffer from one of the problems mentioned above: the number of mutations fixed per clone and its molecular nature are unknown. Therefore, inferences are only possible for the distribution of accumulated effects. Additionally, sequence analysis has revealed the difficulty of unambiguously establishing the relationship between multiple mutations fixed and fitness (19–21).

The goal of this work is to avoid this “black-box” process of mutagenesis by creating a collection of single-nucleotide substitution mutants by site-directed mutagenesis on an infectious VSV cDNA. Then we measure fitness for each member of the collection to infer the statistical properties of the distribution of mutational fitness effects.

## Materials and Methods

**Site-Directed Mutagenesis.** We created a collection of single-nucleotide substitution mutants of VSV. The collection constituted two different sets of mutations. The first contained 48 mutants for which both the site to be changed and the nucleotide to be introduced were chosen randomly. The second contained 43 substitutions already described in wild isolates (22, 23), laboratory populations (19, 20, 24–26), or laboratory clones (27–30). Mutations were distributed evenly along the genome. Table 3, which is published as supporting information on the PNAS web site, contains information about each mutant.

A full-length infectious cDNA clone (kindly provided by G. T. W. Wertz, University of Alabama at Birmingham, Birmingham) was used as template for creating the collection of mutants (31). Site-directed mutagenesis reactions were performed by using the high-fidelity *Pfu* DNA polymerase (Promega) to minimize the chance of appearance of undesired mutations (32). The products were digested with *Dpn*I (Stratagene) to remove the parental methylated strands and then transformed into ultracompetent XL-10 Gold cells (Stratagene). Sequencing of the cDNAs was done to confirm that each desired mutation was incorporated successfully.

As a first step, we introduced the substitution A-3853 → C in the plus strand (Asp-259 → Ala substitution in the G surface protein), which confers the ability of growing in the presence of the I_{1} mAb (MARM phenotype), at concentrations that inhibit wild-type growth (33). This cDNA clone, named MARM *RSV*, was used as template for the rest of mutagenesis.

**Virus Recovery from cDNA Clones.** Approximately 10^{5} (90–95% confluent) baby hamster kidney (BHK_{21}) cells (American Type Culture Collection) were infected with a recombinant vaccinia virus, vTF7-3 (American Type Culture Collection), which expressed the T7 RNA polymerase. After incubation, cells were cotransfected with the full-length mutant cDNA clone and three support plasmids that provided *in trans* the P, L, and N genes of VSV as described by Whelan *et al*. (31). Transfections were done by using Lipofectamine supplemented with Plus reagent (Invitrogen) and adding 25 μg/ml 1-β-d-arabinofuranosylcytosine to the cultures 6 h postinfection (hpi) to inhibit the replication of vaccinia virus vTF7-3. After 96 hpi, the cultures were frozen and thawed, and the supernatant was harvested. Dilutions (100- to 10^{4}-fold) were plated on a fresh monolayer with 0.4% agarose in the overlay DMEM (supplemented with 5% calf serum). The presence of plaque-forming units (PFU) 24 hpi indicated the successful recovery of infectious VSV particles, because vaccinia virus vTF7-3 is unable to produce PFU in such conditions (E. Martínez-Salas, personal communication). Any residual vaccinia virus vTF7-3 particle was removed by filtering the supernatant throughout 0.2-μm membranes (Millipore). Titers of successful transfections ranged between 10^{4} and 10^{6} PFU/ml. Preliminary experiments showed that the accuracy of fitness estimates depended on the titer obtained after the transfection. Therefore, to homogenize the titer of all mutants, 50 μl from the filtered supernatant were used to infect ≈10^{4} cells. After 48 h, cultures were harvested by freezing-thawing and stored in aliquots at -80°C. Titers, estimated by triplicate, were now ≈5 × 10^{6} PFU/ml. Failed transfection experiments were repeated until a positive result was obtained, with a maximum of 10 trials.

Transfection experiments were performed for the whole collection of mutants, the nonmutated wild type, and the MARM *RSV* clones. A large volume of wild type with a high titer was produced and kept at -80°C. This stock constituted our common competitor for fitness assays.

The MARM phenotype of all mutants, as well as the sensitivity of wild type to I_{1} mAb, was confirmed by plating assays in which the overlay medium was supplemented with 25% (vol/vol) of antibody.

**Relative Fitness Assays.** The fitness of each mutant relative to the nonmutated wild type was assessed by seeding ≈2.5 × 10^{3} PFU of each genotype into ≈10^{5} cells. To minimize the probability of fixation of new mutations during competition experiments, they were run for only 12 hpi. Preliminary assays showed that exponential growth occurred during this interval. Samples were taken at 6, 8, 10, and 12 hpi. The titer of both genotypes was determined by plating the appropriate dilution in the presence and absence of I_{1} mAb. The fitness of each mutant relative to wild type (ω) was estimated as the slope of the linear regression log[*N*_{M}(*t*)/*N*_{M}(*t*_{0})] = ωlog[*N*_{W}(*t*)/*N*_{W}(*t*_{0})], where *N*_{M}(·) and *N*_{W}(·) represent the titer of mutant and wild type, respectively, at the beginning of the infection (*t*_{0}) and *t* hpi. Under exponential growth, ω is equal to the ratio of intrinsic growth rates, *r*_{M}/*r*_{W}, of the mutant and the wild type, respectively. All assays were replicated in five independent blocks. For each block, fitness was also assayed for the MARM *RSV* progenitor by triplicate. Fitness estimates of each mutant relative to its progenitor (*W*) were adjusted by dividing the ω values obtained in each block by the fitness value of MARM *RSV* estimated in the same block. The average fitness value of MARM *RSV* relative to wild type was 0.859 ± 0.019 (±1 SEM).

**Statistical Analyses.** Statistical analyses were performed by using spss 11.5. For the purpose of describing the distribution of mutational effects on fitness, each mutant was treated as an independent observation. The fit of the observed distribution to alternative pdf models was performed by least-squares nonlinear regression. The models chosen share the basic feature that mutations with small effects are more common than mutations with larger effects. Akaike's information criterion (AIC) was used to compare the log likelihood of nonnested models (34). The model that better explains the observations, while requiring the lower number of parameters, is the one with the lower AIC.

## Results

**Discarding Compensatory Mutations.** The study of the distribution of single-nucleotide substitution fitness effects strongly depends on whether each genotype carries only the desired mutation or additional mutations having a fitness effect arise during the early stages of replication and are common to most progeny of a transfection experiment. The number of generations, defined as cycles of cell infection and production of progeny (35), elapsed between the transfection, and the beginning of the competition experiment is low enough (in the range of 1.96–6.13, with a median of 2.92) to preclude compensatory mutations to rise and distort the fitness of single mutants. However, to rule out this potential problem, we took a twofold strategy. First, we ran four independent transfection experiments for five genotypes and competed the resulting viruses against our reference wild type. These genotypes covered the whole distribution of fitness effects. As expected, fitness depended on the mutation introduced (nested ANOVA: *F*_{4,15} = 470.614; *P* < 0.001). If additional compensatory mutations had accumulated before fitness assays, we would expect to detect also differences between transfection experiments. However, there was no evidence supporting this hypothesis (nested ANOVA: *F*_{15,80} = 0.975; *P* = 0.489). Second, we determined the full-length RNA consensus sequence resulting from one transfection experiment for these five genotypes. Not a single unexpected change was observed in three of them. Two of them (originally having nonsynonymous mutations), however, presented one additional synonymous change that obviously has no fitness effect. In conclusion, compensatory mutations occurring before competition experiments do not take place at a noticeable rate.

**Assessing the Proportion of Deleterious, Neutral, and Beneficial Mutations.** We recovered infectious particles for 67 of 91 mutants. The fitness for each mutant was compared with the neutral value (*W* = 1) using a one-sample *t* test, and each mutation was subsequently classified in one of the three categories: deleterious, neutral, and beneficial. Overall, 31 mutations had no significant fitness effect, 32 were deleterious, and 4 were beneficial (Table 1). Two kinds of statistical errors can affect these proportions: (*i*) rejecting the hypothesis of neutrality when it is actually true (type I error) and (*ii*) accepting it being actually false (type II error). If all mutations were neutral, we would expect to detect one or two (67 × 0.025) false-deleterious effects as well as one or two false-beneficial effects as a consequence of a type I error. Clearly, this would not be important for the estimated proportion of deleterious mutations. For beneficial mutations, we could apply a multiple test correction, but this enlarges type II errors. Instead, we performed five additional fitness assays for the 10 upper extreme fitness cases, in which the four putative beneficial mutants were included. After additional replication, these four cases remained statistically significant, and another four became so, adding up to a total of eight beneficial mutations. It is noteworthy that these estimates of the proportions of deleterious and beneficial mutations have to be considered as lower bounds, because some of the mutations classified as neutrals could actually have a fitness effect too weak to be detected by our experimental method (type II error).

**Proportion and number (in parentheses) of lethal, deleterious, neutral, and beneficial effects for random and previously described mutations**

**Dealing with the Existence of Lethal Mutations.** Lethal mutations and failed transfection experiments produce the same apparent result: an absence of infectious particles in the supernatant of the transfection. We failed to recover viral particles from the supernatant after 10 trials for 24 mutants. To rule out the possibility of these mutations not being lethal but failed transfection experiments, we estimated our rate of transfection failure as follows. We ran 67 new, independent transfection experiments either with the MARM *RSV* or wild-type cDNAs. We recovered infectious particles in 39 of these experiments after one trial. Therefore, our rate of failure is 41.8% per transfection experiment. By using this figure, the likelihood of not recovering infectious particles caused by recurrent experimental failure after 10 trials is 0.418^{10} = 1.63 × 10^{-4}. In a sample of 91 mutants, hence, we expect much less than one case (91 × 1.36 × 10^{-4} = 0.015) to be assigned erroneously to the category of lethal mutations. In conclusion, we are quite confident that the cases classified as lethal mutations are really so. This possibility is further supported by considering the kind of mutations putatively lethal (Table 3): 19 produced nonsynonymous substitutions, 3 introduced stop codons, and 1 disrupted the initiation codon of the G gene. By contrast, there was only one case of lethal synonymous substitution, 53 nt before the end of M gene. Among random mutations, 40% were putative lethal. For preobserved mutations, although significantly reduced (Fisher's exact test, *P* < 0.010), this proportion was still 12% (Table 1).

**Distribution of Negative Fitness Effects.** The average fitness effect for the 51 mutations with effects that were <1.0 (not necessarily significant) but nonlethal was -0.139 ± 0.021. The distribution was highly and significantly skewed toward strongly negative values (*g*_{1} = -2.002; *t*_{50} = 6.005; *P* < 0.001), and consequently the median (-0.092) was well above the mean. The distribution was also strongly and significantly leptokurtic (*g*_{2} = 4.970; *t*_{50} = 7.578; *P* < 0.001), such that many values lie near the center and in the tail, whereas relatively few have intermediate values. These general properties are valid for both random and preobserved mutations. However, the analysis of fitness distribution needs to be done separately for random and preobserved mutations, because the biological meaning of both data sets is *a priori* different: the former group reflects pure mutational fitness effects, whereas the latter is influenced by the action of drift and natural selection. As expected, the mean negative fitness effect was larger for random than for preobserved nonsynonymous mutations (Fig. 1; Mann–Whitney test, *Z* = 2.098; one-tailed, *P* = 0.018). For synonymous mutations, fitness did not differ from 1 (*t* test, *t*_{8} = 1.197; *P* = 0.266).

*A*) and previously described (

*B*) mutations.

Because fitness effects are not distributed normally, it becomes necessary to determine which of several alternative models better describes our observations. Table 2 shows the statistics describing the fitting of several models to the negative effects. The first model tested was the exponential distribution. Exponential pdfs have been used for a long time for describing deleterious mutational effects (36), and more recently it has been proposed as a good model for describing beneficial effects as well (37–39). The only parameter, λ, is the inverse of the expected value. This model fitted significantly well to random (*F*_{1,21} = 2120.132; *P* < 0.001) and preobserved (*F*_{1,20} = 3327.380; *P* < 0.001) effects, explaining 95.8% and 96.4% of the observed variation, respectively.

**Fit of the observed distribution of deleterious mutational effects to several models for random and preobserved mutations**

Then we tested several two-parameter models. The first model was the γ distribution (40). A γ distribution is characterized by the scale, α, and the shape, β. The expected value of a γ is β/α. Because the exponential is a particular case of the γ, it is possible to use a partial *F* test to compare the fit of both models. For preobserved mutations, the γ significantly improved over the exponential distribution (*F*_{1,20} = 11.394; *P* = 0.003). An alternative to the γ is the β distribution. It has a narrower range of values; whereas the domain of application of the γ is 0 ≤ *W* ≤ +∞, the β is bounded in the range of 0 ≤ *W* ≤ 1. Therefore, it is especially well suited to model mutational effects. The β distribution is characterized by two shape parameters, α and β. The expected value of a β distribution is α/(α + β). This pdf scored the best fit for preobserved mutational effects. According to AIC, it was better than the γ and other alternative two-parameter models such as the Weibull and the log-normal. The least-squares parameter estimates for the β distribution were α = 0.742 ± 0.049 and β = 5.767 ± 0.526. The expected reduction in fitness was -11.4%, a value that is still 18.0% discrepant with the observed average reduction in fitness. The fit of the β model to the data are shown in Fig. 2*A*.

*A*) Mutations chosen randomly. The continuous line shows the predicted

**...**

For random mutations the γ did not improve the fit of exponential distribution (*F*_{1,20} = 1.468; *P* = 0.240). Similarly, neither the β nor the Weibull were significantly better than the exponential (larger AIC values; Table 1). The best fit for random mutations was obtained for the log-normal distribution. This model is characterized by a scale parameter, *m*, and a shape parameter, σ. The least-squares parameter estimates were *m* = 0.092 ± 0.003 and σ = 1.206 ± 0.067. The expected value for the log-normal distribution, *me*^{σ2/2}, was a fitness reduction of -19.1%. The fit of this model to the data is shown in Fig. 2*B*.

Elena *et al*. (10) proposed that deleterious fitness effects should be explained better by more complex models intended to capture cases with large effects unexplained by simpler distributions. Thus, we tried to combine the above single-distribution models with a uniform pdf. For example, in the case of the exponential, the complex model was *p* × exp(*s*|λ) + (1 - *p*) × *Un*(*s*|0, *b*), with *Un*(*s*|0, *b*) being the uniform pdf in the range [0, *b*] and *p* indicating the fraction of mutations sampled from each distribution. The fit of simple models was strongly improved when combined with the uniform distribution, according to partial *F* tests (all cases *P* ≤ 0.049). In combination with the uniform pdf, the β distribution again was the best descriptor for preobserved mutations (Table 2 and Fig. 2*B*), whereas the log-normal remained the best descriptor for random mutations (Table 2 and Fig. 2 *A*). The consequence of adding a uniform term is to raise up the probability of highly deleterious mutations to occur. In fact, in the case of preobserved mutations, the uniform pdf accounted for >99% of the overall predicted probability for fitness effects beyond -8%, whereas the β pdf explained less deleterious effects. In the case of random mutations, this transition was shifted to a fitness effect of -15%. Under the compound models, the expected mean fitness effects are -10.5% for preobserved mutations and -15.4% for random mutations. However, these values are dominated by the uniform pdf and thus are strongly dependent on the upper bound of this distribution, which in turn is highly dependent on sampling error.

**Distribution of Beneficial Fitness Effects.** For the 16 mutants showing beneficial effects, the average fitness effect was 0.044 ± 0.012, a value significantly greater than zero (*t*_{15} = 3.690; *P* = 0.002). The distribution was skewed toward small beneficial effects (*g*_{1} = 1.744; *t*_{15} = 3.091; *P* = 0.008), with median fitness effect (0.032) below the mean. The distribution was also significantly leptokurtic (*g*_{2} = 2.587; *t*_{15} = 2.358; *P* = 0.017). As expected, the mean positive fitness effect was stronger for preobserved mutations than for random mutations (Mann–Whitney test, *Z* = 2.315; one-tailed, *P* = 0.010).

Positive fitness effects are much more rare than deleterious ones (Fig. 1), and that is why it is difficult to infer complex distributions from the data. The exponential distribution provided a relatively poor fit to both preobserved and random data sets, leaving unexplained >10% of the total variance (*R*^{2} = 0.888 in both cases). The γ distribution provided better fits (*R*^{2} = 0.937 for preobserved and *R*^{2} = 0.953 for random mutations), although the benefit of including an additional parameter was barely significant (preobserved mutations: *F*_{1,7} = 5.532, *P* = 0.051; random mutations: *F*_{1,5} = 6.935; *P* = 0.046). The fit to alternative two-parameter pdfs provided similar fits (data not shown). The mean beneficial effects according to a γ distribution were 4.6% for preobserved and 1.7% for random mutations. The fit of the γ model to the data are shown in Fig. 3.

## Discussion

This work represents a study of the distribution of mutational effects on fitness for an RNA virus using explicit single-nucleotide substitutions. On average, mutations were deleterious even when lethals were ignored. Functional and structural analyses (41, 42) have shown that RNA viruses have a very narrow tolerance to accumulate mutations and still be functional, and thus it is not surprising to find that lethal and deleterious mutations are so common. Additionally, previous indirect approaches (15) estimated that the frequency of deleterious mutations in VSV was ≈34%, a value close to ours (Table 1).

On the other side, we found that among 48 random mutations, two were apparently beneficial. It is generally accepted that beneficial effects are ≈1,000-fold less common that neutral and deleterious ones (6, 39, 43). Therefore, it is striking that two of 48 random mutations were beneficial. However, this result is not so surprising if we recall that we used a chimera genome as template for our mutagenesis experiments. The template cDNA was assembled from clones of each of the VSV genes and intergenic sequences from two different sources. Whereas the N, P, M, and L genes were obtained from the San Juan strain of the Indiana serotype, the G gene was obtained from the Orsay strain of the same serotype (31). At the amino acid level, the divergence between the San Juan and the Orsay G proteins is ≈5%. The question is whether this difference precludes an efficient interaction between the Orsay G protein and the rest of the gene products from the San Juan strain. This being the case, many different possible ways to optimize such genomes are available. Furthermore, the ratio of beneficial to deleterious mutations depends on the degree of adaptation of the virus to the laboratory conditions, which in this case is minimal.

As expected, the mean mutational effects as well as the proportion of lethals were different for the random and preobserved mutation sets. However, the effect of preobserved mutations was still deleterious on average, and in a few cases even lethal (Table 1). This result is not surprising for those changes reported in isolated clones, because RNA virus populations are in a dynamic equilibrium between the input of deleterious variants and purifying selection (13). Additionally, some of these variants could have been hidden from natural selection by genetic complementation, provided that multiplicity of infection was high enough (29, 44). However, 18 of the mutations introduced were not found in isolated clones but in consensus sequence characterized for laboratory populations. Novella *et al*. (19) sequenced half of the genome of viruses evolved in mammalian cells, insect cells, or alternating between both cell types. A total of 13 nt substitutions were detected, and 2 of them rose independently in viruses isolated from different evolutionary regimes. Interestingly, both convergent mutations conferred increased fitness when recreated in our experiments (Pro-120 → Ala and Leu-123 → Trp both in the M gene), which made them good candidates for conferring a general nonspecific adaptive advantage. All three lineages harbored at least one mutation with a positive fitness effect, but on the other side, all of them also contained at least one mutation with a negative effect, measured in our experimental setup. (The latter are good candidates for environment-specific mutations.) The rise in frequency of deleterious mutations can be explained by hitchhiking with beneficial mutations in a nonrecombining genome. Cuevas *et al*. (20) found 25 different mutations in 21 independently evolving populations of VSV undergoing adaptive evolution, most of them occurring recurrently in different populations, in a remarkable case of parallel evolution. Among them, we chose 12 nonsynonymous mutations. In at least four of these experimental populations, all the substitutions fixed had a negative fitness effect when introduced in our experiments, and one was even lethal. In contrast, we found only one beneficial mutation. It is therefore naive to expect a predominance of neutral and beneficial effects among preobserved mutations, because fitness effects strongly depend on genotype (epistasis) and environment (20, 45).

Much effort has gone into studying the distribution of deleterious mutational effects in biological systems such as *Caenorhabditis* (9, 46), *Drosophila* (40, 47–49), *E. coli* (10), and RNA viruses (15, 18). Using a set of random mutations, we have shown that mutational fitness effects in VSV are well described by a log-normal pdf. Many processes in life sciences such as latent periods of infectious diseases, microorganisms' sensitivity to drug treatments, survival times in medicine, presence of contaminants in the air, or the abundance of species in ecology have been described by using log-normal models (50). In general, this distribution arises when a given variable is determined by multiple multiplicative small effects. Recently, Lázaro *et al*. (18) showed that the pattern of titer fluctuations in nonevolving foot-and-mouth disease virus populations was log-normally distributed. Such a result was not unexpected, because numerous cellular factors participate in virus replication, each of them having a small effect on the viral yield. However, in their experimental system, these cellular factors could not be distinguished from mutational effects. In contrast, our results unravel the effect of explicit mutations on viral fitness. RNA viruses have a very compact genome such that a given genomic region may be involved in multiple functions, not only as mere carriers of genetic information but as regulatory elements or even ribozymes (21, 51). Consequently, a single-nucleotide change may have strong pleiotropic effects.

For the set of preobserved mutations, we found that deleterious effects were better described by a β pdf, although a γ also gave a very satisfactory fit. Similar distributions, with an exponential-like shape, have been reported previously for different kinds of DNA organisms and RNA viruses (9, 10, 15, 40, 46–48). Similarly, the variation of codon substitution rates across viral genomes has been modeled by using β and γ distributions (52, 53). This exponential-like shape, with most of the mutations having very small effects but a few having very large deleterious effects, is explained easily under the action of natural selection simply because mutations with small effects are more influenced by genetic drift and less efficiently eliminated from the population (54). When a uniform pdf was added to two-parameter pdfs, models fitted substantially better to the empirical deleterious fitness effects (Table 2). A compound model in which a proportion *p* of the mutants is drawn from a uniform distribution and a proportion 1 - *p* from a γ distribution was the best descriptor for the deleterious fitness effects associated with Tn*10* transposition mutations in *E. coli* (10) and with mutations accumulated by the action of Muller's ratchet in VSV (15).

Studies characterizing the statistical properties of beneficial effects are more scarce than those dealing with deleterious mutations, probably because of the difficulty of isolating beneficial mutations in enough numbers to make trustable statistical inference. Thus far, only two studies using *E. coli* populations directly tackled this issue. Imhof and Schlötterer (37) reported an exponential distribution for the beneficial mutations that survived drift and reached a detectable frequency in the population. Rozen *et al*. (38) found an exponential-like distribution among beneficial mutations fixed. However, none of these studies provide information about the actual distribution of all possible beneficial effects. Using extreme value theory, Orr (39) showed that the distribution of beneficial effects has to be exponential independently of the fitness of the wild-type allele. Despite the limited number of mutations with positive effects, our results support the notion that the distribution of beneficial effects is skewed toward low effects and with a long tail of very large beneficial effects. However, the exponential distribution might be improved by more general two-parameter models such as the γ distribution, suggesting that, in analogy to deleterious mutations, the distribution of positive effects shall be not as simple.

## Acknowledgments

We thank G. T. W. Wertz for kindly providing the VSV full-length infectious cDNA as well as the three support plasmids. We are indebted to A. V. Bordería, C. López-Galíndez, E. Martínez-Salas, and I. S. Novella for invaluable technical advice. This study was supported by Spanish Ministerio de Ciencia y Tecnología Grant BMC2001-3096 (to A.M.) and Generalitat Valenciana Grant GV01-65 (to S.F.E.). R.S. enjoyed a predoctoral fellowship from the Ministerio de Educación, Cultura y Deporte.

## Notes

This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: VSV, vesicular stomatitis virus; pdf, probability density function; hpi, hours postinfection; PFU, plaque-forming units; AIC, Akaike's information criterion.

## References

**,**118-138.

**,**372-387. [PubMed]

**/103,**3-19. [PubMed]

**,**337-370. [PubMed]

**,**11-21. [PubMed]

**,**683-685. [PubMed]

**,**829-837. [PMC free article] [PubMed]

**,**277-292. [PMC free article] [PubMed]

**,**3823-3827. [PMC free article] [PubMed]

**/103,**349-358. [PubMed]

**,**151-178. [PubMed]

**,**251-253. [PubMed]

**,**6881-6891.

**,**987-994. [PubMed]

**,**1078-1088.

**,**6015-6019. [PMC free article] [PubMed]

**,**222-228. [PMC free article] [PubMed]

**,**10830-10835. [PMC free article] [PubMed]

**,**459-465. [PubMed]

**,**533-542. [PMC free article] [PubMed]

**,**399-405. [PubMed]

**,**4873-4883. [PMC free article] [PubMed]

**,**171-181. [PubMed]

**,**655-659. [PMC free article] [PubMed]

**,**256-263. [PMC free article] [PubMed]

**,**3591-3595. [PubMed]

**,**454-464. [PMC free article] [PubMed]

**,**529-535. [PMC free article] [PubMed]

**,**505-514. [PMC free article] [PubMed]

**,**1571-1579. [PubMed]

**,**8388-8392. [PMC free article] [PubMed]

**,**2921-2928. [PubMed]

**,**312-325. [PubMed]

**,**716-723.

**,**3566-3571. [PMC free article] [PubMed]

**,**335-355. [PMC free article] [PubMed]

**,**1113-1117. [PMC free article] [PubMed]

**,**1040-1045. [PubMed]

**,**1519-1526. [PMC free article] [PubMed]

**,**1315-1322. [PMC free article] [PubMed]

**,**197-207. [PubMed]

**,**255-267. [PubMed]

**,**1745-1747. [PubMed]

**,**1-10.

**,**1193-1201. [PMC free article] [PubMed]

**,**1993-1999. [PMC free article] [PubMed]

**,**1130-1139.

**,**467-483. [PMC free article] [PubMed]

**,**341-352.

**,**5787-8794. [PMC free article] [PubMed]

**,**431-449. [PMC free article] [PubMed]

**,**807-814. [PubMed]

**,**1725-1737.

**National Academy of Sciences**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (343K)

- Distribution of fitness and virulence effects caused by single-nucleotide substitutions in Tobacco Etch virus.[J Virol. 2007]
*Carrasco P, de la Iglesia F, Elena SF.**J Virol. 2007 Dec; 81(23):12979-84. Epub 2007 Sep 26.* - Distribution of fitness effects caused by random insertion mutations in Escherichia coli.[Genetica. 1998]
*Elena SF, Ekunwe L, Hajela N, Oden SA, Lenski RE.**Genetica. 1998; 102-103(1-6):349-58.* - Distribution of fitness effects caused by single-nucleotide substitutions in bacteriophage f1.[Genetics. 2010]
*Peris JB, Davis P, Cuevas JM, Nebot MR, Sanjuán R.**Genetics. 2010 Jun; 185(2):603-9. Epub 2010 Apr 9.* - Mutational fitness effects in RNA and single-stranded DNA viruses: common patterns revealed by site-directed mutagenesis studies.[Philos Trans R Soc Lond B Biol Sci. 2010]
*Sanjuán R.**Philos Trans R Soc Lond B Biol Sci. 2010 Jun 27; 365(1548):1975-82.* - Rational evolutionary design: the theory of in vitro protein evolution.[Adv Protein Chem. 2000]
*Voigt CA, Kauffman S, Wang ZG.**Adv Protein Chem. 2000; 55:79-160.*

- Experimental Evolution of an Oncolytic Vesicular Stomatitis Virus with Increased Selectivity for p53-Deficient Cells[PLoS ONE. ]
*Garijo R, Hernández-Alonso P, Rivas C, Diallo JS, Sanjuán R.**PLoS ONE. 9(7)e102365* - A Bioinformatics Pipeline for the Analyses of Viral Escape Dynamics and Host Immune Responses during an Infection[BioMed Research International. 2014]
*Leung P, Bull R, Lloyd A, Luciani F.**BioMed Research International. 2014; 2014264519* - A Comprehensive, High-Resolution Map of a Gene's Fitness Landscape[Molecular Biology and Evolution. 2014]
*Firnberg E, Labonte JW, Gray JJ, Ostermeier M.**Molecular Biology and Evolution. 2014 Jun; 31(6)1581-1592* - The influence of CpG and UpA dinucleotide frequencies on RNA virus replication and characterization of the innate cellular pathways underlying virus attenuation and enhanced replication[Nucleic Acids Research. 2014]
*Atkinson NJ, Witteveldt J, Evans DJ, Simmonds P.**Nucleic Acids Research. 2014 Apr; 42(7)4527-4545* - The role of mutational robustness in RNA virus evolution[Nature reviews. Microbiology. 2013]
*Lauring AS, Frydman J, Andino R.**Nature reviews. Microbiology. 2013 May; 11(5)327-336*

- MedGenMedGenRelated information in MedGen
- PubMedPubMedPubMed citations for these articles
- TaxonomyTaxonomyRelated taxonomy entry
- Taxonomy TreeTaxonomy Tree