- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

- Journal List
- Genetics
- v.180(3); 2008 Nov
- PMC2581975

# A Simple Formula for Obtaining Markedly Improved Mutation Rate Estimates

^{1}

^{1}

*Address for correspondence:*Instituto Gulbenkian de Ciencia, Rua da Quinta Grande, 6 P-2780-156 Oeiras, Portugal. E-mail: ude.mnu@hsirregp

## Abstract

In previous work by M. E. Jones and colleagues, it was shown that mutation rate estimates can be improved and corresponding confidence intervals tightened by following a very easy modification of the standard fluctuation assay: cultures are grown to a larger-than-usual final density, and mutants are screened for in only a fraction of the culture. Surprisingly, this very promising development has received limited attention, perhaps because there has been no efficient way to generate the predicted mutant distribution to obtain non-moment-based estimates of the mutation rate. Here, the improved fluctuation assay discovered by Jones and colleagues is made amenable to quantile-based, likelihood, and other Bayesian methods by a simple recursion formula that efficiently generates the entire mutant distribution after growth and dilution. This formula makes possible a further protocol improvement: grow cultures as large as is experimentally possible and severely dilute before plating to obtain easily countable numbers of mutants. A preliminary look at likelihood surfaces suggests that this easy protocol adjustment gives markedly improved mutation rate estimates and confidence intervals.

A common method for estimating mutation rates is the “fluctuation assay,” in which several bacterial cultures are grown from small, presumably mutant-free, inoculums to high-density cultures and subsequently screened for mutants by plating on selective media (Rosche and Foster 2000; Pope *et al.* 2008). On selective media, only selected mutants are able to form colonies, and the numbers of mutants counted on plates are an indicator of the mutation rate. To make an estimate of the mutation rate from these numbers, *a priori* knowledge of their distribution is required. The fundamental assumption upon which this distribution is derived is that no Lamarckian-like processes exist: mutations are assumed to occur spontaneously during cell division in a way that is completely independent of any detriment or benefit that the mutations might subsequently incur. Such mutations can occur at any time during the growth of a culture. If a mutation happens to occur early in the growth of a culture, it will leave a large number of mutant descendants in the culture at the time of plating. The nonnegligible probability of a mutation occurring early thus translates to nonnegligible probabilities in the right tail of the mutant distribution. The resulting heavy-tailed mutant distribution was first proposed by Luria and Delbrück (1943) and has since been dubbed the “Luria–Delbrück distribution” (LDD) in their honor. Its first rigorous derivation was given by Lea and Coulson (1949), and many subsequent derivations account for different biological and experimental details (Armitage 1952; Bailey 1964; Mandelbrot 1974; Bartlett 1978; Stewart *et al.* 1990; Stewart 1991; Pakes 1993; Zheng 1999; Kepler and Oprea 2001; Oprea and Kepler 2001; Dewanji *et al.* 2005).

Each of these derivations culminates in a version of the LDD expressed in probability-generating function (PGF) form. The great utility of PGFs is that (1) distributions can be easier to derive in PGF form, (2) distributions can be expressed compactly in PGF form, even in cases where there is no way to write down a general (nonderivative) expression for the individual probabilities of the distribution itself, and (3) from a distribution expressed in PGF form, the zero class and the moments of the distribution are immediate. This last feature suggests possible mutation rate estimators. The zero-class estimator is robust to different assumptions about growth rates, but its statistical power is mediocre and its use is restricted to cases in which the zero class exists but is not too big. Moment-based estimators can be very inaccurate as a result of the LDD's heavy tail (Armitage 1952, 1953); in fact, the simplest form of the distribution—due to Lea and Coulson (1949)—has no finite moments.

With limited prospects for the utility of moment-based methods, researchers naturally turned to the quantiles for information about the distribution and for obtaining mutation rate estimates. Quantile-based methods, however, require that distribution probabilities be extracted from the governing PGF. Until relatively recently, algorithms for extracting distribution probabilities have been quite complex, and their utility was therefore chiefly for the construction of tables from which researchers could look up quantile-based mutation rate estimates. The most widely used of these methods was the Lea and Coulson “median method” (Lea and Coulson 1949).

Likelihood and other Bayesian methods of mutation rate estimation were first made accessible to experimentalists by Stewart's discovery of a relatively efficient algorithm for generating the probabilities of the LDD (Stewart *et al.* 1990). The subsequent discovery of an even more efficient algorithm—a simple recursion formula (Ma *et al.* 1991, 1992; Sarkar *et al.* 1992) after Gurland (1958)—made likelihood methods routine, and consequently there has been increasing preference for these superior methods of mutation rate estimation (Rosche and Foster 2000). Stewart (1994) performed a systematic assessment of these superior methods and derived a formula for calculating confidence intervals.

Analysis of mutant-count data can sometimes be complicated by experimental artifacts that make a certain fraction of mutants unable to form observable colonies. Such artifacts can be especially pronounced in assays that use mammalian cells (Kendal and Frost 1988). The fraction of plated mutants that do ultimately form observable colonies was originally called the “plating efficiency,” but it is mathematically equivalent to sampling the culture by any means, such as intentionally plating only a fraction of the final culture by either direct aliquot or dilution. To account for such sampling, Jones (1993, 1994) and Stewart (1991) both model the LDD as an irreducible compound distribution: the total number of mutations that occur during the growth of a culture is a Poisson-distributed random variable, and the total number of mutant colonies that derive from each mutation event is an independent random variable whose distribution is the “clone-size” distribution derived by Lea and Coulson. Jones and Stewart both reworked the clone-size distribution with a sampling step added: they derived the probability that a mutation occurring during the growth of a culture gives rise to *k* visible colonies when only a fraction of mutants form visible colonies. To generate the probabilities of total numbers of mutant colonies on plates after growth and sampling, they both outlined different programming algorithms (Stewart *et al.* 1990; Stewart 1991; Jones 1993; Crane *et al.* 1996) to compose the primary Poisson distribution for numbers of mutations occurring during growth with their reworked clone-size distributions that take preplating sampling into account. Jones' algorithm is as follows.

We let denote the probability that a single new mutation gives rise to mutant colonies in the plated culture after dilution (the Jones clone-size distribution). We then define the conditional probability, to be the probability that or fewer mutant colonies are observed, given that new mutations occurred during the growth of the culture. These probabilities are calculated from iteration of the expression with initial condition . When this is done, then we can compute the desired probability, , that colonies are observed in the plated culture, given that is the mean number of new mutations occurring during growth and is the dilution factor that precedes plating (or plating efficiency), as follows: . One cannot sum numerically to infinity, and this sum must therefore be truncated. Jones truncates the sum at 100, which is adequate for values , but thus limits applicability of his tabulated median method results to protocols for which plating efficiency is greater than ~5%. A more elegant solution to the problem of infinite summation appears in Ma *et al.* (1992) and Jones *et al.* (1999).

The above numerical procedure for obtaining the probabilities of the mutant distribution, while more efficient than some, was still unable to accommodate likelihood calculations in an efficient way, especially when the plating efficiency was very small. Nevertheless, these methods paved the way for a rather striking discovery: Jones and colleagues (Crane *et al.* 1996) observed that mutation rate estimates and corresponding confidence intervals could actually be improved by adding a sampling (or dilution) step. If cultures were grown to larger-than-usual final densities and then diluted prior to plating, mutation rate estimates could be improved and corresponding confidence intervals tightened. This development (henceforth called the “Jones protocol”) was significant because it had the potential to introduce a new standard for reliability and resolution with which mutation rates could be reported. Despite the promise of better estimates, however, Jones' improvement on the standard fluctuation assay has not become standard practice. The gains made by the Jones protocol are countered by cumbersome numerical calculations and hence its reliance on a somewhat limited table-based median method for practical mutation rate estimation. On the other hand, the inferiority of the standard fluctuation protocol is compensated for by the superiority of likelihood and other Bayesian methods available to it for practical mutation rate estimation. The pros and cons of these two protocols thus achieved a rough balance and, faced with a choice between the two, many researchers have opted for the more fashionable likelihood approaches in conjunction with the standard fluctuation protocol.

It is hoped that the simple recursion formula derived below and plotted in Figure 1 will allow experimentalists to have their cake and eat it too, by allowing them to follow the Jones protocol for improved mutation rate estimates (Crane *et al.* 1996) and use maximum likelihood or other Bayesian methods (Asteris and Sarkar 1996) for practical mutation rate estimation. The result is a rather spectacular narrowing of likelihood peaks and increased accuracy of their modes (Figure 2). While several sample likelihood peaks are shown in Figure 2, the purpose of this short note is not to propose a detailed and evaluated methodology for employing the Jones protocol but simply to present the recursion formula that gives the mutant distribution.

**...**

**...**

## DERIVATION OF THE FORMULA

The PGF for numbers of mutants after growth of the cultures in a fluctuation assay (the LDD) as derived by Lea and Coulson (1949) is

where , is the mutation rate, and is the final population size of the cultures. Of the mutants present in the cultures after growth, only a fraction will give rise to observable mutant colonies; *i.e.*, is the plating efficiency. For each mutant in a culture just prior to plating, therefore, the stochastic sampling process due to plating (or, equivalently, due to preplating dilution) is a Bernoulli process, whose PGF is

Numbers of mutants after growth and sampling have a distribution whose PGF is

The Lea and Coulson PGF, , has no finite moments, and it is no surprise therefore that also has no finite moments. The Lea and Coulson PGF may be rewritten as

and the PGF for growth and sampling is therefore

or, expanding further,

Rearranging and collecting coefficients of like powers of gives

The coefficient of can be reduced to

where denotes the hypergeometric function as defined in Abramowitz and Stegun (1965) by their Equation 15.1.1. The PGF can now be rewritten as

where the are defined above.

With now in the appropriate form, we can appeal to Lemma 2 in Zheng (1999), after Ma *et al.* (1991) and Gurland (1958), to derive the probabilities of the distribution, which reduce to the recursion

and

where

Unfortunately, subroutines for computing the hypergeometric function can be computationally intensive and they can become unstable for very small values of , *i.e.*, for severe dilutions. To obtain a more useful expression, I expand the hypergeometric function about and obtain an extremely good approximation (Figure 1) for the case of severe dilutions (small ; see Figure 3):

The tail probabilities of this distribution are derived by inspection of the composition,

where the are the probabilities of the Lea and Coulson distribution. For large , we have (Pakes 1993), and as before; the *n*th term of the composition may thus be expanded as follows:

**...**

Writing out the probabilities of the composed distributions in the LDD tail gives

Rearranging and collecting coefficients of like powers of gives , where

Thus, not surprisingly, tail behavior is not changed by sampling (the relation still holds). Absolute tail probabilities, however, are reduced by the dilution factor, ; this reduction implies a corresponding increase in absolute probabilities elsewhere, *i.e.*, in the peak of the distribution. This observation may provide insight to better understand why mutation rate estimates are better and give tighter confidence intervals when the Jones protocol is used (Crane *et al.* 1996).

## DISCUSSION

The formula derived here makes the Jones protocol amenable to likelihood methods, which have the advantage that they use all the available data in an efficient manner. A disadvantage of likelihood methods, however, is that they are very “parametric” and their usefulness thus depends heavily on the correctness of the underlying model used to generate the distribution. Quantile and zero-class methods (Kendal and Frost 1988; Zheng 1999; Rosche and Foster 2000), on the other hand, are much less parametric and are thus more robust to departures from the underlying model. Put differently, quantile and zero-class methods have some advantage when there is some uncertainty about the underlying model or when there are known simplifying assumptions that can introduce error. The formula presented here derives from the simplest model for the LDD (Lea and Coulson 1949), which makes a number of simplifying assumptions, such as exponentially distributed times between replication events (a Markovian birth process) (Jones *et al.* 1999; Kepler and Oprea 2001), probabilities expressed in infinite (nontruncated) series thereby assigning nonzero (albeit miniscule) probabilities to unrealistically large mutant numbers (Armitage 1952; Bailey 1964; Stewart *et al.* 1990; Ma *et al.* 1992; Zheng 1999), and no difference in growth rates between wild-type and mutant subpopulations (Jones 1994; Jaeger and Sarkar 1995; Zheng 1999). In light of these simplifying assumptions, whose impact can sometimes be significant, it might ultimately be desirable to use the less accurate but more robust quantile or zero-class methods to complement the more accurate but less robust likelihood methods. The zero-class, the median, and other quantiles can be easily computed from the formula presented here. Additionally, Jones and colleagues (Jones *et al.* 1994) derive analytical expressions for zero-class and median estimators that are modified to include dilution. Evaluating the use of quantile and zero-class methods to complement likelihood methods is work in progress.

In the previous work by Jones and colleagues (Crane *et al.* 1996), numerical considerations put limits on the size of the final culture densities and hence the severity of the subsequent preplating dilution; *i.e.*, they put limits on how small can be. The approximation (2) derived above, on the other hand, works best when is small (Figure 3). This fact suggests an improvement on the Jones protocol: grow cultures as large as is experimentally possible and dilute the final cultures accordingly prior to plating to obtain easily countable numbers of mutants on the plates. This can only further reduce the impact of the small- assumption used to obtain the more useful approximate formula derived here, thereby increasing its accuracy, and further tighten corresponding confidence intervals due to the effect described by Jones and colleagues (Crane *et al.* 1996). In Figure 2, this modification is referred to as the “extended Jones protocol.” A systematic evaluation of different experimental protocols and the accuracy of mutation rate estimates to be expected is the subject of this article's sequel.

## Acknowledgments

This work was inspired by a data set sent to me by Dominique Schneider. I thank Paul Sniegowski for helpful discussions, and I thank M. E. Jones and an anonymous reviewer for helpful comments and suggestions. I was supported by National Institutes of Health grant R01 GM079483-01A2 and European Commission grant MEXT-CT-2004-14338.

## References

- Abramowitz, M., and I. A. Stegun, 1965.
*Handbook of Mathematical Functions.*Dover, New York. - Armitage, P., 1952. The statistical theory of bacterial populations subject to mutation. J. R. Stat. Soc. B 14 1–40.
- Armitage, P., 1953. Statistical concepts in the theory of bacterial mutation. J. Hygiene 51 162–184. [PMC free article] [PubMed]
- Asteris, G., and S. Sarkar, 1996. Bayesian procedures for the estimation of mutation rates from fluctuation experiments. Genetics 142 313–326. [PMC free article] [PubMed]
- Bailey, N. T. J., 1964.
*The Elements of Stochastic Processes With Applications to the Natural Sciences.*Wiley, New York. - Bartlett, M. S., 1978.
*An Introduction to Stochastic Processes.*Cambridge University Press, London. - Crane, G. J., S. M. Thomas and M. E. Jones, 1996. A modified Luria-Delbrück fluctuation assay for estimating and comparing mutation rates. Mutat. Res. Fundam. Mol. Mech. Mutagen. 354 171–182. [PubMed]
- Dewanji, A., E. G. Luebeck and S. H. Moolgavkar, 2005. A generalized Luria-Delbrück model. Math. Biosci. 197 140–152. [PubMed]
- Gurland, J., 1958. A generalized class of contagious distributions. Biometrics 14 229–249.
- Jaeger, G., and S. Sarkar, 1995. On the distribution of bacterial mutants: the effects of differential fitness of mutants and non-mutants. Genetica 96 217–223.
- Jones, M. E., 1993. An algorithm accounting for plating efficiency in estimating spontaneous mutation rates. Comput. Biol. Med. 23 455–461. [PubMed]
- Jones, M. E., 1994. Luria-Delbrück fluctuation experiments: accounting simultaneously for plating efficiency and differential growth rate. J. Theor. Biol. 166 355–363. [PubMed]
- Jones, M. E., S. M. Thomas and A. Rogers, 1994. Luria-Delbrück fluctuation experiments: design and analysis. Genetics 136 1209–1216. [PMC free article] [PubMed]
- Jones, M. E., S. M. Thomas and K. Clarke, 1999. The application of a linear algebra to the analysis of mutation rates. J. Theor. Biol. 199 11–23. [PubMed]
- Kendal, W. S., and P. Frost, 1988. Pitfalls and practice of Luria-Delbrück fluctuation analysis: a review. Cancer Res. 48 1060–1065. [PubMed]
- Kepler, T. B., and M. Oprea, 2001. Improved inference of mutation rates: I. An integral representation for the Luria-Delbrück distribution. Theor. Popul. Biol. 59 41–48. [PubMed]
- Lea, D. E., and C. A. Coulson, 1949. The distribution of the numbers of mutants in bacterial populations. J. Genet. 49 264–285. [PubMed]
- Luria, S. E., and M. Delbrück, 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28 491–511. [PMC free article] [PubMed]
- Ma, W. T., G. v. Sandri and S. Sarkar, 1991. Novel representation of exponential functions of power series which arise in statistical mechanics and population genetics. Phys. Lett. A 155 103–106.
- Ma, W. T., G. v. Sandri and S. Sarkar, 1992. Analysis of the Luria-Delbrück distribution using discrete convolution powers. J. Appl. Probab. 29 255–267.
- Mandelbrot, B., 1974. A population birth-and-mutation process, I: explicit distributions for the number of mutants in an old culture of bacteria. J. Appl. Probab. 11 437–444.
- Oprea, M., and T. B. Kepler, 2001. Improved inference of mutation rates: II. Generalization of the Luria-Delbrück distribution for realistic cell-cycle time distributions. Theor. Popul. Biol. 59 49–59. [PubMed]
- Pakes, A. G., 1993. Remarks on the Luria-Delbrück distribution. J. Appl. Probab. 30 991–994.
- Pope, C. F., D. M. O'Sullivan, T. D. McHugh and S. H. Gillespie, 2008. A practical guide to measuring mutation rates in antibiotic resistance. Antimicrob. Agents Chemother. 52 1209–1214. [PMC free article] [PubMed]
- Rosche, W. A., and P. L. Foster, 2000. Determining mutation rates in bacterial populations. Methods 20 4–17. [PMC free article] [PubMed]
- Sarkar, S., W. T. Ma and G. V. H. Sandri, 1992. On fluctuation analysis: a new, simple and efficient method for computing the expected number of mutants. Genetica 85 173–179. [PubMed]
- Stewart, F. M., 1991. Fluctuation analysis: the effect of plating efficiency. Genetica 84 51–55. [PubMed]
- Stewart, F. M., 1994. Fluctuation tests: How reliable are the estimates of mutation rates? Genetics 137 1139–1146. [PMC free article] [PubMed]
- Stewart, F. M., D. M. Gordon and B. R. Levin, 1990. Fluctuation analysis: the probability distribution of the number of mutants under different conditions. Genetics 124 175–185. [PMC free article] [PubMed]
- Wolfram Research, 2005.
*Mathematica Edition*, Version 5.2. Wolfram Research, Champaign, IL. - Zheng, Q., 1999. Progress of a half century in the study of the Luria-Delbrück distribution. Math. Biosci. 162 1–32. [PubMed]
- Zheng, Q., 2002. Statistical and algorithmic methods for fluctuation analysis with SALVADOR as an implementation. Math. Biosci. 176 237–252. [PubMed]

**Genetics Society of America**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (641K)

- A modified Luria-Delbrück fluctuation assay for estimating and comparing mutation rates.[Mutat Res. 1996]
*Crane GJ, Thomas SM, Jones ME.**Mutat Res. 1996 Jul 22; 354(2):171-82.* - Mutations and infinity: improved statistical methods for estimating spontaneous rates.[Environ Mol Mutagen. 1996]
*Nádas A, Goncharova EI, Rossman TG.**Environ Mol Mutagen. 1996; 28(2):90-9.* - New algorithms for Luria-Delbrück fluctuation analysis.[Math Biosci. 2005]
*Zheng Q.**Math Biosci. 2005 Aug; 196(2):198-214.* - An estimator of the mutant frequency in assays using transgenic animals.[Mutat Res. 1999]
*Delongchamp RR, Malling HV, Chen JB, Heflich RH.**Mutat Res. 1999 Mar 15; 440(1):101-8.* - Using linked markers to infer the age of a mutation.[Hum Mutat. 2001]
*Rannala B, Bertorelle G.**Hum Mutat. 2001 Aug; 18(2):87-100.*

- Unbiased Estimation of Mutation Rates under Fluctuating Final Counts[PLoS ONE. ]
*Ycart B, Veziris N.**PLoS ONE. 9(7)e101434* - Large population solution of the stochastic Luria-Delbr?ck evolution model[Proceedings of the National Academy of Scie...]
*Kessler DA, Levine H.**Proceedings of the National Academy of Sciences of the United States of America. 2013 Jul 16; 110(29)11682-11687* - The probability of evolutionary rescue: towards a quantitative comparison between theory and evolution experiments[Philosophical Transactions of the Royal Soc...]
*Martin G, Aguilée R, Ramsayer J, Kaltz O, Ronce O.**Philosophical Transactions of the Royal Society B: Biological Sciences. 2013 Jan 19; 368(1610)20120088* - Luria-Delbr?ck Estimation of Turnip Mosaic Virus Mutation Rate In Vivo[Journal of Virology. 2012]
*de la Iglesia F, Martínez F, Hillung J, Cuevas JM, Gerrish PJ, Daròs JA, Elena SF.**Journal of Virology. 2012 Mar; 86(6)3386-3388* - Mutator dynamics in sexual and asexual experimental populations of yeast[BMC Evolutionary Biology. ]
*Raynes Y, Gazzara MR, Sniegowski PD.**BMC Evolutionary Biology. 11158*

- A Simple Formula for Obtaining Markedly Improved Mutation Rate EstimatesA Simple Formula for Obtaining Markedly Improved Mutation Rate EstimatesGenetics. Nov 2008; 180(3)1773PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...