- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

- Journal List
- PLoS ONE
- v.2(3); 2007
- PMC1804098

# Theory versus Data: How to Calculate R_{0}?

^{}

^{#}Contributed equally.

Wrote the paper: SB RB RV. Other: Performed computer simulations: RB RV.

## Abstract

To predict the potential severity of outbreaks of infectious diseases such as SARS, HIV, TB and smallpox, a summary parameter, the basic reproduction number R_{0}, is generally calculated from a population-level model. R_{0} specifies the average number of secondary infections caused by one infected individual during his/her entire infectious period at the start of an outbreak. R_{0} is used to assess the severity of the outbreak, as well as the strength of the medical and/or behavioral interventions necessary for control. Conventionally, it is assumed that if R_{0}>1 the outbreak generates an epidemic, and if R_{0}<1 the outbreak becomes extinct. Here, we use computational and analytical methods to calculate the average number of secondary infections and to show that it does not necessarily represent an epidemic threshold parameter (as it has been generally assumed). Previously we have constructed a new type of individual-level model (ILM) and linked it with a population-level model. Our ILM generates the same temporal incidence and prevalence patterns as the population-level model; we use our ILM to directly calculate the average number of secondary infections (i.e., R_{0}). Surprisingly, we find that this value of R_{0} calculated from the ILM is very different from the epidemic threshold calculated from the population-level model. This occurs because many different individual-level processes can generate the same incidence and prevalence patterns. We show that obtaining R_{0} from empirical contact tracing data collected by epidemiologists and using this R_{0} as a threshold parameter for a population-level model could produce extremely misleading estimates of the infectiousness of the pathogen, the severity of an outbreak, and the strength of the medical and/or behavioral interventions necessary for control.

## Introduction

In Epidemiology, it is essential to quantify the severity of actual (or potential) outbreaks of infectious diseases such as SARS [1], [2], HIV [3], TB [4], and smallpox [5]. The standard procedure is to calculate a parameter called *the basic reproduction number* (R_{0}) that characterizes the potential of an outbreak to cause an epidemic. R_{0} has been extensively used to assess transmissibility of pathogens, severity of outbreaks, and epidemiological control [1]–[6]. The established definition of R_{0}, as phrased by Anderson and May [6], is “*the average number of secondary infections produced when one infected individual is introduced into a host population where everyone is susceptible*”. They have stated that “*If R _{0} is greater than one then the outbreak will lead to an epidemic, and if R_{0} is less than one then the outbreak will become extinct*” [6]; thus they have assumed that R

_{0}is a threshold parameter that establishes whether an outbreak yields an epidemic or not. Here we establish that the average number of secondary infections (i.e., R

_{0}) is not always an epidemic threshold parameter.

Epidemiologists calculate R_{0} using individual-level contact tracing data obtained at the onset of the epidemic. Once an individual is diagnosed, his/her contacts are traced and tested. R_{0} is then computed by averaging over the number of secondary cases of many diagnosed individuals. This approach is based upon the definition of R_{0}, but it does not ensure that the calculated R_{0} is also an epidemic threshold parameter.

Another approach (which is more commonly used) is to obtain R_{0} from population-level data, namely cumulative incidence data. Making certain individual-level modeling assumptions (e.g., the mass-action principle of infectious spread, time independent infection rates, etc.), theorists construct models (typically) based on Ordinary Differential Equations (ODEs) which describe the dynamics of the expected population size in different disease stages without tracking individuals. It is very important to note that the individual-level modeling assumptions cannot be verified using population-level data (i.e., they remain hypothetical). ODE models are formulated in terms of disease transmissibility and progression rates at the population level. These parameters are obtained by fitting the model to population-level data; their relation to the individual-level processes may be quite complex and is generally unknown. Bifurcation analysis of the ODE model yields a threshold parameter [7] that signals the epidemic as indicated by Anderson and May [6] and is formulated in terms of the population-level parameters. This threshold parameter is not usually checked against the value of R_{0} that has been calculated from contact tracing data.

The individual-level and the population-level approaches may produce very different numbers as the first calculates the value of R_{0}, whilst the second calculates the value of a threshold parameter. The question of whether the R_{0} obtained by calculating the average number of secondary infections matches the threshold parameter obtained from fitting the epidemiological model to population-level data has been previously studied [8], [9]. In these two papers, the authors show that R_{0} values obtained from different individual-level models (ILMs) do not necessarily agree with those obtained from mean-field ODE models. However, in order to make this point, the modelers consider that the individual-level transmission dynamics occurs on a *social contact network* with a structure that is different from the all-to-all network assumed by ODE models. An infected individual can only infect his/her neighbors in the network which represent a small fraction of the total population. Thus, the R_{0} mismatch can be attributed to the model mismatch. In contrast, in our ILMs, we preserve the assumption that the contact network is all-to-all. However, our research focuses on the *transmission network*. This network is embedded in the social contact network and forms in time during disease spreading by tracking who infected whom. We analyze two distinct ways in which the transmission network can be realized and directly compute R_{0}. We thus discuss two distinct ILMs whose prevalence and incidence can be described by an ODE model with an established threshold parameter. We calculate their R_{0} values through the definition and then compare these values with the epidemic threshold parameter. Our results address the question of whether or not an R_{0} (i.e., an average number of secondary infections) can be assigned to an ODE model (which only provides a population-level description of disease propagation) without having any knowledge of the underlying disease transmission network.

## Methods

A simple ODE model is the Susceptible-Infected (*SI*) model given by *dS/dt=π-βIS/(S+I)* and *dI/dt= βIS/(S+I)-μI*, where *β* and *μ* are the inflow and, respectively, the outflow of infectious individuals per infectious capita. We apply this model at disease invasion when virtually everyone is susceptible (i.e., *S/(S+I)* is approximately 1) and obtain *dI/dt=βI-μI*. The threshold parameter for the reduced model is *β/μ*; if *β/μ*>1 an outbreak develops into an epidemic, if *β/μ*<1 an outbreak goes extinct. It is important to note that *β* and *μ* are obtained from fitting the model to population-level data, with no clear association to the causal individual-level processes. An individual-level model that is compatible to these dynamics is a branching process; see Fig. 1 and Mathematical Details S1. In this context, *β* is interpreted as the infection rate of an individual and *μ* is the recovery rate of an individual. In this branching process, an individual is expected to infect a number of *β/μ* secondary cases which represents the R_{0} of this ILM. In this case, the average number of secondary infections R_{0}=*β/μ* is also a threshold parameter of the population-level dynamics.

However, the branching process is not the only possible ILM that is compatible with the ODE model. Recently, we have shown that other plausible ILMs can be constructed [10] that yield the same ODE dynamics as the *SI* model at disease invasion. We have constructed a new class of ILMs [10]–[12]; see Fig. 2 and Mathematical Details S1. Since, our example ILM generates the same prevalence and incidence as the *SI* ODE model (Fig. 3A) then it would be expected, on the basis of conventional wisdom, to generate the same R_{0}. Starting from one infected individual, our simulations integrated the ILM and kept track of the number of secondary infections caused by each individual in the infectious and in the recovered pools. The dynamics were integrated to a certain final time and the collected data were stratified over the date of infection. R_{0} was calculated using the average number of secondary cases generated by infectious individuals, according to the standard definition of Anderson and May [6]. This procedure ensures that each individual included in the calculation of R_{0} is no longer infectious and that there is no right censoring (See Mathematical Details S1). More importantly, it emulates the process of obtaining an R_{0} value by real-world contact tracing data.

## Results

The results (black dots) of the simulation are presented in Fig. 3. For comparison with these results, we present the results (open circles) of a similar simulation for the branching process. The prevalence results for the branching process and the ILM agree very well; see Fig. 3A. For the branching process, R_{0} yields the expected value that agrees with the threshold parameter of the *SI* ODE model; see Fig. 3B. Surprisingly, the graph of R_{0} versus the date of infection plateaus at a lower value than that for the branching model. It is thus evident, as supported by our numerics, that two individual-level models having exactly the same expectations of the corresponding population-level variables (i.e., incidence and prevalence) may yield different R_{0} values (as given by the definition). In the case of our second ILM (see Fig. 2), R_{0} is not the threshold parameter of the *SI* ODE model.

## Discussion

Our results have significant consequences for understanding the concept of R_{0}. We explicitly show that certain population-level dynamics, theoretically specified by an ODE model, can be the result of many distinct ILMs. We further demonstrate that the R_{0} obtained from the ILM, by applying the definition of Anderson and May [6], may be different from the epidemic threshold parameter provided by the ODE model. Therefore, population-level predictions based upon an ODE model that use the R_{0} value found by contact tracing as a threshold parameter may be inaccurate.

Our novel results have significant implications for understanding the dynamics of outbreaks of infectious diseases, particularly for the biological understanding of the transmission dynamics of the pathogen, estimating the severity of outbreaks, making health policy decisions, and designing epidemic control strategies. We have shown that the value of R_{0} may not be an accurate measure of the severity of an outbreak since R_{0} may fail to represent an epidemic threshold parameter. Thus, measuring R_{0} through contact tracing (as generally occurs during an outbreak investigation), may not help in predicting the severity of the outbreak and may not be a useful measure for determining the strength of the necessary control interventions. Only an epidemic threshold parameter can be used to design control strategies. This parameter can be obtained through fitting an ODE model to population-level data as mentioned above and will signal epidemic growth whether or not it is equal to the average number of secondary infections. However, obtaining an R_{0} value via contact tracing can be very useful in conjunction with population-level epidemic data to understand the possible transmission mechanisms of the epidemic at the individual level. We thus suggest that the role of R_{0} should be more carefully considered, and that a reevaluation of the role of R_{0} may lead to the development of more effective control strategies.

## Supporting Information

#### Mathematical Details S1

Here we give more details and references about the individual-level models presented in the main text. We also briefly discuss how the concept of right censoring manifests in our simulations.

(0.05 MB PDF)

^{(51K, pdf)}

## Acknowledgments

We thank Virginie Supervie and Justin Okano for stimulating and helpful discussions during the course of this research. We also thank Tiffany Head for assistance with the figures.

## Footnotes

**Competing Interests: **The authors have declared that no competing interests exist.

**Funding: **The authors gratefully acknowledge financial support from NIH (RO1 {"type":"entrez-nucleotide","attrs":{"text":"AI041935","term_id":"3281129"}}AI041935). The funders had no role in the study or in the preparation of the manuscript.

## References

_{0}. Journal of Theoretical Biology. 2000;203:51–61. [PubMed]

**Public Library of Science**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (104K)

- Superspreading and the effect of individual variation on disease emergence.[Nature. 2005]
*Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM.**Nature. 2005 Nov 17; 438(7066):355-9.* - Perspectives on the basic reproductive ratio.[J R Soc Interface. 2005]
*Heffernan JM, Smith RJ, Wahl LM.**J R Soc Interface. 2005 Sep 22; 2(4):281-93.* - The estimation of the basic reproduction number for infectious diseases.[Stat Methods Med Res. 1993]
*Dietz K.**Stat Methods Med Res. 1993; 2(1):23-41.* - The failure of R0.[Comput Math Methods Med. 2011]
*Li J, Blakeley D, Smith RJ.**Comput Math Methods Med. 2011; 2011:527610. Epub 2011 Aug 16.* - Estimation of R0 from the initial phase of an outbreak of a vector-borne infection.[Trop Med Int Health. 2010]
*Massad E, Coutinho FA, Burattini MN, Amaku M.**Trop Med Int Health. 2010 Jan; 15(1):120-6. Epub 2009 Nov 3.*

- Optimized Strategy for the Control and Prevention of Newly Emerging Influenza Revealed by the Spread Dynamics Model[PLoS ONE. ]
*Zhang WD, Zu ZH, Xu Q, Xu ZJ, Liu JJ, Zheng T.**PLoS ONE. 9(1)e84694* - An IDEA for Short Term Outbreak Projection: Nearcasting Using the Basic Reproduction Number[PLoS ONE. ]
*Fisman DN, Hauck TS, Tuite AR, Greer AL.**PLoS ONE. 8(12)e83622* - Infectious disease transmission as a forensic problem: who infected whom?[Journal of the Royal Society Interface. 201...]
*Teunis P, Heijne JC, Sukhrie F, van Eijkeren J, Koopmans M, Kretzschmar M.**Journal of the Royal Society Interface. 2013 Apr 6; 10(81)20120955* - The Impact of the Unstructured Contacts Component in Influenza Pandemic Modeling[PLoS ONE. ]
*Ajelli M, Merler S.**PLoS ONE. 3(1)e1519* - Simulation of an SEIR infectious disease model on the dynamic contact network of conference attendees[BMC Medicine. ]
*Stehlé J, Voirin N, Barrat A, Cattuto C, Colizza V, Isella L, Régis C, Pinton JF, Khanafer N, Van den Broeck W, Vanhems P.**BMC Medicine. 987*

- Theory versus Data: How to Calculate R0?Theory versus Data: How to Calculate R0?PLoS ONE. 2007; 2(3)PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...