Exploration of Superspreading Events in 2015 MERS-CoV Outbreak in Korea by Branching Process Models

South Korea has learned a valuable lesson from the Middle East respiratory syndrome (MERS) coronavirus outbreak in 2015. The 2015 MERS-CoV outbreak in Korea was the largest outbreak outside the Middle Eastern countries and was characterized as a nosocomial infection and a superspreading event. To assess the characteristics of a super spreading event, we specifically analyze the behaviors and epidemiological features of superspreaders. Furthermore, we employ a branching process model to understand a significantly high level of heterogeneity in generating secondary cases. The existing model of the branching process (Lloyd-Smith model) is used to incorporate individual heterogeneity into the model, and the key epidemiological components (the reproduction number and the dispersive parameter) are estimated through the empirical transmission tree of the MERS-CoV data. We also investigate the impact of control intervention strategies on the MERS-CoV dynamics of the Lloyd-Smith model. Our results highlight the roles of superspreaders in a high level of heterogeneity. This indicates that the conditions within hospitals as well as multiple hospital visits were the crucial factors for superspreading events of the 2015 MERS-CoV outbreak.


Introduction
The novel coronavirus emerged in December 2019 and spread to 214 countries with 16,897,243 confirmed cases and 663,470 fatalities as of 30 July 2020 [1]. South Korea was one of the countries that has experienced the early stage of the COVID-19 pandemic [2]. In the absence of vaccines and treatments, South Korea has implemented and maintained effective interventions such as large-scale epidemiological investigation, rapid diagnosis, social distancing, and prompt clinical classification of severe patients with appropriate medical measures. This was possible because the Korean government and health officials learned valuable lessons from the Middle East respiratory syndrome coronavirus (MERS-CoV) outbreak in 2015 [3].
The index case (the first infected individual) of the 2015 MERS-CoV outbreak in South Korea was a man who returned from a business trip in the Middle East. The man visited several medical clinics and hospitals because of fever (being infectious and unidentified), causing the rapid spread of MERS-CoV in the hospitals [3,4]. As a result, there were a total of 186 infected cases, including 38 deaths, and this event recorded the largest number of total confirmed cases outside the Arabian Peninsula.
Especially, KCDC disclosed information including epidemiological surveillance, hospitals, case contact tracing, and supersspreaders to the public [3]. Therefore, we gathered relevant information from the KCDC website, WHO, and news/media reports [3,4,21].
First, Figure 1 demonstrates the epidemic curves of the 2015 MERS-CoV outbreak according to generations (top panel) and intervention periods (bottom panel). There was a total of four generations, and we classified unidentified cases as unknown cases. A total of 28 secondary cases have been linked to the index patient in the first generation of the disease, 111 secondary cases have been reported for the second generation, 22 cases have been identified for the third generation, and one case has been reported for the fourth generation. As seen in the bottom panel, we classified four periods according to interventions: period 1 (May 20-29) is the initial intervention period, during which we only included people who shared a room with the index patient or cared for the index patient; period 2 (May 30-June 7) is the second intervention period, during which we included further close contacts; period 3 (June 8-12) is the third intervention period after the government disclosed information regarding affected healthcare facilities on June 7; and period 4 (June 13-21) is the intervention period when the Republic of Korea and the World Health Organization (WHO) jointly announced the outbreak situation and stressed the awareness [4].

Superspreading Events
The 2015 MERS-CoV outbreak in Korea is characterized as a "superspreading event" (SSE). In general, these SSEs can be defined by a 20/80 rule, which means that 20% of infectious individuals are responsible for 80% of newly generated infections.
Transmission trees (infection tracing only) for a total of 186 MERS-CoV cases are displayed in Figure 2. The node size is proportional to the number of secondary cases (five superspreaders are identified). Figure 3 displays the distribution of the number of secondary cases (left panel) and the 20/80 rule (right panel) of the 2015 MERS-CoV outbreak. Distribution based on the data without multiple cases (17 patients who were infected by more than one patient) and two unknown cases. A total of 152 patients did not infect anyone, but one patient infected 79 people, indicating that the distribution shows high heterogeneity. Furthermore, it is called as an SSE when a cumulative percentage of secondary cases passes on the left of a (20, 80) point, that is, the dashed area in the right panel [5]. In the case of the 2015 MERS-CoV outbreak, all infections originated from just 16 infected people of whom five were superspreaders. In other words, 79% (147 patients) of the total patients were infected by~3% (five patients) of the patients. Table 1 lists the information about these superspreaders [3,22].    Table 1 displays the characteristics of these five superspreaders [3,22]. Note that although superspreaders #14 and #15 had similar conditions such as age, number of hospitals visited, and duration of exposure, there were enormous differences in the number of secondary infection; patient #14 infected 78 people in B Hospital and 1 person in X Hospital, and patient #15 infected only 6 people in C Hospital.

Nosocomial Infections
The 2015 MERS-CoV outbreak in Korea is characterized as a nosocomial outbreak (infections occurred only in the hospital setting, not in the community). The 2015 MERS-CoV outbreak mainly occurred in Seoul, Gyeonggi-do, and Daejeon. The index patient visited Hospital A and other hospitals and then had MERS infection confirmed in Hospital B. During his stay in Hospital A, the index case spread MERS to 26 patients, which included patients #14, #15, and #16. After contacted with the index case, #14 patient stayed in the emergency room of Hospital B and infected 78 people, and patient #15 infected six people in Hospital C. After patient #16, who was infected in Hospital A, went to Daejeon, he spread MERS to 13 and 10 patients in Hospitals D and E, respectively. After patient #76 came into contact with patient #14 in B Hospital, he spread MERS to five patients and four patients in Hospitals F and G, respectively.
There are studies in which multiple hospital visits are a key factor for superspreaders [22,23]. This is critical because visiting several hospitals increases the probability of contact and exposure with infectious individuals. Besides, there exists one more crucial factor of the hospital condition such as the density of emergency rooms, etc. Figure 4 displays the timeline and hospitals visited by the five superspreaders. The vertical dashed line shows the confirmed date of the index case, May 20 [24]. The orange circle indicates the index case and he came into contact with other patients during his stay in room 8014 in A Hospital. Patients #14, #15, and #16 are shown with blue, green, and yellow circles, respectively (upper line). Patient #76 who was infected in B Hospital is shown with a dark gray circle (bottom line).
As mentioned previously, patients #14 and #15 were different despite similar physical conditions. In comparison, after they left Hospital A, where they were exposed to the index case, patient #14 visited the emergency room of Hospital B and he stayed there for approximately 56 h, whereas patient #15 visited the emergency room of Hospital C for about 2 h and was moved to a ward (isolated) [25]. Furthermore, Hospital B, where patient #14 visited, had an emergency overcrowding index of over 100% [26]. Thus, the condition of hospitals and health-care facilities is also considered an important factor in the spread of 2015 MERS-CoV.  Figure 4. Multiple visits of the five superspreaders are displayed: the index case is shown with an orange circle, patient #14 is shown with a blue circle, patient #15 is shown with a green circle, patient #16 is shown with a yellow circle, and patient #76 is shown with a dark gray circle. The vertical dashed line indicates the date that the index case was confirmed on May 20.

Branching Process Models
In this section, we consider the following branching process proposed in the previous work [20]. The random variable, Z, implies the number of secondary cases caused by each infectious individual. The offspring distribution of Z is modeled by a Poisson process. The value of ν for a given individual's infectious history is the expected number of secondary cases they will cause, i.e., their individual reproductive number. Note that ν is an expectation and can take any positive real value, while Z is a non-negative integer (0, 1, 2, 3, . . .). Three distinct scenarios of the individual reproductive number are considered, and therefore three candidate models for the offspring distribution are given as • Branching process model 1 (BP1) • Branching process model 2 (BP2) • Lloyd-Smith (LS) branching process model These models are classified by the distribution of individual reproductive number ν. BP1 is a generation-based model that neglects individual variation, which means that all individuals have the same reproductive number, ν = R 0 . Thus, the offspring distribution is Z ∼ Poisson(R 0 ). BP2 assumes a homogeneous transmission rate with an exponentially distributed recovery rate as ν ∼ Exponential(1/R 0 ). Then, the offspring distribution yields Z ∼ Geometric(R 0 ). The LS model is a general formulation to incorporate models where ν is gamma-distributed with mean R 0 and a dispersion parameter k. The offspring distribution is Z ∼ Negative Binomial(R 0 , k). Note that conventional notation is When k → ∞, Negative Binomial(R 0 , k) becomes Z ∼ Poisson(R 0 ), and also when k = 1, it becomes Z ∼ Geometric(R 0 ). In the negative binomial distribution, smaller values of k indicate greater heterogeneity in the secondary cases (see more details in [20]). In the negative binomial distribution of the LS model, we use the maximum likelihood method to estimate model parameters (two parameters). Further details can be found elsewhere [20,27,28].
A definition of a superspreader was proposed in [20], and the process uses a Poisson distribution with mean R 0 (the reproductive number) because a Poisson distribution means stochasticity without individual variation. Lloyd-Smith et al. defined an SSE as any infected individual who infects more than Z (n) others, where Z (n) is the nth percentile of the Poisson distribution with mean R 0 . That is, in a homogeneous population, a 99th-percentile SSE means any case causing more infections than would occur in 99% of infectious histories. In the case of the 2015 MERS-CoV outbreak, there were five superspreaders, where the threshold number of cases Z (99) is 5. In the case of the 2015 MERS-CoV outbreak, we define superspreaders as those who infected five or more secondary cases according to the Poisson distribution. This is consistent with the definition of SSEs KCDC claimed (an infector with more than four infectees).

Parameter Estimation
In this section, we estimate the parameters of the LS model using the maximum likelihood method. As mentioned earlier, heterogeneity in the MERS-CoV outbreak can be explained by both the reproduction number, R 0 , and the dispersion parameter, k. The reproduction number represents the average number of secondary cases per index case. The dispersion parameter k quantifies the degree of heterogeneity in the secondary cases. Smaller values of k imply a higher level of heterogeneity in secondary cases.
We have fitted a negative binomial distribution (the LS model) to the number of secondary cases from the empirical transmission tree of MERS-CoV data (as shown in Figure 2). Here, we consider two different scenarios depending on the information with/without multiple contacts in the secondary cases of 186 cases. The results of the estimated parameters are given in Table 2 under these two scenarios. The first one is the data set with a total of 167 cases (excluding 17 multiple contacts and 2 unknown) and the second is the data set with a total of 187 cases (including all cases). Figure 5 presents the results under these two scenarios. Under the first set of 167 cases, the basic reproduction number (R 0 = 0.96) is below 1, while the basic reproduction number (R 0 = 1.06) is above for the second set of data with 186 cases. Moreover, the dispersion parameter is smaller, k = 0.063 of the first data set (using 167 cases) than the one, k = 0.12 of the second data set (using 186 cases). This shows that the parameters we obtained are sensitive to the transmission chains of the MERS-CoV outbreak.
Our estimation results of subcritical R 0 (which means that the basic reproduction number (R 0 = 0.96) is less than 1) are consistent with the other results in the previous work [8,16,29,30]. Their work also pointed out that the reproduction number for secondary cases during transmission chains of MERS-CoV in the Middle East has been estimated to lie below the epidemic threshold at R 0 = 1 [16,29,30]. Another study for the 2015 MERS-CoV in South Korea estimated R 0 = 0.91 and k = 0.06 using the LS model [8].
Even though there are slight differences in the parameters estimated from the two sets of data, both results indicate that the heterogeneity of the secondary cases caused by an SSE is high. Furthermore, the extinction probability is the determination of the extinction of disease and our results have a large probability, 0.99 (or 0.98), which indicates that the outbreak should end eventually. It turned out that most MERS-CoV outbreaks have ended within a shorter period time. Next, we illustrate the results of MERS-CoV incidence using the three branching process model given in the previous section. The parameters of the LS model will be used for the BP1 and BP2 models so that they have the same R 0 . Figure 6 shows the results of 5000 simulations with the parameters (k = 0.063 and R 0 = 0.96). As seen in the upper panels, the LS model has a greater scope for outbreak size and outbreak duration than BP1 and BP2. This is because of the greater probability of a large outbreak size in the LS model, which has high heterogeneity. The red diamond mark indicates the 2015 MERS-CoV outbreak, which is a 4th generation outbreak with a total outbreak size of 186 cases. This again confirms that the results of the LS model capture the actual 2015 MERS-CoV outbreak the best.
The middle panels of Figure 6 show the five outbreaks that are the nearest cases to the 2015 MERS-CoV outbreak for each model (five red cross marks) and the bottom panels are the cumulative cases for the outbreaks. This demonstrates that the LS model has the largest peak size and the shortest outbreak duration due to the highest level of heterogeneity (or superspreading events). This implies there is more chance to have a rapid increase in the outbreak size within a short time window (smaller generations). These epidemic outputs of the three models are compared in Table 3. Although the extinction probabilities of BP1 and LS are slightly below 1, they are almost 1 because of R 0 < 1 and this is consistent with the previous research [8]. This confirms that the MERS-CoV showed a significantly high level of heterogeneity in secondary cases due to the five superspreaders.
We present the impact of the basic reproduction number, R 0 , and the dispersion parameter k on the outbreak size. The left panel of Figure 7 demonstrates the averaged outbreak size as varying R 0 and k values. The blue line indicates a total of 186 cases of the 2015 MERS-CoV. The result shows that the outbreak size increases as the dispersion parameter k decreases. Besides, the outbreak size is highly influenced by k as R 0 increases. When R 0 = 0.96 and k = 0.063 (red filled circle), the result is the closest to the actual 2015 MERS-CoV outbreak.  The right panel of Figure 7 illustrates the results of three models using R 0 = 0.96 and k = 0.063 of the LS model.

The Effect of Control Measures
We investigate the effects of three control measures on the MERS-CoV transmission dynamics. These control measures are proposed in the work [20]. Population-wide control is where infectiousness of individual is totally reduced by a factor c so the individual reproduction number is reduced Individual-specific control is where Z ind c = 0 as a factor c if controlled, and Z ind c = Z if not controlled. Furthermore, in individual-specific control, random individual-specific control is where individuals are randomly controlled, and targeting individual-specific control is where the top 20% exercise more control effort than the bottom 80% (in our simulations, the top 20% expend four times more effort than the bottom 80%). Targeting individual-specific control is particularly for controlling superspreaders (the top 20%). Figures 8-10 illustrate the results of control strategies for the three models: population-wide control, random individual-specific control, and targeting individual-specific control. To compare the results of different R 0 , this simulation used two R 0 values (R 0 = 0.96, R 0 = 1.06) and k = 0.06; all the results were the average of 10,000 simulations and the total outbreak size of 100 generations. In all simulations, BP1 did not have targeting individual-specific control because all individuals are identical in BP1, which means that random individual-specific control and targeting individual-specific control are the same.
The left panel of Figure 8 shows the results of BP1; the results of population-wide control and random individual-specific control are similar. For BP2 (the middle panel) and the LS model (the right panel), although there is a slight difference between models without control (c = 0), the outbreak size of targeting individual-specific control was lower than those of other controls (yellow solid line R 0 = 1.06 and magenta dotted line R 0 = 0.96). In particular, despite the larger outbreak size of higher R 0 (=1.06), the outbreak sizes of two R 0 values have no significant difference after a control effort of 0.2. Furthermore, the reduction rates for all models remain similar after the control effort reaches 0.2.
Next, Figure 9 illustrates the reduction rates of the three models when the control effort is 0.2. Although there is no significant difference between reduction rates of control measures for BP1 when R 0 = 0.96, the reduction rates of targeting individual-specific controls (yellow bars) for BP2 and LS are 89% and 94%, respectively, which indicates that targeting control is more effective than random individual-specific controls (green bars) and population-wide controls (blue bars). The results show that the reduction rate of all models is over 75% when R 0 is 0.96 and the reduction rate of all models is over 99% when R 0 is 1.06. Accordingly, targeting control is the most effective control measure even with a 20% control effort.

Discussion
We have clarified the characteristics of an SSE and superspreaders during the 2015 MERS-CoV outbreak in Korea. Our study suggests that although visiting multiple hospitals plays a key role in the characteristics of superspreaders, the condition of hospitals may also be associated with the number of superspreaders and the 2015 MERS-CoV outbreak. Specifically, we compared patients #14 and #15 and found that despite similar conditions, the significant difference in the number of infections (79 vs. 6) may be due to the exposure time they stayed in the emergency room (56 h vs. 2 h).
We also analyzed the 2015 MERS-CoV outbreak with a branching process and presented the effect of control measurements. The LS model provides a better fit than the other models (BP1 and BP2). The values of the dispersion parameter k in the LS model are very small, which means that the infectiousness of individuals in the 2015 MERS-CoV outbreak in Korea shows a significant heterogeneity, representing clear evidence of it being an SSE. Furthermore, the extinction probability has almost reached 1 and this suggests that the 2015 MERS-CoV outbreak in South Korea would soon disappear. The results of control measures indicate that targeting individual-specific control is the most effective and even just 20% control effort for all models is effective. Moreover, when the delay time is 20, it best describes the 2015 MERS-CoV outbreak, and the disclosure of hospital lists occurred about 20 days after the index patient was confirmed. This result suggests that if the implementation of the Korean government had been later, the 2015 MERS-CoV outbreak would have been about twice as large.
This research has several limitations. First, multiple-contact cases that were infected by several patients were excluded for clarity. Our findings may be different if the dataset included all multiple-contact cases. However, although 19 cases were not included, it may be better to use such partial data than data that is ambiguous and unclear. Second, a spatial structure has not been incorporated into our model. This can be resolved in our future research by using a network structure model. Despite these limitations, our study is valuable in that we explore other characteristics of superspreaders and control measures of the branching process model on the 2015 MERS-CoV in South Korea. Our results highlight the roles of superspreaders in a high level of heterogeneity. This indicates that the conditions within hospitals as well as multiple hospital visits were the crucial factors for superspreading events with a high level of heterogeneity of the 2015 MERS-CoV outbreak. Therefore, public health officials should take accounts of these factors into future intervention strategies of emerging infectious diseases.