Modelling the potential role of super spreaders on COVID-19 transmission dynamics

Superspreading phenomenon has been observed in many infectious diseases and contributes significantly to public health burden in many countries. Superspreading events have recently been reported in the transmission of the COVID-19 pandemic. The present study uses a set of nine ordinary differential equations to investigate the impact of superspreading on COVID-19 dynamics. The model developed in this study addresses the heterogeineity in infectiousness by taking into account two forms of transmission rate functions for superspreaders based on clinical (infectivity level) and social or environmental (contact level). The basic reproduction number has been derived and the contribution of each infectious compartment towards the generation of new COVID-19 cases is ascertained. Data fitting was performed and parameter values were estimated within plausible ranges. Numerical simulations performed suggest that control measures that decrease the effective contact radius and increase the transmission rate exponent will be greatly beneficial in the control of COVID-19 in the presence of superspreading phenomen


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-COV-2) disease that originated in the Wuhan City of China is continuing to spread in most parts of the globe. COVID-19 has devastated many communities and countries and to date, it has resulted in more deaths than those caused by SARS and MERS combined. One of the major factor driving this pandemic is the super-spreading (SS) phenomenon. Wong et al. (2015) defined SS as an event that occurs when one individual spreads the pathogen to a disproportionate number of contacted individuals. The U.S. Center for Disease Control and Prevention categorised super spreaders as individuals that are responsible for greater than 10 secondary infections Mkhatshwa and Mummert (2010). For the case of SARS, a SS event was classified as one that occurs when an individual infects ≥ 8 individuals Shen et al. (2004). In a study investigating the impact of socio-environmental factors on TB transmission, Issarow et al. (2018) defined super spreaders as individuals exhaling ≥ 10 infectious respirable particles hr −1 . However, concern has been raised about the largely perceived meaning of this term about the current COVID-19 pandemic especially in social media which has been viewed to be problematic Cave (1984). A call has been made to provide a more concise scientific definition that takes into account the behavioural, environmental and biological factors linked to SS to avoid stigmatisation of those infected thereby deterring potential control efforts of the pandemic Cave (1984). The need to counteract super spreading events becomes paramount considering that SS is associated with an exponential increase in the early stages of the infection and subsequent sustained community transmissions Frieden (2020); Wong et al. (2015); Lau et al. (2017); Walling and Teunis (2004); Yin et al. (2020). SS has also been reported for other infectious diseases such as HIV, Typhoid, Ebola virus disease, gonorrhoea, tuberculosis (TB), measles, mumps and rubella, smallpox, monkey-pox, pneumonic plague, flu pandemic, SARS and MERS-COV disease Hui (2016); Stein (2011); Melsew et al. (2020); Frieden (2020); Mkhatshwa and Mummert (2010); Lloyd-smith et al. (2005). SS for typhoid was noted in the early 20th century where a cook Mary Mallon infected more than 50 individuals Kim et al. (2016.); Hui (2016); Stein (2011); Marineli et al. (2013). For the tuberculosis disease, SS was reported in a study by Riley et al. Riley et al. (1962) which showed that amongst many patients with smear-positive and cavitary tuberculosis only 3 out of 77 patients were responsible for 73% of tuberculosis infections. SS during the West Africa Ebola virus disease outbreak was reported to have sustained the epidemic with 3% of infected individuals associated with 61% of infections Lau et al. (2017). SS events were also reported in the spread of gonorrhoea where few individuals had higher infectivity as compared to others Hethcote and Yorke (1984). SS events have also been reported in the spread of animal infections in buffaloes, deer mice and cattle Maeno (2011); Mkhatshwa and Mummert (2010). SS has not been limited to the spread of infections only. In a study by Yin et al., Yin et al. (2020), the authors described some information super spreaders termed 'opinion leaders' who were responsible for the enormous propagation of important information on public health interventions for COVID-19 in Wuhan City, Hubei Province of China. The authors developed a mathematical model to examine the dynamics of information propagation in the Chinese Sina-microblog through the application of an opinion-leader susceptible-forwarding-immune (OL-SFI) model.
Information about the epidemiology of the COVID-19 disease is currently limited but there are reports of super spreading events linked to the disease Frieden (2020); Wang et al. (2020). In South Korea, one of the members (patient 31) of the Shincheonji church infected several other members resulting in an increased number of recorded deaths attributable to this church's gathering kasulis (2020). A 70-year old preacher who later succumbed to the COVID-19 disease was suspected to have led to the death of around 30 people and quarantine of approximately 40, 000 individuals in the state of Punjab, India BBC news (2020). In Zimbabwe, some reports indicated that some escapees who were unaware of their COVID-19 status infected their household members. This prompted community members to report any suspected cases of escapees in their territory. A variety of factors thought to drive the SS phenomenon include environmental, host, pathogen, and behavioural factors Stein (2011). Understanding the specific impact of these factors can facilitate informed prevention and control strategies. The risk of SS differs across regions based on the community's or country's cultural and socio-economic attributes Frieden (2020).
Various mathematical techniques and tools have been used for studying the SS phenomenon in infectious diseases. These include, Ordinary differential equations Kemper (1980); Mkhatshwa and Mummert (2010); Ndairou et al. (2020), stochastic differential equations Maeno (2011), agent-based modelling (ABM) Yunhwan et al. (2018), negative binomial distribution Lloyd-smith et al. (2005), branching process models in discrete and continuous time generations James et al. (2007); Garske and Rhodes (2008), contact network models Aihaira et al. (2004);zhang et al. (2019). We describe the pioneer ordinary differential equations models developed to study SS events. Kemper (1980) formulated an SI 1 I 2 R model with S(t) representing the susceptible population, I 1 (t) representing the first class of infectives, I 2 (t) representing the second class of infectives and R(t) representing the class of the removed individuals. It was suggested that this model can be suitable to model SS events by assuming different transmission rates r 1 and r 2 for I 1 and I 2 respectively, where the presence of SS events is captured by using higher than normal transmission rates. Mkhatshwa and Mummert (2010) modified the SI 1 I 2 R model in Kemper (1980) to come up with an SIP R model for the SARS outbreak where S(t) are the susceptibles, I(t) are the regularly infected individuals, P (t) are the super spreader individuals and R(t) are the recovered individuals. The SIP R model by Mkhatshwa and Mummert (2010) assumed the same transmission rate β for both the I(t) and P (t) classes but applied the 20/80 rule in Woolhouse et al. (1997) by apportioning b the probability that a newly infected will be regular and the remainder (1 − b) the probability that a newly infected will be a super spreader. Also, the authors captured SS events by assuming that the overall time spent outside quarantine and isolation for regularly infected individuals (ν 1 days) is less than that of super spreader individuals (ν 2 days), that is, ν 1 < ν 2 . This prolonged time outside quarantine and isolation for a super spreader individual means that a superspreader will have increased chances to have many contacts with susceptibles as compared to the regularly infected individual. One of the limitations of the SIP R model involved the assumption of the same transmission rate for both the regularly infected individuals and the super spreader individuals.
In this paper, we extend the work done on modelling the SS phenomenon. We adopt the transmission rate functions developed for the agent-based model in Yunhwan et al. (2018) that provides a general mathematical tool for outbreaks with SS events. The model developed in this study takes into account the heterogeneity of transmission among infected individuals not considered in previous SS models. Heterogeneity in infectiousness has . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ; https://doi.org/10.1101/2021.08.30.21262341 doi: medRxiv preprint been classified in two forms namely; clinical (infectivity level) and social or environmental (contact level) Yunhwan et al. (2018). Understanding the role of these forms of transmission will be beneficial in informing public health policy. The model also takes into account the most effective documented control measures such as early discovery through contact tracing and diagnosis followed by intervention in the form of isolation and treatment Wong et al. (2015). Inclusion of such important control measures in our model will enable assessment of their relative contribution in curtailing SS.
The manuscript is organised as follows: In Section 2, a mathematical model that describes the effect of super-spreaders on the COVID-19 pandemic is formulated and its mathematical analysis given in Section 3. In Section 4, we present the numerical simulations confirming some of the analytical results. Finally, a brief discussion is given in Section 5.

Model formulation
To study the spread of COVID-19 driven by SS, we consider the human population subdivided into nine distinct compartments representing different infection statuses with respect to the disease, (see Table 1 for the description of the state variables and parameters). We consider transmission rates for normal or regular spreaders (λ n (t)) and super spreaders (λ pi (t), i = 1, 2). We take into account two forms of transmission rates for super spreaders based on infectivity level and frequency of contacts. The transmission rate for regular spreaders is given by Here, r 0 is the effective contact radius and b is the exponent satisfying b > 1. Take note that β n (r) is a decreasing function. We define r as the distance between individuals in the I, A or Q status and the susceptible individuals S. The parameter ε models the role of governmental action that includes use of sanitisers, wearing of face masks, social distancing, lockdowns etc. We now define the transmission rate function for super spreaders class P 1 who are assumed to have higher infectivity level as follows: Take note that for this case β p1 (r) takes the form of a constant function with respect to the distance r.
We now define the transmission rate function for super spreaders class P 2 who are assumed to have more contacts than regular spreaders as follows: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ; https://doi.org/10.1101/2021.08.30.21262341 doi: medRxiv preprint Individuals at risk of being infected (Susceptibles) E Individuals exposed to COVID-19 but not infectious (Exposed) A Regularly infected asymptomatic individuals I Regularly infected symptomatic individuals P 1 Super-spreaders with higher infectivity levels than regular spreaders P 2 Super-spreaders with more contacts than regular spreaders Transmission rate exponent θ Longer effective contact radius ε Role of governmental action β 0 Transmission rate parameter Infectious period for individuals in class I γ −1 Infectious period for individuals in class A δ −1 Infectious period for individuals in class P 1 α −1 Infectious period for individuals in class P 2 m 1 Proportion of individuals in class I who recover m 2 Proportion of individuals in class I who are quarantined f Proportion of individuals in class A who are quarantined q 1 Proportion of individuals in class P 1 who recover q 2 Proportion of individuals in class P 1 who are quarantined h 1 Proportion of individuals in class P 2 who recover h 2 Proportion of individuals in class P 2 who are quarantined φ −1 Average period one stays in quarantine g Proportion of individuals in class Q who recover r Distance between individuals in the I, A or Q status and the S individuals k 1 Proportion of E individuals joining the I class at the end of the incubation period k 2 Proportion of E individuals joining the A class at the end of the incubation period k 3 Proportion of E individuals joining the P 2 class at the end of the incubation period The transmission rate function for super spreaders class P 2 is modelled by considering a longer effective contact radius where r θ = θr 0 with θ > 1. This longer effective contact radius can be regarded as an equivalent of staying outside quarantine and isolation for a longer time as compared to normal or regular spreaders. The schematic diagram is given below. The system of non-linear ordinary differential . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ; https://doi.org/10.1101/2021.08.30.21262341 doi: medRxiv preprint equations is obtained upon combining details given in the model description above together with model flow diagram (Figure 1), as follows: where all the model parameters are considered to be non-negative.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Invariant region
Let Γ be the biologically meaningful invariant region for system (1) which is contained in R 9 + . We prove that the solutions of the system exist and are bounded. This is ascertained by using the following theorem on the existence of solutions stated as follows.
Theorem 1: The feasible region defined by Γ ∈ R 9 + , given as is positively invariant with respect to system (1) with initial conditions (2).
Proof: The total change in human population obtained by summing all the nine equations of system (1) gives a constant population Thus, the lim t→∞ sup N ≤ N 0 . Clearly, we see that Γ is positively invariant with respect to system (1). Hence, all the solutions of system (1) exist, and are biologically meaningful, bounded, and remain in Γ for t ≥ 0.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ; https://doi.org/10. 1101/2021 We determine if equation (3) satisfies the conditions given by Lemma 1 (see Yang et al. (1996)) by equating all the state variables for each class to zero which gives: Thus, we have that the solutions of system (1) are positive.

The model reproduction number and the steady states
Employing the next generation matrix approach by van den Driessche and Watmough van Driessche and Watmough (2002) we obtain in which Φ 1 = −Eσ (−κ 1 − κ 2 − κ 3 + 1) + P 1 δ and Φ 2 = −Af γ − Im 2 ρ − P 1 δq 2 − P 2 αh 2 + Qφ. Computing the Jacobian of F and V at the disease-free equilibrium leads to where the model basic reproduction number is the given as R 0 = ρ(F V −1 ), with Here, R I , R A , R Q , R P1 and R P2 represent the contributions of classes I, A, Q, P 1 and P 2 respectively.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Parameter estimation
In order to estimate model parameters, we use data from a high population density country where SS has been reported as a result of high human to human social contact rate. We use the data for Delhi, one of the most affected provinces in India as a result of the Tablighi Jamaat event which took place from the 1st of March 2020 up to the 21st of March 2020. This event has since been regarded as a superspreader event since many coronavirus cases across India have been attributed to attendees of this event who tested positive for coronavirus Wikipedia (2020). As of 31st of March 2020, India had about 24 people who tested positive among the SS attendees, 300 showed symptoms and 16 were quarantined Wikipedia (2020). According to the Union Health Ministry of India, as of the 18th of April 2020 the total number of confirmed cases were 4291, translating to 29.8% of India's total number of cases. These were all linked to the SS event which took place from the 1st of March 2020 up to the 21st of March 2020 Indiatimes (2020).
We tabulate Delhi's COVID-19 cumulative cases ( Table 2). The data used here was taken from Delhidata (2020). We used the data from within the time period 25 March 2020 up to 3 May 2020. This period covers the initial 21 day lockdown instituted on the 25th of March 2020 and later extended to the 17th of May 2020 Biswas et al. (2020). Thus, the choice of these dates is because the impact of the disease spread was felt in India immediately after the SS event which ended on the 21st of March 2020.

Model fitting
We fit system (1) to epidemiological data for Delhi from the 25th of March 2020 up to the 29th of April 2020. We make use of the Least Square (LS) fitting method which is a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ; https://doi.org/10.1101/2021.08.30.21262341 doi: medRxiv preprint mathematical regression analysis tool used in determining the best fit for any given data set.
The LS method provides the overall rationale for the placement of the best fit on the data set under study by giving the smallest sum of squares know as the variance Balcha (2020). One of LS's advantages is that it allows modelers to estimate unknown parameter values by giving a lower bound and an upper bound from which the set of estimated parameter values that produce the best fit is obtained. Thus, in this work, we validate system (1) by fitting it to the data in Table 2 using the LS regression technique. This technique minimises the sum of the squared residuals given by: Here, Ψ represents all the model parameters in Table 3, n is the total number of data points used for the fitting process, p ti (Ψ) andp ti are the cumulative COVID-19 cases of infected individuals by our model predictions and the actual reported data for the time frame under investigation respectively.  Table 2. This period covers the whole of the 1st lockdown period which was later on extended. The blue circles indicate the actual data and the solid red line indicates the model fit to the data.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Numerical results
We use the estimated initial conditions for the exposed individuals E(0), regularly infected asymptomatic class A(0), the regularly infected symptomatic class I(0) and the quarantined/hospitalised individuals Q(0) in Biswas et al. (2020). The data used in Biswas et al. Biswas et al. (2020) was for the whole of India. We choose the estimated initial conditions from the second data set considered in Biswas et al. (2020) which covers the period 25th of March 2020 up to the 24th of April 2020 as this was observed to be the best estimator with least standard error unlike the first data set covering the period 1st of March 2020 up to the 24th of March 2020. In order to apply these initial conditions in the present study, we consider Delhi's population as a fraction of the total population of India. Thus, we obtain the following initial conditions: S(0) = 10927986, E(0) = 0.00808 × 1131, I(0) = 0.00808 × 482, A(0) = 0.00808 × 506, P 1 (0) = 0.00808 × 300, P 2 (0) = 0.00808 × 400, Q(0) = 0.00808 × 647, R(0) = 0, D(0) = 0. The remaining parameter values are summarised in Table 3 below. Thus, we present numerical simulations of system (1) which was done using the set of parameter values given in Table 3. Some of the parameter values were taken from published literature and some were carefully estimated using the data fitting procedure. The numerical simulations performed in this section are for illustrative purposes only.
Data fit g 0.7166 0 − 1 Data fit . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Contour plots of parameter on R 0
Here, we plot the parameters b, ε and θ on the basic reproduction number, R 0 . In figures 4(a) and 4(b), we observe that an increase in the values of b and ε results in an decrease in the value of R 0 . Further, the simulation results in figure 4(b) illustrate that prolonging the impact of the effective contact radius in combination with increasing the value of the parameter b do not impact the value of R 0 .

Effects of parameters on R 0
In this subsection, we show the effects of parameters r, θ, b and ε on the basic reproduction number. We simulate these parameters as functions of R 0 . In figure 5(a), we observe that an increase in both parameters r and θ increases the value of R 0 while figure 5(b) shows that an increase in the values of the parameters ε and b reduces the value of R 0 . Thus, measures in the form of governmental role actions such as national lockdowns, enforcing effective mask usage, etc are beneficial in curtailing the spread of the disease.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ; https://doi.org/10.1101/2021.08.30.21262341 doi: medRxiv preprint 4.6 Simulation results of parameters on A, I, P 1 and P 2 Figures 6-9 illustrate the effect of varying parameters r, θ, b and ε on the population classes A(t), I(t), P 1 (t) and P 2 (t). An arrow pointing upwards indicate that as the parameter increases then the number of individuals increases whereas an arrow pointing downwards indicate that as the parameter increases then the number of individuals decreases. It can be observed from figures 6 and 7 that an increase in parameters r and θ leads to an increase in the number of individuals in compartments A, I, P 1 (t) and P 2 (t). Figures 8 and 9 show that an increase in parameters b and ε results in a decrease in the number of individuals in compartments A, I, P 1 (t) and P 2 (t). Thus, efforts targeted at reducing parameters r and θ and increasing parameters b and ε will be beneficial in controlling the COVID-19 pandemic. For instance, a reduction in the value of parameter r can be achieved through enforcement of measures such as social distancing, curfews e.t.c. which are meant to limit social interactions of individuals in communities.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  An application of the SS model was done through fitting the model to epidemiological data for Delhi in India where SS phenomenon was observed for COVID-19. The model was observed to fit well with the data and unknown epidemiological parameter values were estimated within plausible ranges. These parameter values together with some parameter . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2021. values obtained from the literature were used to carry out numerical simulations. Some few important parameters and important classes of individuals were chosen for numerical results in order to draw important results in line with the objective of the paper. It was noted that an increase of values in parameters r and θ resulted in an increase in the number of individuals in classes A(t), I(t), P 1 (t) and P 2 (t) whereas an increase of values in parameters b and ε resulted in an decrease in the number of individuals in classes A(t), I(t), P 1 (t) and P 2 (t). Thus, measures meant to increase the values of b and ε and decrease the values r and θ will be of great help in the control of COVID-19 in the presence of SS phenomenon. This is also supported by results shown in figure 4. Thus, control measures such as enforcement of social distancing, curfews, wearing of face masks, use of hand sanitizers etc are of great importance in the fight against COVID-19.
The model presented in this paper is not exhaustive. Considering the current intervention strategies taking place in many countries such as vaccination, the model can be extended to . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 2, 2021. study the impact of vaccination. Vital dynamics can as well be incorporated in the model to add realism.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 2, 2021. (d) Figure 9 Effects of varying the parameter ε on the population trajectories of (a) asymptomatic individuals A(t), (b) regularly infected individuals I(t), (c) superspreader individuals P1(t) and (d) superspreader individuals P2(t), starting from ε = 0 up to ε = 1 with a step size of 0.1 across all the population compartments.