Abstract
Many high-profile outbreaks are driven by super-spreading, including HIV, MERS, Ebola, and the SARS-Cov-2 pandemic. That super-spreading is a common feature of epidemics is immutable, however, the relative importance of 2super-spreaders to the outcome of an epidemic, and the individual-level traits that lead to super-spreading, is less clear. For example, an individual may contribute disproportionately to transmission by way of an extremely high contact rate or by way of low recovery, but how these two super-spreaders differ in their effect on epidemiological dynamics is unclear. Furthermore, epidemiological traits may often covary with one another in ways that promote or inhibit super-spreading. What patterns of covariation, and between what traits, are most likely to lead to large epidemics driven by super-spreading? Using stochastic individual-based simulations of an SIR epidemiological model, we explore how variation and covariation between transmission-related traits (contact rate and infectiousness) and duration-related traits (virulence and recovery) of infected individuals affects super-spreading and peak epidemic size. We show that covariation matters when contact rate and infectiousness covary: peak epidemic size is largest when they covary positively and smallest when they covary negatively. We did not see that more super-spreading always leads to larger epidemics, rather, we show that the relationship between super-spreading and peak epidemic size is dependent on which traits are covarying. This suggests that there may not necessarily be any general relationship between the frequency of super-spreading and the size of an epidemic.
Introduction
Throughout the SARS-CoV-2 pandemic, the importance of super-spreading has been imprinted upon our cultural consciousness due to the high visibility of such cases and events; however, in the field of epidemiology, the significance of certain individuals being responsible for a disproportionate number of infections is par for the course. Not only have epidemiologists long known about the existence of super-spreaders, but they have shown that a significant number of high-profile outbreaks throughout modern history, including HIV, MERS, Ebola, and SARS, in addition to SARS-CoV-2, have also been driven by super-spreaders (Lui et al. 2020, Kain et al. 2021, May & Anderson, 1987, Wong et al. 2015, Gani & Leach 2004). In some well-studied outbreaks, such as SARS, information about identified super-spreaders has been recorded in great detail, going so far as to find their respective secondary infection rates, contact rates and other relevant measurements (Stein 2011, Shen et al. 2004). Comparing the characteristics of super-spreaders across outbreaks, however, reveals that super-spreading can be caused by variation (and potentially covariation) in many epidemiologically relevant traits. For example, one of the earliest and most quintessential examples of super-spreading, Typhoid Mary, was a super spreader by way of having low virulence and low recovery, while one of the documented SARS super-spreaders experienced high virulence and high contact rate (Brooks 1996, Shen et al. 2004). This limits the generality of conclusions regarding the role of super-spreaders on overall epidemiological dynamics: does it matter to epidemiological dynamics whether super-spreading arises through variation in transmission-related traits, such as contact rate or infectiousness (Vanderwaal and Ezenwa 2016, White et al. 2018, McCallum et al. 2017), versus through variation in duration-related traits, such as recovery rate or virulence (Gou and Jin 2017)? Moreover, is variation and covariation in some traits more likely to produce super-spreaders than (co)variation in others?
To illustrate, consider the pattern of higher infection prevalence and intensity in males compared to females observed for some types of parasites and some types of hosts (Muehlenbein & Bribiescas 2005, Luis et al. 2012, Vanderwaal & Ezenwa 2016, Kelly et al. 2018). These are thought to be due in part to elevated levels of testosterone (an immunosuppressant that likely prolongs infections; Zuk 2009, Foo et al. 2016) but also due to increased home range sizes which can lead to more infectious contacts (Klein et al. 2000, Luis et al. 2012). Here, individual variation in physiology and behavior act together to shape a potential pathway that might lead some males to become super-spreaders. Other examples include variation in attractiveness to vectors (Allan 2010, Takken & Verhulst 2013) or variation in disposition (being bold or shy) (in Siberian Chipmunks, Tamias sibiricu; Boyer et al. 2010; Vanderwall and Ezenwa 2016). How these different iterations of super-spreading compare to one another regarding influence on epidemiological dynamics is unclear.
We can use an epidemiological model to help identify traits that are likely to both impact epidemiological dynamics and lead to super-spreading. For example, an individual may become disproportionately important to spreading if it has a high transmission rate (which could be caused by a high contact rate or by high infectiousness) or if it is infectious for a long time (which could be caused by a very low recovery rate or mortality rate). Recently, authors have explored the potential importance of, not only variation, but covariation among individual traits to epidemiological dynamics as well (White et al. 2018, Gou & Jin 2017, Hawley et al. 2011). However, this was first suggested in the evolution of virulence literature (Cressler et al. 2016) which presupposes mechanistic links between disease parameters via within-host parasite replication. For instance, a trade-off (negative covariation) between virulence and transmission results when increasing within-host replication increases transmission by increased infectiousness/shedding but comes with a cost of increased virulence (Anderson & May 1982, Ewald 1983). There may also be a trade-off between virulence and recovery if high within-host replication reduces the ability of the immune system to clear the infection but also increases virulence (Anderson & May 1982, Alizon 2008). Lastly, a trade-off between contact rates and infectiousness can arise if high within-host replication increases infectiousness but also leads to increased sickness behavior that decreases host contact rates (Ewald 1994). These trade-offs make a strong case for covariation being possible in nature but are examples of negative covariation exclusively. However, there are plausible reasons to suspect that, in some cases, these correlations may be positive (Hawley et al. 2011). For example, Brookes and Ward (2019) recently showed that canines infected with rabies have increased contact rates due to disease-induced behavioral changes and increased shedding and thus increased infectiousness.
Empirical investigations show that super-spreading may arise through variation in traits related to both transmission (contact rate and shedding rate) and infection duration (virulence and recovery). Additionally, evidence of individual variation and clear rationale for covariation in epidemiologically relevant traits are well-documented in the literature. Despite this, the importance of individual variation/covariation on epidemiological dynamics, and how that relates to the way in which one becomes a super-spreader remains unclear. In an analysis of individual variation on outbreak dynamics, Lloyd-Smith et al. (2005) showed that increased variation is associated with more super-spreading events, but also increased probability of extinction, suggesting that more super-spreading does not necessarily imply larger epidemics. More recently, the effects of covariation on population-scale dynamics have been explored by several papers (Gou & Jin 2017, White et al. 2018, Hawley et al. 2011), but these have focused exclusively on covariation in transmission-related traits (i.e., susceptibility, contact rate, infectiousness). Here we use stochastic individual-based simulations of a simple SIR epidemiological model to explore how variation and covariation in transmission-related traits (contact rate and infectiousness) and duration-related traits (virulence and recovery) of infected individuals affects super-spreading, peak epidemic size and the subsequent relationship between super-spreading and peak epidemic size. We tested all directions of covariation (positive, negative, and none), with three levels of variation (low, moderate, and high) and three different basic reproductive numbers (R0; low, moderate, and high).
Methods
Model Description and Assumptions
In the absence of any variation, the epidemiological dynamics are determined by a simple SIR model (Fig. 1; Eq. 1-3):
We assume that all individuals are born susceptible, and that birth rate is density-dependent (according to the parameter), with equal birth rates,, for all epidemiological classes. We decompose pathogen transmission rate into a contact rate process,, and an infection process (Fig. 1), with infectiousness determined by pathogen shedding rate () such that the probability of infection, given contact, is a saturating function of shedding rate. Infected hosts die due to infection at a per-capita rate, a, and recover into a permanently immune class at a per-capita rate γ. All hosts are assumed to die at the per-capita background rate, d.
Flow diagram of SIR model. Contact and shedding determine movement from susceptible to infected, recovery determines movement from infected to recovered, and virulence determines infected individuals that do not recover. Births and deaths are not included in this diagram but are included in the SIR model used.
Dynamics of infected individuals over time for intragroup trait pairings at a moderate R0 for all levels of variation and directions of covariation. For each simulation, the initial number of susceptible hosts is given by the expected equilibrium population size in the absence of infection, minus the initial number of infected hosts, which was always 5.
To allow individuals to vary in contact rate, shedding rate, virulence, and recovery, we simulated the above SIR model with an individual-based model (IBM) built on the standard Gillespie algorithm for exact stochastic simulation of differential equations (Gillespie 1977). As in the standard algorithm, the probability of any of the possible events (birth, death, infection, and recovery) occurring at a moment in time is proportional to its rate, relative to all of the other events. Following DeLong (2016), we extend the algorithm by allowing each individual to have a unique set of parameters. Here we assume that epidemiological variation comes only through infected hosts, so that, for example, variation in contact rate is due to variation in the behavior of infected hosts, rather than susceptible hosts. The epidemiological traits of newly infected individuals are drawn from a multivariate normal distribution with fixed mean and covariance structure. With this individual-based model, we can track unique fates of each individual in the population, including the number of secondary infections each causes, while allowing population dynamics to emerge out of the stochastic processes of birth, death, infection and recovery.
To develop intuition about how (co)varation is likely to affect epidemiological dynamics, we can analyze how variation is predicted to affect the net reproductive rate, R0. For this simple model,
where Ŝ is the equilibrium number of susceptible hosts in the absence of infection, if one or more epidemiological parameters (co)vary, then R0 becomes a random variable and we can estimate its moments using a Taylor expansion. Table 1 gives the resulting expectations for R0, assuming variation and covariation between all epidemiological parameters; for full details of this derivation, see the appendix.
The epidemiological effect of variation (along the diagonal) and covariation (off-diagonal) in parameters is reported relative to R0 evaluated at the mean parameter values and .
From this, we can make several predictions. In particular, variation in contact rate alone will have no impact on R0, whereas variation in either virulence or recovery will tend to increase R0. The effects of covariation are, of course, more complicated. If contact and shedding negatively covary (i.e., there is a trade-off between contact rate and infectiousness) then the expected R0 decreases. On the other hand, if contact rate negatively covaries with either virulence or recovery, then the expected R0 increases. If virulence and recovery positively covary, then the expected R0increases. In all other cases, it is impossible to make a general prediction, as the expected R0 depends on parameter values.
To confirm these predictions and more fully explore the effects of variation and covariation of transmission and duration-related traits on epidemiological dynamics, and, in particular, on the potential for super-spreading, we explored all possible combinations of contact rate (c), shedding rate (shed), virulence (alpha) and recovery (gamma), allowing parameters to covary with zero, positive, or negative covariation. We tested each model combination at three levels of variation (SD; 0.03, 0.15, 0.75) and three R0 values (R0; 1.5, 3.8, 8). The covariation was calculated from the correlation, set to either -0.5, 0, 0.5, and the variation. This resulted in a total of 54 model variants.
We carried out 50 stochastic simulations of each model, tracking the epidemiological dynamics and each infected individual’s trait values and number of secondary infections. We compared the peak epidemic size and the heterogeneity in the number of secondary infections for each model. To quantify heterogeneity, we use an estimate of the dispersion parameter (k) of a negative binomial model fit to the distribution of secondary infections for each simulation (Lloyd-Smith et al. 2005). We use the estimate of k as a measure of super-spreading: the larger the k value, the less heterogeneity there is in the number of secondary infections, and thus less super-spreading.
Results
We found that our results were strongly determined by whether covarying parameters were duration (virulence, recovery) or transmission (contact rate, shedding rate) related. When trait pairings include one duration parameter and one transmission parameter (i.e., virulence and contact rate) we classify them as intergroup trait pairings. When trait pairings include parameters from the same group (i.e., virulence and recovery), we classify them as intragroup trait pairings. These groupings provide a general framework in which we can understand our results because, for most of our measures, results from simulations of intergroup trait pairings tend to be similar, whereas results from simulations of intragroup trait pairings are distinct to the trait pairing. We also found that when the simulated R0 is increased or decreased the general patterns are the same, but emphasized or deemphasized, respectively. Given this, our figures show only results from our simulations with a moderate R0 (see appendix for different R0 result figures). Similarly, when only one of the intergroup trait pairings is shown in our figures, it implies that the omitted simulation results are qualitatively identical to those that are shown (see appendix for all trait pairings).
Effect of Variation on Peak Epidemic Size
When intergroup traits covary negatively (i.e., individuals with the largest recovery rate have the lowest shedding rate), regardless of which transmission and duration traits are covarying, increasing trait variation always increases the peak epidemic size (fig. 2&3). This agrees with the analytical results, which indicate that R0 is most likely to increase with variation if covariation is negative (Table 1). When traits vary independently or covary positively (i.e., there is a trade-off: individuals with the highest recovery rate have the highest shedding rate), the effect of increasing variation depends on which transmission traits are varying; in particular, if contact rate (c) varies, then increasing variation increases the peak with all directions of covariation; if shedding rate (shed) varies, however, increasing variation can increase, not affect, or even decrease the peak, depending on the direction of covariation (Fig. 3). For intragroup trait pairings, the effect of variation depends on the covarying traits (Fig. 3). When virulence (alpha) and recovery (gamma) covary, more variation always leads to a large epidemic peak regardless of covariation. This agrees with the analytical results (Table 1). When contact rate (c) and shedding rate (shed) vary, however, we see an increase in peak epidemic size when there is positive covariation, but no change in peak size with an increase in variation with no covariation. When these traits covary negatively, we see a decrease in peak epidemic size with an increase in variation. Interestingly, this does not agree with the analytical results, which indicate that increasing variation should always decrease R0 (Table 1).
Dynamics of infected individuals over time for intergroup trait pairings at a moderate R0 for all levels of variation and directions of covariation. Note that the trait pairings of contact with virulence (alpha) and recovery (gamma) are omitted but can be found in the appendix.
Distribution of peak epidemic sizes for intergroup trait pairings at all levels of variation and directions of covariation. Differences in violin width indicate the density of peak epidemic sizes at that value.
Effect of Covariation on Peak Epidemic Size
We see that, at moderate variation, peak epidemic size of intergroup trait pairings is largest when covariation is negative, and larger when traits vary independently than when they covary positively (i.e., there is a trade-off) (fig. 3). At high variation, however, both negative and positive covariation result in larger peak epidemic sizes than independent covariation (fig. 3).
For intragroup trait pairings, we see the same pattern at all levels of variation: the largest peak sizes result from positive covariation and the smallest from negative covariation (fig. 3).
Effect of Variation on Super-spreading (dispersion)
For all trait pairings, more variation leads to lower dispersion (k) which indicates more super-spreading (fig. 5). With negative covariation and the highest level of variation for our contact rate (c) and shedding rate (shed) simulations, however, no dispersion could be estimated due to extinctions.
Effect of Covariation on Super-spreading (dispersion)
For intergroup trait pairings, negative covariation leads to the smallest dispersion values while positive leads to the largest (fig. 5). This indicates that negative covariation leads to more super-spreading than positive or independent covariation. For intragroup trait pairings, there is no clear pattern. There is no effect of covariation on super-spreading when virulence (alpha) and recovery (gamma) covary. When contact rate (c) and shedding rate (shed) vary we see that negative covariation leads to the largest dispersion values and positive covariation leads to the smallest.
This indicates that positive covariation causes more super-spreading than negative or independent covariation.
Effect of super-spreading on peak epidemic size
We see that variation and covariation influence peak epidemic size and dispersion (super-spreading) (fig. 3&4); however, we do not find a direct interaction between dispersion and peak epidemic size (fig. 5). This indicates that the role of super-spreading during an epidemic is indirect and varied. For example, in the contact rate (c) and shedding rate (shed) panel of figure six, when looking at the moderate variation simulations (triangles), we see a trend of increasing peak epidemic size with a decrease in dispersion, but this trend is a function of the direction of covariation rather than only dispersion on peak epidemic size. Although we do not see a direct effect of super-spreading on epidemic size in our simulations, we do show under what conditions super-spreading should be an important driver of epidemiological dynamics. For example, when virulence (alpha) and recovery (gamma) vary, the effect of super-spreading on epidemiological dynamics should be negligible. In contrast, if intergroup traits are highly variable and covary negatively, we expect super-spreading to contribute to increasing the peak epidemic size because we see that despite approximately equal dispersion (k) for all directions of covariation at high variation, high dispersion with negative covariation leads to larger peak epidemic sizes.
Distribution of peak epidemic sizes for intragroup trait pairings at all levels of variation and directions of covariation. Differences in violin width indicate the density of peak epidemic sizes at that value.
Distribution of dispersion (K) values for each trait pairing, level of variation and direction of covariation. Lower dispersion values indicate more super-spreading.
Relationship between dispersion (K) and peak epidemic size for all trait pairings, level of variance and direction of covariation.
Discussion
In this study, we investigated how covariation in traits associated either with disease transmission (e.g., contact rate and infectiousness) or infection duration (e.g., recovery rate and virulence) influenced disease dynamics and individual heterogeneity in spreading. Previous investigations of this question have focused mainly on heterogeneity in traits associated with transmission and on epidemic dynamics, utilizing analytical approximation (Hawley et al. 2011), branching processes (Lloyd-Smith et al. 2005), and network-based approaches on static (Gou & Jin) and dynamic networks (White et al. 2018). In contrast, we built a stochastic, individual-based version of a classic SIR model with demography to understand how both epidemic and endemic dynamics are affected by heterogeneity (Fig. 1, 2). We also allowed for covariation between all combinations of transmission-and duration-related traits to investigate whether certain combinations of traits are more likely to give rise to larger epidemics or more super-spreading events.
Across simulations, we found that increasing variation almost always leads to an increase in the peak size of the epidemic (Fig. 3, 4). The only situation where this is less true is when contact and infectiousness covary, but this is driven by the fact that increasing variation increases the likelihood of epidemic fade-outs, as has been seen in other studies (Lloyd-Smith et al. 2005, White et al. 2018). When epidemics do take off in this case, they tend to be larger when there is more variability. Increasing variation also tends to increase individual heterogeneity in spreading, as evidenced by the decrease in the dispersion parameter, k (Fig. 5). The effect of variation is less pronounced when recovery and virulence covary – although there is increasing heterogeneity, the distributions of k are broadly overlapping suggesting that variation has less effect in this scenario. Interestingly, however, the observed values of k in our simulations are only comparable to existing empirical estimates in the high variation case. For example, Lloyd-Smith et al. 2005 estimated dispersion parameters to be less than 0.1 for all the pathogens they considered; this is in line with estimates for other pathogens (Melsew et al. 2019); we observed values of k that small only in the highest variation case, suggesting that there is considerable individual variability in epidemiological traits in most infectious disease systems.
The effects of covariation tended to be less pronounced, and to depend on the total amount of variation and whether the covarying traits affected transmission or duration (intragroup) or both transmission and duration (intergroup) (Fig. 3-5). For intergroup pairings, covariation only has an effect at the highest level of variation, and we observed larger epidemics and more super-spreading when traits covaried negatively (e.g., the most infectious individuals have the lowest recovery) or when they covaried positively (e.g., the most infectious individuals have the highest recovery). The negative covariation result is intuitive; why positive covariation (i.e., a trade-off) also leads to larger, more variable epidemics is less clear. For intragroup pairings, on the other hand, we see larger and more variable epidemics when covariation is positive (e.g., the most infectious individuals have the highest contact rates), as you might expect. Again, we see the largest effect of covariation sign when transmission traits covary, with epidemic fade-outs being much more likely under negative covariation than positive, especially at high variation.
Our observation that the effects of variation and covariation are largest when transmission traits covary, and smallest when duration traits covary, provides some validation for previous studies emphasis on these two aspects of the epidemiological process (Hawley et al. 2011, White et al. 2005). It is important to keep in mind that the expected R0 is identical across all trait pairings, and it is only the contact-infectiousness pairing that leads to frequent epidemic fade-outs, while also leading to the most super-spreading.
Another clear pattern that emerges from this study is that whether there is a relationship between super-spreading (measured by k) and the size of the epidemic, depends on the covarying traits. For example, when infectiousness and virulence covary (Fig. 5, shed-alpha), increasing variation increases super-spreading without any noticeable effect on peak size. However, when infectiousness and recovery covary (Fig. 5, shed-gamma), increasing variation increases both super-spreading and peak size. This suggests that there will not necessarily be any relationship between the frequency of super-spreading and the size of the epidemic, contrary to some previous work (Lloyd-Smith et al. 2005). Moreover, looking across stochastic simulations for a set of covarying traits and the sign and magnitude of covariation, we do not find that more super-spreading leads to larger peak.
In general, we find that it is critical to know which traits are covarying and how much they vary. Currently, there is little empirical evidence of covariation in nature. However, Hamilton et al. (2020) recently showed that in Tasmanian Devils (Sarcophilus harrisii) infected with devil facial tumor disease (DFTD), contact rates and infectiousness are negatively correlated: as the disease progresses and tumor surface area increases, Devils become more infectious (Obendorf & McGlashan 2008), but due to behavioral changes they have fewer contacts. Our simulations suggest that this negative covariation is likely important to the overall epidemiological dynamics given that, if these traits covaried positively we would expect larger epidemics and super-spreading, depending on the amount of individual variation. Much more common is empirical evidence of individual variation (VanderWaal & Ezenwa 2016, Ferrari et al. 2004, Melsew et al. 2019). For example, variation in pneumonia recovery rates have been documented in Bighorn sheep (Ovis canadensis) (Plowright et al. 2017). Our simulations show that, if recovery rate is covarying with another trait, epidemiological expectations may vary widely (Fig. 3). If recovery covaries with virulence (in any direction), we would expect minimal heterogeneity in number of secondary infections and only an increase in variation would lead to larger peak epidemic sizes (Fig. 3, 4). Conversely, if recovery covaries with contact rates, covariation (in any direction) would indicate an increase in peak epidemic size and super spreading being an important driver of epidemiological dynamics. The same expectations hold for individual heterogeneity in disease-induced mortality, which has been observed in a number of amphibian populations due to variation in parasite aggregation (Wilber et al. 2020).
Although documented cases of measured covariation are relatively rare, there is good reason to suspect that it is likely widespread and has the potential to largely influence population-scale disease dynamics. With this, our results highlight another nuance to understanding why certain individuals become super-spreaders. For example, an individual with a high contact rate may be considered a likely candidate for super-spreading, however, if contact rate covaries negatively with infectiousness (shedding rate), their influence on the epidemiological dynamics might be minimized, despite having many contacts. Conversely, if contact rate covaries in any direction with another trait, such as recovery, the high contact rate of an individual may lead such an individual to becoming a super-spreader in more cases since both positive and negative covariation increase peak epidemic size. Research to uncover when covariation between transmission and/or duration traits occurs will be important to furthering our ability to predict and manage epidemics and super-spreading.
Data Availability
All data produced in the present study are available upon reasonable request to the author