## Abstract

Individual variation in susceptibility and exposure is subject to selection by natural infection, accelerating the acquisition of immunity, and reducing herd immunity thresholds and epidemic final sizes. This is a manifestation of a wider population phenomenon known as “frailty variation”. Despite theoretical understanding, public health policies continue to be guided by mathematical models that leave out considerable variation and as a result inflate projected disease burdens and overestimate the impact of interventions. Here we focus on trajectories of the coronavirus disease (COVID-19) pandemic in England and Scotland until November 2021. We fit models to series of daily deaths and infer relevant epidemiological parameters, including coefficients of variation and effects of non-pharmaceutical interventions which we find in agreement with independent empirical estimates based on contact surveys. Our estimates are robust to whether the analysed data series encompass one or two pandemic waves and enable projections compatible with subsequent dynamics. We conclude that vaccination programmes may have contributed modestly to the acquisition of herd immunity in populations with high levels of pre-existing naturally acquired immunity, while being critical to protect vulnerable individuals from severe outcomes as the virus becomes endemic.

**Highlights**

Variation in susceptibility/exposure responds to selection by natural infection

Selection on susceptibility/exposure flattens epidemic curves

Models with incomplete heterogeneity overestimate intervention impacts

Individual variation lowered the natural herd immunity threshold for SARS-CoV-2

## 1. Introduction

Almost 100 years ago, Kermack and McKendrick [Kermack and McKendrick 1927] published a general and flexible model to study epidemics and fitted particular versions to observed epidemics [Kermack and McKendrick 1927, McKendrick 1939]. Already back then they alerted for the simplifying assumption “that all infected persons are equivalent, and that all susceptible persons are equally liable to acquire infection” [McKendrick 1939]. In their fittings they adjust not only transmission parameters but also the size of the susceptible and exposed populations at epidemic onset. Susceptible and exposed population sizes needed to be adjusted so their homogeneous models could fit the data.

Thirty years later, [Gart 1968] admited that “it is difficult to define exactly the size of the population of susceptible hosts. In this instance the difficulty is associated with the heterogeneous nature of the population”. The author divided the population in two groups, depending on their history of infection, and allowed much greater susceptibility in the group with no history. This did not seem sufficient to provide good fit to observed epidemics as the author adds “we assume that the first group is a homogeneous group of susceptibles, while the second is actually a mixture of immune and susceptible individuals”. In [Gart 1971] the author extended the model to several susceptibility groups, and, more than a decade later, [Ball 1985] compared a model with several susceptibility groups to the homogeneous version and described how homogeneity assumptions increase epidemic size. [Coutinho *et al*. 1999] expanded the formalisms to also account for individual variation in incubation period and recovery but concluded that “at present practical applications might be difficult”. [Pastor-Satorras and Vespignani 2001] developed related formalisms to describe epidemics on contact networks.

Meanwhile, frailty variation had been formalised in demography [Vaupel *et al*. 1979] and introduced in practical survival analysis [Hougaard 1984, Aalen 1988] and non-communicable disease epidemiology to improve model fits and interpretation (for a review see [Aalen *et al*. 2015] and references therein).

On the experimental side, [Dwyer *et al*. 1997] measured nonlinear relationships between transmission and densities of susceptible hosts, implying that the bilinear term in the classical susceptible-infected-recovered (SIR) model may not be appropriate. The authors attributed the nonlinearity in transmission to heterogeneity in host susceptibility to infection which they estimated from the shapes of dose-response curves. [Finkenstädt and Grenfell 2000] fitted a model with nonlinear relationships between transmission and density of susceptible hosts to an observed epidemic and estimated the exponent, which they interpreting as heterogeneity in mixing. [Novozhilov 2008] derived the expressions for the exponents from explicit gamma distributions of susceptibility and [Montalbán *et al*. 2020] provided a more tractable derivation that applies exactly to heterogeneity in susceptibility and approximately to heterogeneity in exposure to infection.

Here we build on this history and analyse the coronavirus disease (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with frailty variation models. The study is focused on England and Scotland, where the infection was first detected in early 2020, with major waves of the disease occurring in the spring of the same year and in the autumn-winter season of 2020-2021.

We use susceptible-exposed-infected-recovered (SEIR) models [Diekmann *et al*. 2013] incorporating continuous distributions of individual susceptibility or exposure to infection [Ball 1985, Pastor-Satorras and Vespignani 2001, Novozhilov 2008, Katriel 2012, Gomes 2020]. We then use Bayesian inference to estimate model parameters by fitting series of deaths while accounting for, and estimating the magnitude of, the combined effects of non-pharmaceutical interventions (NPIs), voluntary behavioural change, seasonality, viral evolution, and any other factors which might have contributed to time dependency in transmission potential. We estimate basic reproduction numbers (ℛ_{0}) consistent with early literature [Flaxman *et al*. 2020, Hilton *et al*. 2020, Kwok *et al*. 2020] and coefficients of variation in agreement with direct measurements based on contact-pattern studies [Mossong *et al*. 2008, Hens *et al*. 2009, Willem *et al*. 2012]. We show that individual variation in susceptibility or exposure to infection can significantly affect basic metrics, such as the herd immunity thresholds (HIT), as well as model projections and assessment of intervention impacts.

## 2. Mathematical models

The basic compartmental SEIR model [Diekmann *et al*. 2013] describing the transmission dynamics of SARS-CoV-2 is represented diagrammatically in Fig. 1. The model accounts for individual variation in susceptibility or exposure to infection.

### 2.1. Individual variation in susceptibility to infection

Let *x* denote the individual susceptibility to infection in relation to the mean, which we describe by a continuous distribution *q*(*x*) with mean *∫xq*(*x*)*dx* = 1 parametrised by a coefficient of variation (CV), . Susceptible individuals, *S*(*x*), become exposed at a rate that depends on their susceptibility, *x*, and on the average force of infection, *λ*, which accounts for the total number of infectious individuals in the population over time. Upon exposure, susceptible individuals enter an incubation phase, *E*(*x*), during which they gradually become infectious [Wei *et al*. 2020, To *et al*. 2020, Arons *et al*. 2020, He *et al*. 2020]. The average infectiousness in this phase is made to be half that in the following state (*ρ* = 0.5), to which individuals progress within an average of 5.5 days (*δ* = 1*/*5.5 per day) [McAloon *et al*. 2020, Lauer *et al*. 2020]. The fully infectious state is denoted by *I*(*x*). Infected individuals eventually become noninfectious, on average approximately 4 days after becoming fully infectious (*γ* = 1*/*4 per day) [Nishiura *et al*. 2020, Li *et al*. 2020]. A small fraction, *ϕ*(*x*), die due to COVID-19 while the remaining majority recover into *R*(*x*) where they are noninfectious and resistant to reinfection due to acquired immunity. The model is formalised mathematically by the infinite system of ordinary differential equations (ODEs):

The average force of infection upon susceptible individuals in a population of size N and transmission coefficient *β* is defined by

An epidemic is simulated by introducing a small seed of infectious individuals in a susceptible population. Initial growth of infected densities is near exponential but decelerates as individuals are removed from the susceptible pool by infection and immunity. With variation in susceptibility, highly susceptible individuals tend to be infected earlier, leaving behind a residual pool of lower mean susceptibility. This selective depletion intensifies the deceleration of epidemic growth and gives an efficient head start to the acquisition of population immunity. Eventually the epidemic will subside and the HIT, defining the percentage of the population that needs to be immune to reverse epidemic growth and prevent future waves, is lower when variation in susceptibility is higher.

The basic reproduction number, defined as the number of infections caused by an average infected individual in a totally susceptible population, is written for system (1)-(4) with force of infection (5) as

This is an indicator of early transmissibility but its use quickly becomes cumbersome. Several factors, such as NPIs, adaptive changes in human behaviour, seasonality and viral evolution, affect ℛ_{0} in a time-dependent manner. We denote the resulting quantity by ℛ_{c}(*t*) = *c*(*t*) · ℛ_{0}, where *c*(*t*) *>* 0 describes the basic risk of infection at time *t* in relation to baseline.

In the estimation of ℛ_{c}(*t*) we assume a stylised profile for *c*(*t*) as illustrated in Fig. 2: *T*_{0} is time when ℛ_{0} begins to show decrease due to behavioural change or seasonality; *L*_{1} is the period of maximal transmission reduction due to a first lockdown in spring 2020 (48 days in England, from 26 March to 12 May; and 66 days in Scotland, from 24 March to 28 May) and *c*_{1} ≤ 1 is the value of ℛ_{c}(*t*) during *L*_{1} in relation to the initial ℛ_{0}; *T*_{1} (*> T*_{0}) is the day first lockdown begins (transmission is allowed to decrease linearly between *T*_{0} and *T*_{1}). After *L*_{1}, contact restrictions are gradually relaxed and we allow transmission to begin a linear increase such that *c*(*t*) reaches the baseline (*c*(*t*) = 1) in *T*_{2} days, which may or may not be within the range of this study. Changes in non-behavioural factors that affect transmission (such as seasonality or viral evolution) are inseparable from contact changes in this framework and are also accounted for by *c*(*t*).

The model will be used to analyse COVID-19 deaths recorded over approximately one year. Mathematically the time-dependent *c*(*t*) is constructed as

Second and third lockdowns in the autumn and winter are implemented more simply as a further reduction in transmission (by a factor *c*_{2}) over the stipulated time periods. Specifically,

A finite version of system (1)-(4) with variable susceptibility force of infection (5) can be derived exactly, taking a previously obtained form when individual susceptibility is gamma distributed [Novozhilov 2008, Montalbán *et al*. 2020]:

Notice that individuals leave compartment I at a rate *γ* but only a fraction 1 − *ϕ* recovers from infection. The remaining fraction, *ϕ*, die due to COVID-19, *η* days after leaving compartment I. We refer to *ϕ* as the infection fatality ratio (IFR).

Had ℛ_{0} remained constant throughout study duration, the effective reproduction number would have been as in [Montalbán *et al*. 2020]. More typical, however, is for ℛ_{0} to vary with seasonal effects [Nickbakhsh *et al*. 2020, Kissler *et al*. 2020, Bacaër and Gomes 2009] and viral evolution [PHE 2021] leading to an effective reproduction number modified as

Here, nevertheless, we present HIT estimates elicited by natural infection in reference to the original ℛ_{0}, as derived in [Montalbán *et al*. 2020]
highlighting that these refer to the SARS-CoV-2 of early 2020; higher thresholds will be expected for more transmissible variants. Once reliable estimates become available for evolving transmissibility, HIT estimates can be updated.

As infection spreads, the susceptible compartment, S, is depleted and recovered individuals populate compartment R where they are protected by acquired immunity. In reality, they eventually lose that protection as immunity wanes or is evaded by new viral variants. In addition, the population slowly renews itself through all-cause death of individuals with various levels of acquired immunity and birth of new susceptibles. These processes are omitted in this version of the model given our purpose to analyse data reported over a 1-year period when the frequency of reinfection has been relatively low in our study setting [Hall *et al*. 2021] and population renewal considered negligible. Supplementary Material (Section S1) we formulate an extended model with reinfection and conduct some sensitivity analyses.

### 2.2. Individual variation in exposure to infection

In a directly transmitted infectious disease, such as COVID-19, variation in exposure to infection is primarily governed by patterns of connectivity among individuals. We incorporate this in system (1)-(4) assuming that individuals mix at random [Pastor-Satorras and Vespignani 2001, Miller *et al*. 2012]. In a related study we developed an assortative mixing version of the model [Aguas *et al*. 2020]. With random mixing and heterogeneous connectivity, the force of infection is written

The basic reproduction number is
and ℛ_{c}(*t*) = *c*(*t*) *·* ℛ_{0} as above.

Model (1)-(4) with variable exposure force of infection (15) can also be reduced to a finite system of ODEs when individual connectivity is gamma distributed, although, in contrast with variable susceptibility, the derivation is approximate in this case [Montalbán *et al*. 2020]:

The effective reproduction number is and the natural HIT

Notice that formulas (14) and (22) refer to herd immunity by natural infection and do not apply to herd immunity induced by vaccines (unless vaccination programmes were designed to mimic the selection exerted by natural infection which is not generally the case [Fine *et al*. 2011]). For a random vaccination programme, the HIT induced by the vaccine would be ℋ_{V} = [1*/*(1 − *σ*_{V})] *·* (1 − 1*/*ℛ_{0}), where *σ*_{V} is the susceptibility of a vaccinated individual relative to unvaccinated. As a corollary, the HIT can only be reached by random vaccination if the vaccines in use are more than 1 − 1*/*ℛ_{0} efficacious against infection. For example, achieving the random vaccination HIT requires a vaccine that is at least 67% efficacious if ℛ_{0} = 3, and 80% efficacious if ℛ_{0} = 5, assuming 100% coverage. With lower coverage the demand on vaccine efficacy increases. This paper addresses primarily the natural HIT (ℋ).

## 3. Data

We used publicly available epidemiological data from the UK coronavirus dashboard describing the unfolding of the SARS-CoV-2 epidemic [https://coronavirus.data.gov.uk/] to estimate relevant transmission parameters for the larger nations: England (56 million population) and Scotland (5.5 million). Spefically, we extracted datasets containing daily deaths (deaths within 28 days of positive test by date of death), , where *k* = 0 is the day when the cumulative rolling average of death numbers exceeded 5 *·* 10^{−8} of the population in both nations (10 March 2020). We also extracted from the same source data on the population considered fully vaccinated (two doses) at the time of this study.

Model outputs would then be fitted to the raw series of daily deaths between 10 March 2020 and 1 February 2021 (329 days in total) or 1 July 2020 (114 days) adopting an IFR of *ϕ* = 0.9% [Ward *et al*. 2021, Chen *et al*. 2021] throughout the fitted period. [Birrel *et al*. 2021] estimate that the IFR fluctuated between about 0.3% and 1.3% prior to vaccination, with a tendency to be low when infection spreads generally in the community and high when spread is predominantly in care homes and hospitals [McKeigue *et al*. 2021]. We assumed constant IFR but conducted analyses of sensitivity to its value (see Supplementary Material (Section S2) for results on *ϕ* = 0.7% and *ϕ* = 1.1%).

Initial conditions were set by solving the linear system of ODEs obtained by neglecting depletion of susceptibles in (9)-(12) (i.e., *d*E*/dt* = *β*(*ρ*E+I) −*δ*E, *d*I*/dt* = *δ*E−*γ*I, *d*R*/dt* = (1 − *ϕ*)*γ*I) for an early *R*_{0} = 2 [Li *et al*. 2020, Kissler *et al*. 2020], to find approximate relationships between variables (conditioning this on higher early transmission potential, such as *R*_{0} = 3, does not significantly change the results but leads to worse fits due to mismatch with early death counts). These approximations concerning the very early days of the epidemic enabled us to constrain the initial conditions as I(−*η*) = *y*_{0}*/*0.002, E(−*η*) = I(−*η*)*/*0.136 and S(−*η*) = N − E(−*η*) −I(− *η*), where *η* is the excess duration of a fatal infection relative to non-fatal, and *y*_{0} is the number of deaths in the first day of the study. These numerical values are in the order of I(−*η*) ≈ *y*_{0}*/ϕ*[1 −exp(−*γ*)] and E(−*η*) ≈ I(−*η*) */* [1 −exp(−*δ*)], which is expected early in the pandemic given the rates of progression from I to death and from E to I, respectively.

## 4. Model fitting and parameter estimation

In order to preserve identifiability, we made six main simplifying assumptions: (i) reinfection is negligible (see Section S1 for sensitivity analysis to reinfection); (ii) the IFR is constant throughout the fitted period (see Section S2 for sensitivity analysis to IFR) but lower due to vaccination in discussion of projections; (iii) vaccine effects are negligible during the fitted period, which in Section 5.1 ends 1 February 2021 (less than 1% fully vaccinated in the UK) and in Section 5.2 ends 1 July 2020 (no vaccines in use), but represented in projections beyond this phase as appropriate (see also Section 5.6); (iv) natural (seasonality, viral evolution and adaptive behavioural change) and interventional (NPI) modulators of the reproduction number are encapsulated in a single time varying parameter *c*(*t*) as illustrated in Fig. 2; (v) second and third lockdowns impact *c*(*t*) by the same factor; (vi) excess transmission from critically ill stages is negligible.

Parameter estimation was performed with software MATLAB (MathWorks, Natick, MA) using the PESTO (Parameter EStimation TOolbox) package [Stapor *et al*. 2018]. We assumed that numbers of SARS-CoV-2 infections are Poisson distributed.

We tried to reproduce the dynamics of COVID-19 deaths by estimating the set of parameters *θ* that maximises the log-likelihood (LL) of observing the daily numbers of reported deaths *Y* :
in which and are the model output numbers of COVID-19 deaths at day *k* in England and Scotland for the set of parameters *θ*, and are the numbers of daily reported deaths, and *n* is the total number of days included in the analysis.

The set of parameters to be estimated is when fitting longer time series (Section 5.1) and in fittings to shorter versions (Section 5.2).

To ensure that the estimated maximum is a global maximum, we performed 50 multi-start optimisations with initialisation parameters sampled from a Latin-Hypercube. The combination of parameters resulting in the maximal LL were used as a starting point for 100, 000 Markov chain Monte-Carlo (MCMC) iterations. From the resulting posterior distributions, we extracted the median estimates for each parameter and the respective 95% credible intervals. We used uniformly distributed priors with wide ranges.

We applied the outlined fitting procedure using both heterogeneity models (specifically individual variation in susceptibility and individual variation in exposure to infection) as well as a homogeneity model (obtained by setting the coefficient of variation to zero (*ν* = 0) in either model). The Akaike information criterion (AIC) was applied to select best fitting models.

## 5. Results

### 5.1. Estimated parameters and herd immunity thresholds

Variable susceptibility, variable connectivity and homogeneous models were fitted to series of COVID-19 deaths reported in England and Scotland until 1 February 2021. The fits are shown in Figs. 3, 4 and 5 (black solid curves; fitted data points in green), and the estimated parameters in Table 1. Maximum LL are also displayed, as well as AIC scores for model selection. We conclude that variable susceptibility and variable connectivity models are better supported by the data than the homogeneous model (lowest AIC in bold), as found in related studies [Aguas *et al*. 2020, Colombo *et al*. 2020]. Variable connectivity is slightly better supported in this case, although the reality most likely combines the two forms of heterogeneity.

Note that while we report CV in all fits, our method is not purposed to estimate levels of heterogeneity solely in susceptibility or connectivity as a goal. Because the two forms of heterogeneity have similar effects on epidemic trajectories, separating relative contributions would pose an identifiability problem that our approach cannot address. Each heterogeneity model we consider assumes that there is no heterogeneity of the other kind. From fitting these two extremes, however, we conclude that they are equally able to fit data and have similar consequences for metrics of interest and projections. Hence a decomposition is not necessary for our purposes.

Given the estimated values for ℛ_{0} and CV (*ν*) we derive the natural HIT through formulas (14) or (22) as appropriate, obtaining ℋ in the range 25 − 27% in England, and 26 − 29% in Scotland. With the homogeneous model, the inferences would be considerably higher, ℋ = 63% in England and in the range 66 − 67% in Scotland, as in early expectations [Kwok *et al*. 2020].

We then prolong model trajectories (dashed curves) for another 4 months (until 1 June 2021) to begin comparisons with data beyond fitted period (red). All models project more deaths than observed, as expected given that UK initiated a mass vaccination programme in late 2020 which would have started impacting the epidemic by February 2021. With reduced models adopted here, implementation of vaccination is not straightforward but in Supplementary Material we replicate the fits with explicit gamma distributions (Section S1) and simulate a crude approximation to the UK vaccination programme (Figs. S1, S2 and S3). In Section 5.6, we focus on England and implement vaccination in more detail according to data on fully vaccinated people in the nation. A relevant question at the time of these results was whether achieved population immunity had been enough to prevent an exit wave as contact restrictions were lifted from 8 March 2021 onwards culminating with the removal of most restrictions from 19 July 2021 (especially in England where this was dubbed “Freedom Day”). To assess how each of the models might answer this question we plotted the cumulative number of estimated infections (blue) to determine the percentage of the population infected by June 2021. We find that both nations remain below the natural herd immunity threshold ℋ (Scotland more than England). Hence, without vaccination, an exit wave might have been expected. To visualise its magnitude we include separate panels on the right where the model is run for approximately 12 months, using as initial conditions the end conditions from left panels and ℛ_{c}(*t*) = ℛ_{0}. Here this is done crudely but in Section S1 we explore a range of vaccination scenarios and in Section 5.6 we refine the projections in the case of England (the same could be done for Scotland). It is striking that heterogeneous models project exit waves one order of magnitude lower (or less) than when homogeneity is assumed, a feature that we explore more thoroughly in section 5.6.

In Supplementary Material we also show that these results are robust to including reinfection (with a risk of 0.1 relative to the average risk of first infection [Hall *et al*. 2021]) (Section S2), arriving at similar HITs, and changing IFR (to 0.7% or 1.0%) (Section S3). This agrees with the expectation that assuming a lower IFR results in higher HIT, and a higher IFR results in lower HIT. Less intuitive is that when we fit the models with a different IFR, all parameters readjust and exit waves appear relatively conserved (perhaps slightly slower with lower IFR (higher HIT) and higher with higher IFR (lower HIT), illustrating the fragility of taking HITs out of context). Similar conserveness is noticed when reinfection is included. Model selection by AIC continues to favour heterogeneous models by large in all scenarios.

### 5.2. Parameter estimation early in the pandemic

In a pandemic it is important to estimate model parameters early when data series are short. To test suitability of our methods for such task, we apply fitting procedures to series of COVID-19 deaths in England and Scotland until 1 July 2020, as Europe was recovering from the first wave. As an initial exploration we perform the fittings as in Section 5.1, estimating all parameters in (26) except *c*_{2} which had no effect prior to November 2020. The results, displayed in Supplementary Material (Section S4), provide similar estimates to those obtained with longer series in Section 5.1 except for parameter *T*_{2} that determines the slope of contact reactivation following lifting of first lockdown. This preliminary analysis suggests *T*_{2} to be very high (or equivalently, slope of contact reactivation close to zero), presumably due to absence of detectable effects of lifting restrictions in such short timeframe. Technically, the estimation of *T*_{2} becomes unsuitable in this framework since as *T*_{2} → ∞ so does uncertainty about its value. One way to circumvent this problem could be to reparameterise contact reactivation in terms of slope rather than time to reach baseline *c*(*t*) = 1. Meanwhile, here we present two scenarios by constraining *T*_{2} (Fig. 6).

Informed by preliminary analysis in Section S4, we first run a scenario where *c*(*t*) remains strictly horizontal until end of fitting period (*T*_{2} → ∞, Fig. 7). This assumption appears to align with stringency index, a composite measure based on government response indicators (school closures, workplace closures, travel bans, etc) developed by Oxford Covid-19 Government Response Tracker (OxCGRT) [Hale *et al*. 2020] (blue in ℛ_{c} panels, scaled to match lower and higher bounds in our stylised profiles). Support for heterogeneous models is again very significant according to AIC scores, and optimal parameter values result in similar HITs to when *T*_{2} was estimated: ℋ = 34% (95% CI, 25 − 57%) for heterogeneous models; and ℋ = 70% (95% CI, 68 − 71%) for homogeneous. Compare Tables 2 (*T*_{2} *→ ∞*) and S5 (*T*_{2} estimated). Second, we run a scenario with finite *T*_{2} to inform how estimated parameters might respond to assumptions about contact reactivation. Assuming *T*_{2} = 120 days (crudely motivated by mobility data [Google 2020], blue dots in Fig. 8), we find that support for heterogeneous models persists but estimated coefficients of variation become larger, reflecting in lower HITs (ℋ = 15% (95% CI, 12 − 17%), Table 3). In absence of further information we might have expected true parameter values to be somewhere between these two scenarios. This is supported by results on longer COVID-19 series in Section 5.1 (ℋ in the range 25 − 29%, Table 1), which is inside, but towards the low end, of the range motivated by stringency index.

### 5.3. Using government response indicators

In this section we use the stringency index, which tracks the strictness of government policies that restricted people’s behaviour during the COVID-19 pandemic [Hale *et al*. 2020], as a main factor capable to modify transmission over time. In Fig. 9 we show stringency indices for England (red) and Scotland (blue), both scaled to be 1 in February 2020 and to match an illustrative *c*_{1} = 0.2 during lockdown 1. We refer to this scaled time-dependent index as *u*(*t*).

Here we replicate model fittings to series of COVID-19 deaths until 1 February 2021 using *u*(*t*) to help determine the overall profile *c*(*t*). We assume that until the 1 July 2020 (end of series fitted in Section 5.2) the profile *c*(*t*) is entirely determined by *u*(*t*). From then on we include a multiplicative factor that increases linearly from 1 to reach some value *w* by the last day of the fitted series. Parameters *c*_{1} and *w* are estimated. In the case of Scotland this was not sufficient to allow acceptable fits. Looking at Fig. 9 we notice that the stringency index for Scotland shows no signal for the second lockdown which happened mostly throughout November 2020 (visible in the stringency index for England). Both death series, however, show a clear reversal in epidemic growth around that time. Presumably restrictions imposed by the government in England might have affected Scotland due to mobility across the border and behavioural influences. To enable fittings to the two nations we modify the Scottish stringency to resemble the English from the 1 July 2020 onwards. The results are provided in Fig. 10 and Table 4.

The quality of these fits is not as good as when we used a stylised transmission modifier *c*(*t*) in Section 5.1, presumably because it does not account as flexibility for behavioural factors other than those resulting from government restrictions. Nevertheless the results remain consistent also in this scenario. Using the stringency index as outlined in this section the heterogeneous models are again better supported by the data according to AIC, and we estimate ℋ in the range 27 − 31% in England and 31 − 36% in Scotland for heterogeneous models, while 58% is obtained for England and 65% for Scotland when homogeneity is assumed.

Overall this analysis suggests that in the eventuality of future pandemics our models, with more or less stylised mitigations, can be used consistently at differently stages as the epidemic unfolds. This study of England and Scotland demonstrates that although applications will always benefit from knowledge of time-dependent effects on transmission, considerable progress can be made by model-based inference before detailed measurements of such factors are available. It is also possible for sensibly designed stylised models to perform better than those constrained by incomplete information. Inferences should be subsequently tested, however, and assumptions improved as more data becomes available.

At this stage, our parameter inferences can be confronted with several types of empirical estimates. First, the basic reproduction numbers estimated here (in the order of ℛ_{0} = 3) align with those from numerous independent studies (e.g., [Flaxman *et al*. 2020, Hilton *et al*. 2020, Kwok *et al*. 2020]). Second, the transmission reduction due to first lockdown, 70 − 80% (i.e., *c*_{1} between 0.2 and 0.3) is in agreement with direct contact measurements [Jarvis *et al*. 2020]. Third, the estimated time from detectable infection to death (1*/γ* + *η*, between 10 and 20 days) agrees with [Verity *et al*. 2020, Wood 2021], more so when heterogeneity models are used. Forth, our time-dependent transmission profiles, *c*(*t*) (parameterised by *T*_{0} and *T*_{2}), align well with the stringency index [Hale *et al*. 2020] and might align even better with some hybrid indicator combining government measures with voluntary adaptive behavioural changes (captured by mobility patterns tracked by Google). Finally, the coefficient of variation (*ν*), the main novelty of this study, is assessed in a dedicated section (Section 5.4).

### 5.4. Coefficients of variation from empirical studies

#### 5.4.1. Contact surveys

Contact patterns provide one of the easiest sources of heterogeneity to study directly. One approach is to use large-scale diary experiments to collect self-reported logs of close or physical contact from study participants. In this section we show data from several of these contact-pattern studies, listed in Table 5. In Fig. 11 we show gamma-fits for the contact distribution of each study on a log scale, along with the CV for each empirical distribution. In Supplementary Material we show that lognormal distributions provide considerable worse fits (Fig. S15).

Irrespective of fits, the empirically measured contact distributions reveal CV between 0.7 and 1.5 (depending on study and setting; mean 0.9, standard deviation 0.2).

In addition to the magnitude of individual variation in connectivity, the time scale of that variation is another important determinant for the effect of selection in accelerating the acquisition of population immunity [Tkachenko *et al*. 2021]. The referenced studies typically report contact patterns for individuals over a very short (e.g., 1-day) period which is insufficient for assessing persistence of the measured variation. One of the studies [Hens *et al*. 2009] made an extra step and measured contact patterns for each individual on two different days (one weekday and one weekend day). In Fig. 12, we show fits for contact patterns by these two days individually, and for the average. The CV for the contact heterogeneity that persists over the two days is approximately 1.1 (in contrast with the larger 1.4 or 1.6 for each day alone).

For a heterogeneous model of an epidemic unfolding over a timescale of years, local dynamics are driven by assumptions about heterogeneity in short-term (e.g., week-long) averages in contact patterns, while global dynamics depend on assumptions about persistence of heterogeneity in those short-term averages over the timescale of the simulation. However, these long-term averages are frequently not evaluated directly by contact diary experiments.

One way to estimate persistent contact heterogeneity from below is to bin contact data by age groups, assuming for example that heterogeneity in contact patterns which persists in contact data after binning in 5-year age groups represents population-level heterogeneity in contact patterns that is persistent on multi-year time scales. Of course, this approach only captures heterogeneity mediated by age; i.e., it would only capture the full extent of heterogeneity in contact patterns if there was no within-age-group persistent variation in contact patterns. As such we should expect age-binned contact data to underestimate the level of persistent heterogeneity in contact patterns [Britton *et al*. 2020], perhaps quite substantially. In Fig. 13 we show the effect on CV of binning the data from the studies in Table 5 in 1- and 10-year age groups, respectively. Generally, binning by larger periods tends to reduce heterogeneity more, although 1-year binning is sufficient to bring the results of all studies to values below our model-based inference of persistent heterogeneity based on epidemic trajectories (CV = 1.1 for England and Scotland, as in Table 1, heterogeneous connectivity). When CV is artificially reduced by averaging processes such as these, models based on the resulting age-structured matrices are likely to overestimate the size of epidemics.

#### 5.4.2. Contact tracing

Contact-tracing data can also be used to describe heterogeneity in transmission by measuring directly the number of secondary cases attributed to each infected individual. Distributions of individual infectivity can then be characterised as in [Lloyd-Smith *et al*. 2005] who have analysed several pathogens including SARS-CoV-1. The authors found the measured distributions to be highly skewed around ℛ_{0}, with a CV of approximately 2.6 for SARS-CoV-1. [Adam *et al*. 2020] conducted one of the earlier such studies for SARS-CoV-2 (there are now many more but a review of this literature is beyond the scope of this work). The authors estimate parameters that correspond to a CV around 2.1. These studies traditionally assume a negative binomial model and parameterise the resulting distributions by a dispersion parameter *k*. From that we derived the corresponding CVs as the square root of the variance (calculated as ℛ_{c}(1 + ℛ_{c}*/k*)) divided by the mean ℛ_{c}.

Estimates of CV obtained from contact-tracing data are larger (roughly twice) those resulting from contact surveys (Section 5.4.1, Fig. S12), presumably because they capture variation in infectiousness as well as connectivity and, without further information, the two causes are inseparable.

### 5.5. Herd immunity thresholds and epidemic sizes

Fig. 14 shows the expected downward trends in the natural herd immunity threshold ℋ for SARS-CoV-2 as coefficients of variation of the gamma distribution increase (formulas (14) and (22)) considering an approximate basic reproduction number of ℛ_{0} = 3 (black and grey curves). Final sizes of the corresponding unmitigated epidemics (R^{*}) are shown in blue. In contrast with ℋ, the final size R^{*} cannot be derived explicitly but can be formulated implicitly as

for the variable susceptibility model (9)-(12),
for variable connectivity (17)-(20), and
for the homogeneous scenario [Diekmann *et al*. 2013].

Our inferences for CV from fittings to COVID-19 daily deaths until 1 February 2021 (Table 1) are marked as vertical grey lines. We use dark tones for heterogeneous susceptibility and light tones for heterogeneous connectivity.

We searched the literature and found no quantitative estimates of individual variation in susceptibility to infections although the evidence of its existence is documented [Carr *et al*. 2016]. Studies that quantify variation in connectivity are more prolific, as reviewed in Section 5.4 and marked by the vertical coloured lines in Fig. 14. Two types of studies were found: contact surveys (solid orange); and contact tracing (dashed red for SARS-CoV-1, and dashed orange for SARS-CoV-2). Our estimates (light grey) lie between those from these two study types.

The natural HIT resulting from our study (Section 5.1) can be visualised at the intersection between grey vertical lines and grey HIT curves. In both cases we obtain ℋ ≈ 25 − 27% in England, and ℋ ≈ 26 − 29% in Scotland, suggesting conserveness across the two models (variable susceptibility and variable connectivity). A combined formulation should result in intermediate CV estimates (currently 1.5 with variable susceptibility only, and 1.1 with variable connectivity) but similar HIT. We also find that once sufficient selection has occurred in the pandemic, our inferences become robust over different time scales (until 1 February 2021, to capture first and second waves, or until 1 July 2020, to capture the first pandemic wave only) contrasting with results from [Tkachenko *et al*. 2021] on the effects of time scales of individual variation on COVID-19 dynamics.

To explore sensitivity to the type of distribution we show equivalent plots to Fig. 14 with lognormal distributions in Supplementary Material (Section S5, Fig. S16). The curves for variable connectivity remain almost unaltered while the decline with increased CV becomes less steep for variable susceptibility.

### 5.6. Interpretation and counterfactuals

For over a century, mathematical epidemiologists have realised that individual variation in susceptibility and exposure to infection were key to determine the shape of epidemic curves. Models that underrepresent these forms of variation tend to overpredict epidemic sizes and consequently inflate the effects attributed to control measures. Fig. 15(a) depicts the epidemic curves that might have resulted from letting the COVID-19 pandemic run unmitigated in England (results for Scotland would be similar). Using parameters estimated in this study (Section 5.1), the homogeneous model (black) would have predicted almost half million deaths in total (460,000), while accounting for individual variation in susceptibility (red) would have brought these estimates down to less than a half (225,000) (individual variation in exposure leading to a similar result). The maximum number of deaths in a single day would have been around 16,000 by the homogeneous model and 6,000 when variability was considered. The figure also includes the real data for England (green dots), where the number of deaths in a single day has peaked at slightly over 1,200 in January 2021, a reduction attributed to adaptive behavioural changes and NPIs (but see [Wood 2021]). Although, in either case, it seems unimaginable that one would let a pandemic like this go unmitigated, the scenario appears more than twice as dramatic when individual variation is neglected.

Moving down Fig. 15, panel (b) simulates what would have happened according to both models with the first lockdown implemented (as it did in the spring 2020), but not the second and third (in the autumn-winter 2020-2021). In this scenario, the homogeneous model would have predicted around 320,000 deaths over an 8-month period around the peak (September 2020 - April 2021), while individual variation in susceptibility would have brought this projection down to around 135,000. The number of predicted COVID-19 deaths in a single day would have been around 4,800 by the homogeneous model and 1,800 when individual variation was considered. The gap between homogeneous and heterogeneous models appear to widen as the pandemic progresses but, more strikingly, the heterogeneous model (with first lockdown but without second and third) is not far from the real data (which included lockdowns 2 and 3).

Fig. 15(c) simulates all three lockdowns as estimated in Section 5.1, but without vaccines. The third lockdown is assumed to last until 7 March 2021, after which we let the controlled basic reproduction number ℛ_{c} initiate a linear increase towards normalcy (here assumed to reach the original ℛ_{0} by 19 July 2021), when most restrictions were removed in England (but see further discussion below in regards to increased transmission due to viral evolution). In the absence of vaccination, a large exit wave would have been expected under the homogeneous model but only a shallow hump is apparent when individual variation is considered.

Finally, the effects of vaccination are explored in the bottom panel (Fig. 15(d)). We simulate a vaccination programme according to the proportion *v*(*t*) of the population fully vaccinated each day (two doses was the standard full vaccination schedule at the time) as recorded in England (blue):
with force of infection

The vaccine is assumed to confer either *ξ* = 30% (solid curves) or *ξ* = 70% (dashed) protection against infection and 1 − (1 − *ξ*) *·* (1 − *ζ*) = 90% against death [Voysey *et al*. 2020, Bernal *et al*. 2021, Sheikh *et al*. 2021]. The implementation of a programme with these characteristics would have reduced the size of the third waves as visualised by comparing Figs. 15(c) and 15(d). According to the homogeneous model the third wave would have nevertheless remained large (black), while under heterogeneity the vaccination programme would have brought model trajectories (red) to the level of the data (green).

To better assess the dynamics governed by the heterogeneous model see Fig. 16(a),(b). We separate the scenarios of 30% and 70% vaccine efficacy against infection in two panels. We also allow ℛ_{c} to reach a value different from baseline after restrictions are lifted on the 19 July 2021. In each vaccine efficacy scenario we calculate the value that ℛ_{c} would have to reach for the number of COVID-19 deaths generated by the model between 1 February 2021 and 1 November 2021 to match the data (26, 129 deaths over the stipulated 9 months). We obtained ℛ_{c} = 1.23 *·* ℛ_{0} in the 30% scenario and ℛ_{c} = 1.41 *·* ℛ_{0} in the 70% scenario. Subtracting the number of deaths under vaccination from the corresponding simulations without vaccination would lead us to conclude that the vaccine would have prevented around 52, 000 and 88, 000 deaths in the 30% and 70% vaccine efficacy scenarios, respectively. This calculation is not possible with the homogeneous model because no vaccination programme in that model can fill the gap between projections in Fig. 15 (c) and the reported deaths (see Fig. 16(c),(d)). Scenarios equivalent to those provided for the heterogeneous model result in epidemic waves one order of magnitude larger when heterogeneity is omitted. Heterogeneous models, which performed better than homogeneous in statistical inferences in this paper (whether based on fittings until July 2020 or February 2021, with or without reinfection, assuming different infection fatality ratios, informed by government stringency indices and mobility data or not), also generate projections in closer agreement with data beyond the fitted period.

In the timeframe of this study, SARS-CoV-2 has undergone substantial evolution, with dominant genetic types being replaced by more transmissible variants. Specifically, the larger wave which peaked in January 2021 was caused by the alpha variant (B.1.1.7), estimated as up to 30% more transmissible than early viruses, while the flatter wave in the summer 2021 was mostly due to delta (B.1.617.2), estimated about 40% more transmissible than alpha [PHE 2021, Richard *et al*. 2021, Wang *et al*. 2021] and hence 80% more than early types. The crude analysis in Fig. 16 suggests that ℛ_{c} might have been 23 − 41% higher during the delta wave than in the early months of the pandemic (baseline ℛ_{0}). This is consistent with people retaining cautious behaviours which cancel some of the increases in transmissibility, e.g., a virus that is 80% more transmissible than the original (wild type) opposed by 22% reduced transmission due to behavioural change has ℛ_{c} = 1.4 *·* ℛ_{0}. In December 2021, delta was replaced by the omicron variant (B.1.1.529) [Elliott *et al*. 2022], assessed as even more transmissible but less pathogenic, a phenomenon that would be interesting to analyse with the models presented here but is beyond scope of the present study.

Throughout the COVID-19 pandemic, our heterogeneity models have systematically projected lower infection rates in the UK (more specifically, England and Scotland) than others conforming to more common approaches [Flaxman *et al*. 2020, Knock *et al*. 2021, Sonabend *et al*. 2021, Davies *et al*. 2020, Davies *et al*. 2021, Hilton *et al*. 2020,Keeling *et al*. 2021a, Keeling *et al*. 2021b, Moore *et al*. 2021]. This was also noticed in similar analyses featuring Spain and Portugal [Aguas *et al*. 2020]. When we disable individual variation in susceptibility and exposure to infection we obtain results similar to those presented elsewhere, suggesting that to be the distinguishing feature that prevents our models from overpredicting infection rates. When mitigations are applied the discrepancy between homogeneous and heterogeneous trajectories increases as the pandemic progresses, eventually causing the rejection of models that lack sufficient heterogeneity. By then, however, such models would have over-attributed impact to interventions, and may have induced governments to systematically make suboptimal decisions by over-weighting benefits in relation to the costs and collateral damages of control measures. More work is need to establish to what extent these might have happened in the COVID-19 pandemic.

## 6. Limitations of the study

We have presented an original approach to modelling pandemics that includes inference of individual variation from its response to selection and enables reliable projections based on series of daily counts of disease outcomes (deaths in this case, but it could have been hospitalisations, reported infections, or a combination of multiple data streams) complemented by minimal information about interventions (Sections 5.1 and 5.2). Future developments might integrate more detailed data about interventions, such as NPIs and vaccines, the emergence and spread of new variants, individual variation in vulnerability to infection due to age, comorbidities or other factors, seasonal dynamics, susceptibility replenishment through waning immunity or birth and deaths, and interactions among these and other processes. Based on explorations initiated here (Sections 5.3 and 5.6) we expect our results to be robust to such extensions given abounding agreements with empirical evidence from independent sources (Section 5.4 and 5.5). Specifying more detail into the models does not necessarily enable more accurate predictions, but there are often other reasons, such as informing the design of targeted interventions, for such developments [Mishra *et al*. 2020]. Although those extensions are beyond the scope of the current paper they are sometimes essential in specific circumstances.

Another aspect where this work may fall short of expectations, and we hope to stimulate further research, is in the treatment of uncertainty. Our inferences result in noticeably narrow credible intervals which may inadvertently suggest overconfidence. Factors that may have contributed to this include the lack of account for uncertainty around some input parameters (such as incubation period, recovery rate and infection fatality ratio) which we assume at fixed values taken from the literature, the absence of stochasticity along individual trajectories (although we consider between individual variability) and the choice of a particular Monte Carlo algorithm [Stapor *et al*. 2018] implemented in Matlab while alternatives are always worth exploring given the rapid development of MCMC methods (although we have replicated a sample of fittings with other algorithms in Stan and obtained similar results). Given the wide discrepancy between heterogeneous and homogeneous model outputs and the consistency across a spectrum of scenarios, however, methodological improvements along these lines are not expected to affect the conclusions of this study. They may nevertheless reduce the confidence on particular metrics being estimated.

## 7. Discussion and conclusion

The conclusions of this work apply much beyond COVID-19. Individual variation in the propensity to acquire infections has many factors. Individuals differ in genetic background, the local environment they live in, and social activity, in ways that affect their susceptibility and exposure to many infectious diseases. Immunological studies identify local environmental conditions as key in shaping the human immune system [Carr *et al*. 2016], while social inequality is probably a main driver [Cevik and Baral 2021]. Studies designed to identify specific factors and characterise their modes of action are key, not necessarily to inform the development of more predictive models but primarily to help focusing interventions on those individuals who need more protection. Notwithstanding these studies being conducted, there will always remain some unobserved variation and, for the sake of model predictability, all variation responding to selection by natural infection must be accounted for to prevent biased outputs. For this reason we took the inverse approach of quantifying selectable variation by fitting fairly generic models to epidemic curves. We used the COVID-19 pandemic in England and Scotland to test the approach. We estimate coefficients of variation for variable connectivity (around 1.1) in agreement with empirical studies of contact patterns, resulting in natural herd immunity thresholds around 25 − 28%. For variable susceptibility we are not aware of available data specific to human COVID-19 to immediately compare our inferences (CV around 1.5, leading to herd immunity thresholds around 27 − 29%).

The natural herd immunity threshold used here (ℋ) describes a ceiling for epidemic growth formulated in terms of the basic reproduction number (ℛ_{0}) and coefficients of individual variation in traits that are under selection by natural infection (namely, susceptibility and exposure to infection), filling a gap in mathematical epidemiology. While ℛ_{0} is indicative of the early growth of an epidemic, it says little about the final size and the height of the peak, which are strongly dependent on types of individual variation that decrease transmission over time (see also [Brauer 2018]). ℋ accounts for that variation enabling more accurate predictions of how large an epidemic might be, but it is a theoretical framework to the extent that ℛ_{0} is a theoretical framework. ℋ changes if the parameters that determine its value change. Most notably, natural changes in ℛ_{0} through time, which can happen due to seasonal forces or viral evolution, will manifest in ℋ. In particular, the percentage of the population immune required to prevent sustained epidemic growth may become higher than the initially estimated ℋ if we enter a high-transmission season or if the virus evolves towards higher transmissibility.

The UK experienced a first wave of the wild type virus in the spring 2020 (low-to-moderate transmission season but highly susceptible population) and initiated a second wave in the autumn of the same year (presumably driven by seasonality). The wave was subsiding (due to acquired immunity and contact restrictions) when a more transmissible variant (alpha) emerged, reaccelerating epidemic growth and becoming dominant by January 2021. By the end of January 2021, this wave was retracting again. Levels of infection and disease were especially low throughout April-May 2021 but this was somewhat altered by an even more transmissible variant (delta) that become dominant by the end of May 2021. The delta variant, which ravaged parts of the world (such as India, where it was first detected), maintained moderate levels of infection and disease in the UK throughout the summer and autumn 2021 (immunity due to previous infection and vaccination having a key role in this outcome). By the time this study was completed (November 2021), a new variant was being detected (omicron, associated with even higher transmissibility, lower incubation period and lower pathogenicity) but this is beyond the scope of this paper.

Although our models have not been developed to estimate variant-specific parameters (except for the wild type), they can give an indication of which parameter ranges are compatible with the transmission dynamics in each nation at the time a particular variant was dominant. In England, assuming an infection fatality ratio of 0.9%, our heterogeneous models estimate ℋ for wild type in the range 25 − 27% (ℛ_{0} around 2.7). Assuming alpha 30% more transmissible, and delta 40% more transmissible than alpha [PHE 2021], we obtain ℋ in the range 30 − 33% (ℛ_{0} around 3.6) for alpha, and 36 − 40% (ℛ_{0} around 5.0) for delta. The estimated cumulative infected percentage is close to 25% by November 2021, suggesting that the nation was then entering the range estimated for the natural herd immunity threshold for wild type SARS-CoV-2 (assuming acquired immunity to be effective across variants), but remained 5% and 11% below threshold for alpha and delta variants, respectively. Adding that almost 70% of the population had been fully vaccinated by then, it might have been reasonable to expect herd immunity to be achieved soon for the variants as well. This was happening just prior to the rapid rise of the omicron variant by December 2021.

In Scotland, we estimate ℋ in the range 26 − 29% (ℛ_{0} around 2.9) for the wild type, 31 − 34% (ℛ_{0} around 3.8) for alpha, and 37 − 41% (ℛ_{0} around 5.3) for delta, where the same infection fatality ratio of 0.9% would suggest a cumulative infected percentage close to 20% by November 2021. In this case, the nation remained 6% below the natural herd immunity threshold for the wild type, 11% for alpha, and 17% for the delta variant. As in England, though, it is plausible the addition of vaccination coverage would have brought the nation to the herd immunity threshold for the three SARS-CoV-2 types, but more research is needed to determine this with satisfactory confidence.

These figures are for the scenario where the IFR is 0.9% and should be adapted if different assumptions are made about this metric, but explorations reported in Supplementary Material suggest that general tendencies will not be affected. Also worth noting that although neither the wild type nor the alpha variant remained in circulation by November 2021, it is of both theoretical and practical interest to detail the mechanisms that resulted in their elimination. In particular, our analyses suggest that variant-specific naturally acquired immunity had a much larger role than what might have been expected according to models that take less account of individual variation in susceptibility and exposure to infection (recall that models that assume homogeneity inflate herd immunity thresholds to 60% and higher). More generally, heterogeneity affects competition mechanisms and measures of relative fitness between variants [Gomes *et al*. 2019], and efficacy of vaccines over time [Gomes *et al*. 2016, Gomes 2020], in ways that are yet to be considered for COVID-19 [Bushman *et al*. 2021, Althaus *et al*. 2021, Thomas *et al*. 2021].

Finally, the maintenance of herd immunity is a dynamic process [Gomes 2021]. Immune individuals lose their immunity or die while new susceptibles are born. Further infections and vaccinations replenish population immunity maintaining infection rates at an endemic equilibrium - the herd immunity threshold. This equilibrium is precarious in the sense that it is sensitive to viral evolution and demographic change, and may be temporarily eroded by seasonality [Bacaër and Gomes 2009] in a pattern exhibited by many common air-borne or vector-borne infectious diseases.

## Data Availability

All data referred to in the manuscript are publicly available.

## Acknowledgements

We thank Jorge Carneiro, José Ferreira Machado, Kate Langwig, Robert MacKay, Paul McKeigue, Antonio Montalbán, Joe Schoneman and Laurette Tuckerman for valuable discussions throughout this study. At the University of Strathclyde, Matthew Burns, Zhichun Jiang, Naithan McNeil, Lauren Schofield and Aidan West conducted their final year BSc projects on Communicating Mathematics and Statistics, supervised by M.G.M.G., on topics related to this study while this paper was being written. This has contributed clarity to our presentation. The models presented here were first submitted to *medRxiv* on 27 April 2020 and posted soon after. Applications to the COVID-19 pandemic and subsequent updates followed. We are grateful to the preprint server for making our work available to interested readers in real time. M.U.F. received funding from Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil.