Estimation of age-stratified contact rates during the COVID-19 pandemic using a novel inference algorithm

Christopher M. Pooley; Andrea B. Doeschl-Wilson; Glenn Marion

doi:10.1101/2022.01.12.22269157

Summary

Well parameterised epidemiological models including accurate representation of contacts, are fundamental to controlling epidemics. However, age-stratified contacts are typically estimated from pre-pandemic/peace-time surveys, even though interventions and public response likely alter contacts. Here we fit age-stratified models, including re-estimation of relative contact rates between age-classes, to public data describing the 2020-21 COVID-19 outbreak in England. This data includes age-stratified population size, cases, deaths, hospital admissions, and results from the Coronavirus Infection Survey (almost 9000 observations in all). Fitting stochastic compartmental models to such detailed data is extremely challenging, especially considering the large number of model parameters being estimated (over 150). An efficient new inference algorithm ABC-MBP combining existing Approximate Bayesian Computation (ABC) methodology with model-based proposals (MBP) is applied. Modified contact rates are inferred alongside time-varying reproduction numbers that quantify changes in overall transmission due to pandemic response, and age-stratified proportions of asymptomatic cases, hospitalisation rates and deaths. These inferences are robust to a range of assumptions including the values of parameters that cannot be estimated from available data. ABC-MBP is shown to enable reliable joint analysis of complex epidemiological data yielding consistent parametrisation of dynamic transmission models that can inform data-driven public health policy and interventions.

Introduction

Dynamic mathematical models, such as the Susceptible-Infectious-Recovered (SIR) transmission model and its numerous variants, have been widely applied over many decades to enhance understanding of infectious disease spread and control in populations (see e.g. [1]). They have been particularly prominent in guiding public health responses to the COVID-19 pandemic [2], and are increasingly seen as important tools for causal inference in epidemiology [3]. Model results used to inform public health decisions must be defensible, requiring models to be informed by, and be consistent with, the widest range of suitable data sources available [4]. However, as discussed below, the COVID-19 pandemic has shown that tools to enable reliable and routine inference for sufficiently realistic models of disease dynamics from multiple sources of data are currently lacking. Here we address this gap by introducing a novel algorithm that enables efficient inference for complex compartmental models, illustrating its potential by fitting an age-structured national-scale model to multiple data sources (demographic, operational and survey data) arising from the COVID-19 epidemic in England 2020-21. This enables simultaneous inference of many key parameters describing the outbreak, and the benefits of such an integrated approach are exemplified by updating pre-pandemic age-stratified contact rates to account for relative differences in contact between age groups.

Fitting complex compartmental models to multiple data sources, with required uncertainty estimates for the key model parameters, comes with computational challenges. Bayesian inference for stochastic models of disease dynamics has been implemented using a range of methodologies that provide exact or approximate approaches to drawing samples from the posterior distribution. Data augmentation approaches, typically combined with Markov chain Monte Carlo (MCMC), offer exact, albeit computationally expensive inference, but require explicit expressions for the complete-data likelihood (see e.g. [5] for an application to an individual level model of COVID-19). In contrast, approximate Bayesian computation (ABC) [6], including frequently used sequential Monte Carlo (SMC) versions [7], is likelihood-free, requiring only the ability to simulate from the model, but typically draws samples from a poor approximation to the posterior. Other SMC algorithms, such as particle MCMC (pMCMC) [8] are also likelihood-free, and in principle offer exact samples from the posterior by developing unbiased estimators for the likelihood via simulation. In many cases these perform well, however, their computational cost typically scales poorly with the dimensionality of the data [9], a serious drawback when applying to large datasets such as those used for COVID-19 here. A relevant example is [10], which applies pMCMC to a stochastic age-structured non-spatial model used to look at variation in the COVID-19 reproduction number separately for different regions of England (for comparison this contained 26 parameters and fitted around 1300 data points with an over-dispersed observation model to aid computational speed¹ [11]). As with others studies, a contact matrix based on the pan-European POLYMOD study [12] was assumed, so limiting the number of parameters to be estimated, and the model was fitted using non-age-structured (i.e. aggregated) data. Adding in age structure greatly increases the number of individual observations (by a factor of ∼17 here), and we found this makes applying a pMCMC algorithm computationally impractical (results not shown). This motivated development of the new ABC-MBP algorithm described below (see Methods) that combines model-based proposals (MBPs), derived from data-augmentation MCMC, with ABC methods.

Quantifying the effective contact structure of a population (i.e. how different groups intermix and transmit infection during an epidemic) is central to epidemiological modelling. For diseases with age as a risk factor, such as COVID-19, understanding how such contacts vary with age is vital because it helps quantify the likely impact of age targeted restrictions, e.g. school closures, care home restrictions or closing night time entertainment. Typically simulation studies are used to assess this impact [13, 14], but accuracy is limited by the degree to which model parameters can correctly be identified given the available data. Approximations of social mixing patterns have been constructed based on wearable proximity sensors [15, 16] or self-recording of contacts [17]. In terms of scale, two studies in the latter category stand out. Firstly, the 2005-6 POLYMOD study [12, 18] recruited 7,290 participants across eight European countries and asked each participant to fill in a diary that recorded the contacts they made on one particular day, along with estimated ages of those contacts. Secondly, the 2017-18 BBC Pandemic study [19] recruited 36,000 volunteers from the UK who used a self-reporting mobile phone app to inform about contacts made. Based on these surveys each study published (pre-pandemic) estimates for the so-called “contact matrix” C⁰, which captures the social mixing patterns between age groups [20] that have been widely used to model transmission dynamics of COVID-19. In this paper we infer the time-varying reproduction rate Rt to quantify the net impact of societal response (lockdowns, social distancing etc.) during the pandemic, and, importantly, estimate modifications to C⁰ that quantify relative changes in mixing between different age groups.

Age stratification in observed COVID-19 cases may be driven by age related variations in susceptibility, contacts, infectivity, probability of being asymptomatic and time spent in latent, infectious and other disease states. This complicates estimation of age-related changes in contacts, but [21] show that viral load does not vary greatly with age or between hospitalised and other cases and thus here we assume infectivity is constant across all age classes and hospitalised and non-hospitalised cases. Furthermore, this study found similar viral load trajectories for children and adults and so we also assume residency times in disease states are not age dependent. In contrast, previous studies have shown age-related differences in contacts [12, 18, 19], susceptibility and proportion of clinical (symptomatic) cases [22] as modelled here (see Methods). We show that use of PCR and seroprevalence survey results, in addition to case reports, allows estimation of age-related fractions of asymptomatic infections alongside Rt and age-stratified hospitalisation rates and deaths from hospitals. However, estimation of age-related susceptibility and (relative) contact rates, are confounded and therefore we estimate them in two separate analyses, in each case leading to models that perform similarly and are better predictors than models that don’t account for such heterogeneity. Results (see below) suggest relative overall contact rates are reduced and age-related contact patterns are different to pre-pandemic patterns with greater relative contact rates by younger age classes. We provide an estimate for these changes assuming susceptibility does not vary with age. Consistent with increasing case fatality rate with age, others have found increasing susceptibility with age [22] suggesting our estimates of increased relative contacts by younger groups is conservative. In contrast, our estimation of age-dependent susceptibilities (assuming pre-pandemic contact patterns) are not consistent with increasing age-related susceptibility, further indicating that pre-pandemic relative mixing matrices require re-estimation during the pandemic. In the main text we therefore focus on estimation of age-related contact rates, and defer a comparison to the age-dependent susceptibility model until the discussion.

Methods

We first describe the model, give a brief description of the data, and then outline the new ABC-MBP Bayesian inference algorithm used to infer model parameters.

The compartmental model

Figure 1 shows the non-spatial age-stratified compartmental model, representing the number of individuals in a given age-class and disease category. The population is assumed to be stratified into 18 age groups indexed by a: the first 16 represent five year age bands (“0-4”, “5-9”,…,”75-79”), then “80+” for all individuals 80 and over, and, finally, a group for care home residents denoted by “CH”². Once an individual in the susceptible compartment S becomes infected by COVID-19 they enter the exposed E compartment where they spend an exponentially distributed residency time mE. A branching probability determines whether they then enter an infectious I compartment³ or an asymptomatic A compartment. Individuals go on to trace a path in Fig. 1, undergoing exponentially distributed residency times in the various compartments and choosing transitions based on branching probabilities when alternative routes exist, until they eventually reach the terminal recovered R or dead D compartments. Note, the branching probabilities in this diagram all depend on age group a. These probabilities will later be estimated from data.

Figure 1: The compartmental model.

The compartments are defined as follows: S: Susceptible, E: Exposed, A: Asymptomatic, I: Infectious, T: PCR test-sensitive but non-infectious, C: Infected but non-infectious (due to self-isolation), H: Hospitalised (or potentially care home in the case of care home residents), R: Recovered, and D: Dead. Variable m_c gives the mean residency time in a compartment c (with individual residency times exponentially distributed about this mean) and gives the branching probability for an individual in age group a going from compartment A to B (note, these probabilities are constrained to add up to one when leaving a given compartment). The term λ_a,t refers to the time-dependent force of infection acting on individuals in age group a. The annotations also indicate how data from the COVID-19 epidemic in England inform inference. Survey data are taken to provide an unbiased estimate of proportions of the total population in the sum of classes A, I, T, C (PCR survey) and R (seroprevalence). Operational data is used to inform transition rates E→I (test positive cases for second pandemic wave), C→H (hospitalisations) and H→D (deaths in hospitals and care homes).

The force of infection λ_a,t in Fig. 1 defines the probability per unit time that an individual becomes infected: The various terms in this expression are as follows:

Reproduction number R_t

This time-varying quantity is defined as the expected number of secondary cases directly generated by a case in a population in which nearly all individuals are susceptible. Estimation of this quantity relies on the approach taken by Diekmann et al. [23], in which R_t is calculated from the highest eigenvalue of the next generation matrix (see Appendix A in the Supplementary Material for further details). R_t is incorporated into the model by means of a piecewise linear spline. The breakpoints for this spline are placed at roughly two-week intervals (with additional points near to lockdowns where it may be expected that R_t changes rapidly). The quantity r_t, which links R_t to the more conventionally used transmission rate, is derived from other model parameters (see Appendix A).

Age group population P_a

The total population size in age group a.

Pre-pandemic contact matrix C⁰

This square matrix captures disease relevant contacts between individuals in the 18 age groups. In particular, the element gives the average number of daily contacts for an individual in age group a with individuals in age group a’. Specification of this fixed matrix is based on data from the BBC Pandemic study [19] (details are provided in Appendix C). A pictorial representation of C⁰ is given in Fig. 2(a). Note the tri-diagonal structure results from frequent inter-generational contacts⁴.

Figure 2: The contact matrix.

(a) The pre-pandemic matrix C⁰ based on the BBC Pandemic study [8] (see Appendix C). For an individual represented by an age group on the x-axis this shows the estimated numbers of daily contacts they make with individuals in the same and different age groups on the y-axis (darker red colours indicate more frequent contacts). (b) The inferred age-adjusted contact matrix C based on COVID-19 data. (c) The factor difference in C over C⁰ (red colours indicate an increase and blue colours a decrease).

Age contact factors v

Each element of this vector corresponds to a specific age group and multiplies the corresponding columns and rows of the contact matrix C⁰. It effectively allows the model to enhance or reduce the relative rate of contacts for different age groups in order to generate an “age adjusted” estimate C for the contact matrix. These contact factors are constrained to have a weighted average of one, i.e. where a sums over all age groups and P is the total population size⁵.

Infectivity φ_c

This sets the relative compartmental infectivity. For simplicity we assume only two infectious compartments: I (for which φ_I=1) and A (for which φ_A=0.55, as estimated in Appendix K). All other compartment have φ_c=0.

Subpopulation size N_c,a,t

The total number of individuals in compartment c and age group a at time t.

External force of infection η_t

The time-varying probability per unit time an individual becomes infected from a source outside of England (see Appendix D for how this is derived from flight [24] and global COVID-19 death data [25]). For convenience this term is divided by a factor f=10⁵ such that it is expressed in units of infections per 100k individuals.

Three important points can be made about Eq.(1): Firstly, it corresponds to a non-spatial model, equivalent to assuming small spatial variation across England for demographic makeup, the contact matrix and external force of infection⁶. Secondly, individuals in all age groups are taken to be equally susceptible to disease (this is revisited later). Lastly, whilst age contact factors v adjust the relative rates of contact for different age groups compared to the pre-pandemic matrix C⁰, it is important to remember that the overall effective contact matrix is modulated by the reproduction number: where R₀ is the basic reproduction number. This accounts for changes in the overall rate of contacts (through government interventions and modification in social behaviour) as well as measures aimed at blocking transmission on contact (e.g. wearing face masks, washing hands, social distancing)⁷. Hence, time variation in R_t is effectively a proxy for time variation in the overall rate of effective contacts.

The stochastic dynamics of the model presented above are approximated by a τ-leaping algorithm [26] with a fixed time step of τ=0.5 days (see Appendix E for details).

Data sources

The time period used for the analysis in this study is between 1^st January 2020 (before the start of the COVID-19 pandemic in England) and 9th June 2021 (just after the second wave of infections). Figure 1 provides a synoptic illustration of how demographic, operational and survey information are interpreted in the context of the compartmental model to enable inference. Further details and information on how these datasets have been prepared for analysis are given in Appendix F (plots of the raw data can be seen by the black lines in the posterior Supplementary Results, with age-aggregated results in Fig. 6). In total, this data consists of almost 9000 individual observations.

Bayesian inference

Collectively the model parameters are referred to as θ (these determine the movement of individuals through the compartmental model in Fig. 1). When simulating from the model, individuals start in the susceptible S compartment⁸ and transitions between the compartmental states result in the subpopulations p_a,c,t within age group a and compartment c changing as a function of time t. These dynamics are collectively referred to as the system “state” (actually a set of states for each point in time), and denoted ξ. In reality ξ is unknown, and from a Bayesian perspective is considered a set of latent variables to be estimated.

The data is collectively denoted y. Application of Bayes’ theorem implies that the posterior probability distribution π(θ,ξ|y) is given by where various terms in this expression are:

The observation model π(y|ξ)

This gives the probability of the data given a system state based on statistical characterisation of the relevant observation processes. Often writing down a true observation model is complicated by the fact that we may not know the error associated with a set of measurements and the observation probability for a set of measurements down a time series may be highly correlated.

Rather than attempt this, this paper follows an approximate Bayesian computational (ABC) approach which relies on introducing a measure of fit between the data y and the state ξ called the “error function” (EF). A small error function implies a close correspondence between ξ and y. For ABC the observation model is defined to be non-zero when the error function is less than some specified cut-off EF_cutoff: The challenge for inference algorithms is to reduce this cut-off as much as possible to ensure a close correspondence between the system state ξ and the observed data y. How this is implemented in practice is discussed in the next section.

Several possibilities exist for the choice of EF⁹, but the following proved effective for the problem in this paper: where the sum i goes over all individual measurements made on the system (i.e. each of the data points on the age-stratified time series for cases, deaths, hospital admissions and the CIS results), y_i is the value of the ith measurement and Y_i is the equivalent value derived from the state ξ. The small constant (dependent on the time series g_i to which i belongs) is introduced to ensure validity even when y_i or Y_i are zero¹⁰.

When y_i and Y_i are equal the contribution to the error function in Eq.(6) is zero. When they are unequal the contribution is always positive (and for small deviations, close to the square of the fractional change in the quantity). The weights w_i in Eq.(6) set the relative importance of different measurements. In particular, they place greater weight on larger observed values¹¹ and overall give an approximately equal weighting to each of the time series in the data¹² (see Appendix I for details).

The latent process likelihood L(ξ|θ)

This gives the probability of simulating a state ξ given a set of model parameters θ. The MBPs used below are “likelihood-free”, so this quantity is not explicitly calculated.

The prior π(θ)

This captures the state of knowledge regarding parameter values before data y is considered. Prior specifications are given for all model parameters in Table R1 of the posterior Supplementary Results (see Appendix J for further explanation). Parameters for which the data is uninformative are fixed to plausible values taken from the literature (see Appendix K)¹³.

Approximate Bayesian Computation with model-based proposals (ABC-MBP)

In this section we explain the ABC-MBP approach used to perform inference. Note, a “particle” refers to the combination of a set of model parameters θ and a system state ξ. The algorithm below iterates over a series of generations G, and with each generation it successively improves the posterior accuracy:

Initialisation – A generation index g is set to 1. P particles¹⁴, indexed by p, are sampled from the model. This is achieved by first sampling from the prior π(θ) and then simulating from the model (using the τ-leaping algorithm described in Appendix E). The initial error function cut-off is set to infinity .
Particle culling and resampling – An error function cut-off is set such that exactly half the particles have an error function above this value and half below¹⁵. Particles with error functions below are directly copied to create new particles in the next generation . On the other hand, particles with error function above are discarded and replaced by a randomly selected particle from one of those whose error function is below this limit. The generation index g is incremented.
Particle mixing – To avoid degeneracy each particle undergoes a series of MBPs collectively referred to as an “update” (see appendix L for details) [9, 27]. These allow particles to explore parameter and state space subject to the condition that their error function must not exceed . MBPs are the key novelty in this algorithm because they generate efficient MCMC mixing¹⁶ by making joint proposals in parameter and state space possible whilst at the same time restricting large changes in the error function (this is in contrast to standard ABC schemes that rely on simulation¹⁷).
Jump to 2 if g is less than G. Particles in the final generation are used to approximate the posterior distribution.

This algorithm guarantees the error function cut-off monotonically decreases towards zero as the generation number increases. As it does so, particle states are forced to lie closer and closer to the data and posterior estimates similarly approach the true posterior. As the generation number g gets higher, however, so more and more MBPs are required in step 3 to avoid particle degeneracy. Therefore, for reasons of computationally practicality, the total generation number G is usually considered sufficiently large when posterior distribution estimates have apparently converged to steady state values (see posterior Supplementary Results Fig. R2).

Results

Inference protocol and validation

Inference was performed using the data and ABC-MBP approach outlined (see Methods). This was implemented using an open source software package called BEEPmbp (see Appendix M for details). To ensure robustness against lack of convergence, or multimodality in the posterior, each analysis was completed by running BEEPmbp K=16 times (chosen to check if randomly initialised starting states converge on the same posterior solution), with each run using P=16 particles (sufficiently large to adequately represent the posterior and avoid becoming stuck in metastable states) iterated over G=350 generations (large enough to get good estimates for parameters, see Fig. R2 in the posterior Supplementary Results). The results presented below are based on the K×P=256 samples generated over those runs¹⁸. The CPU time in total (over all 16 runs) was 4.5 hours when running on 256 cores. Whilst computationally expensive for challenging problems like this one, ABC-MBP is highly parallelisable and can be run relatively quickly given sufficiently powerful computing resources. Inference on representative simulated data correctly identified true model parameters demonstrating the validity of the algorithm for the problem at hand (see Appendix N for details).

Analysis of England 2020-21 COVID-19 outbreak data

The posterior Supplementary Results provide details of fitting the model (Fig.1) with equation (1) to the COVID-19 outbreak data as described above, including: inferred model parameter distributions (Table R1), estimates for the contact matrix C (Table R2), graphs showing fits between raw data and inferred states (Fig. R1)¹⁹, and convergence down generations (Fig. R2). Here we summarise the key results.

Posterior mean estimates for the contact matrix C

are shown in Figure 2(b) with factor differences between C and C⁰ in Fig. 2(c). Based on these, noticeable differences in the contact patterns occur in the majority of age groups. The substantially elevated mixing amongst younger age groups can be understood by examining Fig. 3(a), which shows posterior distributions for the age contact factors (vector v in Eq.(1)). For younger age groups (aged 24 and below) the values exceed one indicating increased contacts relative to what would be expected from C⁰. We note this is broadly consistent with [5], who find increased force of infection for those under 60 versus older individuals. Given the reduced severity of symptoms in these age groups it seems plausible that this results from higher contact rates, rather than increased rates of shedding. The pattern in older age groups is somewhat more complex; group 40-44 was found to reduce contacts relative to C⁰ the most, but interestingly relative rates go up again until the 55-59 age group before reducing for older individuals. It should be noted that, in contrast to other age-groups, absence of data on pre-pandemic contact rates for care-home residents means the estimate for v_CH (Fig.3(a)) should not be interpreted as an increase in contact rate. Rather, it is simply a reflection of the fact that contact rates in care homes are much higher than the rest of the population²⁰ (due to being communal environments with frequent contacts between residents and staff) which has led to a disproportionate number of cases, and sadly deaths, compared to the elderly in the general population.

Figure 3: Age contact factors and variation in susceptibility.

This shows two alternative models used to fit the data: (a) Age contact factors v modify the pre-pandemic contact matrix C⁰ for different age groups in the population (see Eq.(1)). (b) Age groups are given a relative susceptibility to acquiring infection (see Eq.(7)). The error bars show 95% credible intervals.

Time variation in the reproduction number

Figure 4(a) shows the posterior distribution for R_t with important events on the timeline shown by vertical lines (see Appendix O and [28]). As mentioned in Eq.(3), this is a proxy for the overall effective contact rate in the population. On the other hand the effective reproduction number (see Appendix A), as shown in Fig. 4(b), tracks disease progression (above one implies COVID-19 infections are increasing)²¹. These estimates capture the net impact of the many factors that determined the epidemic trajectory. The first lockdown clearly had a huge impact on reducing below one and ending the first wave. A gradual increase over the summer and autumn of 2020 (coinciding with restrictions being lifted) led to a second pandemic wave. Despite an already falling reproduction number in the autumn to winter of 2020, a second lockdown was instigated leading to fall below one once again. Emergence of the more transmissible alpha variant, along with increased mixing during Christmas festivities, led to a substantial rise in cases towards the end of 2020, ahead of a third lockdown. The vaccination program, appears to have helped to curb epidemic spread in early 2021, while emergence of the delta variant, along with easing restrictions, has led to again increasing in the summer of 2021. Detailed analysis of the impact of these events is likely to prove difficult due to the cofounding effects of multiple factors.

Figure 4: Time variation in reproduction number.

(a) The inferred value of the reproduction number R_t as a function of time (red line gives posterior mean and dashed lines indicate the 95% credible interval). This is proportional to the overall effective contact rate (see Eq.(3)). (b) The effective reproduction number (which accounts for the fact that a fraction of the population is not susceptible). Above/below the horizontal black lines shows where COVID-19 is increasing/decreasing. The vertical blue lines denote important milestones during the epidemic [29]. Arrows showing changes in the dominant COVID-19 variant are estimated from COG UK [35] (see Appendix O) and for vaccination come from CIS [24] (these indicate the time period in which between 5% and 95% of individuals are vaccinated with the first dose for selected age groups).

Age-dependence in branching probabilities

Figure 5(a) shows the posterior estimated asymptomatic branching probability for different age groups a. Overall, around 55% of exposed individuals are asymptomatic. This value is consistent with the large range of figures suggested in the literature (e.g. from 18%-75% [29-31]). Asymptomatic branching probabilities are highest for younger age groups, which has important implications, e.g. for interpreting school cases rates (which may significantly under-represent true infection levels). The proportion of asymptomatics declines through adulthood and rises again in older age groups (Fig. 5(a)) consistent with other studies based on data prior to vaccine availability (see e.g. Fig. 2(c) in [22]). The notable exception is in the 80+ and care home groups (which, by definition, are set equal²²), which have a significantly lower asymptomatic branching probability. This result stems from the fact that although many cases are observed in the 80+ group, measurements taken by the CIS near the end of 2020 show only a small proportion of randomly sampled 80+ individuals are found to be antibody positive. One possibility is that waning immunity is much faster in this group, or that a weaker immune response results in more false negative antibody test results.

Figure 5: Age-dependent branching probabilities and residency time in T.

The probability of: (a) becoming asymptomatic, (b) becoming hospitalised given a case (note, for care home patients “hospitalised” may mean going to hospital or becoming critically ill within a care home), and (c) of death given hospitalised. (d) Shows the residency time in the PCR test-sensitive T compartment. The error bars indicate 95% credible intervals.

Figure 6: Simulation results and data.

These plots show age-aggregated data (black) against 500 simulations (blue) performed using model parameters taken from the analysis posterior means (see Table R1 of the posterior Supplementary Results). The red lines indicate an average across simulations. (a) The total population in the infected I, C, A and T compartments (data from CIS), (b) daily cases, and (c) daily hospital admissions (not including care home patients), (d) daily deaths, (e) total recovered (excludes 0-14 age groups and care home residents, single data point from antibody results in CIS). Note, to aid visualisation of the data the first peak has been cut off.

The probability of hospitalisation (given a case) and dying in hospital (given hospitalised) are shown in Figs. 5(b) and 5(c), respectively. These detailed estimates are entirely consistent with the broad understanding that COVID-19 affects the elderly much more severely than the young.

Duration of PCR test sensitivity

Estimated residency times in the PCR test-sensitive T compartment are shown in Fig. 5(d). As both I and A compartments are also detectable by PCR test, adding a mean value from this plot of 11 days to m_E=m_A=4 days leads to an estimated overall test-sensitive period of around 15 days. This is in good agreement with [32], who estimated the time from symptoms to two negative PCR tests at 13 days, which when added to an estimated 2 days presymptomatic period [32] gives exactly our figure. It might be expected the test-sensitive period would increase with age (Fig. 5(d)), but importantly T represents non-hospitalised individuals with significantly milder symptoms than hospitalised cases.

Discussion

ABC-MBP has been shown to be reliable and efficient at producing informative posterior distributions of many key parameters of a complex epidemiological COVID-19 model that integrates age-stratified transitions and contact rates. Here we discuss the impact of key assumptions on the robustness of the estimates obtained.

Compartmental residency times in the model are based on evidence in the literature (see Appendix N), but a variety of alternative estimates exist. To understand the impact of such uncertainty consider the three key interrelated quantities that determine epidemiological dynamics: the reproduction number, the generation time²³, and the infection rate. As infection rate is fully determined by observed data, a shift in the generation time (brought about by making alterations to any of the assumed residency times) must be accompanied by a corresponding rescaling of the reproduction number around one to maintain the observed infection rate. Appendix N investigates this and concludes the results are robust to inaccuracies in fixed residency times in the compartmental model, excepting this rescaling in the profile for R_t. Results are also insensitive (Appendix N) to rescaling of the estimated external force of infection (Appendix D). Further support for our inferences comes from detailed analysis of the raw age-stratified data, which exhibits similar estimates and trends with age obtained for the various quantities reported here (Appendix P).

Our results suggest contact patterns during the COVID-19 outbreak in England are significantly different to pre-pandemic patterns. Although plausible and consistent with other results (see e.g. [5]), alternative explanations may also be consistent with the available data. As discussed above, there is the possibility for age-dependent residency times and/or infectiousness, but support for these hypotheses in the literature seems limited for COVID-19 [21]. We therefore now consider age-dependent susceptibility σ_a (whilst fixing the contact matrix to C⁰), for which the force of infection from Eq.(1) becomes: Estimates under this model are shown in Fig. 3(b), and, perhaps unsurprisingly, they exhibit a similar profile to the age contact factors in Fig. 3(a) (note, Appendix Q shows that other inferred parameter values undergo almost no change, which is further evidence of the robustness of these estimates).

So which model is correct, one which implies age-variation in susceptibility, or age-variation in expected contact rate, or perhaps both? One clue can be provided by [22], which suggests that susceptibility in individuals under 20 years of age is actually less than in older individuals (although this study is based on case data collected relatively early on in the pandemic, and is perhaps open to bias). This is contrary to what we find. Indeed, if contact patterns were to be estimated under this assumption, even larger increases in relative contact rates for younger age groups would be found, suggesting our estimates with Eq.(1) are actually conservative. Thus, we suggest observed increases in force of infection from younger age groups are more plausibly explained by changes in contact pattern. Nonetheless, there may be age-dependent effects in contact rate, susceptibility, infectivity and duration of disease states, but the public data considered here is not sufficient to estimate these simultaneously. Importantly, however, the model based on Eq.(1) is more accurate than one not containing such age contact or susceptibility factors, and so provides an important improvement for more accurate prediction, and highlights important avenues for further investigations regarding these confounding effects.

Figure 6 compares national-level age-aggregated data with simulations based on inferred parameters. We find relatively good agreement with simulation averages, but with large stochastic variability between runs. It is important to appreciate, however, that these results are contingent on historic lockdowns and restriction measures encoded in R_t. In reality extremes of the distribution of modelled outcomes are unlikely to be realised as government interventions and human behaviour are dynamically linked to epidemic severity and would likely reduce in the case of rapidly increasing case numbers.

Clearly the compartmental model in Fig. 1 does not capture all the features of the data, e.g. when examining posterior distributions for the state (in the posterior Supplementary Results) we see clear deviations between the model and the profiles for some of the data curves²⁴. These discrepancies may be due to a number of factors not considered here, including observed spatial heterogeneity in the outbreak, time variation in the contact matrix (e.g. caused by opening and closing schools), the introduction of vaccination, and emergence of different COVID-19 variants. In principle these various effects could be studied using the methodology presented in this paper (no doubt requiring significantly more computation and in some cases additional data), and these remain the subject of future work.

Conclusions

This paper has demonstrated the feasibility of fitting complex compartmental stochastic models containing a large number of parameters (over 150) to a large dataset (almost 9000 individual observations). This was achieved using a new ABC-MBP inference algorithm that is highly parallelisable and generic. Application to publicly available COVID-19 data demonstrated a good model fit based on the inferred parameter values, and yielded several new insights. Firstly, an indication that younger age groups come into contact more frequently than is suggested by a commonly used contact matrix taken from the pre-pandemic literature. Secondly, estimates for age variation in the probability individuals become asymptomatic revealed more asymptomatic young. Thirdly, estimation of the time-varying reproduction number which could inform intervention effectiveness and impacts of COVID-19 variants and vaccination. Finally, a consistently parameterised age-structured model is presented and available to support data-driven public health policy via simulation of near-term projections and of future scenarios.

Data Availability

All data produced in the present work are contained in the manuscript

Additional Information

Data Accessibility

The datasets supporting this article are available in GitHub here.

Details on how to run the analyses reported are provided in the supplementary materials (Appendix M) and the code is available on GitHub here.

Authors’ Contributions

CMP originated the idea for ABC-MBP and performed the cleaning of the raw data, the analysis, and drafted the manuscript. GM helped with epidemiological modelling, the inference algorithm and making improvements to the text. ADW helped with modelling and revising the text. All authors read and approved the manuscript.

Competing Interests

The authors declare that they have no competing interests.

Funding Statement

CMP and GM were funded by Scottish Government’s RESAS (Rural and Environment Science and Analytical Services Division) and EPIC (Epidemiology, Population health and Infectious disease Control). ADW’s contribution was also funded by the BBSRC Institute Strategic Programme Grants (BBS/E/D/20002172-4 (ISP2). We are also grateful for funding from the Data-Driven Innovation (DDI) SFC Beacon Programme.

Acknowledgments

We would like to acknowledge Helen Brown, Stella Mazeri, and Margo Chase-Topping for many useful discussions regarding COVID-19 modelling and data acquisition. Also research software engineers Ian Hinder and Robin Williams for their contributions in developing BEEPmbp (the software used to perform the analysis). Also we express our gratitude to DiRAC (Distributed Research utilising Advanced Computing) for allowing us to use their HPC facilities for free during the pandemic.

Footnotes

↵¹ Such over-dispersion allows the particle number to be reduced substantially. This study used 96 particles, but many more would be required with a realistic observation model.
↵² For brevity, we subsequently refer to these as “age groups”, even though one group actually relates specifically to care homes.
↵³ The I compartment includes both pre-symptomatic and symptomatic classifications of disease progression.
↵⁴ This matrix is asymmetric because populations in different age groups are not the same (notably the care home group has a significantly smaller population).
↵⁵ Such a constraint is necessary, otherwise rescaling v becomes confounded with rescaling r_t in Eq.(1).
↵⁶ Analysis of spatial data has shown this to be more or less true (results not show).
↵⁷ Disentangling these two effects is not possible from the data.
↵⁸ With age-structured populations set to be representative of England using ONS data.
↵⁹ For example, taking the difference between Y_i and y_i and squaring would result in something akin to a least-squares approach.
↵¹⁰ Its value is set to 2% of the highest peak from the time series g_i to which the measurement belongs. This was a simple way to suppress the effect of noise in measurements with low values.
↵¹¹ This reflects the fact that daily transitions, like case and death data, are expected to be Poisson distributed, and so larger values are expected to be less susceptible to stochastic variation than smaller values.
↵¹² Some time series have a daily time step whereas some are weekly. w_i helps to adjust for this discrepancy.
↵¹³ Sensitivity analysis is performed later to identify what affect changing these parameters has on the results.
↵¹⁴ P must be set to an even number.
↵¹⁵ In practice this is achieved by ordering the error function values for each particle, and setting the cut-off to the average of the middle two values on the list.
↵¹⁶ This is a property whereby a particle will explore the potential parameter and state space in a relatively small number of updates. So called “poor mixing” results from correlations in the posterior requiring MCMC to take many updates to generate a effectively new random sample from this space.
↵¹⁷ In stochastic models, simulation with a fixed parameter set can lead to a huge variation in the error function. On the other hand, MBPs ensure that a small change in parameter space is always accompanied by a small shift in the error function. Consequently MBPs can be effectively tuned to keep the system state error function within the ever decreasing cut-off limit.
↵¹⁸ Not all runs were found to converge on precisely the same solution, indicating some level of multimodality in the posterior (they were, however, all close, essentially exhibiting only small deviations in R₀).
↵¹⁹ Generally speaking they fit well with some notable exceptions discussed below.
↵²⁰ The CH column in Fig. 2(b) has a darker red shades than other columns.
↵²¹ accounts for depletion in the susceptible population as a result of infection-induced immunity.
↵²² With information coming solely from the 80+ age group through the CIS data.
↵²³ Defined as the interval between infection time in an individual and when that person transmits to another individual.
↵²⁴ Note, these cannot simply be the result of stochastic fluctuations, as that is already accounted for.

References

[1].↵
Heesterbeek, H., Anderson, R. M., Andreasen, V., Bansal, S., De Angelis, D., Dye, C., Eames, K. T., Edmunds, W. J., Frost, S. D. & Funk, S. 2015 Modeling infectious disease dynamics in the complex landscape of global health. Science 347. (DOI:10.1126/science.aaa4339).
OpenUrl Abstract/FREE Full Text
[2].↵
Brooks-Pollock, E., Danon, L., Jombart, T. & Pellis, L. 2021 Modelling that shaped the early COVID-19 pandemic response in the UK. (DOI:10.1098/rstb.2021.0001).
OpenUrl CrossRef
[3].↵
Ackley, S. F., Lessler, J. & Glymour, M. M. 2021 Dynamical Modeling as a Tool for Inferring Causation. American Journal of Epidemiology. (DOI:10.1093/aje/kwab222).
OpenUrl CrossRef
[4].↵
De Angelis, D., Presanis, A. M., Birrell, P. J., Tomba, G. S. & House, T. 2015 Four key challenges in infectious disease modelling using data from multiple sources. Epidemics 10, 83–87. (DOI:10.1016/j.epidem.2014.09.004).
OpenUrl CrossRef
[5].↵
Lau, M. S., Grenfell, B., Thomas, M., Bryan, M., Nelson, K. & Lopman, B. 2020 Characterizing superspreading events and age-specific infectiousness of SARS-CoV-2 transmission in Georgia, USA. Proceedings of the National Academy of Sciences 117, 22430–22435. (DOI:10.1073/pnas.2011802117).
OpenUrl Abstract/FREE Full Text
[6].↵
Sunnåker, M., Busetto, A. G., Numminen, E., Corander, J., Foll, M. & Dessimoz, C. 2013 Approximate bayesian computation. PLoS computational biology 9, e1002803. (DOI:10.1371/journal.pcbi.1002803).
OpenUrl CrossRef PubMed
[7].↵
Sisson, S. A., Fan, Y. & Tanaka, M. M. 2007 Sequential monte carlo without likelihoods. Proceedings of the National Academy of Sciences 104, 1760–1765. (DOI:10.1073/pnas.0607208104).
OpenUrl Abstract/FREE Full Text
[8].↵
Andrieu, C., Doucet, A. & Holenstein, R. 2010 Particle markov chain monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72, 269–342.
OpenUrl CrossRef Web of Science
[9].↵
Pooley, C., Bishop, S. & Marion, G. 2015 Using model-based proposals for fast parameter inference on discrete state space, continuous-time Markov processes. Journal of The Royal Society Interface 12, 20150225. (DOI:10.1098/rsif.2015.0225).
OpenUrl CrossRef PubMed
[10].↵
Knock, E., Whittles, L., Lees, J., Perez Guzman, P., Verity, R., Fitzjohn, R., Gaythorpe, K., Imai, N., Hinsley, W. & Okell, L. 2020 Report 41: The 2020 SARS-CoV-2 epidemic in England: key epidemiological drivers and impact of interventions. (DOI:10.25561/85146).
OpenUrl CrossRef
[11].↵
Wood, S. N. & Wit, E. C. 2021 Was R< 1 before the English lockdowns? On modelling mechanistic detail, causality and inference about Covid-19. medRxiv. (DOI:10.1101/2021.02.03.21251112).
OpenUrl Abstract/FREE Full Text
[12].↵
Mossong, J., Hens, N., Jit, M., Beutels, P., Auranen, K., Mikolajczyk, R., Massari, M., Salmaso, S., Tomba, G. S. & Wallinga, J. 2008 Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine 5, e74. (DOI:10.1371/journal.pmed.0050074).
OpenUrl CrossRef PubMed
[13].↵
Panovska-Griffiths, J., Kerr, C. C., Stuart, R. M., Mistry, D., Klein, D. J., Viner, R. M. & Bonell, C. 2020 Determining the optimal strategy for reopening schools, the impact of test and trace interventions, and the risk of occurrence of a second COVID-19 epidemic wave in the UK: a modelling study. The Lancet Child & Adolescent Health 4, 817–827. (DOI:10.1016/S2352-4642(20)30250-9).
OpenUrl CrossRef PubMed
[14].↵
Ferguson, N., Laydon, D., Nedjati Gilani, G., Imai, N., Ainslie, K., Baguelin, M., Bhatia, S., Boonyasiri, A., Cucunuba Perez, Z. & Cuomo-Dannenburg, G. 2020 Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. (DOI:10.25561/77482).
OpenUrl CrossRef
[15].↵
Salathé, M., Kazandjieva, M., Lee, J. W., Levis, P., Feldman, M. W. & Jones, J. H. 2010 A high-resolution human contact network for infectious disease transmission. Proceedings of the National Academy of Sciences 107, 22020–22025. (DOI:10.1073/pnas.1009094108).
OpenUrl Abstract/FREE Full Text
[16].↵
Mastrandrea, R., Fournet, J. & Barrat, A. 2015 Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PloS one 10, e0136497. (DOI:10.1371/journal.pone.0136497).
OpenUrl CrossRef
[17].↵
Edmunds, W. J., O’callaghan, C. & Nokes, D. 1997 Who mixes with whom? A method to determine the contact patterns of adults that may lead to the spread of airborne infections. Proceedings of the Royal Society of London. Series B: Biological Sciences 264, 949–957. (DOI:10.1098/rspb.1997.0131).
OpenUrl CrossRef PubMed Web of Science
[18].↵
Prem, K., Cook, A. R. & Jit, M. 2017 Projecting social contact matrices in 152 countries using contact surveys and demographic data. PLoS computational biology 13, e1005697. (DOI:10.1371/journal.pcbi.1005697).
OpenUrl CrossRef
[19].↵
Klepac, P., Kucharski, A. J., Conlan, A. J., Kissler, S., Tang, M. L., Fry, H. & Gog, J. R. 2020 Contacts in context: large-scale setting-specific social mixing matrices from the BBC Pandemic project. MedRxiv. (DOI:10.1101/2020.02.16.20023754).
OpenUrl Abstract/FREE Full Text
[20].↵
Jaouimaa, F.-Z., Dempsey, D., van Osch, S., Kinsella, S., Burke, K., Wyse, J. & Sweeney, J. 2021 An age-structured SEIR model for COVID--19 incidence in Dublin, Ireland with framework for evaluating health intervention cost. arXiv preprint arxiv:2106.06377.
[21].↵
Jones, T. C., Biele, G., Mühlemann, B., Veith, T., Schneider, J., Beheim-Schwarzbach, J., Bleicker, T., Tesch, J., Schmidt, M. L. & Sander, L. E. 2021 Estimating infectiousness throughout SARS-CoV-2 infection course. Science. (DOI:10.1126/science.abi5273).
OpenUrl Abstract/FREE Full Text
[22].↵
Davies, N. G., Klepac, P., Liu, Y., Prem, K., Jit, M. & Eggo, R. M. 2020 Age-dependent effects in the transmission and control of COVID-19 epidemics. Nature medicine 26, 1205–1211. (DOI:10.1038/s41591-020-0962-9).
OpenUrl CrossRef PubMed
[23].↵
Diekmann, O., Heesterbeek, J. A. P. & Metz, J. A. 1990 On the definition and the computation of the basic reproduction ratio R 0 in models for infectious diseases in heterogeneous populations. Journal of mathematical biology 28, 365–382. (DOI:10.1007/BF00178324).
OpenUrl CrossRef PubMed Web of Science
[24].↵
2021 Airport data. Civil Aviation Authority, https://www.caa.co.uk/Data-and-analysis/UK-aviation-market/Airports/Datasets/UK-Airport-data/Airport-data-2020-2001/.
[25].↵
2021 COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. Johns Hopkins University 2021, https://github.com/CSSEGISandData/COVID-19.
[26].↵
Gillespie, D. T. 2001 Approximate accelerated stochastic simulation of chemically reacting systems. The Journal of chemical physics 115, 1716–1733. (DOI:10.1063/1.1378322).
OpenUrl CrossRef Web of Science
[27].↵
Pooley, C., Bishop, S. C., Doeschl-Wilson, A. & Marion, G. 2019 Posterior-based proposals for speeding up Markov chain Monte Carlo. Royal Society open science 6, 190619. (DOI:10.1098/rsos.190619).
OpenUrl CrossRef
[28].↵
2021 Timeline of UK government coronavirus lockdowns. Institute for Government, https://www.instituteforgovernment.org.uk/charts/uk-government-coronavirus-lockdowns.
[29].↵
Mizumoto, K., Kagaya, K., Zarebski, A. & Chowell, G. 2020 Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance 25, 2000180. (DOI:10.2807/1560-7917.ES.2020.25.10.2000180).
OpenUrl CrossRef PubMed
[30].
Day, M. 2020 Covid-19: identifying and isolating asymptomatic people helped eliminate virus in Italian village. BMJ: British Medical Journal (Online) 368. (DOI:10.1136/bmj.m1165).
OpenUrl FREE Full Text
[31].↵
Nishiura, H., Kobayashi, T., Miyama, T., Suzuki, A., Jung, S.-m., Hayashi, K., Kinoshita, R., Yang, Y., Yuan, B. & Akhmetzhanov, A. R. 2020 Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19). International journal of infectious diseases 94, 154. (DOI:10.1016/j.ijid.2020.03.020).
OpenUrl CrossRef PubMed
[32].↵
Byrne, A. W., McEvoy, D., Collins, A. B., Hunt, K., Casey, M., Barber, A., Butler, F., Griffin, J., Lane, E. A. & McAloon, C. 2020 Inferred duration of infectious period of SARS-CoV-2: rapid scoping review and analysis of available evidence for asymptomatic and symptomatic COVID-19 cases. BMJ open 10, e039856. (DOI:10.1136/bmjopen-2020-039856).
OpenUrl Abstract/FREE Full Text

View the discussion thread.

Posted January 13, 2022.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Addiction Medicine (316)
Allergy and Immunology (621)
Anesthesia (162)
Cardiovascular Medicine (2296)
Dentistry and Oral Medicine (280)
Dermatology (202)
Emergency Medicine (371)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (817)
Epidemiology (11621)
Forensic Medicine (10)
Gastroenterology (683)
Genetic and Genomic Medicine (3625)
Geriatric Medicine (340)
Health Economics (622)
Health Informatics (2330)
Health Policy (918)
Health Systems and Quality Improvement (871)
Hematology (336)
HIV/AIDS (758)
Infectious Diseases (except HIV/AIDS) (13201)
Intensive Care and Critical Care Medicine (760)
Medical Education (361)
Medical Ethics (101)
Nephrology (393)
Neurology (3389)
Nursing (193)
Nutrition (512)
Obstetrics and Gynecology (653)
Occupational and Environmental Health (654)
Oncology (1776)
Ophthalmology (526)
Orthopedics (211)
Otolaryngology (284)
Pain Medicine (226)
Palliative Medicine (66)
Pathology (441)
Pediatrics (1012)
Pharmacology and Therapeutics (423)
Primary Care Research (409)
Psychiatry and Clinical Psychology (3102)
Public and Global Health (6020)
Radiology and Imaging (1238)
Rehabilitation Medicine and Physical Therapy (719)
Respiratory Medicine (814)
Rheumatology (370)
Sexual and Reproductive Health (359)
Sports Medicine (319)
Surgery (390)
Toxicology (50)
Transplantation (171)
Urology (143)

[1] [1].↵
Heesterbeek, H., Anderson, R. M., Andreasen, V., Bansal, S., De Angelis, D., Dye, C., Eames, K. T., Edmunds, W. J., Frost, S. D. & Funk, S. 2015 Modeling infectious disease dynamics in the complex landscape of global health. Science 347. (DOI:10.1126/science.aaa4339).
OpenUrl Abstract/FREE Full Text

[2] [2].↵
Brooks-Pollock, E., Danon, L., Jombart, T. & Pellis, L. 2021 Modelling that shaped the early COVID-19 pandemic response in the UK. (DOI:10.1098/rstb.2021.0001).
OpenUrl CrossRef

[3] [3].↵
Ackley, S. F., Lessler, J. & Glymour, M. M. 2021 Dynamical Modeling as a Tool for Inferring Causation. American Journal of Epidemiology. (DOI:10.1093/aje/kwab222).
OpenUrl CrossRef

[4] [4].↵
De Angelis, D., Presanis, A. M., Birrell, P. J., Tomba, G. S. & House, T. 2015 Four key challenges in infectious disease modelling using data from multiple sources. Epidemics 10, 83–87. (DOI:10.1016/j.epidem.2014.09.004).
OpenUrl CrossRef

[5] [5].↵
Lau, M. S., Grenfell, B., Thomas, M., Bryan, M., Nelson, K. & Lopman, B. 2020 Characterizing superspreading events and age-specific infectiousness of SARS-CoV-2 transmission in Georgia, USA. Proceedings of the National Academy of Sciences 117, 22430–22435. (DOI:10.1073/pnas.2011802117).
OpenUrl Abstract/FREE Full Text

[6] [6].↵
Sunnåker, M., Busetto, A. G., Numminen, E., Corander, J., Foll, M. & Dessimoz, C. 2013 Approximate bayesian computation. PLoS computational biology 9, e1002803. (DOI:10.1371/journal.pcbi.1002803).
OpenUrl CrossRef PubMed

[7] [7].↵
Sisson, S. A., Fan, Y. & Tanaka, M. M. 2007 Sequential monte carlo without likelihoods. Proceedings of the National Academy of Sciences 104, 1760–1765. (DOI:10.1073/pnas.0607208104).
OpenUrl Abstract/FREE Full Text

[8] [8].↵
Andrieu, C., Doucet, A. & Holenstein, R. 2010 Particle markov chain monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72, 269–342.
OpenUrl CrossRef Web of Science

[9] [9].↵
Pooley, C., Bishop, S. & Marion, G. 2015 Using model-based proposals for fast parameter inference on discrete state space, continuous-time Markov processes. Journal of The Royal Society Interface 12, 20150225. (DOI:10.1098/rsif.2015.0225).
OpenUrl CrossRef PubMed

[10] [10].↵
Knock, E., Whittles, L., Lees, J., Perez Guzman, P., Verity, R., Fitzjohn, R., Gaythorpe, K., Imai, N., Hinsley, W. & Okell, L. 2020 Report 41: The 2020 SARS-CoV-2 epidemic in England: key epidemiological drivers and impact of interventions. (DOI:10.25561/85146).
OpenUrl CrossRef

[11] [11].↵
Wood, S. N. & Wit, E. C. 2021 Was R< 1 before the English lockdowns? On modelling mechanistic detail, causality and inference about Covid-19. medRxiv. (DOI:10.1101/2021.02.03.21251112).
OpenUrl Abstract/FREE Full Text

[12] [12].↵
Mossong, J., Hens, N., Jit, M., Beutels, P., Auranen, K., Mikolajczyk, R., Massari, M., Salmaso, S., Tomba, G. S. & Wallinga, J. 2008 Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine 5, e74. (DOI:10.1371/journal.pmed.0050074).
OpenUrl CrossRef PubMed

[13] [13].↵
Panovska-Griffiths, J., Kerr, C. C., Stuart, R. M., Mistry, D., Klein, D. J., Viner, R. M. & Bonell, C. 2020 Determining the optimal strategy for reopening schools, the impact of test and trace interventions, and the risk of occurrence of a second COVID-19 epidemic wave in the UK: a modelling study. The Lancet Child & Adolescent Health 4, 817–827. (DOI:10.1016/S2352-4642(20)30250-9).
OpenUrl CrossRef PubMed

[14] [14].↵
Ferguson, N., Laydon, D., Nedjati Gilani, G., Imai, N., Ainslie, K., Baguelin, M., Bhatia, S., Boonyasiri, A., Cucunuba Perez, Z. & Cuomo-Dannenburg, G. 2020 Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. (DOI:10.25561/77482).
OpenUrl CrossRef

[15] [15].↵
Salathé, M., Kazandjieva, M., Lee, J. W., Levis, P., Feldman, M. W. & Jones, J. H. 2010 A high-resolution human contact network for infectious disease transmission. Proceedings of the National Academy of Sciences 107, 22020–22025. (DOI:10.1073/pnas.1009094108).
OpenUrl Abstract/FREE Full Text

[16] [16].↵
Mastrandrea, R., Fournet, J. & Barrat, A. 2015 Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PloS one 10, e0136497. (DOI:10.1371/journal.pone.0136497).
OpenUrl CrossRef

[17] [17].↵
Edmunds, W. J., O’callaghan, C. & Nokes, D. 1997 Who mixes with whom? A method to determine the contact patterns of adults that may lead to the spread of airborne infections. Proceedings of the Royal Society of London. Series B: Biological Sciences 264, 949–957. (DOI:10.1098/rspb.1997.0131).
OpenUrl CrossRef PubMed Web of Science

[18] [18].↵
Prem, K., Cook, A. R. & Jit, M. 2017 Projecting social contact matrices in 152 countries using contact surveys and demographic data. PLoS computational biology 13, e1005697. (DOI:10.1371/journal.pcbi.1005697).
OpenUrl CrossRef

[19] [19].↵
Klepac, P., Kucharski, A. J., Conlan, A. J., Kissler, S., Tang, M. L., Fry, H. & Gog, J. R. 2020 Contacts in context: large-scale setting-specific social mixing matrices from the BBC Pandemic project. MedRxiv. (DOI:10.1101/2020.02.16.20023754).
OpenUrl Abstract/FREE Full Text

[20] [20].↵
Jaouimaa, F.-Z., Dempsey, D., van Osch, S., Kinsella, S., Burke, K., Wyse, J. & Sweeney, J. 2021 An age-structured SEIR model for COVID--19 incidence in Dublin, Ireland with framework for evaluating health intervention cost. arXiv preprint arxiv:2106.06377.

[21] [21].↵
Jones, T. C., Biele, G., Mühlemann, B., Veith, T., Schneider, J., Beheim-Schwarzbach, J., Bleicker, T., Tesch, J., Schmidt, M. L. & Sander, L. E. 2021 Estimating infectiousness throughout SARS-CoV-2 infection course. Science. (DOI:10.1126/science.abi5273).
OpenUrl Abstract/FREE Full Text

[22] [22].↵
Davies, N. G., Klepac, P., Liu, Y., Prem, K., Jit, M. & Eggo, R. M. 2020 Age-dependent effects in the transmission and control of COVID-19 epidemics. Nature medicine 26, 1205–1211. (DOI:10.1038/s41591-020-0962-9).
OpenUrl CrossRef PubMed

[23] [23].↵
Diekmann, O., Heesterbeek, J. A. P. & Metz, J. A. 1990 On the definition and the computation of the basic reproduction ratio R 0 in models for infectious diseases in heterogeneous populations. Journal of mathematical biology 28, 365–382. (DOI:10.1007/BF00178324).
OpenUrl CrossRef PubMed Web of Science

[24] [24].↵
2021 Airport data. Civil Aviation Authority, https://www.caa.co.uk/Data-and-analysis/UK-aviation-market/Airports/Datasets/UK-Airport-data/Airport-data-2020-2001/.

[25] [25].↵
2021 COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. Johns Hopkins University 2021, https://github.com/CSSEGISandData/COVID-19.

[26] [26].↵
Gillespie, D. T. 2001 Approximate accelerated stochastic simulation of chemically reacting systems. The Journal of chemical physics 115, 1716–1733. (DOI:10.1063/1.1378322).
OpenUrl CrossRef Web of Science

[27] [27].↵
Pooley, C., Bishop, S. C., Doeschl-Wilson, A. & Marion, G. 2019 Posterior-based proposals for speeding up Markov chain Monte Carlo. Royal Society open science 6, 190619. (DOI:10.1098/rsos.190619).
OpenUrl CrossRef

[28] [28].↵
2021 Timeline of UK government coronavirus lockdowns. Institute for Government, https://www.instituteforgovernment.org.uk/charts/uk-government-coronavirus-lockdowns.

[29] [29].↵
Mizumoto, K., Kagaya, K., Zarebski, A. & Chowell, G. 2020 Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance 25, 2000180. (DOI:10.2807/1560-7917.ES.2020.25.10.2000180).
OpenUrl CrossRef PubMed

[30] [30].
Day, M. 2020 Covid-19: identifying and isolating asymptomatic people helped eliminate virus in Italian village. BMJ: British Medical Journal (Online) 368. (DOI:10.1136/bmj.m1165).
OpenUrl FREE Full Text

[31] [31].↵
Nishiura, H., Kobayashi, T., Miyama, T., Suzuki, A., Jung, S.-m., Hayashi, K., Kinoshita, R., Yang, Y., Yuan, B. & Akhmetzhanov, A. R. 2020 Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19). International journal of infectious diseases 94, 154. (DOI:10.1016/j.ijid.2020.03.020).
OpenUrl CrossRef PubMed

[32] [32].↵
Byrne, A. W., McEvoy, D., Collins, A. B., Hunt, K., Casey, M., Barber, A., Butler, F., Griffin, J., Lane, E. A. & McAloon, C. 2020 Inferred duration of infectious period of SARS-CoV-2: rapid scoping review and analysis of available evidence for asymptomatic and symptomatic COVID-19 cases. BMJ open 10, e039856. (DOI:10.1136/bmjopen-2020-039856).
OpenUrl Abstract/FREE Full Text