Abstract
We find that epidemic resurgence, defined as an upswing in the effective reproduction number (R) of the contagion from subcritical to supercritical values, is fundamentally difficult to detect in real time. Intrinsic latencies in pathogen transmission, coupled with often smaller incidence across periods of subcritical spread mean that resurgence cannot be reliably detected without significant delays, even if case reporting is perfect. This belies epidemic suppression (where R falls from supercritical to subcritical values), which can be ascertained 5–10 times more rapidly. These innate limits on detecting resurgence only worsen when spatial or demographic heterogeneities are incorporated. Consequently, we argue that resurgence is more effectively handled proactively, at the expense of false alarms. Responses to recrudescent infections or emerging variants of concern will more likely be timely if informed by improved syndromic surveillance systems than by optimised mathematical models of epidemic spread.
Introduction
Real-time estimates of the transmissibility of an infectious disease [1,2] are crucial for informed outbreak responses. Timely detection of salient changes in the effective reproduction number (R) of the disease of interest, which measures the average number of secondary cases likely caused by a primary case, can provide important evidence for policymaking and public communication [3,4], as well as improve forecasts of disease burden [5] (e.g., hospitalisations and deaths). Two critical changes of interest are resurgence and control. Resurgence, which we define as an increase from subcritical (R ≤ 1) to supercritical (R > 1) transmissibility, can warn of imminent waves of infections, signify the emergence of pathogenic variants of concern and signal important shifts in the behavioural patterns of population [6,7]. Alternatively, control (or suppression) describes a switch from supercritical to subcritical spread and can indicate the effectiveness of interventions and the impact of depleting susceptibility (including that due to vaccine-induced immunity) [8,9].
Identifying these transmissibility changes in real time, however, is an enduring challenge for statistical modelling and surveillance planning. Inferring a transition in R from stochastic time series of incident cases necessitates assumptions about differences among meaningful variation (signal) and random fluctuations (noise) [10–12]. Modern approaches to epidemic modelling and monitoring aim to maximise this signal-to-noise ratio either by enhancing noise filtering and bias correction methods [13–15], or by amplifying signal fidelity through improving surveillance quality and diversity [16–18]. While both approaches have substantially advanced the field, there have been few attempts to explore what, if any, fundamental limits exist on the timely detection of these changes. Such limits can provide key benchmarks for assessing the effectiveness of modelling or data collection and deepen our understanding of what can and cannot be achieved by real-time outbreak response programmes, ensuring that model outputs are not overinterpreted and redirecting surveillance resources more efficiently [19–21].
While studies are examining intrinsic bounds on epidemic monitoring and forecasting [22–25], works on transmissibility have mostly probed how extrinsic surveillance biases might cause R misestimation [14,26–28]. Here we address these gaps in the literature by characterising and exposing fundamental limits to detecting resurgence and control from a perfectly ascertained incidence time series. This provides vital insights into the best real-time performance possible and blueprints for how outbreak preparedness might be improved. We analyse a predominant real-time epidemic model [1,2] and discover stark asymmetries in our innate ability to detect resurgence and control. While epidemic control or suppression change-points are inferred robustly and rapidly, inherent delays (5–10 times that for control) strongly inhibit real-time resurgence estimation from widely used incidence curves or data.
We show that these fundamental constraints on resurgence worsen with smaller epidemic size, steepness of the upswing in R and spatial or demographic heterogeneities. Given this bottleneck to timely outbreak analysis, which exists despite perfect case reporting and the use of optimal Bayesian detection algorithms [15,29], we argue that methodological improvements to existing models used to analyse epidemic curves (e.g., cases, hospitalisations or deaths) are less important than enhancing syndromic surveillance systems [30,31]. Such systems, which fuse multiple data sources (including novel ones e.g., wastewater [32]) to triangulate possible resurgences might minimize some of these fundamental limitations. We conclude that early responses to suspected resurging epidemics, at the expense of false alarms, might be justified in many settings, both from our analysis and a consensus that lags in implementing interventions can translate into severely elevated epidemic burden [33–36]. Using both theory and simulation, we explore and elucidate these conclusions in the next section.
Results
Epidemic resurgence is statistically more difficult to infer than control
We first provide intuition for why resurgence and control might present asymmetric difficulties when inferring transmissibility in real time. Consider an epidemic modelled using a renewal branching process [37] over times (usually in days) 1 ≤ s ≤ t. Such models have been widely applied to infer the transmissibility of numerous diseases including Ebola virus, COVID-19 and pandemic influenza. Renewal models postulate that the incidence of new cases at some time s, denoted Is, depends on the effective reproduction number, Rs, and the past incidence, as in Eq. (1) [2]. Here
means the set {Ia, Ia+1, …, Ib } and ≡ indicates equality in distribution.
In Eq. (1), Pois represents Poisson noise and Λs is the total infectiousness, which summarises the weighted influence of past infections. The set of weights wu for all u define the generation time distribution of the infectious disease with [38]. Applying Bayesian inference techniques (see [2,39] for derivations) under the assumption that transmissibility is constant over a past window of length m days, τ(s) = {s, s − 1, …, s − m + 1}, we can obtain a gamma (Gam) distributed posterior distribution for Rs as
, with (a, c) as prior distribution (P(Rs)) parameters, iτ(s) = ∑u∈τ(s) Iu and λτ(s) = ∑u∈τ(s)Λu.
This posterior distribution only uses data up until time s and defines our real-time estimate of R at that time. We can analyse its properties (and related likelihood function ) to obtain the Fisher information (FI) on the left side of Eq. (2). This FI (see [10,39] for derivation) captures how informative
is for inferring Rs, with its inverse defining the smallest asymptotic variance of any Rs estimate [10,40]. Larger FI implies better statistical precision.
As resurgence will likely follow low incidence periods, we might expect λτ(s) to be small, while Rs rises. This effect will reduce the FI in Eq. (2), making these changes harder to detect. In contrast, the impact of interventions will be easier to infer since these are often applied when cases are larger and reduce Rs. We expand on this intuition, using the R posterior distribution to derive the real-time resurgence probability , as on the right side of Eq. (2). We plot its implications in Figure 1, corroborating our intuition. In panel A we find that larger epidemic sizes improve our ability to detect shifts in transmissibility from fluctuations in incidence (the posterior distributions for Rs overlap less). Panel B bolsters this idea, showing that when λτ(s) is smaller (as is likely before resurgence) we need to observe larger relative epidemic size changes
for some increase in
than for an equivalent decrease when aiming to detect control (where λτ(s) would be larger).
Panel A plots posterior real-time distributions for the effective reproduction numbers, Rs, at different relative incidence perturbations, , (increasing from blue to red). The degree of separation and hence our ability to uncover meaningful incidence fluctuations from noise, improves with the current epidemic size, λτ(s). Panel B shows how this sensitivity modulates our capacity to infer resurgence
and control
. If epidemic size is smaller, larger relative incidence changes are needed to detect changes in Rs (curves have gentler gradient). Resurgence (likely closer to the blue line, top right quadrant) is appreciably and innately harder to detect than control (likely closer to the red line, bottom left quadrant).
Fundamental delays on detecting resurgence but not control
The intrinsic asymmetry in sensitivity to upward versus downward shifts in R (see Figure 1) implies that it is not equally simple to infer resurgence and control from incident cases. We investigate ramifications of this observation by comparing our real-time Rs-estimates to ones exploiting all the future incidence information available. We analyse two foundational posterior distributions, the filtered, ps, and smoothed, qs, distributions, defined in Eq. (3). Here ps considers information until time s and captures changes in Rs from in real time. In contrast, qs extracts all the information from the full incidence curve
, providing the best possible (in mean squared error) Rs-estimate [29]. The differential between ps and qs, summarised via the Kullback-Liebler divergence, D(ps|qs), measures the value of this future information.
Bayesian filtering and smoothing are central formalisms across engineering, where real-time inference and detection problems are common [29,41]. We compute formulae from Eq. (3) via the EpiFilter package [15,28], which employs optimal forward-backward algorithms. This method improves on the window-based (τ(s)) formulation of the last section and maximises the signal-to-noise ratio in estimation. We also obtain filtered and smoothed probabilities of resurgence as and
. The probability that the epidemic is controlled (i.e., R ≤ 1) is the complement of these expressions. Our main results, which average the above quantities over many simulated Ebola virus and COVID-19 epidemics, are given in Figure 2 and Figure 5 (appendix), respectively. We uncover striking differences in the intrinsic ability to infer resurgence versus control in real time.
Using renewal models with the generation time from [42], we simulate 1000 realisations of Ebola virus epidemics (t = 300) with step (A panels) and seasonally (B panels) changing transmissibility. Top panels plot mean estimates from the filtered (Ep[Rs], blue) and smoothed (Eq[Rs], red) distributions from every realisation (computed using EpiFilter [15]). Middle panels average the Kullback-Liebler divergences from those simulations and bottom panels present overall filtered , blue), and smoothed (
, red) resurgence probabilities. We find fundamental and striking delays in detecting resurgence, often an order of magnitude longer than those for detecting control or suppression in transmission (see lags between red and blue curves).
Upward change-points are significantly harder to detect both in terms of accuracy and timing. Discrepancies between ps- and qs-based estimates (the latter benchmark the best realisable performance) are appreciably larger for resurgence than control. While decreases in R can be pinpointed reliably, increases seem fundamentally more difficult to detect. These limits appear to exacerbate with the steepness of the R upswing. We confirm these trends with a detailed simulation study across five infectious diseases in Figure 3. There we alter the steepness, θ, of transmissibility changes and map delays in detecting resurgence and control as a function of the difference in the first time that ps- and qs-based probabilities cross 0.5 (Δt50) and 0.95 (Δt95), normalised by the mean generation time of the disease. We find that lags in detecting resurgence are at least 5–10 times longer than for detecting control.
We characterise the discrepancies between detecting resurgence and control against the steepness or rate, θ, of changes in transmissibility (Rs), which we model with logistic functions (panel A). We compare differences in the probability of detecting resurgence (P(Rs > 1)) or control (P(Rs ≤ 1) under filtered and smoothed estimates (see main text) first crossing thresholds of 0.5 (Δt50) and 0.95 (Δt95) for five infectious diseases (panel B plots their assumed generation time distributions from [2,42,43]). We simulate 1000 epidemics from each disease using renewal models and estimate Rs with EpiFilter [15]. Panels C and D (colours match panel B, Δt is normalised by the mean generation times of the diseases) show that delays in detecting resurgence (dots) are at least 5–10 times longer than for indicating control (diamonds). Our ability to infer even symmetrical transmissibility changes is fundamentally asymmetric.
Fundamental delays worsen with spatial or demographic heterogeneities
In previous sections we demonstrated that sensitivity to changes in R is asymmetric, and that intrinsic, restrictive limits exist on detecting resurgence in real time, which do not equally inhibit detecting control. While those conclusions apply generally (e.g., across diseases), they do not consider the influence of spatial or demographic heterogeneity. We examine this complexity through a simple but realistic generalisation of the renewal model. Often R-estimates can be computed at small scales (e.g., at the municipality level) via local incidence or more coarsely (e.g., countrywide), using aggregated case counts [3,13]. We can relate these differing scales with the weighted mean in Eq. (4), where the overall (coarse) R at time s, , is a convex sum of finer-scale R contributions from each group (Rs[j] for the jth of p groups) weighted by the epidemic size of that group (as in Eq. (2) we use windows τ(s) for analytic insight).
Our choice of groupings is arbitrary and can equally model demographic heterogeneities (e.g., age-specific transmission), where we want to understand how dynamics within the subgroups influence overall spread [7]. Our aim is to ascertain how grouping, which often occurs naturally due to data constraints or a need to succinctly describe the infectious dynamics over a country to aid policymaking or public communication [44], affects resurgence detection. Eq. (4) implies that . Since resurgence will likely first occur within some specific (maybe high risk) group and then propagate to other groups [7], this expression suggests that an initial signal (e.g., if some Rs[j] > 1) could be masked by non-resurging groups (which are from this perspective contributing background noise).
As the epidemic size in a resurging group will likely be smaller than those of groups with past epidemics that are now being stabilised or controlled, this exacerbates the sensitivity bounds explored earlier via Eq. (2). We can verify this further loss of sensitivity by examining how the overall posterior distribution depends on those of the component groups as follows, with ⊛ as a repeated convolution operation and Ωj as the posterior distribution for the jth group.
While Eq. (5) holds generally, we assume gamma posterior distributions, leading to statistics analogous to Eq. (2). We plot these sensitivity results at p = 2 and 3 in Figure 4, where group 1 features resurgence and other groups either contain stable or falling incidence. We find that as p grows (and additional distributions convolve to generate ) we lose sensitivity (posterior distributions overlap more for a given relative change in incidence
. Reductions in either the weight (α1), epidemic size (λτ(s)[1]) or other Rs[j ≠ 1], further desensitise the resurgence signals i.e., decrease the gradient of detection probability curves. This is summarised by noting that if Rs[1] = maxj Rs[j], then the sensitivity from Eq. (2) is only matched when the resurging group dominates (α1 ≈ 1) or if other groups have analogous R i.e., Rs[1] ≈ Rs[j]. Delays in detecting resurgence can therefore be severe. Heterogeneity on its own, however, does not force asymmetry between detecting control and resurgence.
We investigate how differences in transmissibility among groups (e.g., due to demographic or spatial factors) fundamentally limit the ability to detect resurgence from a specific group (in this example group 1). Panel A shows that the grouped posterior distribution becomes less sensitive to relative changes in group 1 incidence, (increasing from blue to red). Posterior distributions over
are more overlapped (and tighter in variance) as p increases, for fixed Rs[1] (top). Panel B plots how overall resurgence detection probability
depends on the weight (α1, top, 0.05–1) and epidemic size (λτ(s)[1], middle, 20–80, p = 2) as well as changes in Rs[3] (bottom, 0.5–1.2, p = 3). Decreases in α1 (red to blue) or λτ(s)[1] mean other groups mask the resurging dynamics in group 1, reducing sensitivity (curves become less steep). In the latter case the
(red with solid line at median of λτ(s)[1] range) is always more conservative than P(Rs[1] > 1) (blue, solid median line). As Rs[3] falls (red to blue) the ability to detect resurgence also lags relative to that from observing group 1 (black).
Discussion
Probing the performance limits of noisy biological systems has yielded important insights into the real-time estimation and control of parameters in biochemistry and neuroscience [45–47]. Although models from these fields share dynamic similarities with those in epidemiology, there has been relatively little investigation of how real-time estimates of pathogen transmissibility, parametrised by R, might be fundamentally limited. This is surprising since R is among key parameters considered in initiatives aiming to better systematise real-time epidemic response [48,49]. Here we explored what limits may exist on our ability to reliably detect or measure the change-points in R that signify resurgence and control. By using a combination of Bayesian sensitivity analyses and minimum mean squared error filtering and smoothing algorithms, we discovered striking asymmetries in innate detection sensitivities.
We found that, arguably, the most crucial transitions in epidemic transmissibility are the most inherently difficult to detect. Specifically, resurgence, signified by an increase in R from below to above 1, can at earliest be detected 5–10 times later than an equivalent decrease in R that indicated control (Figure 2, Figure 3 and Figure 5). As this lag could be of the order of the mean generation time of the disease under study, even when case reporting is perfect and optimised detection algorithms are applied, this represents a potentially sharp bottleneck to real-time responses for highly contagious diseases. Intuition for this result came from observing that sensitivity to R change-points will weaken (due to noise masking the signal) with declining epidemic sizes and increasing ‘true’ R, both of which likely occur in resurgent settings (Eq. (2) and Figure 1). Furthermore, these latencies and sensitivity issues would only exacerbate when considering heterogeneous groupings across geography or demography (Eq. (4), (5) and Figure 4).
We repeat the simulations from Figure 2 but for 1000 realisations of COVID-19 epidemics (t = 300) using generation times from [43]. Top panels plot posterior mean estimates from filtered (Ep[Rs], blue) and smoothed (Eq[Rs], red) distributions from every realisation (computed via EpiFilter [15]). Middle panels average Kullback-Liebler divergences from those simulations and bottom panels show overall filtered (, blue), and smoothed (
, red) resurgence probabilities. We again find fundamental and appreciable latencies in detecting resurgence, often an order of magnitude longer than those for detecting epidemic control (compare red and blue curves).
Practical real-time analyses would almost surely involve such groupings or data aggregations [9,13], in addition to being hindered by reporting and other latencies (e.g., if notification times, hospitalisations or deaths are used as proxies for incidence) [14,50]. Consequently, we argue that while case data provide robust signals for pinpointing when an epidemic is under control (and possibly disentangling the impact of interventions), they are insufficient, on their own, to sharply resolve resurgence timepoints. This does not invalidate the importance of approaches that do seek to better characterise real-time R changes [1,2,13,28], but instead adds context on how such inferences should be interpreted when informing policy. Given intrinsic delays in inferring resurgence, which can associate with critical epidemiological changes, such as the emergence of variants of concern or important shifts in population behaviours [6,7], there are grounds for conservative approaches that enact interventions swiftly at the expense of false alarms. This might support, for example the ongoing COVID-19 policies of New Zealand and Australia [51], and adds impetus to recent studies showing how lags in the implementing of interventions can induce drastic costs [33–36].
Moreover, our analysis suggests that enhancing syndromic surveillance systems, which can comprehensively engage diverse data sources [30,31] may be more important than improving models for processing case data. Fusing multiple and sometimes novel data sources, such as wastewater or cross-sectional viral loads [18,32], may present the only truly realistic means of minimizing the innate limits to resurgence detection that we have demonstrated. Approaches aimed at enhancing case-based inference generally correct reporting biases or propose more robust measures of transmissibility, such as time-varying growth rates [14,49,52]. However, as our study highlights limits that persist at the gold standard of perfect case reporting and, further it is known that under such conditions growth rates and R are equally informative [53], these lines of investigation are unlikely to minimise the delays we have exposed.
There are two main limitations of our results. First, as we considered renewal model epidemic descriptions, which predominate real-time R studies, our work necessarily neglects the often-complex contact networks that mediate infection spread [54]. However, other analyses using somewhat different approaches to ours (e.g., Hawkes processes [55]) show apparently similar sensitivity asymmetries and there is evidence that renewal models can be as accurate as network models for inferring R [56], while being easier to run and fit in real time. Second, we did not include any explicit economic modelling. While this is outside the scope of this work it is important to recognise that resurgence detection threshold choices (i.e., how we decide which fluctuations in incidence are actionable) imply some judgment about the relative cost of true positives (timely resurgence detections) versus false alarms [12]. Incorporating explicit cost structures could mean that delays in detecting resurgence are acceptable. We consider this the next investigative step in our aim to probe the limits of real-time performance.
Data Availability
This work contains no data but code for reproducing all analyses and figures is freely available at https://github.com/kpzoo/resurgence-detection
Funding
KVP and CAD acknowledge funding from the MRC Centre for Global Infectious Disease Analysis (reference MR/R015600/1), jointly funded by the UK Medical Research Council (MRC) and the UK Foreign, Commonwealth & Development Office (FCDO), under the MRC/FCDO Concordat agreement and is also part of the EDCTP2 programme supported by the European Union. CAD thanks the UK National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emerging and Zoonotic Infections in partnership with Public Health England (PHE) for funding (grant HPRU200907). The funders had no role in study design, data collection and analysis, decision to publish, or manuscript preparation.
Appendix
We provide simulations in Figure 5 for COVID-19, showing that significant delays in detecting resurgence but not epidemic control persist. These support Figure 2 of the main text, which examined Ebola virus. While figures plot ensembles of mean estimates, single simulations (where estimated credible intervals reflect noise from the incidence of that simulation) also display this asymmetry, confirming that real-time resurgence detection is innately hard.