An improved method to estimate the effective reproduction number of the COVID-19 pandemic: lessons from its application in Greece

Introduction: Monitoring the time-varying effective reproduction number Rt is crucial for assessing the evolution of the COVID-19 pandemic. We present an improved method to estimate Rt and its application to routine surveillance data from Greece. Methods: Our method extends that of Cori et al (2013), adding Bayesian imputation of missing symptom onset dates, imputation of infection times using an external estimate of the incubation period, and an adjustment for reporting delay. To facilitate its use, we provide an R software package named "bayEStim". We applied the method to COVID-19 surveillance data from Greece, and examined the resulting Rt estimates in relation to control measures applied, in order to assess their effectiveness. We also associated Rt, as a measure of transmissibility, to population mobility as recorded in Google data and to ambient temperature. We used a serial interval between 4 and 7.5 days, and a median incubation period of 5.1 days. Results: In Greece Rt fell rapidly as the first control measures were introduced, dropping below 1 at least a week before a full lockdown came into effect. In mid-July Rt started increasing again, as increased mobility associated with tourism activity was observed. Each 10% of increase in relative mobility increased Rt by 8.1% (95% CrI 6.1-10.2%), whereas each unit celsius of temperature increase decreased Rt by 4.6% (95% CrI 5.4-13.7%). Conclusions: Mobility patterns significantly affect Rt. Most of the reduction in COVID-19 transmissibility in Greece occurred already before the lockdown, likely as a result of decreased population mobility. Lower viral transmissibility in summer does not appear sufficient to counterbalance the increased mobility due to tourism. Monitoring Rt is an essential component of COVID-19 surveillance, and it is crucial for correctly assessing the effect of control measures.


Introduction
The coronavirus disease 2019  pandemic originated in December 2019 in the city of Wuhan, China, spreading across the globe and causing millions of cases and hundreds of thousands of deaths within a few months [1,2]. As the world braces for likely further pandemic waves in late 2020, disease surveillance is crucial in order to monitor the situation and guide public health action. An essential component of COVID- 19 surveillance is estimation of the time-varying effective reproduction number Rt, defined as the average number of secondary cases at time t produced by an infected individual over his/her infectious period. Rt reflects the real-world transmissibility of a pathogen, and is affected by the mobility patterns, social mixing, control measures, population immunity and other factors that are prevalent at each point in time. Rt>1 indicates exponential growth, Rt=1 indicates sustained transmission and Rt<1 suggests exponential decay of an epidemic.
Various methods to estimate Rt have been described in the literature [3][4][5]; we present an improvement that addresses the inherent limitations of surveillance data, such as incompleteness and reporting delays, allowing for almost real-time estimation of epidemic trends. We apply our method in the case of the COVID-19 pandemic in Greece, and examine how transmissibility of the SARS-CoV-2 virus has varied over time and in relation to control measures, population mobility and ambient temperature.

Estimation of the effective reproduction number Rt
Our way of estimating Rt extends the methods by Cori et al [3], implemented in a Bayesian framework. In brief, Rt is estimated as the ratio of new locally-acquired infections at time t, to the sum of already infected individuals (local or imported) weighted by an infectivity function ws that is approximated by the serial interval distribution [3,6]. We parametrically model the serial interval with a Gamma distribution, with userprovided mean and standard deviation (SD); a range can be specified for both parameters, from which values are randomly drawn, to include additional uncertainty in the serial interval estimates. As in the original method, in order to reduce noise in the Rt estimates a rolling time window can be specified over which transmission is assumed constant; this is often set at 7 days.
As dates of symptom onset are often partially missing in real-world surveillance data, we use the subset of observed durations between symptom onset and case ascertainment, assumed to also follow a Gamma distribution, to perform Bayesian imputation of any unknown symptom onset dates. This includes asymptomatic cases, for whom we impute an onset date assuming similar duration between infection and case detection as symptomatic cases. In similar fashion, we use a Gamma distribution for the incubation time (again with user-provided mean and SD) to impute infection times for all cases. Whereas using symptom onset dates results in exact but time-lagging estimates of Rt [3], using infection times produces time-accurate estimates, which are necessary in order to assess the effectiveness of control measures and other factors that potentially affect transmission. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020. . https://doi.org/10.1101/2020.09. 19.20198028 doi: medRxiv preprint Bayesian imputation of infection times incorporates the appropriate uncertainty in the underlying epidemic curve of infections, reflecting the full range of curves that are compatible with the observed data. At the same time this smoothens out both the epidemic curve and the resulting Rt estimates, making it harder to detect abrupt changes in transmissibility and their possible relation to control measures [7]. Although a degree of smoothing is desirable, if this is deemed excessive, a shorter rolling time window can be used.
Delays in case diagnosis and reporting (case ascertainment) will result in right-truncation, i.e. artificially low infection counts for later days in the time series, biasing recent Rt estimates downwards. Several methods have been proposed to adjust for this, which are often referred to as "nowcasting" [8,9]. Our approach was to divide the later counts by the cumulative probability of ascertainment, as given by the distribution of the duration between infection and ascertainment. This simple method has the advantage of not requiring additional historical or other data, compared to more elaborate methods. As an alternative we used a "data augmentation" approach, whereby future cases were added in the dataset based on the reporting rate of the previous week. A comparison of the two approaches is included in the online supplement (Supplementary Figure 1).
The overall model is run using Markov Chain Monte Carlo (MCMC) in JAGS [10]. To facilitate its use, we have created an R software package named "bayEStim" (https://github.com/thlytras/bayEStim) providing a user-friendly interface, in similar fashion to the "EpiEstim" package for the original method [3]. At a minimum the package requires only a vector of symptom onset dates, an indicator of whether the case is local or imported, and a mean and SD for the serial interval. In addition, the dates of ascertainment and the mean and SD of the incubation period can be specified for optimal inference.

Data sources and analysis
We applied our method for estimating Rt on national-level COVID-19 surveillance data collected in Greece by the National Public Health Organization (NPHO) from 26 February (when the first case was identified) to 3 August 2020. All cases were laboratory-confirmed with RT-PCR testing, and were considered imported if they had arrived in Greece in the last 3 days before a swab was obtained. For the serial interval we specified ranges of 4.0-7.5 days for the mean and 2.0-5.0 days for the SD, in order to cover the estimates reported in the literature [11][12][13]. Similarly, the mean incubation period was specified as 5.1 days with an SD of 3 days [14]. A 7-day rolling time window was used for Rt estimation. Sensitivity analyses were undertaken with shorter time windows, as well as with the subsets of hospitalized cases and severe cases (defined as hospitalized in intensive care or dead). Dates when major control measures were implemented (or relaxed) were overlaid on the resulting time series of Rt estimates, in order to assess their effects on COVID-19 transmissibility.
In addition, we sought to examine the association between population mobility, ambient temperature and Rt during the study period. We downloaded the Google mobility data for Greece for the study period [15], and averaged the "retail and recreation", "parks", "transit stations" and "workplaces" categories into a single . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020. . https://doi.org/10.1101/2020.09.19.20198028 doi: medRxiv preprint relative mobility measure (compared to the baseline period of 3 January to 6 February 2020). As a sensitivity analysis we omitted "parks" from the mobility measure, as it comprises mostly outdoors activities. From the National Oceanic and Atmospheric Administration (NOAA) website [16], we downloaded mean daily temperatures at 48 weather stations across Greece and used a population-weighted average as the overall countrywide temperature of each day. Then, given the 7-day rolling window, we fitted a linear regression of the logRt estimates on the 7-day moving averages of mobility and temperature. To incorporate the uncertainty of the dependent variable (Rt), we ran the regressions separately on each Rt series obtained in each MCMC iteration and combined the regression coefficients using Rubin's rules [17]. Alternative specifications of this regression were explored as sensitivity analyses. All analyses were undertaken in the R software environment, version 4.0.2 [18].

Results
Until 3 August 2020 a total of 4,737 COVID-19 cases had been reported in Greece. Of those, 280 cases were from two large clusters neither linked to nor representative of the general population (a large cruiseferry arriving in Greece from abroad, and a migrant hosting facility); these were excluded, leaving 4,459 cases for the Rt estimation. Table 1 presents the age distribution of these cases and their breakdown by location of transmission and severity; approximately one quarter were imported, one third were hospitalized, and 8% were severe cases, i.e. were hospitalized in intensive care or died. Imported cases were on average younger than local cases, while hospitalized and severe cases were older than non-hospitalized (p<0.001, Table 1 March the food service industry and shopping centres were closed, and on 23 March all non-essential movement was restricted (i.e. a lockdown was implemented). On 4 March the lockdown was lifted, with most business restrictions gradually relaxed until the end of May. And on 1 July international airport connections were restored with most countries, effectively allowing the Greek tourism sector to operate.
Although the first peak in daily cases occurred at the time of the lockdown, infections appear to peak much earlier, specifically at the time of school closure. Similarly, a second peak on late August corresponds to an increase of infections since early July, coincidental with an influx of imported cases (Figure 1). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020. . https://doi.org/10.1101/2020.09.19.20198028 doi: medRxiv preprint late May was followed by an Rt consistently below 1 until early July, when the border opening and influx of imported cases was subsequently followed by a rise in local COVID-19 cases and Rt (Figures 1 and 2).
Our delay adjustment method performed relatively well, but consistently underestimated the Rt for the latest 7-10 days in the time series (Figure 2 and Supplementary Figure 1); we therefore omitted the last week of estimates from our regression model (after 27 July). A significant association was found between Rt and both mobility and temperature; each 10% of increase in relative mobility increased Rt by 8.1% (95% CrI 6.1-10.2%), whereas each unit celsius of increase in the 7-day temperature average decreased Rt by 4.6% (95% CrI 5.4-13.7%). The adjusted R-squared of the model was 54.3%, indicating a fairly good fit to the data, and had the lowest AIC of all alternative models explored as sensitivity analyses (Supplementary Table 2).
In sensitivity analyses using hospitalized and severe cases, the Rt time series was very similar to the one obtained using all cases, albeit with much reduced precision given the lower numbers of cases (Supplementary Figure 2).

Discussion
Monitoring the transmissibility of the SARS-CoV-2 virus during the COVID-19 pandemic is crucial in order to assess the effectiveness of control measures and guide public health policy. Rt is a useful and easily understandable metric for this purpose, but its reliable estimation presents considerable challenges [7], especially given the inherent limitations of surveillance data [19]. Our method to estimate Rt does not require structural assumptions other than the serial interval distribution, and introduces several improvements that make it appropriate for use in a surveillance context. It works with minimal or incomplete data, it largely adjusts for reporting delays, and can produce time-accurate estimates using an external estimate of the incubation period; the latter is essential for correctly assessing the effectiveness of control measures, as any intervention to reduce infection rates will only be reflected in case reporting rates after a substantial period of time. Our method is also very easy to use through our "bayEStim" package for the R software environment.
On the other hand, there are certain limitations with our approach. Imprecision can be high especially with low case counts, as our model incorporates all sources of uncertainty in Rt, serial interval, incubation period, missing symptom onset dates and delay distribution. Bayesian imputation of infection times also introduces smoothing, which is both appropriate and to an extent desirable, but can also blur abrupt changes in Rt [7]. This should be kept in mind when interpreting the results. Substantial changes in testing or ascertainment rates over time can bias the results, as can the presence of few large clusters compared to overall cases; down-weighting of cluster-related cases, for example according to their percentage positive compared to overall cases, may be a possible solution and has been implemented in "bayEStim". Ideally our method should be applied to hospitalized or severe case series, which can be less subject to underascertainment or non-random testing, although in our case Rt estimates were not appreciably different. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020. . relationship between population mobility and viral transmissibility, indicating that social distancing and staying home can be very effective in bringing the Rt of COVID-19 below 1 [20]. Indeed both mobility and Rt started falling rapidly just as the first control measures were introduced, indicating that the multiple control measures introduced, combined with good public compliance, resulted in effective social distancing earlier than the full lockdown of 23 March. This was likely crucial, as the virus did not get enough time to spread widely among the population. Counterintuitively, the lockdown appears to not have substantially contributed to a further reduction in COVID-19 transmissibility, as Rt had already fallen significantly below 1 at least a week earlier and remained stable thereafter. It is possible that without the lockdown the decrease in mobility and Rt might not have been sustained, but this purely hypothetical.
As social distancing measures were relaxed, Rt in Greece remained largely below 1 until early July. Then a large influx of tourists (with 1.3 million arrivals during all of July [21], in a population of around 10.5 million), with hundreds of imported COVID-19 cases (Figure 1), was followed by a steep increase in locallyacquired cases and Rt climbing back to 1. This coincided with a continued increase in mobility beyond baseline levels ( Figure 2). It should be emphasized that the arrival of imported cases from abroad does not in itself raise Rt, as transmissibility depends entirely on local conditions of social mixing and mobility.
However, summer in Greece is defined by people, Greek and foreign, going on vacation in large numbers and in proximity to each other, especially in tourist hotspots; moreover, an estimated 850,000 local jobs are directly or indirectly linked to the tourism industry [22]. This situation creates the general conditions for sustained spread of the virus among the population, through increased mobility and crowding. It is the combination of increased case importations and increased transmissibility that results in more COVID-19 cases identified during the summer, a phenomenon which is expected to abate in September as the tourist season winds down. In the meantime, is is important to push Rt back below 1 as much as possible, with targeted social distancing measures and widespread use of other protective measures like masks and hand hygiene.
Finally, we found a relationship between ambient temperature and COVID-19 transmissibility; in fact the 18.3 degrees celsius difference between the lowest and highest temperature in our data corresponds to a 58.1% change in Rt (95% CrI 35.7-72.7%), which appears quite significant. This may be attributed to the effects of heat on the virus itself, but is likely more down to behavioural patterns in the population, with more social interaction happening outdoors as the weather gets warmer and vice versa. This effect however does not appear sufficient to counterbalance the increased mobility associated with the summer season in Greece, which is unsurprising considering the lack of population immunity to SARS-CoV-2 [23]. Moreover, with the higher transmissibility anticipated during the winter, a substantially higher degree of social distancing will likely be required to keep Rt below 1 and the pandemic under control.
In conclusion, we present an improved method and software for estimating Rt that is tailored to the realities of a disease surveillance context, and can be a valuable component of the surveillance activities during the current COVID-19 pandemic. Applying our method to the Greek case demonstrates the important conclusions that can be drawn, and that can guide the further management of the pandemic. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020. .

Conflict of Interest statement
ST and VS serve on the Expert Advisory Group for COVID-19 of the Hellenic Ministry of Health, which is an unpaid position. TL and DP declare no competing interests.
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 22, 2020.