USING POPULATION BASED KALMAN ESTIMATOR TO MODEL COVID-19 EPIDEMIC IN FRANCE: ESTIMATING THE EFFECTS OF NON-PHARMACEUTICAL INTERVENTIONS ON THE DYNAMICS OF EPIDEMIC. BY ANNABELLE COLLIN1,2,3,BORIS P. HEJBLUM1,4,5,CAROLE VIGNALS1,4,5,6,

In response to the ongoing COVID-19 pandemic caused by SARS-CoV2, governments are taking a wide range of non-pharmaceutical interventions (NPI). These measures include interventions as stringent as strict lockdown but also school closure, bar and restaurant closure, curfews and barrier gestures i.e. social distancing. Disentangling the effectiveness of each NPI is crucial to inform response to future outbreaks. To this end, we first develop a multi-level estimation of the French COVID-19 epidemic over a period of one year. We rely on a global extended Susceptible-Infectious-Recovered (SIR) mechanistic model of the infection including a dynamical (over time) transmission rate containing a Wiener process accounting for modeling error. Random effects are integrated following an innovative population approach based on a Kalman-type filter where the log-likelihood functional couples data across French regions. We then fit the estimated time-varying transmission rate using a regression model depending on NPI, while accounting for vaccination coverage, apparition of variants of concern (VoC) and seasonal weather conditions. We show that all NPI considered have an independent significant effect on the transmission rate. We additionally demonstrate a strong effect from weather conditions which decrease transmission during the summer period, and also estimate increased transmissibility of VoCs.

In response to the ongoing COVID-19 pandemic caused by SARS-CoV-2, governments are taking a wide range of non-pharmaceutical interventions (NPI). These measures include interventions as stringent as strict lockdown but also school closure, bar and restaurant closure, curfews and barrier gestures i.e. social distancing. Disentangling the effectiveness of each NPI is crucial to inform response to future outbreaks. To this end, we first develop a multi-level estimation of the French COVID-19 epidemic over a period of one year. We rely on a global extended Susceptible-Infectious-Recovered (SIR) mechanistic model of the infection including a dynamical (over time) transmission rate containing a Wiener process accounting for modeling error. Random effects are integrated following an innovative population approach based on a Kalman-type filter where the log-likelihood functional couples data across French regions. We then fit the estimated time-varying transmission rate using a regression model depending on NPI, while accounting for vaccination coverage, apparition of variants of concern (VoC) and seasonal weather conditions. We show that all NPI considered have an independent significant effect on the transmission rate. We additionally demonstrate a strong effect from weather conditions which decrease transmission during the summer period, and also estimate increased transmissibility of VoCs. will need to be admitted to intensive care units (ICU), with a potential need of mechanical ventilation Docherty et al. (2020).
The COVID-19 pandemic has pushed modern health care systems to a breaking point all over the world. Indeed, SARS-CoV-2 being a newly emerging pathogen, means the entire human population is susceptible to infection. Surges in hospitalizations and especially ICU needs wherever SARS-CoV-2 outbreaks arise have been observed. Due to an infectious phase starting before any symptoms are visible and a significant proportion of a-or paucisymptomatic infections Jones et al. (2021), the spread of SARS-Cov-2 is extremely difficult to control Wu et al. (2021). In response, most governments have adapted drastic public health measures, also called Non-Pharmaceutical Interventions (NPI), in order to reduce the transmission of SARS-CoV-2 among their population, and consequently relax the pressure on their health-care system. In particular, the French government adopted the concept of a "graduated response" to the pandemic, deploying an arsenal of various NPI -some very stringent and others less so -in response to the COVID-19 national epidemic situation. Hale et al. Hale et al. (2021) built a stringency index that helps understand how strong the measures over time were. However, this indicator does not allow one to distinguish the effectiveness of each NPI, which is crucial to inform future preparedness response plans. Because NPI all have economic, psychological and social costs, it is paramount to evaluate their impact on the transmission of SARS-CoV-2, and on the dynamics of the COVID-19 epidemic.
Many studies relied on mechanistic models of epidemics in order to either predict its course Davies et al. (2020), evaluate vaccine prioritization strategies Bubar et al. (2021), or retrospectively measure NPIs impact. During the beginning of the epidemic, the focus was mostly on the timing of NPI initiation Pei, Kandula and Shaman (2020);  rather than the effect. However, disentangling the effect of each NPI is a complex problem as their allocation is not randomized and depends on the epidemic state. Many approaches aggregated data from multiple countries. Some worked on regression from time series based on incidence data Banholzer et al. (2021); ; Hsiang et al. (2020). Others used semi-mechanistic models and evaluated the percent reduction on the effective reproductive number Liu et al. (2021a); Haug et al. (2020); Flaxman et al. (2020). We preferably work at a country level so that the effect is not confounded by various behaviors and adherence levels of the population. Most of the work published at a country-level has focused on a on single aggregated NPI such as the Oxford COVID-19 Government Response Tracker Hale et al. (2021), very early epidemic Kraemer et al. (2020); Dehning et al. (2020), or a very limited set of interventions, see Brauner et al. (2021) for a review. Regarding France, Salje et al. (2020);Di Domenico et al. (2020) quantified the effect of various NPI. However, they based their results on the early epidemic and, they have a limited set of interventions. Our work rather focuses on the impact of each NPI on the transmission rate, a more valid indicator than direct epidemic curves or the reproductive number because it is independent of the infected proportion of the population. Finally, existing works do not account for vaccination roll out, introduction of variants, and the importance of weather while estimating the impact of NPI. The estimation of the independent effect of each of these factors leads to a challenging problem of estimation, including concerns about practical identifiability of each effect.
In this work, we propose a two-step approach. First, we estimate the transmission rate of SARS-CoV-2 and its variations in the 12 non-insular French regions over a period of more than a year -since March 2, 2020 until March 28, 2021. Second, we estimate, using linear regression, the effects of several NPI on the transmission rate while accounting for seasonal weather conditions throughout the pandemic, as well as the appearance of nonhistorical variants of concern (VoC) and an increasing proportion of vaccinated people. The first step consists of estimating the transmission rates in the 12 non-insular French regions. This is extremely challenging due to the sparsity and noisiness of the available data, and also because the parametric shape of the transmission rates is unknown. Using data assimilation across multiple geographical regions, and coupling public data with a dynamic mechanistic model, smooth transmission rates can be estimated through a Kalman filtering approach Simon (2006); Bensoussan (2018) -as already used in epidemiology for COVID-19 spread Arroyo-Marioli et al. (2021); Islam, Hoque and Amin (2020) or for other epidemics with regional variability Hu et al. (2020). More precisely, we develop an ingenious methodology to tackle this difficult problem, based on two important methodological innovations: (1) in the model with the introduction of a time-varying dynamics for the transmission rate including a Wiener process accounting for modeling error, and (2) in the way the population is integrated as we follow a new method Kalman-type filter, compatible with population approaches. This method in which the log-likelihood functional -estimated using for example the Unscented Kalman Filter Julier and Uhlmann (2002) -elegantly couples data across multiple geographical regions is presented in Collin, Prague and Moireau (2020). These two innovations are coupled in a strategy allowing the estimation of smooth transmission rates without knowledge of their shapes. This second step allows us to provide estimations alongside associated uncertainties for a) the transmission rate of SARS-CoV-2 throughout the COVID-19 pandemic in France, b) the effect of the principal NPI implemented in France on this transmission rate, and c) the effect of seasonal weather conditions and new VoC circulation, from observed hospitalization data in French regions.
Section 2 presents the data and the SEIRAH model. In Sections 2.4 and 2.5, the two-step strategy to estimate the transmission rate, and the regression to estimate the effects of NPI are presented. Our results are highlighted in Section 3; and their limitations are discussed in Section 4.

Material and Methods.
Open-data regarding the French COVID-19 epidemic, including hospitalization data, NPI implementations, VoC prevalence and the vaccination program are presented below. The dynamic model of the COVID epidemic, using an extended SIR type model, is then presented. Finally, we describe our strategies to estimate the transmission rate, using a population-based Kalman filter, to determine the impacts of NPI, seasonal weather conditions, and VoCs.

Available Data.
Hospitalization Data. Hospitalization data are extracted from the SI-VIC database (Système d'Information pour le suivi des VICtimes), a governmental system created in 2016 in order to identify and follow victims in exceptional circumstances (e.g. terror attacks). Since March 18 th 2020, the SI-VIC database provides, to Santé Publique France, the daily number of hospitalized COVID-19 patients, at multiple geographical scales. In this work, we focus on the 12 French non-insular regions.
An entry in the SI-VIC database corresponds to a patient hospitalized in connection with COVID-19. Specifically, it requires the presence of at least one of two criteria: i) a biologically confirmed diagnosis of COVID-19 (e.g. RT-PCR positive test result), or ii) a chest CT scan suggestive of COVID-19. In this analysis, we rely on the daily incident number of hospitalizations (Y Hin ), and the total number of individuals hospitalized daily (Y H ). The two data series are displayed in Figure 1 over a period of 391 days (from March 2, 2020 to March 28, 2021). Of note, in order to be able to compare the magnitude of epidemic in each region, we standardized the data by the size of the population in each region: direct interpretation would be the number of incident or prevalent hospitalization for 100,000 inhabitants. Non-Pharmaceutical interventions. Timing and modalities of the various NPI implemented in France over the course of the epidemic have been gathered from the French government's action summary website . In France, public health interventions have been highly multipronged. In our analysis, we considered the following summarized NPI occurring during the first year of the epidemic in France: i) first lockdown (with two phases of relaxation/reopening as described bellow), ii) second lockdown (with one phase of relaxation/reopening as described bellow), iii) 8PM curfew, iv) 6PM curfew, v) school closure, vi) bar and restaurant closure, vii) barrier gestures (including all mandatory sanitary protocols: physical distancing, hand washing, part-time remote work and, mask wearing in public spaces). Of note, NPI such as travel bans, enhanced testing, contact-trace-isolate, were ignored to ensure identifia-. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint bility, as they were either i) with a magnitude difficult to quantify, or ii) enforced in complete overlap with other NPI. In addition, partial interventions at a sub-regional level were not considered as an implementation of a measure. This resulted in a rather similar profile of interventions across regions as most of them were applied simultaneously in the 12 regions of interest. Figure 2 summarizes all 10 considered NPI over time.
The first and the second lockdowns have been disconnected due to their different modalities resulting in different behaviors, and thus possibly different impact on transmission. For example, during the first lockdown (from March 17, 2020 to May 11, 2020), the entire population had to work from home -the only exceptions were for workers in vital sectors, such as medical, security, or food sectors -and outings could not exceed one hour in a one kilometer perimeter. Whereas during the second lockdown (from October 29, 2020 to December 15, 2020), working on-site was authorized when working from home was not feasible, and outings were limited to only 3 hours in a 20 kilometers perimeter. In addition, the end of the first lockdown was gradual and separated by the government in three official phases (phase 1: May 11, 2020 to June 2, 2020, phase 2: June 2, 2020 to June 22, 2020 and phase 3: after June 22, 2020) with many evolving measures as the authorized distance of travels, the reopening of cultural places (i.e., museums), the reopening of non-essential stores, etc. A government campaign to raise awareness about barrier gestures started at the end of the first lockdown. Masks and hand washing were mandatory in many places such as public transports, schools and companies. We assume that barrier gestures start on May 11 th 2020, assuming that most of mandatory measures with a potential strong impact started to be implemented at this date. Concerning the second lockdown, a reopening of the non-essential stores happened 2 weeks prior to the end of the lockdown prior to Christmas. To account for this, we separate the second lockdown in two phases: a full lockdown until November 28 th 2020 and a reduced lockdown afterwards.
School closures were documented as of the holiday schedule, which can vary between regions. In France, schools were also closed during the first lockdown. Furthermore, the reopening of schools from May 11, 2020 (end of first lockdown) to July 4, 2020 (end of term) was very progressive with phased school and level-reopening and attendance rising slowly back to normal. We averaged the estimated school attendance to be 30% of school capacities during that transition. Complete closure of all of bars and restaurants happened twice over the study period: first a few days before the implementation of the first lockdown in March 2020 (as for schools closure), and along the beginning of the second lockdown in October 2020. The measure was lifted progressively in all regions after the end of the first lockdown in June 2020. At the end of the second lockdown, the measure was not lifted considering the epidemic situation was not good enough. It was still in place at the end of the study period.
Curfew measures started to be implemented October 17, 2020 in the 12 regions of interest in several major cities and in Île-de-France, with a curfew from 9PM to 6AM. It was extended on October 22 nd to 54 departments (sub-regional administrative unit). It was suspended during the second lockdown from October 30 th to December 15 th 2020, when a new one was implemented at 8PM at a national level. From January 2-12, 2021, the starting hour was changed to 6PM progressively at the departmental level until January 16 th when it switched to 6PM for all 12 regions of interest. Finally on March 20, 2021, the starting hour was changed again nationally to 7PM. To model the measures we considered only two variables, grouping curfews starting at 9PM or 8PM and those starting at 6PM or 7PM, and we set the value to 1 only when a whole region was under the curfew. Assuming that curfews and lockdowns induce different behaviors, curfews are not considered to be included in lockdowns, even if it is forbidden to go out in the evening during lockdowns. Similarly, 6PM curfew and 8PM curfew were considered as different interventions instead of nested ones as they are likely to induce different behaviors.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. Weather conditions. The role of weather conditions on the transmission of the SARS-CoV-2 remains disputed, and early publications were criticized for their inconsistency in results Zaitchik et al. (2020). Nonetheless, the potential impact of both temperature and humidity on aerosolized and fomite transmission routes based on comparisons with other respiratory infections is based on sound mechanistic arguments Rodó et al. (2021). Additionally, with the Northern Hemisphere going through a second winter season during the pandemic, evidence of the association of transmission with seasonal trends of temperature and humidity appears more robust in recent publications Rodó et al. (2021); Liu et al. (2021b). Daily weather data -namely temperature in Celsius degrees (T ), relative humidity in percentage (RH) and absolute humidity in g.m −3 (AH) -measured by meteorological stations were extracted from the National Oceanic and Atmospheric Administration database using R package worldmet. Regional daily means were estimated with all stations located in the region or at less than 10 km from the region border. To account for variations of population density across a region, a weighting was used based the on population living in a 10km buffer around each station, giving more importance to weather conditions around densely populated areas. We use the PREDICT index of weather transmissibility of COVID-19 (IPTCC), as defined by Bukhari and Jameel (2020) and Roumagnac et al. (2021): As for interpretation, this rough index ranges from 0% to 100%, the smaller the less favorable are the conditions for COVID-19 transmissions. For France, we observe a seasonality of IPTCC being small during summer months and higher otherwise, with a north-east/southwest gradient. From this index, we created a weather variable by normalizing (min-max range equal to 1), subtracting the global average value and by reversing it, see black curves in Figure 3. Finally, to focus on the seasonal variation, a loess smoothing with a span of 0.2 was applied before use, leading to a smooth weather variable, denoted by W in what follows. Resulting seasonal . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint variations of this variable for each of the 12 regions of interest over the study period are shown on Figure 3 (red curves). The lower the value, the closer temperature and humidity conditions were close to optimal transmission conditions defined by Bukhari and Jameel (2020). Again, two clear period appear, summer with higher values of this weather variable and winter with lower values. This variable is denoted W i (t) for the i th region. When taking W = 0, one considers the global average value over all French regions over the study period. VoC of SARS-CoV-2. Some variants of the SARS-CoV-2 virus have been classified by national and international health authorities as VoCs as they impact transmissibility or virulence, or decrease the effectiveness of measures World Health Organization. Since January 2021, French health authorities have conducted surveys to estimate the prevalence of three VoCs: 20I/501Y.V1 (Alpha), 20H/501Y.V2 (Beta) and 20J/501Y.V3 (Gamma). The survey of VoC Delta started after this study. We therefore included the cumulative proportion of cases infected by one of these VoCs as a potential covariate explaining the transmission. We used data from two cross-sectional "flash" surveys performed on January 7-8, 2021 and January 27, 2021, see Santé Publique France; Santé Publique France and the weekly estimation of VoC circulation provided by the SI-DEP database at the regional level from February 12, 2021 to March 28,2021 . Between January 8, 2021 and February 12, 2021, the estimated proportion of the sum of the three VoC have increased from a national mean of 3.3% to a national mean of 46%. To complete the missing data, we assume that the proportion before January 8, 2021 equals to 0% and that the evolution is linear between January 8 and January 27, 2021, and then between January 27 and February 12, 2021. Of note, a logistic and an exponential growth were also tested with no significant changes in conclusions (results not shown). With no data reported for Bourgogne-Franche-Comté region on January 27, 2021, only one slope was estimated on the time window. This variable is denoted V oC i (t) in what follows. Representations of the VoC proportion in each region over time are given in Section 1 of the Web Supplementary Material.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint Vaccination process. Vaccination started in France on December 27, 2020. Three COVID vaccines were at that time authorized: BNT162b2 mRNA (Pfizer), ChAdOx1 nCoV-19 (Az-traZeneca) and mRNA-1273 (Moderna). To take into account the vaccination process, we used the VAC-SI database which provides the cumulative percentage of the population who have been vaccinated with at least one dose of vaccine over time. Vaccination was first targeted to the elderly 75+ years of age. The proportion of vaccinated individuals, denoted here within as V , ramped up to about 12% by the end of study period. Representations of the vaccination coverage in the population over time are given in Section 1 of the Web Supplementary Material.
2.3. Model of the epidemic.
The mechanistic model:. We model the evolution of the COVID-19 epidemic using an extended SEIR type model Wang et al. (2020), called a SEIRAH model, adapted from Wang et al. (2020); Prague et al. (2020) where the population of size N is divided into 5 compartments: susceptible S, latent exposed E, symptomatic infectious I, asymptomatic/paucisymptomatic infectious A, hospitalized H, removed R (i.e. both recovered and deceased), see Figure 4. The number of vaccinated people denoted by V is assumed to be known, see Section 2.2. The dynamics of such model is given by where α, r E , D E , r I , D I , D Q , D H are time-independent parameters described in Table 1 while b is a function of time modeling the disease transmission rate.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint Effective reproductive number and attack rates:. When individuals are homogeneous and mix uniformly, the effective reproductive ratio R eff (t) is defined as the mean number of infections generated during the infectious period of a single infectious case at time t. In this model, the effective reproductive ratio can be written as a function of model parameters (see Section 3 of the Web Supplementary Material for details), namely When neglecting the deaths, the proportion of infected individuals assuming no waning immunity -also called attack rates -among the population in each region at a given date is given by:

2.4.
A population-based Kalman filter to estimate the transmission rate.
Framing the problem.. In order to sustain the modeling choices behind the disease transmission rate b, and more generally the selected inference procedure, we used Kalman-based estimation strategies Simon (2006); Bensoussan (2018) applied to the epidemic model (1). We define the global transmission rate t → b(t), which accounts for the proportion of susceptible removed from the system thanks to vaccination: We propose to estimate b and ultimately to retrieve b using an estimation of vaccinated people V , see Section 2.2. In each region, we then introduce a dynamic equation for b of the form . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint where ν i consists in a Wiener process such as for all t, s ≥ 0, ν i (t) − ν i (s) ∼ N (0, (t − s)σ ν ) with σ 2 ν is known and constant in all regions and g i is a function describing the evolution of the global transmission rates b i in each region. After discretization using forward Euler time-scheme with small-enough time-step δt, we end up with a discrete-time dynamical system applied to the variable x = (E, I, R, A, H) ∈ R 5 , for each region i = 1, . . . , 12: where f accounts for the dynamics of (E, I, R, A, H) in (1) while S is reconstructed afterwards using S = N − (E + I + R + A + H) in each region. In the time discrete system, (ν i n ) n≥0 now represent independent random variables, normally distributed with 0 mean and a variance equal to σ 2 ν δt. Moreover, if constant parameters have to be estimated, the vector θ ∈ R Np gathers all of them. For the estimation in the following, we transform the variable z to account for biological constraints. Remarking that all state variables are positive and bounded by N , the total population size of the region, we transform the state variable using x → logit(x/N ). We also apply a similar transformation to b → logit(b/max b ). Of note, state and transmission rate variables are more likely to have Gaussian distributions in the transformation space. After calibration, we fixed max b = 1.5 and checked that other values did not substantially changed the results (result not shown).
Population approach.. To perform the estimation, we rely on an extension of the classical Unscented Kalman filter (UKF) Uhlmann (1997, 2002); Simon (2006). The particularity in this application, as described in Section 2.4, is that multiple series of data are observed jointly in multiple regions as we observe multiple realizations of the same epidemic. In order to take into account in our Kalman estimation that parameters in different regions are correlated as in a population approach, we follow a recently proposed population-based Kalman formulation Collin, Prague and Moireau (2020). As in mixed-effect models Lavielle (2014), each initial uncertainty variable z i 0 is assumed to be randomly distributed around a common population intercept z pop 0 with a Gaussian distribution of unknown covariance Q 0 , namely: . When we treat the population intercept as the empirical mean over the population members in the construction of the objective function, we recover a classical filtering problem Simon (2006) on the aggregated variable z = (z 1 · · · z Nr ) . The only difference is the formulation of the initial covariance priorP 0 which couples the observations across regions and can be written as:P where ⊗ indicates a Kronecker product,Q 0 is a prior of Q 0 and M a small penalization matrix that guarantees the overall invertibility ofP 0 . As a consequence, the matrixP 0 is not block diagonal with respect to the region i and thus all the region dynamics are coupled. The resulting time-discrete Kalman estimator couples all the regions together to produce a population-based estimation. Note that in such strategy, forcing a variable to be constant in the population is possible by simply choosingQ 0 such that Tr(Q −1 0 ) is very small with . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint USING POPULATION BASED KALMAN ESTIMATOR TO MODEL FRENCH COVID EPIDEMIC 11 respect to Tr(M ). Conversely, Tr(M ) small with respect to Tr(Q −1 0 ) in a large population N r 1 will encourage each region to remain decoupled from each others. Starting from a given prior knowledge, our Kalman implementation uses the available measurements (Y H ij , Y Hin ij ) 1≤i≤12,0≤j≤391 , to compute recursively in time the following estimates: Then exploiting thatẑ n gather the augmented state (x i n+1 , b i n+1 , θ i n+1 ), 1 ≤ i ≤ 12 of the 12 regions, a simple post-processing over the regions provides estimations of b and the state variable. We refer to Collin, Prague and Moireau (2020) for further details about this Kalmanbased population approach.
Estimation strategy.. As we want to inject as few information as possible on the shape of the transmission rate b, we will first consider that the Wiener process ν defined in Equation (3) is a time-dependent function which thus encompass the complete dynamics of b. This corresponds to choosing g ≡ 0, meaning we have no prior knowledge on the evolution of transmission rates, i.e. on the NPI effects. However, to avoid overfitting, our objective is to disentangle the latent trajectory of b from possible noise in the data which can be observed in Figure 1. We adopt a 3-step approach described below consisting in smoothing the trajectory of b: 1. Estimate a reasonable prior for the initial transmission rate b before the start of any NPI.
We use data before the first lockdown (10 days available) and assume the transmission rate b i (t) = b i init for t = 1 · · · 10 days to be a constant. In other words, we apply the population Kalman filter estimation described above with θ from Equation (4) being reduced to b init . 2. Estimate the shape of b with prior on initial value but without any prior on the dynamics.
We set the initial value of b i for times t = 1 · · · 10 days to b i init and take g(t) ≡ 0 such that Equation (3) rewrites db(t) = dν(t). We apply the population Kalman filter estimation described above. Of note, the parameter vector θ from Equation (4) is now empty. As there is no information on b, the model error is very important and could lead to an over-fitting of the data. We then build a prior for the dynamics of b by fitting a parametric shape based on sum of logistic functions on the weighted average trajectories of b over all regions with least-square method. Then, we build a prior for the dynamics of b by fitting a parametric shape based on sum of logistic functions on the weighted average trajectories of b over all regions with least-square method. We choose logistic functions as they seem well adapted to model variations due to for example stiff lockdowns or smooth unlocks. We set the number of logistic functions from the number of main changes of variations of the weighted average trajectories of b. This function will represent a prior g on the dynamics. 3. Estimate the shape of b with prior on initial value and informative prior on the dynamics.
We set the initial value of b i for times t = 1 · · · 10 days to b i init and take db(t) = g(t)dt + dν(t) such as in Equation (3). We apply the population Kalman filter estimation described above. As the dynamics of b still includes a modeling noise, the shape of b could be different from the prior transmission rate defined from Step 2. This is the result of this final analysis that is used for further description of the effect of NPI.

Explanatory model for the transmission rate.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint Mixed effects model. Using the function b obtained with population Kalman filter as described in Section 2.4, one can determine the impacts of NPIs, seasonal weather conditions, and VoCs on the transmission rate b = b (1−V /N ) . We followed the example of Flaxman et al. Flaxman et al. (2020) -in considering interventions effects to be multiplicative. Therefore, we fitted a linear mixed effect model on the logarithm transformation of the transmission. The model equations are given in Section 4.2 of the Web Supplementary material. It consists of an intercept, which represents the average transmission of COVID-19 across all regions without NPI without VoC circulation and for an average French weather condition (W=0), the sum of effect of the 10 NPIs described in Section 2.1, the effect of weather and the effect of the VoC percentage. It is known that the transmission of SARS-CoV-2 is very different indoors and outdoors Bulfone et al. (2021). Thus, we also added an interaction effect between bar and restaurant closure and the weather, accounting for the opening of outdoor sitting areas. Of note, this is the only interaction tested to avoid overfitting. We finally added random effects to account for heterogeneity between regions. We added a random intercept and random slopes for the effect of first, second lockdown and 6PM curfew as they may be highly variable between regions. We assume a full covariance matrix for the random effects, allowing effects to be correlated together in each region. In particular, we believe that they may be impacted by various factors not accounted for in the epidemic model, such as the density, the age distribution or the urbanization of the region.
Note that we added a 7-day delay to the lockdowns, which can be interpreted as a necessary time to allow people to organize themselves for adaptation (implementation of working from home, childcare etc ...). This choice has been motivated by the observed 7-day delay in the transmission rates obtained by Kalman filters and will be discussed. To ease interpretation of the estimated effects, parameters were transformed back and expressed as a decrease or an increase in transmission in percentage by applying the function x → 100(e x − 1). Classical 95% confidence intervals were obtained using 100(e x+/−1.96SE(x) − 1), were SE is the standard error obtained from the regression.
Interpretation of the impact of seasonal weather conditions. To facilitate understanding, we calculated some NPI effects during the summer and winter periods. To do so, we replace in the results W by its summer (resp. winter) average from June 21 st , 2020 to October 21 st , 2020 (resp. before June 21 st , 2020 and after October 21 st , 2020 ) over all regions.
Basic reproductive number. The intercept of the regression presented above represents the mean transmission rate over all regions when there is no NPI in place, no VoC and the weather condition is taken to be the average weather condition over a year in France. Thus, inserting it in Equation (2) directly provides the basic reproductive number and 95% confidence intervals can be computed using the standard error of this parameter.

Results.
Estimation of the transmission rate using a population-based Kalman filter.
Step 1 of estimation provided the initial values for the transmission rate and exhibited quite similar values between regions (average 0.78 sd 0.012, see Section 4.1 of the Web Supplementary Material for the values at the regional level). Although higher values are found for the regions with the higher first wave, the small variability between regions may indicate that the magnitude of the first wave was not fully driven by a higher transmission rate, but also by different initial epidemic states (i.e. number of exposed and infectious cases resulting in differences in virus introduction timelines or the occurrence of super-spreading events Roux, Massonnaud and Crépey (2020)). In Step 2, the transmission rates obtained without knowledge of their shapes are given in Figure 5 (top, left) for all the regions. As there is no information on b, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint USING POPULATION BASED KALMAN ESTIMATOR TO MODEL FRENCH COVID EPIDEMIC 13 < l a t e x i t s h a 1 _ b a s e 6 4 = " m 1 u / n u 7 2 7 d B 4

8 C u u E 8 3 S n o z 3 V E = " > A A A B 7 X i c b V A 9 S w N B E J 3 z M 8 a v q K X N Y h C s w l 0 K t Z K A j W V E 8 w H J E f Y 2 e 8 m a v b 1 j d 0 I R / w N N h a K 2 P p / 7 P w 3 b p I r N P H B w O O 9 G W b m B Y k U B l 3 3 2 1 l Z X V v f 2 C x s F b d 3 d v f 2 S w e H T R O n m v E G i 2 W s 2 w E 1 X A r F G y h Q 8 n a i O Y 0 C y V v B 6 H r q t x 6 5 N i J W 9 z h O u B / R g R K h Y B S t 1 L x D n j x V e 6 W y W 3 F n I M v E y 0 k Z c t R 7 p a 9 u P 2 Z p x B U y S Y 3 p e G 6 C f k Y 1 C i b 5 p N h N D U 8 o G 9 E B 7 1 i q a M S N n 8 2 u n Z B T q / R J G G t b C s l M / T 2 R 0 c i Y c R T Y z o j i 0 C x 6 U / E / r 5 N i e O l n Q i U p c s X m i 8 J U E o z J 9 H X S F 5 o z l G N L K N P C 3 k r Y k G r K 0 A Z U t C F i y 8 v k 2 a 1 p 1 X q r f V c u 0 q j 6 M A x 3 A C Z + D B B d T g B u r Q A A Y P 8 A y v 8 O b E z o v z 7 n z M W 1 e c f O Y I / s D 5 / A F q v 8 D < / l a t e x i t >
Step 2

< l a t e x i t s h a 1 _ b a s e 6 4 = " k b g p W A r n E T F R I 0 a V z M R E K a u 5 g B 0 = " > A A A B 7 X i c b V D L S g N B E O y N r x h f U Y 9 e B o P g K e x G U E 8 S 8 O I x o n l A s o T Z y W w y Z n Z 2 m e k V Q o j f 4 M W D I l 7 9 H 2 / + j Z N k D 5 p Y 0 F B U d d P d F S R S G H T d b y e 3 s r q 2 v p H f L G x t 7 + z u F f c P G i Z O N
e N 1 F s t Y t w J q u B S K 1 1 G g 5 K 1 E c x o F k j e D 4 f X U b z 5 y b U S s 7 n G U c D + i f S V C w S h a q X G H P H k 6 6 x Z L b t m d g S w T L y M l y F D r F r 8 6 v Z i l E V f I J D W m 7 b k J + m O q U T D J J 4 V O a n h C 2 Z D 2 e d t S R S N u / P H s 2 g k 5 s U q P h L G 2 p Z D M 1 N 8 T Y x o Z M 4 o C 2 x l R H J h F b y r + 5 7 V T D C / 9 s V B J i l y x + a I w l Q R j M n 2 d 9 I T m D O X I E s q 0 s L c S N q C a M r Q B F W w I 3 u L L y 6 R R K X v n 5 c p t p V S 9 y u L I w x E c w y l 4 c A F V u I E a 1 I H B A z z D K 7 w 5 s f P i v D s f 8 9 a c k 8 0 c w h 8 4 n z 9 s Q 4 8 E < / l a t e x i t > Step 3 Step 2 (with ν = 0). Bottom -Time evolution of R eff in the 12 French regions when g corresponds to 7 logistic functions (with ν = 0). If R eff is superior to 1 (see the horizontal dashed black line), the infection will spread. The first (resp. second, third, fourth, fifth) vertical gray line corresponds to the first day of the first lockdown (resp. the last day of the first lockdown, the start of the academic year, the first day of the second lockdown, the last day of the second lockdown).
the model error is very important and leads to an over-fitting of the data. In particular, oscillations with zero mean and with period close to 7 days are visible for many regions. This is related to a known under-reporting during week-ends. The weighted average trajectory of b is displayed in Figure 5 (bottom, left). We approximated the function g by a sum of 7 logistic functions. Results of the least-square fitting are displayed in dashed line and constitute the prior for the following. Finally, in Step 3, the transmission rates b are obtained and displayed in Figure 5 (top, right). As the dynamics of b still includes a modeling noise, the shape of b is different from the prior transmission rate defined from Step 2 but still smoother.
Effective reproductive number. The resulting effective reproductive ratio R eff is given at the regional level in Figure 5 (bottom, right). It starts from a value ranging between 3.5 and 4 in all regions. This is partially driven by winter-like conditions during the first days of March 2020. The basic reproductive number will be later computed adjusting for weather. The time variation of R eff shows that it rapidly gets under the critical value of one after the first lockdown initiation and then oscillates around this value.
Attack rates. The attack rate represents one intrinsic feature of interest in any epidemic model (as described in Section 2.3). On top of adding an insight on the epidemic dynamics by adding extra knowledge on the number of possible hidden / unmeasured cases, attack rates are a good way to validate how realistic the model fitting is. Figure 6 represents the attack . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint rate at various important times: at the end of first lockdown (May 11, 2020), October 5, 2020 and at the end of our study period (March 28, 2021). The national French attack rates are respectively estimated to be 5.7%, 8.8%, and 25.3% at these dates.
Basic reproductive number. After adjusting for average weather over a period of one year in France, we estimate the national average basic reproductive number at 3.10 [2.95 ; 3.26].  Effects of NPI on transmission rate. Table 2 summarizes the estimation of the effect of NPI on the transmission rates derived from the fixed effects of the model provided in Section 2.5. Regression residuals, fixed and random effects are given in Section 4.2 of the Web Supplementary Materials. Figure 7 shows the corresponding individual fits. The very restrictive first lockdown (respectively the less restrictive second lockdown) decreases the transmission rate by ∼ 80% (respectively ∼ 55%). The impact of school closure is around 7% and the bar and restaurant closure around 10%. The barrier gestures decrease the transmission by ∼ 46%.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. One can see that the values between the curfew at 6PM and the curfew at 8PM (around 30%) are very close and not statistically different (3% [-2% ; 8%]). We demonstrate that all NPI in this analysis decrease the transmission and have an effect significantly different from zero.
Effect of weather. The seasonal weather conditions variable is unitless, so the interpretation of its estimated effect should be made by comparing a value to a reference, which is set to be the average weather in France. We find that transmission is on average significantly increased by 10% during the winter period, and significantly decreased by 22% during the summer period, compared to average weather. Figure 8 (top) presents the estimated effect of seasonal weather conditions over the period of study, by region. The estimated interaction between the closure of bars and restaurants and the seasonal weather conditions is statistically significant (p=0.037) but is also complex to interpret. We show that although closing bars and restaurants always has a significant effect, decreasing the transmission, it is slightly more effective in winter (11% [8%;14%] decrease) rather than in summer (8% [4%;11%] decrease). Figure 8-Middle presents the estimated effect of bar and restaurant closure for the 12 regions of interest over the study period and its interaction with weather conditions. The weather conditions impacts the estimation, with a stronger effect on transmission reduction in Northern regions compared to Southern ones, and also a stronger effect is observed in the Spring 2021 due to more favorable weather.
Effect of VoC. Concerning the effect of the three VoCs, our results show that an increase from 0 to 100% of the proportion would increase the transmission rate by an estimated 22% [15% ; 28%]. Figure 8 (bottom) presents how the increased transmissibility varies over time due to VoC proportion.
4. Discussion. In this paper, we use an innovative method to infer the transmission rate over time from hospitalization data, and estimate the effect of multiple NPI, weather and . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Date (  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint VoC on it. We showed that all NPI considered have a significant and independent effect on the transmission rate. We additionally demonstrated a strong effect of weather conditions, decreasing the transmission in the summer period, and increasing it in the winter period, and we observed an increased transmissibility due to VoCs.
Concerning the transmission rate, interpretation of our results is conditional on the mechanistic model illustrated in Figure 4, and careful attention must be given to the parameters values set from the scientific literature which are detailed in Table 1. This model does not take into account the age-structure of the population on the contrary to the works of Salje et al. (2020);Di Domenico et al. (2020) using French data. However, we believe that using a population approach accounts for different intrinsic characteristics such as population density, age structure, transportation habits, etc. that influence disease spread through each regionspecific transmission rate. Another shortcoming of the SEIRAH model is that it does not take into account waning immunity or travel between regions. The model could be slightly modified to take these aspects into account but would necessitate further fixing of more parameters while no other data is available, especially on inter-regional travel. For this reason we did not proceed with this development. Concerning the available data, the drawback to use hospitalization data is its lack of exhaustiveness given that hospital services declaring cases may vary over time. However, one may think that this variation is less important than in other sources of data such as the daily number of new confirmed cases for which the testing policy has greatly changed during the course of the epidemic.
Interestingly, although the models were different and the data not fully identical, our results were comparable in terms of attack rates with existing modeling works Hozé et al. (2021) and sero-prevalence studies Warszawski et al. (2020); Carrat et al. (2020); Santé Publique France. A comparison is available in Web Supplementary material, Section 4.1. Overall, our estimates tend to be slightly higher by less than 5% than other estimation and sero-prevalence studies. This is presumably due to the strong assumption that there is no waning immunity in our model. Indeed, we assume that all individuals, once infected, have a high enough titer of antibody response to systematically test positive in sero-prevalence studies. In terms of reproductive numbers, we found 3.  Salje et al. (2020). These comparisons allow us to validate the estimation strategy, which has the great advantage of estimating transmission rates without assumptions on the shape of the transmission functions. Furthermore, the computational times are not excessive (a few minutes for the full estimation on a classical work station without code optimization).
We restricted our analysis to the estimation of ten NPI effects. More variables could have been added, e.g. partial interventions (applied, for example, only in large cities in regions highly affected by COVID-19) -but those would be inconsistent with our model defined at a regional level. Other potential variables suffer from inconsistent definitions. Working from home is a good example. First, the adherence to this intervention (which can vary due to employee fatigue, organization difficulties and the absence of legal constraint) is difficult to evaluate. Second, regarding quantitative indicators, the DARES carried out surveys for companies with 10 or more employees and showed that the mean percentage of remote work is around 29.6% [18.8%, 40.4%] with the max value during the two lockdowns . However, this indicator accounts for working from home and paid vacation which may be very different in terms of behaviors. Hence by omitting the working from home in the model, its effect is captured by the effect of lockdown and barrier gestures. Identifying the proper effect of working from home is very difficult, because injunctions and implementations of "working from home" fluctuate greatly over time and over geographical areas without a valid way of measuring adherence.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 16, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 We also had to make a certain number of modeling assumptions. For some interventions, such as lockdowns, we considered a 7-day delay of implementation (which is visible on the transmission rates obtained with the Kalman filter). This choice is validated by other studies such as Dehning et al. (2020), in which the delay can be up to 15 days. However, we investigated its impact on our regression adjustment. Not considering the delay of 7 days greatly deteriorates the fits, see Section 4.2 of the Web Supplementary Materials for more details. Considering our choice to model the effect of the partial opening of schools during the May 11 th to July 4 th 2020 period as equal to 70% of the effect of a full closure could be considered as an oversimplification. Over this period following the first lockdown, schools reopened very gradually according to three different phases and school attendance growth was even more progressive. Up to June 2 nd , strong disparities in openings and attendance existed among regions, with an average of only 30% of pupils under 12 years old attending. Other levels of secondary schools ("collèges" and "lycées") reopened in early June and progressively received more students until vacation, starting in early July. Not to risk facing identifiability issues, we chose not to differentiate regions or phases of reopening and a ratio of 0.7 of closure effect was applied for all regions. Regarding the effect of curfews, we failed at demonstrating a statistically significant difference between 6PM and 8PM. This is may be due to an identifiability issue as the 8PM curfew was in place for less than 3 weeks in many regions. An alternative explanation could be that both curfews were impacting globally social gathering the same way. As for example, both are preventing most private dinners and parties (or at least reducing drastically the number of guests).
In this article, we decided to take into account the weather condition using past works of Bukhari and Jameel (2020) and Bukhari and Jameel (2020). Using separately temperature, absolute, and relative humidity was not explored in this work. All in all, we saw a clear point in including the weather condition as a variable modifying the transmission. Section 4.2 of the Web Supplementary Materials explored simpler models. All deteriorated the fits. However, we remain extremely cautious about the interpretation of the estimated effects of weather conditions and, beyond, of associated mechanisms. Finally, we consider an interaction between seasonal weather conditions and bar and restaurant closure justified by the use of terraces (which have been expanded in many places since the beginning of the pandemic). Of course other interactions can be considered and more complex models can be written but it may lead to over-fitting.
Looking back at the original data, the variant 20I/501Y.V1 (Alpha) appears to have always been predominant (over 90% at any time) compared to the two other VoCs (Beta and Gamma) considered in the French non-insular territory . Widely variable estimations of transmissibility increase for the variant 20I/501Y.V1 (Alpha) have been proposed in the scientific literature, ranging from 29% to 90% Graham et al. (2021) Davies et al. (2021) the lower values being in line with our own findings. The higher estimates could be partially explained because new VoCs appeared at the beginning of winter in England and in the USA. This may lead to confusion between weather conditions and VoCs, which both increase weather conditions, as mentioned by Campbell et al. (2021). We tested this assumption by removing weather conditions from our model and found an increased transmissibility of 43%, see Section 4.2 of the Web Supplementary Material. We retained the weather condition in our model as it greatly improved the fit.
Altogether, this work is one of the first attempts to evaluate retrospectively the effect of multiple NPIs over a period of one-year of the COVID-19 epidemic. On top of applying a novel methodology to current and important application, this work could be extended to generate "what-if" scenarios and help determine appropriate NPI implementations for future waves of infection.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 16, 2021. ; https://doi.org/10.1101/2021.07.09.21260259 doi: medRxiv preprint