Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Changes in Reproductive Rate of SARS-CoV-2 Due to Non-pharmaceutical Interventions in 1,417 U.S. Counties

View ORCID ProfileJie Ying Wu, View ORCID ProfileBenjamin D. Killeen, Philipp Nikutta, Mareike Thies, Anna Zapaishchykova, Shreya Chakraborty, View ORCID ProfileMathias Unberath
doi: https://doi.org/10.1101/2020.05.31.20118687
Jie Ying Wu
1Department of Computer ScienceJohns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jie Ying Wu
  • For correspondence: jieying@jhu.edu
Benjamin D. Killeen
1Department of Computer ScienceJohns Hopkins University Baltimore, MD, United States
2Laboratory for Computational Sensing and Robotics, Johns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin D. Killeen
Philipp Nikutta
2Laboratory for Computational Sensing and Robotics, Johns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mareike Thies
2Laboratory for Computational Sensing and Robotics, Johns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anna Zapaishchykova
2Laboratory for Computational Sensing and Robotics, Johns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shreya Chakraborty
1Department of Computer ScienceJohns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mathias Unberath
1Department of Computer ScienceJohns Hopkins University Baltimore, MD, United States
2Laboratory for Computational Sensing and Robotics, Johns Hopkins University Baltimore, MD, United States
3Malone Center for Engineering in HealthcareJohns Hopkins University Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mathias Unberath
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

In response to the rapid spread of the novel coronavirus, SARS-CoV-2, the U.S. has largely delegated implementation of non-pharmaceutical interventions (NPIs) to local governments on the state and county level. This staggered implementation combined with the heterogeneity of the U.S. complicates quantification the effect of NPIs on the reproductive rate of SARS-CoV-2.

We describe a data-driven approach to quantify the effect of NPIs that relies on county-level similarities to specialize a Bayesian mechanistic model based on observed fatalities. Using this approach, we estimate change in reproductive rate, Rt, due to implementation of NPIs in 1,417 U.S. counties.

We estimate that as of May 28th, 2020 1,177 out of the considered 1,417 U.S. counties have reduced the reproductive rate of SARS-CoV-2 to below 1.0. The estimated effect of any individual NPI, however, is different across counties. Stay-at-home orders were estimated as the only effective NPI in metropolitan and urban counties, while advisory NPIs were estimated to be effective in more rural counties. The expected level of infection predicted by the model ranges from 0 to 28.7% and is far from herd immunity even in counties with advanced spread.

Our results suggest that local conditions are pertinent to containment and re-opening decisions.

1 Introduction

As of May 28th, 2020, the United States has reported more than 1,700,000 cases of novel coronavirus 2019 (COVID-19). This disease, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, has led to more than 100,000 deaths in the U.S.1 Non-pharmaceutical interventions (NPIs) are a critical component of the public health effort to slow the spread of COVID-19 as pharmaceutical interventions are not currently available. NPIs include guidelines for hand hygiene, cancellations of mass events, school closures, closure of non-essential business, and stay-at-home orders. These measures are designed to reduce transmission, buy time to expand healthcare capacity, develop effective testing and tracing mechanisms, and research pharmaceutical options, such as a vaccine.

Indeed, the implementation of NPIs caused a rapid decline of new cases and deaths, although at a very high economic and social cost. Understanding precisely how NPIs affect disease transmission is desirable in order to safely roll back NPIs and minimize adverse effects. Prior works have quantified the effects of NPIs on the reproductive rate Rt of SARS-CoV-2 in China,2 UK,3 Brazil,4 and 14 European countries, including Italy, Spain, and Germany.5,6 Except for some large urban areas,7 these effects have not yet been quantified for the U.S., which delegated NPI implementation to state and local governments rather than establishing a unified, federal approach. As a result, it is necessary to quantify the effect of NPIs on the county level, but the limited number of documented fatalities in many counties would limit this analysis to urban areas with sufficient data for epidemiological modeling.

We develop a data-driven approach that establishes county similarity based on factors known to affect community transmission of infectious diseases. Within groups of similar counties, we jointly optimize the parameters of a Bayesian mechanistic model to the observed deaths in every county under the assumption that the same NPI attains comparable effects across counties that are similar with respect to disease transmission factors. Using this approach, we estimate the change in Rt due to implementing NPIs in 1,417 U.S. counties that exhibit substantially different characteristics regarding population density, economy, demographics, and infrastructure. We fit a model to groups of counties with similar conditions which gives us a probable Rt trajectory over time and consequently, the probable number of people infected over time.

2 Conclusions

Based on the model, as of May 28th, 2020 1,177 out of the considered 1,417 U.S. counties are estimated to have reduced the reproductive rate of novel coronavirus to below 1.0 via the implementation of NPIs. The estimated effect of any individual NPI, however, may differ across counties. We observe that for metropolitan and urban counties, the most substantial reduction is attributed to stay-at-home orders while less restrictive NPIs were estimated to be effective in more rural counties. Further, the expected level of infection predicted by the model, ranging from 0 to 28.7%, is far from herd immunity even in counties with advanced spread.

While the model explains the observed trends in fatalities well, the rapid escalation of the COVID-19 situation and quick succession in the implementation of NPIs in most counties within few days from another complicates the disentanglement of the effects of any individual NPI. Further, even if accurate, the effect attributed to any NPI may not describe the consequences of roll back of the same NPI because social norms and individuals’ behavior may have changed, including the wearing of masks in public.

Despite these limitations, our results suggest that strategies for shutdown as well as re-opening require careful consideration of county conditions in addition to state and national trends.

3 Results

Characterizing Groups of Similar Counties as Clusters

Since our hypothesis is that local conditions affect the spread of COVID-19, we differentiate among groups of counties using a data-driven approach known as clustering. Each group – or cluster – of counties is characterized by having similar demographic and socioeconomic qualities. For instance, cluster 1 consists of low-population, mostly rural counties with little public transit capacity and the lowest median household income. Cluster 2 and 3 are similar in size, having a mean population of 45,000 and 52,000 respectively, but cluster 2 includes higher-income, suburban areas where-as cluster 3 has lower income areas with a large land area. Cluster 4 consists of the densest metropolitan areas with a high proportion of 18- to 65-year-olds, high household income, a high public transit score, and small land area. Finally, cluster 5 has the highest mean population besides cluster 4, but is less densely populated, has poor public transit, and a lower household income on average. It is important to note that although we refer to these clusters with numbers 1–5, these are merely labels returned by the clustering algorithm without any meaning inherent to the ordering Figure 1. shows our clustering for all U.S. counties with this data available, including those without any incidence of COVID-19.

Figure 1:
  • Download figure
  • Open in new tab
Figure 1:

Cluster labels based on demographic and socioeconomic conditions are used to aggregate data and specialize epidemiological models. Here, one can see how cluster 1 and 3 primarily cover rural areas, while clusters 5, 2, and 4 consist of increasingly urban counties.

Once we have identified the clusters, we infer the reproductive rate of SARS-CoV-2 over time. by jointly optimizing the parameters of a Bayesian mechanistic model on each cluster. This process is described in greater detail in Section 4. Figure 2 shows an example of this progression for counties ordered by their public transportation use.

Figure 2:
  • Download figure
  • Open in new tab
Figure 2:

Relationship between public transit capacity and the time-dependent reproductive rate of SARS-CoV-2 for U.S. counties. Colors correspond to clusters as in Figure 1. In the cluster of counties that rely least on public transport, shown in yellow, we observe a substantial drop in reproductive rate already on 3/15/20, possibly indicating a strong effect of advisory NPIs that were in effect at that time.

We observe that although the basic reproductive rate of SARS-CoV-2 starts at a similar level for all clusters, the speed at which counties in each cluster respond to the disease, and reduce its Rt differs. The reproductive rate in metropolitan counties (cluster 4), which tends to have higher reliance on public transportation decreases over the entire period. This is especially apparent going from March 15 to March 25. On the other hand, cluster with little transportation use, such as cluster 5, was estimated to have quickly reduced transmission rates. This suggests that if people have a higher reliance on public infrastructure, more stringent interventions are necessary to reduce transmission. The staggered drops in reproductive rates are attributed to both varied efficacy of NPIs (to be discussed in the next section) and counties implementing NPIs at different times. Comparisons with more features are shown in Section 4.

Estimates of Initial and Current Reproductive Rates and Number of Infected

From Table 1, we observe that most counties exhibited an initial reproductive rate R0 above 3. As of May 28th, however, most have successfully reduced the reproductive rate to Rt ≈ 1 after implementing NPIs. It is estimated that 18.4 to 42.7% of the population (95% confidence interval) has been infected in New York, NY, which has the highest number of cases. Based on the initial reproductive rates between 2 and 4, herd immunity is reached only after 50 – 70% of the population has recovered,8,9 suggesting that all U. S. counties are far from resilience. Consequently, easing restrictions is likely to result in subsequent waves of the epidemic. We state these findings for 15 heterogeneous counties in Table 1 and provide the same metrics for all 1,417 counties online at https://github.com/JieYingWu/npi-model, where we also provide code and data required for reproduction.

View this table:
  • View inline
  • View popup
Table 1:

Estimated initial and current reproductive rate as of May 28th, 2020, and the number of cases for select counties ordered by their reproductive rate as of May 28th. This is compared to the measured number of cases and fatality rates.

Learned Effects of NPIs

Using a cluster-specialized model, we quantify the effectiveness of NPIs, as shown below in Table 2 and Table 3.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2:

The learned weights of the interventions for each cluster-specialized model, showing mean with standard deviation in parentheses. Note how the interventions have different effects for the different clusters.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3: Without a cluster-specialized model, information about the effectiveness of local interventions is not as pronounced.

We note that our model estimates different behavior across counties. While all counties have implemented a similar set of interventions, their estimated effects are substantially different in each respective cluster. For example, metropolitan counties (cluster 4) were estimated to have a strong response to stay-at-home orders, while rural and suburban areas (clusters 1, 2, 3, and 5) responded more to national-level interventions according to our model.

One caveat is that the effects of interventions that were implemented close together are difficult to disentangle. Many local governments implemented formal NPIs in immediate response to the federal guidelines, leading a quick succession of NPIs coming into effect. Disentanglement of the effects of NPIs is explored more in Section 4.

Another limitation of our model is that the federal level interventions may have come before some counties have seen any cases. This introduces ambiguity between R0 estimates and the weight attributed to federal NPIs, i.e., federal guidelines and travel ban. Similar ambiguity is less prominent for other interventions, since they are not generally implemented before a county has seen cases. Additionally, since the epidemiological model used here is mechanistic, it only allows for changes of the transmission rate at the time of implementation of interventions. Other events, such as high-profile cases and cancellations of prominent festivals, likely contributed to increasing awareness of the disease and may also have effects on individual behavior, and therefore the Rt. These effects cannot be attributed to a specific date and may thus affect the weights of interventions that come into effect at around the same time.

4 Methods

Method Overview

Figure 3 shows an overview of the proposed method. We train a Bayesian mechanistic model (BMM) model in a cluster-specialized manner and compare with the baseline model trained on all counties. We show the stability of our model in two ways, first by withholding random days during training and second by withholding given counties. We compare the predictions based on incomplete data with those based on complete data.

Figure 3:
  • Download figure
  • Open in new tab
Figure 3:

The fitting process for our model. We compare the performance of the baseline model (a), which is fit to all eligible counties, with cluster-specialized models BHI 1–5. (b) BMM 1-5. (c) Two validation methods for our model. Validation A tests the stability of the model by withholding 3 random days and comparing the outputs. Validation B withholds a set of counties for comparison rather than days.

County-level Clustering

Quantifying changes in COVID-19’s reproductive rate is complicated when considering differences at the county – rather than the national – level. Parametric epidemiological models are optimized to describe fatality counts, the volume and reliability of which decreases outside of highly populated regions. To make up for this scarcity, we leverage a balanced clustering of U.S. counties to aggregate data from similar counties in the same state, treating them as a single “super-county,” if they have identical NPI implementation dates. This has the advantage of considering counties that would otherwise be excluded without assuming that the spread of the disease in those counties follows the same trend of more advanced regions in the same state or country. Figure 4 shows how for a given state-in this case Texas, and a given cluster, the counties with 1–49 cumulative deaths from COVID-19 are treated as a single entity. In addition to this data aggregation strategy, we fit a cluster-specialized model to each set of counties, quantifying the possibly disparate effects of the NPIs in each type of county, as detailed below.

Figure 4:
  • Download figure
  • Open in new tab
Figure 4:

(a) The total confirmed deaths caused by COVID-19 for counties in Texas, as of May 28th. (b) Cluster labels for each Texas county, based on demographics, education, density, and other factors. (c) Texas counties in cluster 1 having 1–49 cumulative deaths as of May 28th, 2020 and the same NPI implementation dates.. To enable robust epidemiological models, these counties are treated as a single “super-county.”

To generate this clustering, we partition 3,059 U.S. counties into five clusters based on variables which directly affect disease spread, using a Gaussian mixture model.10 Table 4 summarizes these variables, which include demographic, economic, and public transit capacities that we have gathered, processed for machine readability, and released in a publicly available dataset.11 Sources include the United States Census Bureau, the United States Department of Agriculture Economic Research Service and the Center for Neighborhood Technology. A full list of sources can be found on the corresponding website.11 To incorporate potential exposure, we consider county population density, housing density, and land area. Additionally, we consider portions of the population for age- and gender-based demographic categories, due to COVID-19’s disparate effects on these groups.12–16

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 4: Average values for each of the 16 variables considered in our clustering, capturing demographic and socioeconomic information as well as transit and health care capacity.

Our clustering is also based on socioeconomic variables, which may indicate behavioral traits relevant to the spread of COVID-19. For instance, workers with tertiary education are more likely to hold office-type jobs which can be done from home.17 At the same time, many secondary-education jobs have been deemed essential, resulting in a high contact rate, which in turn increases the likelihood of infection. Thus, our clustering considers college education, poverty, unemployment, and median household income for each county as a reflection of the overall job composition in the local area. Finally, we include a population-weighted transit score due to the likelihood of transmission in the enclosed, possibly crowded space that public transport entails.

Figure 5 and Figure 6 show the plotted over median income and density for counties and super-counties in the different clusters. Super-counties are visualized as a single point using their population-weighted average for that feature. The plots show the distribution of the cluster over the features and its correlation with how Rt changes.

Figure 5:
  • Download figure
  • Open in new tab
Figure 5:

Scatterplot and density distribution plot for counties and super-counties comparing Rt over time to median household income. Colors indicate which cluster the county or super-county belongs in, as indicated by Figure 4.

Figure 6:
  • Download figure
  • Open in new tab
Figure 6:

Scatterplot and density distribution plot for counties and super-counties comparing Rt over time to population density. Colors indicate which cluster the county or super-county belongs in, as indicated by Figure 4.

Notably, we exclude ethnic demographics from the variables considered during clustering. because we assume that no direct relationship exists between race and incidence of COVID-19. Rather, such a relationship would be mediated by socioeconomic factors that are already included for similarity assessment.

Modelling the Effects of NPIs

Data Processing

We use cumulative fatality and infection counts from the JHU CSSE COVID-19 Dashboard, which has been tracking COVID-19 since January.1 When fitting our model, we use measured fatality rates, which are generally considered more reliable than confirmed infections because of limited testing and the prevalence of asymptomatic cases. Thus, we use population-weighted fatality rates to estimate the true cases count. Obtaining a reasonable estimate for this ratio is crucial to realistically model the numbers of total infections. However, due to asymptomatic cases, undertesting and biased reporting, this parameter cannot be measured directly, but has to be inferred from observable data.19–21 Previous studies all report fatality rates with substantial uncertainty but agree on the fact that fatality for COVID-19 depends strongly on the age of the infected person. Therefore, we adapt the fatality rates per age group presented in Verity et al.19 for each county with respect to its demographic age distribution. Based on U.S. Census data, a per-county weighted fatality rate is computed using the share of each age group in the overall population.

Model

We estimate the effective reproductive rate using a semi-mechanistic Bayesian mechanistic model proposed in Flaxman et al.5,22, that infers the impact of a predefined set of interventions and estimates the number of infections over time. The model estimates a county-specific initial reproductive rate R0, m for the m-th county and intervention weights αi, the effect of which is assumed constant for all counties included in joint optimization. These effects are assumed multiplicative, modeled as Embedded Image where m is the county index and Embedded Image is a binary indicator for intervention i being in place at time t. The interventions we take into account here are summarized in Table 1.

The model assumes a normal distribution as the prior for the R0. We set the prior on R0∽N(3.28, ∣κ∣) where κ ∽ N+(0, 0.5), The value of 3.28 is in accordance with the analysis presented in Liu et al.23

Starting from the time-varying Rt, a latent function of daily infections is modeled depending on a number of factors: a generation distribution g with density g(τ) that models the time between spread of infection from an individual to the next (approximated as the serial interval distribution), the number of susceptible individuals left in the population, an infection-to-death distribution, and the county-specific time-varying reproduction number that models the average number of secondary infections at any given time. The parameters for the distributions are chosen in accordance to Flaxman et al.5

Model fitting is driven by the timeseries of observed daily deaths. These are linked to the modelled number of infections by the county specific weighted fatality rate. The sum of past infections, along with the weighted probability of death gives the number of deaths on a day for a given county. To ensure that the deaths accounted for are from locally acquired infections, we include observed deaths in a county only after the cumulative count has exceeded 10. The seeding of new infections is assumed to be a month prior to that.

All parameters are estimated jointly using an adaptive Hamiltonian Monte Carlo (HMC) sampler in the probabilistic programming language Stan.24

Validation

We propose three validation schemes to validate our approach. First, we show that our model is stable, so a small change in the input produces a small change in the output. Second, we separate our counties and super-counties into train and test sets. We fit models for each cluster as well as for all counties and super-counties on either, both the train and test set or with only the train set. We evaluate using the parameters of the “correct” cluster model to predict fatalities for the held-out regions. Comparing these predictions to those made when the regions are included during training confirms that the model is not over-reliant on each data point. Third, we evaluate how well our model discerns the effects of individual NPIs and discuss biases it may have to attribute more weight to certain NPIs.

Validating the Stability of the Model

To show stability in our model, we train the model, withholding three non-consecutive days chosen at random, and compare the predictions for these days to the baseline model fit on full data. Only days with non-zero death counts are selected as potential leave-out candidates. We set the threshold for data selection at counties with 50 cumulative deaths by May 28th, which results in 211 counties. (For model validation, we do not use super-counties.) We fit the model over 300 total iterations, with 150 of those as warmup iterations. From the uncertainty distributions obtained as results from the models, we use the expected values of the deaths of the held-out days to compare the two models and report our observations in Figure 7a. The clustering of points around the optimal fit visually confirms that the expected values of the baseline model fit those from the model with held-out days. In Figure 7b, we compare the average county-wise expected value of deaths of the withheld days and observe a more concentrated distribution around the line of optimal fit.

Using the Pearson-correlation coefficient to quantify the similarity between the expected values, we find r-values greater than 0.99 for the held-out days and the average held-out days which suggest that the model is robust against small input perturbations.

Figure 7:
  • Download figure
  • Open in new tab
Figure 7:

The predicted values of the validation and the baseline model. (a) Predictions for withheld days, as shown by the optimal fit, are remarkably close to the true values. (b) The three held-out days per county are averaged and compared.

Validating the Advantages of Clustering

We show the advantage of cluster-specialized models for quantifying the effects of NPIs by comparing their performance with models trained on different clusters or on the national level. While a clustering of U.S. counties may or may not be interesting, its value related to COVID-19 comes from its ability to identify epidemiologically meaningful differences among various regions in the country. When fitting each model, we withhold a validation region from each cluster and use the remaining counties to fit Equation(1), for both cluster-specialized and baseline models. We then use the learned weights αi to initialize a fixed-α BHI model for each withheld region.

With this scheme, we show the advantage of clustering U.S. counties in three ways. First, we show that within the same cluster, we obtain comparable predictions for a county whether or not it is included in training. Second, we observe that α values learned from a different cluster produce substantially different predictions. Finally, we apply the α values learned at the national level to similar effect. This demonstrates how cluster-specialized models can reveal trends in the spread of COVID-19 that are not apparent under the assumption that NPIs have universal effect at the national or state level.

Figure 8 shows the first and second validation strategy for cluster 3, which consists of less-populous counties with large land area. For such regions, which necessarily have fewer cases to support local models, it is vital to gain accurate insight from the entire training pool and verify these insights through the aforementioned validation process. Thus, in this illustrative example, we use data up to May 18th, at which point the continuing spread of cases was not immediately clear. Indeed, Figure 8a shows the predictions obtained for seven counties in Arizona (treated as a single super-county). One can readily observe the model takes into account random outliers while following a general upward trend. Figure 8b exhibits less awareness of outliers but still follows the same trend, despite using α values obtained from training on cluster 3 with these same counties withheld. On the other hand, Figure 8c shows the predictions made when using the wrong α values, from cluster 4, which consists of densely populated city centers. As can be seen, this results in a markedly different prediction for the trend of the disease than actually occurred up to May 28th in Figure 8b. This is a critical difference, which could result in very different decisions for implementing or removing NPIs.

Figure 8:
  • Download figure
  • Open in new tab
Figure 8:

The fatalities predicted by our model for seven counties in Arizona (treated as a single super-county), with fatalities data from Apache, Cochise, La Paz, Mohave, Navajo, Yavapai, and Yuma counties, using data up to May 18th. (a) When the super-county is included in training, the cluster-specialized BHI model incorporates outlier events in deaths from COVID-19 while still describing the overall trend. (b) When the super-county is withheld from training but uses α values from the same cluster, the model still captures the same overall trend of increasing deaths. (c) On the other hand, using values from a different cluster makes it appear as though the curve has flattened, even though (d) shows the continued upward trend based on data up to May 28th.

Finally, we compare the performance of our cluster-specialized model with the baseline model, which uses all eligible counties for training. In many cases, this difference is slight, and both the baseline and cluster-specialized models exhibit similar trends for a given region. However, for certain areas the difference can be profound. Take, for example, the District of Columbia, which has become a significant hot-spot for COVID-19. The baseline model, shown in Figure 9a, has only recently flattened, whereas the cluster-specialized model in Figure 9b reflects the reality that efforts to combat the disease have had much greater effect, significantly reducing the number of deaths. This is because when forced to accommodate counties across the U.S., the baseline model emphasizes federal guidelines with a mean α value of 0.804 (see Table 3). On the other hand, our cluster-specialized model found that stay-at-home orders were much more effective for counties in cluster 4, with a mean α value of 0.963. This difference illustrates the advantage of specializing a BHI model based county-level characteristics; it allows the model to include a greater number of counties within separate clusters while not being forced to overgeneralize and, in doing so, compromise some certainty.

Figure 9:
  • Download figure
  • Open in new tab
Figure 9:

Predicted fatalities in Washington D.C. up to May 28th, 2020. (a) The fatalities predicted by the baseline model, which observes all eligible counties during training. (b) The fatalities predicted by a cluster-specialized model, showing a very different trend in the course of the spread of COVID-19. Although this difference is more pronounced for the District of Columbia than for most regions, it underscores the potential for misleading predictions when the United States’ heterogeneity is not taken into account.

Disentanglement

One drawback of thes model is that it cannot disentangle implementations that came into effect at the same time. For example, states often closed public schools at the same time as federal guidelines were issued so it is difficult to discern the individual effect on reducing Rt. Additionally, in Table 5, we observe that banning gatherings of 50 or more people often occurs at the same time as banning gatherings 500 or more, and restaurants and entertainment venues are often closed together. This suggests that these pairs of interventions may not or be poorly disentangled.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 5: How far apart on average each intervention is implemented from each other. The interventions are in the order defined in Table 2

To further investigate the model’s ability to disentangle intervention weights, we create simulated trajectories of counties’ deaths and cases counts based on their R0 and the dates on which the interventions came into effect. Using all counties that have more than 50 cumulative deaths on May 28th without super-counties, we seed each county with 200 cases in each of the first 6 days. To simulate county-specific trajectories, we construct a sets of generated timeseries. In the first set, we assign intervention weights αi to be randomly generated from a Gamma distribution, the same distribution as our prior on the Bayesian mechanistic model adjusted to be in the range of our learned weights. We then calculate what the Rt on each day must have been based on the R0 and the interventions in place. Once we have the seeded infection and the Rt trajectory for each county, we can calculate daily infections and thus expected fatalities. Using the simulated trajectories, we fit the model. Table 6 compares the weights used for generation with the weights that the model learned.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 6: By setting the intervention weights, we can generate simulated timeseries of cases and deaths counts and have the model learn the weights. The learned values differ from the ground truth intervention weights, showing that the model does not disentangle the contribution of each intervention well in cases where interventions were implemented close together..

We observe that the effects of individual NPIs are not well disentangled in general. The model tends to attribute more weight to few NPIs rather than spread out the weight evenly. Specifically, the model tends to put more weight on stay-at-home orders. This may be because interventions I2-I8 are often implemented close together (see Table 5) and it is difficult to attribute effect to any single one of them on a national scale. While we can conclude that the trajectories the model predicts are reliable, due to their match to measured death, and therefore the overall decrease in Rt is reliable, attributing decreases to any individual NPI is challenging whenever the difference in implementation date is small. This observation seems to be in line with the similarly large confidence intervals reported in previous work on varied models and regions.2–7

Data Availability

All data used in the manuscript is available from public sources.

https://github.com/JieYingWu/COVID-19_US_County-level_Summaries

https://coronavirus.jhu.edu

Authors’ Contributions

Jie Ying Wu worked on the modeling, disentanglement, and coordinating efforts. Benjamin D Killeen worked on clustering and validating the epidemiological advantage of clustering. Philipp Nikutta worked on processing the data and validating the stability of the model. Mareike Thies did the background research and calculated the weighted fatalities. Anna Zapaishchykova created the methods overview diagram as well as the plots over the features. Shreya Chakraborty worked on processing the data and disentanglement. Mathias Unberath developed the initial concept and direction for this work. Jie Ying Wu, Benjamin D Killeen, and Mathias Unberath wrote the manuscript.

Competing Interests

The authors have no competing interests.

Acknowledgements

This article was supported in part by institutional funds provided by the Malone Center for Engineering in Healthcare.

Footnotes

  • jieying{at}jhu.edu

  • killeen{at}jhu.edu

  • pnikutta1{at}jhu.edu

  • mthies1{at}jhu.edu

  • azapais1{at}jhu.edu

  • schakr20{at}jhu.edu

  • mathias{at}jhu.edu

References

  1. 1.↵
    Dong, E., Du, H. & Gardner, L. An Interactive Web‐Based Dashboard to Track COVID‐19 in Real Time. Lancet Infect. Dis. 20, 533–534 (2020).
    OpenUrlCrossRefPubMed
  2. 2.↵
    Lai, S. et al. Effect of non‐pharmaceutical interventions for containing the COVID‐19 outbreak in China. medRxiv (2020) doi:10.1101/2020.03.03.20029843.
    OpenUrlAbstract/FREE Full Text
  3. 3.↵
    Davies, N. G. et al. The effect of non‐pharmaceutical interventions on COVID‐19 cases, deaths and demand for hospital services in the UK: a modelling study. medRxiv (2020) doi:10.1101/2020.04.01.20049908.
    OpenUrlAbstract/FREE Full Text
  4. 4.↵
    Mellan, T. A. et al. Report 21: Estimating COVID‐19 cases and reproduction number in Brazil. medRxiv (2020) doi:10.1101/2020.05.09.20096701.
    OpenUrlAbstract/FREE Full Text
  5. 5.↵
    Flaxman, S. et al. Estimating the Number of Infections and the Impact of Non‐ Pharmaceutical Interventions on COVID‐19 in European Countries: Technical Description Update. arXiv:2004.11342 [stat] (2020).
  6. 6.↵
    Vollmer, M. A. C. et al. A sub‐national analysis of the rate of transmission of COVID‐19 in Italy. medRxiv (2020) doi:10.1101/2020.05.05.20089359.
    OpenUrlAbstract/FREE Full Text
  7. 7.↵
    Fernández‐Villaverde, J. & Jones, C. I. Estimating and Simulating a SIRD Model of COVID‐ 19 for Many Countries, States, and Cities. (2020).
  8. 8.↵
    Randolph, H. E. & Barreiro, L. B. Herd Immunity: Understanding COVID‐19. Immunity 52, 737–741 (2020).
    OpenUrlPubMed
  9. 9.↵
    Kwok, K. O., Lai, F., Wei, W. I., Wong, S. Y. S. & Tang, J. W. T. Herd immunity ‐ estimating the level required to halt the COVID‐19 epidemics in affected countries. J. Infect. 80, e32–e33 (2020).
    OpenUrlCrossRef
  10. 10.↵
    Reynolds, D. Gaussian Mixture Models. in Encyclopedia of Biometrics (eds. Li, S. Z. & Jain, A.) 659–663 (Springer US, 2009). doi:10.1007/978-0-387-73003-5_196.
    OpenUrlCrossRef
  11. 11.↵
    Killeen, B. D. et al. A County‐Level Dataset for Informing the United States’ Response to COVID‐19. arXiv:2004.00756 [physics, q‐bio] (2020).
  12. 12.↵
    Bialek, S. et al. Severe Outcomes Among Patients with Coronavirus Disease 2019 (COVID‐19) — United States, February 12–March 16, 2020. MMWR. Morb. Mortal. Wkly. Rep. 69, 343–346 (2020).
    OpenUrlPubMed
  13. 13.
    Remuzzi, A. & Remuzzi, G. COVID‐19 and Italy: What Next? Lancet 395, 1225–1228 (2020).
    OpenUrlPubMed
  14. 14.
    Epidemiology Working Group for NCIP Epidemic Response & Chinese Center for Disease Control and Prevention. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID‐19) in China. Zhonghua Liu Xing Bing Xue Za Zhi 41, 145–151 (2020).
    OpenUrlCrossRefPubMed
  15. 15.
    Lee, P.-I., Hu, Y.-L., Chen, P.-Y., Huang, Y.-C. & Hsueh, P.-R. Are Children Less Susceptible to COVID‐19? J. Microbiol. Immunol. Infect. (2020) doi:10.1016/j.jmii.2020.02.011.
    OpenUrlCrossRef
  16. 16.↵
    Ruan, Q., Yang, K., Wang, W., Jiang, L. & Song, J. Clinical Predictors of Mortality Due to COVID‐19 Based on an Analysis of Data of 150 Patients from Wuhan, China. Intensive Care Med. 46, 846–848 (2020).
    OpenUrlCrossRefPubMed
  17. 17.↵
    von Gaudecker, H.-M., Holler, R., Janys, L., Siflinger, B. & Zimpelmann, C. Labour Supply in the Early Stages of the COVID‐19 Pandemic: Empirical Evidence on Hours, Home Office, and Expectations. IZA Discuss. Pap. No. 13158 (2020).
  18. 18.
    van Dorn, A., Cooney, R. E. & Sabin, M. L. COVID‐19 Exacerbating Inequalities in the US. Lancet 395, 1243–1244 (2020).
    OpenUrlCrossRefPubMed
  19. 19.↵
    Verity, R. et al. Estimates of the severity of coronavirus disease 2019: a model‐based analysis. Lancet Infect. Dis. 20, 669–677 (2020).
    OpenUrlCrossRefPubMed
  20. 20.
    Russell, T. W. et al. Estimating the infection and case fatality ratio for coronavirus disease (COVID‐19) using age‐adjusted data from the outbreak on the Diamond Princess cruise ship, February 2020. Euro Surveill. 25, (2020).
  21. 21.↵
    Rinaldi, G. & Paradisi, M. An empirical estimate of the infection fatality rate of COVID‐19 from the first Italian outbreak. medRxiv (2020) doi:10.1101/2020.04.18.20070912.
    OpenUrlAbstract/FREE Full Text
  22. 22.↵
    Flaxman, S. et al. Report 13: Estimating the number of infections and the impact of non‐ pharmaceutical interventions on COVID‐19 in 11 European countries. (2020) doi:10.25561/77731.
    OpenUrlCrossRef
  23. 23.↵
    Liu, Y., Gayle, A. A., Wilder‐Smith, A. & Rocklöv, J. The reproductive number of COVID‐19 is higher compared to SARS coronavirus. J. Travel Med. 27, (2020).
  24. 24.↵
    Carpenter, B. et al. Stan : A Probabilistic Programming Language. J. Stat. Softw. 76, (2017).
Back to top
PreviousNext
Posted June 02, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Changes in Reproductive Rate of SARS-CoV-2 Due to Non-pharmaceutical Interventions in 1,417 U.S. Counties
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Changes in Reproductive Rate of SARS-CoV-2 Due to Non-pharmaceutical Interventions in 1,417 U.S. Counties
Jie Ying Wu, Benjamin D. Killeen, Philipp Nikutta, Mareike Thies, Anna Zapaishchykova, Shreya Chakraborty, Mathias Unberath
medRxiv 2020.05.31.20118687; doi: https://doi.org/10.1101/2020.05.31.20118687
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Changes in Reproductive Rate of SARS-CoV-2 Due to Non-pharmaceutical Interventions in 1,417 U.S. Counties
Jie Ying Wu, Benjamin D. Killeen, Philipp Nikutta, Mareike Thies, Anna Zapaishchykova, Shreya Chakraborty, Mathias Unberath
medRxiv 2020.05.31.20118687; doi: https://doi.org/10.1101/2020.05.31.20118687

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Infectious Diseases (except HIV/AIDS)
Subject Areas
All Articles
  • Addiction Medicine (216)
  • Allergy and Immunology (495)
  • Anesthesia (106)
  • Cardiovascular Medicine (1101)
  • Dentistry and Oral Medicine (196)
  • Dermatology (141)
  • Emergency Medicine (274)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (502)
  • Epidemiology (9785)
  • Forensic Medicine (5)
  • Gastroenterology (481)
  • Genetic and Genomic Medicine (2319)
  • Geriatric Medicine (223)
  • Health Economics (463)
  • Health Informatics (1563)
  • Health Policy (737)
  • Health Systems and Quality Improvement (606)
  • Hematology (238)
  • HIV/AIDS (507)
  • Infectious Diseases (except HIV/AIDS) (11657)
  • Intensive Care and Critical Care Medicine (617)
  • Medical Education (240)
  • Medical Ethics (67)
  • Nephrology (258)
  • Neurology (2149)
  • Nursing (134)
  • Nutrition (338)
  • Obstetrics and Gynecology (427)
  • Occupational and Environmental Health (518)
  • Oncology (1183)
  • Ophthalmology (366)
  • Orthopedics (129)
  • Otolaryngology (220)
  • Pain Medicine (148)
  • Palliative Medicine (50)
  • Pathology (313)
  • Pediatrics (698)
  • Pharmacology and Therapeutics (302)
  • Primary Care Research (267)
  • Psychiatry and Clinical Psychology (2188)
  • Public and Global Health (4674)
  • Radiology and Imaging (781)
  • Rehabilitation Medicine and Physical Therapy (457)
  • Respiratory Medicine (624)
  • Rheumatology (274)
  • Sexual and Reproductive Health (226)
  • Sports Medicine (210)
  • Surgery (252)
  • Toxicology (43)
  • Transplantation (120)
  • Urology (94)