## Abstract

The paper evaluates the dynamic impact of various policies adopted by US states on the growth rates of confirmed Covid-19 cases and deaths as well as social distancing behavior measured by Google Mobility Reports, where we take into consideration people’s voluntarily behavioral response to new information of transmission risks in a causal structural model framework. Our analysis finds that both policies and information on transmission risks are important determinants of Covid-19 cases and deaths and shows that a change in policies explains a large fraction of observed changes in social distancing behavior. Our main counterfactual experiments suggest that nationally mandating face masks for employees early in the pandemic could have reduced the weekly growth rate of cases and deaths by more than 10 percentage points in late April and could have led to as much as 19 to 47 percent less deaths nationally by the end of May, which roughly translates into 19 to 47 thousand saved lives. We also find that, without stay-at-home orders, cases would have been larger by 6 to 63 percent and without business closures, cases would have been larger by 17 to 78 percent. We find considerable uncertainty over the effects of school closures due to lack of cross-sectional variation; we could not robustly rule out either large or small effects. Overall, substantial declines in growth rates are attributable to private behavioral response, but policies played an important role as well. We also carry out sensitivity analyses to find neighborhoods of the models under which the results hold robustly: the results on mask policies appear to be much more robust than the results on business closures and stay-at-home orders. Finally, we stress that our study is observational and therefore should be interpreted with great caution. From a completely agnostic point of view, our findings uncover predictive effects (association) of observed policies and behavioral changes on future health outcomes, controlling for informational and other confounding variables.

## 1. Introduction

Accumulating evidence suggests that various policies in the US have reduced social interactions and slowed down the growth of Covid-19 infections^{1} An important outstanding issue, however, is how much of the observed slow down in the spread is attributable to the effect of policies as opposed to a voluntarily change in people’s behavior out of fear of being infected. This question is critical for evaluating the effectiveness of restrictive policies in the US relative to an alternative policy of just providing recommendations and information such as the one adopted by Sweden. More generally, understanding people’s dynamic behavioral response to policies and information is indispensable for properly evaluating the effect of policies on the spread of Covid-19.

This paper quantitatively assesses the impact of various policies adopted by US states on the spread of Covid-19, such as non-essential business closure and mandatory face masks, paying particular attention to how people adjust their behavior in response to policies as well as new information on cases and deaths.

We present a conceptual framework that spells out the causal structure on how the Covid-19 spread is dynamically determined by policies and human behavior. Our approach explicitly recognizes that policies not only directly affect the spread of Covid-19 (e.g., mask requirement) but also indirectly affect its spread by changing people’s behavior (e.g., stay- at-home order). It also recognizes that people react to new information on Covid-19 cases and deaths and voluntarily adjust their behavior (e.g., voluntary social distancing and hand washing) even without any policy in place. Our casual model provides a framework to quantitatively decompose the growth of Covid-19 cases and deaths into three components: (1) direct policy effect, (2) policy effect through behavior, and (3) direct behavior effect in response to new information.

Guided by the causal model, our empirical analysis examines how the weekly growth rates of confirmed Covid-19 cases and deaths are determined by (the lags of) policies and behavior using US state-level data. To examine how policies and information affect people’s behavior, we also regress social distancing measures on policy and information variables. Our regression specification for case and death growths is explicitly guided by a SIR model although our causal approach does not hinge on the validity of a SIR model.

As policy variables, we consider mandatory face masks for employees in public businesses, stay-at-home orders (or shelter-in-place orders), closure of K-12 schools, closure of restaurants except take out, closure of movie theaters, and closure of non-essential businesses. Our behavior variables are four mobility measures that capture the intensity of visits to “transit,” “grocery,” “retail,” and “workplaces” from Google Mobility Reports. We take the lagged growth rate of cases and deaths and the log of lagged cases and deaths at both the state-level and the national-level as our measures of information on infection risks that affects people’s behavior. We also consider the growth rate of tests, month dummies, and state-level characteristics (e.g., population size and total area) as confounders that have to be controlled for in order to identify the causal relationship between policy/behavior and the growth rate of cases and deaths.

Our key findings from regression analysis are as follows. We find that both policies and information on past cases and deaths are important determinants of people’s social distancing behavior, where policy effects explain more than 50% of the observed decline in the four behavior variables^{2} Our estimates suggest that there are both large policy effects and large behavioral effects on the growth of cases and deaths. Except for mandatory masks, the effect of policies on cases and deaths is indirectly materialized through their impact on behavior; the effect of mandatory mask policy is direct without affecting behavior.

Using the estimated model, we evaluate the dynamic impact of the following three counterfactual policies on Covid-19 cases and deaths: (1) mandating face masks, (2) allowing all businesses to open, and (3) not implementing a stay-at-home order. The counterfactual experiments show a large impact of those policies on the number of cases and deaths. They also highlight the importance of voluntary behavioral response to infection risks when evaluating the dynamic policy effects.

Our estimates imply that nationally implementing mandatory face masks for employees in public businesses on March 14th would have reduced the growth rate of cases and that of deaths by approximately 10 percentage points in late April. As shown in Figure 1, this leads to reductions of 21% and 34% in cumulative reported cases and deaths, respectively, by the end of May with 90 percent confidence intervals of [9, 32]% and [19, 47]%, which roughly implies that 34 thousand lives could have been saved. This finding is significant: given this potentially large benefit of reducing the spread of Covid-19, mandating masks is an attractive policy instrument especially because it involves relatively little economic disruption. These estimates contribute to the ongoing efforts towards designing approaches to minimize risks from reopening (Stock, 2020b).

Figure 2 illustrates how never closing any businesses (no movie theater closure, no nonessential business closure, and no closure of restaurants except take-out) could have affected cases and deaths. We estimate that business shutdowns have roughly the same impact on growth rates as mask mandates, albeit with more uncertainty. The point estimates indicate that keeping all businesses open could have increased cumulative cases and deaths by 40% at the end of May (with 90 percent confidence intervals of [17, 78]% for cases and [1,97]% for deaths).

Figure 3 shows that stay-at-home orders had effects of similar magnitude as business closures. No stay-at-home orders could have led to 37% more cases by the start of June with a 90 percent confidence interval given by 6% to 63%. The estimated effect of no stay- at-home orders on deaths is a slightly smaller with a 90 percent confidence interval of –7% to 50%.

We also conducted sensitivity analysis with respect to changes to our regression specification, sample selection, methodology, and assumptions about delays between policy changes and changes in recorded cases. In particular, we examined whether certain effect sizes can be ruled out by various more flexible models or by using alternative timing assumptions that define forward growth rates. The impact of mask mandates is more robustly and more precisely estimated than that of business closure policies or stay-at-home orders, and an undesirable effect of increasing the weekly death growth by 5 percentage points is ruled out by all of the models we consider^{3} This is largely due to the greater variation in the timing of mask mandates across states. The findings of shelter-in-place and business closures policies are considerably less robust. For example, for stay-at-home mandates, models with alternative timing and richer specification for information set suggested smaller effects. Albeit after application of machine learning tools to reduce dimensionality, the range of effects [0, 0.15] could not be ruled out. A similar wide range of effects could not be ruled out for business closures.

We also examine the impact of school closures. Unfortunately, there is very little variation across states in the timing of school closures. Across robustness specifications, we obtain point estimates of the effect of school closures as low as 0 and as high as -0.6. In particular, we find that the results are sensitive to whether the number of past national cases/deaths is included in a specification or not. This highlights the uncertainty regarding the impact of some policies versus private behavioral responses to information.

This figure shows the estimated relative change in cumulative cases and deaths if all states had mandated masks on March 14th. The thick blue line is the estimated change in cumulative cases or deaths relative to the observed number of cases or deaths. The shaded region is a pointwise 90% confidence band.

This figure shows the estimated relative change in cases and deaths if no states had ever implemented any business closure policies. The thick blue line is the estimated change in cumulative cases or deaths relative to the observed number of cumulative cases or deaths. The shaded region is a pointwise 90% confidence band.

This figure shows the estimated relative change in cases and deaths if no states had ever issued stay at home orders. The thick blue line is the estimated change in cumulative cases or deaths relative to the observed number of cumulative cases or deaths. The shaded region is a point wise 90% confidence band.

A growing number of other papers have examined the link between non-pharmaceutical interventions and Covid-19 cases.^{4} ,Hsiang et al. (2020) estimate the effect of policies on the growth rate of cases using data from the United States, China, Iran, Italy, France, and South Korea. In the United States, they find that the combined effect of all policies they consider on the growth rate is –0.347 (0.061). Courtemanche et al. (2020) use US county level data to analyze the effect of interventions on case growth rates. They find that the combination of policies they study reduced growth rates by 9.1 percentage points 16-20 days after implementation, out of which 5.9 percentage points are attributable to shelter in place orders. Both Hsiang et al. (2020) and Courtemanche et al. (2020) adopt a reduced- form approach to estimate the total policy effect on case growth without using any social distancing behavior measures.^{5}

Existing evidence for the impact of social distancing policies on behavior in the US is mixed. Abouk and Heydari (2020) employ a difference-in-differences methodology to find that statewide stay-at-home orders have strong causal impacts on reducing social interactions. In contrast, using data from Google Mobility Reports, Maloney and Taskin (2020) find that the increase in social distancing is largely voluntary and driven by information.^{6} Another study by Gupta et al. (2020) also found little evidence that stay-at-home mandates induced distancing by using mobility measures from PlaceIQ and SafeGraph. Using data from SafeGraph, Andersen (2020) shows that there has been substantial voluntary social distancing but also provide evidence that mandatory measures such as stay-at-home orders have been effective at reducing the frequency of visits outside of one’s home.

Pei, Kandula, and Shaman (2020) use county-level observations of reported infections and deaths in conjunction with mobility data from SafeGraph to conduct simulation of implementing all policies 1-2 weeks earlier and found that it would have resulted in reducing the number of cases and deaths by more than half. However, their study does not explicitly analyze how policies are related to the effective reproduction numbers.

Epidemiologists use model simulations to predict how cases and deaths evolve for the purpose of policy recommendation. As reviewed by Avery et al. (2020), there exists substantial uncertainty about the values of key epidimiological parameters (see also Atkeson, 2020a; Stock, 2020a). Simulations are often done under strong assumptions about the impact of social distancing policies without connecting to the relevant data. Furthermore, simulated models do not take into account that people may limit their contact with other people in response to higher transmission risks.^{7} When such a voluntary behavioral response is ignored, simulations would necessarily exhibit exponential spread and over-predict cases and deaths. In contrast, as cases and deaths rise, a voluntary behavioral response may possibly reduce the effective reproduction number below 1, potentially preventing exponential spread. Our counterfactual experiments illustrate the importance of this voluntary behavioral change.

Whether wearing masks in public place should be mandatory or not has been one of the most contested policy issues with health authorities of different countries providing contradictory recommendations. Reviewing evidence, Greenhalgh et al. (2020) recognize that there is no randomized controlled trial evidence for the effectiveness of face masks, but they state “indirect evidence exists to support the argument for the public wearing masks in the Covid-19 pandemic.”^{8} ,Howard et al. (2020) also review available medical evidence and conclude that “mask wearing reduces the transmissibility per contact by reducing transmission of infected droplets in both laboratory and clinical contexts.” The laboratory findings in Hou et al. (2020) suggest that the nasal cavity may be the initial site of infection followed by aspiration to the lung, supporting the argument “for the widespread use of masks to prevent aersol, large droplet, and/or mechanical exposure to the nasal passages.” He et al. (2020) examined temporal patterns of viral shedding in COVID-19 patients and found the highest viral load at the time of symptom onset; this suggests that a significant portion of transmission may have occurred before symptom onset and that universal face masks may be an effective control measure to reduce transmission.^{9} ,Ollila et al. (2020) provide a meta-analysis of randomized controlled trials of non-surgical face masks in preventing viral respiratory infections in non-hospital and non-household settings, finding that face masks decreased infections across all five studies they reviewed.^{10}

Given the lack of experimental evidence on the effect of masks in the context of COVID-19, conducting observational studies is useful and important. To the best of our knowledge, our paper is the first empirical study that shows the effectiveness of mask mandates on reducing the spread of Covid-19 by analyzing the US state-level data. This finding corroborates and is complementary to the medical observational evidence in Howard et al. (2020). Analyzing mitigation measures in New York, Wuhan, and Italy, Zhang et al. (2020b) conclude that mandatory face coverings substantially reduced infections. Abaluck et al. (2020) find that the growth rates of cases and of deaths in countries with pre-existing norms that sick people should wear masks are lower by 8 to 10% than those rates in countries with no pre-existing mask norms^{11} The Institute for Health Metrics and Evaluation at the University of Washington assesses that, if 95% of the people in the US were to start wearing masks from early August of 2020, 66,000 lives would be saved by December 2020 (IHME, 2020), which is largely consistent with our results. Our finding is also independently corroborated by a completely different causal methodology based on synthetic control using German data in Mitze et al. (2020).^{12}

Our empirical results contribute to informing the economic-epidemiological models that combine economic models with variants of SIR models to evaluate the efficiency of various economic policies aimed at the gradual “reopening” of various sectors of economy^{13} For example, the estimated effects of masks, stay-home mandates, and various other policies on behavior, and of behavior on infection can serve as useful inputs and validation checks in the calibrated macro, sectoral, and micro models (see, e.g., Alvarez, Argente, and Lippi (2020); Baqaee et al. (2020); Fernández-Villaverde and Jones (2020); Acemoglu et al. (2020); Keppo et al. (2020); McAdams (2020) and references therein). Furthermore, the causal framework developed in this paper could be applicable, with appropriate extensions, to the impact of policies on economic outcomes replacing health outcomes (see, e.g., Chetty et al. (2020); Coibion, Gorodnichenko, and Weber (2020)).

Finally, our causal model is framed using the language of structural equations models and causal diagrams of econometrics (Wright (1928); Haavelmo (1944); Tinbergen (1940); Wold (1954); Pearl (1995)) and genetics (Wright, 1923),^{14} with natural unfolding potential/structural outcomes representation (Rubin, 1974; Tinbergen, 1930; Neyman, 1925; Imbens and Rubin, 2015). The work on causal graphs has been modernized and developed by Pearl (1995); Greenland, Pearl, and Robins (1999); Pearl (2009); Pearl and Mackenzie (2018) and many others (e.g., Pearl and Mackenzie (2018); White and Chalak (2009); Robins, Richardson, and Shpitser (2020); Peters, Janzing, and Bernhard (2017); Bareinboim et al. (2020); Hernan and Robins (2020)), with applications in computer science, genetics, epidemiology, and econometrics (see, e.g., Heckman and Pinto (2013); Hiinermund and Bareinboim (2019); White and Chalak (2009) for applications in econometrics). The particular causal diagram we use has several “mediation” components, where variables affect outcomes directly and indirectly through other variables called mediators; these structures go back at least to Wright (1923, see Figure 6); see, e.g., Baron and Kenny (1986), Hines, Vansteelandt, and Diaz-Ordaz (2020), Robins, Richardson, and Shpitser (2020) for modern treatments.

## 2. The Causal Model for the Effect of Policies, Behavior, and Information on Growth of Infection

### 2.1. The Causal Model and Its Structural Equation Form

We introduce our approach through the Wright-style causal diagram shown in Figure 4. The diagram describes how policies, behavior, and information interact together:

The

*forward*health outcome,*Y*, is determined last, after all other variables have been determined;_{i,t+ℓ}The adopted policies,

*P*, affect health outcome_{it}*Y*either directly, or indirectly by altering human behavior_{i,t+ℓ}*B*;_{it}Information variables,

*I*such as lagged values of outcomes can affect human behavior and policies, as well as outcomes;_{it}The confounding factors

*W*, which vary across states and time, affect all other variables._{it}

The index *i* denotes observational unit, the state, and *t* and *t +ℓ* denotes the time, where *ℓ* is a positive integer that represents the time lag between infection and case confirmation or death.

Our main outcomes of interest are the growth rates in Covid-19 cases and deaths, behavioral variables include proportion of time spent in transit, shopping, and workplaces, policy variables include mask mandates, stay-at-home orders, and school and business closures, and the information variables include lagged values of outcome. We provide a detailed description of these variables and their timing in the next section.

The causal structure allows for the effect of the policy to be either direct or indirect – through behavior or through dynamics; all of these effects are not mutually exclusive. The structure also allows for changes in behavior to be brought by change in policies and information. These are all realistic properties that we expect from the contextual knowledge of the problem. Policies such as closures of schools, non-essential business, and restaurants alter and constrain behavior in strong ways. In contrast, policies such as mandating employees to wear masks can potentially affect the Covid-19 transmission directly. The information variables, such as recent growth in the number of cases, can cause people to spend more time at home, regardless of adopted state policies; these changes in behavior in turn affect the transmission of Covid-19. Importantly, policies can have the informational content as well, guiding behavior rather than constraining it.

The causal ordering induced by this directed acyclical graph is determined by the following timing sequence:

information and confounders get determined at

*t*,policies are set in place, given information and confounders at

*t*;behavior is realized, given policies, information, and confounders at

*t*;outcomes get realized at

*t*+*ℓ*given policies, behavior, information, and confounders.

The model also allows for direct dynamic effects of information variables on the outcome through autoregressive structures that capture persistence in growth patterns. As highlighted below, realized outcomes may become new information for future periods, inducing dynamics over multiple periods.

Our quantitative model for causal structure in Figure 4 is given by the following econometric structural (or potential) outcomes model:
which is a collection of functional relations with stochastic shocks, decomposed into observable part *δ*′*W* and unobservable part *ε*. The terms and are the centered stochastic shocks that obey the orthogonality restrictions posed below.

The policies can be modeled via a linear form as well,
although linearity is not critical.^{15}

The exogeneity restrictions on the stochastic shocks are as follows:
where we say that if E*VU* = 0^{16} This is a standard way of representing restrictions on errors in structural equation modeling in econometrics.^{17}

The observed variables are generated by setting *ı* = *I _{it}* and propagating the system from the last equation to the first:

The specification of the model above grasps one-period responses. The dynamics over multiple periods will be induced by the evolution of information variables, which include time, lagged and integrated values of outcome:^{18}
for each t ∊ {0,1, …,*T*}, where *g* is deterministic function of time, e.g., month indicators, assuming that the log of new cases at time *t* ≤ 0 is zero, for notational convenience.^{19} In this structure, people respond to both global information, captured by a function of time such as month dummies, and local information sources, captured by the local growth rate and the total number of cases. The local information also captures the persistence of the growth rate process. We model the reaction of people’s behavior via the term *γ*′*I _{t}* in the behavior equation. The lagged values of behavior variable may be also included in the information set, but we postpone this discussion after the main empirical results are presented.

With any structure of this form, realized outcomes may become new information for future periods, inducing a dynamical system over multiple periods. We show the resulting dynamical system in a diagram of Figure 5. Specification of this system is useful for studying delayed effects of policies and behaviors and in considering the counterfactual policy analysis.

Next we combine the above parts together with an appropriate initializations to give a formal definition of the model we use.

Structural Equations Model (SEM).

Let *i* ∊ {1, …, *N*} denote the observational unit, *t* be the time periods, and *ℓ* be the time delay. (1) For each *i* and *t* ≤ *–ℓ*, the confounder, information, behavior, and policy variables *W _{it}, I_{it}, B_{it}, P_{it}* are determined outside of the model, and the outcome variable

*Y*is determined by factors outside of the model for

_{i,t+ℓ}*t*≤ 0. (2) For each

*i*and

*t*≥ –

*ℓ*, confounders

*W*are determined by factors outside of the model, and information variables

_{it}*I*are determined by (I); policy variables

_{it}*P*are determined by setting

_{it}*ı= I*in (P) with a realized stochastic shock that obeys the exogeneity condition (E); behavior variables

_{it}*B*are determined by setting

_{it}*ı*. =

*I*and

_{it}*p = P*in (SO) with a shock that obeys (E); finally, the outcome

_{it}*Y*is realized by setting

_{i,t+ℓ}*ı = I*,

_{it}*p = P*, and

_{it}*b = B*in (SO) with a shock that obeys (E).

_{it}### 2.2. Main Testable Implication, Identification, Parameter Estimation

The system above together with orthogonality restrictions (E) implies the following collection of projection equations for realized variables:

Therefore the projection equation: should obey:

Without any exclusion restrictions, this equality is just a decomposition of total effects into direct and indirect components and is not a testable restriction. However, in our case we rely on the SIR model with testing to motivate the presence of change in testing rate as a confounder in the outcome equations but not in the behavior equation (therefore, a component of *δ _{B}* and

*δ*is set to 0), implying that (TR) does not necessarily hold and is testable. Furthermore, we estimate (PI→B) on the data set that has many more observations than the data set used to estimate the outcome equations, implying that (TR) is again testable. Later we shall also try to utilize the contextual knowledge that mask mandates only affect the outcome directly and not by changing mobility (i.e.,

_{P}*β*= 0 for mask policy), implying again that (TR) is testable. If not rejected by the data, (TR) can be used to sharpen the estimate of the causal effect of mask policies on the outcomes.

Validation of the model by (TR) allows us to check exclusion restrictions brought by contextual knowledge and check stability of the model by using different data subsets. However, passing the (TR) does not guarantee that the model is necessarily valid for recovering causal effects. The only fundamental way to truly validate a causal model for observational data is through a controlled experiment, which is impossible to carry out in our setting.

The parameters of the SEM are identified by the projection equation set above, provided the latter are identified by sufficient joint variation of these variables across states and time. We can develop this point formally as follows. Consider the previous system of equations, after partialling out the confounders:
where denotes the residual after removing the orthogonal projection of *V _{it}* on

*W*. The residualization is a linear operator, implying that (1) follows immediately from the above. The parameters of (1) are identified as projection coefficients in these equations, provided that residualized vectors appearing in each of the equations have non-singular variance, that is

_{it}Our main estimation method is the standard correlated random effects estimator, where the random effects are parameterized as functions of observable characteristic, *W _{it}*, which include both state-level and time random effects. The state-level random effects are modeled as a function of state level characteristics, and the time random effects are modeled by including month dummies and their interactions with state level characteristics (in the sensitivity analysis, we also add weekly dummies). The stochastic shocks are treated as independent across states

*i*and can be arbitrarily dependent across time

*t*within a state.

Another modeling approach is the fixed effects panel data model, where *W _{it}* includes latent (unobserved) state level confonders

*W*and and time level effects

_{i}*W*, which must be estimated from the data. This approach is much more demanding of the data and relies on long time and cross-sectional histories to estimate

_{t}*W*and

_{i}*W*, resulting in amplification of uncertainty. In addition, when histories are relatively short, large biases emerge and they need to be removed using debiasing methods, see e.g., Chen, Chernozhukov, and Fernández-Val (2019) for overview. We present the results on debiased fixed effect estimation with weekly dummies as parts of our sensitivity analysis. Our sensitivity analysis also considers a debiased machine learning approach using Random Forest in which observed confounders enter the model nonlinearly.

_{t}With exclusion restrictions there are multiple approaches to estimation, for example, via generalized method of moments. We shall take a more pragmatic approach where we estimate the parameters of equations separately and then compute
as the estimator of the total policy effect. Under standard regularity conditions, these estimators concentrate around their population analogues
with approximate deviations controlled by the normal laws, with standard deviations that can be approximated by the bootstrap resampling of observational units *i*. Under correct specification the target quantities reduce to
respectively.^{20}

### 2.3. Counterfactual Policy Analysis

We also consider simple counterfactual exercises, where we examine the effects of setting a sequence of counterfactual policies for each state:

We assume that the SEM remains invariant, except for the policy equation.^{21} The assumption of invariance captures the idea that counterfactual policy interventions would not change the structural functions within the period of the study. The assumption is strong but is necessary to conduct counterfactual experiments, e.g. Sims (1972) and Strotz and Wold (1960). To make the assumption more plausible we limited our study to the early pandemic period.^{22}

Given the policies, we generate the counterfactual outcomes, behavior, and information by propagating the dynamic equations:
with the same initialization as the factual system up to *t* ≤ 0. In stating this counterfactual system of equations, we assume that structural outcome equations (SO) and information equations (I) remain invariant and so do the stochastic shocks, decomposed into observable and unobservable parts. Formally, we record this assumption and above discussion as follows.

Counterfactual Structural Equations Model (CF-SEM).

Let *i* ∊ {1, …, *N*} be the observational unit, *t* be time periods, and *ℓ* be the time delay. (1) For each *i* and *t* ≤ 0, the confounder, information, behavior, policy, and outcome variables are determined as previously stated in SEM: , , , , . (2) For each *i* and *t* > 0, confounders *W* _{t} = Wu* are determined as in SEM, and information variables are determined by (I); policy variables are set in (CF-P); behavior variables are determined by setting and in (SO) with the same stochastic shock in (SO); the counterfactual outcome is realized by setting , , and in (SO) with the same stochastic shock in (SO).

Figures 6 and 7 present the causal path diagram for CF-SEM as well as the dynamics of counterfactual information in CF-SEM.

The counterfactual outcome and factual outcome *Y _{it+ℓ}* are given by:

In generating these predictions, we explore the assumption of invariance stated above. We can write the counterfactual contrast into the sum of three components: which describe the immediate indirect effect of the policy via behavior, the direct effect of the policy, and the dynamic effect of the policy. By recursive substitutions the dynamic effect can be further decomposed into a weighted sum of delayed policy effects via behavior and a weighted sum of delayed policy effects via direct impact.

All counterfactual quantities and contrasts can be computed from the expressions given above. For examples, given Δ*C _{i}*

_{0}> 0, new confirmed cases are linked to growth rates via relation (taking

*t*divisible by

*ℓ*for simplicity):

The cumulative cases can be constructed by summing over the new cases. Various contrasts are then calculated from these quantities. For example, the relative contrast of counterfactual new confirmed cases to the factual confirmed cases is given by:

We refer to the appendix for further details. Similar calculations apply for fatalities. Note that our analysis is conditional on the factual history and structural stochastic shocks.^{23}

The estimated counterfactuals are smooth functionals of the underlying parameter estimates. Therefore, we construct the confidence intervals for counterfactual quantities and contrasts by bootstrapping the parameter estimates. We refer to the appendix for further details.

### 2.4. Outcome and Key Confounders via SIRD model

We next provide details of our key measurement equations, defining the outcomes and key confounders. We motivate the structural outcome equations via the fundamental epidemiological model for the spread of infectious decease called the Susceptible-Infected-Recovered-Dead (SIRD) model with testing.

Letting *C _{it}* denote the cumulative number of confirmed cases in state

*i*at time

*t*, our outcome approximates the weekly growth rate in new cases from

*t*− 7 to t. Here A denotes the differencing operator over 7 days from t to

*t*− 7, so that Δ

*C*: =

_{it}*C*− Δ

_{it}*C*

_{it}_{−7}is the number of new confirmed cases in the past 7 days.

We chose this metric as this is the key metric for policy makers deciding when to relax Covid mitigation policies. The U.S. government’s guidelines for state reopening recommend that states display a “downward trajectory of documented cases within a 14-day period” (White House, 2020). A negative value of *Y _{it}* is an indication of meeting this criteria for reopening. By focusing on weekly cases rather than daily cases, we smooth idiosyncratic daily fluctuations as well as periodic fluctuations associated with days of the week.

Our measurement equation for estimating equations (BPI→Y) and (PI→Y) will take the form:
where *i* is state, *t* is day, *C _{it}* is cumulative confirmed cases,

*T*is the number of tests over 7 days, Δ is a 7-days differencing operator, and ϵ

_{it}*is an unobserved error term.*

_{it}*X*

_{i,t}_{−14}collects other behavioral, policy, and confounding variables, depending on whether we estimate (BPI→Y) or (PI→Y), where the lag of 14 days captures the time lag between infection and confirmed case (see the Appendix A.6). Here is the key confounding variable, derived from considering the SIRD model below. We are treating the change in testing rate as exogenous

^{24}We describe other confounders in the empirical section.

Our main measurement equation (M-C) is motivated by a variant of SIRD model, where we add confirmed cases and infection detection via testing. Let *S*, , *R*, and *D* denote the number of susceptible, infected, recovered, and deceased individuals in a given state. Each of these variables are a function of time. We model them as evolving as
where *N* is the population, *β*(*t*) is the rate of infection spread, *γ* is the rate of recovery or death, and *κ* is the probability of death conditional on infection.

Confirmed cases, *C*(*t*), evolve as
where *τ*(*t*) is the rate that infections are detected.

Our goal is to examine how the rate of infection *β*(*t*) varies with observed policies and measures of social distancing behavior. A key challenge is that we only observed *C*(*t*) and *D*(*t*), but not . The unobserved can be eliminated by differentiating (9) and using (6) as

We consider a discrete-time analogue of equation (10) to motivate our empirical specification by relating the detection rate *τ*(*t*) to the number of tests *T _{it}* while specifying as a linear function of variables

*X*

_{i,t}_{−14}. This results in which is equation (M-C), where

*X*

_{i,t}_{−14}captures a vector of variables related to

*β*(

*t*).

structural interpretation.

Early in the pandemic, when the number of susceptibles is approximately the same as the entire population, i.e. *S _{it}*/

*N*≈ 1, the component is the projection of infection rate

_{it}*β*(

*t*) on

*X*

_{i,t}_{−14}(policy, behavioral, information, and confounders other than testing rate), provided the stochastic component ϵ

*is orthogonal to*

_{it}*X*

_{i,t}_{−14}and the testing variables:

The specification for growth rate in deaths as the outcome is motivated by SIRD as follows. By differentiating (8) and (9) with respect to *t* and using (10), we obtain

Our measurement equation for the growth rate of deaths is based on equation (11) but accounts for a 21 day lag between infection and death as
where
approximates the weekly growth rate in deaths from *t* − 7 to t in state *i*.

## 3. Empirical Analysis

### 3.1. Data

Our baseline measures for daily Covid-19 cases and deaths are from The New York Times (NYT). When there are missing values in NYT, we use reported cases and deaths from JHU CSSE, and then the Covid Tracking Project. The number of tests for each state is from Covid Tracking Project. As shown in the lower right panel of Figure 17 in the appendix, there was a rapid increase in testing in the second half of March and then the number of tests increased very slowly in each state in April.

We use the database on US state policies created by Raifman et al. (2020). In our analysis, we focus on 6 policies: stay-at-home, closed nonessential businesses, closed K-12 schools, closed restaurants except takeout, closed movie theaters, and face mask mandates for employees in public facing businesses. We believe that the first four of these policies are the most widespread and important. Closed movie theaters is included because it captures common bans on gatherings of more than a handful of people. We also include mandatory face mask use by employees because its effectiveness on slowing down Covid-19 spread is a controversial policy issue (Howard et al., 2020; Greenhalgh et al., 2020; Zhang et al., 2020b). Table 1 provides summary statistics, where *N* is the number of states that have ever implemented the policy. We also obtain information on state-level covariates mostly from Raifman et al. (2020), which include population size, total area, unemployment rate, poverty rate, a percentage of people who are subject to illness, and state governor’s party affiliations. These confounders are motivated by Wheaton and Thompson (2020) who find that case growth is associated with residential density and per capita income.

We obtain social distancing behavior measures from“Google COVID-19 Community Mobility Reports” (LLC, 2020). The dataset provides six measures of “mobility trends” that report a percentage change in visits and length of stay at different places relative to a baseline computed by their median values of the same day of the week from January 3 to February 6, 2020. Our analysis focuses on the following four measures: “Grocery & pharmacy,” “Transit stations,” “Retail & recreation,” and “Workplaces.”^{25}

Figure 8 shows the evolution of “Transit stations” and “Workplaces,” where thin lines are the value in each state and date while thicker colored lines are quantiles conditional on date. The figures illustrate a sharp decline in people’s movements starting from mid-March as well as differences in their evolutions across states. They also reveal periodic fluctuations associated with days of the week, which motivates our use of weekly measures.

In our empirical analysis, we use weekly measures for cases, deaths, and tests by summing up their daily measures from day *t* to *t* − 6. We focus on weekly cases and deaths because daily new cases and deaths are affected by the timing of reporting and testing and are quite volatile as shown in the upper right panel of Figure 17 in the appendix. Aggregating to weekly new cases/deaths/tests smooths out idiosyncratic daily noises as well as periodic fluctuations associated with days of the week. We also construct weekly policy and behavior variables by taking 7 day moving averages from day *t* − 14 to *t* − 21 for case growth, where the delay reflects the time lag between infection and case confirmation. The four weekly behavior variables are referred to “Transit Intensity,” “Workplace Intensity,” “Retail Intensity,” and “Grocery Intensity.” Consequently, our empirical analysis uses 7 day moving averages of all variables recorded at daily frequencies. Our sample period is from March 7, 2020 to June 3, 2020.

Table 2 reports that weekly policy and behavior variables are highly correlated with each other, except for the“masks for employees” policy. High correlations may cause multicolinearity problems and could limit our ability to separately identify the effect of each policy or behavior variable on case growth. For this reason, we define the “business closure policies” variable by the average of closed movie theaters, closed restaurants, and closed non-essential businesses variables and consider a specification that includes business closure policies in place of these three policy variables separately.

Figure 9 shows the portion of states that have each policy in place at each date. For most policies, there is considerable variation across states in the time in which the policies are active. The one exception is K-12 school closures. About 80% of states closed schools within a day or two of March 15th, and all states closed schools by early April. This makes the effect of school closings difficult to separate from aggregate time series variation.

### 3.2. The Effect of Policies and Information on Behavior

We first examine how policies and information affect social distancing behaviors by estimating a version of (PI→B):
where represents behavior variable *j* in state *i* at time *t*. *P _{it}* collects the Covid related policies in state

*i*at

*t*. Confounders,

*W*, include state-level covariates, month indicators, and their interactions.

_{it}*I*is a set of information variables that affect people’s behaviors at

_{it}*t*. As information, we include each state’s growth of cases (in panel A) or deaths (in panel B), and log cases or deaths. Additionally, in columns (5)-(8) of Table 3, we include national growth and log of cases or deaths.

Table 3 reports the estimates with standard errors clustered at the state level. Across different specifications, our results imply that policies have large effects on behavior. Comparing columns (1)-(4) with columns (5)-(8), the magnitude of policy effects are sensitive to whether national cases or deaths are included as information. The coefficient on school closures is particularly sensitive to the inclusion of national information variables. As shown in Figure 9, there is little variation across states in the timing of school closures. Consequently, it is difficult to separate the effect of school closures from a behavioral response to the national trend in cases and deaths.

The other policy coefficients are less sensitive to the inclusion of national case/death variables. After school closures, business closure policies have the next largest effect followed by stay-at-home orders. The effect of masks for employees is small.^{26}

The row “∑ * _{j}* Policy

*” reports the sum of the estimated effect of all policies, which is substantial and can account for a large fraction of the observed declines in behavior variables. For example, in Figure 8, transit intensity for a median state was approximately -50% at its lowest point in early April. The estimated policy coefficients in columns imply that imposing all policies would lead to roughly 75% (in column 4) or roughly 35% (in column 8) of the observed decline. The large impact of policies on transit intensity suggests that the policies may have reduced the Covid-19 infection by reducing people’s use of public transportation.*

_{j}^{27}

In Table 3(B), estimated coefficients of deaths and death growth are generally negative. This suggests that the higher number of deaths reduces social interactions measured by Google Mobility Reports perhaps because people are increasingly aware of prevalence of Covid-19 (Maloney and Taskin, 2020). The coefficients on log cases and case growth in Table 3(A) are more mixed.^{28} In columns (5)-(8) of both panels, we see that national case/death variables have large, negative coefficients. This suggests that behavior responded to national conditions although it is also likely that national case/death variables capture unobserved aggregate time effects beyond information which are not fully controlled by month dummies (e.g., latent policy variables and time-varying confounders that are common across states).

### 3.3. The Direct Effect of Policies and Behavior on Case and Death Growth

We now analyze how behavior and policies together inuence case and death growth rates. We begin with some simple graphical evidence of the effect of policies on case and death growth. Figure 10 shows average case and death growth conditional on date and whether masks are mandatory for employees.^{29} The left panel of the figure shows that states with a mask mandate consistently have 0-0.2 lower case growth than states without. The right panel also illustrates that states with a mask mandate tend to have lower average death growth than states without a mask mandate.

Similar plots are shown for other policies in Figures 20 and 21 in the appendix. The figures for stay-at-home orders and closure of nonessential businesses are qualitatively similar to that for masks. States with these two policies appear to have about 0.1 percentage point lower case growth than states without. The effects of school closures, movie theater closures, and restaurant closures are not clearly visible in these figures. These figures are merely suggestive; the patterns observed in them may be driven by confounders.

We more formally analyze the effect of policies by estimating regressions. We first look at the direct effect of policies on case and death growth conditional on behavior by estimating equation (BPI→Y):
where the outcome variable, *Y _{i;t+ℓ}*, is either case growth or death growth.

For case growth as the outcome, we choose a lag length of *ℓ* = 14 days for behavior, policy, and information variables to reect the delay between infection and confirmation of case.^{30} is a vector of four behavior variables in state *i*. *P _{it}* includes the Covid-related policies in state

*i*that directly affect the spread of Covid-19 after controlling for behavior variables (e.g., masks for employees). We include information variables,

*I*, that include the past cases and case growths because the past cases may be correlated with (latent) government policies or people’s behaviors that are not fully captured by our observed policy and behavior variables. We also consider a specification that includes the past cases and case growth at the national level as additional information variables.

_{it}*W*is a set of confounders that includes month dummies, state-level covariates, and the interaction terms between month dummies and state-level covariates.

_{it}^{31}For case growth,

*W*also includes the test rate growth Δlog(

_{it}*T*)

*to capture the effect of changing test rates on confirmed cases. Equation (13) corresponds to (M-C) derived from the SIR model.*

_{it}For death growth as the outcome, we take a lag length of *ℓ* = 21 days. The information variables *I _{it}* include past deaths and death growth rates;

*W*is the same as that of the case growth equation except that the growth rate of test rates is excluded from

_{it}*W*as implied by equation (M-D).

_{it}Table 4 shows the results of estimating (13) for case and death growth rates.

Column (1) represents our baseline specification while column (2) replaces business closure policies with closed movie theaters, closed restaurants, and closed non-essential businesses. Columns (3) and (4) include past cases/deaths and growth rates at national level as additional regressors.

The estimates indicate that mandatory face masks for employees reduce the growth rate of infections and deaths by 9-15 percent, while holding behavior constant. This suggests that requiring masks for employees in public-facing businesses may be an effective preventive measure^{32} The estimated effect of masks on death growth is larger than the effect on case growth, but this difference between the two estimated effects is not statistically significant.

Closing schools has a large and statistically significant coefficient in the death growth regressions. As discussed above, however, there is little cross-state variation in the timing of school closures, making estimates of its effect less reliable.

Neither the effect of stay-at-home orders nor that of business closure policies is estimated significantly different from zero, suggesting that these lockdown policies may have little direct effect on case or death growth when behavior is held constant.

The row “∑* _{k} w_{k}*Behavior

*” reports the sum of estimated coefficients weighted by the average of the behavioral variables from April 1st-10th. The estimates of –0.80 and –0.84 for “∑*

_{k}*Behavior*

_{k}w_{k}*” in column (1) imply that a reduction in mobility measures relative to the baseline in January and February have induced a decrease in case and death growth rates by 80 and 84 percent, respectively, suggesting an importance of social distancing for reducing the spread of Covid-19. When including national cases and deaths in information, as shown in columns (3) and (4), the estimated aggregate impact of behavior is substantially smaller but remains large and statistically significant.*

_{k}A useful practical implication of these results is that Google Mobility Reports and similar data might be useful as a leading indicator of potential case or death growth. This should be done with caution, however, because other changes in the environment might alter the relationship between behavior and infections. Preventative measures, including mandatory face masks, and changes in habit that are not captured in our data might alter the future relationship between Google Mobility Reports and case/death growth.

The negative coefficients of the log of past cases or deaths in Table 4 is consistent with a hypothesis that higher reported cases and deaths change people’s behavior to reduce transmission risks. Such behavioral changes in response to new information are partly captured by Google mobility measures, but the negative estimated coefficient of past cases or deaths imply that other latent behavioral changes that are not fully captured by Google mobility measures (e.g., frequent hand-washing, wearing masks, and keeping 6ft/2m distancing) are also important for reducing future cases and deaths.

If policies are enacted and behavior changes, then future cases/deaths and information will change, which will induce further behavior changes. However, since the model includes lags of cases/deaths as well as their growth rates, computing a long-run effect is not completely straightforward. We investigate dynamic effects that incorporate feedback through information in section 5.

### 3.4. The Total Effect of Policies on Case Growth

In this section, we focus our analysis on policy effects when we hold information constant. The estimated effect of policy on behavior in Table 3 and those of policies and behavior on case/death growth in Table 4 can be combined to calculate the total effect of policy as well as its decomposition into direct and indirect effects.

The first three columns of Table 6 show the direct (holding behavior constant) and indirect (through behavior changes) effects of policy under a specification that excludes national information variables. These are computed from the specification with national cases or deaths included as information (columns (1)-(4) of Table 3 and column (1) of Table 4). The estimates imply that all policies combined would reduce the growth rate of cases and deaths by 0.69 and 0.90, respectively, out of which more than one-half to two-third is attributable to the indirect effect through their impact on behavior. The estimate also indicates that the effect of mandatory masks for employees is mostly direct.

We can also examine the total effect of policies and information on case or death growth, by estimating (PI→Y). The coefficients on policy in this regression combine both the direct and indirect effects.

Table 5 shows the full set of coefficient estimates for (PI→Y). The results are broadly consistent with what we found above. As in Table 3, the effect of school closures is sensitive to the inclusion of national information variables. Also as above, mask mandates have a significant negative effect on growth rates.

Similarly to Table 6, the first three columns of Table 7 report the estimated direct and indirect effects of policy but impose that masks for employees only affect cases/deaths directly without affecting behavior and that business closure policies only affect cases/deaths indirectly through their effects on behavior.^{33} The estimated total effect of masks for employees in the third column of Table 7 is higher than that of Table 6. Similarly, the total effect of business closure policies is estimated to be larger in Table 7 than in Table 6.

Column “Difference” in Tables 6 and 7 show the difference between the estimate of (PI→Y) in column “PI→Y Coefficient” and the implied estimate from (BPI→Y)-(PI→B) in column “Total.” Differences are generally small except for the coefficient of closed K-12 schools and the sum of all policies in Table 6. The large differences in school closures may be due to the difficulty in identifying the effect of school closures given a lack of cross-sectional variation. Imposing the coefficients of masks for employees and business closure policies to be zero in (BPI→Y) and (PI→B), respectively, does not increase differences between “Total” and “PI→Y Coefficient” as reported in the last column of Table 7.

Tables 8 and 9 present the results for a specification that includes national information variables, where the estimates on masks for employees are similar to those in Tables 6 and 7. Column “Difference” of Table 9 indicates that the restrictions of zero coefficients for masks for employees in (BPI→Y) and for business closure policies in (PI→B) are not statistically rejected.

Column “Average” of Tables 6 and 8 reports the average of “Total” and “PI→Y Coefficient” columns. The average is an appealing and simple way to combine the two estimates of the total effect: one relying on the causal structure and another inferred from a direct estimation of equation (PI → Y). We shall be using the average estimate in generating the counterfactuals in the next section. Turning to the results, the estimates of Tables 6 and 8 imply that all policies combined would reduce ΔlogΔD by 0.93 and 0.39, respectively. For comparison, the median of ΔlogΔ*D _{it}* reached its peak in mid-March of about 1.3 (see Figure 18 in the appendix). Since then it has declined to near 0. Therefore, -0.93 and -0.39 imply that policy changes can account for roughly one-third to two-third of the observed decrease in death growth. The remainder of the decline is likely due to changes in behavior from information.

## 4. Sensitivity Analysis

### 4.1. Specifications, Timings, and Flexible Controls via Machine Learning

In this section, we provide sensitivity analysis by estimating (PI→Y) with alternative specifications and methods. Figure 11 shows the 90% confidence intervals of coefficients of (A) masks for employees, (B) closed K-12 school, (C) stay-at-home, and (D) the average variable of closed movie theaters, closed restaurants, and closed non-essential businesses for the following specifications and estimation methods:

Exclude the state of New York from the sample because it may be viewed as an outlier in the early pandemic period.

Add the percentage of people who wear masks to protect themselves in March and April as a confounder for unobserved personal risk-aversion and initial attitude toward mask wearing.

^{34}Add the log of Trump’s vote share in the 2016 presidential election as a confounder for unobserved private behavioral response.

Add past behavior variables as information used to set policies. Under this specification, our causal interpretation is valid when policy variables are sufficiently random conditional on past behavior variables.

Include all additional controls in (3)-(5) with the sample that excludes New York as in (2).

Add weekly dummies to the baseline specification.

Baseline specification estimated by instrumenting Δlog

*T*with one week lagged logarithm value of the number of tests per 1000 people._{it}Estimated by Double Machine Learning (DML) (e.g., Chernozhukov et al., 2018) with Lasso to reduce dimensionality while including all additional controls in (3)-(5).

Estimated by DML with Random Forest to reduce dimensionality and capture some nonlinearities while including all additional controls in (3)-(5).

In Figure 11, “red”, “green”, “blue”, and “purple” indicate the regression models for case growth without national information variables, case growth with national information variables, death growth without national information variables, and death growth with national information variables, respectively. The left panel of “(i) baseline timing” assume that the times from exposure to case confirmation and death reporting are 14 and 21 days, respectively, while they are 7 and 24 days, respectively, in the right panel of “(ii) alternative timing.”^{35}

Panel (A) of Figure 11 illustrates that the estimated coefficients of mask mandates are negative and significant in most specifications, methods, and timing assumptions, confirming the importance of mask policy on reducing case and death growths.

In Panel (B) of Figure 11, many estimates of closures of K-12 schools suggest that the effect of school closures is large. The visual evidence on growth rates for states with and without school closures in Figure 12 also suggests that there may be a potentially large effect, though the history is very short. This evidence is consistent with the emerging evidence of prevalence of Covid-19 among children (Lee et al., 2020b; Szablewski et al., 2020). Davies et al. (2020) find that although children’s transmission and susceptibility rates are half that of ages 20-30, children’s contact rates are much higher. This type of evidence, as well as, evidence that children carry viral loads similar to older people (Jones et al. (2020)), led Germany to make the early decision of closing schools.

Our estimates of school closures substantially vary across specifications, however. In particular, the estimated effects of school closures on case/death growth become notably smaller once national cases/deaths or weekly dummies are controlled for. In the US state-level data, there is little variation across states in the timing of school closures. Consequently, its estimate is particularly sensitive to an inclusion of some aggregate variables such as national cases or weekly dummies. Given this sensitivity, there still exists a lot of uncertainty as to the magnitude of the effect of school closures. Any analyses of re-opening plans need to be aware of this uncertainty. An important research question is how to resolve this uncertainty using additional data sources.

Panel (C) of Figure 11 indicates that the estimated coefficients of stay-at-home orders are generally negative and often significant, providing evidence that stay-at-home orders reduce the spread of COVID-19, although the estimates are sometimes sensitive to timing assumptions. Panel (D) shows that the estimated coefficients of business closure policies substantially vary across specifications, providing a mixed evidence for the effect of closures of businesses on case/death growth.

### 4.2. Fixed Effects Specification

Table 10 presents the results of estimating (PI→Y) with state fixed effects and weekly dummies. Because fixed effects estimator could be substantially biased when the time dimension is relatively short, we report not only the standard fixed effects estimator in columns (1) and (3) but also the debiased fixed effects estimator (e.g., Chen, Chernozhukov, and Fernandez-Val, 2019) in columns (2) and (4), where the state-level clustered bootstrap standard errors are reported in all columns.

The estimated coefficients of masks for employees largely confirm our finding that mandatory mask policies reduce the case and death growths. The estimated coefficients of stay-at-home orders and business closures are negative in columns (1) and (2) but their magnitudes as well as statistical significance are somewhat sensitive to whether bias corrections are applied or not.

The results in Table 10 should be interpreted with caution. The fixed effects approach may not be preferred to random effects approach here because the former relies on long time and cross-sectional histories but, in our data, the effective time-dimension is short and the number of states is not large. Furthermore, the fixed effects approach could suffer more from measurement errors, which could be a concern for our behavior and policy variables.

## 5. Empirical Evaluation of Counterfactual Policies

We now turn our focus to dynamic feedback effects. Policy and behavior changes that reduce case and death growth today can lead to a more optimistic, riskier behavior in the future, attenuating longer run effects. We perform the main counterfactual experiments using the average of two estimated coefficients as reported in column “Average” of Table 7 under a specification that excludes the number of past national cases and deaths from information variables and constrains masks to have only direct effects and business policies only indirect effects.^{36} Details of the counterfactual computations are in appendix A.10.

The vertical text denotes the dates that Washington imposed each policy. To compute the estimated and counterfactual paths we use the average of two estimated coefficients as reported in column “Average” of Table 7. We set initial Δ log Δ*C* and log Δ*C* to their values first observed in the state we are simulating. We hold all other regressors at their observed values. The shaded region is a point-wise 90% confidence interval. See appendix A.10 for more details.

### 5.1. Business Mask Mandate

We first consider the impact of a nationwide mask mandate for employees beginning on March 14th. As discussed earlier, we find that mask mandates reduce case and death growth even when holding behavior constant. In other words, mask mandates may reduce infections with relatively little economic disruption. This makes mask mandates a particularly attractive policy instrument. In this section we examine what would have happened to the number of cases if all states had imposed a mask mandate on March 14th.^{37}

For illustrative purpose, we begin by focusing on Washington State. The left column of Figure 13 shows the change in growth rate from mandating masks on March 14th. The shaded region is a 90% pointwise confidence interval. As shown, mandating masks on March 14th lowers the growth of cases or deaths 14 or 21 days later by 0.07 or 0.14. This effect then gradually declines due to information feedback. Mandatory masks reduce past cases or deaths, which leads to less cautious behavior, attenuating the impact of the policy. The reversal of the decrease in growth in late May is due to our comparison of a mask mandate on March 14th with Washington’s actual mask mandate in early May. By late May, the counterfactual mask effect has decayed through information feedback, and we are comparing it to the undecayed impact of Washington’s actual, later mask mandate.

The middle column of Figure 13 shows how the changes in case and death growth translate into changes in daily cases and deaths. The estimates imply that mandating masks on March 14th would have led to about 70 fewer cases and 5 fewer deaths per day throughout April and May. Cumulatively, this implies 19% [9,29]% fewer cases and 26% [14,37]% fewer deaths in Washington by the start of June.

The results for other states are similar to those for Washington. Figure 14 shows the national effect of mask mandate on March 14th. The top panel shows the effect on cases and the bottom panel shows the effect on deaths. The left column shows the change in growth rates. Dots are the average change in growth rate in each state (i.e. the dots are identical to the solid line in the left panel of Figure 13, except for each state instead of just Washington). The solid blue line is the national average change in growth rate. The shaded region is a 90% pointwise confidence band for the national average change in growth rate.

The middle column shows the changes in cases and deaths per day. Dots are the expected change in each state. The thick line is the national total change in daily cases or deaths. The shaded region is a pointwise 90% confidence band for the national total change. The estimates that a mask mandate in mid March would have decreased cases and deaths by about 6000 and 600 respectively throughout April and May.

The right column of Figure 14 shows how these daily changes compare to the total cumulative cases and deaths. The right column shows the national relative change in cumulative cases or deaths. The point estimates indicate that mandating masks on March 14th could have led to 21% fewer cumulative cases and 34% fewer deaths by the end of May with their 90 percent intervals given by [9, 32]% and [19, 47]%, respectively. The result roughly translates into 19 to 47 thousand saved lives.

In the left column, the dots are the average change in growth in each state. The blue line is the average across states of the change in growth. The shaded region is a point-wise 90% confidence interval for the national average change in growth. The middle column shows the total national change in daily cases and deaths. Black dots are the changes in each state. The right column shows the change in cumulative cases or deaths relative to the baseline of actual policies.

### 5.2. Business Closure Policies

Business closures are particularly controversial. We now examine a counterfactual where there are no business closure policies (no restaurant, movie theater, or non-essential business closure). Figure 15 shows the national effect of never restricting businesses on cases and deaths. The point estimates imply that, without business closures, cases and deaths would be about 40% higher at the end of May. The confidence intervals are wide. A 90 percent confidence interval for the increase in cases at the end of May is [17, 78]%. The confidence interval for deaths is even wider, [1,97]%.

### 5.3. Stay-at-home orders

We next examine a counterfactual where stay-at-home orders had been never issued. Figure 16 shows the national effect of no stay-at-home orders. On average, without stay-at-home orders, case growth rate would have been nearly 0.075 higher in late April. This translates to 37% [6, 63]% more cases per week by the start of June. The estimated effect for deaths is a bit smaller, with no increase included in a 90 percent confidence interval, [–7, 50]%.

## 6. Conclusion

This paper assesses the effects of policies on the spread of Covid-19 in the US using state-level data on cases, tests, policies, and social distancing behavior measures from Google Mobility Reports. Our findings are summarized as follows.

First, our empirical analysis robustly indicates that mandating face masks has reduced the spread of Covid-19 without affecting people’s social distancing behavior measured by Google Mobility Reports. Our counterfactual experiment suggests that if all states had adopted mandatory face mask policies on March 14th of 2020, then the cumulative number of deaths by the end of May would have been smaller by as much as 19 to 47%, which roughly translates into 19 to 47 thousand saved lives.

Second, our baseline counterfactual analysis suggests that keeping all businesses open would have led to 17 to 78% more cases while not implementing stay-at-home orders would have increased cases by 6 to 63% by the end of May although we should interpret these numbers with some caution given that the estimated effect of business closures and stay-at-home orders vary across specifications in our sensitivity analysis.

Third, we find considerable uncertainty over the impact of school closures on case or death growth because it is difficult to identify the effect of school closures from the US state-level data due to the lack of variation in the timing of school closures across states.

Fourth, our analysis shows that people voluntarily reduce their visits to workplace, retail stores, grocery stores, and people limit their use of public transit when they receive information on a higher number of new cases and deaths. This suggests that individuals make decisions to voluntarily limit their contact with others in response to greater transmission risks, leading to an important feedback mechanism that affects future cases and deaths. Model simulations that ignore this voluntary behavioral response to information on transmission risks would over-predict the future number of cases and deaths.

Beyond these findings, our paper presents a useful conceptual framework to investigate the relative roles of policies and information on determining the spread of Covid-19 through their impact on people’s behavior. Our causal model allows us to explicitly define counterfactual scenarios to properly evaluate the effect of alternative policies on the spread of Covid-19. More broadly, our causal framework can be useful for quantitatively analyzing not only health outcomes but also various economic outcomes (Bartik et al., 2020; Chetty et al., 2020).

## Data Availability

All data used in this manuscript is available in public domain as stated in the data appendix of the manuscript.

## Appendix a. Data Construction

### A.1. Measuring Δ*C* and Δlog Δ*C*

We have three data sets with information on daily cumulative confirmed cases in each state. As shown in Table 11, these cumulative case numbers are very highly correlated. However, Table 12 shows that the numbers are different more often than not.

The upper left panel of Figure 17 shows the evolution of new cases in the NYT data, where daily changes in cumulative cases exhibit some excessive volatility. This is likely due to delays and bunching in testing and reporting of results. Table 13 shows the variance of log new cases in each data set, as well as their correlations. As shown, the correlations are approximately 0.9. The NYT new case numbers have the lowest variance^{38} In our subsequent results, we will primarily use the case numbers from The New York Times.

For most of our results, we focus on new cases in a week instead of in a day. We do this for two reasons as discussed in the main text. First, a decline in new cases over two weeks has become a key metric for decision makers. Secondly, aggregating to weekly new cases smooths out the noise associated with the timing of reporting and testing.

Table 14 reports the correlation and variance of weekly log new cases across the three data sets. The upper right panel of Figure 17 shows the evolution of weekly new cases in each state over time from the NYT data. The upper panel of Figure 18 shows the evolution of the weekly growth rate of new cases in each state over time.

Thin gray lines are the log of cases, death, and tests in each state and date. Thicker colored lines are their quantiles conditional on date.

#### A.2. Deaths

Table 15 reports the correlation and variance of weekly deaths in the three data sets. The lower left panel of Figure 17 shows the evolution of weekly deaths in each state from the NYT data. As with cases, we use death data from The New York Times in our main results. The lower panel of Figure 18 shows the evolution of the weekly growth rate of new deaths in each state over time.

Thin gray lines are case or death growth in each state and date. Thicker colored lines are quantiles of case or death growth conditional on date.

#### A.3. Tests

Our test data comes from The Covid Tracking Project. The lower right panel of Figure 17 shows the evolution of the weekly number of tests over time.

#### A.4. Social Distancing Measures

In measuring social distancing, we focus on Google Mobility Reports. This data has international coverage and is publicly available. Figure 19 shows the evolution of the four Google Mobility Reports variables that we use in our analysis.

#### A.5. Policy Variables

We use the database on US state policies created by Raifman et al. (2020). As discussed in the main text, our analysis focuses on six policies. For stay-at-home orders, closed nonessential businesses, closed K-12 schools, closed restaurants except takeout, and closed movie theaters, we double-checked any state for which Raifman et al. (2020) does not record a date. We filled in a few missing dates. Our modified data is available here. Our modifications fill in 1 value for school closures, 2 for stay-at-home orders, 3 for movie theater closure, and 4 for non-essential business closures. Table 16 displays all 25 dated policy variables in Raifman et al. (2020)‘s database with our modifications described above.

Figures 20-21 show the average case and death growth conditional on date and whether each of six policies is implemented or not.

#### A.6. Timing

There is a delay between infection and when a person is tested and appears in our case data. MIDAS (2020) maintain a list of estimates of the duration of various stages of Covid-19 infections. The incubation period, the time from infection to symptom onset, is widely believed to be 5 days. For example, using data from Wuhan, Li et al. (2020) estimate a mean incubation period of 5.2 days. Siordia (2020) reviews the literature and concludes the mean incubation period is 3-9 days.

Estimates of the time between symptom onset and case reporting or death are less common. Using Italian data, Cereda et al. (2020) estimate an average of 7.3 days between symptom onset and reporting. Zhang et al. (2020a) find an average of 7.4 days using Chinese data from December to early February, but they find this period declined from 8.9 days in January to 5.4 days in the first week of February. Both of these papers on time from symptom onset to reporting have large confidence intervals covering approximately 1 to 20 days.

Studying publicly available data on infected persons diagnosed outside of Wuhan, Linton et al. (2020) estimate an average of 15 days from onset to death. Similarly, using publicly available reports of 140 confirmed Covid-19 cases in China, mostly outside Hubei Province, Sanche et al. (2020) estimate the time from onset to death to be 16.1 days.

Based on the above, we expect a delay of roughly two weeks between changes in behavior or policies, and changes in reported cases while a corresponding delay of roughly three weeks for deaths.

#### A.7. Direct and Indirect Policy Effects with national case/death variables

Tables 8 and 9 present the estimates of direct and indirect effects of policies for the specification with past national case/death variables. The effects of school closures and the sum of policies are estimated substantially smaller in Table 8 when national case/death variables are included than in Table 6. This sensitivity reflects the difficulty in identifying the aggregate time effect—which is largely captured by national cases/deaths—given little cross-sectional variation in the timing of school closures across states. On the other hand, the estimated effects of policies other than school closures are similar between Table 6 and Table 8; the effect of other policies are well-identified from cross-sectional variations.

#### A.8. Double Machine Learning

To estimate the coefficient of the *j*-th policy variable, , using double machine learning (Chernozhukov et al. (2018)), we consider a version of (PI→Y) as follows:
where collects policy variables except for the *j*-th policy variables , behavior variables *B _{it}*, and information variables

*I*. The confounding factors

_{it}*W*affect the policy variable via the function

_{it}*m*

_{0}(

*W*) and the outcome variable

*Y*

_{i,t}_{+}

*via the function*

_{ℓ}*g*

_{0}(

*W*). We apply Lasso or Random Forests to estimate

*g*

_{0}(

*W*) for dimension reductions or for capturing non-linearity while the coefficients of

*X*are estimated under linearity without imposing any dimension reductions. We omit the details of estimation procedure here. Please see the discussion in Example 1.1 of Chernozhukov et al. (2018) for reference.

_{it}#### A.9. Debiased Fixed Effects Estimator

We apply cross-over Jackknife bias correction as discussed in Chen et al. (2020) in details. Here, we briefly describe our debiased fixed effects estimator.

Given our panel data with *N* = 51 states and *T* = 75 days for case growth equation (PI→Y), consider two subpanels as follows:
and
where ⎾·⏋ and ⎿·⏌ are the ceiling and floor functions. Each of these two subpanels includes observations for all cross-sectional units and time periods.

We form the debiased fixed effects estimator as
where is the standard fixed effects estimator while denotes the fixed effects estimator using the data set **S**_{1}∪ **S**_{2} but treats the states in **S**_{1} differently from those in **S**_{2} to form the fixed effects estimator; namely, we include approximately twice more state fixed effects to compute ^{39} We obtain bootstrap standard errors by using multipler bootstrap with state-level clustering.

#### A.10. Details for Computing Counterfactuals

We compute counterfactuals from the “total effect” version of the model, with behavior concentrated out.

We consider a counterfactual change of *P _{it}* to , while

*W*and are held constant. In response to the policy change,

_{it}*Y*and the part of

_{i,t+ℓ}*I*that contains

_{it}*Y*, change to and . To be specific, let

_{it}*C*denote cumulative cases in state

_{it}*i*on day

*t*. Our outcome is: where Δ is a 7 day difference operator. Writing the model in terms of Δ log Δ

*C*we have:

To simplify computation, we rewrite this model in terms of log Δ*C*:

This equation is used to iteratively compute log conditional on initial and the entire sequence of .

Weekly cases instead of log weekly cases are given simply by

Cumulative cases can be recursively computed as: given initial .

Note that log depends linearly on , so the residuals (and our decision to condition on versus integrate them out) do not matter when considering linear constasts of log Δ*C _{it}* or ΔlogΔ

*C*, or when considering relative contrasts of Δ

_{it}*C*.

_{it}Inference.

Let *S*(*θ*, *ε*) denote some counterfactual quantity or contrast of interest, where are the parameters, and *ε* is the vector of residuals. Examples of *S* that we compute include:

Contrasts of growth rates:

Relative contrasts of weekly cases:

Relative contrasts of cumulative cases:

The first two examples do not actually depend on *ε*, but the third one does. Inference is by simulation. Let denote our point estimates and the associated residuals. We draw from the asymptotic distribution of . Let denote the residuals associated with . We then compute for *j* = 1,…, 200 and plots the mean across *j* as a point estimate and quantiles across *j* for confidence intervals.

Department of Economics and Center for Statistics and Data Science, MIT, MA 02139

*Email address*: vchern{at}mit.edu

Vancouver School of Economics, UBC, 6000 Iona Drive, Vancouver, BC.

*Email address*: hkasahar{at}mail.ubc.ca

Vancouver School of Economics, UBC, 6000 Iona Drive, Vancouver, BC.

*Email address*: schrimpf{at}mail.ubc.ca

## Footnotes

We are grateful to Daron Acemoglu, V.V. Chari, Raj Chetty, Christian Hansen, Glenn Ellison, Ivan Fernandez-Val, David Green, Ido Rosen, Konstantin Sonin, James Stock, Elie Tamer, and Ivan Werning for helpful comments. We also thank Chiyoung Ahn, Joshua Catalano, Jason Chau, Samuel Gyetvay, Sev Chenyu Hou, Jordan Hutchings, and Dongxiao Zhang for excellent research assistance. We are especially grateful to C. Jessica E. Metcalf and an anonymous referee for detailed and insightful comments, which considerable improved the paper. All mistakes are our own.

↵1 See Courtemanche et al. (2020), Hsiang et al. (2020), Pei, Kandula, and Shaman (2020), Abouk and Heydari (2020), and Wright et al. (2020).

↵2 The behavior accounts for the other half. This is in line with theoretical study by Gitmez, Sonin, and Wright (2020) that investigates the role of private behavior and negative external effects for individual decisions over policy compliance as well as information acquisition during pandemics.

↵3 This null hypothesis can be generated by looking at the meta-evidence from RCTs on the efficacy of masks in preventing other respiratory cold-like deceases. Falsely rejecting this null is costly in terms of potential loss of life, and so it is a reasonable null choice for the mask policy from decision-theoretic point of view.

↵4 We refer the reader to Avery et al. (2020) for a comprehensive review of a larger body of work researching Covid-19; here we focus on few quintessential comparisons on our work with other works that we are aware of.

↵5 Using a synthetic control approach, Cho (2020) finds that the cases would have been lower by 75 percent had Sweden adopted stricter lockdown policies.

↵6 Specifically, they find that of the 60 percentage point drop in workplace intensity, 40 percentage points can be explained by changes in information as proxied by case numbers, while roughly 8 percentage points can be explained by policy changes.

↵7 See Atkeson (2020b) and Stock (2020a) for the implications of the SIR model for Covid-19 in the US. Fernández-Villaverde and Jones (2020) estimate a SIRD model in which time-varying reproduction numbers depend on the daily deaths to capture feedback from daily deaths to future behavior and infections.

↵8 The virus remains viable in the air for several hours, for which surgical masks may be effective. Also, a substantial fraction of individual who are infected become infectious before showing symptom onset.

↵9 Lee et al. (2020a) find evidence that viral loads in asymptomatic patients are similar to those in symptomatic patients. Aerosol transmission of viruses may occur through aerosols particles released during breathing and speaking by asymptomatic infected individuals; masks reduce such airborne transmission (Prather, Wang, and Schooley, 2020). An_nrud et al. (2020) provide visual evidence of speech-generated droplet as well as the effectiveness of cloth masks to reduce the emission of droplets. Chu et al. (2020) conduct a meta-analysis of observational studies on transmission of the viruses that cause COVID-19 and related diseases and find the effectiveness of mask use for reducing transmission.

↵10 Whether wearing masks creates a false sense of security and leads to decrease in social distancing is also a hotly debated topic. A randomized field experiment in Berlin, Germany, conducted by Seres et al. (2020) finds that wearing masks actually increases social distancing, providing no evidence that mandatory masks leads to decrease in social distancing.

↵11 Miyazawa and Kaneko (2020) find that country’s COVID-19 death rates are negatively associated with mask wearing rates.

↵12 Our study was first released in ArXiv on May 28, 2020 whereas Mitze et al. (2020) was released at SSRN on June 8, 2020.

↵13 Adda (2016) analyzes the effect of policies reducing interpersonal contacts such as school closures or the closure of public transportation networks on the spread of inuenza, gastroenteritis, and chickenpox using high frequency data from France.

↵14 The father and son, P. Wright (economist) and S. Wright (geneticist) collaborated to develop structural equation models and causal path diagrams; P. Wright’s key work represented supply-demand system as a directed acyclical graph and established its identification using exclusion restrictions on instrumental variables. We view our work as following this classical tradition.

↵15 Under some additional independence conditions, this can be replaced by an arbitrary non-additive function , such that the unconfoundedness condition stated in the next footnote holds.

↵16 An alternative useful starting point is to impose the Rosenbaum-Rubin type unconfoundedness condition: which imply, with treating stochastic errors as independent additive components, the orthogonal conditions stated above. The same unconfoundedness restrictions can be formulated using formal causal DAGs, and also imply orthogonality restrictions stated above, once stochastic errors are modeled as independent additive components.

↵17 The structural equations of this form are connected to triangular structural equation models, appearing in microeconometrics and macroeconometrics (SVARs), going back to the work of Strotz and Wold (1960).

↵18 Our empirical analysis also considers a specification in which information variables include lagged national cases/deaths as well as lagged behavior variables.

↵19 The general formula for

*I*_{i,t}_{−1}is , where*S*is the initial condition, the log of new cases at time −_{i,t,ℓ}*t*mod*ℓ*.↵20 This construction is not as efficient as generalized method of moments but has a nicer interpretation under possible misspecification of the model: we are combining predictions from two models, one motivated via the causal path, reflecting contextual knowledge, and another from a “reduced form” model not exploiting the path. The combined estimator can improve on precision of either estimator.

↵21 It is possible to consider counterfactual exercises in which policy responds to information through the policy equation if we are interested in endogenous policy responses to information. Counterfactual experiments with endogenous government policy would be important, for example, to understand the issues related to the lagged response of government policies to higher infection rates due to incomplete information.

↵22 Furthermore, we conducted several stability checks, for example, checking if the coefficients on mask policies remain stable (reported in the previous version) and also looking at more recent data during reopening, beyond early pandemic, to examine the stability of the model.

↵23 For unconditional counterfactual, we need to make assumptions about the evolution of stochastic shocks appearing in

*Y*. See, e.g, previous versions of our paper in ArXiv, where unconditional counterfactuals were calculated by assuming stochastic shocks are i.i.d. and resampling them from the empirical distribution. The differences between conditional and unconditional contrasts were small in our empirical analysis._{it}↵24 To check sensitivity to this assumption we performed robustness checks, where we used the further lag of Δlog(

*T*) as a proxy for exogenous change in the testing rate, and we also used that as an instrument for Δlog(_{it}*T*); this did not affect the results on policy effects, although the instrument was not sufficiently strong._{it}↵25 The other two measures are “Residential” and “Parks.” We drop “Residential” because it is highly correlated with “Workplaces” and “Retail & recreation” at correlation coefficients of -0.98 and -0.97, respectively. We also drop “Parks” because it does not have clear implication on the spread of Covid-19.

↵26 Similar to our funding, Kovacs, Dunaiski, and Tukiainen (2020) find no evidence that introduction of compulsory face mask policies affect community mobility in Germany.

↵27 Analyzing the New York City’s subway ridership, Harris (2020) finds a strong link between public transit and spread of infection.

↵28 Rewrite a regression specification after omitting other variables as

*B*=_{it}*γ*_{1}Δ log Δ*C*+_{it}*γ*_{2}log Δ*C*= (_{it}*γ*_{1}+*γ*_{2}) logΔCit −*γ*_{1}logΔ*C*_{i}_{;}_{t}_{−7}. In columns (1)-(4) of Table 3(A), the estimated values of both (*γ*_{1}+*γ*_{2}) and −*γ*_{1}are negative except for grocery. This suggests that a higher level of confirmed cases reduces people’s mobility in workplaces, retails, and transit. For grocery, the positive estimated coefficient of (*γ*_{1}+*γ*_{2}) may reect stock-piling behavior in early pandemic periods.↵29 We take 14 and 21 day lags of mask policies for case and death growths, respectively, to identify the states with a mask mandate because policies affect cases and deaths with time lags. See our discussion in the Appendix A.6.

↵30 As we review in the Appendix A.6, a lag length of 14 days between exposure and case reporting, as well as a lag length of 21 days between exposure and deaths, is broadly consistent with currently available evidence. Section 4 provides a sensitivity analysis under different timing assumptions.

↵31 Month dummies also represent the latent information that is not fully captured by the past cases and growths.

↵32 Note that we are

*not evaluating the effect of universal*mask-wearing for the public but that of mask-wearing for employees. The effect of*universal*mask-wearing for the public could be larger if people comply with such a policy measure. Tian et al. (2020) considered a model in which mask wearing reduces the reproduction number by a factor (1 −*e pm*)^{2}, where*e*is the efficacy of trapping viral particles inside the mask and*pm*is the percentage of mask-wearing population. Given an estimate of*R*_{0}= 2:4, Howard et al. (2020) argue that 50% mask usage and a 50% mask efficacy level would reduce the reproduction number from 2.4 to 1.35, an order of magnitude impact.↵33 These are computed from estimating the specification in columns (1)-(4) of Table 3 and column (1) of Table 4 but imposing that the coefficient of masks for employees is zero in (BPI→Y) and that the coefficient of business closure policies is zero in (PI→B).

↵34 The survey is conducted online by YouGov and is based on the interviews of 89,347 US adults aged 18 and over between March 26-April 29, 2020. The survey question is “Which, if any, of the following measures have you taken in the past 2 weeks to protect yourself from the Coronavirus (COVID-19)?”.

↵35 This alternative timing assumption is motivated by the lower bound estimate of median times from exposure to case confirmation or death reporting in Table 2 of https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html, which are based on data received by CDC through June 29, 2020. We thank C. Jessica E. Metcalf for recommending us to do a sensitivity analysis on the timing assumption while suggesting this reference to us.

↵36 Results (not shown) using the unconstrained estimates in Table 6 are very similar.

↵37 We feel this is a plausible counterfactual policy. Many states began implement business restrictions and school closures in mid March. In a paper made publicly available on April 1st, Abaluck et al. (2020) argued for mask usage based on comparisons between countries with and without pre-existing norms of widespread mask usage.

↵38 This comparison is somewhat sensitive to how you handle negative and zero cases when taking logs. Here, we replaced log(0) with −1. In our main results, we work with weekly new cases, which are very rarely zero.

↵39 Alternatively, we may form the cross-over jackknife corrected estimator as , where denotes the fixed effect estimator using the subpanel

**S**for_{j}*j*= 1; 2. In our empirical analysis, these two cross-over jackknife bias corrected estimators give similar result.