COVID AMP: An Open Access Database of COVID-19 Response Policies

17 As the COVID-19 pandemic unfolded in the spring of 2020, governments around the world 18 began to implement policies to mitigate and manage the outbreak. Significant research efforts 19 were deployed to track and analyse these policies in real-time to better inform the response. 20 While much of the policy analysis focused narrowly on social distancing measures designed to 21 slow the spread of disease, here, we present a dataset focused on capturing the breadth of 22 policy types implemented by jurisdictions globally across the whole-of-government. COVID 23 Analysis and Mapping of Policies (COVID AMP) includes nearly 50,000 policy measures across 24 152 countries, 124 intermediate areas, and 235 local areas between January 2020 and June 25 2022. With up to 40 structured and unstructured metadata fields per policy, as well as the 26 original source and policy text, this dataset provides a uniquely broad capture of the 27 governance strategies for pandemic response, serving as a critical data source for future work 28 in legal epidemiology and political science.


Background & Summary
In response to the COVID-19 pandemic, governments around the world implemented a range of policies, regulations, and mandates to mitigate transmission, support the economy, and protect population health.Despite targeting similar goals, there was significant heterogeneity in how governments approached policy strategies for the pandemic response, in part because of a dearth of prior policy evidence, an evolving understanding of which governmental actions might effectively protect populations, differing access to resources required for specific policy actions, and mismatched expectations regarding adherence to stringent policies.Most policy trackers deployed during the pandemic focused on social distancing measures with an emphasis on the ability to assess the effectiveness of these policies in limiting human movement, human-human interaction, and disease spread as quantified by reported cases, hospitalizations, and fatalities for different populations and subpopulations.[1] These efforts, while critical in performing data capture for rapid analysis of the relative value of different social distancing measures, did not capture the full breadth of policy measures implemented, limiting policymakers' ability to assess the impact of these "non-health" policies and the synergistic effects of a more integrated approach to pandemic response.[2] management.Each row in the dataset represents an individual directive, linked by a unique identifier to the original policy document and coded by the type of policy and the target of policy, as defined as the primary population, location, or entities impacted by the policy or law, in addition to more than 40 additional metadata per directive.Event response and event-specific mitigation efforts are only one subset of the policies needed to effectively manage and respond to large scale emergencies.[4] The National Response Framework (NRF) in the U.S. lists 12 emergency support functions and leans on a whole-ofgovernment response framework to manage critical functions across transportation, military authorities, manufacturing, supply-chain management, first-responder housing, cross-border licensing issues for critical response personnel (e.g., nurses, electrical lineman), housing authorities, and economic support for those impacted.[5] Building on this cross-sector approach, previously identified and applied in the Georgetown Outbreak Activity Library (https://outbreaklibrary.org/), we identified five categories of policy relevant to the COVID-19 outbreak: (1) Social distancing, (2) Emergency declarations, (3) Travel restrictions, (4) Enabling and relief measures, and (5) Support for public health and clinical capacity.Over the course of data collection, five additional categories emerged from policy analysis that were added to the coding scheme to more accurately capture the range of policy actions available and pull forward specific types of policies as they gained global traction (e.g., variation in vaccination policies): (6) Face mask, (7) Contact tracing and testing, (8) Military mobilization, (9) Authorization and enforcement, and (10) Vaccinations.In addition, 71 subcategories were used to capture the type of policy actions at a more granular level.For example, the social distancing category is composed of subcategories such as "Curfews", "Event delays or cancellations", "Alternative election measures", "Private sector closures", or "Stay at home."As the pandemic unfolded, the policies implemented by governments to manage the response and mitigate impacts evolved.Therefore, categories, subcategories, and targets were adjusted over time to maximize the taxonomy of the dataset for exhaustiveness and usability for secondary analysis.All updates to categories or subcategories were made by consensus of the research team, and backpropagated across existing data to ensure internal consistency.For a full Data Dictionary, see Appendix 1: Table 1.

Comparison to other COVID-19 policy trackers
The COVID-19 pandemic prompted over 200 research and government initiatives aimed at tracking the policies and measures implemented in response to the outbreak.[1] Given the extensive nature of these efforts, a comprehensive evaluation of each one is beyond the scope of this article.However, we summarize the key features of COVID AMP in the context of similar datasets, including OxCGRT [6] and CoronaNet [7], to underscore the unique contributions of this effort.One of the most significant contributions of the COVID AMP dataset is the breadth of data collected over geography and time.CoronaNet shares a similarly broad scope, identifying 20 "broad policy types" including NPIs, declarations of emergency, travel restrictions, health communication, and some public and private restrictions.[7] OxCGRT has a more limited scope with 19 indicators focused more specifically on containment, health, and economic support policies [6].In the COVID AMP ontology, these "broad policy types" and "indicators" are equivalent to our concept of "Policy subcategory", for which we capture 71 unique policy types.
Importantly, in contrast to OxCGRT, we do not assign quantitative values to interpret policy stringency, but instead classified policies as either restricting or relaxing based on the intended effect of the directive on the policy environment at the time of enactment.In doing so, the dataset does not use static quantitative measures of implementation impact, but instead captures a breadth of policy types in the context of which they were issued.This is a marked divergence from other policy trackers, which do not differentiate between restricting or relaxing policies with metrics defining the degree of restrictiveness.[6,7] Therefore, the COVID AMP dataset allows for in-depth analysis not only of the lockdown in response to COVID-19, but also an understanding of how reopening occurred across jurisdictions.In addition to the type of policy captured, COVID AMP captures geographic, sector, and demographic targets for a more descriptive approach to policy.In OxCGRT, these variables are binary for each of the indicators to simply designate "targeted" or "general."[6] In CoronaNet, the geographic target can be specified with free text, and the demographic target aligns with 11 broad demographic targets or 25 special demographic targets; however, it does not capture sector targets.[7] In contrast to these databases, COVID AMP uses the terms 'authorizing areas' and 'affected areas' to define conditions in which a policy issued by one level of government applies to another geography.For example, the United Kingdom's travel restriction after the Omicron variant was identified in December 2021 targeted six countries.[8] In COVID AMP, this policy would be a single row where the United Kingdom would be the "Authorizing country" and South Africa, Botswana, Lesotho, Eswatini, Zimbabwe, and Namibia would each be listed in the "Affected country" field.Complementing the geographic targets, the "Policy target" has 73 multi-select options to indicate the populations, sectors, or entities affected by the policy.With nearly 50,000 policies with 40 metadata, the COVID AMP dataset contains over 2 million hand-coded pieces of information without the use of machine learning software.The metadata chosen were designed to ensure a complete historical record of the policies identified and support broad secondary analysis.To our knowledge, COVID AMP is currently the largest and most descriptive policy tracking dataset by number of policies and metadata collected.COVID AMP's coverage of jurisdictions and dates is similar to that of OxCGRT and CoronaNet.COVID AMP captures policies issued from January 2020 to June 2022, whereas OxCGRT has longer temporal coverage through December 2022 and CoronaNet has more narrow coverage through October 2021.[13,14] Although OxCGRT and CoronaNet have broader global coverage (184 and 195 countries, respectively) extending to some subnational areas, the COVID AMP dataset captures data from 152 countries, 124 intermediate areas (i.e., states, provinces, etc.), and 235 local areas (i.e., cities, countries, etc.).

Data Records
The COVID AMP library currently contains more than 15,000 individual COVID-19-related documents issued or effective through the period January 2020 to June 2022 (and beyond for select jurisdictions), comprised of nearly 50,000 unique policies from over 150 countries.Figure 1 shows the extent of policy data coverage globally and in the U.S.This database was specifically designed to capture the breadth of measures applied by different jurisdictions to manage and mitigate the pandemic; therefore, each measure was captured as a separate policy target and categorized by the focus or intent of the policy.Thus, each row in the database represents a uniquely identified policy directive, meaning a single policy document may be captured across many rows.For example, an executive order can include a stay-at-home order for individuals and mandate non-essential business closures.These policy directives share a single policy name and PDF since they are part of the same larger piece of legislation, but they are captured in different lines with different unique IDs.Extensions of previous policies are also captured as a new row and linked back to the previous policy using unique IDs.Each policy directive is tagged with a series of descriptive attributes based on a detailed review of the policy language, including the following selection of fields: .

209
Table 1 The data on which the directive specified in the policy was terminated, replaced, or extended A comprehensive Data Dictionary describing each of the 40 metadata fields is available in Appendix 1: Table 1. Figure 2 shows the distribution of policies by category over time, globally and within the U.S.Both globally and in the U.S., the greatest number of policies were initiated in April 2020; "Social distancing" was the predominant category of policies enacted by governments over the course of the pandemic, followed by "Enabling and relief measures" and "Support for public health and clinical capacity."In conducting analysis, users should expect to see a broad, representative sample of the heterogenous policies implemented over the course of the COVID-19 pandemic.However, the COVID AMP dataset is not a complete global historical record.To the best of our knowledge, the dataset contains a comprehensive set of policies implemented for each of the U.S. states, Puerto Rico, and Guam from January 2020 to June 2022, with over 19,000 policies captured at the national and state level in these jurisdictions.Researchers also documented many policies for U.S. counties (approximately 8,000) and tribal areas (approximately 1,400) with comprehensive coverage for California, Washington D.C., Maryland, Nevada, and Virginia.For the additional 150+ countries for which we have collected data, more than 40 countries have 100+ policies coded, though this does not necessarily imply complete or comprehensive coverage for those countries.For jurisdictions other than the U.S. and its territories, there is generally better policy coverage at the national level as opposed to the intermediate area levels (e.g., state, province, etc.).Due to variations in legal systems, the significance of the total number of policies varies by state and country.For example, many entities regularly renewed the emergency authority of the health department or governor, thus renewing the same policies regularly.Policy totals tend to reflect variation in governance structure and method more than stringency or effectiveness of the policy response.

Technical Validation
Given the frequency and scale of data collection, the research team implemented a combination of manual and automated quality assurance and control (QA/QC) processes to check and correct the data.The manual QA/QC process involved a lead researcher who reviewed data for typographical errors, ensured inter-coder reliability, and confirmed record completion.Completed records included all fields specified in the "Data Records" section; records with missing fields were flagged for review and excluded until corrected.The fields, "Anticipated end date" and "Actual end date" were exceptions, as many policies did not specify the intended end date or were ongoing during data collection.Policies for which the end date is not provided may still be in place or permanent or the end date for the policy was not publicly documented.As of writing, approximately 50% of policies have an "Anticipated end date" and 70% have an "Actual end date."In addition to manual review, automated QA/QC was applied to clean and standardize the data.Drop-down lists, with easily accessible data definitions and glossaries, were used to standardize coding selections, prevent typos, and reduce discrepancies.The Dedupe extension in Airtable was used to find and manage duplicate records based on policies with identical issued/effective start dates, authorizing/affected areas, and data sources.Where a duplicate was identified, the lead researcher merged the information from the two records, selecting the correct information from each if discrepancies in coding existed.Python (3.7.0) was used to filter incomplete records out of the final dataset view, assign policy numbers, and standardly format dates and country names for ease of use in secondary analysis.

Usage Notes
From the implementation of mask mandates to the reduction of prison populations and alternate measures of voting during elections, the COVID AMP library contains a wide array of policy documents that historians, legal experts, economists, and epidemiologists can analyze to assess and compare the effectiveness of COVID-19 outbreak responses around the world.Pairing this tool with epidemiological data supports the evaluation of policy effectiveness and how that success relates to the affected population, authorizing entity, health infrastructure, and other extenuating factors.We also hope that this library will support policymakers in future outbreaks by providing canonical examples of policies from countries and states that had different outcomes during the pandemic.

Example Analysis
The data collected in COVID AMP provides researchers with valuable insights to understand how policy is used to respond to a global pandemic and inform policy response for the next.Figure 3 offers a visual representation of the progression of restricting and relaxing policy types over time in comparison to caseload for the U.S. We identify two informative patterns about the pandemic response in the U.S. First, despite initial concerns about the Omicron variant, the corresponding response was commensurate with that of the early days of the pandemic, where policies were primarily restrictive and gradually relaxed.Second, Figure 3 shows that restrictive policies typically preceded outbreaks by 1-2 months, highlighting the importance of global early warning systems in curbing transmission.These initial insights emphasize the importance of early and effective policy implementation, and support further use of the COVID AMP dataset to strengthen the evidence base for decision-making.The granularity of the COVID AMP database allows for disaggregation and analysis by broad category of policy enacted, as shown in the bottom panel of Figure 3.This analysis highlights the need for nuanced policy making that includes both restricting and relaxing measures not only for policies such as face masks and social distancing, but for authorization and enforcement policies.This analysis also highlights the degree to which different categories are not clearly structured around restrictions, but instead are focused on a whole-of-government, coordinated response.Using "issued date", "effective start date", "anticipated end date", and "actual end date", as coded within COVID AMP, policymakers can use this type of analysis to evaluate when decisions were made to initially implement a policy, when the policy took effect, and the pattern by which policies were relaxed, renewed, or terminated.COVID AMP also supports analysis of the intended targets of each policy implemented, as shown in Figure 4.For example, economists could use the data to assess specific policy types, such as "Regulatory relief", to analyse which sectors received which types of support, and compare effects across jurisdictions and sectors.Using the date each policy was issued and became effective, analysts could, for example, assess how stock prices reacted to regulatory relief announcements.For education officials, the COVID AMP data could be combined with school test scores to understand how the timing of school closures, reopening, and distance learning impacted students' educational performance.Public health researchers could use the data to identify a specific population, such as "Homeless shelters and individuals" and determine which policy types were (or were not) targeted toward the population, and whether it met community needs.With the ability to cross-reference policy subcategories and targets, COVD AMP enables researchers from various fields to conduct more nuanced analyses of the impact of policies on their area of interest, whether that is a sector, population, or policy type, and encourages policy innovation for the future.Published Research Using COVID AMP As a library of policies collected in near real-time and continuously evolving throughout the pandemic, COVID AMP allows users to identify and access policies of interest in addition to the original text of the policy as a raw text file or PDF stored in an Amazon S3 bucket.These data can then be used to perform secondary data transformation as needed for derivative analysis.The COVID AMP dataset does not prescribe research-side assumptions such as policy stringency or policy levels to the data with the specific intent of supporting broader crossdisciplinary downstream use.The value of this approach is demonstrated by Page-Tan & Corbin (2021), who used COVID AMP data to define unique parameters of restrictiveness to test four different policy scenarios in states and localities with high social vulnerability scores using propensity score matching.[9] Additional studies used COVID AMP to validate parameter assumptions for models about the timing of intervention implementation [10,11].Others have used COVID AMP to analyse global differences in response strategies to the Omicron variant through specific focus on travel restrictions [8], access archived public health measures from governments, trace the progression of policy, and evaluate the role of institutions [12,13], and assess the benefits of mask-wearing at the county-level.[14] These studies highlight the ease of use of the database and suggest that significant future work could continue to make use of COVID AMP to ask new questions about the Covid-19 pandemic.

Figures Fig. 1 .
Figures Fig. 1.Geographic policy data coverage of the COVID-AMP database from January 2020 to June 2022 (A) Total number of policies captured for each country.As of date of submission, 152 countries have at least 1 policy coded.(B) Total number of policies captured for each U.S. state.As of date of submission, all 50 states and territories were coded comprehensively.

Fig. 2 .Fig. 4 .
Fig. 2. (A) Distribution of policies by category and month, globally.(B) Distribution of policies by category and month for the United States.The month for each policy was the effective start date.

.
Required data fields and definitions