Characterizing the incidence of adverse events of special interest for COVID-19 vaccines across eight countries: a multinational network cohort study

Summary Background As large-scale immunization programs against COVID-19 proceed around the world, safety signals will emerge that need rapid evaluation. We report population-based, age- and sex-specific background incidence rates of potential adverse events of special interest (AESI) in eight countries using thirteen databases. Methods This multi-national network cohort study included eight electronic medical record and five administrative claims databases from Australia, France, Germany, Japan, Netherlands, Spain, the United Kingdom, and the United States, mapped to a common data model. People observed for at least 365 days before 1 January 2017, 2018, or 2019 were included. We based study outcomes on lists published by regulators: acute myocardial infarction, anaphylaxis, appendicitis, Bell’s palsy, deep vein thrombosis, disseminated intravascular coagulation, encephalomyelitis, Guillain-Barre syndrome, hemorrhagic and non-hemorrhagic stroke, immune thrombocytopenia, myocarditis/pericarditis, narcolepsy, pulmonary embolism, and transverse myelitis. We calculated incidence rates stratified by age, sex, and database. We pooled rates across databases using random effects meta-analyses. We classified meta-analytic estimates into Council of International Organizations of Medical Sciences categories: very common, common, uncommon, rare, or very rare. Findings We analysed 126,661,070 people. Rates varied greatly between databases and by age and sex. Some AESI (e.g., myocardial infarction, Guillain-Barre syndrome) increased with age, while others (e.g., anaphylaxis, appendicitis) were more common in young people. As a result, AESI were classified differently according to age. For example, myocardial infarction was very rare in children, rare in women aged 35–54 years, uncommon in men and women aged 55–84 years, and common in those aged ≥85 years. Interpretation We report robust baseline rates of prioritised AESI across 13 databases. Age, sex, and variation between databases should be considered if background AESI rates are compared to event rates observed with COVID-19 vaccines.


Summary (275 words)
Background As large-scale immunization programs against COVID-19 proceed around the world, safety signals will emerge that need rapid evaluation. We report population-based, age-and sexspecific background incidence rates of potential adverse events of special interest (AESI) in eight countries using thirteen databases.

Methods
This multi-national network cohort study included eight electronic medical record and five administrative claims databases from Australia, France, Germany, Japan, Netherlands, Spain, the United Kingdom, and the United States, mapped to a common data model. People observed for at least 365 days before 1 January 2017, 2018, or 2019 were included. We based study outcomes on lists published by regulators: acute myocardial infarction, anaphylaxis, appendicitis, Bell's palsy, deep vein thrombosis, disseminated intravascular coagulation, encephalomyelitis, Guillain-Barre syndrome, hemorrhagic and non-hemorrhagic stroke, immune thrombocytopenia, myocarditis/pericarditis, narcolepsy, pulmonary embolism, and transverse myelitis. We calculated incidence rates stratified by age, sex, and database. We pooled rates across databases using random effects meta-analyses. We classified metaanalytic estimates into Council of International Organizations of Medical Sciences categories: very common, common, uncommon, rare, or very rare.

Findings
We analysed 126,661,070 people. Rates varied greatly between databases and by age and sex. Some AESI (e.g., myocardial infarction, Guillain-Barre syndrome) increased with age, while others (e.g., anaphylaxis, appendicitis) were more common in young people. As a result, AESI were classified differently according to age. For example, myocardial infarction was very rare in children, rare in women aged 35-54 years, uncommon in men and women aged 55-84 years, and common in those aged ≥85 years.

Introduction
On 11 March 2020, the World Health Organization (WHO) declared the outbreak of COVID-19, caused by the SARS-CoV-2 virus, a global pandemic. As of March 2021, over 100 million confirmed cases and 2·7 million deaths have been reported worldwide. 1 Vaccines for COVID- 19 have been developed at unprecedented speed, with phase 3 clinical efficacy trials reporting resulted for some vaccines less than a year after the WHO declared the pandemic. Several vaccines have been authorised by regulators since December 2020, such as the European Medicines Agency (EMA), the Food and Drug Administration (FDA) in the US, and the UK Medicines and Healthcare Products Regulatory Agency (MHRA). Large-scale immunization programs are ongoing worldwide.
While this scientific achievement should be celebrated, we must recognize that emergency use is accompanied by residual uncertainty about the safety and effectiveness of vaccines in all populations of interest. As with all medical products reaching the milestone of regulatory authorization, vaccine safety must continue to be monitored to complement what was initially learned during clinical development. Spontaneous adverse event reporting has served as a foundational component of post-approval pharmacovigilance activities to ensure the safe and appropriate use of medical products. Observational healthcare data captured during the routine course of clinical care, such as electronic health records and administrative claims, can augment pharmacovigilance by providing real-world context about potential adverse events and their rates in populations of interest. Background rates of adverse events have historically played an important role in monitoring the safety of vaccines by serving as a baseline comparator for observed rates among those vaccinated. 2,3 Each new vaccine has potential adverse events of special interest (AESI) that warrant focused evaluation, based on historical precedent set by prior vaccines and knowledge acquired during its development.
As COVID-19 vaccines have received authorization for emergency use, regulatory agencies around the world have been preparing safety surveillance strategies. The US FDA Center for Biologics Evaluation and Research published a protocol on "Background Rates of Adverse Events of Special Interest for COVID-19 Vaccine Safety Monitoring." 4 The European Medicines Agency-funded vACCine covid-19 monitoring readinESS (ACCESS) project also included estimation of background AESI rates in their protocol. 5 Recognizing the global burden that COVID-19 represents and following the WHO Council for International Organizations of Medical Sciences (CIOMS) guidance that the most valid data for comparison in a particular area are the background rates from the local population, 6 the Observational Health Data Sciences and Informatics (OHDSI) community collaborated to design and execute an international open science study to characterize background rates of COVID-19 AESI. In this paper, we provide descriptive epidemiology context for COVID-19 AESIs using observational data from Australia, France, Germany, Japan, Netherlands, Spain, the UK, and the US.

Study design
A multinational, multi-database population-based network cohort study.

Data sources
We included thirteen databases from eight countries, of which eight were electronic health record data sources and five were administrative claims data sources. A detailed description of the databases can be found in Appendix Table 1.
All datasets were previously mapped to the Observational Medical Outcomes Partnership common data model, which is maintained by the Observational Health Data Sciences and Informatics (OHDSI) network. The analysis code could therefore be distributed across all contributing centers without sharing patient-level data. 7,8 Population/study participants In our primary analysis, we defined the target at-risk population as people who were observed on 1 January 2017, 1 January 2018, or 1 January 2019 and were observed for at least 365 days before this observation date. We defined 1 January each year as the index date.

Events of interest
The events of interest in this study were the AESIs that might need evaluation following COVID-19 vaccination. This outcome list was developed primarily based on the "Background Rates of Adverse Events of Special Interest for COVID-19 Vaccine Safety Monitoring" protocol published by the FDA Center for Biologics Evaluation and Research, the prioritiszd COVID-19 vaccine AESI list by the Brighton Collaboration, and other previous studies. 4,9 Fifteen events were included: non-hemorrhagic stroke, hemorrhagic stroke, acute myocardial infarction, deep vein thrombosis, pulmonary embolism, anaphylaxis, Bell's palsy, myocarditis/pericarditis, narcolepsy, appendicitis, immune thrombocytopenia, disseminated intravascular coagulation, encephalomyelitis (including acute disseminated encephalomyelitis), Guillain-Barre syndrome, and transverse myelitis. 4 Events were identified by condition occurrence records (e.g., diagnosis codes from claims or diagnosis codes and problem lists from electronic health records). Encephalomyelitis, nonhemorrhagic stroke, hemorrhagic stroke, and acute myocardial infarction definitions also required the record to occur within an inpatient setting (any position), while the Guillain-Barre syndrome definition required the condition to be recorded in an inpatient setting in the primary position.
Qualifying events could not be previously observed in a 'clean window' period before the index date. In keeping with the FDA protocol, we applied a clean window of 365 days for all events except anaphylaxis (30 days) and facial nerve palsy and encephalomyelitis (183 days). 4 The full specifications of all phenotype definitions, including source codes and standard . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
As the CPRD-GOLD (UK), IQVIA (France, Germany, and Australia), and IPCI (the Netherlands) databases only included primary care data, we did not use them for events whose definition required an inpatient diagnosis.

Analysis
We defined the time-at-risk as a 365-day period following the index date. People contributed time-at-risk from 1 January to 31 December for each qualifying year in 2017 to 2019, but time was censored during the clean window following an event and at the end of a person's observation period. One person could contribute more than one event, with outcome-specific pre-specified clean periods of 30 to 365 days used to avoid duplicate counts.
Incidence rates were estimated as the total number of events divided by the person-time at risk per 100,000 person-years. We calculated the age-by-sex specific incidence rates in each database and report all rates where the event counts exceeded a minimum cell count of 5. Age was calculated as year of index date minus year of birth and was partitioned into eight mutually exclusive age groups: 1-5 years old, 6-17, 18-35, 36-55, 56-64, 65-74, 75-84, and 85 years and older. We then pooled age-sex specific rates of each AESI across all databases using random-effects models with the DerSimonian-Laird method to estimate between-database variance. 10 We estimated 95% predicted intervals (95% confidence intervals) using the R package 'meta'. 11 Meta-analytic age and sex-specific rates were classified using the World Health Organization Council for International Organizations of Medical Sciences (CIOMS) thresholds: very common (≥1/10), common (<1/10 to ≥1/100), uncommon (<1/100 to ≥1/1,000), rare (<1/1,000 to ≥1/10,000), and very rare (<1/10,000). 12 All statistical analyses were performed in R software. 13 The study protocol and analysis code are available at: https://github.com/ohdsistudies/Covid19VaccineAesiIncidenceCharacterization

Ethical approval
The protocol for this research was approved by the Independent Scientific Advisory Committee (ISAC) for MHRA Database Research (protocol number 20_000211), the IDIAPJGol Clinical Research Ethics Committee (project code: 21/007-PCV), the IPCI governance board (application number 3/2021), and the Columbia University Institutional Review Board (AAAO7805).

Results
From 13 databases, 126,661,070 people contributed 227,043,370 person-years of follow-up. Each database captured important different population demographics, as shown in Table 1, and collectively represented all age and sex subgroups from eight countries. Table 2 summarises the incidence rates of the 15 AESIs, stratified by age and sex, based on prediction intervals from a meta-analysis of the database estimates. Each age/sex subgroup was also classified using the CIOMS adverse event frequency system (very common, common, uncommon, rare, or very rare). The incidence of several outcomes varied substantially by age. For example, acute myocardial infarction was very rare (<1/10,000) in women under 35 years, rare (1/1,000 to 1/10,000) in women aged 35 to 54 years, uncommon (1/100 to 1/1,000) in both men and women aged 55 to 84 years, and common (1/10 to 1/100) in men and women aged 85 years and older. For both men and women, deep vein thrombosis was rare in those . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 17, 2021. ; under 18 years, uncommon in those aged 35 to 84 years, and common in those 85 years and older. Women aged 18 to 34 years had a higher rate of deep vein thrombosis than men of the same age.
The AESIs studied spanned the continuum of possible frequencies. Non-hemorrhagic stroke, acute myocardial infarction, and deep vein thrombosis were all common in those aged 85 years and older and uncommon in those aged 55 to 74 years. Anaphylaxis, Bell's palsy, appendicitis, and immune thrombocytopenia were largely rare in all age groups, although appendicitis was uncommon in those aged 6 to 34 years. Guillain-Barre syndrome and transverse myelitis were very rare in nearly all subgroups.
The prediction intervals for each age-sex subgroup were notably wide, reflecting the substantial population-level heterogeneity observed across sources. Figure 1 shows the database-specific incidence rates that were used to calculate the meta-analytic estimates. The rates recorded for deep vein thrombosis highlight this variation. For women aged 35 to 54 years, the incidence rates ranged from 159/100,000 person-years in Spain (SIDIAP) to 866/100,000 person-years in the US (MDCD). We obtained 13 database estimates for the incidence in women aged 65 to 74 years, ranging from 387/100,000 to 1,443/100,000 personyears. Eight databases gave rates under 650/100,000 person-years (CPRD-GOLD in the UK; CUIMC in the US; IPCI in the Netherlands; IQVIA in Australia, France, and Germany; JMDC in Japan; and SIDIAP-H in Spain), while three databases gave rates more than twice as high, at over 1,300/100,000 person-years (MDCD, MDCR, and OPTUM-SES in the US). Among women aged 75 to 84 years, the lowest incidence rate was 585/100,000 person-years (CPRD-GOLD in the UK) and the highest was 2,167/100,000 person-years (MDCR in the US). We could not find any consistent patterns to explain which databases yielded higher or lower rates across outcomes. Appendix Table 4 lists all database-specific incidence rates.

Discussion
To our knowledge, this is the largest study to date on the descriptive epidemiology of the AESIs prioritised for post-marketing surveillance of COVID-19 vaccines. We report background rates of deep vein thrombosis, pulmonary embolism, stroke, immune thrombocytopenia, and disseminated intravascular coagulation, which are particularly relevance for COVID-19 vaccines given the observed effects of the SARS-COV-2 virus on coagulopathy. [14][15][16][17] Our study provides a comprehensive, detailed assessment of the incidence rates of 15 AESIs across 13 databases, eight countries, and four continents. We found considerable heterogeneity between geographies and databases, suggesting that caution is needed when interpreting the difference between any observed and expected rates. We observed considerable variability with age and sex, emphasizing the need for standardization if background rates are used for surveillance purposes.
Many countries and organizations, such as the US and EU, have passive spontaneous adverse event reporting systems, such as the Vaccine Adverse Event Reporting System (VAERS). However, these systems cannot be used to calculate epidemiological estimates of the burden of disease, including incidence rates, because they lack denominators, which are the total number of persons or person-times being observed. 18,19 Background incidence rates have been used to estimate the expected number of events in the general population. 3,19 They are often obtained from the literature, but methods, case and population definitions, data, and calendar time can vary across publications. In contrast, we calculated all of the presented estimates using precisely the same setting and common analysis procedures, phenotyping algorithms, and data model. However, we still found substantial heterogeneity, suggesting that using a single overall estimate inadequately represents the true uncertainty around event . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
Population-level heterogeneity across data sources was substantial for all events, even after standardizing outcome definitions and stratifying by age and sex. While potentially concerning when seen together in one analysis, our findings are consistent with what has been observed in the literature. For example, in our study, the meta-analysis estimated incidence rates of transverse myelitis ranging from 1 to 4 per 100,000 person-years depend on age-sex strata. Previous studies have reported overall incidence rates of transverse myelitis ranging from 0·4 to 4·6 per 100,000 person-years. 19,20 The rates of Bell's palsy among those over 65 years old has ranged from 4·6 per 100,000 person-years in European data to 174 per 100,000 personyears in US data. 21,22 Rates of narcolepsy from 1 to over 30 per 100,000 person-years have been reported among those aged 25 to 44 years. 21,22 Recently reported data from the ACCESS project also showed heterogeneity in background rates. 5 This heterogeneity needs to be taken into account when comparing rates across populations.
Most of the studied outcomes also had considerable within-source patient-level heterogeneity that followed age and sex patterns. For example, rates of cardiovascular diseases such as acute myocardial infarction, hemorrhagic and non-hemorrhagic stroke, deep vein thrombosis, and pulmonary embolism increased with age. The incidence of Guillain-Barre syndrome and Bell's palsy also increased with age. Narcolepsy and appendicitis were more common in younger populations. The patterns observed in our study were generally comparable with previous reports. 2,3,21-25 Stratification by age and sex or standardization are likely to be useful analytic strategies to reduce confounding when comparing incidence rates across populations. However, the magnitude of heterogeneity across sources within age-sex subgroups suggest that residual patient-level differences will remain, including the differences in the distributions of other risk factors such as comorbidities and medication use.
Comparing published results can be limited by differing study methodology, including the time-at-risk definition, study period, event definitions, population coverage, calendar year, and geographic location. 26 Different subgroup definitions also make direct comparison difficult. For example, previous studies of Guillain-Barre syndrome have used age strata that do not fully overlap with each other. 2,5,21,25,27,28 In contrast, because we applied the same definitions, data model, and analysis to all of the databases within our study, the heterogeneity observed cannot be attributed to analysis variability. Instead, it may have been due to differences in the underlying populations, healthcare systems, and data capture processes. Although some variability may have been due to systematic error, selection bias or differential outcome measurement error between databases, some may reflect true population differences such as socioeconomic status and comorbidities.
As we observed notable differences between age, sex, and databases, caution should be exercised when using incidence rates as a basis for comparison. Comparisons between incidence rates from different sources may be subject to substantial systematic error. We observed large variations between electronic health records and claims data sources when using the same analysis and outcome definitions. Comparisons with rates derived from randomised trials or spontaneous reporting data may have even greater variability. If observational databases are to be used to inform safety surveillance activities, withindatabase analyses (such as self-controlled case designs or propensity-score adjusted comparative cohort designs) may help reduce study bias for any given comparison. Demonstrating consistent effects across databases may further strengthen confidence in results. If observational data are used to derive historical 'expected' rates and compared against observed rates of events from another source, then the uncertainty in the background rate must be appropriately integrated to avoid misleading conclusions.
Our large number of participating databases, geographical coverage, and sizable study . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 17, 2021. ; population allowed us to provide a comprehensive assessment of the background incidence rates across different healthcare systems and regions across the globe. Our study took advantage of the Observational Medical Outcomes Partnership common data model, which allowed us to use the same study design and analytical code and to gather results from participating data partners rapidly and without transferring patient-level data. All outcome definitions, clinical codes and phenotype algorithms have been made open source and are available online for review and to maximise reproducibility and reuse.
The primary limitation of this study is that all outcomes may have been subject to measurement error. As the outcome definitions were based on the presence of specific diagnostic codes and were not validated further, they may have had imperfect sensitivity or specificity. The analysis relied on data from 2017 to 2019 using a target population of all people in each database with >365 days of observation indexed on 1 January, 365 days' timeat-risk, and outcome-specific clean windows to allow for recurrent events. The impact of these design decisions should be explored further.

Conclusion
This study comprehensively assessed the descriptive epidemiology of potential AESIs for COVID-19 vaccines. It highlighted the wide range of adverse effects being monitored, from very rare neurological disorders to more common thromboembolic conditions. We reported large variations in the observable rates of AESI by age group and sex, demonstrating the need to account for stratification or standardization before using background rates for safety surveillance. We also found significant population-level heterogeneity in AESI rates between databases, implying that individual study estimates should be interpreted with caution and systematic error associated with database choice should be incorporated into any analysis. These background rates should provide useful real-world context to inform public health efforts aimed at ensuring patient safety while promoting the appropriate use of vaccines worldwide.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)   is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 17, 2021. ; https://doi.org/10.1101/2021.03.25.21254315 doi: medRxiv preprint Contributors DAP, XL, PR, and GH conceived the idea of this study. XL, AO, GH, PR, and DAP were responsible for interpreting the results and writing the manuscript. RM, AZ, GR, and AS were responsible for implementing the study. XL, AO, PR, RM, KV, PRR, TDS, and MAS Contributed to the study execution (Data holders). XL, AO, GH, and PR Contributed to the study design. All the co-authors contributed to writing the manuscript. All authors approved the final version and had final responsibility for the decision to submit for publication.

Declaration of interests
DPA's research group has received research grants from the European Medicines Agency, from the Innovative Medicines Initiative, from Amgen, Chiesi, and from UCB Biopharma; and consultancy or speaker fees from Astellas, Amgen and UCB Biopharma. GH and AO receive funding from the US National Institutes of Health and the US Food and Drug Administration. KV and PR work for a research group who receives/received unconditional research grants from Yamanouchi, Pfizer-Boehringer Ingelheim, Novartis, GSK, Amgen, Chiesi none of which relates to the content of this paper. PBR, RM, AS, GR, AGS are employee of Janssen Research and Development, and shareholder in Johnson & Johnson. MAS receives grants and contracts from the US Food & Drug Administration and the US Department of Veterans Affairs within the scope of this research, and grants and contracts from the US National Institutes of Health, IQVIA and Private Health Management outside the scope of this research. Funders had no role in the conceptualization, design, data collection, analysis, decision to publish nor preparation of the manuscript. TDS has no conflicts of interest to declare. EM has no conflicts of interest to declare.
DPA receives funding from NIHR in the form of a Senior Research Fellowship and the Oxford NIHR Biomedical Research Centre. XL receives funding from the Clarendon Fund (University of Oxford) to support her DPhil study.
We acknowledge English language editing by Dr Jennifer A. de Beyer, Centre for Statistics in Medicine, University of Oxford.

Data sharing
Patient-level data cannot be shared without approval from data custodians due to local information governance and data protection regulations. Aggregated data, analytical code, and detailed definitions of algorithms for identifying the events are available in a GitHub repository (https://github.com/ohdsi-studies/Covid19VaccineAesiIncidenceCharacterization).

Supplement materials
Appendix Table 1: database information Appendix Table 2: adverse events of special interest phenotype definition and links Appendix Table 3: source clinical codes used in phenotype definition Appendix Table 4: age-sex specific crude incidence rates for each events of each database . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 17, 2021. ; https://doi.org/10.1101/2021.03.25.21254315 doi: medRxiv preprint