## Abstract

The individual-level effectiveness of vaccines against clinical disease caused by SARS-CoV-2 is well-established. However, few studies have directly examined the effect of COVID-19 vaccines on transmission. We quantified the effectiveness of vaccination with BNT162b2 (Pfizer-BioNTech mRNA-based vaccine) against household transmission of SARS-CoV-2 in Israel. We fit two time-to-event models – a mechanistic transmission model and a regression model – to estimate vaccine effectiveness against susceptibility to infection and infectiousness given infection in household settings. Vaccine effectiveness against susceptibility to infection was 80-88%. For breakthrough infections among vaccinated individuals, the vaccine effectiveness against infectiousness was 41-79%. The overall vaccine effectiveness against transmission was 88.5%. Vaccination provides substantial protection against susceptibility to infection and slightly lower protection against infectiousness given infection, thereby reducing transmission of SARS-CoV-2 to household contacts.

**One-Sentence Summary** Vaccination reduced both the rate of infection with SARS-CoV-2 and transmission to household contacts in Israel.

## Main Text

The COVID-19 pandemic caused by SARS-CoV-2 has led to unprecedented disruptions worldwide. The rapid development and deployment of vaccines against the virus has provided an opportunity to control the outbreak in populations with access to vaccination. Multiple vaccines against SARS-CoV-2 have been demonstrated to be effective in preventing clinical disease and reducing disease severity in those who do become infected [1-4]. This direct protection against disease is critical. However, additional population-level benefits can be derived if vaccines also reduce transmission of the virus, thereby providing protection to those who are still vulnerable to infection [1, 5].

To date, there is little direct real-world evidence about the effects of vaccination on SARS-CoV-2 transmission. A few studies have investigated the reduction in transmission in households and amongst healthcare workers [4, 6]. Other studies have indirectly found evidence for a likely effect of the vaccine on transmission by demonstrating reduced viral load in the upper respiratory tract of infected individuals [7-11].

Households are an ideal setting for evaluating transmission of the virus and the effects of vaccination due to the high rate of secondary infection among household members [4, 12]. Detailed data on household structure and timing of infections can be used to quantify the risk of transmission. We aimed to assess the effectiveness of vaccination against susceptibility to infection and against infectiousness given infection with SARS-CoV-2 following vaccination with BNT162b2 (Pfizer-BioNTech mRNA-based vaccine). We accomplished this using two different analytic approaches applied to data from the second-largest healthcare organization in Israel. The rapid and early rollout of mass vaccination in Israel provides a unique opportunity to evaluate the effectiveness of the vaccine against transmission.

We used data from Maccabi Healthcare Services (MHS) centralized database, which captures all data on members’ demographics and healthcare-related interactions. MHS is a nationwide 2.5 million-member state-mandated, not-for-profit sick fund in Israel, representing a quarter of the Israeli population, and is a representative sample of the Israeli population. The full dataset, covering the period from June 15, 2020 to March 24, 2021, included information on 2,305,704 individuals from 1,275,015 households. Among these, 1,276,311 individuals received two doses of BNT162b2 as of March 24, 2021. There were 191,138 detected infections caused by SARS-CoV-2 (8.3% of the total population), with 4,141 infections following the second dose of the vaccine and 73,582 infections in unvaccinated individuals (naïve risk ratio = 5.6%).

Most of the households (60.7% of the total) had a single household member; this individual was infected in 59,552 (7.7%) of the 774,003 households. Information on the number of households and proportion of infections occurring in households of varying size can be found in table S1. We focused our analysis on households with at least one infected individual and two or more household members, for a total of 65,624 households and 253,564 individuals (see supplementary materials, materials and methods).

To infer transmission rates, it is necessary to estimate when each individual within a household was infected and the period when they were infectious. We therefore used a data augmentation approach to impute when a person with a positive PCR test was infected and infectious. This was accomplished using random samples from three different Gamma distributions representing the delay between onset of infectiousness and the date of the PCR test, the date of infection and the onset of infectiousness (i.e., latent period), and the onset of infectiousness to the end of infectiousness (i.e., infectious period) (Fig. 1 and table S2; supplementary materials, materials and methods).

We developed two discrete time-to-event data models of household transmission to estimate vaccine effectiveness against susceptibility to infection and against infectiousness given infection. In both approaches, we model the infection status for person *j* in household *i* on study date *t* (*Y*_{ijt}) using conditionally independent Bernoulli distributions with corresponding probability of infection *π*_{ijt}. These probabilities are then defined based on personal demographics, community risk, vaccination status, and characteristics of household transmission, with the approaches differing in how transmission is described. Both models were fit 100 times with the different draws from the delay distributions to assess uncertainty in the results due to uncertainty in the infection timeline.

Using the model of household transmission, we estimated that receipt of two doses of the vaccine was associated with an age-adjusted vaccine effectiveness against susceptibility to infection (*VE*_{S}) of 80.5% (95% confidence interval (CI): 78.9%, 82.1%) and a vaccine effectiveness against infectiousness given infection (*VE*_{I}) of 41.3% (95% CI: 9.5%, 73.0%). The vaccine effectiveness against transmission (*VE*_{T}), which combines the reduction in the risk of infection and the risk of infectiousness given infection among vaccinated individuals, was estimated to be 88.5% (95% CI: 82.3%, 94.8%). Vaccine effectiveness estimates across age groups are shown in Table S3 and coefficients from the primary model are shown in Table S4.

Using the alternative infection-hazard approach, in the absence of infected household members, vaccination was associated with a reduction in the hazard of infection of *VE*_{S,0} = 87.9% (95% CI: 86.7%, 89.0%). If exposed to an infected, unvaccinated household member, vaccination was associated with a *VE*_{S,u} = 92.3% (95% CI: 90.2%, 94.5%) reduction in the hazard of infection, whereas if exposed to an infected, fully vaccinated household member, vaccination reduced the individual hazard of infection by *VE*_{S,v} = 64.9% (95% CI: 35.2%, 94.5%).

Amongst unvaccinated individuals, there was a *VE*_{I,u} = 78.6% (95% CI: 74.5%, 82.7%) reduction in the hazard of infection when exposed to a fully vaccinated versus unvaccinated infected household contact. However, the vaccination status of infected household contacts was not significantly associated with the hazard of infection amongst fully vaccinated individuals (*VE*_{I,v} = 3.23%; 95% CI: -0.87%, 15.7%).

We observed limited variability in the vaccine effectiveness estimates across 100 iterations of the delay distributions (Fig. 2 and 3), suggesting that our results are robust to the unobserved time-course of infection within individuals. Furthermore, we found robust results when including both vaccine doses in the two models (table S5 and S6).

To date, there is limited published evidence with which to compare our estimates of vaccine effectiveness against infectiousness and transmission. A recent study of over 550,000 households in England showed vaccination with both the ChAdOx1 nCoV-19 and BNT162b2 vaccines reduced the odds of transmission from a vaccinated and infected household member by 40-50% compared to unvaccinated index cases [1, 4]. In previous studies, the index case in each household was defined as the earliest case of laboratory-confirmed COVID-19, by diagnosis date, and all secondary infections in the household were attributed to the index case [4]. In contrast, by inferring the date of infection, we do not assume that the index case in the household was necessarily the first individual to be diagnosed, and we account for the risk of transmission from other infected household members and from the community. Other studies investigating the reduction in infection risk among household members of vaccinated versus unvaccinated healthcare workers were conducted in Scotland and Finland, and provide indirect evidence of a lower risk of infection among household contacts of vaccinated individuals [1, 6, 13].

The two modeling approaches used in this study have different strengths and weaknesses. Both models adjusted for age, time-varying risk from the community, and the vaccination status of both the individual and other infected household members. However, the models differ in how they account for the contribution of other infected household members and the time-varying risk from the community. The vaccine effectiveness measures are naturally derived from the household transmission model, with a straightforward interpretation of the vaccine effectiveness against susceptibility to infection and against infectiousness given infection. The alternative model instead provides a case-by-case description of the reduction in risk depending on the vaccination status of the source(s) of exposure (for measures of the vaccine effectiveness against susceptibility to infection) and of the individual (for measures of vaccine effectiveness against infectiousness given the infection of other household members). Thus, the two models provide different perspectives of the reduction in risk following vaccination, which cannot be directly compared. Nevertheless, both approaches estimate a considerable reduction in both susceptibility to infection and infectiousness given infection following vaccination.

This study has several important limitations. Information on the true infection times (and duration of infectiousness) of infected household members is missing. To overcome this limitation, we sampled from three delay distributions parameterized from the literature to determine the potential infection status of each individual through time. Also, individuals who were infected but did not receive a SARS-CoV-2 test would be misclassified in our dataset. However, this is likely to have only a minor impact on our estimates (see supplementary material, table S7 and S8). We restricted our analysis to households with at least one infected individual and two or more household members (including those who never tested positive), which could bias estimates of the community force of infection [14]. However, since our primary goal was to determine the reduction in the relative risk of transmission following vaccination, and not to estimate the probability of transmission from the community versus infected household members, the decision to exclude households with no confirmed SARS-CoV-2 infections and/or fewer than two members is unlikely to bias our results. We nevertheless conducted a sensitivity analysis including 10,000 randomly select households with no infections and found that the results were robust to the inclusion of these households (table S9 and S10).

The ability of widespread vaccination to confer population-level protection through herd immunity depends on the vaccine effectiveness against transmission. Vaccination can prevent transmission by both providing protection against infection and reducing the infectiousness of vaccinated individuals who do become infected. Neither of these are typically directly measured in vaccine trials. By analyzing data on confirmed SARS-CoV-2 infections among household members in Israel, we provide measures of effectiveness of BNT162b2 against susceptibility to infection and against infectiousness given infection using two different approaches. Both models show evidence of a reduction in the infectiousness of vaccinated individuals who become infected in addition to protection against susceptibility to infection, leading to an overall reduction in the risk of transmission. Evidence of a high vaccine effectiveness against transmission confirms the importance of vaccinating both individuals at high and low risk of severe complications due to COVID-19 in order to maximize the population-level impact of vaccination and potentially achieve herd immunity.

## Data Availability

According to the IMOH (Israeli Ministry of Health) regulations, individual-level data cannot be shared openly. Specific requests for remote access to deidentified data should be referred to the Maccabi Institute for Research & Innovation.

## Funding

National Institutes of Health grant R01AI137093 (JLW, DMW, VEP)

National Institutes of Health grant 1DP2HD091799 (FWC)

National Institutes of Health grant R01AI112970 (VEP)

## Author contributions

Each author’s contribution(s) to the paper should be listed [we encourage you to follow the CRediT model]. Each CRediT role should have its own line, and there should not be any punctuation in the initials.

Conceptualization: SG, TP, DMW, VEP

Methodology: OP, JLW, FWC, DMW, VEP

Investigation: OP, JLW, FWC, DMW, VEP

Visualization: OP, VEP

Funding acquisition: JLW, DMW, VEP

Project administration: SG, DMW, VEP

Supervision: DMW, VEP

Writing – original draft: OP

Writing – review & editing: OP, JLW, FWC, SG, TP, DMW, VEP

## Competing interests

DMW has received consulting fees from Pfizer, Merck, GSK, and Affinivax for topics unrelated to this manuscript and is Principal Investigator on a research grant from Pfizer on an unrelated topic. VEP is a member of the WHO Immunization and Vaccine-related Research Advisory Committee (IVIR-AC) and has received reimbursement from Merck and Pfizer for travel expenses to Scientific Input Engagements unrelated to the topic of this manuscript. All other authors declare that they have no competing interests.

## Data and materials availability

*According to the IMOH (Israeli Ministry of Health) regulations, individual level data cannot be shared openly. Specific requests for remote access to deidentified data should be referred to the Maccabi Institute for Research & Innovation*.

## Materials and Methods

### Setting

Vaccination in Israel began on December 20, 2020, mainly using the BioNTech-Pfizer BNT162b2 vaccine, with a few individuals receiving the vaccine earlier. The vaccination campaign first targeted high-risk individuals, including those 60 years of age and older, medical personnel, workers at nursing homes, and individuals with comorbidities. After this phase, which lasted until January 21, age restrictions were lowered. By February 6, 2021, every Israeli citizen above 16 years old was eligible for the vaccine [10]. By the beginning of April 2021, 61% of the population had received at least one dose of the BNT162b2 vaccine [15].

The vaccination roll-out coincided with Israel’s third and largest wave of SARS-CoV-2 registered cases [16]. Consequently, a third national lockdown was issued in Israel starting December 24, 2020, with more severe restrictions (e.g., schools closures) issued starting January 8. These restrictions were progressively lifted starting on February 7, 2021.

### Data sources

We used data from Maccabi Healthcare Services (MHS) centralized computerized database, which captures all data on members’ healthcare-related interactions (including demographics, inpatient and outpatient visits, diagnoses, procedures etc). MHS is a nationwide 2.5 million-member state-mandated, not-for-profit sick fund in Israel, representing a quarter of the Israeli population, and is a representative sample of the Israeli population. The individual-level data for cases and household contacts include demographic information (i.e. age, sex), date of any polymerase chain reaction (PCR) tests for SARS-CoV-2 and the result of the test (considering that all such tests of MHS members are recorded centrally), and date of receipt of the first and second doses of the vaccine (if received). Individuals were defined as unvaccinated if they had not received any doses of BNT162b2 and fully vaccinated if at least 10 days had passed since receiving the second dose of the vaccine.

Due to computational constraints, we focused on households with at least one infected individual and two or more members; it was not possible to include all households with no infections. Sensitivity analyses were conducted using a randomly-selected subset of households with no infections to evaluate possible biases of this approach. We restricted our analysis to data from June 15, 2020 to March 24, 2021, since viral testing was not widely available prior to this date.

### Data augmentation for inferring transmission rates

For each individual, we observed the date at which the viral test was performed and the outcome of the test, but we do not have data on date of infection. To infer transmission rates, it is necessary to estimate when each individual within a household was infected and the period when each person was infectious. We therefore used a data augmentation approach to impute when a person with a positive PCR test was infected and infectious. This was accomplished using random samples from three different Gamma distributions representing the delay between onset of infectiousness and the date of the PCR test (i.e., *τ*_{report}), between the date of infection and the onset of infectiousness (i.e., latent period, *τ*_{latent}), and the time from the onset of infectiousness to the end of infectiousness (i.e., infectious period, *τ*_{infections}). The values for these distributions were based on prior knowledge derived from observational studies on the latent period, times to seeking a test, and the duration of infectiousness (table S2) [17, 18]. For each individual, a random draw was taken from each of these distributions, and this process was repeated 100 times. For clarity, we refer to these Gamma distributions as the *delay* distributions.

For each person *j* in household *i* with a positive PCR test, let *T*_{ij} be the (imputed) time of infection in days since the beginning of the study:, with being the number of days after the start of the study until the PCR test date. We set the beginning of the study as May 29, 2020, since we allow for infections occurring up to 17 days prior to the start of the data (i.e., June 15, 2020). If no infection occurred for person *j, T*_{ij} is censored and equal to the total number of days in the study (i.e., *t*_{max}). For the purposes of model fitting, we define *Y*_{ijt} as a binary variable equal to zero for each day of the study up until the time of infection (*t* = 1, …, *T*_{ij} − 1), equal to one for *T*_{ij} (assuming the person is infected), and censored from that point onwards (i.e., that person is relevant only in terms of transmitting to other household members). For people who are never infected, *Y*_{ijt} is equal to zero for all days of the study.

### Statistical modeling

Using the augmented data, we developed two discrete time-to-event data models of household transmission to estimate vaccine effectiveness against susceptibility to infection and against infectiousness given infection. In both approaches, we model the infection status for person *j* in household *i* on study day *t* (i.e., *Y*_{ijt}) using conditionally independent Bernoulli distributions with corresponding probability of infection *π*_{ijt}. These probabilities are then defined based on personal demographics, community risk, vaccination status, and characteristics of household transmission, with the approaches differing in how transmission is described. Both models were fit 100 times with different draws from the delay distributions to assess uncertainty in the results due to uncertainty in the unobserved dates of infection and infectiousness.

### Household transmission model

For the primary transmission model, we define the probability of infection on a given day as
where *n*_{i} is the number of members in household *i, p*_{ij0t} is the probability that person *j* in household *i* is infected by the community on study day *t* (i.e., community risk of infection), *p*_{ijkt} is the probability that person *j* is infected by household member *k* (i.e., household risk of infection), and *d*_{ikt} is an indicator of whether person *k* can transmit to *j* on day *t*. In other words, , with and equal to the time of onset and end of infectiousness, respectively, for person *k*, and 1(.) representing the indicator function taking the value of one if the input condition is true and the value of zero otherwise.

The probability that individual *j* never tested positive for SARS-CoV-2 is given as
whereas the probability that individual *j* is infected on day *t*^{*} (before the end of the study) is given as
(i.e., the probability of escaping infection up to time (*t*^{*} − 1) multiplied by the probability of not escaping infection at time *t*^{*}).

Conveniently (with respect to computation), the likelihood function can be written in terms of the introduced binary variables such as
with *N* being the total number of households.

We define the per-person, per-day community risk of infection using the logit link function as
where *δ*_{0} is the baseline risk of infection from the community, *vax*_{ijt} is a binary variable equal to one if at least 10 days has passed since person *j* received the second dose of the vaccine, *age*_{1,ij} is a binary variable equal to one if the person is between 10 and 60 years old at the start of the study (reference category is the ≤10-year-old age group), *age*_{2,ij} is similarly defined for those aged ≥60 years old, and *cases*_{t} describes the time-varying risk from the community and is computed as the standardized number of positive PCR tests on day *t* in the data.

Similarly, the per-person, per-day risk of transmission from an infectious individual *k* to a susceptible household member *j* is defined (for *k≠j*) as
where *α*_{0} is the baseline risk of infection from an infected household member and *vax*_{ikt} is the vaccination status of household member *k*. All other terms have been previously described.

Vaccine effectiveness is expressed as a percentage and computed as 100*(1-*RR)*, with *RR* defined as a risk ratio comparing vaccinated and unvaccinated individuals respectively. Vaccine effectiveness against susceptibility stratified by age group was defined as
with *a*_{j} being a categorical variable equal to 0 for ages ≤10-year-old, 1 for ages between 10 and 60, and 2 for ages ≥60 years old; *vax*_{ijt} and *vax*_{ikt} are as previously defined. The explicit equation for age group 0 would then be
Including *γ*_{1} in both the numerators and denominators of the *VE*_{S} provides the VE with respect to age groups 1 (*VE*_{S,1}) and similarly with *γ*_{2} for the VE with respect to age group 2 (*VE*_{S,2}).

Vaccine effectiveness against infectiousness given infection is defined as
which is based on the vaccination status of infected household member *k*. Again, for age group 0, this is equivalent to
similarly, with *γ*_{1} and *γ*_{2} included for the other age groups.

We also estimated the vaccine effectiveness against transmission as
(i.e., the reduction in risk associated with vaccine-derived protection against both infection and infectiousness given infection). This is equivalent to
The age-adjusted vaccine effectiveness measures can be derived as , with *prop*(*a*_{j}) being the fraction of people in each age group.

We initially applied the household transmission model to only households with a single occupant to inform the values of the baseline risk (*δ*_{0}) and the time-varying risk from the community (*δ*_{1}). We found *δ*_{0}=-8.25 and *δ*_{1}=0.79 and used these values as initial conditions for our analysis on households with at least one infected individual and two or more household members.

### Infection-hazard regression model

We compared the primary household transmission model defined by equations (1-3) with an alternative infection-hazard model. The models differ in their definition of the probability of infection, *π*_{ijt}. We note that similar to the primary transmission model, the likelihood can be defined using the introduced binary variables, *Y*_{ijt}. In this analysis, we use the complementary log-log-link function to connect the probabilities to individual-, household-, and community-level risk factors, such that
and
where *vax*_{ijt}, *age*_{1,ij}, and *age*_{2,ij} are as previously defined, *λ*_{0} is the intercept parameter, *h*_{t} is a smooth function of study time (modeled using splines) that describes the time-varying community risk, *m*_{ijt} is a binary variable equal to 1 if at least one other *unvaccinated* household member is infectious during study time *t* (not including person *j*), and *z*_{ijt} is a binary variable equal to 1 if at least one other *vaccinated* household member is infectious (not including person *j*). Interactions between person *j*’s vaccination status and the household risk variables are also included.

Use of the complementary log-log-link function leads to a hazard ratio interpretation for the exponentiated regression parameters. We define multiple susceptibility vaccine effects using this model output based on comparing different within-individual and within-household scenarios.

First, we compute
as the vaccine effectiveness against susceptibility to infection given there are no other infectious household members at that time (i.e., *m*_{ijt} = *z*_{ijt} = 0) and interpret it as the decrease in the hazard of infection due to vaccination. Similarly, we define
and
as the vaccine effectiveness against susceptibility to infection given there is at least one unvaccinated (*VE*_{S,u}) or vaccinated (*VE*_{S,v}) member in the household.

Vaccine effectiveness against infectiousness given the infection of household contacts is defined as
and
which can be interpreted as the percent reduction in the hazard of infection for individual *j* when exposed to a vaccinated versus unvaccinated infectious household member *k*, given individual *j* is unvaccinated (*VE*_{I,u}) or vaccinated (*VE*_{I,v}).

For both models, we summarized the vaccine effectiveness estimates by taking the mean over the 100 samples of the delay distributions. We derived the 95% confidence intervals (CI) using the law of total variance. All analyses were carried out in the R statistical software [19].

### Sensitivity analysis: including the first vaccine dose

For the primary household transmission model, we included information on the first vaccine dose status by defining community risk as
where *vax*_{1,ijt}, and *vax*_{2,ijt}represent mutually exclusive binary variables of vaccination status (i.e., individual *j* in household *i* is partially or fully vaccinated at time *t*, respectively), and all other terms have been previously described.

Similarly, household risk is defined (for *k*≠*j*) as:
where *vax*_{1,ikt} is the first-dose vaccination status of household member *k* at time *t* and *vax*_{2,ikt} is the second-dose vaccination status of household member *k* at time *t*. Vaccine effectiveness estimates are the same as for the primary analysis.

For the infection-hazard model, we now define the variable *ψ*_{ijt} from equations 4-5 as
where , and represent mutually exclusive binary variables of vaccination level (i.e., partially and fully vaccinated, respectively), *m*_{0,ijt} is a binary variable equal to one if at least one other *unvaccinated* household member is infectious during study time *t* (not including person *j*), *z*_{1,ijt} is a binary variable equal to one if at least one other *partially vaccinated* household member is infectious (not including person *j*), and *z*_{2,ijt} is a binary variable equal to one if at least one other *fully vaccinated* household member is infectious (not including person *j*). Interactions between person *j*’s vaccination status and household risk are also included.

The vaccine effectiveness against susceptibility to infection given there are no other infectious household members at that time (i.e., *m*_{0,ijt} = *z*_{1,ijt} = *z*_{2,ijt} = 0) is given as:
and is interpreted as the decrease in the hazard of infection due to full vaccination.

Similarly, we define
as the vaccine effectiveness against susceptibility to infection given there is at least one unvaccinated (*VE*_{S,u}), partially vaccinated (*VE*_{S,pv}), or fully vaccinated (*VE*_{S,v}) member in the household.

Vaccine effectiveness against infectiousness given the infection of household contacts is defined as
which can be interpreted as the effect of being exposed to a fully vaccinated versus unvaccinated infectious household member given individual *j* is unvaccinated (*VE*_{I,u}), partially vaccinated (*VE*_{I,pv}), or fully vaccinated (*VE*_{I,v}). Results from both models are reported in tables S5 and S6.

### Sensitivity analysis: including a subset of households with no infections

We randomly selected 10,000 households with at least two household members and no detected infections and included them along with our original set of households with at least two household members and at least one infection. We compared the vaccine effectiveness results from both models for one iteration of the delay distributions (tables S9 and S10).

### Sensitivity analysis: testing the robustness of the results to misclassification of cases

We run a sensitivity analysis to test the robustness of the results to misclassification of individuals who were infected but did not receive a SARS-CoV-2 test. For each individual with a negative PCR test, we randomly selected a new PCR test date. On this new PCR test date, each individual could now have a positive PCR test based on a Bernoulli distribution with probability *p* = *α* * *prob*_{test}. We define *prob*_{test} = (total number of positive PCR cases in the dataset)/(total population in the dataset) = 0.08. With this new dataset, we ran the delay distribution process for one iteration, and we estimated the vaccine effectiveness from both models. We tested two scenarios: (a) *α* = 0.01 and (b) *α* = 0.10 (tables S7 and S8).