## ABSTRACT

Randomized controlled trials (RCTs) have shown high efficacy of multiple vaccines against SARS-CoV-2 disease (COVID-19), but evidence remains scarce about vaccines’ efficacy against infection with, and ability to transmit, the virus. We describe an approach to estimate these vaccines’ effects on viral positivity, a prevalence measure which under reasonable assumptions forms a lower bound on efficacy against transmission. Specifically, we recommend separate analysis of positive tests triggered by symptoms (usually the primary outcome) and cross-sectional prevalence of positive tests obtained regardless of symptoms. The odds ratio of carriage for vaccine vs. placebo provides an unbiased estimate of vaccine effectiveness against viral positivity, under certain assumptions, and we show through simulations that likely departures from these assumptions will only modestly bias this estimate. Applying this approach to published data from the RCT of the Moderna vaccine, we estimate that one dose of vaccine reduces the potential for transmission by at least 61%, possibly considerably more. We describe how these approaches can be translated into observational studies of vaccine effectiveness.

**Highlights**

SARS-CoV-2 vaccine trials did not directly estimate vaccine efficacy against transmission.

We describe an approach to estimate a lower bound of vaccine efficacy against transmission.

We estimate one dose of the Moderna vaccine reduces the potential for transmission by at least 61%.

We recommend approaches for analyzing data from trials and observational studies.

## INTRODUCTION

While randomized controlled trials (RCTs) have shown high efficacy of multiple vaccines against SARS-CoV-2 disease (COVID-19) [1–3], evidence remains scarce about the effect of each of these vaccines on infection with, and ability to transmit, the virus.

It is important to understand the impact of vaccination on infection, shedding and transmission of the virus [4]. This information can inform personal decisions about resuming contact once one has been vaccinated (or one’s contact has), prioritization decisions [5], and models of the impact of vaccination [6].

Hypothetically, it is possible that the 70-95% protection offered by these vaccines against symptomatic disease could (i) be purely protection against symptoms with no impact on infection or transmission, (ii) be largely or entirely due to protection against infection, suggesting an impact on transmission similar to the efficacy against symptomatic infection; or (iii) be 70-95% protective against infection and moreover reduce the shedding of virus by those who do become infected, in which case protection against transmission could be even greater than that against symptomatic disease. The primary endpoint of RCTs to date, however, sheds little light on the magnitude of protection the vaccines could offer against transmission.

The impact of a vaccine on transmission is a composite of its effect on becoming infected (because someone not infected cannot transmit) and its effect on the infectiousness of those who get infected despite vaccination: these components have been called the vaccine efficacy for susceptibility to infection and vaccine efficacy for infectiousness [7]. Under plausible assumptions, the efficacy of a vaccine in preventing transmission can be defined as:
where *VE*_{S} and *VE*_{I} are the vaccine efficacy against susceptibility (acquiring viral infection) and against infectiousness, respectively [7,8].

RCTs of the Moderna, Astra-Zeneca, and Janssen vaccines have provided some evidence about vaccine effects on the probability that a trial participant will harbor detectable virus by swabbing participants irrespective of symptoms at one or more time points during the trial and testing the swabs by RT-PCR to detect virus [2,3,9]. News reports indicate that those still in placebo-controlled trials will provide ongoing samples that can yield similar data over time [10]. In each case, reduced prevalence of viral positivity in vaccine vs. placebo recipients may be interpreted as a reduction in acquisition or duration or both, with potentially direct relevance to transmission. However, some reports from the original trials present rather complex interpretations of composite measures involving infections detected by screening of non-symptomatic individuals combined with those detected by swabbing of symptomatic individuals.

Here we describe the results of simulations of randomized trials that are designed to clarify what information is gained by swabbing individuals for viral infection, how this relates to other measures of vaccine efficacy, and what information is present in measures combining different reasons for sampling (no symptoms vs. symptoms).

We first show that the vaccine effect on viral positivity (*VE*_{V}) in individuals swabbed at random, regardless of symptoms is closely approximated by a vaccine efficacy measure previously defined for bacterial carriage, despite several departures from the assumptions underlying the prior work [11]. This measure captures the product of the vaccine’s efficacy in reducing acquisition and its impact in shortening infection duration. We describe how these departures affect the estimates under varying trial conditions. We show that under plausible assumptions, this measure is a lower bound on the vaccine’s efficacy against transmission. We recommend that samples taken to assess vaccine effects on viral positivity be taken in a cross-section of the population irrespective of symptoms, and that this outcome be analyzed separately from the outcome of a positive test where the test was triggered by symptoms (the primary endpoint in most RCTs for SARS-CoV-2 vaccines).

## METHODS

We simulate follow-up of 100,000 individuals for 300 days. For each person each day, we conduct a Bernoulli trial to determine if they will be infected that day, with a probability based on an external force of infection. We assume in our baseline simulations that this probability remains constant at 0.001 and also examine a higher force of infection of 0.003 in a sensitivity analysis. We compare scenarios in which individuals immediately become susceptible again after recovery (“SEIS”) to scenarios in which prior infection confers full protective immunity for the duration of follow-up (“SEIR”).

We vary the proportion of cases that are symptomatic (Table S1), with symptom onset occurring five days after infection [12]. After a three day latent period [13], infected individuals shed virus for a period drawn from a uniform distribution of 15-21 days [14,15]. We make the simplifying assumption that individuals will test positive on any day they are shedding virus.

On day 100, we randomize half of the individuals to receive a two-dose vaccine, with the doses given 28 days apart. We assume the vaccine confers 50% of its full two-dose efficacy after the first dose and that there is a seven day delay after each dose for immunity to take effect. We model three types of vaccine efficacy (Table S1). First, the vaccine multiplies the probability of infection each day by a factor 1 − *VE*_{S}. Second, the vaccine multiplies the duration of shedding by a factor 1 – *VE*_{D}. Third, the vaccine multiplies progression to symptoms among those infected by a factor 1 – *VE*_{P}. We calculate *VE*_{P} based on the value of *VE*_{S} and the assumption that the vaccine reduces s ymptomatic disease by 95% (*VE*_{SP}) [1,2], using the equation:
We then simulate testing and estimation of three measures of vaccine efficacy:

### Vaccine efficacy for viral positivity (VE_{V})

We assume all individuals are tested regardless of symptoms on day *t*. Those who are shedding virus on day *t* are counted as positive (i.e. perfect test sensitivity and specificity). We then calculate using the prevalence odds ratio comparing vaccinated to unvaccinated using eq. 6 below.

### Vaccine efficacy for non-symptomatic infection (VE_{non-symptomatic})

We estimate vaccine efficacy for non-symptomatic infection by calculating the prevalence odds ratio of PCR positivity among individuals who are not symptomatic on dayt in the vaccinated vs. unvaccinated groups.

### Vaccine efficacy estimated from a combination of symptoms and routine tests (VE_{combined})

For this measure of vaccine efficacy, we count as positive any individuals who test positive on day *t* in cross-sectional testing as well as those who were symptomatic and tested positive on or before day t. We then calculate the odds ratio comparing vaccinated to unvaccinated.

Code is available: https://github.com/rek160/InterpretingVaccineEfficacy.

## RESULTS

### For a vaccine that reduces both incidence and duration of viral carriage, vaccine efficacy against carriage can be interpreted as the product of these two effects

Prior work (concerning a bacterial pathogen, though in this exposition we refer to the pathogen as virus) showed that under certain assumptions, for a vaccine that reduces incidence but not duration of infection, the reduction in incidence rate caused by the vaccine (termed in the original paper the vaccine efficacy against acquisition [11], but which we call vaccine efficacy against susceptibility to infection, for consistency with most of the literature [7]) can be defined as
where *λ*_{V} is the incidence rate in the vaccinated and *λ*_{u} is the incidence rate in the placebo arm, and can be estimated as
where *p*_{u} and *p*_{v} are the prevalence of the virus in the placebo and vaccine arm respectively, so the estimator is just one minus the odds ratio for carrying the pathogen for vaccine vs. placebo recipients. This was shown by [11]. A simple extension of their reasoning shows that if vaccine does reduce duration of detectable infection, the quantity
--vaccine efficacy for viral positivity --can be defined as the combined effect of the vaccine on incidence and duration, and can be estimated identically
again from the prevalence odds ratio for the vaccine, and now with effects on both duration and incidence, we have the algebraic relationship
where *VE*_{V} is defined in eq. 5, *VE*_{S} is defined in eq. 3, and is the reduction in average duration of viral positivity due to the vaccine. Thus, to generalize from reference [11], *VE*_{V} is an upper bound on *VE*_{S}: *VE*_{V} ≥ *VE*_{S} with equality for the special case where *VE*_{D} = 0.

*VE*_{V} is a lower bound on the vaccine’s efficacy against transmission, under plausible assumptions

Equations 1 and 7 show that *VE*_{V} and *VE*_{T} are similar though not identical; in particular, they differ in only one term: the substitution of *VE*_{D} in the definition of *VE*_{V} as opposed to *VE*_{I} in the definition of *VE*_{T}. If all virus-positive, vaccinated individuals contributed equally to the force of infection, then these two terms would be identical, and we would have *VE*_{V} = *VE*_{T}: that is, the reduction in transmission thanks to the vaccine would be the combination of reduced probability of infection and reduced duration of shedding in those infected despite vaccination. If we assume that for every day of being virus positive, a vaccinated infected person is on average no more infectious (and perhaps less due to lower viral loads) than an unvaccinated infected person with the same exposure, then we can conclude that
Under this plausible assumption, *VE*_{T}, which cannot be directly estimated from available trial data, is at least as large as *VE*_{V}, which can. We therefore proceed to discuss how to estimate *VE*_{V} for SARS-CoV-2 vaccines.

### Simulated trials show that these estimators applied to a single cross-sectional swab approximately recover the simulated impacts of a vaccine on viral positivity, incorporating effects on acquisition and duration, with visible downward bias just after and long after vaccines are administered

Fig. 1 shows results of 300-day simulations of a trial of 100,000 participants randomized 1:1 to vaccine or placebo on day 100. These participants have been exposed to a constant incidence of infection since day 0. The different panels represent (left to right) simulations with *VE*_{S} = 0,0.3,0.6.0.9and (top to bottom) *VE*_{D} = 0,0.3,0.6.09.We simulate a 2-dose regimen, 28 days apart with the first dose giving half the full efficacy and the effect of each dose starting one week after it is given, that is, on days 107 and 135 of the simulation. The solid black lines give the dose-1 and dose-2 predicted values for *VE*_{V} based on eq. 7, while the curves show the estimates obtained from the simulated data using eq. 6. Fig. 1A shows the situation under the assumption that individuals naturally infected who recover (clear infection) become once again susceptible to reinfection. This is unrealistic for SARS-CoV-2 but follows the assumptions made in the above equations following [11]. Fig. 1B makes the opposite assumption, that individuals naturally infected (whatever their vaccine status) are completely protected against reinfection for the duration of the simulation.

In Fig. 1A, there is close agreement between the simulated curves and the predicted ones, after about day 150. During the first-dose period, estimated efficacy is noisy at the beginning but is typically below the predicted level, reflecting holdover of infections that occurred before (randomization+7 days), which are by assumption therefore unaffected by the vaccine. In contrast, toward the right-hand side of each panel, the agreement is nearly perfect apart from sampling error, because such holdover infections are vanishingly rare and the assumptions underlying eqs. 6 and 7 are met.

Fig. 1B shows a similar pattern, with the important exception that over time, as many individuals in the population are immune, the protection estimated from eq. 6 declines toward the null value. This is because both groups have fewer people at risk as immunity builds up, but when the vaccine has an effect, the placebo group is depleted of susceptible individuals faster than the vaccine group, rendering the two groups more similar and the apparent efficacy lower. This effect, which is a known complexity of randomized [16–19] and observational [20,21] studies of vaccine efficacy/effectiveness, is greater when there are longer times of follow up, higher forces of infection (Fig. S1), and greater heterogeneities in infection risk among the study population.

### Separate analyses of infections detected by testing those with symptoms and infections detected by testing cross sections of participants irrespective of symptoms improve interpretability of VE estimates

All trials of which we are aware for SARS-CoV-2 vaccines have had a primary endpoint of symptomatic disease, ascertained by asking every participant who experiences a defined profile of symptoms to get tested, and counting the outcome of COVID-19 when such a test is positive. As noted, some trials also test a subset of participants irrespective of symptoms, either at the visit for the second vaccine dose [2] or at defined intervals during follow up [3]. The primary endpoint measures vaccine efficacy against symptomatic infection, which has been called *VE*_{SP.} for vaccine efficacy against susceptibility or progression (that is, protection from symptomatic infection that could be preventing infection or preventing symptoms if an individual becomes infected), and is related to *VE*_{S} and *VE*_{P} by eq. 2 above.

Fig. 2 shows simulations similar to those above, but now with a virus assumed to cause symptoms in 1% (red) or 80% (blue) of infected individuals. In these simulations, all symptomatic individuals are assumed to be tested for the primary outcome on the day of symptom onset, and all asymptomatic individuals are not tested for the primary outcome. In addition, all individuals who have not yet experienced symptoms are tested for viral positivity and the combined VE is estimated. When only 1% of infected individuals are symptomatic, (solid) and (dashed lines) are nearly identical. However, when 80% are symptomatic [22], increases over time but falls below the expected *VE*_{SP}. If analysis is restricted to only non-symptomatic individuals (Fig. S2), when there is high *VE*_{P} (i.e. low *VE*_{S}), is lower than .

### Application to Moderna data

Table 1 shows data from the published RCT of the Moderna vaccine [2], in which participants returning for their second vaccine dose were tested by RT-PCR for SARS-CoV-2, with 39 and 15 testing positive without symptoms, respectively, in the placebo and vaccine group. If we assume that everyone in the modified intent-to-treat population not infected prior to the second dose was tested (this is not documented in the paper), this corresponds (Table 1) to an estimate of , by eq. 6. Taken at face value, this implies that one dose reduces virus positivity by 61%; our simulations suggest this may be an underestimate for several reasons. A modest underestimate could occur due to holdover of individuals infected before the first dose took effect and still positive at the time of the second dose. Figure S2 shows that if only individuals not symptomatic at the time of swabbing were included, as in the Moderna study, there could be additional underestimation of because vaccinated individuals without symptoms may disproportionately contribute to the non-symptomatic group. To resolve this potential bias, the data could be reanalyzed to include anyone who was symptomatic and tested positive on the day of the second dose.

Finally, as noted above, one expects that *VE*_{V} ≤ *VE*_{T} (eq.8), so we conclude that the Moderna data from the second-dose swab provides evidence of at least a 61% (95% CI 31-79%) reduction in transmissibility due to a single dose of Moderna vaccine.

The VE estimate combining cases ascertained by symptoms and those ascertained by this testing protocol in Table S18 of [2] is 89.5% (85.1%-92.8%). As described above, this combined measure is an underestimate of the estimated in the study of 94.1% (89.3-96.8%).

## DISCUSSION

We have shown that if analyzed correctly, data from randomized trials that test a cross-section of vaccine and control recipients irrespective of symptoms on a given day for virus can estimate the vaccine efficacy against viral positivity. While a complete estimate of *VE*_{T} would require estimates of both *VE*_{V} and of the daily infectiousness of a vaccinated, infected individual compared to an unvaccinated, infected one, and their correlation across individuals, it is very likely in practice that *VE*_{V} is a lower bound on *VE*_{T}: that is, an estimate from trial data of *VE*_{V} provides strong evidence that *VE*_{T} is at least as high.

Our main findings are as follows: first, that a single cross-sectional comparison of PCR positivity odds between individuals in vaccine vs. control groups provides a relatively accurate estimate, subject to sampling error, of vaccine effectiveness against viral positivity, which is a composite of effects in reducing susceptibility to infection and in reducing duration as described in Eq. 7. This can be shown analytically under certain assumptions. Second, we show by simulation that plausible deviations from these assumptions do not dramatically change results and, when they do, tend to bias toward the null hypothesis of no efficacy. A combined analysis of viral positivity detected due to symptoms and those detected by routine screening of non-symptomatic persons will be some combination of efficacy against viral positivity *VE*_{V} and against symptomatic infection *VE*_{SP} with no clear interpretation in terms of elementary quantities of interest. Thus separate analysis is recommended. Finally, if the cross-sectional sampling is restricted to those who are not symptomatic, it may underestimate *VE*_{V}, especially for vaccines which are highly protective against symptoms (high *VE*_{P}). We therefore recommend that the cross-sectional sample include those who are symptomatic. If this is infeasible (for example, if individuals are instructed not to come for a vaccine dose if they are symptomatic, and the testing happens at the vaccine dose), then we recommend that those who are tested because they are symptomatic and test positive on a particular day be included among the positives in the cross-section, constituting a partial exception to our recommendation of separate analyses.

Our results have been described in the setting of a randomized trial. These results apply also to observational studies as well insofar as they are designed to mimic a target trial [23] and achieve adequate control of confounding and other sources of bias.

In observational studies of vaccine effectiveness to date, cases have often been identified in whoever gets tested, for whatever reason [24–26]. These probably constitute a mix of (i) those tested because symptomatic, (ii) those tested because they are contacts of a known or suspected case (for example in a contact tracing investigation), and (iii) those tested without either reason, for example those who get tested in a regular program by their employer or those who get tested to comply with a travel restriction that requires a negative test before travel. Those positive in group (i) are approximately equivalent, in the observational setting, to those who meet the primary outcome of confirmed COVID-19 from randomized trials. Those positive in group (iii) are perhaps equivalent, in the observational setting, to those who test positive in the routine follow-up of persons in a randomized trial. Group (ii) does not have a clear equivalent in the randomized trials, which typically do not gather information on contacts.

For observational studies, our results therefore imply that it would be ideal to analyze symptomatic cases separately from those routinely tested, and if possible to distinguish those tested due to possible exposure (group ii) from those tested for other reasons, such as for travel clearance (group iii). Those tested because they are symptomatic (group i) should be analyzed analogously to the trials, as the reduction in incidence rate. Those tested for exposure (group ii) are a group in which the efficacy measure is conditioned on exposure, and thus should be analyzed using methods to estimate the secondary attack rate, a risk measure. These recommendations follow standard approaches described in the landmark paper of Halloran et al. 1997 [27]. And those tested for neither reason (group iii) should be analyzed using the odds ratio approach described in this paper, extending others’ prior work [11].

We have not considered another approach that has been used in COVID-19 trials [9] to estimate the impact on asymptomatic infections: serologic testing of participants at the middle or end of the trial [19]. This can contribute to an estimate of *VE*_{S} and thus provide a lower bound on *VE*_{T}, but does not address the duration of infectiousness or the viral shedding of the detected asymptomatic infection. Nevertheless, this is an important additional way to obtain evidence relevant to bounding the vaccine’s efficacy against transmission.

In summary, with careful analysis, data from swabs of individuals in vaccine and comparator arms can yield estimates of a key quantity, the vaccine’s efficacy in reducing viral positivity, likely a lower bound on the vaccine’s efficacy in reducing transmission. Future work should consider how quantitation of virus in both symptomatic and non-symptomatic individuals who do test positive may further refine these estimates.

## Funding

This work was supported by the Morris-Singer fund by US National Cancer Institute Seronet cooperative agreement U01CA261277 and by the UK Department of Health and Social Care using UK Aid funding managed by the NIHR.

## Competing interests

Dr. Lipsitch reports consulting/honoraria from Bristol Myers Squibb, Sanofi Pasteur, and Merck, as well as a grant through his institution, unrelated to COVID-19, from Pfizer. He has served as an unpaid advisor related to COVID-19 to Pfizer, One Day Sooner, Astra-Zeneca, Janssen, and COVAX (United Biomedical). Dr. Kahn discloses consulting fees from Partners In Health.