## Abstract

Ongoing SARS-CoV-2 vaccine trials assess vaccine efficacy against disease (VE_{DIS}), the ability of a vaccine to block symptomatic COVID-19. They will only partially discriminate whether VE_{DIS} is mediated by preventing infection as defined by the detection of virus in the airways (vaccine efficacy against infection defined as VE_{SUSC}), or by preventing symptoms despite breakthrough infection (vaccine efficacy against symptoms or VE_{SYMP}). Vaccine efficacy against infectiousness (VE_{INF}), defined as the decrease in secondary transmissions from infected vaccine recipients versus from infected placebo recipients, is also not being measured. Using mathematical modeling of data from King County Washington, we demonstrate that if the Moderna and Pfizer vaccines, which have observed VE_{DIS}>90%, mediate VE_{DIS} predominately by complete protection against infection, then prevention of a fourth epidemic wave in the spring of 2021, and associated reduction of subsequent cases and deaths by 60%, is likely to occur assuming rapid enough vaccine roll out. If high VE_{DIS} is explained primarily by reduction in symptoms, then VE_{INF}>50% will be necessary to prevent or limit the extent of this fourth epidemic wave. The potential added benefits of high VE_{INF} would be evident regardless of vaccine allocation strategy and would be enhanced if vaccine roll out rate is low or if available vaccines demonstrate waning immunity. Finally, we demonstrate that a 1.0 log vaccine-mediated reduction in average peak viral load might be sufficient to achieve VE_{INF}=60% and that human challenge studies with 104 infected participants, or clinical trials in a university student population could estimate VE_{SUSC}, VE_{SYMP} and VE_{INF} using viral load metrics.

## Introduction

The endpoint for SARS-CoV-2 vaccine efficacy trials targeting licensure is vaccine efficacy against disease (VE_{DIS}) which is defined by a reduction in symptomatic disease, confirmed with polymerase chain reaction (PCR) testing for viral RNA, in vaccine recipients relative to placebo recipients (*1, 2*). The FDA benchmark for licensure is a point estimate of VE_{DIS}>50% with lower alpha-adjusted 95% confidence limit exceeding 30% (*3*). Two mRNA vaccines have shown high levels of protection (>90%) at interim analyses (*4, 5*).

Once VE_{DIS} is established and a vaccine is licensed, mathematical modeling is useful for projecting a roll out strategy that affords maximal reductions in deaths and cases, and to prevent the need for future lockdowns (*6-8*). Yet VE_{DIS} does not provide sufficient information to fully inform these models. High VE_{DIS} is determined by a combination of two distinct phenomena which will only be partially captured in these trials: vaccine efficacy against susceptibility (VE_{SUSC}) which is defined as the vaccine-induced reduction in the rate of infection – as evidenced by detection of virus by PCR - and vaccine efficacy against symptoms (VE_{SYMP}) which is defined as the reduction in the presence of symptoms conditional on infection under vaccine versus placebo **(Table 1, Figure 1)** (*1, 2, 9*). If a vaccine mediates VE_{DIS} primarily through reduction in symptoms, the extent to which people, who convert from symptomatic to asymptomatic infection as a result of receiving the vaccine, can still transmit the virus, remains unknown. A vaccine that achieves high VE_{DIS} via VE_{SYMP} could theoretically contribute less to overall herd immunity than a vaccine that achieves high VE_{DIS} via VE_{SUSC}, as the former may not block ongoing chains of transmission from vaccine recipients.

A third vaccine effect, efficacy against infectiousness (VE_{INF}) is defined as reduction in secondary transmissions from either symptomatic or asymptomatic infected vaccine versus placebo recipients and could also have significant effects on the trajectory of viral epidemics (*10*). Reduced VE_{INF} anticipates that symptomatic breakthrough infections in vaccine recipients may be associated with fewer secondary transmissions than in placebo recipients, and that people who develop asymptomatic rather than symptomatic infection due to vaccination (VE_{SYMP}) may also be less likely to transmit. This latter observation would be expected if a vaccine mediates reduction in both symptoms and secondary transmission potential by lowering the quantity of viral shedding (*11*). While high VE_{DIS} guarantees a high likelihood of individual benefit, protection of unvaccinated members of the population will also depend on VE_{SUSC} and VE_{INF}, as well as the velocity of a vaccination roll out program (*8, 12*).

SARS-CoV-2 serology is being used in the Pfizer and Moderna trials to capture asymptomatic infections and thereby estimate VE_{SUSC} and VE_{SYMP} (*1*). Yet, waning SARS-CoV-2 humoral responses could limit the sensitivity of this approach. Results from the ChAdOx1 vaccine trial show a trend towards lower protection against infection by viral nucleic acid detection than against symptomatic COVID-19, highlighting the potential importance of this approach, though low frequency of sampling could limit the accuracy and precision of these estimates (*13*). VE_{INF} is also not being directly assessed in ongoing vaccine trials. While all current studies will ultimately measure viral load at presentation among symptomatic infected persons only, this approach misses all asymptomatic people who may or may not continue to secondarily transmit SARS-CoV-2. It also does not capture the critical pre-symptomatic phase of symptomatic infection when viral load and transmissibility are highest (*14-16*).

The inability to fully discriminate VE_{SUSC} from VE_{SYMP}, and to directly measure VE_{INF}, in the current slate of promising vaccines limits our ability to forecast vaccine impacts in the population. Specifically, there is uncertainty regarding the threshold of vaccinated people required to achieve herd immunity, where the effective reproductive number (R_{eff}) is maintained below 1 and new cases contract. It is similarly challenging to optimize vaccine allocation to different sectors of the population. For instance, it may be best to target a vaccine with high VE_{SUSC} or VE_{INF}, which breaks secondary chains of transmission, towards essential workers and young people. Alternatively, a vaccine with high VE_{SYMP} but limited effects on secondary transmission may be best prioritized towards populations with highest risk of severe disease, such as the elderly (*8*).

Several possible methods exist to estimate VE_{INF}. One is to measure secondary attack rate among household contacts of infected vaccine recipients versus infected placebo recipients (*17*). Alternatively, cluster randomized trials can assess for indirect protection of unvaccinated persons in vaccinated versus unvaccinated communities (*18*). While both of these trial designs are attractive, they have high operational complexity and will need to be implemented and completed rapidly to impact the course of the pandemic.

Another option is to use a viral load metric as a surrogate endpoint. VE_{INF} is likely to be mediated via a reduction in viral load among recipients of vaccine versus placebo, particularly early during pre-symptomatic or asymptomatic infection when nasal and saliva viral loads are highest (*16, 19-21*). It is possible that VE_{SYMP} is also driven by viral load reduction, though it has yet to be proven beyond association whether any specific viral load metric predicts development of symptoms or severe COVID-19 (*22*). Moreover, only a few studies captured critical early peak viral load kinetics, and in too few people to perform correlate analyses (*20*). Viral load in infected vaccine versus placebo recipients could be measured in large clinical trials in which enrolled participants undergo frequent self-sampling after enrollment, or in smaller highly controlled human challenge studies (*23*).

Here we use a mathematical modeling approach using data from King County Washington to demonstrate the potential effects of VE_{INF} at the population level given multiple vaccine profiles. In contrast to existing models of vaccine prioritization (*7, 8*), including our own (*24*), the model accounts for the likely need for recurrent lockdowns once cases and hospitalizations exceed a certain threshold. We next estimate reduction in peak viral load required to achieve various VE_{INF} and outline animal viral graded challenge experiments required to verify the relationship between viral load and likelihood of transmission. Finally, we propose human challenge study design, and describe a potential clinical trial in university students, to rapidly estimate VE_{SUSC}, VE_{SYMP} and VE_{INF} for a vaccine.

## Results

### Overview

We use a mathematical model of COVID-19 in King County Washington to project the impact of different vaccine profiles on incident cases, hospitalizations and deaths throughout 2021. Our goal is to identify vaccine efficacy profiles under which high VE_{INF} would or would not reduce a large number of infections and deaths, as well as the need for prolonged lockdowns. Because we identify that VE_{INF} could determine whether or not a large fourth wave of infections occurs in the region during the spring of 2021, and because VE_{INF} has not yet been measured for vaccines under study, we next focus on approaches to rapidly estimate its value. Using an intra-host transmission model (*14, 25, 26*), we consider reduction in peak viral load as a potential surrogate endpoint for VE_{INF}, discuss graded challenge studies in animal models to validate these predictions, and provide human viral challenge study designs which might provide actionable VE_{INF} estimates within relevant timeframes for the pandemic.

### Vaccine efficacy definitions

We include vaccines defined by four different types of efficacy **(Table 1)**. VE_{DIS} is the primary endpoint in all ongoing phase 3 vaccine trials and is defined as the proportion of vaccine versus placebo recipients who do not develop symptomatic COVID-19 (*1*). VE_{DIS} represents a composite of efficacy against infection (VE_{SUSC}) and disease (VE_{SYMP}). VE_{SUSC} is the proportion of people in a vaccine arm relative to the placebo arm of a trial who are fully protected against both asymptomatic and symptomatic virologically confirmed SARS-CoV-2 infection. The effect of high VE_{SUSC} (90%) is shown in **Vaccines 1 & 2** in **Fig 1**. Complete protection against infection also implies no possibility of secondary transmission. VE_{SYMP} is conditional on infection and is defined as the proportion of people in the vaccine arm relative to the placebo arm of a trial who remain asymptomatic despite being infected. A vaccine in which VE_{DIS} is mediated entirely by VE_{SYMP} will not lower the absolute number of people infected **(Vaccines 3 & 4** in **Fig 1**).

The proportion of secondary contacts who are symptomatically or asymptomatically infected (red and orange figures respectively in **Fig 1**) by infected vaccine recipients relative to secondary contacts of placebo recipients determines the vaccine efficacy against infectiousness (VE_{INF}, **Table 1**). Unlike VE_{SUSC} and VE_{SYMP}, VE_{INF} is not a determinant of VE_{DIS}. Vaccines with no VE_{INF} **(Vaccines 1 & 3**) and moderate VE_{INF} **(Vaccines 2 & 4**) are shown in **Fig 1**. Notably, the addition of VE_{INF}=50% prevents a much larger number of secondary infections in a scenario where high VE_{DIS} is mediated entirely by VE_{SYMP} **(Vaccine 4)** versus a scenario where high VE_{DIS} is mediated entirely by VE_{SUSC} **(Vaccine 2)**.

The possible relevance of this concept for the Moderna and Pfizer vaccines is evident by comparing extreme scenarios in which VE_{DIS}=90%. A vaccine with VE_{DIS}=90% due to VE_{SUSC}=90% / VE_{SYMP}=0% / VE_{INF}=0% would have very similar population level effects as a vaccine with VE_{SUSC}=0% / VE_{SYMP}=90% / VE_{INF}=100%. In the latter scenario, all vaccinated and subsequently infected people would be asymptomatic and could also not secondarily transmit the virus. On the other hand, a vaccine with VE_{SUSC}=0% / VE_{SYMP}=90% / VE_{INF}=0% would protect vaccinated people against symptoms but have no impact on downstream transmissions.

### Projection of incident SARS-CoV-2 infections and deaths in King County Washington in 2021 assuming no vaccine

We modified a previously developed compartmental model to reproduce the ongoing COVID-19 epidemic in King County Washington (*24*). The model includes uninfected, exposed, asymptomatic infected, pre-symptomatic infected, symptomatic infected, diagnosed asymptomatic, diagnosed symptomatic, hospitalized, dead and recovered compartments, all stratified into four age cohorts **(Sup fig 1)**. We calibrated the model to daily cases **(Sup fig 2a)**, daily hospitalizations **(Sup fig 2b)**, daily deaths **(Sup fig 2c**), age-stratified cases **(Sup fig2d)**, age-stratified hospitalizations **(Sup fig2e)**, age-stratified deaths **(Sup fig2f)**, cumulative cases **(Sup fig 2g)**, and cumulative deaths **(Sup fig 2h)** through November, 2020. Model equations are in the **Supplement** with parameters and their values in **Supplementary tables 1-4**.

Extending beyond the calibration period, we attempt to capture realistic approximations of local pandemic management to date. First, based on experience in other U.S. states and current Washington state policies, we assume that in the absence of a vaccine, numbers of cases and hospitalizations are likely to fluctuate due to the community response to the epidemic (*27, 28*). When number of new infections remains below a certain threshold, physical distancing measures are assumed to relax allowing greater contact between susceptible and infected people. The effective reproductive number (R_{e}) eventually exceeds one and cases start growing in number. Eventually a threshold is surpassed that necessitates reinforcement of physical distancing restrictions: R_{e} drops below one and cases contract. Based on timing of Washington state reinforcement of social distancing (*29*), we set this threshold at a 2-week daily case average of 300 new cases per 100,000. Physical distancing is set on a scale of 0 to 1 where 0 represents the level of pre-pandemic interactivity in the population and 1 implies no physical interactions.

Without a vaccine (black line, **Fig 2**), we projected the ongoing recurrent third wave (numbered in top row) of infections **(Fig 2a)**, diagnosed cases **(Fig 2b)**, hospitalizations **(Fig 2c)** and deaths **(Fig 2d)** between 11/2020 and 3/2021 **(Fig 2, top row)**. We also anticipated the need to re-enforce physical distancing to achieve 40% interactivity relative to pre-pandemic levels **(Fig 2e)**, in order to lower R_{e} below 1 **(Fig 2f)**. The model projects that this third wave would have a significantly higher number of infections, diagnosed cases and hospitalizations then the first wave in the spring of 2020 and the second wave in the summer of 2020. Daily deaths were projected to peak at similar levels to the first wave due to a higher proportion of younger individuals becoming infected with lower death rates. By January 2021, despite physical distancing, ∼15% of the population was projected to have been infected with ∼7000 hospitalizations and ∼1500 deaths **(Fig 2, middle row**).

In the absence of a vaccine, our model projected a substantial fourth wave of infections **(Fig 2a)**, diagnosed cases **(Fig 2b)**, hospitalizations **(Fig 2c)** and deaths **(Fig 2d)** between April and October 2021, necessitating a fourth cycle of increased physical distancing **(Fig 2e)**. At the end of this fourth wave, we forecast that ∼25% of the population will have been infected and ∼5% diagnosed with ∼10,000 hospitalizations and ∼2000 deaths **(Fig 2, middle row**).

### Vaccine model scenarios

We consider scenarios in which VE_{SUSC}, VE_{SYMP}, and VE_{INF} each have either low (10%), medium (50%) or high (90%) efficacy. Each possible parameter combination allows for 3^{3} (27) vaccine scenarios. Five VE_{SUSC} and VE_{SYMP} combinations (15 scenarios when considering the 3 values of VE_{INF}): VE_{SUSC}=90% / VE_{SYMP}=10%; VE_{SUSC}=10% / VE_{SYMP}=90%; VE_{SUSC}=90% / VE_{SYMP}=50%; VE_{SUSC}=50% / VE_{SYMP}=90%; VE_{SUSC}=90% / VE_{SYMP}=90%) would be roughly compatible with current projections for the Moderna and Pfizer mRNA vaccines which had estimated VE_{DIS} = 95% and 90% respectively at interim analyses (*4, 5*). Three combinations or 9 scenarios (VE_{SUSC}=50% / VE_{SYMP}=50%; VE_{SUSC}=50% / VE_{SYMP}=10%; VE_{SUSC}=10% / VE_{SYMP}=50%) would be realistic if there is a slight decrease in VE_{DIS} in the final analysis of these studies and would still meet criteria for licensure. These lower vaccine estimates may be relevant for other vaccines in development, or after a single dose of the Moderna or Pfizer product. One combination or 3 scenarios (VE_{SUSC}=10% / VE_{SYMP}=10%) which would not meet licensure requirements are included as controls to independently assess the effect of increasing VE_{INF}.

We initially assumed 5000 vaccinations per day with the goal of covering 50% of the population of 2.2 million people with a start day for vaccination on January 1, 2021 allowing 220 days until completion in mid-August. In our simulations, both susceptible and recovered persons were vaccine eligible. We assumed that the vaccine start date represented timing of the second shot for the mRNA vaccines such that efficacy accrues at the defined time of vaccination. Initially, we imputed no loss of vaccine efficacy over time.

In one scenario **(Fig 2)**, we assumed disproportionate initial targeting of the cohorts aged > 70 initially (80% of vaccines with 20% to those older than 20 years old). In a second scenario **(Sup fig 3)**, we first vaccinated the age cohorts with greatest inter-connectivity (20-45 and 45-69 years old received 80% of vaccines). We also imputed slow relaxation of social distancing during the vaccination program when cases remained below a certain threshold.

### High VE_{SUSC} without VE_{INF} is sufficient for prevention of a fourth wave of SARS-CoV-2 cases, deaths, and lockdown in spring 2021

We first considered scenarios in which elderly cohorts were vaccinated first and VE_{DIS} was mediated mostly by VE_{SUSC} rather than VE_{SYMP} (VE_{SYMP}=10%). Vaccines with high VE_{SUSC} (90%, blue lines) or VE_{INF} (90% darkest blue, green and red lines) resulted in the greatest reduction in peak **(Fig 2 top row)** and cumulative **(Fig 2 middle row)** infections **(Fig 2a)**, diagnosed cases **(Fig 2b)**, hospitalizations **(Fig 2c)** and deaths **(Fig 2d)**. All vaccines with VE_{SUSC} = 50% (green lines) or 90% (blue lines), or VE_{INF} = 50% (medium blue, green and red) or 90% (dark blue, green and red), prevented a large fourth wave of infections and deaths and allowed 20% physical distancing starting in April 2021 **(Fig 2e)** while maintaining R_{e} less than 1 **(Fig 2f)**. Notably, only VE_{SUSC}=90% vaccines (blue lines) are compatible with Moderna and Pfizer results and VE_{INF} had little impact on these projections.

All vaccines with at least 90% VE_{SUSC} or VE_{INF}, or both 50% VE_{SUSC} and 50% VE_{INF}, lead to a reduction in approximately 200,000 infections, 45,000 diagnosed infections, 3500 hospitalizations and 600 deaths since the start of the vaccination period **(Fig 2 bottom row)**, though fewer deaths were prevented than would have already occurred by April 2021 (∼1500). A vaccine with VE_{SUSC} = 10%, VE_{INF} = 10% and VE_{SYMP} = 10% (pink line) was predicted to only delay the peak of infections, hospitalizations and deaths **(Fig 2 top row)** with a small percentage reduction in these outcomes over time **(Fig 2 bottom row)** and a requirement for a fourth phase of increased physical distancing **(Fig 2e)**.

Nearly equivalent results were noted when younger and middle-aged cohorts with higher inter-connectivity were vaccinated with higher priority **(Sup fig 3)**.

### High VE_{INF} is a requirement for prevention of a fourth wave of SARS-CoV-2 cases, deaths, and lockdown in spring 2021 for low VE_{SUSC}, high VE_{SYMP} vaccines

We next considered a scenario in which VE_{DIS} was mediated mostly by VE_{SYMP} rather than VE_{SUSC} (10%) with vaccine prioritization to the elderly. Once again, for all conditions with VE_{INF} = 90% (darkest purple, darkest orange lines), we observed a substantial decrease in infections **(Fig 3a)**, diagnosed cases **(Fig 3b)**, hospitalizations **(Fig 3c)** and deaths **(Fig 3d)** with no large fourth wave observed **(Fig 3, top row)**, a levelling off in cumulative incidence of all outcomes **(Fig 3, middle row)**, and a 60-70% reduction in all outcomes starting at the time of vaccination **(Fig 3, bottom row)**. There were no visible effects of increasing VE_{SYMP} when VE_{INF} = 90%.

Under the high VE_{SYMP}, low VE_{INF} scenario compatible with the Moderna and Pfizer vaccine trial results (light purple line), a fourth protracted wave of ∼60,000 infections and ∼200 deaths lasting from April 2021 through January 2022 occurred which did not meet a threshold that necessitated further resumption of lockdown measures **(Fig 3e**,**f)**. At moderate VE_{INF} (50%, medium purple and medium orange) and in particular at lower VE_{INF} (10%, light purple and light orange), we observed a beneficial effect of increased VE_{SYMP} **(Fig 3)** with further reduction in all outcomes at high (90%, light purple) versus moderate (50%, light orange) VE_{SYMP}. This result likely relates to the fact that VE_{SYMP} converts symptomatic to asymptomatic infection which in our model is predicted to be associated with a 44% decrease in overall infectivity.

### Lowering of cases and deaths under vaccine scenarios with low or moderate VE_{SUSC} and high VE_{INF}

We explored the impact of varying VE_{INF} under several plausible vaccine scenarios with VE_{DIS} = 90%. We generated heat maps for total post-vaccine diagnosed cases **(Fig 4a)** and deaths **(Fig 4b)** and identified that for scenarios when VE_{DIS} =90% is mediated mostly by VE_{SYMP} (90%, black circles), increasing VE_{INF} from 10% to 90% resulted in substantial further reductions in diagnosed cases (∼20,000) and deaths (∼60). When VE_{DIS}=90% was mediated entirely (90%, white circles) or mostly (70%, grey circles) by VE_{SUSC}, then increasing VE_{INF} from 10% to 90% resulted in very limited further reductions in diagnosed cases and or deaths.

We next identified that VE_{SUSC} and VE_{INF} had nearly equivalent effects on number of post-vaccine diagnosed cases **(Sup fig 4a)** and deaths **(Sup fig 4b)**. For VE_{SYMP} = 10% and 50% (**Sup fig 4, left and middle columns**), at values of VE_{SUSC}<50%, increases in VE_{INF} lead to further reductions in cases and deaths. However, at high values of VE_{SUSC}, increases in VE_{INF} added little benefit. There was moderate effect modification by VE_{SYMP}: particularly for deaths, there was less added benefit of increasing VE_{INF}, when VE_{SYMP} = 90% (**Sup fig 4, right column**) highlighting that VE_{SYMP} prevents deaths more efficiently than infections.

### VE_{INF} as a determinant of the severity of a fourth wave in the event of slow vaccine roll out

The distribution and acceptability of vaccines to the public remains uncertain. We therefore simulated the vaccine scenarios in **Fig 2** assuming half (2500 vaccines / day with completion in March 2022, **Fig 5, left column)** the roll out speed. Here we assumed VE_{SYMP}=90% such that all nine considered scenarios had efficacies compatible with the Moderna and Pfizer results. Nevertheless, at slower roll out, a fourth wave of infections **(Fig 5a)**, diagnosed cases **(Fig 5b)**, hospitalization **(Fig 5c)** and deaths **(Fig 5b)** occurred among all scenarios. The peak **(Fig 5 top row)** was blunted and delayed under scenarios with VE_{SUSC}=90% (blue lines) or VE_{INF}=90% (dark red, green and blue) with ∼100,000 fewer cumulative cases and ∼300 fewer deaths **(Fig 5 middle and bottom rows)** indicating that VE_{INF} would take on added importance under less optimal roll out scenarios with low VE_{SUSC}. Only the scenario with limited effects on secondary transmission VE_{SUSC}=10% and VE_{INF}=10% allowed a severe enough wave to necessitate another round of physical distancing **(Fig 5e,f)**.

Vaccine uptake at 5000 per day resulted in R_{eff}<1 in April to June at 20% social distancing relative to pre-pandemic conditions (SD=0.2), a rough proxy for herd immunity assuming some degree of residual social distancing and masking **(Fig 6a)**. Vaccine uptake at 2500 per day achieved R_{eff}<1 in August to September **(Fig 6b)**. The more rapid roll out scenario achieved this functional herd immunity before occurrence of the fourth wave whereas under the slower roll out scenario, a surge in cases in July contributed to development of R_{eff}<1 (**Fig 6, bottom row)**. Under both roll out rate scenarios, vaccines that had both low VE_{SUSC} and VE_{INF} (light green and pink), had a one-to-two month delay to reach R_{eff}<1, thereby allowing more infections.

### VE_{INF} as a determinant of whether a fourth wave of cases occurs in the event of waning vaccine efficacy

The duration of SARS-CoV-2 vaccine induced immunity also remains uncertain. We simulated the vaccine scenarios in **Fig 2** but with VE_{SYMP}=90% assuming slow (1% per month, **Sup fig 5, left column)**, moderate (5% per month, **Sup fig 5, middle column)**, and rapid (10% per month, **Sup fig 5, right column)** reversion from vaccine immune to susceptible. At high loss of efficacy, a fourth wave of cases **(Sup fig 5a)** and deaths **(Sup fig 5b)** eventually occurred among all vaccine scenarios, but was delayed under scenarios with VE_{SUSC}=90% (blue lines) or VE_{INF}=90% (dark red, green and blue) indicating that VE_{INF} may also take on critical importance in the event of waning vaccine efficacy.

### A virologic mathematical model of SARS-CoV-2 transmission

The above results suggest that the potential severity of a fourth wave can only be estimated with accurate estimates for VE_{SUSC}, VE_{SYMP} and VE_{INF} among relevant vaccines, as well as rates of vaccine roll out and duration of protection. It is therefore a priority to identify the true values for these vaccine characteristics.

Based on experience from multiple other viruses that exposure dose predicts transmission (*30, 31*), we hypothesize that VE_{INF} is likely to be mediated by reduction in viral load among infected people **(Fig 7)**. We therefore employed our existing intra-host model described in the **Methods** that links SARS-CoV-2 viral load dynamics in an infected person with the potential for transmission. The model reproduces two key distributions: the heterogeneous number of people secondarily infected by each person (termed *individual R0*) and the serial intervals associated with each transmission pair (*16, 32, 33*). The resulting model output explains observed super-spreader phenomenon, and predicts a transmission dose response curve which captures the probability of transmission given an exposure viral load (*14, 26*). The model was compatible with existing data showing that most transmissions occur during the pre-symptomatic phase of infection (*16*).

### Small reduction in peak viral load required for lowering VE_{INF}

We next considered methods to estimate VE_{INF} using viral load as a potential surrogate. To simulate the effect of a vaccine on viral load, we assumed the presence of a tissue-resident population of immune cells with rapid ability to recognize and kill infected cells, as well as proliferate *in situ*, as a necessary condition to lower peak viral load in infections such as SARS-CoV-2 with rapid initial growth kinetics (*34*). We first established a relationship between the initial number of tissue resident immune cells and peak viral load **(Fig 8a, b)** during individual simulated infections. We then assumed vaccination trials consisting of 1000 people in which vaccine recipients generated a certain number of these immune cells while placebo recipients did not. By estimating the reduction in number of transmissions, we then were able to estimate VE_{INF} for each vaccine. The model predicted a sigmoidal relationship between reduction in peak viral load and VE_{INF}: a 0.6 log or 4-fold reduction in peak viral load resulted in VE_{INF} =50% and a 2.5 log or ∼300-fold reduction resulted in VE_{INF} = 90% **(Fig 8c)**.

### Proposed animal challenge experiments for establishing transmission dose required for SARS-CoV-2

Our model derived prediction that vaccine induced reduction in exposure viral load could serve as a surrogate endpoint for VE_{INF} requires experimental validation. While a transmission dose response relationship has been demonstrated for SARS-CoV-1 in mice (*35*), and infection of SARS-CoV-2 between golden hamsters occurred with direct viral load exposure at 10^{11} RNA copies but not 10^{9} viral RNA copies (*36, 37*), formal dose response experiments are still needed for SARS-CoV-2. We performed mock experiments in which 30 animals (either mice, hamsters or non-human primates) are challenged with 3 or 5 graded doses of viruses selected from experience with SARS-CoV-1. This method demonstrated that given the existence of a true transmission dose response curve, we can identify the slope and ID50 (dose at which 50% transmission occurs) >99% of the time if 5 doses (selected based on the TD50 and slope from SARS-CoV-1 challenge in mice) are used in the experiments **(Sup table 5)**. If the initial estimate for ID50 is mis-specified 20-fold, then the failure percentage increases to 7% and the 95% confidence range for slope and ID50 of the curve are wider **(Sup table 6)**. However, if we employ a dynamic sample size allocation procedure where we determine the challenge dose based on the number of infections in a single first dose group (subsequent doses are increased if all mice remain uninfected or vice versa) then the uncertainty range for both parameters decreases and the failure percentage is miniscule **(Sup table 7)**. Overall, this result suggests that the transmission dose response curve predicted by our mathematical model could be verified quickly in graded challenge experiments in animals, adding further veracity to viral load as a reasonable proxy for transmission risk in clinical trials.

### Proposed study design for randomized placebo controlled double blinded human challenge vaccines trials with viral load as endpoints

One method to directly assess VE_{SUSC} and VE_{SYMP,} and to indirectly estimate VE_{INF}, using projections from our model **(Fig 8c)** is a human challenge study in which healthy participants are challenged with SARS-CoV-2 one month after receiving a final dose of a vaccine or placebo schedule in a blinded fashion. Following challenge, viral load would be sampled daily for 2 weeks to capture true peak.

Under this study design, we would expect all placebo recipients to develop virologically confirmed infection and for most to be symptomatic. Assuming VE_{DIS}=90%, then only ∼10% of vaccine recipients would develop symptoms with a confirmatory PCR test. If VE_{DIS} is mediated entirely by VE_{SUSC}, then vaccine recipients would shed no virus as in **Fig 7** and the study would not be able to assess VE_{INF}. However, as shown conceptually in **Fig 1** and in projected model data in **Figs 4**, VE_{INF} has little added benefit when VE_{SUSC} is high making its estimation unnecessary for population projections.

If VE_{DIS} is mediated entirely by VE_{SYMP}, then vaccine recipients will shed virus as in **Fig 7** and useful comparisons can then be made between infected vaccine and placebo recipients. We demonstrate that with this approach, 80% power can be achieved to detect differences in peak log_{10} viral load between vaccine and placebo arms with as few as 10 infected participants per arm if peak viral reduction is 2.5 log_{10} (model estimated VE_{INF}∼90%), 52 participants per arm if peak viral reduction is 1.0 log_{10} (model estimated VE_{INF}∼60%), and 143 participants per arm if peak viral reduction is 0.6 log_{10} (model estimated VE_{INF}=50%) **(Sup fig 6)**.

## Discussion

An optimal vaccine program would prevent the maximum numbers of cases and deaths, without the need for further lockdown periods. The first component of such a program will be testing and licensing of vaccines that provide at least partial protection from symptomatic disease (VE_{DIS}). Initial data from the Pfizer and Moderna trials suggest that these products may have 90% or higher VE_{DIS}. The second step is to consider the proportion of the population that will need to be vaccinated in order to provide herd immunity. This threshold will depend critically on indirect effects that protect unvaccinated members of the population. Indirect effects occur when VE_{DIS} is mediated by VE_{SUSC} rather than VE_{SYMP}, but may also be augmented by a vaccine product with high VE_{INF} in the scenario where VE_{SUSC} is low. Given rapid enough roll out, our results suggest that vaccines with either high VE_{SUSC} or high VE_{INF,} or both moderate VE_{SUSC} and moderate VE_{INF}, could prevent a fourth wave of cases and deaths in the spring and summer.

Unfortunately, VE_{SUSC} can only be partially discriminated from VE_{SYMP} in current clinical trials using serologic assays which may miss infection due to waning humoral responses (*38*). Moreover, VE_{INF} is not being directly assessed. While viral load is being compared between symptomatic infected vaccine and placebo recipients, in most cases, sampling will occur several days after the peak viral load when transmission risk is highest (*1*). Viral load in asymptomatic cases also will not be captured. Overall, VE_{SUSC}, VE_{SYMP} and VE_{INF} are particularly challenging to measure, leaving policy makers with incomplete information for projecting the impact of a given vaccine.

We identify that under any scenario in which VE_{SUSC} is low, a vaccine with VE_{INF} >50% would add substantial protection at the population level. A vaccine with this profile would exert maximal benefit whether given to younger or elderly cohorts first if rolled out quickly enough. If VE_{SYMP} is driving observed results, then high VE_{INF} would be vital for preventing thousands of cases and saving hundreds of lives in King County, with much larger benefits when considered across the larger US population. In scenarios where a fourth spring wave is inevitable, such as slow vaccine roll-out or waning vaccine induced protection, VE_{INF} could potentially delay and blunt the peak number of cases and deaths, thereby preventing the need for re-enforcement of physical distancing while also preventing many deaths.

Therefore, it is an urgent research priority to identify reasonable estimates for VE_{SUSC}, VE_{SYMP} and VE_{INF} for vaccines that will be given to the population at large. Trial strategies which attempt to directly measure secondary infections in households (*39, 40*), or to assess the degree of protection afforded to unvaccinated members of communities with partial vaccination relative to communities with less vaccination, would potentially be conclusive. They will need to be performed quickly to obtain actionable results prior to spring 2021.

Our analyses suggest that peak viral load could serve as a surrogate endpoint for secondary transmission and allow for rapid, complementary studies. We estimated the relationship between viral load and transmission probability for SARS-CoV-2 based on model fitting to observed serial intervals and individual R0 values (*14*). The emergent transmission response curve took on a similar sigmoidal shape to empirically derived curves for SARS-CoV-1 in a controlled set of murine experiments (*35*), and also resembled the relationship between quantitative viral PCR and probability of culture positivity in humans infected with SARS-CoV-2 (*41*).

As a first step, it is necessary to formally test the hypothesis that exposure viral load is predictive of transmission risk (*42*). A valid viral load surrogate cannot currently be inferred from human cohorts as the exposure viral load is never documented between transmission pairs, though formal surrogate endpoint analysis will ultimately be necessary if sufficient data emerges. We suggest that animal models of infection are ideal for this purpose and that necessary transmission dose can be inferred with a relatively small number of non-human primates or mice.

Human trials using reduction in peak viral load or viral area under the curve as correlates for reduction in VE_{INF} could take one of two forms. The first would involve prospective nasal sampling of virus in all enrolled participants with virologic endpoints compared between those who become infected in vaccine and placebo arms. An ideal study population would be university students due to their high incidence rate and low overall infection morbidity. The advantages of this approach would be real-world validation of biologic vaccine effects in which participants experience natural variability in potentially critical factors such as viral exposure dose, time between vaccination and infection, and route of transmission. The relationship between viral load and symptoms would also be clarified with this study design. Challenges would be operational including large samples size and a massive number of prospective samples.

Human challenge studies are a potentially rapid method to directly measure VE_{SUSC} and VE_{SYMP,} and to indirectly estimate VE_{INF} using viral load, as each participant would contribute to the study endpoints. This approach could potentially be completed in at minimum 104 participants within 2-3 months, depending on the selected time between vaccination and viral challenge. While challenge studies are most efficient, there are important ethical considerations regarding potential harm to study participants. Moreover, it will be uncertain whether results can be generalized to the wider population, particularly those in different age cohorts. Nevertheless, even crude estimates of VE_{SUSC}, VE_{SYMP} and VE_{INF} could add critical knowledge to influence vaccine implementation policies.

Our approach has limitations. We combine several scales of models which reflect population conditions unique to King County Washington and virologic findings from across the globe. The models are not equipped to make precise vaccine schedule assessments for different locations and are not meant as predictions. Rather, we intend to make the conclusion that VE_{INF} could theoretically provide substantial population level benefits and to provide a framework for most rapid evaluation of this metric. The scope of the ongoing third wave is difficult to forecast and will depend on changes in human behavior over the next several weeks. The number of cases and deaths during a possible fourth spring wave may be somewhat dependent on current events.

In conclusion, in the situation where observed high VE_{DIS} is predominately due to reduction in symptoms rather than absolute protection against infection, VE_{INF} will be vital to measure as it may determine whether a severe fourth wave of cases and deaths is imminent in the spring. Using peak viral load as a proxy measure in human challenge studies is an efficient way to complement other clinical trial designs to assess VE_{INF}.

## Methods

### King Country transmission model

We modified a previously developed deterministic compartment model (*43*), which captures the epidemic dynamics in King County, WA between January 2020 and October 2020 and projects the trajectory of the local pandemic through the end of 2021 in the absence and presence of vaccines. Vaccination is simulated with a starting date of January 1, 2021. Our model stratifies the population by age (0-19 years, 20-49 years, 50-69 years, and 70+ years), infection status (susceptible, exposed, asymptomatic, pre-symptomatic, symptomatic, recovered), clinical status (undiagnosed, diagnosed, hospitalized) and vaccination status.

In our main scenario we assume that 20% of infections are asymptomatic and that asymptomatic people are as infectious as symptomatic individuals but missing the highly infectious pre-symptomatic phase. As a result, the relative infectiousness of individuals who never develop symptoms is 56% of the overall infectiousness of individuals who develop symptomatic COVID-19. This conservative estimate falls between the 35% relative infectiousness estimated in recent review based on 79 studies (*44*) and the current best estimate of 75% suggested by the CDC in their COVID-19 pandemic planning scenarios (*45*).

The forces of infection, representing the risk of the susceptible individuals to acquire infection (transition from susceptible to exposed), are differentiated by age of the susceptible individual, the contact matrix (proportion of contacts with each age group), infection and treatment status (asymptomatic, pre-symptomatic, symptomatic, diagnosed and hospitalized cases) of the infected contacts as described in the **Supplement**.

The model is parameterized with local demographic and contact data from King County, WA and calibrated to local case and mortality data using transmission parameters ranges informed from published sources (*15, 46-48*).

A critical parameter in the model is the social distancing metric which estimates the amount of potential infection contacts between members of the population. This parameter is intended to capture physical contact reduction due to physical distancing policies, but also decreased number of transmission contacts due to masking. The parameter varies between 0, which represent pre-pandemic level of interactivity, and 1, which represents complete physical distancing with no interactivity.

In the current version, we incorporate fluctuating values of this parameter retrospectively to calibrate the model to observed infection, hospitalization and death data through the end of October 2021, as well as prospectively to capture likely enhanced physical distancing in response to present and future cases. Our benchmarks for increasing physical distancing to 0.6 was when 2-week average number of cases exceeded 300 per 100,000. We allowed relaxation of the parameter to 0.2 when 2-week average number of cases fell below 100 per 100,000. For elderly populations, we assume greater restrictions to 0.8 and lessen relaxation to 0.4. This approach reproduces the waves of infection which have defined the United States pandemic to date.

### Vaccine simulations in King County, Washington

We consider several vaccine efficacy profiles as described in the **Results** with different efficacies as defined in **Table 1**. Implementation of these efficacies is described in the **Supplement**.

### SARS-CoV-2 within-host model

We next employed a within-host model describing SARS-CoV-2 infection from our previous study to generate viral loads to assess transmission risk (*21*). The viral load generating model is included in the **Supplement**.

### Intra-host transmission model

We employed our previously described model linking transmitter viral load with probability of transmission (*21*). The details of this model are described in the **Supplement**.

### Intra-host model vaccination simulations

We simulated the impact of the vaccination by assuming that a vaccine generates a certain number of SARS-CoV-2 specific acquired immune cells that are ready to proliferate (with no need for precursor compartments *M*_{1} and *M*_{2}) and act to quickly eliminate the ongoing infection. We thereby modify our intra-host model in the **Supplement** as,
Here, *I*_{50} denotes the level of infected cell that allows proliferation of immune cells at 50% maximal. We assume it to be 10 cells/mL. We further fix ω = 2 days^{-1}cells^{-1} and *m* = 0.01 days^{-1}cells^{-1}.

Each vaccine trial consists of selecting a starting condition of parameter E (E_{0}) that leads to a predictable reduction in peak viral load **(Fig 6b)**. We simulate individual trials with 1000 vaccine participants and 1000 placebo recipients, and then assess the relative reduction in transmissions to estimate VE_{INF} as in **Table 1**.

### Graded challenge sample size estimates

We assume that the dose response model is given by where *P _{t}*[

*V*(

*t*)] is the infectiousness based on viral loads V (dose) at the time t of viral challenge, λ is the infectivity parameter that represents the viral load that corresponds to 50% infectiousness and α is the Hill coefficient that controls the sharpness in the dose-response curve.

We consider the trial design given this initial estimation of parameters: α = 0.8 (from our human model and λ = 100 pfu from murine experiments with SARS-CoV-1 (*35*). We estimate α and λ under the given total sample size (N = 30) with different combination of dose assignment. Particularly, we consider five does: V_{1} = ID10 = 6, V_{2} = ID30 = 35, V_{3} = ID50 = 100, V_{4} = ID70 = 290, and V_{5} = ID90 = 1600 pfu. We consider three different settings to allocate sample size: (1) equal sample size among the five dose (6 animals each dose group); (2) equal sample size among V_{1}, V_{3}, and V_{5}, (10 animals each dose group; and (3) equal sample size among V_{2}, V_{3}, and V_{4}. We simulate the data from the two-parameter dose-response model and use the function ‘drm’ in the R package ‘drc’ to fit the model.

We consider sensitivity analysis when ID50 is poorly specified, especially if ID50 is smaller than expected. We also we consider a dynamic sample size allocation procedure. Particularly, we first allocate n1 = N/5 animals to V_{1} and determine the following dose allocation based on the number of infections in this V1 dose group, denoted as m1. If more than half of the animals allocated to V_{1} are infected, i.e., m1 ≥ n1/2, we set λ∗ = V1 and assign the rest of the animals equally to the doses V_{2}^{∗} =1pfu, V_{3}^{∗} =2pfu, V_{4}^{∗} =18pfu and V_{5}^{∗} =100pfu, which reflect this new λ^{∗}. Based on this dynamic sample size allocation procedure, the proportion of replicates that fail to produce an estimate is smaller and we are able to identify the values of α and ID50 with increased accuracy.

### Human challenge study sample size estimates

Sample size calculations were based on a 2-sample t-test to compare mean peak log_{10} viral load between the vaccine and placebo arm with 80% power. We conservatively assumed a common standard deviation of 1.8 log_{10} for the peak log_{10} viral load in both arms, which is based on estimates from natural infection (*49*), and likely an upper limit for our proposed human challenge trial where the challenge dose and anatomic site would be equivalent. We also assumed equal number of participants in each arm, and a 2-sided type I error of 0.05. We considered a range of values for the difference between peak log_{10} viral loads in the vaccine and placebo arm that corresponded to projected values of VE_{INF} ranging from 30% to 100%.

## Data Availability

Code for our mathematical model will be provided immediately upon request.

## Supplementary materials

### King County Mathematical Model of SARS-CoV-2

Our mathematical model consists of a series of differential equations (*1*), which describe a series of compartments described in **Sup fig 1** and in the **Results**. Equations are listed here:

### King County model parameters

Model parameters are listed as follows with values listed in the **Supplementary Table 1:**

p_{i} – proportion of the infections which become symptomatic by age in absence of a vaccine γ_{1,} γ_{2}-progression rates from exposed (E) to infectious (A and P) to symptomatic (I)

h_{i} – hospitalization rate among severe cases by age in absence of a vaccine

– hospitalization rate among diagnosed by age (calculated)

r_{1-3} - recovery rate of the asymptomatic, mild symptomatic and hospitalized cases

- recovery rate of the diagnosed symptomatic cases by age (calculated)

f_{i} – fatality rate among hospitalized by age

v_{i} – vaccination rate by age including prioritization strategy

v_{s} – vaccine efficacy in reducing the risk of symptomatic infection upon acquisition

v_{h} – vaccine efficacy in reducing the risk of hospitalization upon symptomatic infection

die_{i} – death rate without hospitalization by age.

d_{i} – diagnostic rate by age.

The diagnostic rates are estimated using testing data from the Washington State Department of Health (DOH). The equation uses the average daily tests for the time period in question divided by an estimate of the number of people desiring tests. The rates are divided across age groups according to the fraction of tests in each age group during that month. To get the test demand estimate, we fit a parameter rho_S that represents to likelihood of non-infected individual seeking a test compared to a symptomatically infected individual.

The forces of infection (λ_{i}), representing the risk of the susceptible individuals by age to acquire infection (transition from susceptible to exposed), are differentiated by age of the susceptible individual, a contact matrix (proportion of contacts with each age group), infection and treatment status (asymptomatic, pre-symptomatic, symptomatic, diagnosed and hospitalized cases) of the infected contacts, and the time-dependent reduction of transmission due to physical distancing measures (work from home, closing non-essential businesses, banning large gathering, etc.) applied in the area (scaled up starting March 8 and fully taking effect March 29) and later relaxed during the reopening after May 15 to values that are fit monthly until Oct 31^{st}, 2020.
where β_{a}, β_{p}, β_{s}, β_{d}, β_{h} are the transmission rates from contacts with asymptomatic, pre-symptomatic, symptomatic, diagnosed and hospitalized infections (before the start of COVID measures at t= δ_{1}), c_{ij} is contact matrix (proportion of the contact with other age groups), N_{i} is population size by age, v_{a} is vaccine efficacy in reducing the acquisition risk (reduction of susceptibility), v_{t} is vaccine efficacy in reducing the transmission risk (reduction of infectiousness).

R_{sd} (t) is the reduction of transmission due to physical distancing and other preventive measures which is applied uniformly to all age groups. It is scaled up linearly from 0 to between t= δ_{1} and t= δ_{2}). Later it is calibrated monthly to match the King County epidemic through October of 2020 and then controlled dynamically based on the bi-weekly case rates per 100k of the population. For age groups 1-3 the highest value of is 0.6 (i.e. interactions at 40% of pre-COVID levels) and the lowest allowed is 0.2 (social interactions at 80% of pre-COVID levels). These limits are each 0.2 higher for the oldest age group. The triggers for increasing or decreasing social distancing levels are given in the parameter table.

### King County model calibration

The model is calibrated to 3 “targets” based on local data (**Supplementary fig 2**), namely: the age-wise number of confirmed daily cases (**S2d**), daily deaths (**S2e**) and daily hospital admissions (**S2f**) reported in King County over time since the start of the epidemic outbreak through October 31^{st}, 2020. We used the BFGS optimization algorithm to estimate the best parameter values for the time period being fitted. We defined thresholds for each parameter and proceeded with the best set reported by the routine. Calibration was divided into multiple periods. The first was from the start of the epidemic through the initial lockdown period ending in early May. Subsequent fits were by month, but sharde the initial fits for start date, β_{∗} (overall infectivity) and β_{d} (adjustment to infectivity for diagnosed individuals).

### Intra-host model of SARS-CoV-2 kinetics

This model assumes SARS-CoV-2 (*V*) infects susceptible cells (*S*) at rate *β* producing infected cells (*I*) that then generate new virus at a per-capita rate *π*. In the model, the death of infected cells is mediated by (1) the innate responses *δI*^{k} which is dependent on the infected cell density and the exponent *k*, (2) the acquired immune responses by SARS-CoV-2-specific effector cells (*E*). The acquired responses are non-linear and are captured by the Hill coefficient *r* that allows rapid saturation of the killing of infected cells whereas the parameter *ϕ* determines the level of SARS-CoV-2-specific effector cells at which the saturation occurs. Moreover, we describe the rise of SARS-CoV-2-specific effector cells in a two-stage manner. The first stage defines the proliferation of the first precursor cell compartment (*M*_{1}) at rate *ω* and differentiation into a second precursor cell compartment (*M*_{2}) at a per capita rate *q*. Finally, second precursor cells differentiate into effector cells at the same per capita rate *q* and are cleared at rate *δ*_{E:}. More details on the model can be found in the ref (*2*).

The model is expressed as a system of ordinary differential equations:
The initial conditions for the model were assumed as *S*(0) = 10^{7} cells/mL, *I*(0) = 1 cells/mL, , *M*_{1} (0) = 1, *M*_{2} (0) = 0 and *E*_{0} = 0. For simulations, we sampled parameter values from a nonlinear mixed-effect model (*3*), with the following fixed effects and standard deviation of the random effects (in parenthesis): Log10 *β*: −7.23 (0.2) virions^{-1} day^{-1}; *δ*: 3.13 (0.02) day^{-1} cells^{-k}; *k*: 0.08 (0.02); Log10(*π*): 2.59 (0.05) day^{-1}; *m*: 3.21 (0.33) days^{-1}cells^{-1}; Log10(*ω*): −4.55 (0.01) days^{-1}cells^{-1}. We also assumed *r* = 10; *δ*_{E:} = 1 day^{-1}; *q* = 2.4 × 10^{−5} day^{-1} and *c* = 1*5* day^{-1}.

### Intra-host transmission model

To estimate SARS-CoV-2 infectiousness *P*_{t} [*V*(*t*)] we employed the function, . Here, *V*(*t*) is the viral load of the transmitter obtained from our previously proposed within host model and estimates (*2*). *λ* is the infectivity parameter that represents the viral load that corresponds to 50% infectiousness and 50% contagiousness, and *α* is the Hill coefficient that controls the slope of the dose-response curve. Our transmission model assumes that only some contacts of an infected individual with viral load dependent infectiousness are physically exposed to the virus (defined as exposure contacts), that only some exposure contacts have virus passaged to their airways (contagiousness) and that only some exposed contacts with virus in their airways become secondarily infected (successful secondary infection). Contagiousness and infectiousness are then treated as viral load dependent multiplicative probabilities with transmission risk for a single exposure contact being the product. Contagiousness is considered to be viral load dependent based on the concept that a transmitter’s dispersal cloud of virus is more likely to prove contagious at higher viral load, which is entirely separate from viral infectivity within the airway once a virus contacts the surface of susceptible cells.

We assumed that the total exposed contacts within a time step is gamma distributed, i.e. , using the average daily contact rates (*θ*) and the dispersion parameter (*ρ*). To obtain the true number of exposure contacts with airway exposure to virus, we multiply the contagiousness of the transmitter by the total exposed contacts within a time step (i.e., ). Transmissions within a time step are simulated stochastically using time-dependent viral load to determine infectiousness (*p*_{t}). Successful transmission is modelled stochastically by drawing a random uniform variable (*U*(0,1)) and comparing it with infectiousness of the transmitter. In the case of successful transmission, the number of secondary infections within that time step is obtained by the product of the infectiousness (*p*_{t}) and the number of exposure contacts drawn from the gamma distribution (*ζ*_{t}). In other words, the number of secondary infections for a time step is . We obtain the number of secondary infections from a transmitter on a daily basis noting that viral load, and subsequent risk, does not change substantially within a day. We then summed up the number of secondary infections over 30 days since the time of exposure to obtain the individual reproduction number, i.e., .

We further assume that upon successful infection, it takes j days for the virus to move within-host, reach the infection site and produce the first infected cell. To calculate serial interval (time between the onset of symptoms of transmitter and secondarily infected person), we sample the incubation period in the transmitter and in the secondarily infected person from a gamma distribution (*4, 5*). In cases in which symptom onset in the newly infected person precedes symptom onset in the transmitter, the serial interval is negative; otherwise, serial interval is non-negative.

The model was fit to distributions of individual R0 (secondary transmissions per person) and serial interval as previously described (*6-10*). We then arrived at parameter estimates for *λ, τ, α* and *θ* and identified that a skewed distribution of daily exposure contacts explains the virus super-spreader property. This model was used to obtain baseline levels of secondary transmission for simulated placebo recipients.