## Abstract

There is no agreed methodology for pharmacometric assessment of candidate antiviral drugs in COVID-19. The most widely used measure of virological response in clinical trials so far is the time to viral clearance assessed by qPCR of viral nucleic acid in eluates from serial nasopharyngeal swabs. We posited that the rate of viral clearance would have better discriminatory value. Using a pharmacodynamic model fit to individual SARS-CoV-2 virus clearance data from 46 uncomplicated COVID-19 infections in a cohort of prospectively followed adults, we simulated qPCR viral load data to compare type 2 errors when using time to clearance and rate of clearance under varying antiviral effects, sample sizes, sampling frequencies and durations of follow-up. The rate of viral clearance is a uniformly superior endpoint as compared to time to clearance with respect to type 2 error, and it is not dependent on initial viral load or assay sensitivity. For greatest efficiency pharmacometric assessments should be conducted in early illness and daily qPCR samples should be taken over 7 to 10 days in each patient studied. Adaptive randomisation and early stopping for success permits more rapid identification of active interventions.

## Background

SARS-CoV-2 infection can be characterised roughly as two overlapping clinical stages. The first stage comprises uncontrolled viral replication. In individuals who have symptomatic COVID-19 illness, timing of peak viral load in the nasopharynx corresponds approximately to the time of symptom onset [1]. The second stage comprises a decrease in the viral load resulting initially from host defence mechanisms. During this second stage a small subset (*<* 5%, strongly age dependent [2]) of infected individuals progress to severe pneumonia and some of these die. While low dose dexamethasone has been shown to reduce mortality at this late stage by approximately 30% [3], no antiviral drugs have shown unequivocal benefit. New effective therapeutics are urgently needed to treat COVID-19 earlier in the course of infection and thereby prevent hospitalisation and death.

It has been hypothesised that effective antiviral drugs administered during the first stage of infection would accelerate virus clearance and thereby reduce the overall viral load. In theory this should reduce the probability of progression to severe COVID-19 illness. Accelerated viral clearance has been used in several preliminary reports of therapeutic interventions as a proxy measure of benefit (i.e. a reduced chance of developing severe COVID-19) [4, 5, 6]. By contrast, if measures of clinical response or clinical progression are used as the primary endpoint then large sample sizes are required to detect meaningful effects as only a minority of symptomatic individuals progress to severe disease. If accelerated viral clearance is indeed predictive of clinical benefit then measurement of viral clearance is much more efficient as a method of ‘phase 2’ candidate drug screening and dose-finding.

Viral clearance can be summarised in different ways, the most common summaries being the time-to-clearance (the time to reach the lower limit of detection for a given assay) and the rate-of-clearance (the slope of the decline of the log viral load). To optimise the design of a phase 2 trial that uses a summary statistic of viral clearance as the primary endpoint, it is necessary to assess viral clearance dynamics in untreated individuals and to characterise the main sources of variation. We use prospectively gathered data from 46 individuals who had frequent qPCR sampling before and after peak viral load to simulate data and determine optimal trial design parameters [7]. We use a pharmacodynamic model fit to these data to simulate viral clearance data in untreated and treated individuals with varying effect sizes for the hypothetical treatments. The simulations demonstrate that rate of viral clearance is a uniformly better endpoint as compared to time to viral clearance (where better is quantified by the resulting type 2 error), and necessitates shorter follow-up for accurate characterisation. We propose an adaptive randomisation phase 2 trial design for antiviral drugs in which rate of viral clearance is the primary endpoint.

## Results

The simulation results presented here are based on the posterior predictive distribution of viral dynamics from a Bayesian pharmacodynamic model fitted to cycle threshold (Ct) data from 46 individuals prospectively followed over time with frequent qPCR measurements [7]. In general the decline in qPCR estimated viral loads in nasopharyngeal swab samples from peak values was first order and monoexponential (and therefore characterised adequately by a rate constant), although in some individuals a second slower phase of reduction was evident. Thus the clearance rate can be estimated from the slope of a simple linear regression onto the qPCR Ct values (the Ct value is proportional to the log viral load). The posterior predictive distribution takes into account measurement variability (note that this could be reduced by using extra information such as urea correction for extracellular fluid volume in nasopharyngeal or oropharyngeal swabs, or host DNA to account for cell content). We simulate only from the posterior predictive distribution for symptomatic individuals, as enrolment in pharmacometric assessments is usually from passive case detection and symptomatic patients represent the main target group for therapeutics. In addition, these data showed that symptomatic individuals have slower viral clearance than asymptomatic individuals [7]. Figure 1A shows example Ct clearance curves for 10 individuals simulated at random. If the antiviral drug reduces viral replication, then the simplest mechanistic model of an effective antiviral treatment effect is parameterised as a multiplicative effect on the clearance rate. Therefore, the effect of an antiviral drug is parameterised as increasing the slope of the clearance curve (inversely related to the clearance half-life). Figure 1B shows how effect sizes (in terms of % increases) translate into average clearance curves. For an effect size of 15%, Figure 1B and 1C show how this translates into endpoint distributions when there is daily sampling for 21 days after the peak viral load. We make the simplifying assumption that all individuals are recruited into the study at peak viral load (although when fitting to real data this assumption would need verification). The effect sizes used can be put into perspective with respect to recent results from monoclonal antibody treatments. The trial of the neutralising monoclonal antibody LY-CoV555 [4] estimated an approximate 15% increase in viral clearance (rate of viral clearance was estimated in the phase 2 trial using the baseline viral load and a single time point at day 11); the trial of the human IgG1 antibody cocktail REGN-COV2 estimated an approximate 20% increase in viral clearance [5].

### Rate-of-clearance results in uniformly lower type 2 error than time-to-clearance

We simulated individual patient viral clearance data for varying sample sizes and varying durations of follow-up and compared the power (1 - type 2 error) to detect effects using either time-to-clearance (defined as the first time at which the qPCR Ct value reaches 40) or rate-of-clearance (defined as the slope coefficient estimated from a linear regression of time onto the Ct values *<* 40). Figure 2 shows that in all scenarios, for the three durations of follow-up considered here (7, 10 or 14 days), rate-of-clearance results in a lower type 2 error (greater power) than time-to-clearance. While the 7 days follow-up misses the clearance time in nearly all individuals, 7 days is sufficient to obtain a reasonable estimate of the clearance slope. Daily qPCR measurements for 10 instead of 7 days can substantially reduce the type 2 error at moderate sample sizes for intermediate effects, but extending this to 14 days results in very small further gains in terms of power.

As there are no unequivocally effective antiviral drugs at present, the pharmacometric properties of an effective treatment are uncertain. Assuming that there is not viral density dependence in clearance rates then the higher the initial viral load the longer it will take for virus to become undetectable. The dependency between baseline viral load and time to clearance can bias results, particularly in small studies (where by chance one group may have higher starting viral burdens). Rate-of-clearance is independent of baseline viral load and so removes this finite sample bias.

### Duration of follow-up and frequency of sampling

Figure 2 shows that the marginal gain between 14 versus 10 days of follow-up when using rate-of-clearance as the trial endpoint is small. We looked at marginal gains in more detail, simulating for an effect size of 7.5% acceleration in clearance (increase in slope) and a sample size of 50 individuals per arm. Figure 3 shows that the marginal gains tail off after 9-10 days for the rate-of-clearance endpoint, and tail off at 12-13 days for the time-to-clearance endpoint. We note that these values assume that individuals are recruited at peak viral load.

In SARS-CoV-2 viral clearance rate estimations, more detailed sampling reduces the variation from measurement error and so results in more accurate estimates. Measuring viral densities more than once daily adds logistical difficulties in a clinical trial, but could be worth it if the information gained was substantial. Figure 4 shows small marginal gains in statistical power for twice versus once daily qPCR, suggesting that once daily measurement of qPCR is likely to be sufficient (i.e. an acceptable trade-off between statistical error and resource costs). However, large gains are made by having at least one interim follow-up measurement (estimating the rate from 3 samples versus 2, Figure 4. For small effect sizes (e.g. 7.5% increase in viral clearance rate), estimating clearance rates from daily qPCRs over 10 days considerably increases power over using a single day 10 measurement (*>* 80% power compared to less than 50% power for 100 patients per arm).

### Adaptive randomisation to identify active antiviral drugs efficiently

Our model simulations suggest that the rate of viral clearance can be estimated from a short duration of follow-up (e.g. 7 to 10 days). For a trial aiming to compare the efficacy of multiple candidate antiviral drugs, the most efficient design to determine the best candidate for a subsequent phase 3 trial is adaptive randomisation (including a control arm), whereby the randomisation ratios are adjusted as the trial progresses, based on the estimated drug effects. Adaptive randomisation favours the most promising candidates by increasing the number of individuals who are randomised to that drug. For example, the randomisation probability of the control arm could remain fixed throughout the trial, but the randomisation probabilities of the intervention arms *T*_{k}, *k* = 1..*K* could be set as proportional to *P* [clearance rate of *T*_{k} *>* clearance rate of control]. These probabilities can be estimated from a hierarchical model that accounts for inter-individual variability in both peak viral load and rate of clearance with independent additive terms on the clearance rate for each intervention arm. For example, this could specified as:
where *y*_{i,t} is the observed Ct value in individual *i* at time *t*; *T*_{i} is the randomised treatment given to individual *i*; *α*_{i} is the random intercept for individual *i* (Ct value at time 0); is the slope estimated for individual *i* which decomposes as the population mean slope *β*, a treatment specific random effect (we set *β*_{control} = 0) and an individual random effect *β*_{i}; *σ*_{Ct} is the standard deviation that determines the independent and identically distributed model error. If the individuals included in the trial are all symptomatic, then we can assume that all observed Ct values are above the nadir (i.e., viral loads below the peak) and thus we are estimating a single slope (as opposed to [7] where viral dynamics before and after the peak are inferred). In reality this assumption may not hold and either an ad hoc approach could be used (for example using only the Ct values post observed nadir), or a more complicated piecewise linear model could be fit to the data (as done in [7]).

Adjustment for right truncation should be applied for Ct values above or equal to the limit of quantification (usually set at 40). In addition, the adaptive randomisation design can be improved by specifying a minimum effect size *λ >* 0 that is the lower bound of clinically relevant increases in viral clearance. Denoting the probability of being assigned treatment arm *k* as *π*(*k*), we propose an adaptive randomisation scheme whereby , keeping the control arm randomisation ratio fixed (e.g. for *K* intervention arms). is estimated from all data accrued so far. As qRT-PCR is widely available, a rapid “turn-around” in reporting clearance rates allowing timely adaption should be feasible.

Figure 5 shows the results from simulation of two trial scenarios which used adaptive randomisation with four intervention arms (5 arms in total including the control). In scenario 1 (top two panels), one of the treatment arms increased the viral clearance rate by 10%, whereas the other three arms had no effect relative to control. In scenario 2 (bottom two panels), all four intervention arms were ineffective (identical to control). We specified a stopping rule for success, defined as a probability greater than 99% that one of the intervention arms had a clearance rate more than 1% faster than the control arm (i.e. *λ* = 0.01). Stopping for futility occurred if none of the intervention arms had greater than 10% probability of having more than a 1% faster clearance rate compared to control. Randomisation ratios for the four intervention arms were proportional to the posterior probability that the viral clearance was more than 1% faster than the control arm. These randomisation ratios were updated in batches of 30 patients with an initial ‘burn-in’ of 50 patients. The maximum trial size was set at 400.

For scenario 1, the median trial size was 140 patients, with a median of 43 patients randomised to the active arm (30% of patients across trials) and 21 patients to each of the inactive arms. The trial identified the active arm in 93% of cases; did not reach the success or futility thresholds by 400 patients in 3.4% of cases, and stopped for futility or identified the wrong intervention arm in 3.7% of cases (from a total of 1000 independent simulations). This gives a type 2 error of 7% for an effect size of 10% increase in viral clearance rate. For scenario 2, stopping for futility by 400 patients occurred in only 2% of cases, no decision was reached in 87% of cases, and one of the four interventions arms was classified wrongly as meeting the success probability threshold in 11% of cases. This gives a type 1 error of 11%. The type 1 and type 2 errors are partly controlled by the value of *λ* (the clinical significance threshold). Varying *λ* trades type 1 versus type 2 errors. We chose a low value of *λ* = 0.01 in order to have high power but with the drawback of higher type 1 error. This corresponds to the usual goal of a phase 2 trial, which is not to miss effective therapies (as compared to a phase 3 trial, which is not to recommend ineffective therapies).

## Discussion

An effective, well tolerated, safe and affordable treatment of COVID-19 that prevented progression to severe disease would be of enormous global health benefit. There have been a large number of small clinical trials assessing the efficacy of repurposed candidate antivirals, based usually on moderate activity in virus cell cultures, but actionable evidence has come only from the relatively few large randomised controlled trials such the RECOVERY and SOLIDARITY platform trials. Among the small molecule antiviral candidates the greatest interest has surrounded remdesivir, a viral RNA-dependent RNA polymerase inhibitor. Remdesivir was associated with a shorter duration of hospitalisation in a randomised controlled trial [8], but did not reduce mortality in the large SOLIDARITY trial [9], nor was there a trend to faster viral clearance in the initial clinical trial in China [10]. This has led the WHO to recommend against its use in COVID-19 [11]. Indeed, no antiviral small molecule drug has yet shown life saving benefit in COVID-19 infections, nor is there any convincing evidence yet that any antiviral drugs do accelerate viral clearance [12]. In most studies describing viral clearance ‘rates’, it is time to a negative nasopharyngeal swab that is being described without adjustment for the initial viral load. For example, a recent study in Brazil reported faster viral clearance after treatment with nitazoxanide (an antiparasitic drug) compared to placebo [13]. However, the reported comparison did not adjust for the large difference in baseline viral loads between the two groups (median viral densities in nasopharyngeal samples were 0.5 log_{10} copies per mL higher in the placebo group compared to the nitazoxanide group). The difference in clearance rates estimated from the median values implies only an approximate 1.15% increase in viral clearance rate, which is of uncertain clinical significance.

The most promising evidence for accelerated viral clearance from a therapeutic currently comes from antibody therapies, two of which have recently been given US FDA emergency use authorisations [14, 15]. The monoclonal antibody LY-CoV555 (trade name bamlanivimab) reduced the viral load by day 11 in a phase 2 trial of 452 patients (effect size of approximately a 15% increase in viral clearance rate for the best observed dosing subgroup) [4]. The antibody cocktail REGN-COV2 (a combination of casirivimab and imdevimab) increased viral clearance by approximately 20% in an extended phase 2 trial of 275 patients, with the largest effect on clearance seen in patients with the highest viral loads. Our simulation results show that these large effect sizes could have been characterised with much smaller patient sample sizes using viral clearance as the primary endpoint. There are a large number of both repurposed and novel antiviral drugs either in development or under consideration for COVID-19 prophylaxis or treatment, but there is no agreed methodology for testing them *in vivo* or for determining the optimum dosage. Although COVID-19 is a systemic infection, viral clearance from the upper respiratory tract is the only readily accessible measure of a patient’s virological response. Viral clearance reflects both host-defence and any additional contribution from an antiviral therapeutic. Thus, the pattern of clearance will depend on the stage of disease and the host’s response. Peak viral densities in nasopharyngeal or oropharyngeal fluids occur around the time of COVID-19 illness onset [1, 12]. Deterioration in oxygenation (reflecting pneumonitis) necessitating hospital admission occurs in a minority of patients approximately one week later at a time when viral densities are usually much lower. Thus assessment of antiviral activity is best performed in subjects at initial presentation of uncomplicated illness. In some individuals, studied at a very early stage of illness, viral densities will rise but once densities fall the pattern is generally first order (log-linear), as it is for most pathogens. In some patients a second slower phase of elimination is evident. In extreme cases there is long lasting viral persistence [16].

The slope of the initial log-linear decline in viral densities in nasopharyngeal or oropharyngeal secretions (i.e. the rate constant) is the key measurement of antiviral effect. The difference between the slopes with the putative antiviral and without represents the drug effect. The time to viral clearance has been reported widely in clinical trials, but this pharmacodynamic measure has substantial disadvantages. It is strongly dependent both on the peak or baseline viral densities and the limit of qPCR detection and, by definition, requires follow up until PCR negativity. Evaluations of therapeutic interventions which do not report these values and use time to viral clearance as the primary endpoint are difficult to interpret (e.g. [17]). Time to viral clearance results in a substantially higher type 2 error (lower power) than the clearance rate, and therefore requires much larger sample sizes, at least daily sampling, and a longer duration of follow up. Additionally, unless there is rapid reporting, sampling must continue for much longer than necessary. Thus, use of time to clear as a primary endpoint in pharmacometric assessments is imprecise, inefficient, and expensive. Furthermore, if baseline viral loads differ between treatment groups, then time to viral clearance is biased. Importantly any comparison or meta-analysis of studies which used different sampling techniques, or had different qPCR sensitivities, or different cut-offs, is confounded systematically if time to clearance is used as an endpoint, but not if virus clearance rate is measured.

A swab taken from the nasopharynx or oropharynx contains a variable number of cells and a variable volume of extracellular fluid and nasal secretions. Virus is both intracellular and extracellular. In theory, these factors can be adjusted for by comparison with a human DNA marker and adjusting for urea content respectively. This should reduce intersample variance and improve clearance rate assessments. Nevertheless, even without these adjustments, satisfactory clearance rate estimates can be obtained over a week with as few as six samples. Further studies to examine virus clearance rates at higher peak nasopharyngeal viral densities, and in different patient groups to characterise better the determinants of viral clearance will facilitate interpretation and refine inclusion criteria in phase 2 pharmacometric assessments.

Our model based simulations suggest that pharmacometric assessment of candidate antivirals for COVID-19 should measure virus clearance rate [18], and not the much more widely used time to clearance [19], as their primary endpoint. If performed in early uncomplicated illness, reasonable precision can be obtained with daily qPCR samples over 10 days from each studied patient. Adaptive randomisation using the viral clearance rate can rapidly identify active intervention arms.

## Methods

### Model of viral dynamics

The pharmacodynamic model of Ct viral load over time is based upon the prospective longitudinal SARS-CoV-2 RT-qPCR testing performed for players, staff, and vendors during the resumption of the 2019-20 National Basketball Association (NBA) season and has been described previously in detail [7]. The code and data are openly accessible on github at https://github.com/gradlab/CtTrajectories. The original description of the model had three main individual parameters (using the stan notation): *dp, wp, wr. dp[i]* is the peak viral load for individual *i*; *wp[i]* is the time until peak viral load; *wr[i]* is the time to recovery (time from peak viral load to time at which viral density is unmeasurable, i.e. Ct ≥ 40). We re-parameterised the stan model (for sampling from the posterior distribution) so that *wr* is replaced by the slope coefficient (equal to -dp/wr). This removes any dependency in the posterior predictive distribution between the clearance rate and the peak viral load. To simulate viral trajectories under the posterior predictive, we draw at random from the top level of the hierarchical model (population distribution of the peak viral load and rate of clearance) and then simulated Ct values from time 0 (time of peak viral load) until a specified maximum time after the peak.

### Adaptive randomisation

To simulate trials with adaptive randomisation we again used the same model of viral clearance dynamics to simulate trial data, and then used a hierarchical linear model to estimate drug effects. This used *stan lmer* from the R package *rstanarm* whereby individual random effects were specified for both peak viral loads and viral clearance slopes. This was fitted directly to the simulated Ct values (R formula specification: *y* ∼ 1 + *t* * *Trt* + *t* − *Trt* + (1 + *t*|*id*), where *y* is the Ct value, *t* is the timepoint, *Trt* is the treatment arm, and *id* is the patient identifier). After a burn-in of 50 patients, randomisation probabilities for the intervention arms were set to be proportional to the posterior probability that the treatment effect was greater than a 1% increase in viral clearance. The posterior distribution over the model parameters was estimated from 1000 draws from 1 chain (to minimise running time).

For scenario 1 we simulated 1000 independent trials (approximately 2 days computation on 8 cores) and for scenario 2 we simulated 100 independent trials (also around 2 days on 8 cores). Trials under scenario 2 (no active drugs) are more computationally expensive because the large majority do not meet any early stopping rules (median trial size is 410, the maximum trial size) whereas under scenario 1 (one active treatment), the majority of trials stop early (median trial size is 140). The posterior fitting is proportional to the number of individuals and so takes longer at each interim analysis.

## Code

All the code is available at https://github.com/jwatowatson/phase2sims

## Data Availability

All data and code are available on the github repositories linked to this manuscript.