Modeling the dynamics of SARS-CoV-2 immunity waning, antigenic drifting, and population serology patterns

Reinfection with SARS-CoV-2 can result from either waning immunity, a drift in the virus that escapes previously stimulated immunity, or both. The nature of such reinfection risks will affect the choice of control tactics and vaccines. We constructed an SIR transmission model of waning and drifting that can be fitted to cross-neutralization serological data. In this model, waning occurs in individuals who have recovered from previous infections while drifting occurs during transmission to a previously infected individual. Interactions at the population level generate complex dynamics that cause drifting to occur in unanticipated but explainable ways across waning and drifting parameter sets. In particular, raising the fraction of transmissions where drifting occurs slows the rise of drifted strains to high levels and changes the equilibrium distribution of strains from {cup} shaped (extreme strains dominate) to {cap} shaped (central strains dominate). In {cup} shaped parameter regimes, endemic infection levels can rise after many years to above the original epidemic peak. The model simulates results from cross-neutralization assays given sera from previously infected individuals when multiple drifted strains are used in the assays. Fitting the model to such assay data can estimate waning and drifting parameters. Given the parameters, the model predicts infection patterns. We propose a process for using fits of our model to serological and other data called Decision Robustness and Identifiability Analysis (DRIA). This can inform decisions about vaccine options such as whether to prepare for changes in vaccine composition because the virus is changing to escape immunity.


Introduction
Many aspects of SARS-CoV-2 dynamics remain unknown, including the risks of and reasons for reinfection. At least one case of reinfection has apparently been well documented (Kupferschmidt 2020). SARS-CoV-2 reinfections might arise either because host immunity wanes after recovery from infection or because the virus evolves to escape immunity stimulated by prior infections. Reinfection is common and all age groups are repeatedly infected with endemic coronaviruses (Monto, DeJonge et al. 2020, Nickbakhsh, Ho et al. 2020. But no studies of any coronaviruses have quantified the roles of immunity waning or antigenic drifting contributing to reinfections. High variation in coronavirus genomes at attachment sites (Andersen, Rambaut et al. 2020)suggest that virus variation might explain some reinfections. When waning of immunity allows for reinfections, those reinfections could generate forces that generate antigenic drift to new virus variants. Such interactions between waning and drifting could also affect the risks of infection after vaccination.
To explore such interactions, and to create a basis for extracting information about drifting and waning by fitting models to data, we present a model that captures the separate effects of waning and drifting and their interactions. Our model opens new paths for relating serological data to population patterns of infection and estimating waning and drifting parameters from serological data.
The model we examine is simple. Yet it generates complexities we did not anticipate. The understanding we gained in how such complexities emerge will help develop more detailed models with greater capacity to guide both research and control decisions. Accordingly, we lay out a path for building on our model to ensure the validity of scientific or public health decisions. We call this strategy Decision Robustness and Identifiability Analysis (DRIA). Although the model presented here does not include vaccination, adding it is straight forward.
It was polio that first led us to develop this model of combined waning and drifting. Besides polio, influenza and pertussis will especially benefit from model elaborations beyond the skeleton we present here.
We first describe our model and the behaviors it generates. We then describe how it enables population serological analyses to estimate waning and drifting parameters. Finally, we describe how the DRIA strategy will facilitate decisions about vaccine use.

I: The Model I.1: Model Structure
We use a continuous SIR (Susceptible, Infectious, Recovered) compartmental model. The S state represents never-infected susceptibles. The infectious state I is divided into M+1 levels of antigenic drift: I0, I1, …, IM. We write B for the effective contact rate per week between those in S and those in any Ih. An infected individual in state Ih recovers at rate g per week and upon recovery enters recovered state Rh0. The second subscript in Rh0 represents how much the recovered individual's immune system has waned from its maximum effectiveness. As his immune system wanes, the individual newly reinfected with drift level h moves from Rh0 to Rh1 to Rh2 and eventually to RhP (maximal waning) at constant rate w. So, for each drift level h, there are P+1 waning states Rhk, k=0,1,…,P.
New infections can occur when an infected in state Ih meets a recovered individual in state Rjk, with the probability of transmission increasing: 1) as the waning level k increases, and 2) with increases in the difference |h-j| between the drift level h of the infected and last former infection level j of the susceptible in Rjk. For example, there is no transmission when an Ih encounters an Rh0 and the highest probability of transmission when an I0 encounters someone in RMP. To quantify this probability of transmission between an Ih and an Rjk, we combine the risk of infection due to waning $ One logic for our choice of this model structure is that drifting of the virus to escape host immunity is a process that takes place during infection and transmission. In an infected individual, diverse viruses are generated. Upon transmission, those viruses that escape a new host's immunity are more likely to cause infection.
The model is presented in detail in the supporting material (SM). It makes a number of simplifying assumptions. For example, it assumes that all infections have the same unchanging recovery rate g and the same weekly effective contact rate B (modified, of course, by h,j,k). It assumes a constant birth and death rate for the whole population, as well as random mixing. It uses a single waning rate w for the P transitions between the waning states. In SM we list other simplifying assumptions that can be realistically relaxed. But we argue that one should not make a model more realistic just to do so because that results in the model becoming less identifiable. Parameter identifiability can be quickly lost even with one or two realistic model elaborations. That is why we emphasize Decision Identifiability which we describe later.

I.2: Model Behavior
We simulated the model delineated in the Supporting Material using the Berkeley Madonna Software (Madonna 2020). For the simulations in this report we set effective weekly contact rate B=1 and weekly rate of recovery g=0.5, so that the underlying basic reproduction number is R0=2. The birth and death rates were set at 1/(75 times 52). All time scales were set to a week. We introduced one infection per 10 million into a continuous population denoted as having 1000 individuals with no immunity or control effects. We worked with 7 drift levels and 7 stages of immunity, so that M=P=6. We varied only the drift rate dr and the waning rate w to understand better the interactions between drifting and waning. Numerical solution of the model used Runge-Kutta 4; the stability of inferences made were evaluated across smaller time steps.
Figure 1: Patterns of total infections and infections in the first, third, and fifth drifting states across waning parameters from 0 to 0.1 per week, and drift fraction parameters from 0.01 to 0.5 per week.
We have only begun to explore the rich system complexities that emerge from this simple model. We discuss only the interactions between waning of immunity and drifting to escape immunity. These create phenomena that could influence the stability and change dynamics of infectious agents. This simple model needs elaboration to inform theoretical science or public policy decisions. But the dynamics of this simple model are so rich that model elaborations need to proceed with an understanding of what is generating those dynamics. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 I.2a: A sparse overview across a broad spectrum of waning and drifting To analyze how changes in dr and w affect the dynamics of our system, we ran simulations with seven representative values of each --from no waning and no drifting to high levels of both. We fixed w and let dr vary over these seven values and then fixed dr and let w vary. We present four of these graphs in the SM Figure S1. Figure 1 gives a sparse overview of these observations. (We will go into more detail below.) In Figure 1, we vary waning across the columns and drifting across the rows. The dark curve in each subfigure is the graph of the number of infected I(t) from t=0 to t=500 weeks. We also include graphs of I1(t), I3(t), and I5(t) to understand some of the details of the spread. We present some observations gleaned from or implied by Figure 1  1) Note that the first epidemic is virtually unchanged as w and dr vary in the 12 graphs of Figure 1, peaking around 153.4 infections at week 32. This invariance is expected since drifting in the model arises from reinfections, which are rare in the first epidemic (nearly all the infected are in I0), and the epidemic ends before waning begins. In fact, up to the 0.1 rate of waning per week, the size of the original epidemic stays the same.
2) Without waning, there is no drifting; drifting can only occur as a result of the immunologically selective forces of reinfection. When w=0 in our model (column 1), epidemics are dispersed in time with virtually no infection between them. They occur when cumulative births generate a large enough (never-infected) susceptible population to raise the effective reproduction number above 1. It takes 75 years for the first epidemic to appear and another 42 years after that for the second epidemic to appear.
3) More waning leads to more infection.

4)
In particular, as w rises, so does the timing and peaks of epidemics after the first one.
5) For small values of w, infections occur in discrete waves, with virtually no infection in between.
6) By the time w reaches 0.1, total infection oscillates upwards in time, eventually reaching a positive equilibrium.
7) The size of this equilibrium decreases as the drift fraction dr increases. See supplemental material Figure S1   To provide a feel for the waning parameter values in Figure 2, we present in row 2 of Table 1 the waning levels in terms of the time it takes for half of recently infected individuals to lose all of their immunity. This is the time when half the previously infected population is at waning level 6 (the sum of Rj6 across j). In Row 3, we present the waning levels in terms of the time for the sum of all susceptibility across the whole population to reach a level that has half the immunity it had right after infection. Figure 2 presents the prevalence of infections at different drifting levels. As in Figure 1, the first epidemic is unchanged by the waning or drifting parameter values. However, at each positive waning level, the timing, size, and drifting level composition of the second and third epidemics changes as dr increases from 0.01 to 0.5.
At waning level 0.01, the period between these first two epidemics shortens gradually as the drifting parameter increases to dr=0.03, after which it lengthens again. The second and third epidemics are increasingly populated by more drifted viruses as dr is increased up to a value of 0.1. But by drifting fraction 0.5, there is a marked change as the second and third epidemic infections are mainly at drift levels zero and one.
At waning level 0.05, the second epidemic appears quickly and subsequently there are epidemic waves rather than distinct epidemics. This is even more the case for waning level 0.1. For example, consider the case w=0.05, dr=0.01 in Row 1, Column 2. We blow up this subfigure in section 2.1 (Page4) of the SM Figure S1. As seen in this supplementary figure, after the initial . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 epidemic, the total infection fluctuates upwards through a series of six waves. As usual, the I0s dominate the initial wave. But then, as seen in Figure 2, I1 dominates the next wave, I2 the next, I3 the next, ..., I6 the last, after which total infection eventually equilibrates at ≈ 56.5/1000. Throughout this process, because of the small drift fraction 0.01, I0 never goes away. In fact, it is the second most active drift level in each wave after the first. Its encounters with neverinfected susceptibles further sustains the numbers of these persistent I0s. The resulting R0ks have the highest probability of reinfection when they encounter each new Ij because 0 is the furthest index from j. Soon after the I6s arise, the I0s and I6s become the dominant strains. All this required a low drifting fraction. As the drift fraction dr increases, the role of the I0s decreases. When dr =0.1, each Ih still plays a major role in wave h, but each becomes more persistent over time. When the drifting fraction reaches 0.5, the impact of the early drift levels I0 and I1 diminishes and the intermediate levels, especially I3 eventually take over. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 There is less drifting to the extremes as the drifting fraction parameter dr is increased because: 1) There is less amplification of the extreme drifting levels since the fraction with potential to stay at the same level is 1 − , which decreases as dr increases; 2) There is more drifting away from the high and low extreme levels as the dr increases, 3) As the infection level goes down for the first two reasons, there are more uninfected individuals that have an absolutely flat susceptibility to infections at all drift levels. That further reduces the forces driving drifting to the highest and lowest drifting levels.
By the end of week 500, drifting levels are converging to equilibrium levels in Figure 2. Figure 3 presents the numbers of infective of each drift level when the system finally equilibrates. Extreme levels I0 and I6 dominate for small dr, intermediate levels I2, I3, and I4 dominate for dr=0.5. Our model assumption that transmission probabilities are highest when the strains are furthest apart leads to higher infection levels for smaller dr's.
Waning level patterns are a determinant of reinfection potential. Figure S2 in SM tracks the waning levels for the parameters in Figure 2. Waning cascades from level 0 to 6 where it accumulates. The population level force of infection is a major determinant of what waning levels dominate across time. At the very low waning rate of 0.01 per week, waning cascades slowly and never accumulates much in level 6 except at the very high drifting fraction. The dual role of the drifting fraction dr in initiating waning more quickly but causing less growth after initiation explains this pattern. At the higher waning rates there is more accumulation in w6 (Max Wane) during the first four years. But then as the drifting levels of viruses drift apart and cause an increasing force of infection, w6 is drained by new infections and these cause increasing levels of w0 that cascade down. Equilibrium levels of w6 decrease thus decrease more than lower waning levels. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 We explored different model formulations where drift levels went in circles rather than ending at the extremes. These formulations produced similar results. See SM 2.3.

II. Model generation of serological data for estimating parameters and choosing vaccines
We have demonstrated interactions between waning and drifting that affect the population patterns of infection. The patterns produced by these interactions provide a way to estimate waning and drifting parameters by fitting the model to observed data. But we don't want to wait many years for patterns to emerge like those seen in figures 1 & 2 to estimate those parameters. We want a methodology that can use the serosurveys currently being conducted to help make decisions quickly regarding how the challenges of waning and drifting should be addressed through vaccination programs. Those serosurveys include 1) prospective follow up of infected individuals to estimate waning as a function of time since infection, and 2) serial cross sectional samples of sera to characterize population patterns of infection. The methods we present in this section will efficiently use the second sets of data to describe the combined effects of waning and drifting. Then using the DRIA approach we will present in the next section, vaccine decisions can be informed in a more timely and valid manner by using both types of data.

II.1
How model generated data is turned into cross-neutralization analysis data Our model generates tables of cross-neutralization data. The real world, through the work of adept serologists, also generates tables of cross-neutralization data -with the same information. That means that we can adjust model parameters or otherwise modify our model until there is correspondence between our model and the real world. To do that well, we first need to understand how our model generates the data and how changes in model parameters alter the data. Providing the needed understanding is the goal of this section.
In the SM we provide a tutorial on interpreting population cross-neutralization data (SM3.1 page 9). We give a brief overview here. Suppose there are V virus types under consideration; they may differ, for example, by strain, year, or location. Choose a sample of uninfected individuals and take a blood sample of each to measure neutralizing antibody in the serum. Dilute each serum by a fraction T times for each individual in the sample. Assay each dilution to see if the serum neutralizes the growth of each of the V virus types. Record for each virus type the most diluted serum that neutralized it. This measure of immune response to each of the V virus types yields an entry in a × ⋯ × (V times) table. Then count the number of individuals whose entry fell into that slot to generate a cross-neutralization table. We use neutralization assays as the serological method only because neutralization is widely thought of as being the most direct assessment of immunity. But more informative assays can be developed and related to immunity using the DRIA process to be discussed later.
In the case of V=2 virus types, if all the entries are on the diagonal, then the viruses are equivalent with regard to drifting status. On the other hand, if there has been drifting so that . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint each viruse can escape some of the immunity stimulated by the other, table entries will be on the sub-or super-diagonals.
Our model has 7 virus drift levels Ih and 49 categories Rjk of previously infected individuals (7 drifting levels j by 7 waning levels k). To measure from our model output the neutralization level of virus type Ih by the immune system of an individual in an Rjk, we use the transmissibility measure Z(h,j,k), as formulated by function (1.1) in our model (without the B factor). In effect, the Z function corresponds to the inverse of the neutralizing level. To make that number correspond to a titer, we divide the interval [0,1] into 10 equal subintervals that could correspond to 10 sequential dilutions for the neutralization assay. Both waning and drifting determine the neutralizing antibody levels according to equation (1.1).
In this exposition, we work with two or three virus types at a time. The cross-neutralization table for 2 viruses is a 10 by 10 matrix. To get a particular value for each of the 100 entries in this cross-neutralization matrix, we first determine a susceptibility level for each of the 49 Rjk subgroups. Then we sum up the total number of individuals in the population that have horizontal axis level of susceptibility in the table corresponding to h1 and the vertical axis level of susceptibility corresponding to h2. Each entry thus has identical people with regard to neutralizing antibodies but it can be composed of individuals with different waning and drifting levels.
To describe this process more analytically, let Π(z) denote the subinterval to which z in [0,1] belongs; analytically. Π(z) is 1 + the integer part of the decimal expansion of 10·z. Then, the (m,n)th entry of the cross-reaction matrix is the total number of individuals in all the compartments Rjk for which Π(Z(h1,j,k)) =m and Π(Z(h2,j,k)) =n.

II.2 How model parameters change cross-neutralization patterns
We numerically solved the model after introduction of 1 case per 10 million population and took cross sections of the uninfected population at 60, 110, and 162 weeks after that introduction. In the SM section 3.2 we examine the cross-neutralization patterns produced by four drifting fraction parameters, two waning rate parameters, and cross-neutralization using viruses at drifting levels 0 and 1 for all three time points and levels 0 and 2 for times 110 and 162.
During the first three years, Figures 2-3 and S2 show small or no differences between patterns with different drifting fraction parameters but the same waning level. The most valuable information is not captured in those figures because they do not break down the previously infected population by both drifting level and waning level. That breakdown is where the information on drifting lies. Cross-neutralization titers can perceive that breakdown, as can be appreciated across the range of parameters examined in Figure S3. For illustration purposes we present 60, 110, and 162 week cross-neutralization patterns for drift levels 0 and 1 at 60 and 110 weeks and 0 and 2 at 162 weeks given a waning rate of w=0.1 and a drifting fraction of dr=0.1.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint Figure 4: Expected cross-neutralization assay results given model parameter values for waning rate = 0.1 and drifting fraction = 0.1 with a population cross sections taken at 60, 110, and 162 weeks and viruses at drift levels 0 and 1 used in the earlier assays and levels 0 and 2 are used at 162 weeks. Note that blank cells had no population in them. Those with zeros had less than 0.05 population. All of those with the lowest titers on both were never infected individuals highlighted in yellow. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 At most parameter settings there was little evidence of drifting at 60 weeks. At 60 weeks 25 out of 628 individuals with a titer greater than level 5 had a higher titer to the strain at drifting level 1 than to the strain with a drifting level of 0. But by week 110, 351 out of 667 did. In the third panel of Figure 4 we see how at 162 weeks, drifting to level 2 and very slightly beyone is detected. It also shows how intermediate levels of drifting might appear when there has been further drifting from an initially drifted virus. Note that this third panel uses drifting levels 0 and 2. If the world corresponded to this oversimplified model, if the world was deterministic like our model rather than stochastic, and if we could have measured the world perfectly just like we measured our model output, a decision at 60 weeks to adjust the vaccine might have been made.
Using the DRIA approach to be discussed in the next section, we don't need our model to perfectly correspond to the real world or our measurements to be perfect in order to make a good policy decision. But we do need real world data and we almost certainly would need to realistically relax many of the simplifying assumptions in our model. Additionally, we might need to insure the identifiability of our decision with additional data like that discussed in SM section 4.4. But serology data should be a great asset to making a valid decision because our model can generate it and thus the parameters of the model can be estimated by fitting the model to the data.
Note that by making inferences about what fraction of the population has been infected by drifted or undrifted strains, cross-neutralization assays not only provide information about drifting and waning parameter values. They also make inferences about what fraction of the population has been infected with differently drifted strains. This information, together with waning and drifting parameter values, and a DRIA process that ensures robustness and identifiability is what enables our model to predict the future.
The SM in section 3.3 considers additional things that can be gleaned from the crossneutralization data and explores issues of why the data turns out as they do. The biggest determinant of the shape of the cross-neutralization data is the past history of infection. Higher waning leads to more reinfection, which leads to more drifting, which in turn leads to more reinfection. There can be no perfect separation the waning influences from the drifting influences. But drifting effects show up in the off diagonal cells and are dependent upon having viruses in the assay where genetic changes have immunological effects. One can improve their feel for model behavior by seeking to understand why the shape of the cross-neutralization patterns changes as parameter values change in terms of the past history of the population.
To get the full information out of population level cross-neutralization assays, one must fit a model to actual cross-neutralization data from appropriately designed sero-surveys. One gets that information out by treating the model as a partially observed Markov process to fit the model to data (Funk and King 2019).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint

III. Model Development and Decision Validity Assessment Processes (DRIA)
The model we have presented is a first step in a process needed to make valid scientific or policy decisions. The DRIA process we propose, as shown in the figure below has a number of distinct steps that will make subsequent investigations more productive. The enumerated steps are as follows: 1) formulate the decision to be made in terms acceptable to policy makers, 2) construct a simple model like the one in this paper, 3) fit the model to data, 4) assess whether a decision is identifiable given the data used. This requires two tasks to be completed. First, decision boundaries must be mapped out in parameter space. Second, the model must be fit to data and the parameter space that is consistent with the data must be mapped out. This key step is addressed by using one of the modern approaches to fitting the model to data (Funk and King 2019). A decision is identifiable when the parameter space consistent with the data falls entirely into one decision parameter space or another. 5) If the decision is identifiable, then one proceeds with inference robustness loops that put the decision on firmer grounds because it is less likely that unrealistic model assumptions could be determining the decision. 6) If the decision is not identifiable one proceeds to a decision identifiability loop and seeks more informative data or model changes which better use of available data. These make the decision more identifiable. These steps are taken in an iterative process involving two loops as shown in Figure 6 above. Those loops address the two major sources of errors when using models to make decisions about complex dynamic systems: 1) Some aspect of model structure that does not correspond to reality leads to a wrong decision.
2) The fit of the model to data which leads to a decision is not the only fit possible and other fits could lead to a different decision. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint

Decision Robustness
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint Decision robustness loops can add model complexity until identifiability is lost. Then even if new data is found, more inference robustness loops can put one back into identifiability loops. If one is pursuing a decision about a scientific theory formulation, decision robustness and identifiability loops could go on infinitely. Policy decisions, in contrast, must be made before things get out of control. Therefore, they require judgements that could be improved by collaborations between scientists and responsible administrators.
A decision about scientific theory that DRIA should address is whether immunity driven drifting in pandemic SARS-CoV-2 strains are sufficient for it to sustain transmission through reinfections. This scientific decision should help later focus on public health control decisions related to drifting. For example, a decision is needed as to whether to invest the many billions of dollars in vaccine development that can handle antigenic drifting as influenza vaccines do.
IV. Discussion of how this model can improve pandemic coronavirus control This model opens three new ways to improve pandemic control. 1) It provides a framework to think about the causal systems where waning and drifting contribute to repeat infections. 2) It provides a path to use serological data to inform transmission system analyses more fully.
3) It provides a way of observing whether the way we think about interactions between waning and drifting is occurring in the real world by fitting the model to serological data.
In our analysis of our simple model, we observed concerning levels of drifting after one year only at high levels of both drifting fraction and waning rate parameters. At lower parameter values we still observed notable drifting over longer intervals. Our analysis, however, requires fitting the model to data and following out DRIA procedures before any action inferences are justified other than gathering the data and carrying out the analysis.
Hopefully we will have a vaccine soon and our major concern by the time significant drifting has occurred will be assessing whether the virus can escape not only the immunity it stimulated, but also the immunity provided by vaccines. The immediate use of this model should be to guide the collection and analysis of serological and epidemiological data that, together with further model elaborations, should inform decisions about future waning and drifting risks and their effects on vaccine choices and vaccine administration strategies.
Kupferschmidt, K. (2020). "Some people can get the pandemic virus twice, a study suggests. That is no reason to Panic." Science 369(6507).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020.

Waning
An infected individual in state Ih recovers at rate g per week and upon recovery enters recovered state Rh0. The second subscript in Rh0 represents how much the recovered individual's immune system has waned from its maximum effectiveness. For each drift level h, there are P+1 waning states Rh0, Rh1,…,RhP. At each time step a fraction w of those in any Rhk (k<P) move to Rh,k+1. We use M+1=P+1=7 in our simulations.
We write B for the effective contact rate per week between those in S and those in any Ih. For the susceptibles in the recovered states Rhk, new infections can occur when an infected in state Ih meets a recovered individual in state Rjk, with the probability of transmission increasing: 1) as the waning level k increases, and 2) with increases in the difference |h-j| between the drift level h of the infected and last former infection level j of the susceptible in Rjk. For example, there is no transmission when an Ih encounters an Rh0 and the highest probability of transmission when an I0 encounters someone in RMP.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint

Transmission
To quantify this probability of transmission between an Ih and an Rjk, we combine the risk of infection due to waning with the risk due to drifting so that the total probability of transmission is This formulates almost additive joint effects of waning and drifting on infection risk and implies that drifting and waning operate on the same scale. Since there are only P waning steps and M drifting steps and we divide by P+1 and M+1, waning alone, or drifting alone, or the two in combination cannot reach a total risk of 1. That makes the joint effects of waning and drifting slightly greater than additive.

Drifting
When an infected in Ih encounters and infects a susceptible in Rjk, there is a probability dr of antigenic drift to the closest different drift level. In particular, with probability dr, an individual in Rjk newly infected by an infective in Ih will transit to state Ih', where h' = h+1 if j>h and h' = h-1 if j<h (with appropriate modifications at the extremes h=0 or h=M). If j=h, so that an Ih infects an Rhk, then the Rhk individual moves to Ih with probability 1-dr, to Ih-1 with probability dr/2, or to Ih+1 with probability dr/2. We need a modification if j=h=0 or M. If an I0 meets an R0k, the R0k individual moves to I0 with probability 1-dr or to I1 with probability dr (not dr/2). If an IM meets an RMk, the RMk individual moves to IM with probability 1-dr or to IM-1 with probability dr (not dr/2).

Population
We use a system of ordinary differential equations to model a continuous population of size N. We use N=1,000 in our simulations. But the population is continuous so we introduce infection in 1 out of 10,000,000 people. We assume a constant birth rate m and death rate m, and no extra disease-related deaths, so that population is constant.

The Equations
(0.1) CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint (0.2) Equation (0.1) and the first line of (0.2) describe the infection of the never-previously infected susceptibles (and the background birth and death rates). In the second line of (0.2), new IJs arise through drifting when IjJ1s transmit to Rhks with h>J-1. In the third line, new IJs arise through drifting when IJ+1s transmit to Rhks with h<J+1. The fourth and fifth lines describe the transmission of IJs to Rhks when there is no drifting (J≠h in line 4, hence the εJh, and J=h in line 5). In the sixth line new IJs arise through drifting when IJ-1s transmit to RJ-1,ks. In the seventh line new IJs arise through drifting when IJ+1s transmit to RJ+1,ks. The p0 and pM are needed because all drifting is one way at the top and bottom of the drift scale.
For R JK , drift level J and waning level K, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint The first term in (0.3) describes the recovery of IJs to RJ0; the second term the waning from RJ,K-1 into RJ,K; the third term movement out of RJK because of transmission from an infective in some Ih; the last term the waning out of RJK into Rj,K+1 for K<P.
We explored using a circular but still directionally symmetric drifting formulation that maintained an ordered drifting scale that kept the same order of drifting states as the distance from the first drifting level of zero increased. We found that behavior of that model identical to that of the model presented here. At the extremes, drifting could be up or down one side of the circle. The two sides of the circle remain symmetric so that drifting down from the top or up from the bottom is doubled.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint 2 Model Behavior

Patterns of total infections by w and dr
To analyze how changes in dr and w affect the dynamics of our system, we ran simulations with seven representative values of each --from no waning and no drifting to high levels of both. We fixed w and let dr vary over these seven values and then fixed dr and let w vary. We present four of these graphs here: for w=0.01 and w=0.1, varying dr in each case, and for dr=0.01 and dr =0.1, varying w in each case. The second graph (w=0.1) clearly shows a case where increasing the drift fraction dr decreases the long run infection level. The last two graphs, fixing dr, show that increased waning leads to increased infection.
Note in the last row that when waning rate w reaches the unrealistically high value of 0.3, the first epidemic begins to change -accelerating instead of dying out just after it reaches its peak.
We also carried this out for B=4, quadrupling the basic reproduction number R0. The corresponding graphs had similar shapes, except the infection level kept rising more intensely. Finally, we carried out this analysis for seasonal contact structure, letting B oscillate about B=1. Of course, the graphs were similar but with more seasonal oscillation. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint 2.2 Expansion of one entry from Figure 2 in the main text. This entry for the waning rate parameter equal to 0.1 and the drifting fraction parameter equal to 0.05 is the same as in Figure 2 in the main text except that the total number of infections has been added. If we number the waves 0,1,2,…,6, then drifting level Ih is the dominant virus in wave h. Levels I0 and I6 are the dominant levels in the long run, the backdrop for the highest transmission. Figure S2: A more extensive presentation of data found in figure 2 for waning rate 0.1 and drifting fraction 0.05.

2.3
Effects of the Drifting Fraction Formulation: There are three parameters in our model that we have yet to fully explore. These are B, the effective contact rate; M, the number of drift levels; and P, the number of waning levels. M is likely to be an important parameter with regard to our serological cross reaction modeling. We see it affecting both the balance between the dual effects of changing the drifting fraction parameter and the detail that can be captured in serological cross-reaction assays. We have examined many runs for B=4 and found the patterns in the graphs relatively unchanged but at much higher levels of infection.
One unexpected behavior of our model is the effect of increasing the drifting fraction parameter. That both increases the rate of appearance of new strains and decreases the growth of new strains. Increasing the number of drift levels should modulate this dual effect as follows.
If the same total change in drift levels is expanded into a larger number of drift levels, the drift fractions must be increased for the emergence of strains with the same effects on susceptibility . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10. 1101 to appear in the same time range. That increase will decrease the growth of strains at new drift levels. But there will be more drift levels growing. Given these complexities, a thorough exploration of the effects of changing M is advisable. Increasing the number of drift levels will change the relationship between the rate of appearance of drifted strains and the growth in size of the new drifted strains. That in turn will change the shape of the cross-neutralization titers over time.
It was difficult to explore the effect of the number of drifting levels under the constraints of the Berkeley Madonna software we used. These considerations should provide some priority for such explorations. This third parameter of waning and drifting, namely M, should make changes in the combined effects of w and dr that allow for more precise fitting of model parameters to data.

2.4
Effects of formulating drifting in a circle The phenomenon of increasing dr values causing a switch from equilibrium patterns that tend to the highest and lowest levels to a pattern that is centered on the intermediate level caused us to wonder whether that would change if we formulated drifting in a circle. We explored two different circle drifting formulations. In one there are two different states at the bottom and the top of the circle. This formulation corresponds to the extreme level causing only half of its drifting to a higher drifting level and half of its drifting to a drifting value identical to its own but on the other half of the circle. In a second formulation, we had only one state at the extreme levels. In this case half of the drifting goes to the next level up on one half of the circle and half goes up the other. We observed that this later formulation is conceptually and practically the same as the model formulation in the paper. The former formulation is conceptually the same as our current formulation but with only half as much drifting up or down from the extremes.
There were minor differences in the formulations in terms of quantitative values of drifting levels. But there were no differences in the qualitative behavior caused by increasing dr values as seen in figure 3 of the main text.

Waning level patterns
The patterns in the following figure were described in the main text. Both past and current infection patterns determine the level of waning in those who are uninfected but were previously infected. Only limited insights can be gained, however, by looking at infection patterns as seen in Figure 2 in the main text and waning levels as seen in the figure below. That is because it is the joint distribution of both waning levels and immunity to drifting levels that determines population level susceptibility. That population level susceptibility, moreover, is a function of what drift level viruses are circulating. Thus, we need a population level measure that captures the joint distribution of waning and immunity to different drift level viruses. The way we can get that in the real world is through population level cross-neutralization assays as discussed later in section 3.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint

Serological Prediction of Drifting
We propose to use population level cross-neutralization assays to predict which virus strains will be taking off in the population. Our model assumes all possible strains fall in a straight line equidistant from each other. We know that is unrealistic. But it provides a framework for beginning to understand how and where and why we will need to realistically relax our simplifying assumptions. To make valid decisions on which drifted strains should provide information for new vaccine modifications, we will need to understand and predict the effects on both virus circulation dynamics and on population level cross neutralization assays of the many potential changes we could make in our model. Since we propose population level crossneutralization assays as a key source of data for model fitting, in this section we want to improve thinking about such assays and the effects they reflect on changing immunity and transmission dynamics in the population. We want to set up an environment where epidemiologist modelers, serologists, and immunologists can gain and share insights on the task of understanding and predicting the drifting and waning of the virus and population levels of immunity to drifted viruses.
The approach we take to prediction is a causal theory approach. We run the model from the present to predict the future. An essential task in making valid decisions about the future based on model behavior is to decide which of the myriad realistic complexities of the real world make enough of an impact on the decisions we want our model to address so that pursuing realistic relaxation of simplifying assumptions is justified.
To help the reader see the potential for population level serological analysis of crossneutralizations, three issues are addressed. First, for modelers who may not be familiar with interpreting such assays and for serologists who work mostly on individual level phenomenon, we provide a simplified view of the population level serology effects of waning and drifting. Second, we present a series of 7 different simulated population level cross-neutralization assays for each of 8 different parameter settings. These can be studied at different levels and we try to guide the reader to the level they want to pursue. Third, since we chose a very simple structure for the drifting structure which we know is unrealistic but which provides a good first step in understanding drifting, we discuss a simple way of expanding the dimensions of drifting.

Thinking about cross neutralization assays
Some modelers may not be familiar with cross neutralization assays. Even some serologists may not be used to thinking about population level patterns for cross-neutralization assays. Therefore, we present some overly simplified effects seen in population level patterns to help the reader see signals in population patterns of cross-neutralization assays.
The tables below provide a tutorial on perceiving population level immunity drifting from a cross-neutralization analysis on a cross section of a population. In Tables S1a-d, the current circulating virus is Virus A; viruses B and C are not circulating in the population from whom a cross sectional sample of sera was drawn. Twenty-six individuals provide sera samples of sera for neutralization assays. Each sample is divided into 6 titers by dilution. Titer 1 is undiluted; titer 2 is diluted 1-1; titer 3 is titer 2 diluted 1-1, and so on to the most diluted titer 6. Each of . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint these titers is mixed with a sample of virus A. For each individual j, record the highest numbered (weakest) titer that neutralizes the virus; call it TA(j). These same serum samples are also mixed with virus B and the highest numbered neutralizing titer is recorded for individual j, call it TB(j). The joint distribution of the two is recorded in the light blue area. More concretely, Tot 4 6 7 5 3 1 A/B 1 2 3 4 5 6 A/C 1 2 3 4 5 6 Figure S3: Cross-reaction analysis results for an idealized situation where waning occurs all at once just before time 2 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint the element in row r and column c in the Table is the number of individuals j for whom TA(j) =r and TB(j) = c. The fact that all the entries for this analysis lie on the diagonal in Table S1a indicates that virus B is serologically equivalent to virus A with regard to neutralization. Also note that those that required a stronger (lower numbered) titer for neutralization could have experienced some earlier waning or maybe might just have weaker immune systems for the viruses.
The exact same procedure with the same individuals is repeated one time period later and reported in Table S1c. We see there that it takes a titer one level stronger to neutralize both virus A and virus B. One concludes that the subjects' immune systems for viruses And B have waned during that time period. The scaling of waning would have to correspond perfectly with the scaling of titers for this to have worked out so neatly. Reality would never be so orderly as this.
In Tables S1b and S1d, twenty-six individuals underwent the same procedure as in Tables S1a and S1c, but with non-circulating virus C replacing virus B. In Table S1b, for each individual, it took a one-unit stronger titer (lower dilution) of serum to neutralize virus C than it did to neutralize virus A. With nearly all the non-zero entries lying on the super-diagonal, one concludes that virus A, the virus actually circulating in the population, has experienced a titer dilution of antigenic drift from virus C as it escaped immunity stimulated by that virus. If C had drifted to escape immunity stimulated by A, the line above the diagonal would be below the diagonal. Again, the scaling of drifting would have to exactly correspond to the scaling of titers for this to work out so neatly.
Once again, the next lower Table S1d summarizes samples taken from these same individuals one time period later. Compared to Table S1b, each individual required a stronger titer of serum (lower dilution) to neutralize virus A or virus C. One concludes that there has been a waning of immunity during this time period and that the distribution of different strains has not changed. Note that this last aspect is unrealistic and how the circulation has changed is one of the major pieces of information that will inform us about the dynamics of drifting. For example, the scaling of waning, drifting, and titers have to be identical to get such neat results.
In Table S1e, virus A is circulating at the same level as virus B in the population sampled. In Table S1f virus B is replaced with virus C, from which A has drifted. In Table S1e, the neutralizing titers against each virus were identical for each individual. One concludes that A and B are serologically equivalent. Drifting is not affecting the circulation of viruses in this population.
In Table S1f, 10 individuals required a one-level higher titer to neutralize virus A than to neutralize virus C, and 10 individuals had the opposite outcome. The observation that all the non-zero entries were on the super-diagonal or the sub-diagonal of Table 1f suggests that there has been some drifting but not waning between the two viruses. This indicates that there are an equal number of individuals in the population infected by viruses A and C. There are six people with low titers of antibody that did not show any differences between titers A and C. It . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint is possible that these individuals had waned immunity and did not have enough antibody to generate a one titer difference between the two viruses.

Simulated population cross-neutralization patterns
We turn now to the construction of cross-reaction tables from our model. We present a series of 8 tables (Tables 2-9) that correspond to the parameter settings in the 8 subgraphs of Figure 2 in the main text with waning rates of 0.05 and 0.1. Each table has five sub-tables labelled A-E. They present the expected cross-neutralization assay results given model parameter values for waning rate = 0.05 and 0.1 and drifting fraction = 0.01, 0.05, 0.1 or 0.5. The sub-tables are A: a cross section of uninfected population taken at 60 weeks after seeding one case per 10 million individuals. These use virus levels 0 and 1 in the cross-neutralization, B: a cross section at 110 weeks using drift level 0 and 1 viruses, C: a cross section at 110 weeks using drift level 0 and 2d viruses, D: a cross section at 162 weeks using viruses at drift levels 0, and E: a cross section at 162 weeks using levels 0 and 2.
Note that in these tables all individuals at the lowest titer levels against both viruses are individuals who have never been infected. They are colored yellow. Entries with a zero had fewer than 0.05 cases.
We do not want the reader to be confused by an artifact in our model that has nothing to do with the concepts to be gained from these tables. There is a gap at titer level 7 for the drift level 0 (DL0) virus being neutralized. Along the drift level 1 (DL1) virus titers, this gap is at titer level 6. Along the drift level 2 (DL2) virus titers, this gap is at titer level 5. These gaps are due to there being 7 drift levels in the model and 10 titer levels in our simulation of the cross-neutralization table. The gap titer levels had simulated immunity levels in the serologically surveyed population that fell just above and below the gap levels.
The patterns in Table S2A reveal that the population was infected almost exclusively with a DL 0 virus. The very small fraction of the population on the sub-diagonal provides a tiny indication that a drift level one virus has been circulating in the population. The fact that almost all the population is on the supra-diagonal is a strong indication, however, that the DL1 virus has drifted from the DL0 virus. Also note that at 60 weeks the population is divided between recently infected individuals never infected individuals with only a few previously infected but now highly waned individuals. That indicates that the sample was drawn as the epidemic was reaching its end.
At week 110 ( Figure 2B), the cross-neutralization assay has a very different shape. As seen in Figure 2 of the main text, at 110 weeks the population had just entered a second epidemic wave. Consequently, the population of individuals infected in the first epidemic wave is clustered at the bottom end of the titer scales rather than the top end. Table S2B also shows a little more evidence of DL1 infections having occurred in the population. Further comments on Table S2 are made at the end of the table. Only those who want to assess the potential for parameter estimation need to examine all eight tables. Informative table sets for others are Tables  1, 4, and 6. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. In Table S2C we see that the shape of the cross-neutralization table changes when we use a more drifted virus in the neutralization assay. Here there are three diagonal rows. The supradiagonal one corresponds to DL0 infected individuals, the diagonal corresponds to DL1 infected individuals, and a few DL2 infected individuals are seen in the sub-diagonal row. If a higher number of waning states had been modelled, we speculate that patterns would be more spread out along the diagonals.
Note from Tables S2D and S2E that by 162 weeks the majority of infected individuals have been infected with a DL1 virus but the frequency of individuals with immunity to DL2 viruses is minimal. This last observation is clearest in Table S2E.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. As we increase dr to 0.05, Table S3 shows little drifting at 60 and 110 weeks. But at 162 weeks, we actually see a smaller ratio of previously infected individuals with DL1 infections compared to drift DL0 infections. We write this ratio as DL1/DL0. This ratio is a little greater at the top end of the scale than the middle. There is almost no difference, however, in the high waning level individuals. This is consistent with the conclusions we made about the effects of increasing the drifting fraction dr. An increase in dr increases the introduction of waning levels but it decreases the growth of waning levels after introduction.
In contrast, DL2/DL0 in Tables 2E and 3E show more DL2 individuals with d= 0.05 than with d = 0.01.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. .

4D
Drift Level 0 vs. As we increase the dr to 0.1 in Table S4, we see an even further decrease the ratio of DL1/DL0 at 162 weeks. But we see a very small increase in the DL1 at 110 weeks. Likewise, there is a minimal increase in DL2 at 162 weeks. There is waning down the diagonal of DL1 infected individuals between 60 and 110 weeks. By week 162 new infections have moved individuals up the waning scale. Overall, the changes from dr=0.01 to dr=0.1 have been small.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. .

5D
Drift Level 0 vs. As the dr is raised to 0.5 for W = 0.5, the number of DL2 individuals at 165 weeks in Table S5E has decreased from S4E. But the number of DL1 individuals at 110 weeks has increased. The slower growth of DL1 individuals in the population has affected the growth of DL2 individuals.
Overall, the changes the changes between Tables S2 and S3 induced by raising the dr are moderate.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. .
Comparing Tables S2B and S6B, note an increase in DL1 at 110 weeks. Increased waning has accelerated drifting. This gets lost when using the drift level 2 virus S6C. At 162 weeks, there are actually fewer DL1 given w=0.1 than there are given w=0.05. But there are quite a few more DL2. The smaller number of DL1 appears to be due to waning to a level where the DL of the last infection is not distinguished by cross-neutralization.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. .

7D
Drift Level 0 vs. As we increase dr to 0.05 at W = 0.1, we get a bit of a greater increase in DL1 at 60 weeks than we did for the same change given W = 0.05 (comparing S3A to S7A). But the increase in DL1 at 110 weeks is truly remarkable: from 41 to 370 (comparing S3B to S7B). The waning got the drifting over a threshold. There was a more modest increase in DL2. By week 162 the difference is less because the DL1 population grew given the dr = 0.01. The increased w also led to a notable increase in DL2 at 162 weeks.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. .

8D
Drift Level 0 vs. This continues the picture seen in the comparison of w=0.05 and 0.1 at the last drift level of 0.05. Comparing S4A to S8A, there is some increase in DL1 at 60 weeks and more drifting down to levels where there is not enough immunity to distinguish DL. Comparing S4B with S8B there is a marked increase DL1 at 110 weeks as well as an increase in DL2 at time 110 (Comparing S4C with S8C). At 162 weeks there is more DL2 and more immunity levels that have waned beyond the point where DL can be distinguished.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. .

9D
Drift Level 0 vs. The transition from dr= 0.1 to dr = 0.5 at a waning rate of 0.1 is similar to the transition at w = 0.05. At 110 weeks, there is evidence of a previous large DL1 epidemic that did not happen at w=0.05. Both DL1 and DL2 populations seem to go down as waning makes older infections not distinguishable by serology. At 162 weeks w similar relationships, but with a rise in DL2 that is greater than for w=0.05.
Viewing the pattern of waning and drifting at a small number of single points in time over short time periods, as in the graphs just presented, provides a different perspective than viewing the patterns produced by numerically solving ODE equations over longer time periods, as in Figures  1 and 2 in the main text or Figures S1 and S2. Examining the simulated cross-neutralization tables is more like the real-world experience where we are trying to figure out what has happened by looking at serology results. How far individuals have gone down the waning scale depends upon how far up the waning scale they went upon infection, how much time has . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 passed since their infection, and how fast they have waned. So just reading one graph at one point in time is not so informative with regard to waning. There are two ways to get more information. One is to conduct separate surveys following the serology of infected individuals. Another is to repeat a cross neutralization survey at a later time. Each provides different information.
Follow up serology on infected individuals allows more precise estimation of waning. Since waning is a determinant of drifting, it in turn allows more precise estimation of drifting. The amount of drifting between two times does not provide a direct measure of drifting. The change across time is due to a combination of increase in drifting events during transmission and transmission from already drifted individuals. Multiple time points contribute power to the joint estimation of waning and drifting parameters. If waning is estimated more precisely, multiple time points will contribute even more to the estimation of drifting parameters. How many recent infections there are at different drifting levels is reflected by how far down they appear on the diagonal. The longer it has been since infection, the further down the diagonal they will be. For example, look at DL0 vs. DL1 at waning rate = 0.05 and drift fraction = 0.1 week = 110 in S3B. Although there are only 11 DL1 infected individuals, the fraction at each waning level that were DL1 goes down markedly. But there is enough force of infection from each drift level so that at week 162 in SBD, the fraction of transmissions goes down less quickly. That fraction at different time points offers power to estimate transmission rates, drifting, and waning separately.
Interpreting differences between serology relationships at different points in time is important not because it predicts what the parameter estimates for waning and drifting might be from a given set of data. But because during DRIA one must consider what the potential of realistically relaxing simplifying assumptions might be. As one becomes more adept at interpreting the possible sources of serology data patterns, one will gain insights that direct one to what realistic relaxations of simplifying assumptions will contribute most to a decision that one is facing.

Expanding the dimensions of drifting and predicting the emergence of new strains
Changing the integer value of M and the value of the drifting fraction parameter dr at each level will make the model more flexible with regard to fitting data. That, in turn, will decrease the identifiability of model-based decision inferences. Opening the model so that it can capture new knowledge about epitope roles and antibody or T-cell effects at epitopes might recapture identifiability. Now we consider a simple approach to changing the number of drifting dimensions. In the model presented in this paper, we made the convenient simplifying assumption that the drift levels of the virus lie at the integer points 0,1,…,M of a line. This allowed the working assumptions that: 1) when an infected Ih contacted a newly recovered Ik in Rk0, the probability of transmission was |h-k|/(M+1), and 2) when an Ih infected an Rkj, the newly infected could drift to a nearby level Ih-1 or an Ih+1 . This can all be accomplished in a more general drift space.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 Suppose there are M+1 drift levels in some high dimensional space X. Label these levels as 0,1,..,M -possibly in an arbitrary order. One need only specify: 1) the probability of infection phk when an Ih meets an Rk0, and 2) the probability qh,h' that when an Ih transmits to any Rjk, the new infection will be in Ih' -some measure of nearness in X. These are two (M+1)x(M+1) matrices, which may have some underlying structure. For example, the phk may be symmetric; it will have 0s down its diagonal. Basically, one needs to estimate a distance measure or metric on the space of virus drift levels, related to transmissibility and drift. The Hamming distance on the virus genome could provide a metric see, for example (Girvan, Callaway et al. 2002). However, (Koelle, Cobey et al. 2006) argues that distance measures that relate the degree of crossimmunity with similarity across sequence strains are not realistic. The shape space methods of (Lapedes and Farber 2001) could be used in this task. But a more meaningful dimensionality could arise from actual spatial descriptions of epitopes and antibodies given that the technology for this now exists. But that technology is not easy or practical for use at a population level.
Multiple laboratory assays could contribute to measures of distance. These include genetic sequencing, neutralization assays, cell sorted B-cell and T-cell sequences specific for different antigens, avidity profiles of antibodies to specific antigens that ideally could be epitope specific, and immunity profiles to synthetic antigens as in (Doran, Gao et al. 2015) and (Kodadek and McEnaney 2016). All of these could contribute to the construction of cross-reaction tables similar to the neutralization tables we presented in the paper but with more laboratory defined dimensions.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101

Decision Robustness and Identifiability Analysis
The ability of our model to produce population serological data generates opportunities for fitting models to data as just discussed. Giving free reign to theory speculation and imagination of data possibilities, as was done in the last section, is useful for developing ideas. But it may be less helpful for answering burning questions. DRIA creates a set of constraints for relating theory to data. DRIA constrains questions addressed to a framework where answers are more likely to be found. DRIA, however, does not present a set of rules to be followed that has a clear endpoint. If the decision sought is about scientific theory, science never ends so one can conceivably always be pursuing robustness and identifiability assessment loops. If the decision is about an urgent public health issue, like making an investment in vaccines for next year, the decision will have to be made by making judgments about the seriousness of either identifiability or robustness problems. DRIA presents a context that can protect against two obvious sources of error discussed in the main text.
We separate loops into robustness assessment and identifiability assessment. These loops can intertwine. That happens when realistic relaxation of simplifying model assumptions increases identifiability because it better captures the effects reflected in the data being analyzed or opens up new data that can inform the model.

Questions to be addressed by DRIA
As stated in the main text, a scientific question to be addressed is whether pandemic SARS-CoV-2 strain will drift enough to sustainably circulate through reinfections. The following is a list of related questions that narrow this question to some practical issues. Some are policy questions and some are scientific questions. Each question should start a new DRIA. The first DRIA model for each question could be a more detailed model than that presented in this paper. Progress in understanding and elaborating drifting and waning models should be rapid. With each advance in understanding, the most efficient starting model should become clearer. For each of these questions, the inference robustness and inference identifiability steps of DRIA will be different. The following questions are not meant to be answered by performing a purely model analysis. They are meant to be answered by fitting models to data in progressive inference robustness or inference identifiability loops. Questions relevant to vaccines will require model elaborations that capture vaccine effects. That should be a straightforward task.
1. When administering SARS-CoV-2 vaccines, will drifting emerge when the coverage achieved can only reach X? (X to be decided in consultation with Public Health Experts.) 2. Will vaccines that stimulate immunity through diverse epitopes require less coverage to eliminate drifting than vaccines that target a single epitope? 3. Will vaccines that stimulate more diverse B-cells to a single epitope be subject to a lower drifting force? 4. Will drifting emerge when a coverage of only X can be achieved using an improved vaccine Y that covers more epitopes with more diverse antibodies?
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 5. Will vaccinating the age groups or other subgroups most involved in transmission make a vaccine less subject to drifting? 6. Will failure to vaccinate a specific high-risk group cause drifting that spreads outside of that group and contributes to sustaining drifting in other groups? 7. Will high transmission conditions increase the potential for drifting? 8. Will decreasing transmission through better ventilation eliminate the risk of drifting associated with only being able to achieve a vaccine coverage of X. 9. Will having diverse subpopulations with higher or lower transmission potential increase the potential for drifting?
4.2 DRIA robustness assessment loops The decision robustness assessment loop assesses whether realistically relaxing an assumption will change or make unidentifiable an inference that is identifiable in the simpler model. Since we have not fitted the model we presented to data, we are not ready to proceed with decision robustness assessment steps. To help understand DRIA, we present a list of simplifying assumptions that might be relaxed. These steps could be applicable to our main text question or the above listed questions.
The following numbered "realistic relaxations of simplifying assumptions" are likely to vary with regard to their chances for changing the above 9 listed decisions or to make a decision about them unidentifiable. Not all of these items below will be applicable to all of the 9 decisions above.
1. "The population is homogeneous with no age specific mixing or age specific histories of infection." This assumption could be relaxed by adding age groups and age specific contact patterns. 2. "The population is of infinite size in every category." This is an assumption that is intrinsic to all ODE models and can be relaxed by instituting an identical model in an agent based discrete individual format. 3. "The waning scale is linear across compartments with equal sized steps and equal sized differences in infection susceptibility." Theory suggests that relationships might follow a power law rather than being linear as in the model in this paper. Data on neutralization titers as a function of time might describe another pattern. Realistically relaxing the linear function might be pursued somewhat differently for agent based or ODE models: ODE -lengthening the time spent in each sequential compartment and decreasing the loss of susceptibility across each sequential compartment could give more power law relationships. AGENT BASED -formulating the increase in susceptibility by time as a power law. If good follow up neutralization data is available, integrating the observed relationships into the model might be the best alternative. An agent-based model makes it easier to adopt the uncertainties intrinsic to cross-reaction data. 4. "The drifting scale is ordinal and unidirectional." This can be relaxed as discussed in section III.3 on page 12. 5. "The natural history of infection is unchanged by the waning or drifting state of individuals being re-infected." This clearly unrealistic assumption was beneficial in . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 helping to work out what aspects of the system were causing new phenomena to appear. In a more detailed model, the effects of small drifting fractions at higher waning levels might have been attributed to assumptions about how different levels of immunity affect contagiousness. But now that we understand why our model formulation led to the dynamics it did, realistically relaxing assumptions about how immunity will affect the duration of infection can be done without causing as much confusion. 6. "Immunity is acquired from infection only and not from a vaccine." This can be relaxed by assigning population to specified drifting statuses in a manner that logically corresponds to vaccine administration. 7. "Immunity is directed only to the most recent infection." This is most easily relaxed in an agent-based formulation of the model. One approach could be to formulate B-cell survival and stimulation upon a new infection in each individual. A less computationally intensive approach might be to formulate a set of immunity dimensions corresponding to multi-dimensional drifting formulations. 8. "Boosting of immunity from first infections ad reinfection is to the same level. That level has the characteristic that the next step below that level creates some susceptibility." There are two separate things in that simplification that can be separately relaxed. The level of boosting could be made a function of the past history of infections, the exposure dose, or both. What level implies susceptibility could be made a function of the same things. Thus, one function could address both issues. 9. "Immunity is lost as soon as one leaves the first waning or drifting level." There could be redundant immunity such that several compartments must be traversed before infection becomes possible. 10. "Drifting occurs by short steps and not leaps." Recombination is a characteristic of coronaviruses. To capture some of that, one could add recombination across different antigenic dimensions to the model. Such a model elaboration would be especially useful if data on recombination were available. But it could be pursued without such data and notable effects in the model could stimulate collection of data. 11. "Waning and drifting generate escape from immune protection according to a simple additive protective effects formulation." A more realistic assumption might be that early waning is related to the loss of a small number of B cell clones and the probability of drifting from these will be high as well. Under that assumption, the joint distributions of waning and drifting will no longer be additive. The model could be elaborated by having different strengths of immunity with different loss rates.

DRIA decision identifiability concepts
Decision inference robustness is more widely understood and addressed than is decision identifiability. Understanding the distinction between parameter identifiability and decision identifiability may help to see how the DRIA we propose works. Many readers will be familiar with parameter identifiability. Parameters are identifiable by a set of data when only one set of parameters will generate the data being fit. There are distinct practical and theoretical approaches to parameter identifiability. Few infection transmission system models are . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 practically identifiable by the type of data to which infection transmission system models are usually fit.
Given that DRIA focuses on decisions rather than on model parameters, there is a better chance of achieving identifiability. A decision is identifiable when all of the different parameter sets consistent with the data to which the model is fit lead to one decision. Decision identifiability is easier to assess than parameter identifiability. The key steps are to first be precise in defining the decision to be made. Then the model must be able to specify two different boundaries.
The first model defined boundary is the decision boundary in parameter space. Criteria must be established that define for each possible parameter set what the decision should be. That might be done by some cost or benefit assessment of model output given. The key step of establishing criteria is discussion between public health officials, content area scientists, and complex systems scientist modelers.
The second boundary that must be defined by the model is the parameter space that is consistent with the data. To do that, one must have a statistical analysis procedure that defines the likelihood of the data given the model. There are various Partially Observed Markov Process (POMP) methods that can be employed to that end. The "pomp" software () can be employed. A pomp analysis that shows lack of decision identifiability is what gets one into the decision identifiability loop.

4.4
Pursuing decision identifiability A task of step six as outlined in the main text is to find new data that will allow one to narrow the parameter space that is consistent with the data. The new data to which the model is fit needs to be an integral part of the model. A major contribution of the model presented in this paper is that it opens up serological cross reaction data given different strain variants data for fitting transmission system models that entail drifting.
New data increases identifiability when the parameter space consistent with available data is made smaller. The new data may yet to be gathered, it may be more precise data, or it could be existing data that has not yet been used.
Data to assess antigenic drifting could come from epidemiological surveillance, outbreak investigations, data and specimens collected to investigate other infections like influenza, newly designed studies stimulated by model developments, follow up data after infection, virus isolations, virus sequencing, or detailed immune response characterizations. Data from all these sources could be enhanced by fitting waning and drifting models to them. Even when the model does not generate the data analyzed in the manner that our model generates crossneutralization data, it can be useful as it might still constrain parameter space.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 Surveillance data A promising set of data for the model we have presented could come from cross sectional surveys like those conducted to determine what fraction of the population has evidence of prior infection. Those sera together with virus isolations should be examined to generate the cross-reaction tables we have discussed. More widely available data might be temporally dispersed sera and isolated virus collections where viruses are not linked to sera collection populations. Fitting a model to such data requires assumptions about the transmission system dynamics generating the data. But in a DRIA context, such data could still be powerful.

Outbreak data
Outbreaks can represent natural experiments. For example, the cruise ship outbreaks of SARS-CoV-2 exhibit constrained and measurable contact patterns and allow for more detailed data collection. The cruise ship outbreaks during the first epidemic wave of SARS-CoV-2 cannot provide information on drifting forces since all infections were first infections. They might, however, provide information on cross-reactive immune protection from other coronaviruses. In the future, they could information on drifting if adequate serology specimens are gathered.

Specimens from studies designed to investigate other pathogens
In the first phase of this pandemic, studies of influenza have been productively leveraged to gain insights into SARS-CoV-2 infections. They provide data on exposure risks to other infections in relationship to risks for SARS-CoV-2. As we enter the time when drifting may appear, these studies could be even more valuable. Sequential sera with follow up that documents new infections can best come from such studies.

Newly designed studies stimulated by model developments
Sometimes the model can direct us to gather new data. When an inference robustness step shows lack of robustness to realistic relaxation of a simplifying assumption, the realism added to the model may imply that data on that aspect of realism may be needed. For example, when adding age specific contact patterns to a model, the inference could change across different age specific contact patterns. In that case, getting data on those age patterns should help narrow the parameter space. But there may be other alternatives. For example, age specific genetic sequence patterns of the virus may reflect aspects of the contact patterns that make a difference for the inference being pursued.

Follow up data after infection
Data on waning can be acquired through follow up of individuals after they have been identified as infected. That data addresses waning without any influence of drifting in the individuals followed up serologically since the presence of any intervening infection can be detected serologically. Given the relationships we showed in the main text between waning and drifting parameters, adding follow up data on waning has considerable potential to increase decision identifiability.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101 Independent data on drifting is not so readily available. Characterizing viruses isolated over time across different populations, age groups, and other risk groups does present information to which the model can be fit. But SARS-CoV-2 isolations are less likely to be representative of the overall population if infected individuals given variations in severity and the frequency of asymptomatic infections.

Genetic sequences
Genetic sequences are a type of data that is used to inform drifting assessments for influenza. The classic work of (Smith, Lapedes et al. 2004) Derek Smith et al. (), the analyses by (Koelle, Cobey et al. 2006) and the more recent work of (Castro, Bedford et al. 2020) as well as the work by (Luksza and Lassig 2014) offer ways to predict virus evolution without modeling a drifting process in the course of reinfections as we do. The genetic information may be useful even when it does not directly reflect viral pathogenesis. For example, the burden of harmful genetic changes can alter the effects of immune escape directed genetic changes in a virus in unintuitive ways (Koelle and Rasmussen 2015). Developing a model that handles genetic data within causal system framework that informs transmission system parameter estimations seems possible. The path to such a formulation, however, is yet to be articulated. Even for influenza, we suspect that using a population transmission system model that integrates waning, boosting, and drifting like we have presented could increase the value of genetic sequence data.

Laboratory data on SARS-CoV-2
Recent studies show what can be obtained by in-depth examination of a few isolates (Banerjee, Nasir et al. 2020) (Liu, Wang et al. 2020) (Premkumar, Segovia-Chumbez et al. 2020) (Noy-Porat, Makdasi et al. 2020) (Zost, Gilchuk et al. 2020) or the in-depth characterization of immune responses from a few individuals (Brouwer, Caniels et al. 2020). Such data could refine and make more specific the laboratory data produced by cross-neutralization or related serologic assays.

Detailed immune response characterizations
Laboratory studies such as those showing immunological cross reactions between coronaviruses (Mateus, Grifoni et al. 2020) or measurement of protective effects for specific epitopes and antibodies to them (Brouwer, Caniels et al. 2020) can also contribute to identifiability by suggesting more appropriate model formulations or more informative serology.
Another source of more detailed immune response data is to conduct antibody quantity vs. avidity profiles against the major epitopes on samples from a cross-section of a population. An ELISA or similar assay could be used. Assessing avidity just by measuring how much antibody attachment is lessened by urea or some similar reagent may not accurately describe the avidity profile because the competitive avidity of different antibodies is not assessed. A more complete description of the volume of low avidity to high avidity antibody that assesses competitive attachment could be assessed by varying the incubation time and the washing times in an ELISA . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101/2020.09.10.20192153 doi: medRxiv preprint from a minute to two weeks. Such long incubation or washing times would not increase the total time for examining collected sera since there is no increase in anything but the time in which plates are left to sit. A model of the data could then be used to provide a profile of avidity vs. quantity that is model translated into a susceptibility measure. That model would in turn inform the waning and drifting transmission system model.
Another way to get more detailed immune response characterizations is to chemically construct epitopes, screen millions of them at a time to identify epitope like constructs that distinguish different sera sets defined by time, age, and infection history, then further elaborate shape space for the chemically constructed epitopes around those that are positive until one has a thorough description of immune response characteristics by age, time, and infection history. Methods for this have been developed by (Doran, Gao et al. 2015, Kodadek andMcEnaney 2016) Fitting the model to this data should allow estimation of waning and drifting parameters of considerable complexity.
Using multi-scale models to increase identifiability Combining within host models and population models is an area of model analysis that has potential to increase identifiability. Our model made such a combination. But it did so by eliminating the virus growth and diversification dynamics within both the source and recipient individuals. It is possible that getting data on either of these within host processes could improve identifiability of waning and drifting parameters. An example where adding within-host process data increased the identifiability of population processes comes from (Tuncer, Marctheva et al. 2018) (Tuncer, Gulbudak et al. 2016). They have clarified the conceptual ideas involved and showed in a Rift Valley Fever model that more population model parameters were identifiable when they added within host processes to the model for which they could establish some scaling.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 11, 2020. . https://doi.org/10.1101