## Abstract

Infectious disease outbreaks are expected to grow exponentially in time but their initial dynamics is less known. Here I derive analytical expressions for the infectious disease dynamics with a gamma distribution of generation intervals. Excluding the exponential distribution, the outbreak grows as a power law at short times. At long times the dynamics is exponential with a growth rate determined by the basic reproductive number and the parameters of the generation interval distribution. These analytical expressions can be deployed to do better estimates of infectious disease parameters.

Infectious diseases can spread to a significant fraction of the population causing an epidemic. The chance for that to happen is determined by the infectious dynamics within each individual and the disease transmission between individuals. The within individual dynamics determines the generation interval, the time interval between a primary case being infected to the disease transmission to a secondary case. The transmission between individuals is reflected in the reproductive number, the average number of secondary infectious caused by a primary case.

The networks underlying proximity and sexual transmission of infectious diseases have wide connectivity distributions and exhibit the small-world property [1, 2]. In these networks the basic reproductive number is proportional to the ratio between the second and first moments of the connectivity distribution [3–6], which can be very large. Less attention has been given to the interplay between the contact network and the generation interval distribution.

Before entering the main calculations, let’s have a look at standard models of infectious disease dynamics. In the susceptible, infected and removed (SIR) model, susceptible individuals get infected at rate *β* and infected individuals get removed at a rate *γ*. The basic reproductive number, the average number of new infections generated by a single infected individual, is given by
The dynamics of the infected individuals at the population level is given by
where *S* and *I* are the number of susceptible and infected individuals and *N* is the population size. For short times we can assume t*S* ≈ *N* and *I* ≪ *N*. In this limit we can integrate equation (7) obtaining
where
is the population reproductive number of the SIR model. According to the SIR model, the basic and population reproductive numbers are identical.

In the SEIR model susceptible individuals get infected at rate *β* without becoming infectious (exposed), exposed individuals become infectious at a rate *α* and infectious individuals get removed at a rate *γ*. The infected compartment is now split into exposed, with number *E*, and infectious, with number *I*. Note that once an individual become infectious it behaves exactly as an infected individual in the SIR model. Therefore we obtain the same basic reproductive number
The dynamics of exposed/infectious individuals reads
For short times *S* ≈ *N* and *I* ≪ *N*. In this limit we can integrate equation (7) using an eigenvalue approach. Focusing on the largest eigenvalue, we obtain
Where
is the population reproductive number of the SEIR model. The introduction of the exposed compartment carries as a consequence a discrepancy between the local and population reproductive numbers.

The SIR and SEIR models neglect key aspects of real populations. First, there is a variable contact rate across the population, which translates to a variable rate of disease transmission across infectious individuals. Second, while the SEIR model accounts for the incubation period of infectious diseases, it is still too restrictive. Using well stablished mathematics from the theory of age-dependent branching processes, I have previously calculated the expected number of infected individuals of infectious disease outbreaks on heterogeneous populations and any given time interval distribution [7, 8]. However, the applications were limited to an exponential distribution of generation intervals. Here I use this formalism to calculate the population reproductive number for a broader class of generation interval distributions.

The branching process approach maps contact processes mediating disease transmission into causal trees of disease transmission. The average reproductive number of patient zero, the expected number of secondary cases generated by patient zero, is given by
where ⟨ *β* ⟩ is the average infectious contact rate in the population and *γ* is the rate of recovery from the infection. For infected cases other than patient zero one needs to take into account that the disease spreading biases the disease transmission to individuals with a higher contact rate. The patient zero can be thought as an individual selected at random from the population. Any other infected individual will not be selected at random, but it will be found with a probability proportional to its contact rate: *β/N* ⟨ *β* ⟩, where *N* is the population size. Once infected, the individual found by contact will engage in new contacts at a rate *β*. Therefore, the average reproductive number of patients other than patient zero is
*R*_{0} gives the average number of infectious at the first generation, those generated by patient zero. *R*_{0}*R* gives the average number of infections at the second generation and *R*_{0}*R*^{d−1} gives the average number of infections at the *d* generation. The actual time when an infected case at generation *d* becomes infected equals the sum of *d* generation intervals and it has a probability density function *g*⋆ ^{d}(*t*), where the symbol ⋆ denotes convolution. Therefore, the average number of new infected individuals at time *t* is given by
where *N*_{0} is the number of patients zero and *D* is the maximum generation, when the disease transmission ends. When *D* is small, due to lockdown for example, the spreading dynamics follows a polynomial or power-law growth [7, 9]. Here, I focus on the case *D* → ∞, or more precisely *t* ≪ ∫ *D dtg*(*t*)*t*. In this limit
Now we will focus on the shape of the generation interval distribution. In the SIR model an infected individual is infectious right from the time it becomes infected until removed. Let *t*_{R} be a specific realization of the removal time. Since removal takes place at a rate *γ, t*_{R} has the exponential probability density function *γe*^{−γt}. Assuming that each individual has a time-independent contact rate, the generation intervals will be uniformly distributed between 0 and the recovery time, resulting in the same exponential distribution of generation intervals
In the SEIR model the generation interval can be decomposed into the sum *t*_{S} = *t*_{I} + *t*_{R}, where *t*_{I} is the time interval from exposed to infectious and *t*_{R} is the removal time. Since the transition from exposed to infectious takes place at a constant rate *α*, then *t*_{I} has the exponential probability density function *αe*^{−αt}. Therefore
There are two key differences between the generation interval distributions for the SIR and SEIR model. First, the mode for *g*_{SIR}(*t*) is zero and that for *g*_{SEIR}(*t*) is non-zero. Second, around *t* = 0, *g*_{SIR}(*t*) ∼ *γ*, while *g*_{SEIR}(*t*) ∼ *αγt*. Both differences are a consequence of introducing an incubation state between being infected and becoming infectious.

The case *α* ≠ *γ* in (15) makes the calculation of convolutions difficult. A more suitable functional form is the gamma distribution
where *s* ≥ 1 is a shape parameter quantifying the convexity around *t* = 0 (Fig. 1). The case *g*(*t*, 1) corresponds

with the exponential distribution of the SIR model (14). For *s* = 2 we obtain the case *α* = *γ* of the SEIR model (15). More generally, the values of *s* and *γ* could be derived from the fitting to empirical data. For example, for COVID-19, the inspection of inferred generation interval distributions suggest *γ >* 1 [10] and a fitting to a gamma distribution resulted in *γ* = 2.5 [11]. Besides covering many possible scenarios, the gamma distribution is amenable to convolution. Using the Laplace transform method one obtains
The convolution of a gamma distribution is itself a gamma distribution, with the exponent scaled by the order of the convolution (*d* → *ds*).

Coming back to the outbreak dynamics, substituting (17) into (13) we obtain
Equations (18)-(19) provide a series representation for the expected number of new infections. For short times, neglecting *d >* 1 terms we obtain (*R*^{1/s}*γt* ≪ 1)
The case *s* = 1 is not representative. For *s* = 1 we have *I*(*t, s*) ∼ 1, while *I*(*t, s*) ∼ *t*^{s−1} for *s >* 1. That is, except for the exponential distribution, the outbreak grows as a power law *t*^{s−1} for short times.

For *s* = 1 equation (19) is the series expansion of the exponential
For *s* = 2 the series expansion of the hyperbolic sine
For *s* = 4 the series of sinh(*x*) minus sin(*x*)
Using this analytical representations we uncover the full impact of the time generation distribution on the infection dynamics (Fig. 2).

More generally, when *s* is a natural number
is a subseries of the exponential function. This series has been calculated [12, 13], resulting in
where *ω* = *e*^{2πi/s}. For large *x*, since ℜ (*ω*^{n}) *<* 1 for all *n* = 1, …, *s* − 1, we obtain (*x* ≫ 1)
Based on this asymptotic behaviour, for long times equation (18) can be approximated by (*R*^{1/s}*γt* ≫ 1)
where
is the population reproductive number. Equation (28) coincides with the result from Wallinga and Lipsitch based on the Lotka-Euler equation [14].

The equation for the population reproductive number (28) can be used to estimate the basic reproductive number from empirical data for the generation interval distribution and the doubling time. According to equation (27), the disease doubling time is given by
*t*_{D} is estimated from the plot of the number of new cases as a function of time. *s* and *γ* are estimated from a fit to the generation interval data. Then, using equations (28) and (29) we can estimate *P* and *R*.

Equation (28) is definitive proof that the shape of the generation interval distribution determines the relationship between the basic (*R*) and population (*P*) reproductive numbers. For *s* = 1 we recover the SIR model equation (4), when the individual and population reproductive numbers coincide. For *s* = 2 we recover the case *α* = *β* of the SEIR model in equation (9). Figure 3 shows the population reproductive number as a function of *s* for two different local reproductive numbers. When *R >* 1, *P* decreases monotonically with increasing *s*. When *R <* 1, *P* increases monotonically with increasing *s*. In either case, *P* approaches 1 for large *s*. Note, however, that the shape parameter *s* does not change the fact that if *R >* 1 then *I*(*t*) grows exponentially, while *I*(*t*) decays exponentially when *R <* 1.

The analysis of a gamma distribution of generation intervals can be extended to calculate the population reproductive number when there are heterogenous mixing patterns between individuals according to certain types [15, 16]. The outcome is similar to equation (28), after replacing *R* by the largest eigenvalue *ρ* of the mixing matrix of reproductive numbers (equation 12 in Ref [16]). After making this substitution, we obtain the population reproductive number for the multi-type generalization
where Λ_{i} is the largest eigenvalue of the *i*th mixing matrix (*e*.*g*., age, mask use, etc) [16].

In conclusion, the branching process formalism allows for a flexible description of infectious disease outbreaks that can be fully based on empirical distributions. At short times the outbreak grows as a power law, *t*^{s−1}, where the exponent is determined by the shape parameter of the gamma distribution. Therefore a power law growth is not incompatible with the homogeneous mixing approximation as previously claimed [17]. Furthermore, this power law should not be confused with the long-time power law induced by the truncation of the disease transmission at a maximum generation, as expected from an imposed lockdown for example [7, 9].

## Data Availability

All data are contained within the submission.

## ACKNOWLEDGEMENTS

Supported by Cancer Research UK C596/A21140.

## Footnotes

Change of notation from serial interval to generation interval. Acknowledgement of relevant work by Wallinga and Lipsicth.