Near-exact explicit asymptotic solution of the SIR model well above the epidemic threshold

A simple and explicit expression of the solution of the SIR epidemiological model of Kermack and McKendrick is constructed in the asymptotic limit of large basic reproduction numbers $ro$. The proposed formula yields good qualitative agreement already when $rogeq3$ and rapidly becomes quantitatively accurate as larger values of $ro$ are assumed. The derivation is based on the method of matched asymptotic expansions, which exploits the fact that the exponential growing phase and the eventual recession of the outbreak occur on distinct time scales. From the newly derived solution, an analytical estimate of the time separating the first inflexion point of the epidemic curve from the peak of infections is given.


Introduction
The COVID-19 pandemic has impacted all aspects of our daily lives. It has thus triggered an extraordinary world-wide response across all fields of science, from virology and pharmacology to sociology. Within the scientific crowd, applied mathematicians and theoretical physicists also brought their contribution, notably by using their mathematical expertise in epidemiological modelling [1,2,3,4,5,6,7,8,9,10,11,12]. Deterministic compartment models allow one in principle to develop an accurate global view of the contagion dynamics within large populations. Compartments can be used to divide the population into age categories as well as according to the evolution of the illness, for those who have been infected [13,8,10]. These multiple levels of description are useful and necessary, and can be made to fit closely to the data. However, they contain a plethora of fitting parameters and can therefore be complex to interpret. Fortunately, it turns out that the simplest of all compartment models, SIR, faithfully reproduces the global dynamics at the level of a country or a large city with COVID-19 [9,14]. As a result, the SIR model remains a useful tool at such global level of description. We write it as follows dS dt = −βSI/N, where S, I, and R respectively denote the number of susceptible, infected and removed individuals, with constant sum N = S + I + R. The population R includes both those who have recovered from the illness and those who have died. Above, β is the contact rate, γ is the recovery rate, and their ratio is the basic reproduction number. Despite its long history [15], there continues to be many efforts to try and solve it analytically [16,17,18,19,20,21,22,23,24,25]. In their seminal paper, Kermack and McKendrick offered an approximate solution in the limit of a "small epidemic", i.e. when the reproduction number R 0 is just above 1 [15]. So far, this has remained the only explicit approximate analytical solution available, in the sense that it can be justified as an asymptotic limit of the true solution. Well above the epidemic threshold, an analytical expression of the solution can be written in parametric form [26]. Unfortunately, it is only given implicitly, such that time is expressed in terms of one of the dynamical variables through an integral. Hence, this solution is not directly interpretable and its analysis can be quite involved [23]. Alternatives have been proposed in the form of converging series [17,20], but these can involve expansions with as many as 15 up to 60 terms and, hence, are again impractical for analysis. More down-to-earth is the approach by which the solution is fitted by well-chosen ansatz with a few parameters [27,24]. However, these ansatz are not derived from the model and their parameters must be fitted to the data, rather than deduced from the model.

Statement of results
The aim of this paper is to show that where a is the infected fraction of the population at t = 0, while n r,∞ and t * are given by Given n r (t), one may infer the infected fraction of the population exactly as 2 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint   Eq. (1) is remarkably simple and compact. The "∼" sign in it indicates that it holds asymptotically in the limit R 0 1. However, a good qualitative agreement with the numerical solution is already found for R 0 = 3 and the approximate formula rapidly becomes as good as exact with increasing R 0 , see Fig. 1. Hence, Eq. (1) complements the classical "small epidemic" formula derived by Kermack and McKendrick [15], which is valid in the limit 0 < R 0 − 1 1. When, in 1956, D. G. Kendall presented the implicit analytical solution of the SIR model, he noted: "It is curious that the Kermack and McKendrick approximation should have been accepted without comment for nearly thirty years; the exact solution is easily obtained and the difference between the two can be of practical significance." Similarly, the reader will find that the derivation of the present asymptotic formula is quite simple and could have been established much earlier with wellestablished asymptotic techniques.
As an illustration of the usefulness of the present analytical expression, let us compute the time elapsed between two important moments of the epidemic, namely the time t 1 of first inflexion of the curve n i (t) and the time t 2 of the peak of the epidemic. In appendix, we show that the former happens when n r is equal to n 1 , solution of 3 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  while the latter happens when in the limit a → 0. It turns our that t 1 < t * and that t 2 > t * . Hence, using Eq. (1), we have Meanwhile, for t 2 , we have Above, we may assume in first approximation that t 2 is sufficiently large that the logarithmic term is negligible. Thus, we obtain the more manageable expression and obtain A comparison between this approximate formula and the numerics is given in Fig. 2. Collecting data in real-time allows one to detect when the slope of the epidemic curve has reached a maximum; t 2 − t 1 is then the time left before the epidemic starts to recede.

Preliminaries
Before embarking into the asymptotic analysis of the SIR model, we first recall a few well-known facts. Eliminating S thanks to the relation S(t)+I(t)+R(t) = 4 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 29, 2021. ; https://doi.org/10.1101/2021.03.24.21254226 doi: medRxiv preprint N , the evolution equations for the fractions n i = I/N and n r = R/N are They should be solved with the initial condition At the start of the outbreak, n i , n r 1 and the linearised evolution equations immediately yield On the other hand, dividing Eq. (10) by Eq. (11), one obtains which immediately yields Eq. (3). Thanks to the explicit dependence n i [n r (t)], Eq. (11) eventually becomes a first-order nonlinear ODE: As t → ∞, n r tends to the value n r,∞ that makes the right hand side vanish: In the large-R 0 limit, the second term in the right hand side above is small, so that a recursive resolution can be set up. An excellent approximation is given by with an absolute error of less than 0.002 for R 0 > 3. Alternatively, one may express n r,∞ in terms of the tabulated Lambert function [23]. It is immediate to see that Eq. (15) can formally be integrated as This expression has been known and studied for a long time [26]. Note that if if both γ and β are functions of time, such that their ratio R 0 = β/γ remains constant, then Eq. (18) remains valid, with γt replaced by the effective time τ = γ(t)dt [22]. Recently, a thorough analysis of the integrand of Eq. (18) was carried out, yielding upper and lower bounds for the function t(n r ) [23]. In addition, series representations of the integral based either on Taylor expansion of the integrand near the origin or based on its poles in the complex-n plane were given in that work. The first of these approaches effectively amounts to assume that n is small and thus generalizes Kermack and McKendrick approximate solution. However, the formulas in [23] are rather involved, making the inversion of the relation t(n r ) laborious.

5
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Matched asymptotic expansions
The method consists in deriving separate approximations at the early stage and at the later stage of the outbreak. One the one hand, the outer solution has a characteristic time γ −1 and describes the decaying phase of the epidemic. On the other hand, the inner solution applies to the growing phase and evolves on a shorter time scale. In the language of asymptotics, the R 0 1 limit produces a boundary layer problem [28]. The two approximations are constrained by matching conditions at intermediate stages of the outbreak. Thanks to the fact that they have an overlapping range of validity, one can then construct a composite solution, Eq. (1), that is uniformly valid in time.

Outer solution
When n r = O(1), e −R0nr is exponentially small and can at first be neglected in Eq. (15). We thus have where the superscript (o) refers to "outer" approximation and t o is an arbitrary constant. However, knowing the actual limiting value of n r as t → ∞, a better approximation is Since n r,∞ and 1 only differ by an exponentially small quantity, they are asymptotically equivalent as R 0 → ∞. However, the correction brought about by using n r,∞ in Eq. (20) considerably improves the approximation for moderately large R 0 and thus expands the range of usability of the final result, Eq. (1). Note that n (o) r is only defined for t > t o , since n r must be a positive number.

Inner solution
At earlier stages of the epidemics, n r is O R 0 −1 . More precisely, Eq. (13) suggests to write where (i) refers to "inner" approximation. Then Eq. (15) becomes Unlike Eq. (15), this last equation can be integrated explicitly: where we used the fact that ν(0) = 0. Hence, and 6 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 29, 2021. ; https://doi.org/10.1101/2021.03.24.21254226 doi: medRxiv preprint

Matching
So far, we have derived two approximations that efficiently describe the solution at different moments of the epidemic. However, the constant t o in Eq. (20) is still unknown. To determine it, we must impose that (20) and (26) To formalise this statement more precisely, let us define the matching region as where η is such that γη∆t 1, while (β − γ) η∆t 1. This double asymptotic constraint is possible since (β − γ) /γ = R 0 − 1 1 by assumption. On the one hand, we have ∼ n r,∞ − e γ(t * −to) + e −γ(t * −to) γη∆t.
Hence, we have t o = t * and

Composite solution
As indicated above, n (o) r is only defined for t > t o = t * . For t < t * , only the inner solution holds: On the other hand, for t > t * , we may construct a uniformly valid approximation as n r (t > t * ) ∼ n (i) 7 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint where the common part is the time-dependance that n (i) r and n (o) r have in common in the matching region, and which should be removed to avoid double counting. One has, from (35), Hence Eqs. (38) and (41) yield Eq. (1), which completes the derivation.

Conclusion
We have shown that an explicit approximate solution, Eq. (1), can be derived in the limit of large R 0 . The interest of this solution is that it is already effective when the reproduction is only moderately large. Given that SIR is in itself a crude approximation of reality, it makes little sense from an applied perspective to look for an approximation that is exact to within 1%. Taking the time between inflexion and maximum of the epidemic curve as a measure of accuracy, the present theory, Eq. (9) yields an answer with less than 10% error for R 0 just above 4. With influenza, a reproduction number exceeding the value of 3 appears rare but possible [29], as was the case with the 1918 influenza pandemic [30].
In the case of the COVID-19 pandemic estimates yield R 0 in the range 1.5 to 6.49 [31] and particularly around 3 in France [32] and between 3.5 and 4 in Korea and in the Hubei province (China) during the first wave [33]. For measles, R 0 can be on the order of 10 or even larger [34]. Hence, the large-R 0 limit considered here is relevant, even though, of course, the assumption of a constant ratio β/γ does not apply to the COVID-19 pandemic. While the formulas derived in this paper appear accurate enough, it is to be noted that they can be improved. Only leading-order expressions have been used for the inner and outer solutions that apply to the ascending and receding phases of the outbreak, respectively. The inner solution can be improved by the keeping O (1/(R 0 − 1)) corrections in Eq. (22)  A Derivation of formulas (4) and (5) From Eq. (10), dn i /dt vanishes when n i = 1 − 1 R0 − n r . By substitution into (3), one immediately obtains (5). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 29, 2021. ; https://doi.org/10.1101/2021.03.24.21254226 doi: medRxiv preprint Next, let us enquire when the second time derivative of n i vanishes. Writing n i = n i [n r (t)], let us denote Then, Dividing (10) by (11), one has Hence, n i = −R 0 (n i + 1) and Equating (47) with (49) yields (4).
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 29, 2021. ; https://doi.org/10.1101/2021.03.24.21254226 doi: medRxiv preprint