Mathematical Relationship between Reproduction Number and Epidemic Curve of Daily Cases

The strict mathematical relationship between R t and the curve of daily cases f ( t ) is shown. Up to date and statistically robust R t from the curve of daily cases can be estimated as soon as new cases are added to the curve. That is equivalent to estimating R t by averaging all detected cases of infection, without any distortion induced by the difficulty of following and weighting trees of secondary cases from original ones, and without needing to wait for secondary cases to manifest infection. With this method, if R t scaled numbers are of interest, only the average duration of subjects’ infectivity has to be estimated directly, but independently of linking secondary cases to primary ones. A new index, Instantaneous Reproduction Number R ist is introduced, which is mathematically tied to R t and does not depend on the duration of infectivity of subjects. R ist , R t and the doubling/halving time of the epidemic may be estimated by simple computations at the very detection time of new daily cases. Any smoothed curve of daily cases gives smooth R t and R ist . No phase lag on R t estimate is introduced by this method.

sub-population; the reaction of immune systems; the reactive behaviours of host and pathogen populations; the mobility pattern of vector particles, like infecting particles in turbulent air; etc).
This writing shows how R t definition is strictly tied to the curve of daily cases by mathematical equations. The two are essentially the same thing expressed with different words. R t is a sort of first derivative of the curve of daily cases with respect to time t.
The difficulty of directly estimating R t in a reliable way is the same as predicting the evolution of an epidemic in a reliable way. Indeed even much harder, since one has to face the further uncertainty of estimating and weighting trees of secondary cases implied by this process. It is very similar to estimating the space traveled by measuring acceleration with very inaccurate accelerometers, but very much harder and error prone.
The excellent articles by Cori, et al. [1] and Dietz [2] clearly show this difficulty.

Epidemiological definition of R t
The epidemiological definition of R t states: R t is the number of secondary infections caused by a single case of disease during its period of infectivity in a completely susceptible population, on average.
(see: [3], [1], [2] and many other sources.) According to this epidemiological definition, R t is analogous to the multiplier of the initial unit capital after 1 period, in a compound capitalization process.
This analogy allows the estimation of R t from the epidemic curve of daily cases by introducing the concept of Instantaneous Reproduction Number R ist , similar to the instantaneous capitalization rate in actuarial mathematics.

Definition of R ist
The epidemiological definition of R t indicates an exponential expansion. So does R 0 , as the limit to the beginning of an epidemic of an uninfected population. An infected individual, after his period of infectious capacity, will have infected a new infected plus (or minus) a number of new infected, on average. This is equivalent to the amount of a compound capitalization at the interest rate r, where R t is the amount after period 1. In general: • after period 1: 1 · (1 + r) = R t ; • after period 2: 1 · (1 + r) · (1 + r) = 1 · (1 + r) 2 ; • after period p: 1 · (1 + r) p .
To obtain which interest rate r should be used for a continuous compound capitalization of n fractions of a period that gives the amount R t after 1 period, we can write as follows: R t = 1 + r n n equivalent to: R t = 1 + r n n r r Passing to the limit for n → ∞, and noting that lim n→∞ (1 + r n ) n r = e, we get: n r r = e r hence: r = ln(R t ) = R ist · 1 2/7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 12, 2021. ; In other terms, r is the exponent to be given to e to obtain R t after a period of infectious duration equal to 1.
If we want to express R ist in a unit of time g i other than the dimensionless unit period, for example the days (or hours) with which we measure the duration of the infectivity period of an infectious subject and with which we measure the progress of the epidemic, we can write: In this way we have the parameter R ist which characterizes the exponential growth (as per the definition of R t ) at the point in time t that the increase (or decrease) of daily cases generates.
Connecting R ist to the epidemic curve of daily cases Whenever an exponential function y = e ax is represented in logarithmic scale ln(y) = ax, it becomes a straight line. Its shape factor a becomes the slope of the straight line (the angular coefficient).
If we represent the curve of the daily cases f (t) in logarithmic scale h(t) = ln(f (t)), the slope of the tangent of h(t) at point t is the slope R ist , corresponding to the exponential growth of the epidemiological definition of the effective reproduction number R t , represented in logarithmic scale, at time t, and scaled in time units of the curve of daily cases. But the tangent of h(t) at point t is also the first derivative of h(t). That is: A different reasoning perhaps better illustrates the concept of estimating R t from epidemic curves.
Since all infected have been infected by someone, R t is basically the ratio between the daily cases at time t + 1 over the cases at time t, where 1 is the infecting period. Given the point a on the curve of daily cases that precedes the point b, Differentiating the curve of daily cases, expressed in logarithmic scale with base e, means making the difference between two values, spaced by a unitary period of time tending to zero, that is: ln(b) − ln(a). This expression is equivalent to doing ln(b/a), as those who have used slide rules [4] easily remember: ln(b) − ln(a) = ln(b/a).
By doing the inverse operation of extracting a logarithm from a number, i.e. raising the base of the logarithm to a power of the value of the logarithm in question, one obtains the ratio b/a in the scale of daily cases of infection: e ln(b/a) = b/a. This ratio represents the rate of increase (if > 1), or decrease (if < 1), of the infections averaged over all the infections observed, including all the information on the overall average resistance to the spread of the infection that may have formed meanwhile, for any known or unknown reason it was formed. It also takes in properly weighted account all the overlaps of the infection trees defined by R t , and of the hosts' varying susceptibility.
Furthermore, the value obtained in this way is a very accurate value of R t acting at current time of b, that is, at the very moment in which the current value of the infected cases is known. The passage to the limit of a period that tends to the instant, implicit in the differentiation operation with respect to t, allows to have a curve of R t trend that is always updated in real-time.
According to the epidemiological definition of R t , we have the following correspondence of classical outstanding cases, direct consequence of the epidemiological definition: CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101/2021.01.24.21250405 doi: medRxiv preprint R ist > 0 : when the daily cases increase and the epidemic is expanding; therefore the associated e Rist·gi = R t > 1.
R ist = 0 : when the daily cases remain constant and the epidemic is stationary therefore the associated e Rist·gi = R t = 1.
In this case the curve of daily cases has a minimum or a maximum; R ist crosses 0; R t crosses 1.
R ist < 0 : when the daily cases decrease and the epidemic is contracting therefore the associated e Rist·gi = R t < 1.
Since these outstanding cases derive from the epidemiological definition of R t , they also are criterion for evaluating the correct estimate of R t . A contrasting value of R t respect to the epidemic curve is also an indication that R t or the epidemic curve are wrong.

Summary of conversion formulas
The curve of daily cases f (t) expressed in logarithmic scale with base e is obviously given by: R ist is given by the first derivative (numerically or analytically determined) of any smoothed curve of daily cases, given in logarithmic scale with base e: Please notice that if we have a smoothing procedure of the curve of daily cases that introduces any phase lag, as we have using mobile averages or FIR/IIR filters, we will have the same phase lag in the estimation of R ist and R t . Otherwise if we have some form of static averaging, as using some least squares fitting procedure, no phase lag is introduced. R t is given by: R ist is also equivalen to: The doubling or halving time of infection g d∨h is given by imposing 2.0 as R t and computing the number of resulting days (negative numbers represent halving time):

Some generable charts
The following charts show how R t may be calculated from a fitting of the curve of cumulative cases. Obviously, the curve of daily cases is the first derivative of cumulative cases. The second derivative is the one obtained by differentiating the curve of daily cases taken in logarithmic scale to generate R ist . Taken together, they form a sort of derivative of order 2 of the curve of cumulative cases.
The fitting is primarily done on cumulative cases because they automatically compensate some kind of errors (for example: a missed case one day may be detected 4/7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted February 12, 2021. ; in the following days, etc.). Model and fitting techniques [7] used for the following figures are outside the scope of this writing. Here the model is simply used as source of a smoothed daily data set. The other formulas used to generate the following charts are summarized in the section above. The data source used for this fitting is the COVID-19 official one for Italy [6].

Conclusion
Reproduction number R t and epidemic curve of daily cases are essentially the same thing expressed with different words. The conversion formulas between the two are shown.
The strict mathematical relationship between the Reproduction Number R t and the epidemic curve of daily cases allows to substitute complex and error prone statistical estimations with deterministic computations from the count of daily cases. It is still necessary to directly estimate the average duration of the infectivity of subjects, but independently of linking secondary cases to primary ones. Moreover, the obtained estimation of R t is done averaging all detected cases, instead of a possibly deviated sample of them..
The method also completely eliminates, by mathematics, the uncertainty introduced by the process of identifying secondary cases from primary ones along overlapping trees of infectious relationships. Since all detected daily cases are infected by someone, the resulting curve of daily cases is the perfectly balanced action of R t , and it is detected in full. This method automatically incorporates that balance.
Also, the R t estimate performed by this method is always up to date, as soon as new cases are added to the epidemic curve, without the need to wait for secondary cases to manifest after the period of infection.
Just a glance at the dispersion of an ample set of daily data around a good fitting of these data let easily imagine how difficult and unreliable could be any attempt to directly estimate a trend of the epidemic from small samples of cases and relying on considerations of the spread of these samples over overlapping trees of secondary cases. But this is exactly what the epidemiological definition of R t asks to do.
A more general conclusion may be drawn by observing how different is the evolution of an epidemic of the same pathogens in different regions or times. The ample difference of R t curves computed by this method on different zones [8] clearly shows that the dynamics of an epidemic seems to follow unpredictable and chaotic behavior, very much like weather.
We are used to think of populations involved in an epidemic as an isotropic material, like steel, which has equal behavior in all directions respect to stress and strain. SIR models and the epidemiological definition of R t seem to implicitly assume this isotropy.
Perhaps, an epidemic may better be depicted as acting on many different relationship's fabrics entangled together. A burst of infections occurs when two or more entangled fabrics mix and new connections merge in a new more extended fabrics-entangled fabrics which may be some in a stable infectious condition that eventually becomes saturated, and others not.
If this is a plausible landscape of a population's infection, not every link in this entanglement of networks has the same infection capacity and not all nodes of these networks are isotropically connected.
In other words, there may be several networks that may have poor connections with each other, while having strong connection among the members of each network. For example, the network of families with children that go to the same school may have strong links between families of teachers and classmates, but may have weak connections with other unrelated networks of parents-children-teachers. Some of these networks may saturate eventually, while others may not have even been infected. The same thing happens with other types of relational networks. This is a very anisotropic environment.
This landscape shows a very challenging non linear object to investigate. Maybe, it has some emerging regularities at the macroscopic level, like sequences of overlapping sigmoidal shapes in the curve of cumulative cases, just as a complex sound emerges from the combination of elementary sinusoids. World [5] and regional [8] COVID-19 data of cumulative cases show everywhere overlapping sigmoids.
Maybe sophisticated tools, like the emerging field of "dynamical networks", will provide interesting insights. However, to build reliable models based on the dynamics of networks is very difficult. The risk always is that the model needs a one-to-one mapping to the unknown phenomenon object of modelling, which leads to the conclusion that the proof may only be in the pudding. 6/7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101/2021.01.24.21250405 doi: medRxiv preprint