Extrapolation of Infection Data for the CoVid-19 Virus in 21 Countries and States and Estimate of the Efficiency of Lock Down.

Predictions about the further development of the Corona pandemic are of great public interest but many approaches demand a large number of country specific parameters and are not easily transferable. A special interest of simulations on the pandemic is to trace the effect of politics for reducing the virus spread, since these measures have had an enormous impact on economy and daily life. Here a simple yet powerful algorithm is introduced for fitting the infection numbers by simple analytic functions. This way, the increase of the case numbers in periods with different regulations can be distinguished, and by extrapolating the fit functions, a forecast for the maximum numbers and time scales are possible. The effect of the restraints such as lock down are demonstrated by comparing the resulting infection history with the likely unconstrained virus spread, and it is shown that a delay of 1-4 weeks before imposing measures aiming at social distancing could have led to a complete infection of the respective populations. The approach is simply transferable to many different states. Here data from six E.U. countries, the UK, Russia, two Asian countries, the USA and ten states inside the USA with significant case numbers are analyzed, and striking qualitative similarities are found.


Issue
At the beginning in January, February, and beginning of March 2020, the total infection numbers showed a dramatic and grew nearly exponentially worldwide, and it seemed possible that finally a significant part of the population will be infected. By attaining herd immunity this way the virus should finally disappear.
It turned out that the mortality due to the virus was rather high. In the worst case, a death rate of e.g.
2% during infection of 2/3 of the population would result e.g. in Germany in 1.1 millions of victims before herd immunity was reached. At the same time, many health systems were overcharged and hospitals could no more provide adequate treatment to severely affected persons. As a consequence, herd immunity by rapid spreading of virus infection was no acceptable option, and many countries shutdown public life, schools and universities, shops and factories mostly in March. These measures extended for some weeks and slowed down infection rates, but caused enormous damage to economy and great social problems.
In the next stage pressure grew in the societies to release constraints step by step from May on, but parallel to that concern grows that this might lead to a second wave of infection spreading and that the pandemic might last for a long time and influence our daily life for years.
The details of shutdown and gradual release were subject to political decisions and to the specific situations in each state, and different ways have been followed in countries all over the world. It is thus of great interest to trace the virus spreading as a result of these measures quantitatively and inside a simple model, which makes data from different countries easily comparable. Obviously, it is also of great interest to have some forecast on the extension of the corona crisis.

Calculation methods
Beyond comparing data with an exponential increase, there are three approaches for understanding and predicting the development of case numbers: Compartment models describe the kinetics of infection spreading by dividing the manifold of people into different groups of non-infected (N), susceptible (S), exposed (E), carrier (C), infected (I), recovered (R) and dead (D). The definition of the compartments is somewhat arbitrary, and different models are around such as SIR (1) (2), SIRD (3), SECIR (4). Even small models need a fairly large number of parameters (4). It is demanding to define these parameters with sufficient precisions, and sophisticated statistical approaches are necessary (2) (1). Compartment models were applied to follow daily infection rates ∆ ( ) 1 . By several reasons, these data are strongly fluctuating, and compartment models may have to introduce additional assumptions to fit this fluctuation (1).
Sophisticated statistical models are used in epidemiology for evaluating the reproduction rate (5), but a recent application to data from European states showed that the data fluctuation also affects the model predictions (6).
Neuronal networks are a method of fitting data very precisely [ZHU], but the problem of this approach is that the number of parameters may be high. These parameters are hidden in the weights in the network, their values usually not having a well-defined physical meaning. It may thus be difficult to fit data from different countries and compare the parameters, and also to distinguish between noise and significant data. The authors of (7) conclude that this approach does not show advantages over analytic curve fitting.
A fit of the infection data by analytic function is tempting by three reasons (i) An analytical function can be chosen, which depends only on few parameters with a physical meaning. The fits provided here aim at determining two values, the predicted number of total infections ( → ∞) at the end of the pandemic, when the infection rate has come down again, and the time scale, when this will happen.
(ii) A fit of the infection data shall enable to distinguish between periods with different regulations such as before and during lock down, and the later release of shutdowns and social distancing. This is very difficult with fits trying to follow Walter Langel 4 Mathematically, the total number of confirmed cases, ( ), is in turn the integral of ∆ ( ) 1 and thus behaves much smoother. It is, however, nearly impossible to distinguish between different exponential increase functions and the deviations from them by plotting ( ) on a linear y-scale.
Analytical functions are not restrained to fitting the infection rate itself. Here the decadic logarithm ( ( )) of the total infection numbers is fitted by a smooth analytical function of time (dashed lines in Figure 1). Exponential behavior and deviation from that are easily interpreted. From the fit of ( ( )), calculated curves for ( ) and of are obtained, which can be directly interpreted rather than the original zigzag data for the infection rate.
(iii) Due to the transparent use of a small number of parameters, analytical functions are easily transferable between data sets from different countries and entities, showing convergence as well as differences. In this work the development in 21 countries and states is compared and a systematic classification is presented. The choice comprises two Asian countries, which very quickly extinguishing the virus spreading, six countries in the E.U., which were considered to be in similar conditions before the crisis broke out, but handled it quite differently, and Russia, the U.K. and the USA. In (8), ten states inside the USA are mentioned, which have high numbers of cases and thus could provide statistically significant data. According to (9), the data for states inside the USA indeed strongly diverge. Thus, data from the USA were not only treated as a whole, but also the numbers for ten states were analyzed separately.
The following results are used here for characterizing the virus spread: -The maximum number of infected cases ( → ∞), which is attained asymptotically by the fit function, characterizes the efficiency of the lock down.
-For comparison between different countries, the data for ( → ∞) are also normalized inhabitants of the respective country/state yielding the "cumulative Incidence". The term "incidence" is used here for describing normalization to 100,000 inhabitants, and "cumulative" means summed up till the end.
-The day , at which the infection rate has its maximum value, The length of the pandemic is characterized by two numbers: -The fit parameter ∆ 10% gives the time in days until has dropped from its maximum value to 10% of it.
-Another instructive result is the date, at which this infection rate per day has dropped so far that the incidence per day is smaller than one independently of the previous maximum value. For Germany this . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint corresponds e.g. to 830 new cases per day in 83 mill. of inhabitants, which means that the pandemic is not yet over, but controllable.
-A new time parameter ∆ ℎ is introduced here with the following meaning: The lock down (fortunately) stopped the exponential increase before the full populations in the respective states were infected, and the case numbers then developed on a much smaller level. A delay of the lock down would have led to higher infection numbers and finally the whole population would have been infected, i.e. lock down would have had no more effect. Here, the time is estimated between actual lock down and the date, at which any measure would have lost its effect. It is seen that this time was only a few weeks long, and this figure demonstrates the importance of quick action for stopping the unrestricted spread of the virus.
At least two approaches have been presented so far. A power law approach has been presented in In March 2020, at a time, when the lengths of shut down and pandemic was not yet provided elsewhere, a method was presented (11), which approximated the natural logarithm of the total case numbers by a logistic function. Very good fits were obtained, but as the number of data evolved, it turned out that the method did not reproduce obvious breaks between different periods sufficiently well. The present paper thus is based on the genuine logistic function (12) with again only three parameters. Even though the earlier predictions were based on much smaller data sets and used a different fit function, main conclusions are still consistent with the present calculation ( Table 1). After reevaluating Δt 10% for the old fit, it turned out that the time scale was already fairly precise. The final number of total cases came out well in Germany, but was overestimated in Italy. Still it was already obvious from these fits that the maximum case number were orders of magnitude lower than the total population.

Method
For eleven countries, the total case numbers were taken from (13), (14), and the infection rates were calculated as differences between total numbers of two following days. For the states of the USA, the infection rates per day were taken from (15), and the total numbers were obtained by summing up till the respective date.
A well-known formula for virus spread in a community provides the logistic function (12): . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06. 17.20134254 doi: medRxiv preprint This formula only contains three parameters, the starting value ( = 0) at = 0, the final value ( → ∞), which is attained at long times, and a time constant , which describes the transition from the exponential increase to saturation. This time parameter has a simple meaning, since ∆ 10% = 3.7 • is the time, in which the infection rate in a single fit decreases from maximum to 10% of the maximum value. -The function correctly reproduces the limiting asymptotic cases: This starting point cannot be taken directly from some scattering initial data but is used as fit parameter. Its actual value depends on the time = 0 when the data set starts.
For short times, while the formula reproduces a nearly exponential increase: and a time for increasing by a factor of two given as At long times, for the function approaches asymptotically a maximum value, ( ) ≈ ( → ∞).
This saturation behavior is generated by the exponential function in the first denominator in eq. (1) approaching zero for long times . The length of the transition range from initial exponential increase to final saturation is described by the third parameter , which is characteristic for the rise time of the number of infections during initial exponential increase and also the saturation behavior.
Usually formula (1)  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint ( → ∞) is now a fit parameter having values of only a small fraction of the population, such as several ten or 100 thousand people.
For the fit of equation (1) to the data points, the standard solver in Microsoft Excel 2016 was applied to the logarithm of the number ( ) of confirmed infections as saved in steps of one day, and the least squares error with respect to equation (1) was minimized by varying the three parameters ( = 0), ( → ∞), and . The data were weighted by √ ( ) to account for poor statistics of small numbers.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

General features
There are important similarities between the fits for all countries and states considered here (Figure 1).
In the log-linear plots, a sharp concave bending is seen somewhere between late March and early April.
This effect is obvious without any fitting model, and one could essentially even draw straight lines with different slopes through the data points before and after this turning point. This bending is ascribed to the effects of any social distancing, which interrupted the originally exponential increase of infection numbers at around the same time in many countries. For some plots in Figure 1, (Germany, California, Michigan, New York, Russia, UK), vertical lines indicate the known dates of lock downs. In Germany, UK, Michigan and New York, the bend can be assigned to the respective lock down, since the observed total case numbers start to fall below the orange fit shortly afterwards. In California and Russia, the situation is less clear, since strict lock downs took place already on March 19 th (16) and 15 th (17), respectively.
The fitting periods were handled as follows: (1) Data during the time before intervention rose very quickly in all states. Orange lines were fitted to only a few points from the first days in this time interval, and an essentially linear increase is seen on the logarithmic scale. This means that the increase was still exponential at that time, and probably the whole population would have been infected in a short delay without precautions. In the range of exponential increase the slope m in the log-linear plot, the initial doubling time 2, and the parameter of the fit function are related by = 1 (10)• and 2, = (2) • (7).
, small values between only two and six days were found ( Table 2), and this has largely contributed to the fear, which has raised the appearance of CoVid-19. The final saturation value for the case number without intervention cannot be extracted reliably from this fit. Here an estimate of 20% of the respective population is used. This is significantly below 100% since there is probably a high number of unreported cases, and herd immunity is attained at an infection of significantly less than the full population (18). As the saturation value is not known with any precision, the horizontal parts of the orange lines are considered as guides to the eye. On the other hand, it is of little relevance to speculate about the saturation value of the infection numbers without retarding measures.
(2) In all cases considered here, significant deviations from the initial exponential increases have been found already after a few days.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint The full green lines (Figure 1) describe the increase of total cases after the lock down and release since beginning May up to June 6 th . At the first glance they look as straight lines with a much smaller slope than for the orange lines in their exponential regime. This would still correspond to an exponential increase, which is just slower than before lock down.
Closer inspection shows that the green lines are slightly concave and thus increase slower than exponentially. Here, from the deviation of the case numbers from a strictly exponential increase during lock down and especially for the present release period, well defined fits are obtained. They permit to quantify the reactions to limitations such as social distancing. The fits yield predictions of end of the infection period and of the total case numbers ( → ∞) ( Table 2). These fits with a single broad logistic functions (green in Figure 1) with maxima between end of April and end of July spreading over a somewhat longer time are denoted as F1. The time Δt 10% between maximum of the infection rate and its decrease to 10% of the maximum is as long as 60-100 days.
The vertical positions of the green fits directly depend on the dates, when interventions took place.
Lock down at earlier times resulted in a lower lying line and a smaller maximum number of total cases, whereas imposing the lock down later led to an increase of ( → ∞). After waiting another time ∆ ℎ , the green line would have converged to the horizontal orange line meaning full infection. As was explained above, the herd immunity is estimated to be reached, when the total number of reported cases attains 20% of the population in the respective country. We thus obtain The doubling times 2, of about 1.5-4 days for the exponential increases before lock down were taken from the slopes of the full orange lines in Figure 1. The estimate for Δt ℎ was made as follows e.g. for France (cf. anymore. In all cases, ∆ ℎ is only between one and less than four weeks ( Table 2). This demonstrates, how important it had been to rapidly decide appropriate measures in the respective countries.
The eq. (1) was fitted to the logarithm of the total numbers, but additional information is obtained from the linear plots (Figure 1, right column). The full green lines fit major parts of the data points for ( ), mostly since beginning of May, and are equivalent to the dashed green line for ∆ ( )/1 for . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint new infections, which match especially the infection rates in the time interval (light blue lines in Figure   1). Other than from the directly observed noisy data, one obtains a time scale from the fit, and sees clearly, when the infections rise to a maximum and decrease again.
In many plots, the intermediate data points between the initial free virus spread and the slower infection rates after gradual release are not satisfactorily fitted by the green lines. The data from many countries in this intermediate range in the log-linear plots are well fitted by a third set of parameters (dotted blue lines), which describes the curvature of data between steep initial growth and slower increase after the lock down. These dotted lines predict lower total case numbers and a shorter time scale than the green lines ( The difference between these types is, how much this long time fit (green lines in Figure 1 The following parameters characterize the time scale ( Table 2).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint (i) The date of maximum infection rate, which was already passed in many countries at the beginning of April, in China even in February.
(ii) The date, at which the infection rate has dropped below 1 case per 100,000 per day. It is seen that the infections will at least in some states (Florida, North Carolina) persist till end 2020.

Results for individual countries and states
As this paper is mainly on demonstrating the value of a simple analytical fit method, no deeper discussion on special countries will be given, but only a few remarks are made.

Asia
As is well known, the Asian countries China and South Korea have extremely low infection numbers as compared to their large populations, and the extrapolations result in small cumulated incidences of six and 21 cases, respectively. Both countries obviously have strictly reacted to the pandemic, and the fit types are F2 with a strong short time component. None of them ever reported incidences above one.
In the case of China it has to be considered that localized strong eruptions are normalized to a very large population. South Korea obviously was very quick in reacting to the pandemic, which is demonstrated by the longest time found here for any country of Δt ℎ = 25 ( Table 2).

European Union
Even though the EU countries were closely connected before the pandemic and reacted on a similar time scale, the development in Europe diverged. For the four countries France, Germany, Italy, and Spain, similar parameters were found ( Table 2), and the fit type was F2, reflecting a strong impact of the lock down measures in these countries. In Italy the long time contribution to the case number is rather high, even though the country had at least as strict contact bans as the other three. The extrapolated cumulative incidence ranges from 228 to 532, i.e. is in Spain higher by a factor of two than in France. The incidence per day dropped below one in May, which makes the opening of these countries for tourism plausible.
The severe problems, Spain France and Italy experienced during the initial phase of the pandemic, were not due to high average cumulative incidences, but to localized very high infection numbers and initial problems in the local health systems.
In the two further EU countries included in this study, Poland and Sweden, the impact of the measures was less obvious, and the fit type F3 indicates an important long time contribution to the cumulative incidence. The time scales and absolute case numbers are quite similar in both countries.
Δt 10% is as long 120 and 141 days, as compared to 40-60 days in the four EU countries mentioned . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint above, and the infection rate will drop to 10% in October rather than in May. As Sweden is much smaller than Poland, its resulting extrapolated cumulative incidence (840) is very high, and very low in Poland (140), and the barrier of 1 new case per day will be reached much earlier in Poland than in Sweden (end July vs. October). It may be speculated if the liberal contact ban policy in Sweden and the early strict closing of the borders in Poland, respectively, have incited these diverging results.
Due to the long time scale, both countries are still at the beginning of the pandemic, and predictions have to be taken with care. Especially in Sweden, the high case numbers since June 1 st -11 th are also compatible with a new uncontrolled exponential rise.

Russia and UK
The number of total cases ( → ∞) may reach 800,000 in Russia vs. more than 300,000 in the UK, but the extrapolated cumulative incidences then will be in the same range around 500 for both countries ( Table 2). Both infections rates have passed the maxima and start to decrease, even if this is partially hidden by fluctuations, and both data sets have a short time contribution, being more pronounced for Russia (F2) than for the UK (F3), and strong long time contributions have to be expected. Apart of that the time histories are very different: The total case numbers in the UK showed a clear reaction to the lock down starting March 23 rd , which is still at least in part maintained (Figure 1). The maximum of new infections per day will decrease rather fast, and the incidence of 1 might be reached already beginning of June in the UK, but only in August in Russia.
The log-linear plot for Russia shows a significant curvature, suggesting that the number of confirmed infection cases was very early increasing significantly less than exponential. A continuously concave dependence of ( ( )) over is observed (Figure 1). This indicates the superposition of the effect of successive measures starting with a lock down after March 30 th . The digital pass system for Moscow starting on April 15 th is indicated in the plot, suggesting that this measure was the most efficient. No effect of release is visible in the data plotted.

USA
The total incidence for the whole USA is around 800, which is at the upper end of the European states.
A closer inspection of ten states in the USA showed that the number for the whole country are averaged over states with similarly case numbers but very different time histories. Even the fit types are different, and all three defined types occur ( Table 2).
The extrapolated cumulative incidences for ten states inside the USA with high case numbers are compared with the policy in the respective state (8) in Table 3. The three states, which "monitor . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint vulnerable populations", have a significantly lower number of cases to be expected than the others.
Even Pennsylvania with its large cities still has a lower value than Indiana. A very rough estimate is that this monitoring reduces the case number in comparable countries by a factor of two to three. This would be consistent with the experience in Europe that residences of elder people are very severely affected by the virus.
Two extremely diverging cases are California and North Carolina with 470 and 2040 extrapolated total incidence, respectively.
Case numbers in California first showed the exponential increase (orange in Figure 1) as seen everywhere. Around March 19 th a lockdown became effective, and the case number was significantly lower than the initial exponential curve would have predicted. A single fit (F1) resulted in 186,000 extrapolated total cases and an incidence of 470, which is well in the range of many European countries. By May 12 th (16) the lockdown was released to some extent. As a striking result, the new cases per day started to increase again strongly on a nearly exponential line well beyond the original fit. On the other hand, data for North Carolina yielded a high number of infections from the beginning.
The maximum of daily infections is only reached end of July, and one has to wait till beginning of November, until the infection rate is decreased to 10% of the peak value. North Carolina is described as a more agricultural state (19), and should not have to fight infections in large industrial centers as do New York, Illinois, California and Pennsylvania. The high number of infections in this state thus seems to be remarkable and may be related to the fact, that N. C. is the only state among the ten not containing new cases (8).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Conclusion
The intention of the present manuscript is to present a model, which permits to trace the Corona virus spread as a function of interventions and to quantify effects on the basis of empirical data and few significant fit parameters. Such models are of great relevance for giving a justification for imposing or releasing measures such as contact ban or lockdown, which have a huge worldwide impact on the economy and daily life. This work is complementary to extended studies using epidemiological models with a large overhead, which are often difficult to apply and to transfers. It has been shown that the approach followed here permits the straightforward comparison of data from different countries.
I want to make the following points:   . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint (6) This analysis permits a forecast for the final incidence and the duration of the pandemic by extrapolation of fit functions. Moreover, one rapidly notices effects such as the uncontrolled increase of case numbers by monitoring sudden deviations from the fit.
(7) For ten states inside the USA, a correlation between the criteria for reopening and the calculated cumulative incidence was found. California, Pennsylvania and Washington, which monitor vulnerable populations, had lower predicted values. North Carolina does not contain new cases and has a very high calculated value.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint Table 2: Simulation results for ten countries including the USA and eleven states inside the USA. Data were extracted from the fits in Figure 1 and arranged according to the fit type only, but disregarding political correlations and the scale of the infection in each country. The time unit is days. The columns contain: -sizes of populations (20): Eventual uncertainties in these numbers are probably small as compared to errors in the fits and the resulting maximum numbers of cases.
-The initial doubling time 2, during uncontrolled virus spreading is related to the initial slope of the orange fit.
-( → ∞), the extrapolated maximum case number (cf. full green line in Figure 1) -Δt ℎ , the efficient time, by which the virus spreading was slowed down before large scale virus spreading would have taken place with final herd immunity, but with an uncontrollable death number.
-The fit types are denoted F1, F2 and F3 according to the importance of the fitting by the short time (dotted blue) or long time (full and dashed green) functions (s. text and Figure 1).
and are the time constants for the fits by blue and green lines, respectively . The importance of these constants is readily seen by inspecting the respective results for the infection rates in the linear plots.
-New cases per day: Figures in the next four columns reflect the importance and duration of the pandemic in each country: (1) Maximum new infections per day, (2) respective date, (3) date, when the rate has decreased to 10% of its maximum, and (4) time between these two dates.
-Incidence per day <1: An estimate for the duration of the pandemic is given by evaluating the date, when the incidence has dropped below 1 case per day.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 19, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 19, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 19, 2020. . https://doi.org/10.1101/2020.06.17.20134254 doi: medRxiv preprint