COVID-19: Time-Dependent Effective Reproduction Number and Sub-notification Effect Estimation Modeling

Background: Since the beginning of the COVID-19 pandemic, researchers and health authorities have sought to identify the different parameters that govern their infection and death cycles, in order to be able to make better decisions. In particular, a series of reproduction number estimation models have been presented, with different practical results. Objective: This article aims to present an effective and efficient model for estimating the Reproduction Number and to discuss the impacts of sub-notification on these calculations. Methods: The concept of Moving Average Method with Initial value (MAMI) is used, as well as a model for Rt, the Reproduction Number, is derived from experimental data. The models are applied to real data and their performance is presented. Results: Analyses on Rt and sub-notification effects for Germany, Italy, Sweden, United Kingdom, South Korea, and the State of New York are presented to show the performance of the methods here introduced. Conclusions: We show that, with relatively simple mathematical tools, it is possible to obtain reliable values for time-dependent Reproduction Numbers (Rt), as well as we demonstrate that the impact of sub-notification is relatively low, after the initial phase of the epidemic cycle has passed.


1) Introduction
The attempt to describe the different epidemic cycles that represent the current COVID-19 pandemic often comes up against the quality of the publicized data. For a number of practical reasons registration of deaths and of infections are inevitably imprecise, although can be corrected over time, as we discuss in [1]. This previous work suggests that one way around this effect is to apply the so-called Moving Average Method (MAM), in which the daily value of deaths is replaced by the sum of the current daily value with the values for the following six days, divided by seven. In other words, in a 7-day week, starting on Sunday and ending on Saturday, the average of the seven values will be computed in the Sunday value. This is the case with the Moving Average Method with Initial Value (MAMI). In the series analysis, the MAMI is seen as a moving window in time, starting from the first event (infection or death) to the last one. While in [1] we dealt with finding patterns in the number of deaths cycles, in order to predict the end of these cycles, in this paper we introduce a method to estimate the transmission rate or, in technical terms, the Basic Reproduction Number (R 0 ), which in fact, we consider that varies in time, therefore it should be considered as a Transmission Function. Many works focused and continue to focus on establishing R 0 , the most recent listed on PubMed database (https://pubmed.ncbi.nlm.nih.gov/), with direct relationship with our work, are briefly commented as follows.
Musa et al. [2] estimate the exponential growth rate and basic reproduction number (R 0 ) of COVID-19 in Africa to show the potential of the virus to spread. The authors analyzed the initial phase of the epidemic in Africa by using the simple exponential growth model. The Poisson likelihood framework is adopted for data fitting and parameter estimation, and the distribution of COVID-19 generation interval is modeled as Gamma distributions to compute the basic reproduction number. With . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint a calculated R 0 , the authors quantified the instantaneous transmissibility of the outbreak by the time-varying effective reproductive number to show the potential of COVID-19 to spread across African region.
Hao et al. [3] reconstruct the full-spectrum dynamics of the COVID-19 outbreak between January 1 and March 8, 2020 in Wuhan, China, across five time periods based on key events and interventions. Following specific assumptions, the authors assumed that the transmission rate and the ascertainment rate did not change in the first two of the periods, and estimated rates across periods by Markov Chain Monte Carlo (MCMC) and further converted the transmission rate into the effective reproduction number R e . Using confirmed cases exported from Wuhan to Singapore, the authors conservatively estimated the ascertainment rate during the early outbreak in Wuhan. Sensitivity analyses to test the robustness was applied to an specific outlier data point, varying lengths of latent and infectious periods, duration from symptom onset to isolation, ratio of transmissibility of unascertained cases to ascertained cases, and the initial ascertainment rate.
Ogden et al. [4] introduce a two-folded approach with an agent-based model and a deterministic compartmental model. Both are Susceptible-Exposed-Infectious-Recovered (SEIR) type models with elements to model COVID-19 and impacts of Non-pharmaceutical measures (NPIs). Parameters in the models are calibrated daily by literature searches, to ensure that evolving knowledge is captured by the models.
The main objective of the modeling approaches was to compare the impacts of different NPIs, by adjusting Gaussian-shaped curves according to the parameters obtained daily and checking the R e .
Fawad et al. [5] aim at obtaining precise estimates of COVID-19 Pandemic current and future trends, by retrieving data from the Health Commission of Hubei, China, using Logistic-S curve model to estimate the current and future trends among 16 cities of Hubei, China from the 11 th of January to the 24 th of February, 2020. Mean Absolute Percentage Error (MAPE), Mean Absolute Deviation (MAD), and Mean Squared Deviation (MSD) were used to evaluate the predicted models for each city.
Lin et al. [6] aim to summarize the mathematical models that have been developed to understand and predict the infectiousness of COVID-19 to inform efforts to manage the outbreak in China. The authors searched various article bases for . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint relevant studies published between 1 st December, 2019 and 21 st , February, 2020, as well as built a Gaussian simulation for the evolution of the epidemic in Wuhan specifically. When R 0 and R c were stratified by region, differences among the two parameters remained statistically significant across all four studied Chinese regions.
Although the works listed above are quite recent, they are based on data from the beginning of the pandemic, and as we noted in [1], these may have led to later models that failed to make predictions. In fact, other authors identify deficiencies in the prevalent pandemic prediction models. Hon and Li [7] work on top of the Susceptible-Infectious-Removed (SIR) model and its variants for modeling the pandemic. The author however list that time-independent parameters in the classical models may not capture the dynamic transmission and removal processes, governed by virus containment strategies taken at various phases of the epidemic; and few models account for possible inaccuracies of the reported cases. The authors in [7] propose a Poisson model with time-dependent transmission and removal rates to account for possible random errors in reporting and estimate a time-dependent disease reproduction number, which may reflect the effectiveness of virus control strategies.
Similar problems with the early models on COVID-19 cycles were also reported by [8]. The authors present an analysis of these early models, mostly based on the city of Wuhan original outbreak. They present a statistical framework for comparing and combining different estimates of R 0 (the transmission rate or reproductive number) across a wide range of models by decomposing the basic reproductive number into three key quantities: the exponential growth rate, the mean generation interval and the generation-interval dispersion. They apply their framework to early estimates of R 0 for the SARS-CoV-2 outbreak, showing that many R 0 estimates were overly confident. The results emphasize the importance of propagating uncertainties in all components of R 0 , including the shape of the generation-interval distribution, in efforts to estimate R 0 at the outset of an epidemic. In general, the authors in [8] conclude that the models derived from the initial Chinese data resulted in low quality later predictions, in the same way that we point out in [1].
Therefore, there is room for proposing new forms of estimating R 0 more precisely. Thus, this article, while recognizing the important contribution made by the . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint other authors who preceded it, seeks simultaneously to propose a model that corrects their deficiencies and at the same time is simple to apply. Simplicity, combined with efficiency, is essential for its use to be disseminated by the health authorities of micro-regions and cities, since in [1] we propose that it is through the analysis of regional data that a better understanding of the patterns of the epidemic cycles is achieved.

2) Data Conditioning and Epidemic Cycle Definition
In the same way that was done in [1], in this work certain definitions have to be established before going further. Initially, we submit daily infection cases figures to the previously discussed Moving Average Method with Initial value (MAMI), and the critical epidemic cycle duration is defined by the time spent between its beginning, the peak and its closure. Also, it is considered that the cycle does start when the number of daily infections start to climb continuously past a threshold of 5% of the MAMI peak. The same criteria is used to define the end of the cycle, however, in this case when figures descent past 5% of the peak value. A False Peak is a data value that does not fit in the general cycle tendency. A Peak, or a True Peak, is the point where the cycle reverses itself from climbing to descending [1]. Finally, there is the question of terminology: we consider that there is an effective transmission rate that is dependent of time, represented by a function R e (t), however, for the sake of simplicity, it will be referenced using the usual terminology and representation of the area, which is simply R t . The terms Transmission Rate and Reproduction Number will be used as synonyms.

2.1) MAMI Effect Over Reproduction Numbers
The impact of MAMI applied to registered numbers can be better understood by analyzing Figure 1. It is easily seems that MAMI bears greatest effect at the very beginning of the epidemic cycle, after a brief time, average and actual data tend to yield to the same value as cycles progress. It will be shown along this paper that the Reproduction Number varies most at the early stages and the use of MAMI is plainly justified to avoid numbers that are registered in batches and not into a smooth, daily, fashion.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 1 -MAMI effect over Reproduction Numbers expressed for two different countries, South Korea and Italy. South Korea: Blue line is R t by MAMI applied to registered data, red line is R t determined for registered data. For Italy: yellow line is R t for registered data and green for MAMI applied to registered data.

2.2) The Progressive Sum of Infection Cases
The use of Moving Average Method, fixing the average of the seven days cycle value at the first day, as explained in [1], is used along this work. Daily values for total cases were collected from JHU's site on the 22 sd of July, 2020. The number of infected should not be used straight as reported daily by local authorities, for reasons previously explained [1], but before, submitted to MAMI, and then applied to the described method. This method may be used for Progressive Sum of Infection Cases (PSIC), where the next day is always greater than the previous (or at least equal). In this case it is assumed that the whole number of infected is capable of contaminating and not only those listed on the day before, therefore is more realistically to use the summation of cases than the daily cases solely.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

2.3) Reported Data
Germany is a West country reported as exemplary in terms of application of NPIs, therefore it is going to be used as a starting point for this work. In Figure 1, the black line represents the number of daily cases, as reported by local authorities and listed in John Hopkins University's website (https://coronavirus.jhu.edu/map.html). Blue bars are these numbers submitted to MAMI and the red line represents the total cases to date and uses the right hand axis as reference. The same data source, calculation method, and display procedure is used for representing data from Italy, Sweden, UK, South Korea, and the State of New York, as presented in Figures 3 to 7. All data was collected on the 22 nd of July, 2020.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 6 -Number of COVID-19 cases reported for South Korea. The black line represents the daily reported numbers, blue bars their MAMI, and red line the total cases to date and uses the right hand axis as reference.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint represents the daily reported numbers, red bars their MAMI, and blue line the total cases to date and uses the right hand axis as reference. Table 1 displays the basic critical cycles numbers presented in this work. Note that Sweden and NY State still have not closed their critical cycles, if a 5% percent limit criteria is used. It does contain the dates for beginning and end of the critical cycle, its duration, the peak day. Maximum value of daily cases after MAMI was applies to registered daily cases and the 5% limit used for determine the cycle boundaries.

3) Daily Number of Infected
This paper proposes to approach the Reproduction Number varying in time and calculating its effective value, based on experimental data. Therefore, we seek to develop a mathematical model that is efficient and at the same time is fast computed.
To achieve this objective, the cycle data were studied in detail and the following experimental formulations were developed.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint The total number of infected daily (I d ), during a period of time t, can be described as a function of the daily increase rate factor (1 + b) multiplied by a scale factor, as described below: Or it could be also be written as: Where I d is the total number of infected at some given time t, R t is the Time-Dependent Effective Reproduction Number, obtained from experimental data, "a" is the scale factor and "b" the absolute daily increase rate, or instantaneous rate and is defined as: Where I d,n+1 is the present day and I d,n is yesterday.
For the Reproduction Number determination, it is necessary to determine the scale factor "a". Therefore, it is assumed the following function: And finally: In order to map the interpretation proposed here for the classical mathematical interpretation for, the equivalence transformations will be described below. Let: Where β is infection-producing contacts per unit time (instantaneous rate), with a mean infectious period of τ .
Equation (6) can be transformed into: From equation (5) we have: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint In (8) all dimensional units are compatible, therefore our transformations to obtain R t in (5) are valid. Since Formula (5) was obtained from experimental data, it will be used from this point onwards in this work, and R t must be understood as R e (t) as explained beforehand.
It is noted that the daily increase rate factor, 1 + b, is not enough to describe the number of contaminated cases registered at one given day, because it just informs the absolute increase ratio occurred from one day to the next. The Reproduction Number coefficient needs more numerical information to be able to express correctly the magnitude of daily numbers. It needs the scale factor "a" to bring information. Figure 7 shows that while the (1 + b) factor varies rapidly from one day to the next, C drops steadily, changing slowly as the exponential time grows. The same behavior is displayed by the total daily registered number of deaths, which keeps growing smoothly. This is the numerical evidence that the factor (1+b) alone cannot describe the total number of deaths.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 8 -Behaviors of (1 + b) and R t factors, for the first 20 days in the German epidemic cycle.

4.1) Germany
In Figure 9 three distinct zones are formed. Zone "a" is in the very beginning of the cycle and the Reproduction Number varies from 1.10 to 1.48 from one day to the next, this probably is just the reflection of large initial variation in numbers, but only if we limit this zone to no more than 5% of the MAMI peak value. It is easy to notice that the figures bear small influence in the overall disease behavior. Zone "b" describes transmissivity during the critical disease cycle (from March, the 6 th to June the 7 th ), where a rapid increase in daily cases stops only around the peak than drops steadily towards the end. This is the most lethal period of the epidemic cycle and it is considered over once a 5% peak level is reached again. The remaining time, Zone "c", is the residual cycle that appears in all countries and places facing the COVID-19 crisis. In absolute values the Reproduction Number for the critical period starts with a 1.30 value and drops continuously towards 1.00, although never quite reaches it (as for the time of this paper was written).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 9 -Total epidemic cycle in Germany, using the number of infected people daily.

4.2) Italy
It can be seem in Picture 10 that three distinct zones are formed. Zone "a" is in the beginning of the cycle and the Reproduction Number varies from 1.78 to 1.44 from one day to the next, once again this probably is just the reflection of large initial variation in number, but this zone is limited to no more than 5% of the MAMI peak value. It is easy to notice that the figures bear small influence in the overall disease behavior. Zone "b" describes transmissivity during the critical disease cycle (from February 25 th to June 15 th ). This is the most lethal period of the epidemic cycle and it is considered over once a 5% peak level is reached again. The remaining time, Zone "c", is the residual cycle. In absolute values the Reproduction Number for the critical period starts with a 1.44 value and drops continuously towards 1.12.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Picture 10 -Total epidemic cycle in Italy, using the number of infected people daily.

4.3) Sweden
It can be seem in Picture 11 that two distinct zones are formed, once Sweden is considered, by the 5% criteria an "ongoing" epidemic cycle, although in the present date, close to the end. Zone "a" is in the beginning of the cycle and the Reproduction Number varies from circa 1.33 to 1.16 from one day to the next, once again this probably is just the reflection of large initial variation in number, but this zone is limited to no more than 5% of the MAMI peak value. It is easy to notice that the figures bear small influence in the overall disease behavior. Zone "b" describes transmissivity during the critical disease cycle (from March 4 th onwards). This is the most lethal period of the epidemic cycle and it is considered over once a below 5% peak level is reached again. In absolute values the Reproduction Number for the critical period starts with a 1.16 value and drops continuously towards 1.07.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 11 -Ongoing epidemic cycle in Sweden, using the number of infected people daily.

4.4) United Kingdom
In Figure 12, two distinct zones are formed, because the UK critical cycle was considered over exactly at the data collection day. Zone "a" is in the beginning of the cycle and the Reproduction Number oscillates between 1.48 and 1.37 from one day to the next, once again this probably is just the reflection of large initial variation in number, but this zone is limited to no more than 5% of the MAMI peak value. Zone "b" describes transmissivity during the critical disease cycle (from March 6 th onwards, up to July, the 21 st ). This is the most lethal period of the epidemic cycle and it is considered over once a below 5% peak level is reached again. In absolute values the Reproduction Number for the critical period starts with a 1.37 value and drops continuously towards 1.08.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 12 -Total epidemic cycle in the UK, using the number of infected people daily.

4.5) South Korea
Data presented in Figure 13 shows two distinct zones, given that the registered cases starts exactly at the critical cycle. Zone "b" is in the beginning of the cycle and the Reproduction Number starts at 2.07 and drops very fast to 1.02. This is the critical disease cycle (from February 19 th up to April, 9 th ). After the limit date for the critical cycle, it is considered that Zone "c" is reached, which is the residual cycle.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 13 -Total epidemic cycle in the South Korea, using JHU database as source for number of infected people daily.

4.6) New York State
In figure 14 two distinct zones are formed, because for the NY State, the critical cycle is still ongoing at the data collection day. Zone "a" is in the beginning of the cycle and the Reproduction Number oscillates between 3.94 and 3.30 from one day to the next, once again this probably is just the reflection of large initial variation in number, but this zone is limited to no more than 5% of the MAMI peak value. Zone "b" describes transmissivity during the critical disease cycle (from March 16 th onwards). This is the most lethal period of the epidemic cycle and it is considered over once a below 5% peak level is reached again. In absolute values the Reproduction Number for the critical period starts with a 3.30 value and drops continuously towards 1.08.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 14 -Total epidemic cycle in New York State, using JHU database as source for number of infected people daily.

5) Sub Notification Effect Over Reproduction Number
When it comes to analyzing the number of cases of infection in the COVID-19 epidemic, an issue that always arises is underreporting or sub-notification and its importance in predicting the behavior of the epidemic cycle. Thus, the second part of this work is dedicated to the study of sub-notification and its effects on prediction.
Sub-notification (SN) is here understood as the fact that the COVID-19 true number of infected are only guessed by public health authorities. Once most of the of the people exposed to the virus does not display any sign of infection or the symptoms are very mild ones so that no medical attention is required, therefore going unnoticed and unregistered by local bureaus of health statistics, the development of evaluation tools is necessary.
If it is assumed that SN is a constant factor (eg. 10 times the registered number of cases) during the whole epidemic cycle, it does not change the absolute daily . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint increase rate "b" or the (1 + b) factor. However, in fact it does affect the scale factor "a", therefore changing the transmission rate or reproduction number (R t ).

5.1) Sub Notification Impact Estimation Method
The impact of SN over R t may be estimated by the following method. At first it is assumed that the actual registered numbers for daily contaminated people are no longer their actual values, but new ones multiplied by a factor (the SN factor). After that the scale factor "a" is calculated. The term (1 + b) remains constant, once the ratio (see term definition in Formula 3) remains constant. Then "a" and (1 + b) are applied to Formula 5, thus recalculating R t , now reflecting the effect of the imposed SN factor. This new R t value would have been the correct one in case all sub notified cases were suddenly registered. The percentage difference between this new, recalculated R t and the actual one provides an estimate for the impact of SN over the reproduction number for a given population. Therefore, multiplying the values for registered cases by a factor of 10 will not cause a tenfold increase in R t . The true impact must be, therefore, calculated as described. It is also observed that SN affects the greatest at the very beginning of the critical cycle. After a certain amount of time, error drops to insignificant values (below 5%). In reverse, a sudden change of trajectory. Practical applications of this method are presented as follows.

5.2) Sub Notification Estimates for Germany
An arbitrary threshold line representing a 5% error was drawn in Figure 15. This limit tells that after the 50 th day into the German critical cycle (the one between 5% of the peak value, before and after it), regardless the amount of sub notification, the error over the calculated Reproduction Number is no greater than 5%. At the other extreme, a 3x sub notification basically does not induce errors larger than 5% over the Reproduction Number, in any time during the critical cycle. A maximum error of 16.84% is estimated for the worst case scenario here simulated, a 40x sub notification, and the first day into the cycle. In overall, sub notification appears to have no significant impact in Germany official infected numbers. Sub notification also seems to have more impact in the very beginning of a given cycle, but turns itself irrelevant towards the end.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

5.3) Sub Notification Estimates for South Korea
The same threshold line representing 5% error was drawn in Figure 16. This limit tells that after the 48 th day into the South Korean critical cycle, regardless the amount of sub notification, the error over the calculated Reproduction Number is no greater than 5%. At the other extreme, a 3x sub notification basically does not induce errors larger than 5% over the Reproduction Number, in any time during the critical cycle. A maximum error of 14.25% is estimated for the worst case scenario here simulated, a 40x sub notification, and the first day into the cycle. In overall, sub notification appears to have no significant impact in South Korea, as in Germany, official infected numbers. Sub notification also have more impact in the very beginning of a given cycle, but turns itself irrelevant towards the end of it.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020.

5.4) Sub Notification Estimates for Italy
Italy case is presented in Figure 17. The 5% limit tells that after the 44 th day into the Italian critical cycle, regardless the amount of sub notification, the error over the calculated Reproduction Number is no greater than 5%. At the other extreme, a 3x sub notification basically does not induce errors larger than 5% over the Reproduction Number, in any time during the critical cycle and 5x barely disturbs it. A maximum error of 12.34% is estimated for the worst case scenario here simulated, a 40x sub notification, and the first day into the cycle. In overall, sub notification appears to have no significant impact in Italy, as in the previous two cases, official infected numbers. Sub notification also have more impact in the very beginning of a given cycle, but turns itself irrelevant towards the end of it.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020.

5.5) Sub Notification Estimates for Sweden
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Sweden SN effect is presented in Figure 18. The calculated limit tells that after the 54 th day into the Swedish critical cycle, regardless the amount of sub notification, the error over the calculated Reproduction Number is no greater than 5%. At the other extreme, a 3x sub notification basically does not induce errors larger than 5% over the Reproduction Number, after the fourth day during the critical cycle. A maximum error of 18.53% is estimated for the worst case scenario here simulated, a 40x sub notification, and the first day into the cycle. In overall, sub notification appears to have no significant impact in Sweden. Sub notification also have more impact in the very beginning of a given cycle, but turns itself irrelevant towards the end of it.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020.

5.6) Sub Notification Estimates for New York State
NY State case is presented in Figure 19. Its limit tells that after the 63 rd day into the New York critical cycle, regardless the amount of sub notification, the error over the calculated Reproduction Number is no greater than 5%. At the other extreme, even at a 3x sub notification, the error induced is much larger than 5% over the Reproduction Number, and takes 13 days to reach values below 5%. A maximum error of 33.63% is estimated for the worst case scenario here simulated, a 40x sub notification, and the first day into the cycle. In overall, sub notification appears to have some impact in NY State. Sub-notification also have more impact in the very beginning of a given cycle, but turns itself irrelevant towards the end of it.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 19 -Sub notification effect over Transmissivity Rate in NY State, during the critical epidemic cycle.

5.7) Sub Notification Estimates for the United Kingdon
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint UK case is presented in Figure 20. The 5% limit tells that after the 51 st day into the British critical cycle, regardless the amount of sub notification, the error over the calculated Reproduction Number is no greater than 5%. At the other extreme, a 3x sub notification basically does not induce errors larger than 5% over the Reproduction Number, in any time during the critical cycle and 5x barely disturbs it and only during the first 10 days. A maximum error of 16.11% is estimated for the worst case scenario here simulated, a 40x sub notification, and the first day into the cycle. In overall, sub notification appears to have no significant impact in Italy, as in the previous 2 cases, official infected numbers. Sub notification also have more impact in the very beginning of a given cycle, but turns itself irrelevant towards the end of it.  Table 7 lists the errors associated with ignoring the existence of Sub Notification into the epidemic cycle -the UK.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

6) Most Lethal Cycle of Epidemics Control Performance
This item expresses the epidemic behavior during the Most Lethal Cycle of Epidemics (MLCE) [1], which is defined as the period of time when the number of registered cases grows fast towards a peak and then declines towards an almost constant daily rate. The frequency shape resembles a triangular form, where the climbing side is smaller than the descending side [1].
In order to allow a comparison between different MLCEs all listed places had their date normalized. The interval was selected between the agreed 5% limits, and the non-dimensional time was determined dividing each day by the total MLCE duration, as listed in Table 1. For the Reproduction Number, all the values was divided by the largest one found in the MLCE interval. All these transformations allow to confront how efficient was the disease spreading control used. Sweden and New York state were considered as still having an open MLCE, therefore the end of the cycle considered was the day of the data collection (July, 23 rd , 2020).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint Figure 21 -Non-dimensional critical epidemic cycle for Germany, Sweden, UK and Italy. Figure 21 shows that Italy was, relatively, the most unsuccessful place in reducing Reproduction Numbers, although not for a large margin. Germany and the UK have the same performance and kept R t falling slowly but steadily. South Korea and New York state achieved a large drop in the early stages of the critical cycle, but after that R t were basically constant.

7) Conclusions
In an attempt to quickly predict the evolution of local episodes of COVID-19 in countries and cities, several prediction models for epidemic cycles were presented, most of them, unfortunately, were not successful, as the pandemic progressed and consequent access to updated data, as well as the literature showed.
As a natural evolution of this task of prediction, new models emerged, adding complexity in the search for success in forecasts. Although some models have been successful, we understand that, in order to become usable by a significant number of . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 30, 2020. . https://doi.org/10.1101/2020.07.28.20164087 doi: medRxiv preprint health authorities, in addition to the necessary efficacy, they also need to be efficient, which in our view, is related to simplicity of use. Models that are too complex, even when supported by software, still need to be fed by several parameters, which makes these tools often inaccessible to decision makers in locations with less technicalscientific resources.
In this sense, this article complements the previous one [1], which dealt with prediction in the number of deaths, in two fold way: by bringing an effective and efficient model of prediction in the number of infection cases, and also seeking to show the impact of sub-notification on calculations, shedding light on this discussion, which is still frequent, especially in countries where there is no mass testing.
We show that, with relatively simple mathematical tools, and even in the face of data with quality problems, it is possible to obtain reliable values for time-dependent Reproduction Numbers (R t ), as well as we demonstrate that the impact of subnotification, when provided with the proper statistical treatment, is relatively low, after the initial phase of the epidemic cycle has passed. In this way, we believe to have contributed to the studies of the epidemic cycles of COVID-19.