Undocumented infectives in the Covid-19 pandemic ================================================ * Maurizio Melis * Roberto Littera ## Abstract **Background** A crucial role in epidemics is played by the number of undetected infective individuals who continue to circulate and spread the disease. Epidemiological investigations and mathematical models have revealed that the rapid diffusion of Covid-19 can mostly be attributed to the large percentage of undocumented infective individuals who escape testing. **Methods** The dynamics of an infection can be described by the SIR model, which divides the population into susceptible (*S*), infective (*I*) and removed (*R*) subjects. In particular, we exploited the Kermack and McKendrick epidemic model which can be applied when the population is much larger than the fraction of infected subjects. **Results** We proved that the fraction of undocumented infectives, in comparison to the total number of infected subjects, is given by ![Graphic][1] where *R* is the basic reproduction number. Its mean value *R* = 2.10 (2.09 − 2.11) in three Italian regions for the Covid-19 epidemic yielded a percentage of undetected infectives of 52.4% (52.2% - 52.6%) compared to the total number of infectives. **Conclusions** Our results, straightforwardly obtained from the SIR model, highlight the role played by undetected carriers in the transmission and spread of the SARS-CoV-2 infection. Such evidence strongly recommends careful monitoring of the infective population and ongoing adjustment of preventive measures for disease control until a vaccine becomes available. ## Introduction A critical issue in the control of an epidemic is to know the exact number of infective subjects. Current estimates of SARS-CoV-2 infection are significantly hampered by the difficulty to perform large-scale diagnostic tests, despite a growing awareness that the spread of the Covid-19 pandemic is mostly caused by undetected carriers. The dynamics of an epidemic can be described by an epidemiological model known as the SIR model, which divides the whole population into three classes of subjects: susceptible (*S*), infective (*I*) and removed (*R*) individuals. Kermack and McKendrick [1] developed a SIR model for the study of epidemics in populations much larger than the infected fraction. Under this assumption, which is fully verified in the Covid-19 epidemic, we proved that the total number of infectives, when an epidemic occurs, is approximately *R* · *R*, where *R* > 1 is the basic reproduction number of the infection and *R* is the number of infectives who have been removed because of recovery, isolation, hospitalisation or death. The number of undocumented infectives is then (*R* − 1) · *R*. The fractions of removed and undetected infectives, in comparison to the total number of infectives, are ![Graphic][2] and ![Graphic][3], respectively. By applying the aforesaid model to the data available on the Covid-19 epidemic in Italy, we obtained that the mean value of the basic reproduction number in three Italian regions was *R* = 2.10 (95% confidence interval, 2.09 – 2.11). Consequently, the number of undocumented cases turned out to be about *R* − 1 = 1.1 times the number of removed cases. More specifically, we found that the percentage of undocumented infectives was about ![Graphic][4] (95% confidence interval, 52.2% – 52.6%) of the total number of infectives. Previous investigations found that the percentages of asymptomatic infectives (i.e. subjects without fever, cough or any other symptoms) were: 43.2% (32.2% - 54.7%) in Vo’, a small town near Padua in Italy [2]; 50.5% (46.5% - 54.4%) on board the Diamond Princess cruise ship in Yokohama, Japan [3] and 47% (38% - 56%) within China [4]. The speed at which an epidemic grows cannot be explained if we only take into account the number of recorded infected patients who, supposably, are immediately removed from the circulating population by hospitalisation or isolation at home. Undocumented infectives are largely responsible for the rapid increase of the epidemic and can be classified into three classes: pauci or asymptomatic individuals who never develop overt symptoms during the course of infection; presymptomatic subjects who will eventually develop symptoms; symptomatic infective individuals who have clinical symptoms but for different reasons (such as the shortage of nasopharyngeal swabs) are not diagnosed as positive and capable of transmitting the disease. The main fraction of undocumented infectives is represented by the asymptomatic carriers, who continue to circulate and spread the disease. To reliably detect their presence, it would be necessary to test the entire population and not just the suspect cases. The data provided by the Italian Ministry of Health and the Civil Protection Department up to the 3rd of June 2020 [5] reported about 233800 removed cases in Italy, including either patients hospitalised or isolated at home or recovered or dead. Based on the result found in the present study, the total number of infectives had to be almost 491000. This means that 257200 individuals were not diagnosed as infected although they continued to circulate and spread the virus. This study confirms that undocumented infectives can be considered the key culprits for the rapid spread of SARS-CoV-2 within the population. Consequently, interventions to control the infection will need to be maintained until the complete disappearance of the epidemic. Further details on the SIR model and the numerical fit of the data are reported in the Appendices, where evaluation of the basic and effective reproduction numbers *R* and *R*eff(*t*) is also discussed. ## Methods In the SIR epidemic model the population is divided into three distinct classes [6]: the susceptible subjects, *S*, who can catch the disease; the unremoved infectives, *I*, who have the disease and can transmit it; and the removed infected subjects, *R*, namely those with a laboratory diagnosis who are either hospitalised, isolated at home, dead or recovered. We assume that all individuals diagnosed as infected – either by nasopharyngeal swab or serological test – are immediately isolated, thus passing from the class of the infectives *I* to that of the removed infectives *R*. On the contrary, the infected subjects who are not diagnosed as positive pass from the class of the infectives *I* to that of the undocumented infectives *U*. The total number of infected subjects *I**tot*(*t*) is the sum of the number of removed infectives *R*(*t*) and undocumented infectives *U*(*t*): *I**tot*(*t*) = *R*(*t*) + *U*(*t*). As discussed in the Introduction, the undocumented infective individuals (i.e. unremoved or undetected) can be pauci or asymptomatic, presymptomatic and symptomatic without a positive test for Covid-19. The progression of an individual from the susceptible class to the removed or the undocumented compartment is represented by the flow diagram in Figure 1. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/07/11/2020.07.09.20149682/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/F1) Figure 1. Flow chart representing the progression of an individual from the susceptible class to the removed or the undocumented compartment. The fractions of *I**tot* in the *R* and *U* compartments are ![Graphic][5] and ![Graphic][6], respectively. If *S*(*t*) is the number of susceptible individuals at time *t* and *N* is the size of the population, the total number of infected subjects *I**tot*(*t*) turns out to be ![Formula][7] By manipulating the differential equations which define the SIR model (Appendix A) and assuming that the initial number *S* of susceptible individuals is close to *N*, i.e. *S* ≅ *N*, one obtains ![Formula][8] where *R* is the basic reproduction number (discussed in Appendix B). Under the assumption *R* · *R*(*t*)/*N* ≪ 1 (a condition which is certainly verified if the population size is much larger than the number of infected subjects) we can approximate *S*(*t*) in the following form: ![Formula][9] The total number of infected subjects *I**tot*(*t*) at time *t* then becomes ![Formula][10] while the unremoved infectives *U*(*t*) at time *t* turn out to be ![Formula][11] The ratio between the removed infected subjects *R*(*t*) and *I**tot*(*t*) at time *t* is ![Formula][12] while the ratio between the unremoved infectives *U*(*t*) and *I**tot*(*t*) at time *t* is ![Formula][13] Being *R* · *R*(*t*)/*N* ≪ 1, the previous four equations can be approximated as ![Formula][14] These results are obtained under the assumption *S* ≅ *N*, which implies *R* > 1, i.e. that an epidemic ensues. The fraction of undocumented infectives *U*, in comparison to the total infectives *I**tot*, has been derived straightforwardly from the SIR epidemic model and only depends on the basic reproduction number *R*. ## Results The data provided by the Italian Ministry of Health and the Civil Protection Department in Italy [5], updated to the 3rd of June 2020, were fitted for three Italian regions by means of a specific code written with Wolfram Mathematica 12.1 [7] and based on the Kermack-McKendrick model [1]. Lombardy, in the north of Italy, has been the region with the highest number of Covid-19 infections, followed by Emilia-Romagna (at the second place from the 29th of February to the 24th of April, at the third place in the other periods of the epidemic). On the contrary, the Island of Sardinia, in the South of Italy, was one of the regions with the lowest number of documented Covid-19 infections and deaths. The population size *N* in these regions, updated to the 1st of January 2019, were: Lombardy *N* = 10060574, Emilia-Romagna *N* = 4459477, Sardinia *N* = 1639591 (data from ISTAT, Italian National Institute of Statistics). In these three Italian regions our epidemiological model yielded the mean value *R* = 2.10 (2.09 − 2.11) for the basic reproduction number *R* (Appendix B). Time *t* was expressed in days since *t* (*t* = 0), the day before the date of the first diagnosed patient: 19th of February in Lombardy, 20th of February in Emilia-Romagna and 2nd of March in Sardinia. At any time *t*, the mean percentage of removed infectives *R*(*t*) in comparison to the total number of infectives *I**tot* (*t*) was about ![Graphic][15], while the mean percentage of unremoved infectives *U*(*t*) was about ![Graphic][16]. The assumption that the population size must be larger than the number of infected subjects corresponds to a relative error ![Graphic][17] on the undocumented fraction of infectives, i.e. a percent error lower than 0.9% in Lombardy, 0.6% in Emilia-Romagna and 0.1% in Sardinia. Figure 2 represents, on the basis of the data provided by the Italian Ministry of Health [5], the number of removed infectives *R*(*t*) in Lombardy, Emilia-Romagna and Sardinia, fitted by the equation *R*(*t*) = *c*1 · [tanh(*c*2*t* − *c*3) + tanh(*c*3)] (Appendix C). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/07/11/2020.07.09.20149682/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/F2) Figure 2. Fit of the number of removed infectives *R*(*t*) according to the Kermack-McKendrick model in three Italian regions. Table 1 reports the main epidemic parameters of the Covid-19 epidemic in Lombardy, Emilia-Romagna and Sardinia: the basic reproduction number *R*, the final numbers (for *t* → ∞) of the removed (*R*), unrecorded (*U*) and total *I**tot* infectives, the percentages *U*/*I**tot* and *R*/*I**tot*, the day *t* when the epidemic started, the time *t*peak (both in days, since *t*, and according to calendar date) of the maximum rate *R*′(*t*peak) of new cases per day, with the corresponding number of removed infectives *R*(*t*peak), the constants *c*1, *c*2, *c*3 in the equation *R*(*t*) = *c*1 · [tanh(*c*2*t* − *c*3) + tanh(*c*3)], determined by fitting the data on the Covid-19 epidemic with Wolfram Mathematica 12.1 [7]. View this table: [Table 1.](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/T1) Table 1. Main epidemic parameters from the fit of the removed infectives *R*(*t*) in Lombardy, Emilia-Romagna and Sardinia. In brackets, we reported the 95% confidence intervals. Figure 3 shows the number of newly recorded infectives per day in Lombardy, Emilia-Romagna and Sardinia. These curves plot the equation ![Graphic][18] (Appendix C), which yields the rate of new removed infectives in the Kermack-McKendrick model. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/07/11/2020.07.09.20149682/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/F3) Figure 3. Rate ![Graphic][19] of new removed infectives per day according to the Kermack-McKendrick model in three Italian regions. Figure 4 compares the percentages of asymptomatic infectives found in three previous investigations, conducted in Vo’ (Italy) [2], Japan [3] and China [4], with the percentage of undocumented infectives in Lombardy, Emilia-Romagna and Sardinia obtained in this study through the SIR model. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/07/11/2020.07.09.20149682/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/F4) Figure 4. Comparison between the percentage of unrecorded infectives obtained using the SIR model and the percentages of asymptomatic infectives in three previous investigations conducted in China, Japan and Vo’ (Italy). The error bars represent the 95% confidence intervals. The result obtained with the SIR model seems affected by a relatively small error in comparison to the errors in the other studies (Figure 4). The reason is that the 95% confidence interval associated to our finding only represents the uncertainty intrinsic to the mathematical model, excluding the error on the number, provided by the Italian Ministry of Health [5], of the removed infectives *R*(*t*) at time *t*. This number was probably understimated because of the difficulty to administer swabs or serological tests to all the suspect cases or even to subjects with overt symptoms. However, we only considered the errors associated to the statistical goodness of fit in our model, being unable to evaluate the uncertainty of the data on removed infectives. Figure 5 shows, in three Italian regions, the numbers of removed (*R*), unremoved (*U*) and total (*I**tot*) infectives, related by the equations *U* = (*R* − 1) · *R* and *I**tot* = *R* · *R*. ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/07/11/2020.07.09.20149682/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/F5) Figure 5. Fits of the number of removed, unrecorded and total infectives in Lombardy, Emilia-Romagna and Sardinia according to the Kermack-McKendrick model. In Appendix D, the Kermack-McKendrick model was also used to compute the effective reproduction number *R*eff(*t*) and to evaluate the time corresponding to the threshold *R*eff(*t*) = 1 at which the epidemic starts to decline. ## Discussion The speed at which an infection spreads is strongly influenced by the number of undocumented infected individuals, who contribute to disseminate the virus without being diagnosed as positive. This study proved that in any epidemic the fraction of unrecorded infectives, compared to the total number of infections, is given by the approximated expression ![Graphic][20], which only depends on the basic reproduction number *R*. The analytical expression of *R* found in Appendix B was exploited to compute the basic reproduction number in three Italian regions (Lombardy, Emilia-Romagna and Sardinia); the corresponding mean value *R* = 2.10 (95% confidence interval, 2.09 – 2.11) overlaps well with the result *R* = 2.2 (1.4 − 3.9) found in China [8] and the result *R* = 2.28 (2.06 – 2.52) obtained in Japan on board a cruise ship [9]. In Appendix D, the Kermack-McKendrick model was also used to compute the effective reproduction number *R*eff(*t*) defined e.g. in the volume [10]. By exploiting the aforesaid mean value of *R*, we found that the percentage of unrecorded infectives ![Graphic][21] (95% confidence interval, 52.2% – 52.6%) of the total infectives. The assumption that the population size must be larger than the number of infected subjects corresponds to a percent error lower than 1% on the undocumented fraction of infectives. As shown in Figure 4, the percentage of undocumented infectives obtained in this study overlaps well with the percentages of asymptomatic infectives found in previous investigations [2, 3, 4], confirming that the fraction of unremoved infectives is considerable and may have strong influence on the dynamics of the epidemic. In a study conducted in Vo’ [2], a small town in Veneto (Italy), most inhabitants were tested through nasopharyngeal swabs in two consecutive surveys; the mean percentage of asymptomatic infectives corresponded to 43.2% (32.2% - 54.7%) of the total of SARS-CoV-2 infections. Important findings in this study were also that the viral load in asymptomatic infections did not significantly differ from that of symptomatic infections and that asymptomatic infectives can transmit the virus [2]. Investigation performed on the passengers of the Diamond Princess [3], a cruise ship in Yokohama (Japan), revealed that from the start of the epidemic the percentage of asymptomatic infectives on board the ship was 50.5% (46.5% - 54.4%) of the total infectives. One of the first studies [4] to reveal the crucial role of undocumented infections in the Covid-19 pandemic estimated the undocumented fraction of infectives on the basis of a mathematical model connecting mobility data and observations of reported infections within China. The percentage of undocumented infectives turned out to be *U* = 86.2% (81.6%– 89.8%) of the total number of positive cases. However, in this study the transmission rate for undocumented infectives was assumed to be *μ* = 55% (46% − 62%) of the transmission rate in symptomatic infectives [4]. On the contrary, we assumed that all infected subjects – with or without symptoms – have the same viral load, as confirmed by the investigation in Vo’ [2], and can transmit the virus at the same rate. Under this assumption, the effective percentage *U*eff of undocumented infectives is given by *U*eff = *μ* · *U* = 47% (38% − 56%). Another study [11] investigated 350 attendees of a wedding in Jordan, 76 of whom tested positive for SARS-CoV-2. Among them, 36 individuals were asymptomatic, i.e. 47.4% (35.8% - 59.2%) of the total number of infected subjects. The studies [2, 3, 11] were based on laboratory tests performed in small communities (the inhabitants of Vo’ in Italy, the passengers of a cruise ship in Japan and the attendees of a wedding in Jordan, respectively) where the Covid-19 infection had spread. On the contrary, the study in China [4] was based on a mathematical model comparing mobility data and infection diffusion within China after the start of the Covid-19 epidemic. A Review [12] of the available evidence on asymptomatic SARS-CoV-2 infectives found that asymptomatic subjects accounted for approximately 40% to 45% of the total number of infections and could transmit the virus to others. The authors of the Review also pointed out that the high frequency of asymptomatic infections could at least partly explain the rapid spread of the virus, since infected subjects who feel and look well are likely to have more interaction with others than symptomatic infectives. The results obtained in the aforementioned investigations [2, 3, 11, 12] concerned *asymptomatic* infected subjects, while the results found in our study included all the *undocumented* infectives, both asymptomatic subjects and presymptomatic individuals or symptomatic subjects who had not been tested as positive for several reasons (e.g. the shortage of nasopharyngeal swabs). This can explain why the percentages of asymptomatic infected subjects in those studies [2, 3, 11, 12] turned out to be a bit lower than the percentage we found for all the undocumented infectives. The 95% confidence intervals of the epidemiological parameters reported in Table 1 were only associated to the error intrinsic to the mathematical model considered in this study, while the uncertainty on the data concerning the removed infectives was not included, although the number of recorded positive cases was probably underestimated as a consequence of the low frequency in administering swabs and serological tests to the population in most Italian regions. ## Conclusions Our derivation of the percentage of undocumented infectives only relied on SIR model, a cornerstone in the study of infectious disease dynamics. Despite its simplicity, SIR model describes the global dynamics of an epidemic and allows to evaluate several epidemiological parameters. However, more complex and realistic generalisations of the SIR model could be introduced to give a better picture of real epidemics. The general expression of the percentage of undocumented infectives found in this study only requires the knowledge of the basic reproduction number *R*. Other methods involve numerous variables, in order to provide a more accurate description of the epidemic, however they also require specific assumptions on unknown parameters of the underlying mathematical framework. The main conclusion which can be drawn from the results obtained in this study is that unrecorded infections play a key role in the transmission of SARS-CoV-2. The high percentage of undocumented infections poses a major challenge for the control of Covid-19 and highlights the necessity to carefully monitor and adjust social distancing and other preventive measures until a vaccine is found. ## Data Availability All the data about Covid-19 epidemic in Italy are available from the database of the Italian Ministry of Health and Civil Protection Department ## Acknowledgments We are grateful to Anna Maria Koopmans for translations, professional writing assistance and preparation of the manuscript. ## Appendices ### A. SIR model The equations describing the SIR model are: ![Formula][22] where *r* > 0 is the infection rate and *a* > 0 is the removal rate of infectives. At any time *t* the sum of *S*(*t*), *I*(*t*) and *R*(*t*) is equal to *N*, the population size: ![Formula][23] The initial conditions are: ![Formula][24] By dividing the first and third equations of the SIR model and introducing the relative removal rate *ρ* = *a*/*r*, one obtains ![Formula][25] Following the Kermack-McKendrick model, if the population size is much larger than the number of infectious subjects, ![Graphic][26] is small and *S*(*t*) can be approximated by ![Formula][27] From the constraint *N* = *S* + *R* + *I* it follows that the number of infectives *I*(*t*) can be expressed as *I*(*t*) = *N* − *S*(*t*) − *R*(*t*). The third equation of the SIR model then becomes ![Formula][28] By integrating the previous equation, one obtains ![Formula][29] Where ![Formula][30] Finally, the rate ![Graphic][31] of new removed infectives per unit of time is given by ![Formula][32] The SIR model assumes that the removal rate of infectives *a* and the infection rate *r* do not vary during the epidemic; consequently, the calculated curves conform roughly to the observed data. Conclusions concerning the true values of the constants *a, r* and *S* – as well as the basic reproduction number ![Graphic][33] –should not be drawn from their direct relationships with the parameters of the numerical fit. The total number of infected subjects *I**tot*(*t*) at time *t* can be computed by adding the initial infectives *I* to the integral of the infectives *I*(*t*) from *t* = 0 to *t*: ![Formula][34] From the third differential equation of the SIR model, the number of infectives *I*(*t*) at time *t* can be written in terms of the rate of new removed infections: ![Graphic][35], where *a* is the removal rate of infective subjects to the removed class. If we assume that *a* is constant against time and that *S* ≅ *N*, i.e. *I* ≅ 0, the total number of infectives *I**tot*(*t*) turns out to be ![Formula][36] By comparing this expression of *I**tot*(*t*) with the equation *I**tot*(*t*) ≅ *R* · *R*(*t*) obtained in the Methods section, one finally obtains ![Graphic][37]. The SIR model provides an oversimplified description of epidemic dynamics. Generalisations of it may be necessary to obtain a more accurate picture of real epidemics. ### B. Basic reproduction number *R* From the differential equations of the SIR model, it turns out that an epidemic occurs if *S* > *ρ*, where *ρ* is the relative removal rate and *S* is the number of susceptible subjects (with initial value *S*). The critical parameter *R* = *S*/*ρ* is the basic reproduction number; it represents the number of secondary infections from one primary infection in a wholly susceptible population. If *R* > 1 an epidemic ensues, if *R* < 1 no epidemic can occur. From the definition of the basic reproduction number *R*, it follows that each primary contagious case produces |*R* − 1| new secondary cases in a completely susceptible population. In a neighbourhood of the initial time *t* = 0, the basic reproduction number *R* can be assumed to be constant. The number of new infectives *I*(*t*) ≡ *I**t*, at any time *t* in the neighbourhood of *t* = 0, is |*R* − 1| times the number of infectives *I**t*−1 at the previous time *t* − 1, i.e. *I**t* = |*R* − 1| · *I**t*−1. By iterating this procedure up to the initial time *t* = 0, when *I*(0) = *I*, one gets: ![Formula][38] By inverting the equation *I**t* = |*R* − 1|*t* · *I* and assuming *t* in a neighbourhood of *t* = 0, the basic reproduction number turns out to be: ![Formula][39] where the plus and minus signs correspond either to a growing or declining epidemic with *R* > 1 or *R* < 1, respectively. Being ![Graphic][40] (as follows from the third equation of the SIR model, Appendix A), the previous expression of *R* can be written as ![Formula][41] where ![Graphic][42] is the rate of new removed infectives per unit of time. ### C. Data fit The number *R*(*t*) of removed infectives against time *t* can be fitted by the curve ![Formula][43] where the parameters *c*1, *c*2, *c*3 are related to four epidemiological characteristics: the removal rate of infectives *a*, the infection rate *r*, the initial number of susceptible subjects *S* and the population size *N*. The initial number of removed infectives is *R*(*t* → 0) = 0, while their final number *R*(*t* → ∞) is ![Formula][44] The rate of new removed infectives per unit of time is ![Formula][45] The time *t*peak corresponding to the maximum of *dR*/*dt*, flex of the *R*(*t*) curve, is: ![Formula][46] The maximum rate of new detected infections per unit of time is ![Formula][47] The number of removed infectives *R*(*t*) at time *t*peak turns out to be ![Formula][48] The basic reproduction number *R* is given by ![Formula][49] The best-fit of the Covid-19 data in Lombardy, Emilia-Romagna and Sardinia was obtained through the “NonlinearModelFit” algorithm of Wolfram Mathematica 12.1, which also provided the 95% confidence intervals of the epidemiological parameters. The adjusted *R*-squared, measuring the goodness of fit, turned out to be about *R*2 = 0.999 in all the Italian regions considered in this study. ### D. Effective reproduction number The time course of an epidemic can be described by the effective reproduction number *R*eff(*t*), which is defined as the average number of new secondary infected cases per primary case at time *t. R*eff(*t*) represents the time development of the basic reproduction number *R* due to the decrease of susceptible individuals and the implementation of control measures. If *R*eff(*t*) < 1, the epidemic is declining and can be considered as under control; the opposite occurs if *R*eff(*t*) > 1. The effective reproduction number *R*eff(*t*) is given by ![Formula][50] The assumption *I* ≅ 0 yields *N* ≅ *S* and *S*(*t*) ≅ *S* − *I**tot*(*t*). The effective reproduction number can then be expressed as ![Formula][51] The minimum number of initial susceptible individuals *S* cannot be less than the final number of total infectives *I**tot*(*t* → ∞) for an infection rate *r* equal to one (*r* = 1); analogously, the maximum value of *S* cannot exceed the population size *N*: *I**tot*(*t* → ∞) ≤ *S* ≤ *N*. If we require that the limit of *R*eff(*t*) as *t* → ∞ is zero, then *S* must be assumed equal to its lower bound, *I**tot*(*t* → ∞), and the effective reproduction number becomes ![Formula][52] As discussed in the Methods section, *I**tot*(*t*) ≅ *R* · *R*(*t*) if the population size is much larger than the number of infected subjects; in this case the previous equation can be written as: ![Formula][53] The following Figure represents the effective reproduction number *R*eff(*t*) against time *t* in three Italian regions: Lombardy, Emilia-Romagna and Sardinia. ![Figure6](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/07/11/2020.07.09.20149682/F6.medium.gif) [Figure6](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/F6) The Kermack-McKendrick model of *R*(*t*) discussed in Appendix A can be linearized in the neighbourhood of the time *t*peak corresponding to the maximum of the rate ![Graphic][54] of new removed cases per unit of time: *R*(*t*) − *R*(*t*peak) = *R*′(*t*peak) · (*t* – *t*peak), where ![Graphic][55] By substituting *t* with *t*1, corresponding to the threshold value *R*eff(*t*1) = 1, one can compute the time difference Δ*t* = *t*1 − *t*peak: ![Formula][56] The number of removed infectives *R*(*t*1) at time *t* = *t*1 can be obtained from the equation expressing *R*eff(*t*) in terms of *R*(*t*): ![Formula][57] By substituting *R*(*t*1) into the equation of Δ*t* and expressing *R*(*t*peak), *R*′(*t*peak) and *R*(*t* → ∞) in terms of the parameters *c*1, *c*2, *c*3 of the *R*(*t*) fit discussed in Appendix C, the difference Δ*t* between times *t*1 and *t*peak becomes: ![Formula][58] Being *t*peak = *c*3/*c*2 (Appendix C), the time *t*1 corresponding to *R*eff(*t*1) = 1 turns out to be ![Formula][59] The following Table reports the 95% confidence interval for the threshold value *R*eff(*t*1) = 1 of the effective reproduction number and the corresponding time *t*1 (both in days, since the start of the epidemic, and according to calendar date) in the Italian regions considered in this study. View this table: [Table2](http://medrxiv.org/content/early/2020/07/11/2020.07.09.20149682/T2) * Received July 9, 2020. * Revision received July 9, 2020. * Accepted July 11, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at [http://creativecommons.org/licenses/by-nc/4.0/](http://creativecommons.org/licenses/by-nc/4.0/) ## References 1. [1].Kermack WO and McKendrick AG. Contributions to the Mathematical Theory of Epidemics. Proc R Soc Lond A 1933; 141:94–122. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1098/rspa.1933.0106&link_type=DOI) 2. [2].Lavezzo E, Franchin E, Ciavarella C, Cuomo-Dannenburg G, Luisa Barzon L, Del Vecchio C, et al. Suppression of COVID-19 outbreak in the municipality of Vo’, Italy. MedRxiv preprint. Doi: [https://doi.org/10.1101/2020.04.17.20053157](https://doi.org/10.1101/2020.04.17.20053157). 3. [3].Mizumoto K, Kagaya K, Zarebski A and Chowell G. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Euro Surveill. 2020;25(10):pii=2000180. Doi: [https://doi.org/10.2807/1560-7917](https://doi.org/10.2807/1560-7917). 4. [4].Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 2020 (Epub 2020 Mar 16); 368(6490):489–493. doi: [https://doi.org/10.1126/science.abb3221](https://doi.org/10.1126/science.abb3221). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNjgvNjQ5MC80ODkiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wNy8xMS8yMDIwLjA3LjA5LjIwMTQ5NjgyLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 5. [5].Data about Covid-19 epidemic in Italian regions. Italian Ministry of Health and Civil Protection Department. [https://github.com/pcm-dpc/COVID-19/tree/master/schede-riepilogative/regioni](https://github.com/pcm-dpc/COVID-19/tree/master/schede-riepilogative/regioni). 6. [6].Murray JD. Mathematical Biology. I: An Introduction (Third Edition, 2002). New York: Springer-Verlag. 7. [7].Wolfram Research, Inc. Mathematica 12.1 (Trial Version). Champaign, Illinois, US (2020). 8. [8].Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia. N Engl J Med 2020; 382:1199–1207. Doi: [https://doi.org/10.1056/NEJMoa2001316](https://doi.org/10.1056/NEJMoa2001316) [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2001316&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F07%2F11%2F2020.07.09.20149682.atom) 9. [9].Zhanga S, Diaob MY, Yuc W, Peic L, Lind Z and Chena D. Estimation of the reproductive number of novel coronavirus (COVID-19) and the probable outbreak size on the Diamond Princess cruise ship: A data-driven analysis. International Journal of Infectious Diseases 93 (2020) 201–204. Doi: [https://doi.org/10.1016/j.ijid.2020.02.033](https://doi.org/10.1016/j.ijid.2020.02.033). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ijid.2020.02.033&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32097725&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F07%2F11%2F2020.07.09.20149682.atom) 10. [10].1. Chowell G, 2. Hyman JM, 3. Bettencourt LMA., 4. Castillo-Chavez C Nishiura H and Chowell G. The Effective Reproduction Number as a Prelude to Statistical Estimation of Time-Dependent Epidemic Trends. In: Chowell G, Hyman JM, Bettencourt LMA., Castillo-Chavez C (eds). Mathematical and Statistical Estimation Approaches in Epidemiology. Springer, Dordrecht (2009). Doi: [https://doi.org/10.1007/978-90-481-2313-1_5](https://doi.org/10.1007/978-90-481-2313-1_5). 11. [11].Yusef D, Hayajneh W, Awad S, Momany S, Khassawneh B, Samrah S, et al. Large outbreak of coronavirus disease among wedding attendees, Jordan. Emerg Infect Dis. 2020 Sep [Online Publication Date: 20 May 2020]. Doi: [https://doi.org/10.3201/eid2609.201469](https://doi.org/10.3201/eid2609.201469). 12. [12].Oran DP and Topol EJ. Prevalence of Asymptomatic SARS-CoV-2 Infection. A Narrative Review. Ann Intern Med [3 June 2020]. Doi: [https://doi.org/10.7326/M20-3012](https://doi.org/10.7326/M20-3012). [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/inline-graphic-4.gif [5]: F1/embed/inline-graphic-5.gif [6]: F1/embed/inline-graphic-6.gif [7]: /embed/graphic-2.gif [8]: /embed/graphic-3.gif [9]: /embed/graphic-4.gif [10]: /embed/graphic-5.gif [11]: /embed/graphic-6.gif [12]: /embed/graphic-7.gif [13]: /embed/graphic-8.gif [14]: /embed/graphic-9.gif [15]: /embed/inline-graphic-7.gif [16]: /embed/inline-graphic-8.gif [17]: /embed/inline-graphic-9.gif [18]: /embed/inline-graphic-10.gif [19]: F3/embed/inline-graphic-11.gif [20]: /embed/inline-graphic-12.gif [21]: /embed/inline-graphic-13.gif [22]: /embed/graphic-15.gif [23]: /embed/graphic-16.gif [24]: /embed/graphic-17.gif [25]: /embed/graphic-18.gif [26]: /embed/inline-graphic-14.gif [27]: /embed/graphic-19.gif [28]: /embed/graphic-20.gif [29]: /embed/graphic-21.gif [30]: /embed/graphic-22.gif [31]: /embed/inline-graphic-15.gif [32]: /embed/graphic-23.gif [33]: /embed/inline-graphic-16.gif [34]: /embed/graphic-24.gif [35]: /embed/inline-graphic-17.gif [36]: /embed/graphic-25.gif [37]: /embed/inline-graphic-18.gif [38]: /embed/graphic-26.gif [39]: /embed/graphic-27.gif [40]: /embed/inline-graphic-19.gif [41]: /embed/graphic-28.gif [42]: /embed/inline-graphic-20.gif [43]: /embed/graphic-29.gif [44]: /embed/graphic-30.gif [45]: /embed/graphic-31.gif [46]: /embed/graphic-32.gif [47]: /embed/graphic-33.gif [48]: /embed/graphic-34.gif [49]: /embed/graphic-35.gif [50]: /embed/graphic-36.gif [51]: /embed/graphic-37.gif [52]: /embed/graphic-38.gif [53]: /embed/graphic-39.gif [54]: /embed/inline-graphic-21.gif [55]: /embed/inline-graphic-22.gif [56]: /embed/graphic-41.gif [57]: /embed/graphic-42.gif [58]: /embed/graphic-43.gif [59]: /embed/graphic-44.gif