The serial interval of COVID-19 from publicly reported confirmed cases ====================================================================== * Zhanwei Du * Lin Wang * Xiaoke Xu * Ye Wu * Benjamin J. Cowling * Lauren Ancel Meyers ## Abstract As a novel coronavirus (COVID-19) continues to emerge throughout China and threaten the globe, its transmission characteristics remain uncertain. Here, we analyze the serial intervals–the time period between the onset of symptoms in an index (infector) case and the onset of symptoms in a secondary (infectee) case–of 468 infector-infectee pairs with confirmed COVID-19 cases reported by health departments in 18 Chinese provinces between January 21, 2020, and February 8, 2020. The reported serial intervals range from -11 days to 20 days, with a mean of 3.96 days (95% confidence interval: 3.53-4.39), a standard deviation of 4.75 days (95% confidence interval: 4.46-5.07), and 12.1% of reports indicating pre-symptomatic transmission. Keywords * Wuhan * coronavirus * epidemiology * serial interval A new coronavirus (COVID-19) emerged in Wuhan, China in late 2019 and was declared a public health emergency of international concern by the World Health Organization (WHO) on January 30, 2020 (1). As of February 19, 2020, the WHO has reported over 75,204 COVID-19 infections and over 2,009 COVID-19 deaths (2), while key aspects of the transmission dynamics of COVID-19 remain unclear (3). The serial interval of COVID-19 is defined as the time duration between a primary case (infector) developing symptoms and secondary case (infectee) developing symptoms (4,5). Obtaining robust estimates for the distribution of COVID-19 serial intervals is a critical input for determining the reproduction number which can indicate the extent of interventions required to control an epidemic (6). However, this quantity cannot be inferred from daily case count data alone (7). To obtain reliable estimates of the serial interval, we obtained data on 468 COVID-19 transmission events reported in mainland China outside of Hubei Province between January 21, 2020, and February 8, 2020. Each report consists of a probable date of symptom onset for both the infector and infectee as well as the probable locations of infection for both cases. The data include only confirmed cases that were compiled from online reports from 18 provincial centers for disease control and prevention. Notably, 59 of the 468 reports indicate that the infectee developed symptoms earlier than the infector. Thus, pre-symptomatic transmission may be occurring, i.e., infected persons may be infectious before their symptoms appear. In light of these negative-valued serial intervals, we assume that COVID-19 serial intervals follow a normal distribution rather than the more commonly assumed gamma or Weibull distributions that are limited to strictly positive values (8,9). We estimate a mean serial interval for COVID-19 of 3.96 [95% CI 3.53-4.39] with a standard deviation of 4.75 [95% CI 4.46-5.07], which is considerably lower than reported mean serial intervals of 8.4 days for SARS (9) and 12.6 days (10) - 14.6 days (11) for MERS. The mean serial interval is slightly but not significantly longer when the index case is imported (4.06 [95% CI 3.55-4.57]) versus locally infected (3.66 [95% CI 2.84-4.47]). Combining these findings with published estimates for the early exponential growth rate COVID-19 in Wuhan (12,13), we estimate a basic reproduction number (*R*) of 1.33 (6), which is lower than published estimates that assume a mean serial interval exceeding seven days (13–15). These estimates reflect reported symptom onset dates for 752 cases from 93 Chinese cities, who range in age from 1 to 90 years (mean 45.2 years and SD 17.21 years). We note three key caveats of the analysis. First, the data are restricted to online reports of confirmed cases and therefore may be biased towards more severe cases in areas with a high-functioning healthcare and public health infrastructure. Second, the distribution of serial intervals varies throughout an epidemic, with the time between successive cases contracting around the epidemic peak (16). To provide intuition, a susceptible person is likely to become infected more quickly if they are surrounded by two infected people rather than just one. Since our estimates are based primarily on transmission events reported during the early stages of outbreaks, we do not explicitly account for such compression and interpret the estimates as *basic* serial intervals at the outset of an epidemic. If some of the reported infections occurred amidst growing clusters of cases, our estimates may instead reflect effective serial intervals that would be expected during a period of epidemic growth. Finally, rapid isolation of symptomatic cases in some locations may have prevented longer serial intervals, potentially biasing our estimate downwards compared to serial intervals that might be observed in an uncontrolled epidemic. Given the heterogeneity in type and reliability of these sources, we caution that our findings should be interpreted as working hypotheses regarding the infectiousness of COVID-19 requiring further validation as more data become available. The potential implications for COVID-19 control are mixed. While our lower estimates for *R* suggest easier containment, the large number of reported asymptomatic transmission events is concerning. ## Supplementary Appendix ### Data We collected publicly available online data on 6,903 confirmed cases from 271 cities of mainland China, that were available as of February 8, 2020. The data were extracted in Chinese from the websites of provincial public health departments and translated to English. We then filtered the data for clearly indicated transmission events consisting of: (i) a known *infector* and *infectee*, (ii) reported locations of infection for both cases, and (iii) reported dates and locations of symptom onset for both cases. We thereby obtained 468 infector-infectee pairs from 93 Chinese cities between January 21, 2020 and February 8, 2020 (Figure S1). The index cases (infectors) for each pair are reported as either importations from the city of Wuhan (*N* = 239), importations from cities other than Wuhan (*N* = 106) or local infections (*N* = 122). The cases included 752 unique individuals, with 98 index cases who infected multiple people and 17 individuals that appear as both infector and infectee. They range in age from 1 to 90 years and include 386 females, 363 males and 3 cases of unreported sex. ![Figure.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/02/23/2020.02.19.20025452/F1.medium.gif) [Figure.](http://medrxiv.org/content/early/2020/02/23/2020.02.19.20025452/F1) Figure. Estimated serial interval distribution for COVID-19 based on 468 reported transmission events in China between January 21, 2020 and February 8, 2020. Bars indicate the number of infection events with specified serial interval and blue lines indicate fitted normal distributions for (a) all infection events (*N* = 468) reported across 93 cities of mainland China by February 8, 2020, and (b) the subset infection events (*N* = 122) in which both the infector and infectee were infected in the reporting city (i.e., the index case was not an importation from another city). Negative serial intervals (left of the vertical dotted lines) suggest the possibility of COVID-2019 transmission from asymptomatic or mildly symptomatic cases. ![Figure S1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/02/23/2020.02.19.20025452/F2.medium.gif) [Figure S1.](http://medrxiv.org/content/early/2020/02/23/2020.02.19.20025452/F2) Figure S1. Geographic composition of the infection report data set. The data consist of 468 infector-infectee pairs reported by February 8, 2020 across 93 cities in mainland China. Colors represent the number of reported events per city, which range from 1 to 72, with an average of 5.03 (SD 8.54) infection events. The 71 cities with fewer than five events are colored in blues; the 22 cities with at least five events are colored in shades of orange. ## Inference Methods ### Estimating serial interval distribution For each pair, we calculated the number of days between the reported symptom onset date for the infector and the reported symptom onset date for the infectee. Negative values indicate that the infectee developed symptoms before the infectee. We then used the fitdist function in Matlab (17) to fit a normal distribution to all 468 observations. It finds unbiased estimates of the mean and standard deviation, with 95% confidence intervals. We applied the same procedure to estimate the means and standard deviations with the data stratified by whether the index case was imported or infected locally. ### Estimating the basic reproduction number (*R*) Given a epidemic growth rate *r* and a normally distributed serial interval with mean (*μ*) and standard deviation (*σ*), the basic reproduction number is given by ![Formula][1] Assuming our point estimates for the mean and standard deviation of the serial interval distribution (Table S1) and a recently published estimate for the exponential growth rate of COVID-19 infections in Wuhan of 0.10 per day (13), we estimate an *R* of 1.33. ## Supplementary Analysis To facilitate interpretation and future analyses, we summarize key characteristics of the COVID-2019 infection report data set. ### Age distribution Of the 737 unique cases in the data set, 1.7%, 3.5%, 54.1%, 26.1% and 14.5% were ages 0-4, 5-17, 18-49, 50-64, and over 65 years, respectively. Across all transmission events, approximately one third occurred between adults ages 18 to 49, ∼92% had an adult infector (over 18), and over 99% had an adult infectee (over 18) (Table S2). ### Secondary case distribution Across the 468 transmission events, there were 301 unique infectors. The mean number of transmission events per infector is 1.55 (Figure S2) with a maximum of 16 secondary infections reported from a 40 year old male in Liaocheng city of Shandong Province. ![Figure S2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/02/23/2020.02.19.20025452/F3.medium.gif) [Figure S2.](http://medrxiv.org/content/early/2020/02/23/2020.02.19.20025452/F3) Figure S2. Number of infections per unique index case in the infection report data set. There are 301 unique infectors across the 468 infector-infectee pairs. The number of transmission events reported per infector ranges from 1 to 16, with ∼55% having only one. ### Geographic distribution The 468 transmission events were reported from 93 Chinese cities in 17 Chinese provinces and Tianjin (Figure S3). There are 22 cities with at least five infection events and 71 cities with fewer than five infection events in the sample. The maximum number of reports from a city is 72 for Shenzhen, which reported 339 cumulative cases as of February 8, 2020. ## Data Availability All data used in this study is from open access. ## Author Bio Dr. Du is a postdoctoral researcher in the Department of Integrative Biology at the University of Texas at Austin. He develops mathematical models to elucidate the transmission dynamics, surveillance, and control of infectious diseases. View this table: [Table S1.](http://medrxiv.org/content/early/2020/02/23/2020.02.19.20025452/T1) Table S1. Estimated serial interval distributions based on location of index infection. We assume that the serial intervals follow normal distributions and report the estimated means and standard deviations for (a) all 468 infector-infectee pairs reported from 93 cities in mainland China by February 8, 2020, (b) a subset of 122 infection events in which the index case was infected locally, and (c) a subset of 346 infection events in which the index case was an importation from another city. The rightmost column provides the proportion of infection events in which the secondary case developed symptoms prior to the index case. View this table: [Table S2.](http://medrxiv.org/content/early/2020/02/23/2020.02.19.20025452/T2) Table S2. Age distribution for the 457 of 468 infector-infectee pairs. Each value denotes the number of infector-infectee pairs in the specified age combination. Age was not reported for the remaining 11 pairs. ## Acknowledgments We acknowledge the financial support from NIH (U01 GM087719) and the National Natural Science Foundation of China (61773091). * Received February 19, 2020. * Revision received February 19, 2020. * Accepted February 23, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. 1.WHO | Pneumonia of unknown cause – China. 2020 Jan 30 [cited 2020 Feb 18]; Available from: [https://www.who.int/csr/don/05-january-2020-pneumonia-of-unkown-cause-china/en/](https://www.who.int/csr/don/05-january-2020-pneumonia-of-unkown-cause-china/en/) 2. 2.Organization WH, Others. Coronavirus disease 2019 (COVID-19): situation report, 30. 2020; Available from: [https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200219-sitrep-30-covid-19.pdf?sfvrsn=6e50645_2](https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200219-sitrep-30-covid-19.pdf?sfvrsn=6e50645_2) 3. 3.Cowling BJ, Leung GM. Epidemiological research priorities for public health control of the ongoing global novel coronavirus (2019-nCoV) outbreak. Euro Surveill [Internet]. 2020 Feb 13; Available from: [http://dx.doi.org/10.2807/1560-7917.ES.2020.25.6.2000110](http://dx.doi.org/10.2807/1560-7917.ES.2020.25.6.2000110) 4. 4.Giesecke J. Modern infectious disease epidemiology. CRC Press; 2017. 5. 5.Svensson A. A note on generation times in epidemic models. Math Biosci. 2007 Jul;208(1):300–11. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.mbs.2006.10.010&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=17174352&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F02%2F23%2F2020.02.19.20025452.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000248196400017&link_type=ISI) 6. 6.Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc Biol Sci. 2007 Feb 22;274(1609):599–604. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1098/rspb.2006.3754&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=17476782&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F02%2F23%2F2020.02.19.20025452.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000243354200019&link_type=ISI) 7. 7.Vink MA, Bootsma MCJ, Wallinga J. Serial Intervals of Respiratory Infectious Diseases: A Systematic Review and Analysis [Internet]. Vol. 180, American Journal of Epidemiology. 2014. p. 865–75. Available from: [http://dx.doi.org/10.1093/aje/kwu209](http://dx.doi.org/10.1093/aje/kwu209) [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwu209&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25294601&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F02%2F23%2F2020.02.19.20025452.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000344600900001&link_type=ISI) 8. 8.Kuk AYC, Ma S. The estimation of SARS incubation distribution from serial interval data using a convolution likelihood. Stat Med. 2005 Aug 30;24(16):2525–37. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.2123&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16013037&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F02%2F23%2F2020.02.19.20025452.atom) 9. 9.Lipsitch M, Cohen T, Cooper B, Robins JM, Ma S, James L, et al. Transmission dynamics and control of severe acute respiratory syndrome. Science. 2003 Jun 20;300(5627):1966–70. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEzOiIzMDAvNTYyNy8xOTY2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMDIvMjMvMjAyMC4wMi4xOS4yMDAyNTQ1Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 10. 10.Cowling BJ, Park M, Fang VJ, Wu P, Leung GM, Wu JT. Preliminary epidemiological assessment of MERS-CoV outbreak in South Korea, May to June 2015 [Internet]. Vol. 20, Eurosurveillance. 2015. Available from: [http://dx.doi.org/10.2807/1560-7917.es2015.20.25.21163](http://dx.doi.org/10.2807/1560-7917.es2015.20.25.21163) 11. 11.Park SH, Kim Y-S, Jung Y, Choi SY, Cho N-H, Jeong HW, et al. Outbreaks of Middle East Respiratory Syndrome in Two Hospitals Initiated by a Single Patient in Daejeon, South Korea. Infect Chemother. 2016 Jun;48(2):99–107. 12. 12.Jung S-M, Akhmetzhanov AR, Hayashi K, Linton NM, Yang Y, Yuan B, et al. Real time estimation of the risk of death from novel coronavirus (2019-nCoV) infection: Inference using exported cases [Internet]. Available from: [http://dx.doi.org/10.1101/2020.01.29.20019547](http://dx.doi.org/10.1101/2020.01.29.20019547) 13. 13.Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med [Internet]. 2020 Jan 29; Available from: [http://dx.doi.org/10.1056/NEJMoa2001316](http://dx.doi.org/10.1056/NEJMoa2001316) 14. 14.Tuite AR, Fisman DN. Reporting, Epidemic Growth, and Reproduction Numbers for the 2019 Novel Coronavirus (2019-nCoV) Epidemic [Internet]. Annals of Internal Medicine. 2020. Available from: [http://dx.doi.org/10.7326/m20-0358](http://dx.doi.org/10.7326/m20-0358) 15. 15.Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet [Internet]. 2020 Jan 31; Available from: [http://dx.doi.org/10.1016/S0140-6736(20)30260-9](http://dx.doi.org/10.1016/S0140-6736(20)30260-9) 16. 16.Kenah E, Lipsitch M, Robins JM. Generation interval contraction and epidemic data analysis. Math Biosci. 2008 May;213(1):71–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.mbs.2008.02.007&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18394654&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F02%2F23%2F2020.02.19.20025452.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000256231600008&link_type=ISI) 17. 17.Fit probability distribution object to data - MATLAB fitdist [Internet]. [cited 2020 Feb 19]. Available from: [https://www.mathworks.com/help/stats/fitdist.html](https://www.mathworks.com/help/stats/fitdist.html) [1]: /embed/graphic-1.gif