Estimation of the COVID-19 Average Incubation Time: Systematic Review, Meta-analysis and Sensitivity Analyses

Objectives : We aim to provide sensible estimates of the average incubation time of COVID-19 by capitalizing available estimates reported in the literature and explore diﬀerent ways to accommodate heterogeneity involved with the reported studies. Methods : We search through online databases to collect the studies about estimates of the average incubation time and conduct meta-analyses to accommodate heterogeneity of the studies and the publication bias. Cochran’s heterogeneity statistic Q and Higgin’s & Thompson’s I 2 statistic are employed. Subgroup analyses are conducted using mixed eﬀects models and publication bias is assessed using the funnel plot and Egger’s test. Results : Using all those reported mean incubation estimates, the average incubation time is estimated to be 6.43 days with a 95% conﬁdence interval (CI) (5.90, 6.96), and using all those reported mean incubation estimates together with those transformed


Introduction
The coronavirus disease 2019 (COVID-19) has presented tremendous impact on public health as well as economy.Much research has been conducted to understand various clinical characteristics of COVID-19.One interesting question concerns the COVID-19 incubation time which is defined as the time from infection of SARS-CoV-2 to the onset of clinical symptoms (Giesecke, 2017).As the incubation time varies from patient to patient, it is useful to estimate the average incubation time of the population.
Understanding the average incubation time is of great significance for a multitude of reasons.Most obviously, knowing the average incubation time gives us a critical metric in developing strategies for isolation or quarantine.Setting a reasonable quarantine time is directly based on the distribution of incubation times and having a sensible estimate of the average incubation time helps us come up with effective intervention steps.Moreover, in developing epidemic models such as SEIR, the average incubation time is taken as an important parameter to model transmission features of SARS-CoV-2, and different estimates of this parameter may greatly affect the outcomes (Brookmeyer, 2014).
Due to its importance, many studies have been carried out to estimate the average incubation time for COVID-19.However, available studies do not reveal comparable estimates of the mean incubation time, and they vary considerably from 1.8 days in China (Leung, 2020) to 14 days in India (Gupta et al., 2020).Additionally, it is difficult to assess which estimate more reasonably reflects the average incubation time of the population because different studies are carried out for different subjects under different conditions.In this article, we aim to provide synthetic estimates of the average incubation time of COVID-19 by capitalizing on the report estimates in the literature and explore different ways to accommodate heterogeneity involved with the reported studies on COVID-19.
While a number of meta-analyses have offered synthetic estimates, those studies concentrated on early reports which are prior to June 2020, and some of them included a small number of studies.To overcome those limitations, in our analyses here, we conduct a thorough search covering a long study period from January 1, 2020 to May 20, 2021, which allows our study to include more diverse information on estimates of the average incubation estimate.We carry out meta-analyses for those reported mean estimates reported as well as those transformed from estimates of the median incubation time.Subgroup analyses and sensitivity analyses are conducted to investigate heterogeneity among the reported studies and the stability of the produced synthetic estimates of the mean incubation time of COVID-19.
The rest of the manuscript is organized as follows.In Section 2 we present the data collection and extraction methods and the basic characteristics of the data.In Section 3 we describe the general procedures for meta-analyses.In Section 4 we analyze the data and report the results.We conclude the article with discussions presented in the last section.Collabovid comprises publications from Elsevier, PubMed, medRxiv, bioRxiv, and arXiv.

Search Strategy and Selection Criteria
We start with an automatic search process using the pairwise combinations of phrases each from one of the following categories: (1).'incubation', 'incubation period ', and 'incubation time'; (2).' COVID-19 ', 'SARS-CoV-2 ', '2019-nCoV ', '2019nCoV ', and'Novel Coronavirus'.This process identifies 611 articles.Next, using those keywords, we manually check if the references of the resulting articles should be included in our collection of articles, yielding 17 additional articles.Then we remove 93 duplicated articles from this collection of 628 COVID-19 studies.Further, we examine each study manually by checking first the abstract and then the full text to see whether the study is about the COVID-19 incubation time.The abstract checking process excludes 375 articles.We now further examine the full text for the remaining 160 studies and retain only those studies with the information of the sample size as well as the information on one of the following categories: (a) having an estimate of the mean incubation time together with its standard error (SE) or a 95% confidence interval (CI); (b) having an estimate of the median incubation time together with a 95% CI, an interquartile range (IQR), or a range.
This step excludes 51 studies for not reporting an estimate of the mean or median incubation time, 2 studies for not reporting dispersion estimates associated with mean or median estimates, and 3 studies for not reporting the sample size.All these procedures finally lead to 104 papers which discuss estimates of the mean or median incubation time for COVID-19.
We summarize this process of gathering the COVID-19 studies about incubation information in Figure 1, which is prepared using the flow chart template developed for systematic review and meta-analysis, available at the website www.prisma-statement.org.

Data Extraction
We now report those 104 papers searched in Section 2.1 by displaying the information about the last name of the first author, the study period, the region of study subjects, and the methodology, together with the sample size, the estimate of the mean or median COVID-19 incubation time, and the associated standard error reported in the article or converted by us from using the reported 95% CI.
In Figure 2 we further display the information of those selected 104 papers, in which   In Tables 2 and 3, only information on estimates of the mean incubation time is included; and in Table 4, we show 35 studies with 36 reported estimates of median incubation time, together with the computed estimates of the mean and standard deviation (SD) using the methods described in Section 4.3 and Section S2 of the Supplementary Material.
Among the papers on meta-analysis reported in Table 2, the size of studies in each paper varies from 5 to 99, and the estimates (in days) of the mean incubation time range from 5.08 (He et al., 2020) to 6.71 (Banka and Comiskey, 2020).Of all those 16 meta-analyses, 14 are conducted for worldwide studies, one is for patients in China, and one is for patients in Asia.In terms of the distributional assumption for the incubation time, 2 papers assume a log-normal distribution, 1 paper assumes a gamma distribution, and 13 papers make other assumptions.) reports one synthetic mean incubation estimate derived from multiple studies using meta-analysis and one mean incubation estimate obtained from a single sample.The number in brackets under the heading "Sample Size" represent number of total sample size within all meta-analyses.(if the author provided.) Among the studies reported in Table 3, the sample size varies from 6 to 11,545, and the estimates of the mean incubation time range from 1.8 to 14 days.Forty-one (74.55%) studies are conducted inside China, in which 9 (16.36%)estimates are obtained from study subjects inside Hubei province, China.In terms of the methodology, 14 (25.45%)analyses are descriptive, 37 (67.27%)studies are derived from parametric models, and the rest are from non-parametric models.For those studies not reporting the standard deviation (SD) but reporting a 95% confidence interval (CI) of the mean incubation time, we use the length of the 95% confidence interval L to estimate SD: where t 0.975,n−1 is the 97.5th percentile of the student's t distribution with (n − 1) degrees of freedom, and n is the sample size of the study (Higgins et al., 2019).Reported and estimated SDs are shown in the last column of Table 3.
Among the studies reported in Table 4, the reported sample size varies from 6 to 2907, and the estimates of the median incubation time range from 2.87 to 10.00 days.The computed estimates of the mean incubation time vary from 2.87 to 9.33 days.Twenty-three (64.86%) studies are conducted inside China, in which 4 (11.11%) of them are inside Hubei province, China.In terms of the methodology, 24 (66.67%)analyses are descriptive, 11 (30.56%)studies are derived from parametric models, and one study (2.78%) uses non-parametric models.To estimate the SD, we apply (1) to those studies with 95% CI reported.For studies with only interquartile range (IQR) or range, we transform those quantities to obtain estimates of the mean incubation time and SD (Wan et al., 2014) using the formulas displayed in S2 of the Supplementary Materials.

Random Effects Model and Fixed Effect Model
Let µ denote the true mean incubation time for the population, and suppose that K independent studies are available to report an estimate of µ.For i = 1, ..., K, let y i denote the estimate of µ, and let σ i denote the associated standard error.
In conducting the meta-analysis under the fixed effect model (Jackson and White, 2018), we use a pooled average of the estimates from the K studies to obtain a synthetic estimate of µ, given by and the associated variance for M is calculated as: where w i = 1/σ 2 i is the weight for study i.
Similarly, when conducting meta-analysis under the random effects model, a synthetic estimate of µ and its associated variance, denoted M * and V M * , respectively, are still given by ( 2) and (3), except that the weight w i for study i is now replaced by w * i = 1/(σ 2 i + T 2 ) (Borenstein et al., 2009, pp.77-87),where ; Q is also Cochran's heterogeneity statistic.

Heterogeneity Test
To assess the heterogeneity among different studies, we are interested in testing the null hypothesis: H 0 : all studies target to estimate the same average incubation time. (4) One approach is to use Cochran's heterogeneity statistic Q to calculate the p-value, P (χ 2 (K− 1) > Q), with χ 2 (K − 1) representing a random variable following the χ 2 distribution of (K − 1) degrees of freedom (Cochran, 1954).
Another useful approach is based on the so-called I 2 statistic (Higgins and Thompson, 2002), defined as If I 2 > 50%, a random effect model is preferred; otherwise, a fixed effect model is suggested.
Both Q and I 2 statistics do not depend on the scale of measurements, but their performance differently depends on K.The Q statistic is more sensitive to small values of K than the I 2 statistic does.When K is smaller than 10, the Q statistic may not perform reliably.
I 2 explores the between-study variance in a relative scale whereas Q statistic explains the variance in the absolute scale.

Forest Plot
To visualize the results from meta-analysis in contrast to the results reported by individual studies, one may employ the forest plot (Lalkhen and McCluskey, 2008).Here we use the package meta (Balduzzi et al., 2019) in R version 4.1.0to construct forest plots, which displays the information for each study, including the last name of the first author and the estimate of the mean incubation time with a 95% CI, together with the pooled average incubation estimate and a 95% CI as well as the I 2 statistics, Q statistics, and its p-value.

Subgroup Analyses
If the random effects model is suggested by the test in Section 3.2, one may further investigate the source of heterogeneity among the K studies by conducting subgroup analyses with different groupings introduced (Borenstein and Higgins, 2013).The mixed effects model is taken to reflect the differences in the mean incubation times between subgroups, where the subgroups are assumed to be fixed and within each subgroup, the true effect size (i.e., the mean incubation time here) varies across different studies.The method of testing for the between-group difference is conducted by first taking the pooled results of random effects model within each subgroup as single estimates and then applying the Cochran's heterogeneity statistic Q described in Section 3.1 (Borenstein and Higgins, 2013).
To be specific, suppose we divide all the studies into S subgroups.For i = 1, ..., S, we use the procedures for obtaining V * M and M * under the random effects model described in Section 3.1 to calculate the estimate of the mean incubation time and the associated standard error for subgroup i, denoted s i and τ i , respectively.By replacing y i with s i and the σ i with τ i in the Q statistic in Section 3.1, we obtain the Cochran's heterogeneity statistic, denoted Q * , for heterogeneity among the S subgroups.Then we apply the test procedure described in Section 3.2 to assess if the mean incubation time estimates differ significantly among the subgroups.

Risk of Bias Assessment
To evaluate the quality of the studies, it is important to assess the risk of bias, defined as the systematic error or deviation from the truth (Viswanathan et al., 2012).We adapt the risk of bias tool developed by Hoy et al. (2012) and use a 10-point checklist to assess the risk of bias for each study.In particular, in items 9 and 10 of the checklist of Hoy et al. (2012) which are about disease prevalence, we change the descriptions to reflect the information on the COVID-19 incubation times as suggested by Quesada et al. (2021).The checklist includes both external and internal bias assessments related to the sampling method, data collection, case definition, the validity of methodology, and reporting bias as suggested by Wassie et al. (2020).The answer to each question in the list is ranked as low risk (1 point) or high risk (0 point).A total score over 8 is regarded as an overall indication of low risk of bias, a total score below 5 indicates high risk of bias, and otherwise moderate risk of bias.
The details of the checklist are displayed in Section S1 of the Supplementary Material.
To display results of assessing the risk of bias, one may use the function rob.summary() in package dmetar (Harrer et al., 2019) in R version 4.1.0which outputs two summary tables.One summary table reports the proportion of studies with high/low risk of bias for each checklist question, and the other table, called RevMan risk of bias table (Higgins and Thomas, 2011), displays a more detailed view by showing the risk of bias results associated with each study for each question, where the rows correspond to the risk assessment items and the columns refer to the studies; in the display, color red or blue is used to show high and low risk of bias, respectively.

Publication Bias
To understand the results produced by the meta-analysis, we assess potential publication bias incurred in individual studies.To this end, we use the funnel plot and Egger's test (Egger et al., 1997).
The funnel plot displays the standard error against the effect size for each study.If publication bias is present, the funnel would look asymmetrical.To measure asymmetry of the funnel plot, one may employ Egger's test which involves a linear regression equation (Egger et al., 1997): for i = 1, .., K, where a and b are the intercept and slope, and i is the noise term with mean zero.Then assessing no publication bias is reflected by the null hypothesis: and the test statistic is calculated as: where â refers to the estimates of a and se(â) is the associated standard error by applying the least squares method to fit model ( 6) to the data {(y i ,σ i ): i = 1, ..., K}.Then the p-value of testing ( 7) is given by 2 • P (t(K − 2) > |t * |), where t(K − 2) represents a random variable having the t distribution with (K − 2) degrees of freedom.A small p-value indicates the presence of the publication bias.

Data Analysis
Now we apply the procedures described in Section 3 to analyze the data described in Section 2. We carry out four different analyses.Analysis 1 is conducted to those studies with the information about estimates of the mean incubation time only, whereas Analysis 2 are based on the studies with the information about estimates of the median incubation time.
Analysis 3 combines the studies in both Analyses 1 and 2, where a transformation described in Section S2 of the Supplementary Material is used to convert the estimates of the median incubation time to those of the mean incubation time.Analysis 4 is conducted to those studies which merely report meta-analysis results.
Further, using the methods in Section 3.4, we perform subgroup analyses for different regions (China and outside China, Hubei and outside Hubei), different methodologies (parametric models, non-parametric models, and descriptive analysis), and different bias levels (low risk, moderate risk, and high risk).These analyses allow us to evaluate estimates with certain heterogeneities controlled.Risk of bias assessment and publication bias assessment are conducted using methods described in Sections 3.5 and 3.6, respectively.Figure 3 shows an overall summary for all the 95 studies in Tables 3 and 4 with those 16 studies about meta-analysis in

Results of Analysis 1
Table 3 contains 4 studies (marked as an asterisk) with highly right screwed 95% CIs, in the sense that the mean is a lot closer to the lower bound than the upper bound, suggesting that the derived standard deviations using (1) are unreliable.Thus, we exclude those studies from the 59 studies summarized in Table 3 and then apply the test procedures described in Section 3.2 to the remaining 55 studies.The p-value for Cochran's test is less than 0.01 and I 2 = 99%, suggesting that the random effects model is preferred when conducting metaanalysis.Figure 5 displays the forest plot of the meta-analysis, showing that the pooled mean incubation estimate for Analysis 1 is 6.43 days with a 95% CI (5.90,6.96).By applying the method in Section 3.6, we obtain the p-value 0.33 for the Egger's test, suggesting no evidence for the presence of asymmetry in the funnel plot, displayed in Figure 6.

Results of Analysis 2
Using the test procedures described in Section 3.2, we assess the 36 transformed results shown in the last 2 columns of Table 4.We obtain that the p-value for Cochran's test is less than 0.01 and I 2 = 95%, suggesting that the random effects model is preferred when conducting meta-analysis.Using the method in Section 3.1 gives us a synthetic approximate mean incubation estimate to be 5.52 days with a 95% CI (5.06, 5.99).Applying the method in Section 3.6 yields the p-value 0.43 for the Egger's test, showing no evidence for the presence of publication bias.

Results of Analysis 3
Combining the 55 studies in Analysis 1 and 36 studies in Analysis 2 and applying the test procedures described in Section 3.2 to those 91 studies, we obtain that the p-value for Cochran's test is less than 0.01 and I 2 = 98%, suggesting that the random effects model is preferred when conducting meta-analysis.Figure 7 displays the forest plot of the metaanalysis, showing that the pooled mean incubation estimate for Analysis 3 is 6.08 days with a 95% CI (5.71,6.46).By applying the method in Section 3.6, we obtain the p-value 0.32 for the Egger's test, indicating no evidence for the presence of asymmetry in the funnel plot, displayed in Figure 8.

Results of Analysis 4
Treating each of the 16 meta-analysis reports in Table 2 as an individual study and using the test procedures described in Section 3.2, we obtain that the p-value for Cochran's test is less than 0.01 and I 2 = 65%, suggesting that the random effects model is preferred when conducting meta-analysis.Using the method in Section 3.1 gives us a synthetic estimate of the mean incubation time for Analysis 4 to be 5.81 days with a 95% CI (5.57, 6.06).
Applying the method in Section 3.6 yields the p-value 0.21 for the Egger's test, suggesting no evidence for the publication bias.

Results of Subgroup Analyses
Applying the test procedures described in Section 3.4 to the 55 studies considered in Analysis 1, we further conduct four subgroup analyses, where three grouping strategies are applied based on regions and methodologies as described in Section 2, and another grouping strategy is suggested in Section 4.1 by classifying studies into groups of low, moderate, and high risk of bias.The results are reported in Table 5, where LBCI and UBCI stand for the lower and upper bounds of the 95% confidence intervals related to the mean incubation estimate, respectively.
Finally, for Subgroup Analysis 4, according to Section 4.1, 29 studies are of high risk of bias, 24 studies are of moderate risk, and 2 studies are of low risk of bias.The studies with low risk of bias give a synthetic estimate of 5.70 days (95% CI (4.64, 6.76)), the studies of moderate risk produce an estimate of the pooled mean incubation time to be 6.95 days (95% CI (6.30, 7.60)), and the records from the subgroup of the high risk result in an estimate of 6.03 days (95% CI (5.25, 6.82)).
Further, applying the method of testing the differences between subgroups in Section 3.5, we obtain p-values 0.48, 0.15, 0.61, and 0.07 for Subgroup Analyses 1, 2, 3, and 4, respectively, suggesting no significant difference among the subgroups in all four subgroup analyses at the level of 0.05.

Results of Sensitivity Analyses
To further understand the performance of the meta-analysis, we conduct two sensitivity analyses using the same procedure as in Analysis 1-4.First, we repeat Analyses 1 and 3 by adding back those four studies with highly right screwed CIs.Using the methods in Section 3.1, we then respectively obtain synthetic estimates of the mean incubation time to slightly decrease to 6.37 days with a 95% CI (5.86,6.89)and 6.06 days with a 95% CI (5.69,6.42).
Secondly, we exclude all the studies that have asymmetric CIs in Analysis 1 and conduct another meta-analysis on the 13 remaining studies.The resultant synthetic estimate of the time is 6.06 days with 95% CI (5.27,6.85),which is smaller than that of Analysis 1 and almost identical to the estimate from Analysis 3 with a wider CI.

Conclusions and Discussion
To provide a sensible understanding of the average incubation time for COVID-19, in this article we take different angles to examine the reported estimates in the literature that were obtained from different studies.With the 55 estimates of the mean incubation time of COVID-19, we obtain that the synthetic estimate from meta-analysis is 6.43 days (95% CI (5.90, 6.96)).Further combined with 36 estimates transformed from the reported estimates of the median incubation time of COVID-19, the meta-analysis yields a synthetic estimate of the mean incubation time decreased to 6.08 days (95% CI (5.71, 6.46)).
Subgroup analyses suggest that the estimate of the mean incubation time is 7.18 days (95% CI (5.55, 8.80)) and 6.71 days (95% CI (6.07, 7.35)), respectively, for patients outside China and outside Hubei province.For different risk levels, studies with low risk of bias yield the smallest pooled mean estimate of 5.70 days (95% CI (4.64, 6.76)), compared to moderate and high risk groups.Notably, none of the between-subgroup differences is significant.
Sensitivity analyses suggest that if including or excluding studies with highly skewed CIs considerably changes estimates.While it is difficult to determine exactly the average incubation time of COVID-19, our study here provides insights into understating of this unknown quantity by incorporating various features of the available estimates, including heterogeneity, varying sample sizes, study bias, differences in estimation methods, and so on.
This being said, there are limitations in the analyses here, just like other available studies.
Although our search of the literature spans the period of between January 1, 2020, and May 20, 2021, the reported estimates of the mean incubation time of COVID-19 are mainly obtained from the studies of those infected cases prior to March 31, 2020.The results thereby do not reflect the feature that the average incubation time may change with the emerging variants of the virus.The normality assumption in the meta-analysis may not be valid since the distribution of incubation times is assumed to be right-skewed in some analyses.
Many studies did not give individual characteristics such as age, the sex ratio, and medical conditions of patients, which hinders us from further exploring the heterogeneity of the studies.Most studies using parametric models assumed a distribution such as Gamma, Weibull, or log-normal to describe COVID-19 incubation times.Such distributional assumptions, however, are basically not testable.

Figure 1 :
Figure 1: Flow diagram for gathering studies about estimation of the mean or median COVID-19 incubation time

Figure
Figure 2: Nested Structures in the collected papers

Figure 3 :
Figure 3: Summary of risk of bias

Figure 8 :
Figure 8: Funnel plot for Analysis 3 69 (N 1) studies merely report the information about estimates of the mean incubation time and 35 (N 2 ) articles report only the information about estimates of the median incubation

Table 1 :
The number of papers and estimates reported in Tables2-4

Table 2 :
A summary of 16 papers reporting meta-analysis results about estimation of the mean incubation time † This paper(N 111

Table 3 :
A summary of 59 estimates of the mean incubation time from 54 papers 1The SD is transformed from the reported 95% CI. a These papers (N 122 ) each reports two mean incubation estimates.b This paper (N 121 ) reports three mean incubation estimates.c This paper (N 111 ) reports one synthetic mean incubation estimate derived from multiple studies using meta-analysis and one mean incubation estimate obtained from a single sample.* These studies have highly right skewed 95% CIs of (0.4,15.8), (0.3,8.2), (1.2,12.5),and (0.1, 15.6), respectively.

Table 4 :
A summary of 36 estimates about the median incubation time from 35 papers, together with the derived entries reported in the last two columns Table 2 excluded, and Figure 4 displays the RevMan risk of bias table, where the risk status (high or low) for 10 questions in the checklist and 95 studies is displayed by rows and columns, respectively.

Table 5 :
Subgroup analysis results K Mean LBCI UBCI I 2