Abstract
Background Estimating the prevalence of severe or critical illness and case fatality of COVID-19 outbreak in December, 2019 remains a challenge due to biases associated with surveillance, data synthesis and reporting. We aimed to address this limitation in a systematic review and meta-analysis and to examine the clinical, biochemical and radiological risk factors in a meta-regression.
Methods PRISMA guidelines were followed. PubMed, Scopus and Web of Science were searched using pre-specified keywords on March 07, 2020. Peer-reviewed empirical studies examining rates of severe illness, critical illness and case fatality among COVID-19 patients were examined. Numerators and denominators to compute the prevalence rates and risk factors were extracted. Random-effects meta-analyses were performed. Results were corrected for publication bias. Meta-regression analyses examined the moderator effects of potential risk factors.
Results The meta-analysis included 29 studies representing 2,090 individuals. Pooled rates of severe illness, critical illness and case fatality among COVID-19 patients were 15%, 5% and 0.8% respectively. Adjusting for potential underreporting and publication bias, increased these estimates to 26%, 16% and 7.4% respectively. Increasing age and elevated LDH consistently predicted severe / critical disease and case fatality. Hypertension; fever and dyspnea at presentation; and elevated CRP predicted increased severity.
Conclusions Risk factors that emerged in our analyses predicting severity and case fatality should inform clinicians to define endophenotypes possessing a greater risk. Estimated case fatality rate of 7.4% after correcting for publication bias underscores the importance of strict adherence to preventive measures, case detection, surveillance and reporting.
Introduction
A novel corona virus, first identified in Wuhan, China in late 2019, resulted in a pandemic by the first quarter of 2020, contributed by the prolonged survival of the virus in the environment and extended length of pre-or post-symptomatic and potential asymptomatic shedding. 1-4 While the virus is known to cause only a mild illness in a majority, severe illness characterized by respiratory distress requiring hospital admission is not uncommon. 5 Furthermore, the virus has the potential to precipitate a life-threatening critical illness, characterized by respiratory failure, circulatory shock, sepsis or other organ failure, requiring intensive care. 6, 7
An extensive body of literature that emerged since the outset of the epidemic in China, have examined the rates of severe and critically severe illness as well as case fatality associated with COVID-19. However, the literature on COVID-19 has several limitations. First, due to lack of awareness and limited availability of training and resources to confirm the diagnosis, failure to recognize and code COVID-19 as the potential cause of morbidity and mortality may have contribute to under estimation of the effects of COVID-19. 8 In fact, a recent estimate suggested that approximately 86% cases of COVID-19 were not documented prior to January 23, 2020. 4 On the contrary, screening of only those who are at high risk may lead to over-reporting of morbidity and mortality. Second, given that most datasets and publications are derived from retrospective chart review, as opposed to prospective methods, high measurement error is inevitable. 9 Third, the literature on the outcomes are originating mainly from tertiary care settings, distorting the overall clinical picture. 10 Finally, including the same patients in multiple reports examining the same research question without clearly indicating is a major lapse in methodological and ethical standards. 11
As such, we aimed to conduct a systematic review of the available literature to identify publications with minimal potential overlap, in order to estimate the prevalence of severe illness, critical illness and case fatality among individuals with COVID-19 in random-effects meta-analyses to enhance generalizability. More importantly, we aimed to adjust our prevalence estimates by correcting for publication bias and underreporting. We further aimed to examine the effects of clinical, biochemical and radiological risk factors moderating the between-study heterogeneity of the severity and case fatality rates.
Methods
Search strategy and selection criteria
All procedures were conducted in accord with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines. PubMed, Scopus and Web of Science databases were searched on March 7, 2020 with the aim of identifying studies that have been published in year 2020 examining the prevalence of severe illness, critically severe illness and mortality associated with COVID-19 infection using pre-determined keyword combinations (Table S1 in Appendix). No language restrictions were applied. Duplicate records were removed and titles and abstracts were screened for pre-defined eligibility criteria (Figure 1) by two independent raters (CD and CK or SC). Records published in Chinese were translated to American English using Google translator and a native Chinese speaker (SC) examined the original records in Chinese. Full-text manuscripts of records that were deemed eligible in the initial screening were re-examined. All eligible full-text manuscripts were examined using the 9-item Quality Assessment Tool for Case Series Studies of the National Heart, Lung and Blood Institute by two study personnel.
Data were extracted from the eligible manuscripts into pre-defined data-fields. The data fields included the total sample size and number of participants with severe illness, critically severe illness and mortality. Severe illness was operationally defined as having either respiratory distress with RR ≥ 30 /min, resting peripheral oxygen saturation of ≤ 93% or arterial partial pressure of oxygen ≤ 300 mmHg, requiring hospitalization. Critical illness was defined as having respiratory failure, circulatory shock, end-organ failure or any combination of the above requiring intensive care. In addition, the following variables were extracted as potential covariates of the above outcomes. Central tendency (i.e. mean or median) and dispersion (i.e. SD, SE, 95%CI, IQR or range) of age were extracted. When not reported, study level means and standard deviations for age were imputed from the available statistics (i.e. median, IQR or range). 12 Proportions of the following variables within a study sample were extracted: age ≤ 18 years, age ≥60 years, female sex, diabetes mellitus, hypertension, heart disease, chronic liver disease, chronic kidney disease, chronic obstructive pulmonary disease, malignancy, immunosuppression (e.g. HIV), smoking and pregnancy. Proportions of patients with specific presenting symptoms (i.e. fever, cough, sore throat, shortness of breath, headache, diarrhea), asymptomatic cases, specific laboratory parameters (i.e. positive nucleic acid test for COVID-19, leukopenia, leukocytosis, thrombocytopenia, lymphopenia, elevated lactate dehydrogenase (LDH), elevated C-reactive protein (CRP), elevated erythrocyte sedimentation rate (ESR), high procalcitonin and high D-dimer based on reference ranges considered in each study) and radiographic features (i.e. no lesions on CT, patchy consolidation, ground glass opacities, peripheral distribution, and bilateral lung involvement or involvement of ≥ 3 lobes).
Data analysis
Three separate DerSimonian-Laird random-effects meta-analyses were performed using the ‘meta’ package (version 4.11-0) in R statistical software (version 3.6.2) to examine three primary outcomes: the prevalence of a) combined severe or critical COVID-19 infection, b) critically severe COVID-19 infection, and c) COID-19-associated mortality. 13 Studies with both zero or 100% proportions were not excluded to ensure incorporation of all available data, which is also known to ensure analytic consistency and minimize bias. 14 Consistency of the findings of the meta-analyses were confirmed by leave-one-out sensitivity analyses. 15 Given that under-reporting and publication bias could result in biased (i.e. smaller) prevalence estimates, publication bias was examined using funnel plots and the effect-sizes were imputed for estimated missing (i.e. unpublished / unreported) studies via the trim-and-fill method. 16, 17 The meta-analyses were repeated including the effect-size estimates of these potentially missing studies in order to obtain unbiased estimates of the three primary outcomes. Heterogeneity of effect-sizes was quantified by calculating the Higgins’ I2 statistic for each meta-analysis. 18, 19 To explain the heterogeneity of the studies 20, exploratory univariate random-effects meta-regression analyses were performed to examine the moderator effects of each of the covariates described above.
Results
Results of database search, subsequent screening and eligibility assessment are summarized in a PRISMA flow diagram (Figure 1). Out of 1,470 records identified in the initial database search, 29 studies including data of 2,090 patients with COVID-19 were deemed eligible. The proportions of females in the study samples ranged from 27.59-100.00%. The mean age of the participants included in the studies ranged from 2-66 years. Four studies recruited entirely on children and one study exclusively recruited pregnant women. The included studies and their quality ratings are summarized in Table S2 with their references. The summary statistics of all covariates are summarized in Table S3 in Appendix.
Pooled prevalence of severe or critically severe illness among individuals with COVID-19 infection was estimated to be 14.6% (95%CI, 8.9%-23.1%) in the random-effects meta-analysis (Figure 2a). Excluding any single study from the meta-analysis (i.e. leave-one-out sensitivity analyses) did not significantly change the pooled prevalence estimate. However, funnel plot of the effect-sizes was severely asymmetric, suggesting substantial underreporting or publication bias (Figure 2b). Seven effect-sizes were imputed to correct for the publication bias. When the random-effects meta-analysis was performed including these imputed effect-sizes (i.e. after correcting for publication bias), the prevalence of severe or critical illness increased to 25.8% (95%CI, 17.2%-36.8%) (Figure 2c).
Significant heterogeneity was observed among the prevalence estimates of severe illness (τ2 = 1.679; I2 = 94%, p < 0.001). Correcting for publication bias decreased this heterogeneity (I2 = 85%, 95%CI, 80%-89%), however, heterogeneity remained significant (p < 0.001). Exploratory univariate random-effects meta-regression analyses conducted with the aim of explaining the heterogeneity using the moderator effects of the considered covariates suggested that each of increasing mean age (p = 0.006) and prevalence of age ≥ 60 years (p < 0.001), hypertension (p < 0.001), chronic kidney disease (p = 0.038), malignancy (p = 0.023) and chronic obstructive pulmonary disease (p = 0.025) were associated with a greater risk of severe or critical illness associated with COVID-19, while the prevalence of age ≤ 18 years (p = 0.007) within a sample was associated with a reduced risk of severe or critical illness. Prevalence of the presenting clinical features of fever (p < 0.001), dyspnea (p = 0.028) and diarrhea (p = 0.026); laboratory findings of lymphocytopenia (p = 0.003), elevated LDH (p < 0.001), CRP (p < 0.001) and D-dimer levels (p < 0.001); and bilateral lung involvement or involvement of ≥ 3 lung lobes (p = 0.006) was associated with increased risk of severe or critically severe illness, while having no radiological features on chest CT was associated with decreased risk of severe illness (p = 0.003). The results of all univariate meta-regression analyses examining the moderator effects of the covariates on the prevalence of severe or critical illness in COVID-19 infection and their effects on heterogeneity are summarized in Table S4 in Appendix.
The random effects meta-regression analyses revealed a pooled estimate of 4.8% (95%CI, 2.4%-9.5%) for the prevalence of critical illness in COVID-19 infection (Figure 3a). Leave-one-out meta-regression analyses did not significantly change this estimate. The funnel plot of effect-sizes (Figure 3b) was highly suggestive of underreporting or publication bias. Eleven effect-sizes had to be imputed to statistically correct for this bias and after correction, the pooled prevalence of critical illness in COVID-19 infection increased to 16.3% (95%CI, 9.8%-25.7%) (Figure 3c).
Significant heterogeneity of effect-sizes was a concern for the meta-analysis of prevalence of critical illness as well (τ2 = 1.994; I2 = 92%, p < 0.001). Correcting for publication bias decreased this heterogeneity (I2 = 78%, 95%CI, 69%-84%), however, heterogeneity remained significant (p < 0.001). Univariate meta-regression analyses suggested increased risk of critical illness associated with sample characteristics of increasing mean age (p = 0.002), prevalence of age ≥ 60 years (p < 0.001), comorbid hypertension (p < 0.01), cardiac disease (p = 0.023) and malignancy (p = 0.041). Similarly, prevalence of fever (p = 0.044), dyspnea (p = 0.042) and fatigue (p = 0.036) on presentation; prevalence of increased LDH (p = 0.003), CRP (p = 0.008) and D-dimer (p = 0.021) were associated with a greater risk of critical illness (Table S5 in Appendix).
Prevalence of COVID-19 associated mortality was 0.8% (95%CI, 0.2%-2.9%) based on the random-effects meta-analysis (Figure 4a) and this estimate was minimally affected by leave-one-out sensitivity analyses. However, as with severe and critically severe illness, publication bias / potential underreporting was apparent on the funnel plot of effect sizes (Figure 4b). Thirteen effect-sizes were imputed to account for missing / unreported effects in an attempt to statistically correct for the publication bias and conducting the meta-regression analyses with the addition of these effect-sizes revealed a pooled estimate of 7.4% (95%CI, 4.5%-11.9%) for the mortality rate associated with COVID-19 infection (Figure 4c).
Heterogeneity of effect-sizes on prevalence of mortality in COVID-19 was also a concern (τ2 = 2.996; I2 = 86%, p < 0.001). Correction for publication bias decreased the heterogeneity (I2 = 61%, 95%CI, 45%-73%), yet the heterogeneity remained significant (p < 0.001). Univariate meta-regression analyses modeling heterogeneity indicated increased mortality risks associated with increasing mean age (p < 0.001), prevalence of age ≥ 60 years (p = 0.011), presenting with fatigue (p = 0.048), leukocytosis (p =0.007), high LDH (p = 0.030) and low albumin (p < 0.001). Prevalence of age ≤ 18 years (p = 0.036) was associated with a decreased risk of COVID-19-associated mortality (Table S6 in Appendix).
Discussion
In this systematic review and meta-analysis, we comprehensively and systematically examined the available literature to estimate the prevalence of morbidity and mortality associated with SRAS-CoV-2 infection. Despite the large number of articles that were reviewed, our quantitative synthesis was limited to 29 studies representing data of 2,090 individuals. Our literature-based estimates of severe illness, critical illness and case fatality rates among patients with COVID-19 were 15%, 5% and 0.8% respectively. After adjusting for underreporting and publication bias, COVID-19-associated prevalence of severe illness, critical illness and case fatality increased to 26%, 16% and 7.4% respectively.
Our unadjusted random-effects estimates of severe illness requiring hospitalization (15%) and critical illness requiring intensive care admission (5%), are consistent with the estimates of COVID-19-associated morbidity based on large individual-level datasets. 21As such, the unadjusted findings of our meta-analysis regarding severity of illness corroborate the inferences made based on current surveillance systems. However, the unadjusted mortality rate observed in our analysis (0.8%, 95% CI, 0.2%-2.9%) is lower than the COVID-19-associated mortality rates in China (3.6%) or globally (3.4%) at the end of February, 2020 (i.e. the time represented in the reviewed publications). 22
As we have noted in the introduction, retrospective patient data and the literature derived from such data could be systematically biased towards both overestimating and / or underestimating morbidity and mortality. The reviewed studies are largely representing tertiary care settings, of which the capacity may have been overridden minimally, if at all, despite the high reproductive number (R0) at the time of sampling. As such, the outcomes of our unadjusted random-effects meta-analyses, which accounts for the random variability of effects between studies, can be inferred to be a generalizable representation of morbidity and mortality rates applicable to a well-trained and equipped healthcare setting of which resources are not overwhelmed. This estimate therefore represents COVID-19 associated mortality in regions with a low R0.
However, reverse-causation bias caused by failing to capture the deaths that may not reach healthcare facilities / be diagnosed prior to death may contribute to underestimation. 23 The effect-sizes imputed to correct for publication bias may in fact represent severity and case fatality rates of settings where the demand exceeds the available resources. 24 In fact, our data substantiate the recent forecasts and recommendations to suppress the epidemic growth. 25 While optimized utilization of healthcare facilities by maintaining a low R0 may reduce the mortality rates to as low as 0.8%, overwhelming healthcare resources may increase the overall case fatality rate to 7.4%, or even greater as represented by the effect-sizes we have imputed. 26
In our meta-regression analyses that examined risk factors, increasing age and age ≥ 60 years consistently stood out as a risk factor, while age ≤ 18 years consistently remained a protective factor, being consistent with the current literature. Angiotensin converting enzyme (ACE)-2 receptors are generally upregulated in patients with hypertension or heart failure, who receive ACE-inhibitors and angiotensin-II receptor blockers. 27 COVID-19 is known to enter cells by binding to ACE-2 receptors, increasing their risk of infection and development of severe clinical illness. 27 Consistently, we observed an increased risk of severity with hypertension and cardiovascular disease. Presenting with fever emerged as a risk factor for severe and critical illness as well as mortality, underscoring the importance body temperature as a screening tool. 28
Our systematic review has some limitations. First, while we eliminated studies with overlapping samples by screening for overlaps of institutes, study dates and authors, we cannot be 100% certain. Second, high degree of heterogeneity was a concern. However, this should be expected in any meta-analysis due to the variability in methodology and study samples and we used heterogeneity to explore covariates in meta-regression analyses. 18, 20 Third, funnel plots of all three meta-analyses indicating substantial publication bias, limiting the generalizability of uncorrected random-effects meta-analyses. Finally, the protocol was not pre-registered. Use of a systematic search strategy; use of random-effects meta-analyses and meta-regression analyses assuming high heterogeneity of effect-sizes; exploring the etiology of heterogeneity in meta-regression analyses, which also identified risk factors of morbidity and mortality; exploring the validity of our findings in sensitivity analyses; and statistically correcting for publication / underreporting bias are notable strengths of our systematic review and meta-analysis.
In conclusion, after correcting for publication bias, COVID-19 associated overall rates of requirement for hospitalization, intensive care and case fatality could be as high as 26%, 16% and 7.4% respectively. This underscores the importance of strict adherence to preventive measures, case detection, surveillance and reporting. Hypertension; fever and dyspnea at presentation; and elevated CRP seem to predict increased disease severity, while increasing age and elevated LDH seem to consistently predict severity and case fatality. These risk factors should inform clinicians to define endophenotypes possessing a greater risk.
Data Availability
The data presented in the manuscript is based entirely on data extracted from peer-reviewed, published research. The authors are willing to share detailed records of all steps including searching databases, systematically screening the literature, quality assessment, data extraction and meta-analysis (including the scripts).
Contributors
CK, CD and SC conceptualized the study; CK and CD conducted the literature search; CK, CD and SC were involved in screening the literature; CK and CD extracted the data; CK performed the meta-analyses and meta-regression analyses; CK and CD wrote the manuscript; all authors read and agreed to the content of the manuscript.
Declaration of interests
The study was not funded. The authors have no potential conflicts of interest to declare.