Abstract
Background COVID-19 case fatality rate in hospitalized patients varies across countries and studies, but reliable estimates specific for age, sex, and comorbidities are needed to design trials for COVID-19 interventions. Aim of this study is to provide population-based survival curves of hospitalized COVID-19 patients.
Methods A cohort study was conducted in Lombardy, Veneto, and Reggio Emilia using COVID-19 registries linked to hospital discharge databases containing patient clinical histories. All patients with positive SARS-CoV-2 RT-PCR test on oral/nasopharyngeal swabs hospitalized from 21st February to 21st April 2020 were identified. Kaplan Meier survival estimates were calculated at 14 and 30 days for death in any setting, stratifying by age, sex and Charlson Index.
Findings Overall 42,926 hospitalized COVID-19 patients were identified. Patients’ median age was 69 years (IQR: 57–79), 62·6% were males, 69·4% had a Charlson Index of 0. In total, 11,205 (26·1%) patients died over a median follow-up of 24 days (IQR: 10–35). Survival curves showed that 22·0% of patients died within 14 days and 27·6% within 30 days of hospitalization. Survival was higher in younger patients and in females. Younger patients with comorbidities had a lower survival than older ones with comorbidities.
Interpretation Over 27% of hospitalized COVID-19 patients died within one month in three areas of Northern Italy that were heavily affected by SARS-CoV-2 infection. Such a high fatality rate suggests that trials should focus on survival and have follow-up of at least one month.
Funding The study did not receive any external funding.
Evidence before this study Two recent systematic reviews with meta-analyses report case fatality rates of three to four percent in COVID-19 patients. Most studies on hospitalized cohorts report only slightly higher figures. These figures do not correspond to those derived from routinely collected clinical data in most European countries, reporting a 10% case fatality rate which has been increasing over time since the epidemic started.
Robust and precise survival estimates of hospitalized COVID-19 patients which take into account prognostic factors such as age, sex and burden of comorbidities are needed to design appropriate phase II and phase III clinical studies of drugs targeting COVID-19.
Added value of this study In this study we present the first survival estimates by age, sex and Charlson index for a large population-based cohort of Italian hospitalized COVID-19 patients.
Implications of all the available evidence Over 27% of COVID-19 patients died within one month from hospital admission. Such a high fatality rate suggests that studies should prioritize mortality as primary outcome. Furthermore, we found that the fatality rate reaches a plateau 30 days after hospitalization, suggesting that studies should have at least one month of follow up to observe deaths; shorter follow-up could lead to overestimation of treatment benefits.
Introduction
The novel SARS-CoV-2 has caused a pandemic in early 2020. The virus has shown a high reproduction number and has spread rapidly on a global scale,1–2 with more than three million cases of coronavirus disease (COVID-19) being diagnosed worldwide.3 Italy was one of the first countries to face the epidemic outside of China.4–5 As of the end of April 2020, Italy is second only to the USA as the country with the highest number of COVID-19-related deaths, with a total of around 30,000 deaths and a case fatality rate close to 13·6%.3
The spectrum of COVID-19 disease ranges from asymptomatic to severe mixed interstitial-alveolar pneumonia, that may lead to severe acute respiratory distress syndrome and death.6 While several important epidemiological findings have emerged about the pandemic, the COVID-19 case-fatality rate (CFR) remains unknown. Reports from different countries show an enormous heterogeneity, ranging from less than 1% to approximately 12% CFR.3,7 The accuracy of this assessment, however, is limited by disease ascertainment challenges, bias towards symptomatic and very sick patients, and variability in testing accuracy.8 An important source of this heterogeneity is very likely the difference in case detection and reporting (denominator) and also how COVID-19-related deaths are defined (numerator). Furthermore, the length of observation time is key to accurately measure CFR.9–11 Where widespread screening was performed in the general population (e.g., in South Korea), the overall CFR is obviously lower, because the denominator includes many mild or asymptomatic cases. However, in countries where mainly people presenting to the emergency department are being screened (e.g., in the Italian region of Lombardy, during the strongest wave of the epidemic), CFRs are higher, because the denominator will include predominantly severe cases.12
Two recent systematic reviews of trials and observational studies, including studies mainly from China, found a CFR pooled estimate of 3–4%;13,14 however, most reports from European countries show CFRs ranging from 10% to 20% when a cohort approach with adequate follow up was conducted.9,15 Differences in CFR are also reported in studies including only hospitalized COVID-19 patients.16–20 Reliable population-based estimates of the CFR for hospitalized patients are essential for providing public health standards to health care providers that have to monitor the impact of the epidemic and disease management. CFRs can also be useful to provide a reference standard to accurately design trials of interventions targeting COVID-19 with adequate statistical power. Having a reliable expected number of deaths in a cohort of hospitalized COVID-19 patients with known age and comorbidity burden is critical to design adequately-powered experimental studies and, in particular, to design phase two, single-arm studies, that are currently very frequently employed in ongoing COVID-19 intervention trials. In particular, 729 interventional studies on COVID-19 patients are registered in https://clinicaltrials.gov as of 30th April 2020 (supplementary figure 1 and table 1). Of these, 37·9% are phase II (including phase I/II and II/III) trials and one third of those studies are aimed at exploring survival as outcome (supplementary figure 2).
Flowchart of data sources used in the present study. COVID-19 registries at the regional level collect, for all SARS-CoV-2 RT-PCR positive patients, data on date of diagnosis, symptoms, hospitalizations, death, and recovery. These registries were linked to the resident population registries. We report the percentage of patients successfully linked; in some cases, deterministic linkage could not be carried out, likely due to patients not residing in the region or to inaccurate individual data reporting in the COVID-19 registry. Patients confirmed to be residents in the region of interest were found using data from prior contact with the NHS. This data was used to identify comorbidities.
Given the rapid spread of the pandemic and the absence of any effective therapy, it is likely that new therapies, if also proven to be promising in phase II non-randomized studies, will be directly transferred to clinical practice without more rigorous testing of their efficacy in phase III studies. The aim of this study is therefore to provide population-based survival curves of hospitalized COVID-19 patients that can be used to monitor the impact of epidemic and disease management, to design phase II trials and to correctly calculate the required sample size of phase III studies.
Methods
Setting
The study included patients living in the Lombardy region, first and the hardest hit Italian region (10 million inhabitants), the Veneto region, where the second epidemic wave broke out (4·9 million inhabitants,) and in Reggio Emilia Local Health Unit, which is part of the Emilia-Romagna region, ranking second for the spread of the epidemic during the study period (0·5 million inhabitants). Together, the areas included in the study represent 25% of Italian population.
In Italy the National Health System (NHS) provides all testing activities and acute care free of charge for all residents. All RT-polymerase chain reaction (PCR) SARS-CoV-2 tests performed in Italy must be recorded in a dedicated COVID-19 surveillance registry. Regional health systems are organized in different ways and testing strategies differ based on local protocols and logistic constraints during the study period: in Veneto contact tracing and tests among asymptomatic contacts as well as patients with mild symptoms was carried out since the epidemic began and continued over time. Lombardy had the fastest progression of the epidemic and testing outside the hospital setting quickly became soon untenable, while Reggio Emilia faced an intermediate situation. In all the study catchment areas, patients with symptoms potentially suggesting COVID-19, who were admitted to the Emergency Room (ER) or were admitted to hospital, were always tested.
Study population
Data were retrieved from the COVID-19 surveillance registry coordinated by the National Institute of Health and implemented in each catchment area. This registry collects information about symptoms, diagnoses, hospitalizations, intensive care unit (ICU) admission, death, and recovery concerning patients testing positive for SARS-CoV-2 RNA by RT-PCR on nasopharyngeal or throat swab samples. Information was collected by the public health departments of the local health authorities from different sources, including molecular laboratories, hospitals, and death certificates and reported to the regional surveillance system within 48 hours (figure 1).
From the COVID-19 registry all patients aged ≥18 years with SARS-CoV-2 infection who were hospitalized from 21st February (date of the first Italian COVID-19 patient hospitalization) to 21st April 2020 were identified in Lombardy. Patients were identified from 21st February to 31st March 2020 in the other two catchment areas. The admission date was considered the index date (ID). Patients were followed up until death or the end of available data, whichever came first. Hospital discharge databases were linked to the COVID-19 registry to identify hospital admissions in the 10 years preceding COVID-19 hospitalizations in order to calculate the Charlson index for each patient.21
An R-based tool for distributed analyses developed by the National Institute of Health (The ShinISS) was used by each study center to locally analyze COVID-19 patient data using a common data model, sharing only an anonymized dataset for central analysis, in compliance with EU-GDPR regulations.
Outcome measure
The end-point was death occurring during follow-up in an inpatient or outpatient setting for any cause, as reported in COVID-19 registry. The outcome measure was time to event from COVID-19 hospitalization to death or the end of available follow-up, whichever came first.
Covariates of interest
Besides sex, age and catchment area, Charlson index was calculated in order to take into account the impact of the comorbidity burden.
Statistical analysis
Univariate survival measures, with corresponding 95% confidence intervals (95% CI), at a fixed follow-up time(14 and 30 days) were computed with the Kaplan Meier estimator, for each covariate: age (<50, 50–59, 60–69, 70–79, 80–89, and ≥90 years), sex (males and females) and Charlson index (0, 1–2, and ≥3). Furthermore, we also present data for sex and Charlson index, stratified by age category. A Cox proportional hazard model was used to show the effect of sex, adjusted for age and comorbidities, with an interaction term between age and Charlson index.
Ethics
The study protocol was approved by Ethics Committee of the Italian National Institute of Health on 18th March 2020 and subsequently by local Institutional Review Board or Ethics Committee.
Data sharing statement
Data will be updated periodically. Aggregate data are available upon submission of a request which describes the research objectives and a protocol for analyses to the local competent authorities: for Lombardy, queries should be submitted to a specific COVID-19 Committee, identified by the Directorate General for Health (under DDG n. 3019/2020), e-mail accessodatic19{at}regione.lombardia.it; for Veneto to the Health Direction of the Azienda Zero, e-mail: direzione.sanitaria{at}azero.veneto.it; for Reggio Emilia to the Comitato Etico Area Vasta Emilia Nord, Reggio Emilia office, e-mail: mariafrancesca.paolo{at}ausl.re.it. Data will be available at least for seven years from the approval of the study.
Results
The three regional COVID-19 registries captured data from 79,882 patients with at least one positive PCR result over the study period; 35,671 patients were not hospitalized or were below 18 years old. Finally, 1,243 patients were excluded because their data could not be deterministically linked to the resident population registries, resulting in a 97·2% success of record linkage. Overall, 42,926 hospitalized COVID-19 patients were included in the study, of which 38,715 were from Lombardy, 3,229 were from Veneto and 982 were from Reggio Emilia (figure 1). Patients’ median age was 69 years (IQR: 57–79) and 62·6% were males. In total, 11,205 (26·1%) patients died over a median follow-up of 24 days (IQR: 10–35). The median time from hospitalization to death was 6 days (IQR: 3–12); 69·4% of patients did not have any prior hospitalization reporting the comorbidities included in the Charlson Index, while 6·0% had a Charlson index ≥3 (table 1).
The survival curves show that 22·0% (95% CI 21·6%–22·4%) of patients die within the second week of hospitalization and 27·6% (95% CI 27·2%–28·1%) die within 30 days. The curve then reaches a plateau with a few deaths occurring beyond one month from the start of COVID-19 hospitalization. After this point differences among regions almost disappear (tables 2 and 3). The curve had similar trajectory for all age groups, although the survival rate was higher in patients aged <50 years (2·8% died at 30 days post-hospitalization) and those aged 50–59 years (6·7% died at 30 days post-hospitalization). On the contrary, patients aged 80–89 years and ≥ 90 years had a lower survival rate (52·5% and 64·9% died at 30 days post-hospitalization, respectively). Survival rates were also higher among females both at 14- and 30-days post-hospitalization (18·7% and 23·7%, respectively). The proportion of deaths occurring at 30 days was also higher in patients with comorbidities: 20·7%, 40·2%, and 58·1% for those with a Charlson index of 0, 1–2 and ≥3, respectively (figure 2, table 3).
Kaplan Meier survival curves for hospitalized COVID-19 patients in Northern Italy, 21st February to 21st April 2020. Panel a: overall; panel b: stratified by age groups; panel c: stratified by sex at birth; panel d: stratified by Charlson index
The difference in survival between males and females was similar across all ages (tables 2 and 3, figure 3), with the HR for females vs. males ranging from 0·77 (95%CI 0·74–0·80) to 0·66 (95%CI 0·63–0·68), when adjusting for age and Charlson index (table 4). The effect of comorbidities on survival was more pronounced in younger patients, while gradually decreasing with age: in patients below the age of 50 the HR for those with Charlson index of ≥3 as compared with a Charlson index of 0 was 15·2 (95%CI 7·1–32·7), while in those over 90 it was only 1·1 (95%CI 0·93–1·4) (table 4). In younger patients (<50 years), the corresponding estimated proportion of deaths occurring at 30 days was 2·3% in those with a Charlson index of 0 and 24·6% in patients with Charlson index ≥3, while in patients over 90, these estimates were 51·3% and 76·6% in patients with a Charlson index of 0 and of ≥3, respectively (table 3, figure 4).
Kaplan Meier survival curves for hospitalized COVID-19 patients in Northern Italy stratified by age and sex at birth, from 21st February to 21st April 2020
Kaplan Meier survival curves for hospitalized COVID-19 patients in Northern Italy stratified by age and Charlson index, from 21st February to 21st April 2020
Discussion
In a large population-based cohort of hospitalized COVID-19 patients we observed a 27·6% fatality rate at one month after hospitalization, which reached a plateau thereafter. The fatality rate was slightly higher in males (29·9%) than in females (25·7%), increasing dramatically to 52% in persons aged 80–90 and 65% in persons aged over 90. The burden of comorbidities, measured using the Charlson index, is an important prognostic factor in younger patients, but after the age of 80 the impact of this on fatality rate is smaller, as probably other factors, such as frailty, become more important risk factors for death.
Compared to previous studies and current statistics, we observed a much higher CFR in hospitalized cohorts, where deaths rates rarely reach 20%.13,14,17 A recent study on 16,749 UK patients hospitalized with COVID-19 found a CFR of 33%.22 The differences between the UK study and the present study could be explained by the different criteria used for deciding which patients should be hospitalized, but our data are in line with the very high fatality rate observed in all settings in Italy, not only in hospitalized patients.9,15 Several hypotheses have been proposed to explain differences in CFR in various countries, including different viral strains that could cause more or less severe disease, the different capacity of a healthcare system to respond to the COVID-19 crisis, as well as different case definitions, with some countries including probable cases in the overall death toll. The CFR can increase in some areas if there is a surge of infected patients, which adds to the strain on the healthcare system and can overwhelm its medical resources.23 Data from hospitalized COVID-19 patients in New York City show a CFR of 20% at approximately 6 days of follow-up;19 although this figure seems much higher than our observed survival in the first week after hospital admission, if we consider that the average Charlson index in the New York population was over 4, the fatality rate is consistent with our estimate for the population with a Charlson index ≥3, reported in figure 2, which ranges between 20 and 25%. An relatively long follow-up and the very old age of the hospitalized population may partly explain the high fatality observed in the Italian cohort.
Our results confirm a lower survival in males, even after adjusting for comorbidities.24 Interestingly, males had a higher probability of having COVID-19 as well as of being hospitalized15,25 and dying after hospitalization due to COVID-19, according to our data. Therefore, the excess morbidity and mortality is underestimated when we observe only hospitalized patients. Different hypotheses involving the role of ACE2 as principal receptor of SARS-CoV-2 for its entry into cells have been proposed to explain the higher susceptibility to infection and the lower survival in males.26 The role of the TMPRSS2, a protease involved in virus binding and up-regulated by androgens, has been also proposed.27
The impact of age on the risk of death after developing COVID-19 was dramatic in our cohort, in line with all previous studies.7,15 Again, the impact of age on survival could be partially underestimated when studying hospitalized cohorts, as older patients, at least until the age of 80, also have a higher risk of being hospitalized. The excess risk cannot be entirely attributed to the underlying comorbidities that are usually more frequent in older patients, as can be deduced comparing the survival curves of people with the same Charlson index in different age groups (figure 4). On the other hand, severe symptoms in children and adolescents seem extremely rare, and very few cases of death have been reported for children aged less than 10 years so far.8,28
The burden of comorbidities play an important role in determining the risk of death in COVID-19 patients in our cohort, confirming the results of all previous studies.17,18 The large number of cases identified allowed us to study the effect of comorbidity burden across age groups, giving a precise survival estimate by comorbidity in each age group. The effect of comorbidities on survival is much more pronounced in younger patients than in older ones. In our study design, the assessment of the pre-existing comorbidities is independent from the outcome being collected through the analysis of patient use of healthcare services in the years preceding the epidemic. Therefore, even if we may underestimate the presence of comorbidities, this misclassification is not differential in severe or fatal cases and in non-severe cases, thus the hazard ratios should be unbiased.
The availability of life-sustaining therapies or lack thereof should be considered when interpreting the COVID-19 mortality rate: poor outcomes may be due to known risk factors such as old age, frailty, comorbidities, profound disability, or because of effects of logistic limitations associated with lack of medical staff and medical staff burn-out, lack of beds and/or medical supplies. Moreover, although patients might indeed have SARSCoV-2 infection, the infection itself may not necessarily be the cause of death. Extreme examples include patients with metastatic cancer or terminal organ failure.12
Due to the regional structure of the Italian NHS, the responses to the COVID-19 epidemic vary from region to region. The Lombardy region is home to a sixth of the Italian population (10·08 million inhabitants) and accounts for 37% of cases and 53% of deaths of the country, as of 21st April, 2020. The COVID-19 outbreak initially hit Italy in two small towns, Codogno and Vò Euganeo, in Lombardy and Veneto, respectively. The two outbreaks developed differently: in Lombardy the spread of infection was quick and several other clusters emerged rapidly;4 in Veneto, the first cluster had a limited spread until the diffusion of cases in all Northern Italian regions occurred. Finally, in Reggio Emilia the outbreak was related to the expansion of the first cluster occurring in the neighboring area of Codogno. Differences in the pattern of spread of the infection across the three areas are probably due to the different size of the initial clusters when they emerged, i.e. much larger in Codogno than in Vò Euganeo, but also on the epidemic control strategies which were implemented initially: Veneto opted for strict containment of the outbreak and piloted mass testing in selected areas, whereas Lombardy expanded hospital services to meet a massively increased need for hospitalization and beds in ICUs.29 As result the proportion of positive tests on the total number of swabs performed was 20% in Lombardy, 5·1% in Veneto and 13·9% in Emilia-Romagna. Clinical criteria which must be met for COVID-19 hospital admission may vary among regions, principally due to a different organization of primary care and the availability of hospital beds, which varied in different places and moments of the epidemic, and availability of other options for managing less severe patients, such as home-based treatment with monitoring. Where special units for home-based care for COVID-19 patients were activated, patients were treated and followed-up at home, when possible. Nevertheless, differences in the overall CFRs among the three catchment areas at 30 days post-hospitalization in our cohort were modest, suggesting the generalizability of the results once the plateau is reached.
Implications for practice
The aim of this study was to provide a precise estimate of survival over time after hospitalization for COVID-19 patients. Survival and CFR are very useful also for the design of phase II clinical trials and to calculate the adequate sample size for phase III clinical trials. We found a very high CFR, particularly in older people. Since this endpoint is very common, we suggest that future experimental studies investigating potential therapeutic interventions in COVID-19 patients should prioritize survival as a main outcome. Furthermore, we found that the fatality rate reaches a plateau 30 days after hospitalization, thus implying that studies should have at least one month of follow-up after COVID-19 hospitalization to properly observe deaths; studies with shorter followup may not capture all deaths related to COVID-19, ultimately leading to the overestimation of the potential benefit of COVID-19 interventions in phase II studies in absence of a comparator.
Hard outcomes such as the CFR have a crucial role and, together with survival estimates, should guide health-care leaders and policy makers in developing public health strategies at national and international levels.23
Data Availability
Data will be updated periodically. Aggregate data are available upon submission of a request which describes the research objectives and a protocol for analyses to the local competent authorities: for Lombardy, queries should be submitted to a specific COVID-19 Committee, identified by the Directorate General for Health (under DDG n. 3019/2020), e-mail accessodatic19{at}regione.lombardia.it; for Veneto to the Health Direction of the Azienda Zero, e-mail: direzione.sanitaria{at}azero.veneto.it; for Reggio Emilia to the Comitato Etico Area Vasta Emilia Nord, Reggio Emilia office, e-mail: mariafrancesca.paolo{at}ausl.re.it. Data will be available at least for seven years from the approval of the study.
Authors and contributors
PGR, EF, MMas, GT and SSA contributed to study design, data collection, data analysis, data interpretation, the literature search, and first draft writing of the manuscript.
JS edited the final version of the manuscript, contributed to literature search and data visualization.
GP, OL, DC contributed to study design data, data collection, interpretation and, the literature search.
MMas, MMar, MP contributed to data collection, software programming, data analysis and data interpretation. All authors reviewed and approved the final version of the report.
The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Declaration of interests
Paolo Giorgi Rossi, Eliana Ferroni, Stefania Spila Alegiani, Gisella Pitter, Olivia Leoni, Danilo Cereda, Massimiliano Marino, Michele Pellizzari, and Marco Massari declare no competing interests.
Gianluca Trifirò reports grants from Novartis, from Italian Drug Agency, during the conduct of the study; has participated in advisory boards within the last five years on topics not related to this manuscript and organized by Sandoz, Hospira, Sanofi, Biogen, Ipsen, Shire and is consultant for Otsuka. He is the principal investigator of observational studies funded by several pharmaceutical companies (e.g. Amgen, AstraZeneca, Daiichi Sankyo, IBSA) to the University of Messina as well as scientific coordinator of the Master’s program “Pharmacovigilance, pharmacoepidemiology and pharmacoeconomics: real world data evaluations” at University of Messina which is receives unconditional funding from several pharmaceutical companies.
Acknowledgments
No funding was received for the conduct of the study.
Footnotes
ITA-COVID19 working group Italian National Institute of Health (Italy): Stefania Spila Alegiani, Francesca Menniti Ippolito, Roberto Da Cas, Marco Massari, Graziano Onder, Patrizio Pezzotti; University of Messina – Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Messina (Italy) & Italian Society of Pharmacology, Milan (Italy): Gianluca Trifirò, Janet Sultana, Valentina Ientile, Salvatore Crisafulli; Azienda Unità Sanitaria Locale – IRCCS, Reggio Emilia (Italy): Paolo Giorgi Rossi, Roberto Grilli, Giulio Formoso, Massimiliano Marino, Massimo Vicentini; Lombardy Region – Directorate General for Health, Milan (Italy): Olivia Leoni, Danilo Cereda, Monica Ludergnani, Annalisa Bodina, Ambra Castrofino, Francesca Grosso, Gabriele del Castillo, Marcello Tirani; Veneto Region – Azienda Zero, Padova (Italy): Eliana Ferroni, Gisella Pitter, Michele Pellizzari, Manuel Zorzi, Elena Narne, Ugo Fedeli, Mario Saia.