Introduction

As of 28th April 2020, the number of confirmed COVID-19 cases surpassed 2.9 million worldwide, and the number of deaths due to the disease reached 200,000 [1]. In Sweden, diagnosed COVID-19 cases surpassed 18,000 and related deaths reached 2200 at the same date. Guidelines from the World Health Organization and the European Centre for Disease Prevention and Control suggest that individuals aged 70 years and older or with an underlying medical condition such as cardiovascular disease, high blood pressure, cancer, chronic obstructive pneumatic/pulmonary disease (COPD), asthma, and diabetes, are considered to be at high risk of developing severe symptoms of COVID-19, requiring in-hospital care [1, 2]. These recommendations are mainly based on studies from China and Italy, and generally show that once infected, individuals with at least one of these prognostic factors are more likely to generate severe disease, requiring hospitalization and a resulting higher risk of mortality [3,4,5,6,7,8]. Governments around the world have, therefore, recommended that individuals with at least one of these factors self-isolate for prolonged periods of time to not only reduce the risk contracting severe COVID-19, but also prevent any sudden increase in demand for critical care in hospitals, which could overwhelm health systems. If the pandemic developed to affect a large proportion of the population, then critical care capacity could become saturated. However, the prevalence of these prognostic factors for severe COVID-19 are to a large extent unknown in many countries. Knowledge of the distribution of individuals considered to be at high risk of severe COVID-19, coupled with the capacity of the health care system, would allow clear strategic planning.

Several models have been produced to support COVID-19 planning in countries across the world [9,10,11,12]. Many of these models are based on the assumption that disease severity increases with age, but they do not account for an increased risk of severe disease in individuals with underlying medical conditions. This is usually because age stratified burden of disease at a local level is rarely available. Even when this information is available, data from which it originates can be obtained from a sample of the population rather than from the whole population. If the sample is not representative of the population at large, results may be biased. In order to build clear robust models that will provide trustworthy estimates of the extent to which the infection will impact populations, we need reliable estimates on the underlying prevalence of medical conditions suggesting high risk of severe disease.

The unified Swedish healthcare and register system provides a unique opportunity to calculate the burden and prevalence of prognostic factors for severe COVID-19. This knowledge will help both healthcare capacity planning and provide further data that can be applied to underlying assumptions for models that support planning worldwide. We therefore aimed to use Swedish register data to describe the prevalence of prognostic factors of severe COVID-19 at national and county level in Sweden.

Methods

Data sources

We used data from the Swedish national health care and population registers linked at an individual level using the unique personal identification number of all residents in Sweden [13]. Disease burden was based on diagnoses and date of hospitalization or visits from the National Inpatient Register and Outpatient Specialist Care Register, and sociodemographic characteristics such as age, sex, county of residence were obtained from the Total Population Register [14, 15]. We also used the Cancer Register to identify malignant tumors, and the Swedish Prescribed Drug Register to identify prescriptions dispensed by individuals and further our understating of disease burden [16, 17]. These data were originally aggregated as part of a study on comorbidities in cancer risk and survival.

Study population

We identified all people living in Sweden on 31st December 2014 and alive on 1st January 2016.

Identification of prognostic factors for severe COVID-19

We based our decision on the prognostic factors for severe COVID-19 on the guidelines from the World Health Organization and European Centre for Disease Prevention and Control [1, 2], which were age 70 years and older, cardiovascular disease, cancer, COPD, severe asthma, and diabetes. Age was calculated at 31st December 2015. An individual was then identified as having an underlying medical condition if they had a diagnosis in either the Inpatient or Outpatient Register (as primary or secondary diagnosis) or the Cancer Register within 3 years prior to 1st January 2016. If data were available, we also identified related dispensations of prescriptions from the Prescribed Drug Register within the same time period. We used International Statistical Classification of Diseases and Related Health Problems version 10 codes (ICD-10) to identify a diagnosis and Active Therapeutic Chemical codes (ATC) to identify the dispensation of a prescription. The ICD-10 and ATC codes for underlying medical conditions were: cardiovascular disease (I20-I99), cancer (C00-C75, combined with morphology codes indicating malignant behavior), COPD (J41-J44), severe asthma (J45), and diabetes (E10, E11, E13, E14, O24; ATC: A10).

Analysis

We initially calculated the burden (raw number) and prevalence (proportion) of all individuals living in Sweden in relation to their sex, age (1–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80+), county of residence (21 counties), if they had a predetermined prognostic factors for severe COVID-19, and if they had at least one, two, or three of these prognostic factors. We then calculated the burden and prevalence of the five underlying medical conditions across each age group, and the burden and prevalence of each of the six prognostic factors individually in each of the 21 counties across the whole of Sweden. We also calculated the age group stratified burden and prevalence of each of the five underlying medical conditions in each county. Finally, we calculated the burden and prevalence of individuals with at least one, two, and three prognostic factors for severe COVID-19 in each county.

Sensitivity analyses

We repeated all above analyses using a look back period of 1, 5 and 10 years prior to 1st January 2016 to define occurrence of disease in the registers, rather than 3 years. We also repeated the main analysis, but restricted cardiovascular disease and cancer diagnoses to more severe disease (restricted cardiovascular disease to ICD-10 codes: I20.0, I21–22, I24–28, I30–46, I50, I60–69, I71–72; restricted cancer codes to exclude non-melanoma skin cancers).

Patient involvement

No patients were involved in setting the research question or the outcome choices, nor were they involved in developing plans for design or implementation of the study. No patients were asked to advise on interpretation or writing up of results.

Results

Table 1 shows the characteristics of the study population. The mean age was 41 years and around 50% of the 9.6 million individuals lived in three of the 21 counties (Stockholm, Västra Götaland and Skåne). Over 22% of the study population had at least one prognostic factor for severe COVID-19 (2,131,319 individuals), and 1.6% had at least three factors (154,746 individuals).

Table 1 Characteristics of the study population

Burden and prevalence of underlying medical conditions suggesting high risk severe COVID-19 by age group

Table 2 shows the burden and prevalence of each medical condition by age group in the study population. The prevalence of each condition generally increased as age increased; however, there was a higher prevalence of severe asthma in the youngest groups compared with all other age groups.

Table 2 Burden and prevalence of underlying medical conditions suggesting high risk for severe COVID-19 by age group

Burden and prevalence of prognostic factors for severe COVID-19 by Swedish county

Table 3 shows the burden and prevalence of each of the six prognostic factors in each Swedish county, and Fig. 1 visualizes the ratio of the county specific prevalence of each factor to the prevalence of that factor overall in the study population. The burden and prevalence of all prognostic factors were as follows (Table 3): the proportion of people aged 70 years and older ranged from 11.1% in Stockholm (242,208 individuals) to 17.6% in Kalmar (40,872 individuals); cardiovascular disease ranged from 6.4% in Uppsala and Stockholm (22,086 and 140,165 individuals) to 8.8% in Dalarna (24,205 individuals); cancer ranged from 1.1% in Norrbotten (2790 individuals) to 1.6% in Halland (5067 individuals); COPD ranged from 0.6% in Västerbotten (1708 individuals) to 1.1% in Kalmar (2448 individuals); severe asthma ranged from 1.8% in Västra Götaland and Örebro (29,353 and 5243 individuals) to 2.6% in Norrbotten and Stockholm (6407 and 56,251 individuals); and diabetes ranged from 4.0% in Stockholm (86,195) to 4.7% in Uppsala (16,311). The burden of each of the five underlying medical conditions stratified by age group in each county are also presented in online supplementary material “Appendix 1”.

Table 3 Burden and prevalence of prognostic factors for severe COVID-19 by each Swedish county
Fig. 1
figure 1

Maps showing the ratio of the county specific prevalence of each prognostic factor for severe COVID-19 compared with the overall prevalence of that factor in Sweden

Burden and prevalence of at least one, two, and three prognostic factors for severe COVID-19 by Swedish county

Table 4 shows the burden and prevalence of individuals with at least one, two, and three of the six prognostic factors for severe COVID-19 in each county in Sweden, and Fig. 2 visualizes the ratio of the county specific prevalence of people living with at least one, two, and three factors compared with the overall prevalence in the study population. The burden and prevalence of prognostic factors were: at least one prognostic factor ranged from 19.2% in Stockholm (416,988 individuals) to 25.9% in Kalmar (60,005 individuals); at least two prognostic factors ranged from 5.5% in Stockholm (119,057 individuals) to 8.5% in Kalmar (19,699 individuals; and at least three prognostic factors ranged from 1.3% in Stockholm (28,162 individuals) to 2.1% in Kalmar (4839 individuals).

Table 4 Burden and prevalence of at least one, two, or three prognostic factors for severe COVID-19 in each Swedish county
Fig. 2
figure 2

Maps showing the ratio of the county specific prevalence of people living with at least one, two, and three prognostic factors for severe COVID-19 compared with the overall prevalence in Sweden

Sensitivity analyses

All analyses with a look back period of 1, 5, and 10 years are shown in online supplementary material “Appendices 2–10”. There was generally a greater burden of prognostic factors as the look back period increased. Across all of Sweden, the overall prevalence of individuals with at least one prognostic factor ranged from 19.4% with a 1 year look back to 27.6% with a 10 year look back, and the overall prevalence of individuals with at least three prognostic factors ranged from 0.5% with a 1 year look back to 2.9% with a 10 year look back (online supplementary material “Appendix 2”).

When the cardiovascular disease and cancer definitions were restricted to more severe disease, the prevalence these conditions in the whole study population was 4.4% and 1.2% respectively. The prevalence of individuals with at least one prognostic factor was 20.7%, and with at least three prognostic factors was 1.2%. Results for these analyses by age and county are available in online supplementary material “Appendices 11–13”.

Discussion

Using data from the whole Swedish population, we show that over 2 million individuals (22.1%) have at least one of six prognostic factors for severe COVID-19, as defined by the World Health Organization and the European Centre for Disease Prevention and Control (cardiovascular disease, cancer, COPD, severe asthma, diabetes, or age 70 years and older). More than 150,000 individuals (1.6%) have at least three of these prognostic factors, which identifies the most vulnerable population. We also show that the distribution of the prognostic factors is heterogeneous across Sweden, with the Kalmar county containing the highest proportion of its inhabitants with at least one factor (25.5%). However, due to its high population density in comparison with other counties, Stockholm county has the highest number of individuals with at least one prognostic factors (416,988 individuals). We also present age and county specific prevalence of each prognostic factor to facilitate capacity planning and to provide underlying data for assumptions made in mathematical modelling of the current pandemic.

Comparison to other studies

The number of people living with cardiovascular disease in 2015 in Sweden was 492,943 according to the Global Burden of Diseases studies, which is higher than the 389,774 we reported in the 1 year look back estimate in online supplementary material “Appendix 2” [18]. However, previous studies suggest the specificity and sensitivity of diagnoses of specific cardiovascular diagnoses such as acute myocardial infarction, heart failure and atrial fibrillation from the National Patient Registers are high [14]. The National Cancer Register reported that there were 214,000 individual tumors reported in 3 years prior to 31st December 2016, while we report 129, 155 individuals in our study population with at least one tumor in 3 years prior to 1st January 2016 [19]. The prevalence of COPD is estimated at around 4–10% in Sweden, which is higher than the 0.8% we calculated. However, only 30% of COPD cases are diagnosed by healthcare professionals, which are often the most severe cases. The low observed prevalence of COPD in our study may therefore be due to us only capturing severe disease that is recorded in the Patient Register, and we were unable to capture moderate and mild COPD diagnosed and treated exclusively in primary care [20,21,22]. We have also underestimated the prevalence of asthma in Sweden, which is known to be between 8 and 10%, as we were only able to identify severe cases that required hospital admission [14, 23, 24]. However, more severe asthma is likely to exacerbate more severe COVID-19, meaning we have identified those at the greatest risk. Finally, a study calculated a diabetes prevalence of 4.6% in Stockholm using survey data from the Stockholm Public Health Cohort, which is slightly higher than our estimate of 4.0% [25]. However, a report from the National Diabetes Register suggests that 22.6% of diabetes cases do not require pharmaceutical therapies, and only can be identified from primary care or quality registers, for which we did not have access [26, 27].

Strengths and limitations

Sweden is one of the few countries in the world where the study population for an analysis is the whole country. It is therefore possible to accurately calculate prevalence of disease for the whole population, without sampling and risking selection bias if sampling is non-random.

We were only able to identify the burden of prognostic factors on 1st January 2016. However, it is unlikely that the structure of the Swedish population has changed enough in 4 years to considerably change the prevalence estimates we calculated. The population of Sweden has increased by 476,572 inhabitants between 1st January 2016 and 1st January 2020 [28].

We could not identify all underlying medical conditions that the World Health Organization and the European Centre for Disease Prevention and Control suggest are prognostic factors for severe COVID-19. Given the data we had available, we were not able to identify individuals with hypertension or high blood pressure because these conditions are usually diagnosed in primary care, and we only had access to data from specialized outpatient care and hospitalizations [29]. The Prescribed Drug Register could identify individuals with hypertension or high blood pressure as it includes information on individuals that dispensed a medication regularly used to treat these conditions (diuretics, beta-blockers, ACE inhibitors etc.) [29]. However, the data we had available from the Prescribed Drug Register did not include information on these medications. Given hypertension and high blood pressure are precursors of clinical cardiovascular disease, it is likely that those with the most severe disease are captured in our cardiovascular disease estimates. Furthermore, other health agencies around the world (Center for Disease Control, United States; National Health Service, United Kingdom) have suggested additional prognostic factors for severe COVID-19 such as chronic kidney disease, liver disease, immunosuppression, and severe obesity. We decided to ground our choice of prognostic factors on recommendations from the World Health Organization and the European Centre for Disease Prevention and Control to give an overview of factors deemed important by multinational organizations, and as these are likely the guidelines that individuals in Sweden and Europe are currently following.

There is little available evidence on the prognostic factors that contribute the most to severe COVID-19 in different populations across the world. Our raw measure of cumulative number of prognostic factors for severe COVID-19 may therefore not represent those at the highest risk if one factor contributes more to severe disease in comparison with the others. Data from the World Health organization-China Joint Mission on Coronavirus Disease suggest that the case-fatality is highest in those with cardiovascular disease (13.2%) compared with cancer (7.6%), chronic respiratory disease (8.0%), diabetes (9.2%), and those with no comorbid conditions (1.4%) [30]. If this is similar in all populations, then the individual prevalence of each of the prognostic factors at national and county level in Sweden that we have also presented may give better information of the populations at highest risk of severe COVID-19. Furthermore, the current definitions of what constitutes a prognostic factor are broad. It is likely that severe forms of cardiovascular disease (e.g. acute myocardial infarction) contribute more to risk of severe COVID-19 compared with less severe cardiovascular disease (e.g. stable angina). As current evidence is scarce, we need a better understanding of the specific conditions that contribute to a poorer prognosis of COVID-19.

All calculations from the main analyses rest on the assumption that any medical conditions were diagnosed within 3 years prior to 1st January 2016. We have also presented the same analyses when the look back period is changed to 1, 5 and 10 years. The primary look back period was defined as 3 years due to being a reasonable time frame to capture individuals with active disease. We believe this gives an accurate overview of the burden of disease in the population for all diseases apart from cancer. It has been suggested that only those with active cancer are truly at a high risk of severe COVID-19, and a definition of active cancer can take many forms [31]. For some types of cancer, 3 years can be considered a long time after cancer diagnosis, and if the individual has survived, it is likely they will be considered to no longer have active cancer at 3 years after diagnosis. Therefore, for cancer, the analysis with a 1 year look back period may be a better estimation of individuals with active disease. Furthermore, all calculations require just one occurrence of a diagnosis to be counted as a confirmed condition. It is possible to increase confidence that an individual has been truly diagnosed with certain specific conditions by requiring more than one record indicating that diagnosis. However, this is highly disease specific and would require further breakdown into the conditions that require more than one record of diagnosis to confirm disease, which is out of the scope of the current project.

This study gives an accurate overview of the burden and prevalence of individuals in Sweden with the prognostic factors for severe COVID-19. We have not made any attempt to model the transmission of the disease, but rather provide clear calculations of the number of vulnerable individuals based on current guidelines. These numbers will allow authorities to optimally plan healthcare resources, by comparing the number of individuals at risk of severe disease with the critical care capacity. Results can also be applied to underlying assumptions of disease burden in modelling efforts to support COVID-19 planning. Overall, this information is crucial when deciding appropriate strategies to mitigate the pandemic and reduce both the direct mortality burden from the disease itself, and the indirect mortality burden from potentially overwhelmed health systems.