Abstract
Background With increasing rates of SARS-CoV-2 infections and the intention to avoid a lock-down, the risks for the working population are of great interest. No large studies have been conducted which allow risk assessment for this population.
Methods DKMS is a non-profit donor center for stem cell donation and reaches out to registered volunteers between 18 and 61 years of age. To identify risk factors for severe COVID-19 courses in this population we performed a cross-sectional study. Self-reported data on oro- or nasopharyngeal swabs, risk factors, symptoms and treatment were collected with a health questionnaire and linked to existing genetic data. We fitted multivariable logistic regression models for the risk of contracting SARS-CoV-2, risk of severe respiratory infection and risk of hospitalization.
Findings Of 4,440,895 contacted volunteers 924,660 (20.8%) participated in the study. Among 157,544 participants tested, 7,948 reported SARS-CoV-2 detection. Of those, 947 participants (11.9%) reported an asymptomatic course, 5,014 (63.1%) mild/moderate respiratory infections, and 1,987 (25%) severe respiratory tract infections. In total, 286 participants (3.6%) were hospitalized for respiratory tract infections. The risk of hospitalization in comparison to a 20-year old person of normal weight was 2.1-fold higher (95%-CI, 1.2-3.69, p=0.01) for a person of same age with a BMI between 35-40 kg/m2, it was 5.33-fold higher (95%-CI, 2.92-9.70, p<0.001) for a 55-year old person with normal weight and 11.2-fold higher (95%-CI, 10.1-14.6, p<0.001) for a 55-year old person with a BMI between 35-40 kg/m2. Blood group A was associated with a 1.15-fold higher risk for contracting SARS-CoV-2 (95%-CI 1.08-1.22, p<0.001) than blood group O but did not impact COVID-19 severity.
Interpretation In this relatively healthy population, the risk for hospitalizations due to SARS-CoV-2 infections was moderate. Age and BMI were major risk factors. These data may help to tailor risk-stratified preventive measures.
Funding DKMS initiated and conducted this study. The Federal Ministry of Education and Research (BMBF) supported the study by a research grant (COVID-19 call (202), reference number 01KI20177).
Introduction
With numbers of new SARS-CoV-2 infections reaching alarming levels many countries discuss and implement focused measures to reduce the spread of the virus and to avoid full lock-downs. The elderly and individuals who are at greater risk of COVID-19 related mortality due to comorbidity are in the center of this debate. Risk factor analyses from hospital-based or health insurance-based studies1-5 provide comprehensive data for these subpopulations. Yet, the working population may also face considerable risks, especially when lock-down measures and stay-at-home orders to reduce the spread of the virus are avoided. Owing to the design of most studies published so far, little information is available on risk factors for the working population, in spite of an increasing share of younger COVID-19 patients in the second half of 20206-7.
DKMS gemeinnützige GmbH (DKMS) is a major stem cell donor registry and reaches out to more than ten million volunteers who registered as potential stem cell donor8. Volunteers may only register if their general health condition does not preclude them from a stem cell donation. Registered donors are requested to report relevant medical information, which may affect their suitability as stem cell donors and may subsequently be excluded from the active donor file. DKMS excludes volunteers from the registry at the age of 61 years. As a consequence, volunteers registered with DKMS represent a subset of the working population with slightly less health issues9.
Since the severity of SARS-CoV-2 infections varies from asymptomatic courses to acute respiratory failure, information on risk factors is critical. Besides age, comorbidity, and risk behaviour, genetic data may partly explain this variation. As a donor registry, DKMS administrates immunogenetic data relevant for donor-patient matching in the context of stem cell transplantation. In response to the pandemic, DKMS therefore launched a large population-based study to identify risk factors, which might help individual risk assessment and tailor preventive measures.
Blood group A has been associated with an increased risk of contracting the virus and/or severe courses10-13. While the pathophysiology is unclear, it was speculated that A epitopes facilitate virus entry. ABO epitopes modify many proteins, lipids and sphingolipids. Individuals with at least one A1 allele have a higher glycosyltransferase activity and thus ubiquitously expose much higher numbers of A epitopes compared to individuals who inherited A2 alleles but no A1 allele14. Therefore, we tested the hypothesis that individuals with A1 phenotype have a greater risk of infection compared to individuals with A2 phenotype.
Here, we present first results of this immunogenetic study. Age and body mass index (BMI) were major risk factors for severe respiratory tract infections and hospitalization for respiratory distress. In our data, ABO blood group (BG) was a minor factor for the risk of infection but not for COVID-19 severity. Comparable rates of infection among participants with BG A1 versus A2, as observed in our study, strongly argue against the hypothesis that A epitopes facilitate virus entry into host cells.
Methods
Study design
The project was designed as a registry-based cross-sectional study. Existing immunogenetic data were linked to self-reported COVID-19 specific data collected with a standardized health questionnaire. The responsible Institutional Review Board of the Technische Universität Dresden (IRB00001473) approved the study. We registered the study with the trial registry of the German Center for Infection Research (https://dzif.clinicalsite.org/de/cat/2099/trial/4361). Data privacy of the participating individuals was protected in accordance with the General Data Protection Regulation of the European Union. We conducted the study in compliance with the principles of the Declaration of Helsinki. All participants provided explicit consent that COVID-19 specific data were linked to immunogenetic data in the DKMS donor registry file.
Study population
Study invitations were sent by e-mail to volunteers registered with DKMS in Germany who were between 18 and 61 years old, had previously provided an e-mail address, and had not objected to being contacted for issues other than stem cell donation requests. At registration with DKMS, volunteers provide basic health information with respect to weight, height, and serious health conditions. As a consequence, volunteers registered with DKMS represent a subset of the working population characterized by the exclusion of individuals with serious chronic illnesses at the time of registration. Volunteers who develop serious acute or chronic health conditions stay in the registry unless they proactively contact DKMS or get contacted for a stem cell donation request.
Health questionnaire
In total, 4,440,895 donors were contacted via e-mail for study participation between August 17th, 2020 and August 26th, 2020. Participants who provided informed consent completed the health questionnaire through a web-based password protected log-in. The database was locked on September 14th, 2020. The brief health questionnaire contained questions on the testing procedure, known risk factors, COVID-19 self-reported symptoms, and the severity of the course indicated by questions about hospitalization, need for supplemental oxygen, and mechanical ventilation. A translated version of the health questionnaire is provided in the supplement (Supplemental Material 1).
Data definitions and processing
Case definitions
Participants who reported at least one positive test for SARS-CoV-2 on an oro- or nasopharyngeal swab were considered as positive cases. Participants who had been tested at least once but never tested positive for SARS-CoV-2 were assigned as negative cases.
Phenotype definitions
Based on self-reported symptoms and treatment information, participants who reported a positive test were classified in three groups: i) severe respiratory tract infections (RTI) were defined by the combination of at least fever and cough, dyspnea and cough, dyspnea and fever, or dyspnea and myalgia5. This group included respiratory hospitalizations, defined by in-patient care with supplemental oxygen or mechanical ventilation or hospitalization for dyspnea or cough; ii) mild/moderate courses, if self-reported symptoms did not meet the definition of severe RTI. These participants reported any other of the following symptoms fever, myalgia, sore throat, loss of taste/smell, cough, or dyspnea; and iii) participants without symptoms.
Definition of risk factors
Obesity was classified into categories as suggested by the Centers of Disease Control and Prevention (https://www.cdc.gov/obesity/adult/defining.html). A history of arterial hypertension and diabetes mellitus was considered only when medication to treat this condition was reported for 2019.
Immunogenetic data
Volunteers who registered with DKMS were typed upfront with a panel of genes potentially relevant for donor selection. This panel evolved over years and includes ABO and Rh blood groups. Genotyping of DNA extracted from buccal swabs is performed at high resolution with an amplicon-based approach using Illumina devices15-16.
Based on information for ABO alleles, heterozygous alleles were grouped into ABO BG phenotypes by assuming dominant expression for A1 over A2 and BG A or B over BG O. Blood group A1 was coded for individuals with A1+0, A1+A1, A1+A2, A1+A3, A1+Aw, or A1+Ax and BG A2 was coded for individuals with A2+0, A2+A2, A2+Ax, or A2+Ael. Ambiguous genotypes were imputed by the most frequent allele according to Lang et al 2016 16.
Statistical methods
Analysis populations
We defined two analysis populations: 1) The population to analyze the risk of contracting SARS-CoV-2 comprised all positive and negative cases. 2) The population to analyze risk factors for COVID-19 severity included all positive cases but not individuals with positive tests in August or September, because they might not have experienced the maximum severity of COVID-19 at the time of reporting. Details on participant withdrawals and exclusions and on population sizes are provided in supplemental Figure S1.
Statistical analysis
In the descriptive part we used frequencies and percentages to describe the data. The risk of contracting SARS-CoV-2 was specified as primary endpoint for the immunogenetic evaluation of the impact of ABO blood groups. Five a priori defined hypotheses (A1 vs A2, A vs AB vs B vs O, A vs non-A, O vs non-O, and AB vs non-AB) were tested. For each single test a one-sided significance level of 1% was set in order to control of type I error. The power for these tests exceeded 96%. Binary logistic regression models were used to investigate potential factors associated with the risk of infection and the severity of COVID-19. Odds ratios (ORs) and their 95% confidence intervals were used to describe the associations. For exploratory and confirmatory analyses, statistical testing was based on Wald tests for logistic regression coefficients and likelihood ratio tests. All models were adjusted for age, sex, BMI, diabetes mellitus, arterial hypertension, and smoking status. Age, BMI, and month of positive test were modeled as categorical variables. All analyses were carried out using R Statistical Software version 3.5.1.
Results
Study participation
In total 4,440,895 e-mails were sent and 924,660 participants consented to study participation (20.8% return rate). Response rates of female and male email recipients were 21.8% and 16.5% (below 40 years) and 24.5% and 20.0% (above 40 years), respectively (chi-square test, p<0.001). Regional response rates ranged between 17% and 25%.
Descriptive analysis
Altogether 157,544 participants reported known results from a test for SARS-CoV-2 and 7,948 participants reported a positive test. Heat maps for regional rates of positive tests reported in this survey mirrored regional incidences of SARS-CoV-2 infections reported to the Robert Koch Institute (RKI) (Supplemental Figure 2). According to this source, the cumulative incidence of confirmed SARS-CoV-2 infections until August 31st among individuals aged between 15 and 59 years ranged between 0.33% and 0.38%. The overall rate of SARS-CoV-2 positive participants among all study participants was 0.86% (7,948/924,561).
We show demographic information of SARS-CoV-2 positive and -negative participants in Table 1. Female predominance among study participants also reflects the fact that more women than men are registered with DKMS (61% vs. 39%, respectively). Among participants who tested SARS-CoV-2 positive, 947 individuals (11.9%) reported no symptoms related to the infection, 5,014 individuals (63.1%) reported mild/moderate symptoms and 1,987 individuals (25.0%) reported symptoms of severe RTI. Respiratory hospitalization was reported from 286 participants (3.6% out of all SARS-CoV-2 positive individuals). Of those, 161 participants (2.0%) needed supplemental oxygen and 22 participants (0.28%) needed mechanical ventilation. Participants who reported respiratory support had a median age of 50 years (range: 24 to 58 years). Twenty participants were hospitalized, but did not meet the criteria for respiratory hospitalization.
Rates of positive tests
We assessed risk factors for contracting SARS-CoV-2 by comparing the distribution of baseline characteristics between participants who tested positive and those who reported a negative test in the same period. The rate of positive tests by age group showed a bimodal distribution with peak rates among individuals aged 25 to 29 years (5.6%) and 50 to 61 years (6.2%). It was lowest among participants aged 35 to 39 years (4.1%) (see Figure 1). Men (5.6%) reported higher rates of positive tests compared to women (4.6%) (adjusted OR 1.27, 95%-CI: 1.21-1.34; p<0.001). Active smokers (3.7%) reported lower rates of positive tests than non-smokers (5.5%) (adjusted OR 0.63, 95%-CI: 0.59-0.68; p<0.001). Results of multivariable logistic regression models are shown in Table 2.
Rates of positive cases were also associated with the month of testing (Figure 1, Panel B and C). The rates of positive cases declined from 19.1% in March to 1.7% in August in our study. In parallel, the percentage of positive tests in Germany reported to the RKI peaked in late March/early April and declined in the subsequent four months until September 202017.
Risk factors contributing to COVID-19 severity
Age, BMI, and a history of diabetes or arterial hypertension had a significant impact on severe RTI and respiratory hospitalizations (Figure 2). Increasing age and BMI were associated with increased rates of respiratory hospitalizations due to SARS-CoV-2. Participants aged 55 to 61 years had a 5.3-fold greater risk (95%-CI: 2.9-9.7; p<0.001) for respiratory hospitalizations compared to participants aged 18 to 24 years. Participants with class 3 obesity (BMI ≥ 40) had a 3.2-fold greater risk (95%-CI: 1.6 to 6.7; p=0.002) of respiratory hospitalizations compared to individuals with normal weight. Tests for interactions between sex and BMI were not significant at the 5% level. According to the multivariable regression model (Table 2) the risk of respiratory hospitalization, for example for a 55-year old individual with a BMI of 35-40 kg/m2 was 11.2-fold higher compared to a 20-year old individual with normal weight (observed rates were 29% and 1.3%, respectively).
The risks for respiratory hospitalizations for participants with a history of diabetes or arterial hypertension were reflected by odds ratios of 1.62 (95%-CI: 0.82-3.20, p=0.16) and 1.26 (95%-CI: 0.87-1.81, p=0.22) with 95%-confidence intervals including 1, respectively. Past smokers and active smokers did not show an increased risk of respiratory hospitalizations (odds ratios 1.12, 95%-CI: 0.84-1.49, p=0.45 and 0.88, 95%-CI 0.59-1.31, p=0.53, respectively) compared to non-smokers.
Broad testing of asymptomatic individuals and contact persons of infected individuals started in Germany only in summer 2020. The rate of asymptomatic infections correlated with cases reported later in the year and with a lower percentage of positive tests. Participants aged 24 years or lower showed the highest incidence of asymptomatic infections (14.4%), but with large differences and a clear trend, e.g. 26.8%, 20.6% and 10.4% for individuals aged 18, 19, and 24 years, respectively.
Impact of ABO/Rhesus phenotypes and genotypes
The impact of ABO genotype was analyzed in 125,125 participants with ABO genotypes, including 102,426 participants with allele-level resolution. We hypothesized that epitope density of ABO antigens is associated with the risk of infection. Participants with A1 and A2 phenotypes, however, had comparable rates of positive tests (5.3% and 5.4%, adjusted OR 1.03, 95%-CI 0.93-1.15; p=0.58) (see Figure 3A). Next, we analyzed the impact of ABO phenotypes on the risk of contracting SARS-CoV-2 and the risk of severe RTI (Figure 3A-D). Participants with BG A had a significantly higher rate of positive tests (adjusted OR 1.15, 95%-CI 1.08-1.22; p<0.001) compared to participants with BG O (see Figure 3C). Participants with BG AB (adjusted OR 1.19, 95%-CI 1.06-1.35; p=0.004) had significantly more infections as well, but no difference was found for participants with BG B compared to BG O.
In contrast, BG A did not show an impact on COVID-19 severity. But BG B was associated with a higher risk of severe RTI (OR 1.24, 95%-CI 1.01-1.53; p=0.038) and respiratory hospitalizations (OR 1.78, 95%-CI 1.18-2.68; p=0.006) compared to BG O (Figures 3B and 3D).
Finally, we performed exploratory analyses on the impact of ABO genotypes (Table 3). Allele level information was available for 102,426 participants. Participants who were homozygous for B had the lowest percentage of positive tests and participants homozygous for A2 had the highest percentage of positive tests. The adjusted odds ratio (OR 1.07, 95%-CI 0.96-1.19; p=0.23) for a positive test for participants with A2 alleles (A2+A2, A2+O, A2+B) versus A1 alleles (A1+A1, A1+O, A1+B) was not statistically significant. We provide additional information on exploratory analyses for blood group subtypes and rare genotypes in the supplement. Rhesus positivity was not associated with either phenotype in our study (Supplemental Table S1).
Discussion
Individual risk assessment for adults aged 18 to 61 years without serious health issues remains uncertain. More information on risk factors for this part of the working population may inform decision making throughout the course of the pandemic. We present data of one of the largest studies in this population which have been conducted so far. In comparison to data for the general population collected with the European Health Interview Survey, our study population showed lower rates of diabetes, arterial hypertension, and nicotine abuse (see supplemental Table S2)18. Notably, women are overrepresented in our cohort, which partly reflects the preponderance of women among registered volunteers. The pattern of our results suggests that exposition to the virus in the environment, captured by numbers of infections per month, dominated other risk factors for contracting SARS-CoV-2. Host susceptibility appeared to explain little variation and the bimodal distribution of the percentage of positive tests across age groups (Figure 1 and Table 2) most likely reflected age-group specific exposition patterns and behavior.
Our data are compatible with a protective effect of active smoking to contract SARS-CoV-2 which has been observed in other large studies also2-3, 12. Mechanistic explanations have been proposed, e.g. that nicotine might exert anti-inflammatory effects or that nitric oxide inhibits viral replication and invasion of target cells in the respiratory tract19-20. However, owing to the character of smoking as sensitive self-reported information we and others cannot exclude a reporting bias. For example, smokers could ask more often for tests driven by mild respiratory symptoms than non-smokers. If so, any inference on the percentage of positive tests by smoking status would be biased.
In line with previous reports and meta analyses, we found that BG A is a significant, albeit weak, risk factor for contracting SARS-CoV-210, 12-13. Notably, the investigation of ABO subgroups did not provide a consistent explanation for this association. Participants with A1 versus A2 phenotype showed comparable rates of infections. Since the A-epitope density is more than four-fold higher in BG A1 versus A2, our data therefore argue against the hypothesis that A-epitopes facilitate SARS-CoV-2 entry into cells14. Genetic variance in ABO genes has been associated with the risk for venous thromboembolism, stroke, coronary artery disease, and acute respiratory distress syndrome regardless of COVID-1921-23. The impact of glycosylation patterns on the clearance of plasma proteins (e.g. of the von Willebrand Factor) may result for example in differences in end-stage coagulation and thrombosis24-25. This would also explain a negative impact of BG A in high-risk populations but not in low-risk populations like ours10-11, 26. In exploratory analyses, BG B was associated with a greater risk of severe respiratory infections and respiratory hospitalization. However, genotype-specific analyses did not reveal a conclusive pattern of associations (Table 3 and Table S2) and BG B has not been described consistently as risk factor for severe COVID-19 courses10, 12-13. Therefore, we interpret this association as incidental finding.
Only 3.6% of participants who had been tested positive, reported hospitalizations due to respiratory infections. This rate is substantially lower than the 14% average hospitalization rate for COVID-19 for Germany7. Yet, the hospitalization rate of 1.8% among infected crew members of an aircraft carrier with a median age of 28 years was even lower26. These apparent discrepancies can partly be explained by differences between the study populations with respect to two major risk factors for respiratory hospitalizations, age, and BMI. Based on multivariable regression modeling, the risk of hospitalization due to RTI was 5.3-fold higher for individuals aged 55-61 years compared to 18-24 years. This impact of age on the risk of hospitalization due to SARS-CoV-2 among 18 to 61-year-old adults is remarkable and appears to be more pronounced than for influenza. The influenza burden by age group changes from season to season, but in contrast to SARS-CoV-2, influenza-associated rates of hospitalizations are typically higher for the very young and the very old27. The second major risk factor is BMI. In multivariable regression modeling, overweight (BMI 25 to <30 kg/m2) increased the risk of hospitalization (odds ratio 1.35, p=0.048) and obesity class II (BMI 35 to <40 kg/m2) approximately doubled the risk of hospitalization (odds ratio 2.1, p=0.01). The prevalence of obesity in a population may therefore be a major factor determining the burden of COVID-19. While our data stand out with respect to the granularity, others reported already that overweight and obesity might be strong risk factors for COVID-19 hospitalization and mortality3, 12, 28. Yet, obesity also has a strong impact on the respiratory hospitalizations for influenza and might thus not reflect COVID-19 specific pathology 29.
Only 22 out of 7,948 participants (0.28%) needed mechanical ventilation in our cohort. This low number fits to a weighted case fatality ratio of 0.11% calculated for a population with comparable sex- and age-distribution based on national data (Supplemental Table S3). It is also in line with rates of ICU admissions in smaller studies on healthy populations and larger studies on COVID-19-related mortality 3-4, 26.
Our study has several limitations: First, we analyzed self-reported data. Test results from oro- or nasopharyngeal swabs were not verified. Further, gender-specific or smoker-specific behavior in response to the pandemic may have biased our data 30. Second, the phenotype definitions based on self-reported symptoms without objective measurements. Therefore, we introduced respiratory hospitalization as a more stringent – but non-standard - definition for severity in order to exclude hospitalizations for isolation or social indications. Third, negative cases had no confirmed COVID-19 infection during the past 6 months. However, this did not preclude asymptomatic or mild to moderate SARS-CoV-2 infections, which had not been diagnosed. Forth, rates of positive tests must not be interpreted as incidences because our study population is not a random sample of the general population. Fifth, by design, we could not collect data on events, which incapacitated volunteers to respond. The infection-fatality-rate of adults below 61 years of age is low, but the rate of respiratory hospitalization calculated from self-reported data systematically underestimates the true rate of hospitalizations by this margin.
Taken together, the data show a moderate risk of respiratory hospitalizations due to SARS-CoV-2 infection in this cohort of mostly healthy participants ranging between 18 and 61 years of age. The risk for hospitalization, however, varied substantially depending on age and BMI. Blood group A had no impact on this endpoint. Our results may be used for individual counseling and risk-stratified preventive measures.
Data Availability
The authors confirm that the data supporting the results of this study are available in the article [and/or] its supplementary materials.
Author contributions
A.H. Schmidt, S.N. Bernas, J. Schetelig, and H. Baldauf designed the study.
S. Wendler, F. Heidenreich, J. Schetelig, R. Real, and S.N. Bernas performed systematic literature searches addressing different aspects of design and risk factors for this study.
S. Wendler, S.N. Bernas, R. Real, H. Baldauf, J. Schetelig, and A.H. Schmidt developed the health questionnaire.
C. Bunzel, D. Endert, R. Barth, S.N. Bernas, J. Sauter, and H. Baldauf designed and implemented the data protection concept.
J. Sauter, R. Barth, J. Hofmann, J. Markert, S.N. Bernas, D. Endert designed the database, validated data entry and export, and verified the data download.
H. Baldauf and J. Schetelig performed the statistical analysis.
H. Baldauf, S.N. Bernas, and J. Sauter had access to the underlying data and verified their integrity.
All authors contributed to the interpretation of the data.
J. Schetelig wrote the first draft of the manuscript. All authors reviewed the paper and approved the final version of the manuscript.
Acknowledgements
We are grateful to all registered DKMS donors who participated in this study. Further, we would like to acknowledge the dedicated work of many of our coworkers in different departments of DKMS who facilitated this study. Especially, we would like to name Alois Grathwohl, Heike Fischer, Stefanie Diesch, Michaela Jaeger, Rolando Silva, Nina Belser, Sandra Fritschi, Julia Chatelain, Larissa Wolf, Jonas Wolf, and Marcel Eggert (HLA Service), Theodora Fouki, Clemens Fritschi, Philipp Rau-Endrulat, Philipp Krempel, Jochen Schnitzer, Kai Höfler, Julia Pingel and Mike Bretz (IT), Nina Louis, Sonja Krohn, Kay Beutling, and Konstanze Burkard (Corporate Communications), Alessandro Haemmerle, Claus Haibt, and Ralf Neumann (Data Management), and Daniel Schefzyk (Bioinformatics). Finally, we would like to acknowledge a research grant from BMBF (reference number 01KI20177) which in parts facilitated this study.