Abstract
Introduction Cardiovascular disease and diabetes have shown to be predictive of clinical deterioration towards critical illness or death in the hospitalised COVID-19 patient population. The aim of this study was to determine the incremental value of cardiovascular vulnerability - defined by the number of cardiovascular diseases and/or diabetes - in predicting the risk of escalation of care towards hospital referral in primary care patients with clinically suspected or confirmed COVID-19.
Methods Data were retrospectively collected from three large Dutch primary care registries with routine care data of ±850,000 people. A prognostic prediction model was developed in two databases to assess the incremental value of cardiovascular vulnerability. Data from the ‘first wave’ of COVID-19 infections in the Netherlands (March 1 2020 to June 1 2020) was used for derivation. A multivariable logistic regression model was fitted to predict hospital referral within 90 days follow-up after first consultation in consecutive adult patients seen in primary care for COVID-19 symptoms. Age, sex, the interaction between age and sex, and the number of underlying cardiovascular diseases and/or diabetes (0, 1, or ≥2) were pre-specified as predictors prior to the analyses. The model was (i) compared to a simpler model without the predictor number of cardiovascular diseases and/or diabetes and (ii) externally validated in COVID-19 confirmed patients during the ‘second wave’ (June 1 2020 to April 15 2021) in all three databases.
Results There were 5,475 patients included for model development and 6.8% had the primary outcome hospital referral. The model with number of cardiovascular diseases included as a predictor performed better than a model without this predictor (likelihood ratio test p<0.001). Older male patients with multiple cardiovascular diseases and/or diabetes had the highest predicted risk of hospital referral, reaching risks above 15-20% in these patients. The model was externally validated in a population of 16,693 COVID-19 patients. The observed risk was lower in this temporal validation cohort (4.7% versus 6.8%). The temporally validated c-statistic was 0.747 (95%CI 0.729-0.764) and the model showed good calibration.
Conclusion In this general population study, risk of clinical deterioration after suspected or confirmed COVID-19 was on average 5.1% in the development and validation cohorts combined. This risk increased with age and was higher in males compared to females. Importantly, patients with concurrent cardiovascular disease and/or diabetes had higher predicted risks. Identifying those at risk for hospital referral could have clinical implications for COVID-19 early disease management in primary care.
Introduction
Coronavirus disease 2019 (COVID-19) has led to a global pandemic ever since the first cases were described in late 2019. In order to orchestrate the flow of patients depending on the expected course of the disease, the need arose for risk profiling patients suffering most from COVID-19. Since then, a fast-growing amount of prediction models have been developed for prognosticating COVID-19 patients. However, most of these models focus on an in-hospital population, with only few prediction models focusing on the community.(1) This is unfortunate as the clinical presentation of (suspected) COVID-19 starts-off with initially mild to moderate symptoms in the first week of illness, and only in some with progression to hypoxemia for which hospital (or even ICU) admission is needed, typically occurring in the second week of illness.(2) If such deterioration occurs, the primary care physician is often the first to decide on the need and optimal timing for more impactful measures, such as intensified monitoring or ultimately hospitalisation. In hospitalised COVID-19 patients, underlying cardiovascular diseases have been identified as strong predictors for further disease deterioration towards ICU admittance or death.(3–7) The incremental value of cardiovascular diseases and/or diabetes in predicting escalation of care in primary care COVID-19 patients has, however, yet to be determined.
The aim of this study was to determine the incremental value of cardiovascular vulnerability – defined by the number of cardiovascular disease and/or diabetes – in predicting the risk of hospital referral in primary care patients with clinically suspected or confirmed COVID-19. Results from this analysis are intended to value concurrent cardiovascular disease in community COVID-19 patients, and to support clinical decision making on COVID-19 early disease management for primary care physicians.
Methods
Study design
This study involves an analysis of observational data of community people registered by the primary care physician with confirmed and clinically suspected COVID-19 patients. We assessed the incremental value of cardiovascular disease and/or diabetes by developing a prognostic prediction model in a cohort of patients from the ‘first wave’ of COVID-19 infections in the Netherlands (March 1 2020 to June 1 2020) that was temporally validated in a cohort of patients from the ‘second wave’ of infections in the Netherlands (June 1 2020 to April 15 2021). Where appropriate for this study, we adhered to the TRIPOD guideline for reporting prediction models.(8)
Databases
Patients were included from three similar ongoing and dynamic Dutch primary care registries containing pseudo-anonymous medical data of approximately 850,000 patients: the Julius General Practitioner’s Network (JGPN) Utrecht, the ‘Academisch Netwerk Huisartsgeneeskunde VU’ (ANH) Amsterdam, and the ‘Academisch Huisartsennetwerk AMC’ (AHA) Amsterdam.(9–11) Two databases (JGPN and ANH) were used to identify patients for the development of the prediction model (i.e. development cohort) and all three databases (JGPN, ANH and AHA) were used to identify patients for the temporal validation (i.e. the validation cohort).
Study population and data collection
Patients for the development cohort were included from March 1 2020 to June 1 2020 (the ‘first wave’ of the COVID-19 pandemic in the Netherlands). During this time period, very limited polymerase chain reaction (PCR) testing for COVID-19 was available and mainly restricted to more severe hospitalised cases. Consequently, many patients with symptoms that were highly suggestive of COVID-19 were not being tested. To create a representative patient sample we therefore included all consecutive adult patients aged 18 years or older, who visited their primary care physician with confirmed COVID-19 as well as those with symptoms that were clinically suspected.
For identification of the study population and data collection, the same methods were applied in all three databases. Dutch primary care physicians link diagnoses and clinical symptoms to the electronic medical records as diagnostic codes using the International Classification of Primary Care (ICPC) coding system. The primary care physicians supplying clinical data to the JGPN, AHA and ANH databases are trained in and experienced with using ICPC codes. For the development cohort, COVID-19 suspected patients were identified using the ICPC codes R74 (acute upper respiratory infection), R81 (pneumonia) and R83 (other respiratory infection). At the time of reporting, primary care physicians were recommended to use R81 and R83 for indicating COVID-19 suspected and COVID-19 confirmed cases respectively. Records of patients labelled with ICPC R74 (unspecified acute upper respiratory infection) were manually screened for COVID-19 suspicion in the consultation text by three clinical scientists (FSvR, LPTJ, and SvD) and cases of doubt were discussed until agreement was reached. Patients with ICPC R74 and not having a synonym of or reference to COVID-19 suspicion or related symptoms in this text were excluded. Of all included patients, baseline characteristics (i.e. age; sex; comorbidities; and where available Body Mass Index (BMI), oxygen saturation and C-Reactive Protein (CRP)) were collected. Comorbidities (i.e. cardiovascular disease, pulmonary disease, cancer), and history of relevant diseases were identified using ICPC. Supplementary table 1 contains all ICPC codes that were used for this study.
For the validation cohort, consecutive adult JGPN patients were included from September 1 2020 until April 15 2021, and consecutive adult AHA and ANH patients were included from June 1 2020 until December 31 2020 (the ‘second wave’ of COVID-19 infections). At this point in time, PCR COVID-19 tests were available for all symptomatic patients in the Netherlands, and confirmed cases were routinely coded by primary care physicians using ICPC coding in medical records (R83 and R83.03). Thus, only confirmed COVID-19 cases (i.e. with ICPC R83 and R83.03) were included during the ‘second wave’ of the COVID-19 pandemic in the Netherlands for validation of the model.
Outcome
The primary outcome of this study and the outcome of the prediction model was referral to an emergency ward for intended hospital admission. This was defined as any escalation of care resulting in hospital referral by the primary care physician and as such recorded in the consultation text of the medical record. To capture the full spectrum of complications of COVID-19 resulting in hospitalisation, follow-up lasted 90 days after first consultation for COVID-19 suspected symptoms. For this, consultation texts were manually screened for any emergency hospital referral and its synonyms by (primary care) clinical scientists (FSvR, LPTJ, SvD, and GJG) and cases of doubt were discussed, again until consensus was reached.
Candidate predictors
Based upon existing literature from hospitalised COVID-19 patients, we a-priori specified the following candidate predictors prior to the analysis phase: age, sex, the interaction between age and sex, and the number of cardiovascular diseases. Cardiovascular diseases were defined as (history of) type 2 diabetes mellitus, heart failure, coronary artery disease, peripheral arterial disease, stroke/transient ischemic attack (TIA), venous thromboembolism (pulmonary embolism or deep venous thrombosis; VTE), and/or atrial fibrillation (AF). These cardiovascular diseases were deemed present if the primary care physician had used corresponding ICPCs (supplementary table 1) at any point before the index date in the patient’s medical record. In absence of an ICPC code the disease was assumed to be absent. The number of cardiovascular diseases were counted per patient and categorised in one of three categories: no cardiovascular disease, one cardiovascular disease, or two or more cardiovascular diseases.
Sample size
The model development cohort in JGPN and ANH yielded 5,475 eligible patients with an event fraction of 0.068 (6.8%, n=373) for the primary outcome referral to the hospital. Prior to prediction analysis, the number of allowed candidate predictors was determined. Based on the proposed calculation for sample size in prediction modelling by Riley et al.(12), the maximum number of candidate predictors that can be modelled was 30 with a R2 Cox-Snell (R2cs) of 0.0495. As this R2cs was estimated in absence of a known value, varying R2cs from 0.0395 to 0.0595 yielded a minimum of 24 and a maximum of 37 candidate predictors, including interaction terms. By using the candidate predictors age, sex, the interaction between age and sex, and the number of cardiovascular diseases with three categories, the amount is considerably lower than the maximum of allowed candidate predictors, and therefore the sample size of 5,475 eligible patients was deemed sufficient and large enough for model development.
Missing data
Candidate predictors age, sex and cardiovascular disease had no missing data. Missing values in baseline of characteristics measurements of CRP, BMI and oxygen saturation level were not imputed as these determinants were not used further in predictive modelling.
Statistical analyses
Baseline characteristics were summarised using descriptive statistics with categorical variables as numbers with percentages and continuous variables as means with standard deviations or medians with interquartile ranges (IQR). A multivariable logistic regression modelling approach was used. All included patients were entered in a fixed model (full model) with the predictors age, sex, the interaction between age and sex, and the number of cardiovascular diseases with three categories. To determine the incremental value of the predictor number of cardiovascular diseases, a second model (simple model) was fitted using only the predictors age, sex, and the interaction between age and sex. Age was considered as a continuous variable and was studied using a restricted cubic spline function to account for possible non-linearity with 4 knots on the percentiles 0.05, 0.35, 0.65 and 0.95.(13) Number of cardiovascular diseases was taken as a dummy variable with ‘no cardiovascular disease’ as reference category. The incremental value of number of cardiovascular diseases was studied comparing the two models c-statistics (ΔAUC), Cox-Snell R2cs (ΔR2cs), and a likelihood ratio test. To determine incremental value, an alpha of 0.05 was used for the likelihood ratio test. The full model was internally validated using Harrell’s bootstrapping with 100 repetitions to obtain optimism corrected estimates of the c-statistic and R2, and slope were calculated. For temporal external validation of the full model, calibration and discrimination were evaluated: observed and predicted events are shown in calibration plots and for discrimination areas under the curve (AUC/c-statistic) are given. Other performance measures for temporal external validation that are reported are: calibration slope, calibration intercept, calibration in the large, R2cs, and Brier score. Brier score assesses the overall goodness of fit of the model with smaller numbers indicating better performance. Confidence intervals for c-statistics were obtained using the Delong method. For R2cs and Brier score confidence intervals, bootstrapping was used with repetitions set at 1000. Validation was done in the whole validation dataset as well as separately in the JGPN, ANH, and AHA validation datasets. All statistical analyses were performed in R version 4.0.3 with R base, rms, pROC, DescTools, and rmda packages.(14–18)
Ethics
This research was conducted in accordance with Dutch law and the European Union General Data Protection Regulation and according to the principles of the Declaration of Helsinki. The need for formal ethical reviewing was waived by the local medical research ethics committee of the University Medical Center Utrecht, the Netherlands as the research did not require direct patient or physician involvement.
Results
Patient characteristics
Patient characteristics of the development cohort are described in table 1. There were 5,475 patients included in this cohort: 2,825 from JGPN and 2,650 from ANH. In JGPN, the median age was 48 (IQR 34-62) years and 40.1% were male. 64.6% of the patients were coded as R74, 14.7% as R81, and 21.3% as R83. In ANH, the median age was 49 (IQR 36-62) years and 40.3% were male. 71.5% were coded as R74, 10.7% as R81, and 19.2% as R83. Differences between both datasets in the development cohort were minor. Around a quarter of patients suffered from one or more cardiovascular disease, most often type 2 diabetes and coronary artery disease.
Patient characteristics of the COVID-19 confirmed patients in the validation cohort are described in table 2. The total of 16,693 patients for validation could be divided in 5,420 from JGPN, 4,989 from ANH, and 6,284 from AHA. In JGPN, the median age was 43 (IQR 30-56) years and 44.3% were male. In ANH, the median age was 47 (IQR 34-59) years and 42.5% were male and in the AHA dataset, the median age was 49 (IQR 36-60) years and 39.2% were male. The differences between these three datasets in the validation cohort were also minor. Around 15-20% suffered from one or more cardiovascular disease, again most often type 2 diabetes and coronary artery disease.
Model development and internal validation
All 5,475 patients were used for model development. 373 patients (6.8%) had the outcome hospital referral. All predefined model regression coefficients of the two models (i.e. age, sex, interaction between age and sex, and the number of cardiovascular diseases with three categories), with confidence intervals are shown in table 3. The apparent c-statistic of the full model was 0.693 (95%CI 0.665-0.721) and the internally validated c-statistic was 0.688 (95%CI 0.660-0.716). The apparent c-statistic of the simple model was 0.681 (95%CI 0.653-0.710) and the internally validated c-statistic was 0.680 (95%CI 0.652-0.708). The full and the simple model are compared in table 4. The full model performed significantly better than the simple model (p-value for likelihood ratio test, χ2 = 19.5, df = 2, p<0.001). Figure 1 gives a visual representation of the full prediction model showing the predicted risks for hospitalisation as a function of (increasing) age, stratified by sex and by the number of underlying cardiovascular diseases and/or diabetes. Overall risks are higher for male patients and increase with age. Furthermore, a higher risk is observed in patients with underlying cardiovascular disease.
Temporal external validation
Predicted risks were overall slightly higher than the observed risk (6.2% versus 4.6%) and the calibration slope was 1.36. Overall discrimination showed an AUC of 0.747 (95%CI 0.729-0.764). All combined and individual database validation performance measures are shown in table 5. Figure 2 shows the overall calibration plot and the calibration plots per database separately are given in the supplementary materials. The outcome prevalence was lower in the validation datasets than in the development datasets (4.7% versus 6.8%), probably explaining the overestimation in predicted risks by the model.
Discussion
Cardiovascular vulnerability is a predictor of escalation of care in a population of 5,475 consecutive adult patients in primary care with confirmed or clinically suspected COVID-19 in the ‘first wave’ of infections in the Netherlands. This finding was confirmed by external validation in a population of 16,693 confirmed COVID-19 primary care patients in the ‘second wave’, exemplifying the robustness of our inferences. On average, in 22,168 primary care patients 5.1% was referred to the hospital for considering admission. A model including the number of concurrent cardiovascular conditions and/or diabetes (0, 1, or ≥2) in addition to age and sex showed moderate to good performance and demonstrated consistent and good discrimination and calibration upon temporal external validation. The model showed a c-statistic of 0.747 (95%CI 0.729-0.764).
Although most COVID-19 patients experience a favourable prognosis without the need for referral for hospital care, studies on COVID-19 mainly focussed on those seen in the hospital setting. While on average the risk for hospitalisation in adult community people with COVID-19 is low (5.1%), it is much higher than the hospitalisation rate for other lower respiratory infections which is estimated at approximately 1% of the adult population affected.(19) In our study, age, sex and the number of concurrent cardiovascular conditions and/or diabetes can identify specific patients at far greater risk of hospitalisation. In fact, for female patients without cardiovascular comorbidity, the risk of clinical deterioration for which hospital referral was deemed necessary remains well below 10% even in the eldest elderly (aged 80+), while in the presence of cardiovascular diseases and/or diabetes, patients experience higher risks already at younger ages, notably males. For instance, a male patient with two or more underlying cardiovascular diseases will reach a predicted risk of 15% already at the age of around 57 years and this predicted risk will even further increase to above 20% from the age of 80 onwards. This indicates the incremental effect of cardiovascular diseases and/or diabetes in addition to age and sex in predicting the risk for complicated COVID-19 disease trajectories in primary care patients.
Our findings overall confirm those from previous studies done in the hospital setting where age and male sex are important predictors for disease progression towards the endpoints ICU admission or death.(1,20–22) Social, behavioural, comorbidity and biological differences (ACE2 expression, sex-hormones, X-chromosome exposure) between male and female sexes all might contribute to the higher risks of COVID-19 progression observed in males, although probably not all mechanisms have been fully elucidated yet.(23,24) Also, in hospitalised patients, it has been demonstrated that there is an association between cardiovascular disease and COVID-19 complicated disease trajectories, with higher prevalence of cardiovascular disease and diabetes described in those with critical illness. (3–7,25) Our study shows that this prognostically unfavourable effect is already present much earlier on in the COVID-19 disease course, at the start of symptoms in primary care.
This is in line with previous research, where this additive effect of (cardiovascular) comorbidities was also described by the 4C Mortality Score.(26) In this study, the authors demonstrated that the number of comorbidities, importantly including cardiovascular comorbidities, had a more predictive effect than taking only individual co-morbidities in predicting in-hospital mortality of COVID-19 patients.(26) Furthermore, there are two large community-based prediction studies also highlighting the importance of cardiovascular comorbidities as a predictor in the community COVID-19 population. The QCOVID model that was developed in the UK was based on data from primary care and showed excellent discrimination (c-statistic >0.9) for the primary outcome time to death from COVID-19. The domain of that study, however, covered the whole general population regardless of COVID-19 diagnosis and therefore this can best be interpreted as the risk prediction of getting infected with COVID-19 and subsequently having complications from COVID-19. Thus, the aim of this model was to inform UK health policy and support interventions to manage COVID-19 related risks, rather than inform medical decision making during patient consultations in confirmed or clinically suspected COVID-19 cases.(27) With only 0.07% with the outcome death, and thus very low a priori chance, the c-statistic ‘misleadingly’ moves towards 1.0. Another similar public health based UK study in patients with and without COVID-19 identified determinants that were associated with COVID-19 related death in the OpenSAFELY primary care database by linking primary care records to reported COVID-19 related deaths. It found the most predictive clinical determinants to be increasing age, male sex, type 2 diabetes mellitus, and cardiovascular disease, similar to our findings.(28) While the domain notably differs between patients seeking care for COVID-19 symptoms in our study and the adult community as a whole in these studies form the UK, all draw similar conclusions on the increased risk of clinical deterioration in patients with cardiovascular disease.
Strengths and limitations
This research contributes to the evidence-based prognostication of community COVID-19. We were able to use a large and representative database capturing both the ‘first’ and ‘second’ wave of the COVID-19 in the Netherlands. We used state-of-the-art methodology including external temporal validation to predict clinical deterioration in a patient population that currently is understudied. For full appreciation of our findings, however, some limitations also need to be addressed. First, the model was developed in a dataset with a low event fraction of the outcome hospital referral, yet the actual number of hospital referral events did allow us to perform robust multivariable regression techniques. Second, there are limitations to using routine care registry data that could have resulted in misclassification of the study population, predictors and outcome, and most importantly has the risk of missing values. For example, uncertainty concerning COVID-19 infection status may exist as COVID-19 PCR test results are not automatically linked to the primary care electronic medical records. However, the model proved its transportability in primary care patients in a different time period with satisfactory calibration and discrimination, during a time window where PCR testing was widely performed. Furthermore, the outcome hospital referral was based upon a rigorous manual extraction of medical records by pairs of researchers, yet not based upon actual linkage to hospital records. Additionally, there are differences between our development and validation population: the patients from the ‘first wave’ are all symptomatic patients that visited their primary care physician for symptoms suggestive of COVID-19, while the patients from the ‘second wave’ – due to government recommendation for individuals to get tested even in the circumstance of only mild symptoms – also include more healthy people that just informed their primary care physician of their positive COVID-19 PCR status. This could also explain the lower event fraction in the validation set (4.7% versus 6.8% in the development population). Finally, the incremental value of cardiovascular disease on prognosticating COVID-19 was assessed in different ways; although we did observe a highly significant change in the likelihood ratio test, the delta in c-statistic and R2cs was only small to modest. Possible reasons for this include the overall low risk of hospitalisation in most patients in our cohort, as well as that most patients (80.2%) in fact in our cohort did not suffer from concurrent cardiovascular diseases and/or diabetes, and it has been widely acknowledged that, notably in such scenario’s, a change in e.g. the c-statistic is difficult to achieve.
Clinical implications
The simplicity of the chosen predictors and the clinical applicability may provide great advantages for risk profiling patient with suspected or confirmed COVID-19 in the primary care and community setting. This can have several important clinical implications. First, it may be possible to identify patients that will benefit from closer monitoring and frequent follow-up at home by predicting the risk of clinical deterioration early on in the COVID-19 disease course. By intensified monitoring of higher risk patients, critical illness may be detected earlier, potentially improving prognosis. Second, risk prediction could also support advanced care planning. Informing both patients and physicians on the risk of severe illness, may help in anticipating a more stringent or more lenient management. Last, risk profiling may be used for targeting treatment. Vaccination strategies to prevent COVID-19, for instance, may focus on those with cardiovascular disease and/or diabetes first. Additionally, experimental regiments to treat COVID-19 may be addressed to high-risk patients that may benefit most. Examples include for instance treatment with budesonide or colchicine; both treatment options likely benefit patients most at higher prior probability of having an adverse prognosis.(29,30) Nevertheless, in the end, risk prediction in primary care has to prove its value in daily practice at the background of changing characteristics of this challenging COVID-19 pandemic and influences of virus mutations. We however do hope that prognostic studies, like ours, may aid physician by making informed, evidence-based decisions and thereby improve patient outcomes.
Conclusion
In this general population study, risk of clinical deterioration after suspected or confirmed COVID-19 was on average 5.1%. This risk increased with age and was higher in males compared to females. Importantly, patients with concurrent cardiovascular disease and/or diabetes had higher predicted risks. Identifying those at risk for hospital referral could have clinical implications for COVID-19 early disease management in primary care.
Data Availability
The data used for this study are available from the Dutch routine primary care registries JGPN, ANH and AHA. Restrictions apply to the availability of these data, which are used under license for the current study, and so are not publicly available. Data will however be available from the authors upon reasonable request and with permission of the individual registries.