Abstract
Background A wide range of predictive models exist that predict risk of common lifestyle conditions. However, these have not focused on identifying pre-clinical higher risk groups that would benefit from lifestyle interventions and do not include genetic risk scores.
Objective To develop, validate, and compare the performance of three decision rule algorithms including biomarkers, physical measurements and genetic risk scores for incident coronary artery disease (CAD), diabetes (T2D), and hypertension in the general population against commonly used clinical risk scoring tools.
Methods We identified 60782 individuals in the UK Biobank study with available follow-up data. Three decision rules models were developed and tested for an association with incident disease. Hazard ratios (with 95% confidence interval) for incident CAD, T2D, and hypertension were calculated from survival models. Model performance in discriminating between higher risk individuals suitable for lifestyle intervention and individuals at low risk was assessed using the area under the receiver operating characteristic curve (AUROC).
Results We ascertained 500 incident CAD cases, 1005 incident T2D cases, and 2379 incident cases of hypertension. The higher risk group in the decision rules model had a 40-, 40.9-, and 21.6-fold increase in risk of CAD, T2D, and hypertension, respectively (P < 0.001 for all). Risk increased significantly between the three strata for all three conditions (P < 0.05). Risk stratification based on decision rules identified both a low-risk group (only 1.3% incident disease across all models), as well as a high-risk group where at least 72% of those developing disease within 8 years would have been recommended lifestyle intervention. Based on genetic risk alone, we identified not only a high-risk group, but also a group at elevated risk for all health conditions.
Conclusion We found that decision rule models comprising blood biomarkers, physical measurements, and polygenic risk scores are superior at identifying individuals likely to benefit from lifestyle intervention for three of the most common lifestyle-related chronic health conditions compared to commonly used clinical risk scores. Their utility as part of digital data or digital therapeutics platforms to support the implementation of lifestyle interventions in preventive and primary care should be further validated.
Introduction
Developed countries have seen a consistent rise in life expectancy and overall improving trends in chronic disease outcomes [1]. In just six decades, this has translated to a global increase in life expectancy of over 20 years for both men and women [1]. Yet, longer life expectancy has been accompanied by an increase in the prevalence of common chronic diseases, such as coronary artery disease (CAD), type 2 diabetes (T2D), and hypertension, which pose a significant burden to societies and limit healthy life expectancy (HALE) [2,3]. Preventive strategies which allow for earlier lifestyle intervention are a solution to tackle the growing burden of lifestyle-related health conditions. Indeed, lifestyle interventions such as weight loss, limiting (saturated) fat intake, and 30 minutes of exercise per day are recommended across multiple guidelines to reduce cardiovascular disease risk and the progression from prediabetes to T2D [4,5]. Yet, the sustainable implementation of lifestyle interventions faces several challenges, and cannot be achieved with one-size-fits-all approaches [6]. Rather, adherence and maintenance of health behaviour change requires personalized lifestyle recommendations.
To be able to provide such targeted lifestyle recommendations, the first step is to adequately stratify risk in individuals in a pre-clinical state and prioritize which aspects of their health they ought to focus on. For the three prevalent chronic health conditions mentioned above, several risk assessment tools have been made available to primary care physicians, including the Framingham risk scores [7,8,9]. These risk scores incorporate clinical and laboratory parameters, and have been shown to perform comparably well in European populations to other risk scores [7,10]. However, over two thirds of models for cardiovascular risk are restricted to a mixture of demographics, medical history, blood pressure and lipid profile, and a limited set of lifestyle factors, such as smoking [11]. Until now, these models do not include physical measurements or genetic susceptibility, although these health conditions are known to be multifactorial in nature, and for instance, progression from prediabetes to T2D is accelerated by even modest increases in adiposity, in individuals at higher genetic risk [6]. Especially when several studies have shown that the addition of genetic risk scores, as well as scores combining physical measurements and lifestyle factors, to demographic and biomarker data can improve risk stratification in preventive and primary care settings [11,12,13,14,15,16,17].
This study aimed to evaluate decision rules models incorporating other routine biomarkers, physical measurements, and genetic information in addition to established risk factors and investigate whether these improve risk stratification for three prevalent lifestyle-related health conditions in the large population-based UK Biobank cohort.
Methods
Study population
The UK Biobank is a longitudinal population-based cohort of 502,503 participants aged between 37–73 years old, collected between 2006 and 2010. For this study, we included only participants without coronary artery disease, T2D, and hypertension diagnosed by a physician at recruitment, in whom extensive follow-up data were available. In addition, individuals without diagnosed disease but who at baseline crossed a “clinical threshold” for any of the health conditions were also excluded from further analysis. These were individuals with a systolic blood pressure between 140 and 180 mmHg systolic or between 90 and 120 mmHg diastolic for hypertension [18], a fasting glucose value above 7.0 mmol/L for T2D [19], individuals with significantly impaired kidney function for cardiovascular disease [20,21]. Individuals for whom any of the variables in Table 1 were missing were also excluded. This study was conducted under UK Biobank application 55495, and followed TRIPOD reporting guidelines (Appendix S1). Local Institutional Review Board ethics approval was not necessary for this study.
Biomarkers, physical measurements and polygenic risk scores
To define the risk factors for each of the health conditions, a literature search was conducted in accordance with the 2009 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [21]. We searched for meta-analyses indexed in PubMed that were published between January 2014 and end of 2019 (additional details on the search strategy available in table S1, and PRISMA flowcharts in Figs S1-3). We also searched relevant national and international clinical guidelines not originally identified by the search. Based on the findings of the literature review, rules were defined to stratify individuals as high, elevated, and no elevated risk. These rules as described in detail in Appendix S2, and shortly below. Data on biomarkers was retrieved from the blood biochemistry category in the UKB, physical measurements from the body size measurements and abdominal composition categories, and smoking status was ascertained based on the self-reported smoking status registered at recruitment. Family and medical history were retrieved from the respective categories.
Coronary artery disease
For coronary artery disease, the literature and additional guideline search identified total cholesterol [22], HDL cholesterol [7,23,24], LDL cholesterol [24], triglycerides [23,25,26,27], and high-sensitivity C-reactive protein (hs-CRP) [28,29,30] as relevant blood biomarkers. The Framingham Risk Score for 10-year coronary heart disease risk was used, which included information on treatment for hypertension and smoking status [7]. A polygenic risk score for coronary artery disease was calculated as described below. Individuals were classified as high risk for which intervention is advised if they met any of the following rules, all weighted equally: total cholesterol above 8 mmol/L, systolic blood pressure above 180 mmHg, LDL cholesterol above 4.9 mmol/L, the incidence risk according to Framingham was high and the genetic susceptibility score being was “high”, or if biomarkers other than LDL were out of range. The no elevated risk profile was defined as no biomarkers being out of range, the genetic susceptibility score below the eighth decile, and negative family history. All others for which at least one risk factor was elevated were classified as intermediate risk.
Diabetes
Glycaemic variables (fasting glucose and HbA1c), blood lipids, markers of body composition, blood pressure, family history, gender, and smoking were identified as risk factors [8,31,32,33,34,35]. The Framingham Risk Score for diabetes was used, and a polygenic risk score for diabetes was calculated [8]. Participants were placed in the highest stratum if they met any of the following rules, all weighted equally: HbA1c was above 6.5% and fasting glucose was below 6.1mmol/L, fasting glucose was above 6.1mmol/L, either of the glycaemic variables was elevated (HbA1c between 5.5% and 6.4% or fasting glucose between 5.6 mmol/L and 6.1mmol/L) and they were overweight/obese, their clinical risk was high, their glucose was unregulated and their genetic susceptibility was high, or if their clinical risk was elevated, they were older than 45, had a HDL cholesterol below 0.9 mmol/L, and triglycerides above 2.8 mmol/L [19]. Participants were classified as not being at elevated risk if all biomarkers were within normal range, the genetic susceptibility score was below the eighth decile, and clinical risk was not elevated. All others with at least one marker out of range were considered at intermediate risk.
Hypertension
For hypertension, the literature and additional guideline search identified age [36], systolic and diastolic blood pressure [36,37,38,39,40], body mass index (BMI) [41,42], gender [43], and smoking status as relevant markers. The Framingham Risk Score for hypertension risk [9] was used, and a PRS for systolic blood pressure was calculated. Participants were classified as high risk if their systolic blood pressure was between 130-140 mmHg, or the diastolic between 80 and 90 mmHg. Equally, those with a high clinical risk, or an elevated clinical risk and a high PRS were stratified as high risk. The no elevated risk profile was defined as all biomarkers being within normal range, the genetic susceptibility score being below the eighth decile, and incidence risk according to the clinical score not being elevated. All others with at least one marker out of range were considered at intermediate risk.
Polygenic risk scores
Polygenic risk scores (PRS) were calculated using an additive model for CAD, T2D and hypertension. Individuals were binned into deciles based on their PRS scores and the average disease incidence was calculated for each decile. The difference between individuals in the tenth risk decile, those in the nineth and eighth deciles, and all other deciles were assessed. The 1000 Genomes dataset was used as reference panel for the LD calculations [44]. The genotyping data and data containing the tested phenotype outcomes were downloaded from the UKB. All variants with an imputation R2 < 0.4 determined with the minimac3 algorithm, were removed from the downloaded genotyping files [45]. Summary statistics files from three large genome-wide association studies (GWAS) conducted in other cohorts were used to calculate PRS for CAD, T2D, and hypertension [46,47,48]. These publicly available summary statistics were reformatted where necessary to be consistent with the format required by LDpred. A rho of 1 was used, and all variants with a GWAS significance p-value below 0.01 were selected based on previous studies showing marginal differences between this and other stringency cutoffs (table S4) [13]. In total, the T2D, CAD, and hypertension PRS included 199120, 139885 and 400016 SNPs, respectively. The PRS were also computed with and without adjustment of the following variables: genotyping array, first four principal components, age and sex. To assess the added predictive value of PRS over sex and age alone, we also added the predictions of a logistic regression model including only sex and age (table S2). Individuals were binned into deciles based on their PRS scores and the average disease incidence at each age was calculated for each decile (Fig S1). Additional methods are available in Appendix S3.
Ascertainment of disease incidence
Information regarding the variables used to calculate incidence for each of the health conditions at 8 years after study enrollment can be found in table S2. In short, we used International Statistical Classification of Diseases and Related Health Problems (ICD) codes, and the self-reported diagnoses collected at recruitment and follow-up questionnaires.
Statistical analysis
Similar to the three strata of risk for the decision rules model, three risk strata (“low”, “intermediate” and “high”) were defined for the Framingham scores. For the coronary artery disease risk score, the bottom, middle, and top tertiles were used as risk categories [7]. For diabetes, categories were based on <3%, 3% to 8%, and >8% incidence risk at eight-years [8]. For hypertension, this was <5%, 5% to 10%, and >10% incidence risk [9].
To evaluate the ability to discriminate higher risk individuals who would be suggested lifestyle intervention from those at no elevated risk, we used the area under the receiver operating characteristics (AUROC) curve computed from 2000 bootstrap iterations. Sensitivity and specificity for each model is also presented. Cox proportional hazards models were used to test the association of risk strata defined by the decision rules model and the clinical scores with incident events of CAD, T2D, and hypertension. Hazard ratios (HRs) with 95% confidence intervals were calculated between risk strata and the reference group (those not at elevated risk for the decision rules model, and low risk for Framingham).
We considered a p-value <0.01 as statistically significant for differences in AUROC determined by DeLong’s nonparametric test and p-value <0.05 significant for differences in risk between strata [49]. All data analyses were performed using R software v4.0.3 and the “survival”, “survminer” and “ggplot2” packages were used to conduct the survival analysis and generate graphs.
Results
Population characteristics
In total, 60 782 unique participants had follow-up data available, of which 42 978, 36 913, and 33 541 were included in the analyses for T2D, CAD, and hypertension, respectively. Table 1 shows the baseline characteristics of the study population. At follow-up, there were 500 incident CAD cases, 1005 incident T2D cases, and 2379 incident hypertension cases. Participants were aged 56.3 years on average, and slightly more participants were female (51.2%). Average values for all lipid markers were above general recommendations [21]. Similarly, physical measurements of BMI and waist circumference were above the existing thresholds for abdominal obesity, and both average systolic and diastolic blood pressure values crossed the stage 1 hypertension threshold [50,51].
Polygenic risk scores
For all three health conditions, a higher PRS was strongly associated with a higher incidence rate (Fig 1). For the highest risk stratum compared to the rest of the population, this translated to a hazard ratio (HR) of 4.6 (95% CI 3.8-5.6), 2.9 (2.5-3.4), and 1.9 (1.7-2.1) for CAD, T2D, and hypertension, respectively (table 2). When comparing the highest risk individuals to those in the first seven deciles, the HRs were 7 (95% CI 5.7-8.7), 3.8 (3.2-4.4), and 2.2 (2.0-2.5) (table 3). The risk for individuals in the ninth and eighth deciles was also significantly higher compared to individuals in the first seven deciles, with HRs of 3.4 (2.7-4.2) for CAD, 2.3 (2.0-2.7) for T2D, and 1.8 (1.7-2) for hypertension (table 3).
Sensitivity analysis
The Framingham scores achieved an AUROC of 0.67 (95% CI 0.63-0.71) for women and 0.60 (0.58-0.63) for men, 0.72 (0.71-0.73), and 0.60 (0.59-0.60) for CAD, T2D, and hypertension, respectively. Sensitivity and specificity for these models were 59.3% and 74.4%, 49.5% and 70.8%, 72.2% and 71.6%, and 97.4% and 22.1%. The performance of the decision rules model was better than Framingham for CAD in men, T2D, and hypertension with an AUROC of 0.66 (0.64-0.68) for CAD, 0.75 (0.74-0.76) for T2D, and 0.70 (0.69-0.71) for hypertension (table 2, P < 0.01 for all). There was no difference in performance between the decision rules model and Framingham for CAD in women. The discriminatory power of the decision rules model was also superior to PRS alone (table 3). Specificity was higher for T2D (68.2%, 67.7%-68.6%) and hypertension (65.5%, 65%-66%), with sensitivity also higher for T2D (81.5%, 79.1%-84%) and lower for hypertension (73.9%, 72.7%-75.7%). For CAD, the decision rules model achieved higher sensitivity (72.0%, 68.0%-76.0%) but lower specificity (59.8%, 59.3%-60.3%) than both Framingham models. For the decision rules models, positive predictive values were higher for T2D, hypertension and CAD in women, but lower than for the men’s model (table 4). Negative predictive values were extremely high for all models, with the highest being 99.58% (99.50%-99.65%) for the Framingham for CAD in women and the lowest 97.05% (96.85%-97.24%) for the decision rules for hypertension (table 4).
Risk stratification and lifestyle advice recommendations
The observed absolute risk for each health condition differed between the high, intermediate, and low risk strata for the decision rules model, but not for the clinical risk score (Fig 2). In terms of absolute risk, being classified as high risk by the clinical score translated to a 2.6% and 1.4% difference in absolute risk compared to not being at elevated risk for CAD in men (HR 3.8, 2.8-5.1) and in women (HR 6.8, 4.3-10.8), 2.1% for T2D (HR 3.7, 2.1-6.7), and 7.4% for hypertension (HR 14.1, 9.36-21.3). For the intermediate risk stratum, there was a risk difference for CAD in men and women, but not for T2D or hypertension. In comparison, the high-risk group in the decision rules model showed a 2.34% increase in absolute risk for CAD (HR 40, 5.6-283), 5.64% for T2D (HR 40.9, 23.7-70.8), and 12.4% for hypertension (HR 21.6, 13.4-34.8). For the intermediate risk group, these differences were 0.62% (HR 4, 1.5-77), 0.69% (HR 5.6, 3.2-9.9), and 2.4% (HR 4.5, 2.8-7.3), respectively. If all individuals in the higher risk group were recommended lifestyle intervention as a consequence of their baseline measurements, 40.6%, 33%, and 37.2% of all individuals would be recommended lifestyle intervention for CAD, T2D and hypertension with the decision rules model. For T2D and hypertension, this is 41.6% and 53% less than if the clinical risk scores were used, while detecting as many cases for T2D and only 561 fewer for hypertension. For CAD, 14980 (40.6%) of individuals would have been recommended intervention by the decision rules, compared to 10111 (27.4%) for Framingham. This represents a difference of detecting and advising intervention to 72% of all those who eventually developed disease, as opposed to 53.2%. In addition, 15.4% of those who ended up developing CAD were classified as low risk by Framingham, compared to 0.2% for the decision rules model.
Discussion
We investigated the association of different risk categories of three decision rule models incorporating blood biomarkers, physical measurements, and genetic information, with incident disease for three common lifestyle-related health conditions and compared its performance to currently used clinical risk calculators in 60782 returning participants in the population-based UK Biobank study. Individuals classified as high risk who would be recommended lifestyle intervention by the decision rules model had a 40-, 41- and 22-fold higher 8-year risk of CAD, T2D, and hypertension compared to those who were classified as not having elevated risk. All decision rules models either outperformed the respective Framingham clinical score or showed improvement in the detection of cases likely to benefit from lifestyle intervention.
We showed that adding other biomarkers, physical measurements, and genetic risk to traditional clinical risk scores leads to moderate improvement in predictive performance. From the many clinical risk scores for risk estimation of cardiometabolic health conditions available, we chose to compare our decision rules models to the Framingham risk scores due to their extensive validation across multiple cohorts [10,11]. In this sub-population of the UKB cohort, the Framingham scores performed comparably to studies in North American and Dutch populations (0.63 to 0.67), but slightly worse than reports from other studies, including lower than in the best original validation studies (0.66 to 0.83) [58,59]. For hypertension, specifically, the inferior performance of the Framingham model compared to other studies likely comes from the substantially lower number of prehypertensive individuals and mean blood pressure values in these cohorts (below 120 mmHg systolic and 75 mmHg diastolic) compared to the UKB, leading to as many as 79.3% of individuals being classified as high risk [56,57]. Compared to the Framingham scores, all three decision rules models improved either the detection of cases likely to benefit from lifestyle intervention or of those least likely to do so [52,53]. The slight improvement in performance of the rules model for diabetes is not surprising, as unregulated HbA1c is a risk factor for disease development in prediabetics, and specific insulin resistance phenotypes are linked to central adiposity. Similarly, there is growing evidence for genetics playing a more central role in the diabetes burden than previously thought [54,55]. For hypertension, the addition of genetic data also likely explains the improved performance of the rules model.
While modest in magnitude, the differences in performance between the different models could have significant practical implications. Preventive health programs should consider the health risks of individuals holistically across a spectrum of mental and physical health. By improving the precision to detect those who eventually developed the disease and would be recommended intervention and minimizing the number of individuals who did not develop the disease and would have nonetheless been advised to take action, these models have the potential to increase the impact of such programs in two ways. On the one hand, it could increase their effectiveness, since the number of prevented cases if the interventions were successfully implemented would be higher. On the other hand, cardiometabolic health issues are highly prevalent. By also accurately identifying individuals less likely to benefit from a cardiometabolic health intervention in the short-term, these models can be combined with models for other physical and mental health conditions and help low risk individuals to prioritize lifestyle changes in other aspects of their health. With recent studies showing that programs as short as three to five months can trigger diabetes remission and improve cardiovascular risk factors [60,61,62], the use of these stratification mechanisms for a periodic risk assessment across varied lifestyle conditions would be a valuable tool for optimizing return on investment in personalized preventive medicine programs.
With regards to the addition of genetic risk to clinical scores, our findings support recent studies that suggested adding genetic susceptibility scores to clinical scores for CAD and T2D, as well as stroke or cardiovascular disease led to improvements in risk prediction [12,63,64]. Based on genetic risk alone, we identified a group of high risk individuals with hazard ratios of 4.6-, 2.9-, and 1.9 for CAD, T2D and hypertension. However, we also encountered differences between the top risk decile and the ninth and eighth deciles, and between these and the rest of the population. In comparison, Khera et al. identified a similar risk increase only in the top 8% and 3.5% of individuals in the UKB, for CAD and T2D respectively, and the top 5% individuals in the Finnish cohort of Mars et al. were at 2.62-fold increased risk of CAD and 3.28-fold for T2D (table S3, Fig S3) [12,13]. This effectively demarcates not only a “high risk”, but also an “elevated risk” group in these two deciles, compared to the “no elevated risk” group comprising the rest of the population.
One of the barriers to the implementation of risk models in preventive and primary care has been the belief that such algorithms have an actual low impact on decision-making in apparently healthy individuals, and mostly generate demand for “unnecessary” care [65,66,67]. In this study, we make two significant contributions to help overcome this issue. First, we showed that easily interpretable decision rules models including genetic risk can better identify individuals at low risk unlikely to benefit from lifestyle interventions in the short-term than traditional clinical scores. Models based on risk factor burden are easy to interpret and communicate, and a simple metric such as the absence or presence of more than one risk factor is associated with substantial differences in lifetime risk of cardiometabolic health conditions [68]. By including genetic risk in risk factor burden calculations in an additive way, we can identify individuals at genetically elevated or high risk with normal demographic and blood risk factors. There remain substantial financial and technical challenges in conducting GWAS, and in correctly calculating and interpretating PRS, for different health conditions. In individuals for whom routinely collected medical and biomarker data clearly identifies a higher risk, or for monogenic conditions, the addition of polygenic risk is unlikely to bring additional useful information. However, as the GWAS and PGS catalogues keep expanding their – for now limited – repertoire of traits and conditions, this approach could be especially meaningful for implementation in preventive care, where risk stratification targets a younger, usually healthier populations [69,70,71].
Second, the large sample size of the UK Biobank even after exclusion of individuals without follow-up, allows us to extrapolate the potential impact of these models for preventive lifestyle intervention at large scale. In the Netherlands, more than 16000 people enrolled themselves in a combined lifestyle intervention program between January 2019 and April 2020 alone. In a UKB population at least twice that, 9000 and 14000 fewer people would have been recommended lifestyle intervention by the decision rules compared to the clinical risk scores for T2D and hypertension. With a growing number of digital medical data and digital therapeutics platforms available to support clinicians and empower individuals to proactively act on their health, it is becoming easier to collected, process and analyse data from different sources such as the blood, body composition, and genetic markers evaluated in this study. When integrated with such platforms, the models developed in this study represent a viable, potentially less resource intensive framework for lifestyle interventions in preventive and primary care.
This study also presented some limitations. Firstly, the list of risk factors included is not exhaustive, due to both the high level of evidence required for inclusion in the model (most studies considered were meta-analyses) as well as the non-availability of other relevant variables in the UK Biobank data repository. Secondly, being a decision rules model, our proposed model does not provide individual risk predictions. While this increases the interpretability and applicability of the model (especially in a primary and preventive care setting), individuals within the same stratum may have different actual risk. Thirdly, we conducted the analysis with the assumption that all individuals classified as high risk who would have been recommended lifestyle intervention would not only have started it, but also achieved some degree of success. With a growing offer of consumer health and wellbeing programs, as well as employer-sponsored health programs, it is easier than ever before for individuals to preventively implement lifestyle changes [72]. However, many factors not accounted for here play a role in determining the actual effectiveness of these programs, so prospective validation in a study setting as well as in the market is required to assess the actual impact of these models on the effectiveness of preventive health interventions. Lastly, both the GWAS for the three PRS used in this study, as well as the UK Biobank cohort itself, are very ethnically homogeneous, with more than 90% of total participants being of white ethnicity and European descent. Therefore, the PRS results for UK Biobank participants of other ethnicities may be sub-optimal, and PRS and model validation will be required in cohorts with more diverse ethnical background.
In conclusion, in this prospective population-based cohort study of 60782 people, we developed and validated three risk stratification models for three prevalent chronic conditions. Adding other blood markers, physical measurements and genetic susceptibility scores to currently used clinical risk scoring tools resulted in moderate improvements in performance and in the identification of individuals likely to benefit from lifestyle intervention. When integrated with digital data or digital therapeutics platforms that enable the collection and analysis of these data, these algorithms can be used to support the successful adoption of lifestyle interventions in preventive and primary care.
Data Availability
The data that support the findings of this study are available from the UK Biobank project site, subject to registration and application process. Further details can be found at https://www.ukbiobank.ac.uk.
Funding
Tom J. de Koning, Bruce Wolffenbuttel, Sipko van Dam and Pytrik Folkertsma were funded by the Dutch Top Sector Life Sciences and Health Public-Private Partnership Allowance. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Conflict of interest
I have read the journal’s policy and the authors of this manuscript have the following competing interests: all authors except Tom J. de Koning and Bruce Wolffenbuttel are employed by Ancora Health B.V..Tom J. de Koning and Bruce Wolffenbuttel sit on the medical advisory board of Ancora Health B.V. Additionally, Jose Castela Forte, Rahul Gannamani, Sridhar Kumaraswamy, and Sipko van Dam own shares of Ancora Health B.V. The funder provided support in the form of salaries for all employees but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Data Availability Statement
The data that support the findings of this study are available from the UK Biobank project site, subject to registration and application process. Further details can be found at https://www.ukbiobank.ac.uk.
Author contributions
JCF: main contributor to all aspects of the manuscript. PF: main contributor to the data analysis and the methods and results sections of the manuscript. RG: original ideation of the manuscript, and significant contributions to the drafting of the manuscript. SK: original ideation. SM: significant contributor to the methods and discussion sections of the manuscript. TdK: co-supervisor in the clinical aspects of the manuscript, and significant contributor to the discussion and interpretation of the findings. SvD: significant contributor to the data analysis and the methods and results sections of the manuscript. Responsible for the data access request to UKB. BW: main supervisor in the clinical aspects of the manuscript, and significant contributions to the introduction and discussion. All authors gave input towards and approved the final manuscript.
Acknowledgements
We thank participants in both the UK Biobank as well as all clinical, academic and administrative staff involved in data collection and storage.
Footnotes
Added 2 references
References
- 1).↵
- 2).↵
- 3).↵
- 4).↵
- 5).↵
- 6).↵
- 7).↵
- 8).↵
- 9).↵
- 10).↵
- 11).↵
- 12).↵
- 13).↵
- 14).↵
- 15).↵
- 16).↵
- 17).↵
- 18).↵
- 19).↵
- 20).↵
- 21).↵
- 22).↵
- 23).↵
- 24).↵
- 25).↵
- 26).↵
- 27).↵
- 28).↵
- 29).↵
- 30).↵
- 31).↵
- 32).↵
- 33).↵
- 34).↵
- 35).↵
- 36).↵
- 37).↵
- 38).↵
- 39).↵
- 40).↵
- 41).↵
- 42).↵
- 43).↵
- 44).↵
- 45).↵
- 46).↵
- 47).↵
- 48).↵
- 49).↵
- 50).↵
- 51).↵
- 52).↵
- 53).↵
- 54).↵
- 55).↵
- 56).↵
- 57).↵
- 58).↵
- 59).↵
- 60).↵
- 61).↵
- 62).↵
- 63).↵
- 64).↵
- 65).↵
- 66).↵
- 67).↵
- 68).↵
- 69).↵
- 70).↵
- 71).↵
- 72).↵