Abstract
An inverse correlation between stature and risk of coronary artery disease (CAD) has been observed in several epidemiologic studies, and recent Mendelian randomization (MR) experiments have suggested causal association. However, the extent to which the effect estimated by MR can be explained by established cardiovascular risk factors is unclear, with a recent report suggesting that lung function traits could fully explain the height-CAD effect. To clarify this relationship, we utilized a well-powered set of genetic instruments for human stature, comprising >1,800 genetic variants for height and CAD. In univariable analysis, we confirmed that a one standard deviation decrease in height (∼6.5 cm) was associated with a 12.0% increase in the risk of CAD, consistent with previous reports. In multivariable analysis accounting for effects from up to 12 established risk factors, we observed a >3-fold attenuation in the causal effect of height on CAD susceptibility (3.7%, p = 0.02). However, multivariable analyses demonstrated independent effects of height on other cardiovascular traits beyond CAD, consistent with epidemiologic associations and univariable MR experiments. In contrast with published reports, we observed minimal effects of lung function traits on CAD risk in our analyses, indicating that these traits are unlikely to explain the residual association between height and CAD risk. In sum, these results suggest the impact of height on CAD risk beyond previously established cardiovascular risk factors is minimal and not explained by lung function measures.
Introduction
Epidemiological evidence suggests that shorter height is associated with an increased risk of coronary artery disease (CAD), even after adjustment for known risk factors such as smoking status, lipid levels, body mass index (BMI), systolic blood pressure, and alcohol consumption (1–3). Efforts to ascertain true effects of genetically influenced height on human health are important. Determinants of height include modifiable nutritional and socio-economic characteristics (4), as well as closely correlated cardiovascular or anthropometric traits (e.g., wider arteries or larger lungs). Mendelian randomization (MR) is a technique that can estimate causal relationships between traits by using genetic variants and their corresponding effects identified and measured through large-scale genetic analyses such as genome-wide association studies (GWAS). Studies using Mendelian randomization have provided evidence for a univariable causal association between shorter stature and risk of CAD (5,6). However, it is still unclear how much of this effect can be explained by established, conventional CAD risk factors. Previous work has suggested that the effect of height on CAD is nearly entirely explained by lung function, specifically forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC) (7). This finding is backed by epidemiological evidence (8,9), but results from other published univariable MR analyses are inconclusive (10).
Multivariable Mendelian randomization (MVMR) allows for the investigation of causal exposure-outcome trait relationships after accounting for additional factors that may complicate simple, direct exposure-outcome associations. Here, we reaffirm the univariable causal relationship between height and CAD risk using recent GWAS for height and CAD with increased power. Using MVMR methods, we then investigated direct effects of height on CAD risk after accounting for the effects of 12 established risk. We also expanded these experiments to consider effects of height additional cardiovascular traits within this MVMR framework. Lastly, we investigated the effects of lung function on CAD risk using univariable and multivariable MR analyses, including repeating the exact procedures used in previous work with updated and legacy datasets.
Results
We provide an overview of the experiments performed in Figure 1. We began by assembling GWAS data for height and CAD to perform experiments using the conventional, two-sample MR design. We utilized large, well-powered genetic studies for height and CAD, with mean sample size of 693,529 individuals per variant for height (11) and 122,733 CAD cases (12) (Table S1). For height, additional low frequency (1-5% MAF) or rare (0.1-1% MAF) genome-wide significant SNPs with effects up to 2 cm/allele were also included (13). After quality control and filtering (Methods), the final set of height instruments comprised of 2,037 variants, nearly two and a half times that of previous efforts (7) (Tables S2 and S3), explaining a theoretical 26.4% of the variance in height (Table S4). The set of variants had a Cragg-Donald F-Statistic of 121.5 (14) (based on the theoretical variance explained assuming Hardy-Weinberg equilibrium), minimizing any concerns regarding weak instrument bias (15,16).
Schematic flowchart outlining our experimental setup. We collected genetic instrument for height and effect sizes for 12 additional CAD risk factors as well as for lung function. These data were used in two-sample Mendelian randomization in univariable and multivariable investigation.
Genetically influenced height is inversely associated with CAD risk
We next utilized these summary statistics to conduct univariable, two-sample MR of height on CAD risk (Methods). Using the inverse-variance weighted (IVW) method, we observed that a one standard deviation (SD) decrease in height (∼6.5cm) (17) was associated with a 12.0% increase in the risk of CAD (OR = 1.13, 95% CI = 1.10 – 1.16, p = 1.6 × 10−20, Figure 2), a result that was consistent with previous reports (5–7). This effect had persistent and robust association in sensitivity MR analyses, including weighted median (WM) (OR = 1.12, 95% CI = 1.09 – 1.15, p = 4.2 × 10−14) and MR-Egger (OR = 1.10, 95% CI = 1.04 – 1.16, p = 2.7 × 10−4) as well as other approaches (Table S5). We further noted that the intercept estimated from MR-Egger was not significant, suggesting no direct statistical evidence supporting the presence of horizontal pleiotropy (p = 0.36). Taken collectively, these results confirm the previously observed statistical effect of shorter stature with increased susceptibility to CAD. However, these results do not themselves address the extent to which height modulates CAD risk beyond the set of conventional established risk factors for CAD, which includes alcohol use (18), birth weight (19), body mass index (20), blood pressure (21), educational attainment (22), lipid levels (23), physical activity (24), smoking behavior (25), type 2 diabetes (26), or waist-hip ratio (27).
a) Univariable height-CAD analyses using inverse-variance weighted, weighted median, and Egger MR methods. b) Multivariable MR results modeling all risk factors together. Effect direction indicates direction of the exposure that increases CAD risk – effects of all risk factors were flipped to OR>1.
Multivariable analysis substantially attenuates the association between genetically influenced height and CAD risk
We next sought to quantify the extent to which the catalog of established CAD risk factors could explain the association between lower stature and elevated risk of CAD. For this, we utilized multivariable Mendelian randomization (MVMR) to account for effects of 12 risk factors with established or compelling associations with CAD (Methods). After obtaining recent, large-scale GWAS for all traits (11,28–35), 1,906 of our 2,037 (93.6%) of variants were present in GWAS summary data across all traits and were utilized for MVMR analysis (Table S6). We note that a reanalysis of this slightly smaller (N variants = 1,906) instrument returned virtually identical results to the above univariable results (IVW OR = 1.13, CI= 1.10 – 1.16, p=4.7 × 10−19). Furthermore, the Sanderson-Windmeijer F statistic of the model was 40.9, indicating low bias due to weak instruments (36). After including all 12 risk factors jointly with height, we observed that one SD shorter stature was associated with a 3.7% increase in the risk of CAD (OR = 1.04, 95% CI = 1.01 – 1.07, P = 0.021, Figure 2, Table S7), representing a 68% attenuation in the effect compared to univariable MR. These results indicate a small, but significant, residual effect of height on susceptibility to CAD after accounting for known risk factors.
We next quantified the effects of each risk factor individually to explore the risk factors attenuating the effects of height on CAD risk in more detail. These effects were considered independent of the effects observed when all risk factors were jointly analyzed. In a series of MVMR models for height with each risk factor separately, we observed that 9 risk factors reduced height’s effect on CAD risk, ranging between 0.06 – 3.65% of the total effect (Table S8), somewhat consistent with previous work reporting 1-3% attenuation in height effect for each risk factor (7). In contrast, 3 risk factors (smoking initiation, physical activity, and waist-hip ratio) increased the effect of height on CAD susceptibility, between 0.09% – 1.03%. These results indicate that the attenuation of the effect of height of CAD in the full joint model is not entirely explained by a single factor, but instead by a combination of risk factors in aggregate.
Lung function measures are not genetically causally associated with CAD susceptibility
Recent work suggested that lung function measures are inversely associated with cardiovascular events (8–10). Furthermore, Mendelian randomization analyses suggested that the bulk of the causal effect of height on CAD susceptibility is complicated by FEV1 and FVC (7). Given these reports, we next focused our efforts on (i) characterizing what relationship exists between these measures and CAD susceptibility, and (ii) if these measures could explain the residual putative causal effect estimated between height and CAD.
To develop sets of FEV1 and FVC genetic instruments and perform MVMR analyses, we utilized a large, recently published GWAS study for both traits (37). After quality control and filtering analogous to what we applied to our height genetic instrument (Methods), the final sets of genetic variants comprised 171 for FEV1 and 128 for FVC, which explained a theoretical 3.26% and 2.32% of the genetic variance to each trait, respectively (Tables S9-S12). These sets of instruments had Cragg-Donald F-Statistics of 78.8 and 74.1, respectively, arguing against weak instrument bias (14–16).
We then performed 5 sets of MR experiments to examine the effects of FEV1 and FVC, and their roles as attenuators of height’s effect on susceptibility to CAD. Analyses were conducted using both univariable and multivariable methods, as well as recently published and legacy data sets. Collectively, we observed little attenuation of height’s effect on CAD risk due to lung function factors, as well as scant univariable association between either lung function measure and CAD risk (Figure 3, Tables S13-S20).
a) Univariable MR analyses of lung function and CAD. b) Multivariable MR results modeling all risk factors together, using Shrine et al. lung function GWAS. c) Multivariable MR results modeling height and lung function factors only using Shrine et al. lung function GWAS, on CAD risk. d) Multivariable MR results modeling all risk factors together, but using Neale Lab lung function GWAS. e) Multivariable MR results modeling height and lung function factors only but using Neale Lab lung function GWAS, on CAD risk.
First, we performed univariable MR analysis for each lung function measure on CAD susceptibility (Figure 3a). We observed no significant effect of FEV1 on CAD risk using inverse-variance weighted, weighted median, or MR-Egger methods (pIVW = 0.08, pWM = 0.78, pEgger = 0.44). Effects were directionally inconsistent, with increased FEV1 increasing CAD risk by MR-Egger, but increased FEV1 associated with decreased CAD risk by IVW and WM methods, though we note that all results do not reject the null hypothesis. For FVC, results were also not significant, except for those from the IVW method (pIVW = 0.034, pWM = 0.061, pEgger = 0.61). However, these results were directionally consistent across methods, with decreased FVC associated with increased CAD risk, consistent with epidemiological evidence (8,9).
Second, we performed MVMR with FEV1 and FVC, both with and without the previously described 12 risk factors using the same 1,906 variants as were used in MVMR analysis. After including FEV1 and FVC, we observed a virtually identical, attenuated association between decreased height and CAD susceptibility as we observed from our prior experiments considering those 12 established risk factors without FEV1 and FVC (OR = 1.04, 95% CI = 1.00 - 1.07, p = 0.024, Figure 3b).
Neither FEV1 or FVC were associated with CAD susceptibility in this joint model (pFEV1 = 0.41 and pFVC = 0.39). Similarly, when we excluded the 12 established risk factors and included only FEV1 and FVC as additional factors with height, we did not observe attenuation versus univariable height-CAD analysis (OR = 1.13, 95% CI = 1.10 - 1.16, p = 2.8 × 10−18, Figure 3c). Again, the associations for either lung function factor were, at best, modest (pFEV1 = 0.39 and pFVC = 0.025).
Third, we performed additional MVMR analyses that instead utilized the lung function GWAS considered in prior analyses, which had suggested a lung function complicating effect (URLs). Again, we did not observe substantial attenuation in height’s effect on CAD risk, (OR = 1.04, 95% CI = 1.01 - 1.08, p = 0.014, Figure 3d). When we modeled FEV1 and FVC exclusively with height, the effect of height on CAD risk did not appear to be reduced by lung function (OR = 1.13, 95% CI = 1.10 - 1.16, p = 3.2 × 10−17, Figure 3e), although the effects for both lung function measures were nominally significant at best (pFEV1 = 0.026 and pFVC = 0.052)
Fourth, we created a restricted set of height genetic instruments that excluded variants nominally associated with FEV1 or FVC (37) (p < 0.05) to more directly compare between results from the published work suggesting lung function’s effect on height (7), and only retained variants present in lung function, height, and CAD GWAS. Of 2,037 SNPs in our original set of height instruments, 1,112 remained and were used in univariable MR experiments analyzing the effect of height on CAD. We continued to observe a robust association between decreased height and increased CAD risk.
Estimated effects, using WM (OR = 1.12, CI = 1.07 - 1.17, p = 1.0×10−7) and IVW methods (OR = 1.13, CI = 1.09 - 1.17, p = 3.3×10−11), were similar to our univariable analyses (Figure S1a), and the estimated effect from MR-Egger increased slightly (OR = 1.13, CI = 1.04 - 1.23, p = 2.7×10−3).
Fifth, we filtered nominally associated FEV1 and FVC variants from our height IVs using older summary statistics for lung function (URLs) rather than Shrine et al (37), leaving 747 variants. In this case, we do observe an attenuation of association for height on CAD risk, consistent with the previous reports (6) (Figure S1b). Estimates of the effect of height on CAD risk also decreased (IVW estimate 12.0% to 8.9%, p = 1.6×10−20 to p = 1.4×10−3; WM estimate 11.3% to 3.7%, p = 4.2×10−14 to p = 0.22; MR-Egger estimate 9.8% to 7.9%, p = 2.7×10−4 to p = 0.24).
The effect of height on coronary artery disease risk and ischemic stroke risk, but not other cardiovascular disease risks, is attenuated by established risk factors
Finally, we extended our analyses to consider other cardiovascular disease risks. Consistent with our prior results, the effect of height on coronary artery disease risk was attenuated in a MVMR context adjusting for up to 12 established cardiovascular risk factors, negating the significant effects seen in epidemiologic and univariable studies (Fig 4). Although this trend was consistent for the effect of height on ischemic stroke risk, the effects of height on other cardiovascular disease risks (venous thromboembolism, abdominal aortic aneurysm, atrial fibrillation) were not affected by related established causal risk factors, remaining consistent with epidemiologic observations and univariable MR estimations (Fig 4).
Multivariable MR including 12 additional cardiovascular disease risk factors as exposures attenuates the effect of height on coronary artery disease risk and ischemic stroke, but does not impact the effects of height on venous thromboembolism, abdominal aortic aneurysm, or atrial fibrillation.
Discussion
Using well-powered GWAS, we recapitulated previous evidence that lower stature increases susceptibility to CAD, estimating a 12% increased CAD risk per 1 standard deviation unit decrease in height. This effect was substantially attenuated to 3.7% after accounting for a collection of 12 CAD risk factors. In contrast with prior work, we demonstrate – through univariate and multivariable MR experiments directly – that lung function explains little, if any effect, of height on CAD susceptibility. Our findings refute the notion that height, height-related nutritional status, or socioeconomic status meaningfully contributes to CAD risk (4).
Several factors could be driving differences between our work and previously published findings of lung function on CAD risk. Certain phenotype-specific covariates, namely height, were not included in the lung function GWAS results used in Marouli et al. (7). By contrast, the GWAS we used included height as a covariate in the models used for association testing. It is known that including heritable covariate-adjusted data can induce collider biases in MR studies (38). While this might cloud the interpretation of the results using those data for univariate and multivariable experiments about the relationship between FEV1 and/or FVC and CAD risk, our major conclusion contrasts previous reports. In our experiments, lung function did not explain the relationship of height to CAD. In multivariable analyses, a small effect of height persisted (Figure 3d). One reasonable interpretation of these findings is that lung function may be a causal risk factor for CAD as reported previously (39), even when considered jointly with other cardiometabolic risk factors.
We note that the procedure used in Marouli et al., in which variants nominally associated (i.e., p < 0.05) with lung function were removed from height instruments, inherently reduces power. This reduction could have influenced the apparent attenuation of lung function on height’s effect on CAD risk. However, we did not observe a similar attenuation when we utilized their procedure using a larger genetic instrument for height. These findings occurred despite strong similarities between lung function GWAS size (Shrine et al. N=400,102 vs. Neale Lab N=361,194) and high degree of sample overlap, with both GWAS being primarily conducted in the UK Biobank. The collection of these results provide evidence against the hypothesis that lung function substantially attenuates height’s genetically causal effect on CAD risk. Instead, other established CAD risk factors attenuated the relationship between height and CAD.
One exposure we considered in the multivariable analysis but did not include was high-density lipoprotein (HDL) cholesterol levels. While epidemiological evidence has suggested an inverse relationship between HDL and CAD risk (40), randomized controlled trials and MR studies have largely not supported this relationship (41,42), and newer evidence has suggested the role of HDL on CAD risk is nonlinear and/or context specific (43). Moreover, HDL is correlated with triglyceride levels, which does have a growing body of evidence to support causality and which we did include in our multivariable analysis. Thus, we opted to not include HDL in these analyses.
Furthermore, while not significantly different from OR=1, the relationship between alcohol exposure and CAD is complex and potentially non-linear (18), and thus we advise strong caution against strong interpretations of the directionality of alcohol exposure in multivariable analyses.
Our study had some limitations. First, there is some sample overlap between exposure and outcome traits, which may induce bias in effect estimates (44). However, given the samples sizes underlying these data sets, we expect the effect of biases from sample overlap to be minimal. Second, we were limited to the extent that we could address residual population stratification and/or assortative mating not accounted for in the primary GWAS. Future MR studies using siblings could potentially address this limitation (45), with prior work demonstrating stronger effects of height on CAD risk using this design (46). Finally, how well these findings may translate to non-European ancestries remains unclear as our analyses were conducted almost exclusively in individuals of predominantly European descent (47–49). Even within ancestry groups, stratification effects – such as population structure and stratification across Europe (50) – or context-specificity (51), evident in BMI (52,53), may cause genetic effects to vary. Future work to explore differences in risk by ancestry would be an important consideration as such data become more readily available.
In conclusion, we observed a limited effect of height on CAD after accounting for established risk factors. We additionally provide evidence that lung function has a minimal role in attenuating the effect of height on CAD, which contrasts recently published work. These findings help explain the clinical and epidemiological significance of height as a risk factor for CAD and other cardiovascular conditions.
Methods
Variants used as instruments for height
We begin with the set of conditionally independent genome-wide significant (p < 5×10−8) SNPs (54) from a recent height GWAS (11) as the starting point for the construction of our genetic instruments for height. For robustness across exposures without conditionally independent results, only SNPs with genome-wide significance in both the marginal and conditional analysis were retained. To this initial pool of SNPs, we included an additional set of low frequency (1% to 5% MAF) or rare (0.1 - 1% MAF) genome-wide significant associations for height SNPs (13), where the SNP with lowest p-value was retained in case of duplicates. Proxies for palindromic SNPs (i.e., variants with A/T or C/G alleles) with r2 ≥ 0.95 were obtained from the primary height GWAS using 1000 Genomes EUR as LD reference (S Table 3) (55,56) – all variants were lifted over to hg19 (URLs). Finally, SNPs in networks of LD with r2 > 0.05 were pruned using a graph-based approach with a greedy approximation algorithm (URLs) using 1000 Genomes Phase 3 EUR as the LD reference. This filtering resulted in an overall set of instruments which included 2,041 genome-wide significantly associated, non- palindromic, statistically independent (r2 < 0.05) SNPs, including variants down to 0.1% allele frequency, nearly 2.5-fold more than previous efforts (7). Of these 2,041 SNPs, 2,037 were present in the CAD GWAS and used in further analyses. Alleles and effect sizes were oriented to the height-increasing allele. This set of instruments had 80% power to detect an odds ratio of 1.018 at alpha .05 (URLs) (57), assuming a CAD GWAS sample size of 547,261, case ratio of 0.224, and R2 of 0.264 in height using the height instruments.
Established cardiovascular risk factors
We obtained a collection of established risk factors for CAD from the literature to evaluate potentially indirect effects of height on CAD. Summary statistics from GWAS were acquired for body mass index (BMI) (11), plasma lipid levels (low density lipoprotein and triglycerides) (28), systolic and diastolic blood pressure (29), waist-hip ratio (30), smoking initiation (31), alcohol consumption (drinks per week (31)), educational attainment (32), birth weight (33), physical activity (34), and type 2 diabetes (35). Units for systolic blood pressure were converted to 5 mmHg from 1 mmHg (29,58); the remaining GWAS were already expressed in units of standard deviations. For each trait, all alleles were harmonized to correspond to the height-increasing effect allele.
Generation of additional instrument sets for FEV1 and FVC
GWAS summary statistics were obtained for FEV1 and FVC (37). Lists of conditionally independent genome-wide significant lead variants were obtained from their respective GWAS, with palindromic SNPs and linkage disequilibrium addressed in an analogous way to the set of height instruments were filtered, as described above. 171 and 128 SNPs were included in the final set of instruments for FEV1 and FVC, respectively.
Alleles and effect sizes were harmonized to the height-increasing effect allele.
Epidemiologic associations
Observed epidemiologic effects were manually curated from publicly available sources, including Wormser et al. (59) for height on ischemic stroke risk, Carter et al. (60) for height on abdominal aortic aneurysm risk, and Lai et al. (61) for all other outcomes.
Mendelian randomization methods
Univariable two-sample Mendelian randomization analyses were carried out using the inverse-variance weighted, weighted median (62), and MR-Egger (63) methods with the R package MendelianRandomization (64). For completeness, we also present the results from eight additional methods (and note qualitatively similar results as to what we highlighted in the main text). In the univariable analysis, 2,037 of 2,041 SNPs for the height instrument were present in the CAD GWAS results and were used for analysis.
Evidence of horizontal pleiotropy was evaluated using the intercept from MR-Egger. Outcome GWAS summary statistics were obtained from studies listed in S Table 1 (28– 34,65–68).
Multivariable Mendelian randomization analysis was conducted using the R package WSpiller/MVMR (36). Across all exposures, 1,906 SNPs of the 2,037 were present in every GWAS and were used for analysis. Forest plots were constructed using the forestplot R package. Where indicated, odds ratios were converted to true effects by exponentiation (69).
Data Availability
All data produced in the present study are available upon reasonable request to the authors.
Author Contributions
S.M.D., T.L.A., and B.F.V. conceived of the project. D.H., K.L., and B.F.V. designed the experiments. D.H. and E.S. curated the data and performed the analysis. D.H., C.T., and B.F.V. prepared the initial draft of the manuscript. All authors provided critical reviews and edits to the manuscript. B.F.V. supervised and administered all aspects of the work.
Conflicting Interest Statement
D.H., E.S., K.L., T.L.A., C.S.T., and B.F.V. have no conflicting interest to report. S.M.D receives research support from RenalytixAI and personal consulting fees from Calico Labs, outside the scope of the current research.
URLs
UCSC Liftover: https://genome.ucsc.edu/cgi-bin/hgLiftOver
Select analysis code: https://github.com/daniel-hui/Height-CAD_MVMR
Neale Lab UK Biobank GWAS results: http://www.nealelab.is/uk-biobank
MR power calculation: https://shiny.cnsgenomics.com/mRnd
Supplementary Figures
a) Shrine et al. (N SNPs = 1,112) and b) Neale Lab (N SNPs = 747).
Acknowledgements
For data from Klarin et al, approved dbGAP access to phs001672.v6.p1 was provided to B.F.V. (dbGAP project ID: 27398). The authors thank the Million Veteran Program (MVP) staff, researchers, and volunteers, who have contributed to MVP, and especially participants who previously served their country in the military and now generously agreed to enroll in the study (see https://www.research.va.gov/mvp/ for more details). This research is based on data from the Million Veteran Program, Office of Research and Development, Veterans Health Administration, and was supported by the Veterans Administration (VA) Cooperative Studies Program (CSP) award #G002. B.F.V. acknowledges support from the National Institutes of Health (DK126194 and DK101478) and Linda Pechenik Montague Investigator Award. C.S.T acknowledges support from the National Institutes of Health (HL156052). S.M.D. is supported by the U.S. Department of Veterans Affairs award IK2-CX001780. This publication does not represent the views of the Department of Veterans Affairs or the United States Government. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Footnotes
Updated manuscript 1.5 years later after a few rounds of revisions.