The impact of lung function on cardiovascular diseases and cardiovascular risk factors: a two sample bi-directional Mendelian randomization study

Background: Observational studies suggested lung function is inversely associated with cardiovascular disease (CVD) although these studies could be susceptible to residual confounding. We conducted a 2 sample Mendelian randomization study using summary statistics from genome wide association studies (GWAS) to clarify the role of lung function in CVD and its risk factors, and conversely the role of CVD in lung function. Methods: We obtained genetic instruments for forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC) from publicly available UK Biobank summary statistics (n=421,986). We applied these genetic instruments for FEV1 (260) and FVC (320) to publicly available GWAS summary statistics for coronary artery disease (CAD) (n=184,305), stroke and its subtypes (n=446,696), atrial fibrillation (n=1,030,836), and heart failure (n=977,320) and cardiovascular risk factors. Inverse variance weighting was used to assess the impact of lung function on these outcomes. Sensitivity analyses included MR-Egger, weighted median, MR-PRESSO, and multivariable Mendelian randomization. We also conducted bi-directional Mendelian randomization to assess whether CVD affects lung function. Results: FEV1 and FVC were inversely associated with CAD (odds ratio (OR) per standard deviation (SD) increase, 0.72 (95% confidence interval (CI) 0.63 to 0.82) and 0.70 (95%CI 0.62 to 0.78)), overall stroke (0.87 (95%CI 0.77 to 0.97), 0.90 (0.82 to 1.00)), ischemic stroke (0.87 (95%CI 0.77 to 0.99), 0.90 (95%CI 0.80 to 1.00)), small vessel stroke (0.78, (95%CI 0.61 to 1.00), 0.74 (95%CI 0.61 to 0.92)), and large artery stroke (0.69 (95%CI 0.54 to 0.89), 0.72 (95%CI 0.57 to 0.91)). FEV1 and FVC were inversely associated with type 2 diabetes (0.75 (95%CI 0.62 to 0.90), 0.67 (95%CI 0.58 to 0.79)) and systolic blood pressure. Sensitivity analyses produced similar direction for most outcomes although the magnitude sometimes differed. Adjusting for height attenuated results for CAD (e.g. OR for 1SD FEV1 0.95 (0.76 to 1.20), but this may reflect weak instrument bias. This adjustment did not attenuate effects for stroke or type 2 diabetes. No strong evidence was observed for CVD affecting lung function. Conclusion: Higher lung function likely protect against CAD and stroke.


Introduction
pressure (UK Biobank associations obtained from IEU GWAS database), 16 17 LDL, HDL cholesterol and triglycerides (GLGC), 18 type 2 diabetes (DIAGRAM), 23 fasting glucose, insulin, and glycated hemoglobin (MAGIC), 21 22 were obtained from publicly available online GWAS summary data repositories, with majority of them retrieved via MR Base. 17 Table 1 summaries the numbers (including the cases and controls where relevant) included in these GWAS, population (including ethnicity) and the sample size in the GWAS. Additional details, including quality control, imputation methods, and any covariates adjusted for in each GWAS and how the outcomes were defined are provided in Supplemental Table 1.

Selecting genetic instruments from the exposure GWAS, identifying these in outcome GWAS and harmonizing instruments across studies
We identified genetic instruments for lung function at genome wide significant p value (<5x10 8 ), and we excluded instruments which were in high linkage disequilibrium (LD) with other instruments (r 2 <0.001). We searched for the genetic instruments in the outcome datasets. For genetic instruments not available for an outcome, a proxy instrument in high LD with the original instrument (r 2 >0.8) was identified via MR-Base based on 1000 Genomes catalog (CEU reference population). 17 No proxy instruments were identified for outcomes not available in MR-Base. We aligned each genetic association for exposure and outcome on the same effect allele. We used effect allele frequency to ensure palindromic genetic instruments were aligned properly where that was possible.

Statistical analysis
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https: //doi.org/10.1101//doi.org/10. /2020 We estimated the r 2 of each genetic instrument and summed them up to compute the overall r 2 and F statistics using the sample size which the instruments for lung function was derived from (n=421,986). Higher r 2 and F statistic values suggest lower risk of weak instrument bias. 26 27 For our main Mendelian randomization analyses, we used inverse variance weighting (IVW) with multiplicative random effects to obtain the causal effect of FEV 1 and FVC on CVD outcomes and their risk factors.
IVW assumes there is no unbalanced horizontal pleiotropy. 28 To assess the presence of potentially invalid instruments, we checked for evidence of heterogeneity across instrument-specific Wald ratios (i.e ratio of instrument-outcome association to instrument-exposure association) using Cochrane Q test. Heterogeneity may result from horizontal pleiotropy.
Additional analyses to explore potential violation of Mendelian randomization assumptions MR assumes that genetic instruments do not affect confounders of the exposureoutcome association. We assessed the plausibility of this assumption by exploring whether the FEV 1 and FVC genetic instruments were related to key confounders defined as plausibly affecting lung function and CVDs (body mass index, height, socioeconomic position, alcohol and smoking). 11 This was done using publicly available summary genome wide association study data for: body mass index, 20 education (as a measure of socioeconomic position), 24 number of drinks of alcohol per week and number of cigarette smoked per day, 19 and height. 25 Specifically, we meta analyzed the estimate of each SNP-outcome association (per lung function increasing allele) using standard inverse variance weighting method with additive random effects. For confounders which are associated with the genetic instruments, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint we then assessed whether these associations were a sign of vertical pleiotropy (causal role of lung function on confounders) or horizontal pleiotropy (causal role of confounders on lung function) using bi-directional Mendelian randomization. 11 We used multivariable Mendelian randomization to control for potential horizontal pleiotropy by including estimates from the relevant confounders which were strongly associated with genetic instruments for lung function. 11 29 We also approximated the conditional F statistics to evaluate potential weak instrument bias, 30 presented Cochrane Q test to evaluate heterogeneity, and also provided estimates for multivariable MR-Egger to assess robustness of findings due to pleiotropy. 31 To assess unbalanced horizontal pleiotropy bias we undertook the following sensitivity analyses: MR-Egger, 32 weighted median analyses, 34 that are more robust than IVW to the assumption that there is no effect of the genetic instrument on outcome other than through the exposures of interest (i.e. no horizontal pleiotropy paths). 28 Like IVW, MR-Egger and MR-PRESSO require that the InSIDE ('instrument strength is independent of direct effect') assumption is not violated. Specifically, InSIDE assumes there is no correlation between the strength of the instrument (association of FEV 1 or FVC on their respective genetic instruments) and the strength of any association of the genetic instruments on the outcome not via the exposures of interest (e.g. any horizontal pleiotropy paths). The weighted median analyses assumed at least 50% of the weight of the genetic instruments is via the exposure of interest. Further details of these methods and their additional (and differing) assumptions are available in the supplemental material.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint

Bi-directional Mendelian randomization
To assess potential reverse causation, i.e. predisposition to CVDs affecting variation in FEV 1 or FVC , we also conducted a Mendelian randomization using genetic predictors of risk of CAD, overall stroke, heart failure, and atrial fibrillation as instruments. [12][13][14][15] We used IVW to explore evidence of effects of genetic predispositions to these CVDs on FEV 1 or FVC.

Results
We included up to 260 SNPs for FEV 1 , which explained 3.5% of its variance and had an overall F statistic of 58.8. For FVC, we included up to 320 SNPs, which explained 4.8% of its variance and had an overall F statistic of 66.4. Supplemental Tables 2-3 shows the summary statistics of the lung function instruments used in this study. Figure 1 shows the IVW MR causal effect estimates of FEV 1 and FVC on the risk of CVD and type 2 diabetes. Higher FEV 1 was associated with lower risk of CAD, overall stroke and some subtypes (i.e. ischemic stroke, small vessel stroke, large artery stroke), and type 2 diabetes, but increased risk of atrial fibrillation. FVC showed similar relation with these outcomes. There was no strong evidence that either FEV 1 or FVC influenced heart failure (both point estimates were close to the null). Figure 2 shows the IVW MR causal effect estimates of FEV 1 and FVC on cardiovascular risk factors using IVW. Higher FEV 1 was mainly associated with lower systolic blood pressure and triglycerides. FVC showed similar relations with these . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 1 0 outcomes and FVC was also associated with lower glucose and insulin.
Supplemental Figures 1-4 show that between SNP Wald ratio heterogeneity was high for most of the outcomes.
We found that genetic instruments for lung function were related to height, BMI, and education, with the strongest associations seen for height. There was no strong evidence of association with alcohol or tobacco phenotypes (Supplemental Table 4) .
Subsequent bi-Mendelian randomization analyses suggested there might be bidirectional effect between lung function and BMI, height, and education (Supplemental Table 5). Based on these findings we undertook multivariable MR adjusting for height only. This was done on the basis of the strong association of the lung function genetic instruments with height and knowledge that height is an established cause of lung function (Supplemental Table 4 Similar findings were observed when we used multivariable MR-Egger method. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 1 1 However, these analyses need to be treated with some caution as the conditional F statistics were small, ranging from 7.4 to 9.3 for FEV 1 and 9.7 to 11.4 for FVC.
Supplemental figures 1-4 show the MR-Egger and weighted median effect estimates of FEV 1 and FVC on CVD and cardiovascular risk factors. For FEV 1 , the direction and magnitudes of the estimates from MR-Egger and weighted median were consistent with those from IVW, with the exception of large artery stroke. The MR-Egger analysis suggested that higher FEV 1 increased the risk of large artery stroke, albeit with wide confidence intervals overlapping null. However, the MR-Egger intercept test did not provide strong evidence for overall horizontal pleiotropy, except for atrial fibrillation. For FVC, the direction and magnitudes of the estimates from MR-Egger and weighted median were consistent with those from IVW. Similar findings to those seen from IVW analyses were observed when we used MR-PRESSO to control for pleiotropy by outlier exclusion (Supplemental Tables 6-7). Figure 3 shows the association of predisposition to cardiovascular disease with FEV 1 and FVC, using genetic instruments for CVD (Supplemental Table 8). We found no strong evidence for an association between CVD predisposition and lung function.

Discussion
In this Mendelian randomization study, we have assessed the impact of lung function on the risk of CVDs and risk factors which may mediate any observed effects on CVDs (blood pressure, lipids, glycemic traits). We also undertook several sensitivity analyses to explore assumption violations and explored possible confounding by height and smoking. We found convincing evidence that lung function is . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint cardioprotective against CAD, consistent with our previous Mendelian randomization regarding the effect magnitude (OR FEV1 : 0.78 and OR FVC : 0.82). 7 These findings were consistent across a range of sensitivity analyses used to explore possible bias due to horizontal pleiotropy, however they did attenuate notably with multivariable Mendelian randomization adjustment for height (to an OR of 0.95 for FEV 1 and 0.85 for FVC). We also found novel evidence that better lung function protects against stroke and type 2 diabetes, with these effects being consistent across sensitivity analyses, including adjustment for height. The positive association between lung function and atrial fibrillation in our main Mendelian randomization is in the opposite direction to that seen for other CVD outcomes but attenuated to the null with adjustment for height. The impact of lung function on heart failure is also less clear than that seen for CAD, stroke and type 2 diabetes, as the protective effect is only evident when adjusting for height in the multivariable Mendelian randomization.
Lastly, our study also suggested that genetic susceptibility to CVD is unlikely to affect lung function.
Previous observational studies have suggested poorer FEV 1 and FVC increase CVD. 1-5 35 Our study suggests these observed associations are likely causal for CAD and stroke. Multiple pathways may explain the protective effect, with our study suggesting systolic blood pressure is a possible mechanism, 36 but with weaker evidence of an effect of lung function on diastolic blood pressure. This may imply poorer lung function primarily acts on arterial stiffness. 37 We also found better lung function associated with lower triglycerides, lower insulin and type 2 diabetes. As previous Mendelian randomization studies support a causal effect of hypertriglyceridemia, hyperglycemia, type 2 diabetes and higher insulin on CAD, [38][39][40][41] . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 1 3 these are possibly also mediators of the effect of lung function on CAD. With the exception of an inverse association with triglycerides, we did not find strong evidence for a causal effect of lung function on lipids. Importantly, we cannot rule out that the effect of lung function on CAD is largely explained by confounding due to height given the attenuation of the effect estimates in multivariable Mendelian randomization adjustment for height. These results have to be treated with some caution as the conditional F-statistics suggest possible weak instrument bias.
In our main analyses we found evidence of a protective effect of lung function on overall, ischemic, small vessel and large artery stroke, but not cardioembolic stroke.
The underlying mechanisms may include reduced blood pressure, 42 reduced risk of type 2 diabetes, 43 whilst the role of lipid is not as clear. 41 The lack of large GWAS with publicly available data on haemorrhagic stroke mean that we were not able to explore this outcome and it is not included in the overall stroke outcome. With the exception of MR Egger results sensitivity analyses were largely supportive of these causal effects. The magnitude of the effect estimates for these stroke outcomes were smaller than those for CAD and confidence intervals were wider, suggesting that lung function may be a weaker risk factor for stroke than CAD. However, it is also possible that the weaker effect of lung function on stroke (compared with that observed for CAD) is an underestimation because of selection bias. For example, if lung function is also related to survival then those who survive to have a stroke (which tend to occur at older ages than CAD) will likely be the ones with better lung function. Stroke shares several other risk factors, such as hypertension with CAD, but commonly occurs after CAD and this competing risk may attenuate or even reverse the estimates. 8 44 Similar paradoxical findings have also been observed in . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint observational studies, such as the association of cigarette smoking with dementia where the relation is attenuated, or even reversed with increasing age. 45 Further replication of our findings for stroke in Mendelian randomization studies ideally in cohorts with younger age at recruitment may better help elucidate the role of lung function in stroke. In contrast to CAD and stroke, results for heart failure in the main analysis were very close to the null, which could suggest lung function is not causally related to heart failure. However, it remains possible that the lack of association is due to biases arising from covariable adjustments on survival and/or competing risk of CAD. 8 44 The positive relation of lung function with atrial fibrillation in our main IVW analyses is somewhat unexpected given the majority of the findings from observational studies suggest an inverse relation. 46 However, the positive association disappeared upon adjustment for height using multivariable Mendelian randomization. Whilst the reason underpinning these changes is unclear, it is possible that height may be a stronger confounder for atrial fibrillation than any other forms of CVD, or adjustment for height may have reduced biases arising competing risk of CAD. This could also imply substantial horizontal pleiotropy due to height could have made other pleiotropy robust methods (e.g. MR-Egger, MR-PRESSO) less efficient which gave the same findings with IVW in the atrial fibrillation analyses. Nevertheless, the results from multivariable Mendelian randomization should be interpreted with caution due to wider confidence intervals and the possibility of weak instrument bias.
Although we used Mendelian randomization which is less susceptible to confounding by key traits, such as smoking, height and socio-economic position, 47 there are some . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint limitations. First, the validity of our study depends on the 3 main instrumental variable assumptions. 6 The F-statistics and proportion of variation in FEV 1 and FVC make a major impact of weak instrument bias on our main IVW results unlikely. We chose SNPs identified from the UK Biobank summary statistics as instruments for lung function. Whilst this avoids the introduction of invalid SNPs due to bias arising from adjustment for height and smoking (as done in previous GWAS), 48 the lack of replication of the UK Biobank GWAS results may mean that some of the genetic instrument-FEV 1 and FVC associations are exaggerated and hence the Mendelian randomization results presented here attenuated. However, the broad similarity of our findings here for CAD with those from our previous paper (using GWAS summary data from a collaboration of studies not only including UK Biobank) suggest this has not caused major bias (OR FEV1 : 0.78 and OR: 0.82 FVC in our previous study versus OR: 0.72 FEV1 and OR: 0.70 FVC in this study). 7 We also explored whether the lung function genetic instruments were associated with potential confounders of the relation of lung function with CVDs risk (i.e. BMI, smoking, alcohol and education, and height), which if present could bias our results through horizontal pleiotropy. We found limited evidence that the lung function instruments relate strongly to these risk factors, apart from height. Multivariable Mendelian randomization to adjust for height did notably attenuate the effects on CAD, though not those for stroke or type 2 diabetes. However, additional exploratory analyses suggested the magnitude, but not direction, of some multivariable Mendelian randomization estimates were sensitive to the choice of GWAS where height instruments were extracted from, such as directly from GIANT (Supplemental weak instrument bias and it would be valuable to repeat our analyses using data from even larger summary GWAS. Whilst we found evidence for between SNP heterogeneity, which might occur because of horizontal pleiotropy resulting in violation of the third assumption, sensitivity analyses that explore bias due to unbalanced horizontal pleiotropy (i.e. MR-Egger, weighted median and MR-PRESSO analyses) supported our main analyses. In addition to these three key assumptions, two-sample MR assumes that the two samples are from the same underlying population and that the two samples are independent. With the exception of summary data for CAD (23% non-European), all of our samples were restricted to European populations. There was overlap between the lung function and blood pressure (100% -i.e. both were undertaken in UK biobank), atrial fibrillation (38% of this outcome GWAS participants were in UK biobank), and heart failure (40% in UK biobank). Such overlap can mean that any weak instrument bias could be towards confounded estimates, rather than towards the null (as in two sample MR with independent samples). This would particularly influence blood pressure and could mean associations are exaggerated by in the presence of weak instruments. Second, we were unable to explore the possibility of non-linear effects between lung function and CVD, which can only be explored in one sample Mendelian randomization in large biobanks with individual level data. 49 In conclusion, our Mendelian randomization study provides some evidence concerning the protective role of lung function on CAD, stroke, type 2 diabetes and lower systolic blood pressure. Future studies should explore the underlying mechanisms, including the role of height in these relationships, and hence help identify additional targets of intervention for cardiovascular disease prevention.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint statistics for type 2 diabetes, which have been downloaded from https://www.diagram-consortium.org. We thank GIANT consortium for providing the summary statistics for body mass index and height, which have been downloaded from . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. Funding . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. The funders had no role in the design, analyses, interpretation of results or writing of the paper. The views expressed in this paper are those of the authors and not necessarily any of the funders. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint Figure 1: The impact of forced expiratory volume in 1 second (FEV 1 , per SD) and forced vital capacity (FVC, per SD) on cardiovascular disease and type 2 diabetes using Mendelian randomization . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint Figure 2: The impact of forced expiratory volume in 1 second (FEV 1 , per SD) and forced vital capacity (FVC, per SD) on continuously measured cardiovascular risk factors using Mendelian randomization . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint Figure 3: The impact of higher predisposition to cardiovascular outcome on forced expiratory volume in 1 second (FEV 1 ) and forced vital capacity (FVC) using Mendelian randomization . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 3, 2020. . https://doi.org/10.1101/2020.06.01.20118919 doi: medRxiv preprint