Abstract
Background Inverse-variance weighted two-sample Mendelian randomization (IVW-MR) is the most widely used approach that utilizes genome-wide association studies (GWAS) summary statistics to infer the existence and the strength of the causal effect between an exposure and an outcome. Estimates from this approach can be subject to different biases due to the use of weak instruments and winner’s curse, which can change as a function of the overlap between the exposure and outcome samples.
Methods We developed a method (MRlap) that simultaneously considers weak instrument bias and winner’s curse, while accounting for potential sample overlap. Assuming spike-and-slab genomic architecture and leveraging LD-score regression and other techniques, we could analytically derive, reliably estimate, and hence correct for the bias of IVW-MR using association summary statistics only.
Results We tested our approach using simulated data for a wide range of realistic settings. In all the explored scenarios, our correction reduced the bias, in some situations by as much as 30 folds. When applied to real data on obesity-related exposures, we observed significant differences between IVW-based and corrected effects, both for non-overlapping and fully overlapping samples. Additionally, our results are consistent with the fact that the strength of the biases will decrease as the sample size increases. Finally, we also showed that the overall bias is also dependent on the genetic architecture of the exposure, and traits with low heritability and/or high polygenicity are more strongly affected.
Conclusions Our method not only reduces bias in causal effect estimation but also enables the use of much larger GWAS sample sizes, by allowing for potentially overlapping samples.
Competing Interest Statement
The authors have declared no competing interest.