Habitual Coffee Consumption Increases Risks for Metabolic Diseases: Genome-wide Association Studies and a Phenotype-wide Two Sample Mendelian Randomization Analysis

Abstract Background and aims: Coffee is one of the most widely consumed beverages in the world and has received considerable concerns regarding its impact on human health. Mendelian randomization (MR) could be valuable to explore the potential health effects of coffee via instrumental variables. In this study, we aim to identify novel genetic loci associated with habitual coffee consumption using genome-wide meta-analysis (GWMA) and to evaluate the broad impact of coffee consumption on human health and disease risk via a large-scale, phenotype-wide, two sample Mendelian randomization (TSMR) analysis. Methods: We conducted a genome-wide association study (GWAS) among 283,926 coffee consumers of European ancestry in the UK Biobank (UKBB) to identify single nucleotide polymorphisms (SNPs) associated with the amount of coffee consumption (cups/day, GWAS 1), caffeine intake (GWAS 2) as well as the intake of non-caffeine substance in coffee (GWAS 3). The GWAS 1 results were further combined with the published results from the Coffee and Caffeine Genetics Consortium (CCGC) for a GWMA. TSMR were performed to evaluate the causal-relationship between coffee/caffeine/non-caffeine substance consumption and 1,101 diseases and health traits. Results: The GWMA identified 50 lead SNPs among 19 genomic regions for habitual coffee consumption. Nine out of the 19 loci were novel, including ADAMTSL4-AS1, CACNA2D2, LINC02123-ADCY2, UBD-SNORD32B, SEMA4D-GADD45G, LOC101929457-LINGO1, RAI1, HCN2, and BRWD1. The GWAS 2 and 3 identified 2 (SORCS2 and SLC39A8) and 5 (LINC02060-LINC00461, AGR3-AHR, PRR4-TAS2R14, CYP1A1-CYP1A2, and FTO) genomic regions, respectively. TSMR analysis indicated that coffee consumption increased the risk of high blood lipids, obesity, and diabetes. Meanwhile, intake of caffeine and non-caffeine coffee components decreased and increased some of the blood lipids levels, respectively. Conclusions: Our study provided evidence that habitual coffee consumption could increase the risk of metabolic perturbations. The bioactive components in coffee, other than caffeine, may be more harmful to human health. Our findings have significant implications for global public health given the increasing burden of metabolic diseases.


Introduction
Coffee is one of the most widely consumed beverages worldwide. The research community has been long debating whether coffee consumption is beneficial or harmful to health. Overall, observational studies favored the beneficial effect of coffee consumption in reducing the risk of metabolic syndrome, obesity, type 2 diabetes (T2D), cardiovascular disease, and several specific cancers 1-4 . However, evidence from randomized controlled trials (RCT) have shed light on the detrimental effects of coffee consumption, such as elevated blood lipids, blood glucose level, and fasting insulin level 5-8 .
The differences in the health outcomes of coffee consumption between the observational studies and RCT highlighted the necessity for further clarification. Mendelian randomization (MR) studies, which were considered to be advantageous to observational studies as they were similar to the RCT design but with comparable length of duration as those of the observational studies. They provided a potentially cost-effective strategy to examine the causal relationship between coffee consumption and health outcomes in human populations. A recent MR study found that coffee consumption increased the risk of osteoarthrosis and obesity 9 , while no significant effect on blood lipids or T2D was identified. Our recent MR analysis did not reveal a significant causal impact of coffee intake on nonalcoholic fatty liver disease (NAFLD) 10 . However, many of these MR studies on coffee consumption were limited in several aspects, with the limited power as a major issue, given the moderate effect of genetic alleles on coffee consumption habit 10 . Therefore, the identification of more genetic susceptibility alleles underlying the coffee drinking habit among larger populations will increase the power for MR analysis. As such, the causal impact of coffee consumption on a broad range of health outcomes can be further elucidated.
In this study, we performed a genome-wide meta-analysis of coffee consumption among 375,388 individuals of European ancestry, which led to the identification of additional novel loci for habitual coffee consumption. Utilizing these expanded list of genetic risk alleles as instrument for coffee consumption, we further conducted two-sample MR (TSMR) analyses between coffee consumption and a large number of health outcomes previously studied among various GWAS. Furthermore, the GWASs

Study design and cohorts
The study was conducted using the UK Biobank data resources under application number 53536. As shown in Error! Reference source not found., we conducted a GWAS of coffee consumption (GWAS 1) in the UK Biobank cohort (UKBB) and a genome-wide meta-analysis by combining the published results from the Coffee and Caffeine Genetics Consortium (CCGC) 11 . The UKBB cohort included over 500,000 adult individuals recruited from the UK population between 2006 and 2010. The extensive phenotypic and genotypic data were collected among all participants 12 . The quality control on the genotype data followed the procedure recommended by the UKBB 13 . Our GWAS analysis was restricted to the coffee drinkers of European-ancestry. The participants who did not drink coffee were excluded. Finally, 283,926 participants were available for association testing. The genotypic data were further imputed based on the Haplotype Reference Consortium and the UK10K + 1000 genomes reference panel 14 . The CCGC study investigated two phenotypes, the quantitative coffee consumption (phenotype 1) and the comparison between high and low coffee consumers (phenotype 2). In the current study, phenotype 1, which included up to 91,462 coffee consumers of European ancestry, was used in the meta-analysis. This combination resulted in a total sample size of 375,388. Functional annotation of GWAS summary was performed using FUMA GWAS 15 . To investigate the causal effect of coffee consumption, we employed a two-sample MR (TSMR) analysis to screen a broad spectrum of phenotypes in MR-Base 16 . After the enrichment analysis, the significant causal effects were further examined using a one-sample MR (OSMR) analysis using the phenotype defined in UKBB. To distinguish the effects of caffeine and other non-caffeine substances consumption on human health, we carried out the caffeine GWAS (GWAS 2) and other substances GWAS (GWAS 3) using the individual-level data of UKBB. The two phenotypes were defined as below. TSMR was also conducted after FUMA annotation.
questionnaire. The amount of coffee consumption was determined by the question 'How many cups of coffee do you drink each day (include decaffeinated coffee)?'. If participants reported that they drink less than one cup of coffee per day, their cups per day were set as 0. Participants with very high coffee consumption (> 8 cups/day) were excluded. Notably, the first phenotype of the CCGC study used in the subsequent meta-analysis study was also the number of cups of predominantly regular-type coffee consumed per day among coffee consumers 11 . Therefore, our new GWAS 1 is consistent with this previously conducted GWAS on coffee consumption. The coffee type was also coded and included as a covariate to be adjusted in the GWAS analysis. The coffee type was surveyed by the question "What type of coffee do you usually drink?" The options included decaffeinated coffee, instant coffee, ground coffee and others which were coded as 1, 2, 3, and 4, respectively.
GWAS 2 aimed to identify genetic variants associated with caffeine consumption, in which 283,926 consumers with intake of any type of coffee were included. The consumers of decaffeinated coffee were then coded as category 0, with the consumers of instant coffee, ground coffee, and other types of coffee coded as category 1. GWAS 3 aimed to identify genetic variants associated with other non-caffeine substances contained in coffee consumption, in which "non-coffee drinkers" were coded as 0, with the drinkers consuming decaffeinated coffee as category 1.

The genome-wide association study and meta-analysis
To identify genetic variants associated with the daily amount of coffee consumption (cups/day), caffeine consumption, and non-caffeine substance consumption, three GWAS were performed using mixed linear model adjusted for age, sex, body mass index, smoking status (never, previous, current), and the first 5 genetic principal components. Coffee type was adjusted for coffee consumption GWAS in addition. The genetic principal components were calculated from the linkage disequilibrium (LD) pruned (r 2 <0.1) array genotype data of the participants of European ancestry. The autosomal SNPs with minor allele frequency (MAF) > 0.01, imputation INFO score > 0.8, missing rate <0.05, and HWE-Pval >1×10 -6 were used in the genome-wide association study and meta-analysis. The meta-analysis was performed by combining the GWAS 1 results with that of the CCGC phenotype 1 (cups/day) GWAS using a fixed-effects inverse-variance weighted model 17 .

Functional annotation of genome-wide association study and meta-analysis
We used the web-based tool FUMA GWAS to define genomic risk loci and obtained functional information of relevant SNPs in these loci 15 . First, lead SNPs were defined using a genome-wide significant P value (5×10 −8 ) and LD r 2 <0.05. All SNPs with significant P value (5×10 −8 ) in LD (r 2 ≥ 0.05) with one of the lead SNPs were candidate SNPs. Further, genomic risk loci were identified by merging LD blocks if they were less than 250kb apart.
Gene-mapping was based on two strategies. Firstly, positional mapping was performed by selecting exonic and splicing-site SNPs with CADD score ≥ 12.37 18 . Secondly, expression quantitative trait locus (eQTL) mapping was used to map SNPs to genes that show a significant eQTL association with these SNPs. The eQTL mapping was conducted using data generated in GTEx v8 19 , and only cis-eQTLs (SNPs within 1Mb of a gene of interest) were included. The Benjamini-Hochberg false discovery rate (FDR) 20 of 0.05 was used to define significant eQTL associations. Gene enrichment and tissue specificity expression analysis were conducted using FUMA 15 and TSEA (http://genetics.wustl.edu/jdlab/tsea/), respectively. We used PhenoScanner to identify the pleiotropic effects of top lead SNPs 21; 22 .

Mendelian randomization study
The TSMR was performed in an inverse variance weighted (IVW) approach using lead SNPs associated with exposure as instrumental variables (IVs). For coffee consumption as the exposure, we used all 50 lead SNPs reaching the genome-wide level significance as IVs. For both caffeine and non-caffeine coffee intake as exposures, given the small number of SNPs reaching the genome-wide significance level, we used SNPs with a p value <10 -5 as IVs to reduce potential pleotropic effects. This results 38 and 83 IV for caffeine and non-caffeine substance intake, respectively. For all candidate IVs, only when the p-value of the IVs in an outcome is at least greater than 0.001, can it be used to infer the causality . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) between the exposure and outcomes. The outcomes were the phenotypes available in MR-Base (N=1,101). MR-Egger regression and Cochran's Q test were used to detect the pleiotropic effect and the heterogeneity of the IVs. The causal effects estimated using the IVs with MR-egger intercept p value ≤ 0.05 or Cochran's Q p value ≤ 0.05 were considered to be biased. The enrichment analysis of the significant causal effects (IVW p ≤ 0.05) in the categories defined by MR-Base was conducted using hypergeometric distribution test (https://systems.crump.ucla.edu/hypergeometric/index.php).
To reduce the false positive rate, the significant causal effects of coffee consumption (cups/day) on health outcomes were further validated using OSMR analysis using individual level UKBB data. In our OSMR study, the causal effect of coffee consumption on an outcome was estimated by the association between the coffee polygenic risk score (coffee-PRS) and the outcome. For each participant in the UKBB, the coffee PRS was calculated by adding together the allele dosages of the instrumental variables, weighted by their association effects with coffee consumption. The association with a dichotomous or continuous outcome was estimated using a logistic regression or linear regression model, respectively, without an adjustment. For random blood glucose, the linear model was adjusted for the self-report fasting time.

Statistical analysis
The linear mixed model was estimated using the Genome-wide Complex Trait Analysis (GCTA) 23 . The genome-wide meta-analysis was performed using METAL 17 . The TSMR analysis was conducted using the TwoSampleMR package of R (version 0.4.25). The polygenic risk score (PRS) and the explanation of the coffee consumption by IVs ( R 2 ) were evaluated by PRSice-2 24 . The F-statistic was calculated by the following formula to estimate the statistical power of lead SNPs: N was the sample size, and k was the number of IVs.
Linear regression and logistic regression were conducted using R software ( version 4.0.2 (https://www.r-project.org/). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)  Table 2). After the meta-analysis by further combining the data of the CCGC GWAS, we were able to identify an additional significant locus (rs1571536, SEMA4D-GADD45G) ( (LOC101929457-LINGO1), and rs3945 (BRWD1) were nominally significant in the CCGC GWAS study at a significant level of 0.05 with the same direction for the associations (Table 1). For the locus 9

GWAS of coffee consumption (cups/day), caffeine consumption, and non-caffeine coffee
(MLXIPL-VPS37D), rs7800944 was identified as an index SNP in CCGC. In our results, the lead SNP rs799166 is in LD (r 2 =0.36, among European Caucasian population) with rs7800944, and was predicted to be located in a SMAD2 binding site in JASPAR 26 . For the loci 13 and 17, the lead SNPs (rs2667773 and rs56113850) were not present in the CCGC GWAS findings. However, their LD proxy SNPs (rs2667768 and rs1496402) (r 2 >0.6 with the aforementioned lead SNPs among Caucasian population) were observed as the corresponding lead SNPs, respectively. Therefore, of the 50 lead SNPs, 35 were . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) nominally validated in the previous CCGC study (p<0.05), while 15 were newly observed as lead SNPs only in the current meta-analysis (Supplementary Table 4).
The participants and SNPs used in the GWAS 2 were the same as the GWAS 1. At this stage, we found rs112764911 (SORCS2) and rs13107325 (SLC39A8) to be associated with caffeine consumption at the genome-wide level (Table 1, Figure 2B). Regional plots are available in the online resources (Supplementary Figure S3-4).
Regional plots are available in the online resources (Supplementary Figure S5-6).
We also examined whether the identified SNPs are also eQTLs for nearby genes. The results were included in Supplementary Table 6. We found that 1941, 4 and 219 SNPs that are associated with the three phenotypes of GWAS 1, 2 and 3 are also significant eQTLs (FDR<0.05) for 180 genes in at least one tissue, respectively. For example, rs56113850-C associated with coffee consumption, which is also . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) significantly associated with increased expression of both CYP1A6 and CYP1A7 (Supplementary Table   7). Novel risk alleles for coffee consumption, rs6655975-A, rs1571536-C, rs2667773-A, and rs3945-G, are associated with increased expression of ADAMTSL4-AS1, GADD45G, LINGO1, and BRWD1, respectively (Supplementary Table 7). Rs12898397-C, a missense variant with deleterious impact (CADD score=24.2) on ULK3, is associated with increased coffee consumption while a decreased expression of ULK3 in multiple tissues (Supplementary Table 7). Rs11642015-T, associated with non-caffeine substances consumption, is associated with increased expression of FTO.
The gene enrichment analysis showed that these 180 GWAS SNP-associated genes were mainly involved in small molecule metabolic process, xenobiotic metabolic process, oxygenase p450 pathway, and generation of precursor metabolites and energy ( Figure 3). Enrichment for tissue-specific expression of these genes showed that they were significantly overrepresented in heart and liver (FDR <0.1) ( Figure   4). Some of the lead SNPs alleles associated with the amount of coffee consumption were also associated with other traits in previously published GWA studies (Supplementary Tables 8). For instance, the alleles of rs2867110-G and rs476828-C were associated with increased BMI 27 ; rs1260326-C, rs3792253-C and rs799166-G were associated with lowered triglycerides 28 ; rs1467913-C， rs351237-G， rs12902040-T，rs9783698-G，and rs12914012-T were associated with lowered height 29; 30 . In addition, several alleles associated with higher coffee consumption were associated with decreased impedance of body (e.g. rs11078398-G), lowered creatinine in the urine (e.g. rs2472297-T), and increased age at menarche (e.g. rs3945-G) (http://www.nealelab.is/uk-biobank). The lead SNP rs13107325-C associated with increased caffeine consumption was also associated with decreased body mass index 27 . The lead SNPs rs2067919-C and rs11642015-T associated with increased non-caffeine substances consumption were also associated with decreased alcohol intake frequency and increased risk of diabetes, respectively 30; 31 .
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ;

Causal relationship between coffee consumption and the health consequences: TSMR
We used the 50 lead SNPs provided by our meta-analysis (Supplementary Table 4) as the IVs in our TSMR analysis. TSMR analyses involved 1,101 phenotypes in MR-Base as outcomes. For each outcome, only when the p-value of the IVs in the outcome is at least greater than 0.001, can it be used to infer the causality between coffee consumption and outcomes.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; To reveal the diseases categories or traits which are mostly affected by coffee consumption, we conducted an enrichment analysis of the significant causal effects. The results showed that the health outcomes causally driven by the coffee consumption were significantly enriched in several MR-base categories, i.e. blood lipids, fatty acids or amino acids in plasma/serum, and anthropometric measurements ( Figure 5).

Validation of the associations between coffee consumption and health outcomes: OSMR
We used the IVs actually used in each outcome in the TSMR to construct the PRS score separately and inferred the relationship with the corresponding outcomes.

Causal relationship between the consumption of caffeine and non-caffeine components in coffee and metabolic traits
We set out to examine which components in coffee may lead to the potential detrimental effects of coffee consumption on metabolic perturbations. We conducted TSMR using 38 SNPs (Supplementary  Table 11) from GWAS 3 with p<10 -5 as a liberal cut-off for selecting the IVs, to examine the effects of caffeine or other non-caffeine components in coffee on coffee-consumption associated metabolic traits identified in the aforementioned TSMR analysis. We found that caffeine exposure was negatively associated with concentration of chylomicrons . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion
We performed the largest-to-date GWA and meta-analysis on coffee consumption. We further for the first time conducted GWA studies on caffeine intake and decaffeinated coffee intake. Our analyses identified novel loci associated with each of the three phenotypes, which provide new insights into the genetic basis underlying the coffee consumption behavior among human populations. Moreover, by leveraging these genetic findings, we performed large-scale MR analyses to assess the causal relationship between different coffee intake behavior and the health outcomes. Our study indicated that, unlike what have found in many observational studies 1-4; 32 , coffee consumption may causally lead to increased risks for metabolic diseases, and the more coffee consumed, the worse the outcomes. This finding is consistent with several studies based on RCTs 5-9 , having important implications for public health.
In our GWA studies, we confirmed all 10 known 11 and newly identified 9 genetic loci associated with coffee consumption among a large, combined European population. A recent GWAS also using UKBB samples identified fewer loci associated with coffee consumption 25 . The discrepancy between this GWAS 25 and our coffee consumption GWAS 1 may be attributed to the sample selection and inclusion of different covariates. The former study included both coffee and non-coffee drinkers in UKBB European population 25 , while our GWAS 1 was restricted to the coffee drinkers of European-ancestry.
Furthermore, statistical models of t h e former adjusted for age, sex, BMI and top 20 principal components 25 , while our statistical models additionally adjusted for coffee type and smoking. These . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; settings allow us to combine the CCGC GWAS study to perform the meta-analysis. In addition, we also identify 2 and 5 novel loci significantly associated with caffeine and non-caffeine coffee components.
The enrichment analysis of the genes whose transcription is associated with these SNPs revealed pathways related to small molecule metabolic process, xenobiotic metabolic process, oxygenase p450 pathway, etc., highlighted that genetic variants altering the metabolism of caffeine and related active xenobiotic compounds in coffee are likely the major determinants for coffee consumption behaviors. This is consistent with the previous identification 11 . Meanwhile, these GWAS SNP-regulated genes are enriched in the liver and heart tissue, with also a significant enrichment related to the generation of precursor metabolites and energy (Figure 4), suggesting a potentially overlap between the coffee consumption and energy metabolism. In addition, our genome-wide meta-analysis of coffee consumption identified candidate SNPs that may have deleterious impacts (CADD score >12.37) on gene functions. In particular, rs12898397-C (CADD score =24.2), a missense variant with deleterious impact on ULK3 (Unc-51 like kinase 3), was associated with increased coffee consumption but decreased expression of ULK3. ULK3 is a serine/threonine protein kinase that acts as a regulator of sonic hedgehog (SHH) signaling and autophagy. ULK3 low expression may induce the dysregulation of autophagy, which participates in controlling the metabolic functions of liver via multiple ways 33 .
We performed GWA studies on caffeine consumption and non-caffeine substances consumption, which have never been investigated at the genome-wide level. We found that rs112764911 (SORCS2) and rs13107325 (SLC39A8) were associated with caffeine consumption, and the latter may be also associated with coffee consumption (p=2.48×10 -4 ). Interestingly both genes were identified among a number of GWA studies to be associated with multiple phenotypes especially neuropsychiatric diseases and traits, e.g. SNPs in SORCS2 are highly expressed in the brain tissue, and previously was associated with attention function in attention deficit hyperactive disorder 34 , general risk tolerance and risk behavior 35 , alcohol withdrawal symptom 36 , depressive and manic episodes in bipolar disorder 37 , etc.; while SNPs in SLC39A8 were also associated with schizophrenia 38 , bipolar disorder 39 , and intelligence 40 , etc., indicating the impact of caffeine on central nerve system and the potential connection between . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; caffeine intake and neuropsychiatric reactions. In addition, we identified 5 loci (LINC02060, AHR, PRR4, CYP1A1, and FTO) associated with drinking of decaffeinated coffee. Two of them (AHR and CYP1A1) are also associated with coffee consumption in GWAS. AHR is known to be activated by many xenobiotic compounds, e.g. polycyclic aromatic hydrocarbons (PAHs) in coffee 41 . AHR response elements reside in the bidirectional promoter region located at chromosome 15q24, which associated with transcriptional activation of both CYP1A1 and CYP1A2 42-44 . While CYP1A1 plays an important role in metabolizing and the detoxification of PAHs, CYP1A2 directly metabolizes caffeine 45 , which may explain the overlapping identification between the two GWA studies. In addition, the FTO gene is known to be associated with BMI and involved in energy metabolism, further indicated the close connection between coffee intake and energy intake/homeostasis. Our study warrants continued investigations for the detailed mechanism underlying how these genes determining the caffeine or decaffeinated coffee intake.
In order to investigate the potential impact of regular consumption of coffee, caffeine or non-caffeine coffee constituents on human health, we performed MR analyses and found that coffee consumption may causally lead to altered risks for multiple clinical outcomes which are enriched in metabolic perturbations, especially increased risks for dyslipidemia, obesity, and diabetes. Meanwhile, non-caffeine substances consumption increased the risks for high blood lipids, while caffeine consumption decreased the risks for high blood lipids. The impact of coffee on human health has been long speculated to be attributed to the various bioactive components contained in coffee, such as caffeine, chlorogenic acid, diterpenoids, PAHs, etc. In general, due to the bitter taste of coffee, coffee consumption is more likely to be associated with increased intake of sugary, thereby increase the risk of diabetes and obesity 46; 47 . Meanwhile, after consumers switch from caffeinated coffee to decaffeinated coffee, LDL cholesterol and apolipoprotein B increase, suggesting that other coffee components other than caffeine may be responsible for the high blood lipids 48 . Consistent with our findings, caffeine has been demonstrated to have a beneficial impact on lipid metabolism, which reduces intrahepatic lipid content and stimulates β -oxidation in hepatic cells and liver via regulating the autophagy-lysosomal . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; pathway signaling 49 . While, diterpenoids contained in coffee may be an important factor leading to the increase of blood lipid level, and its impact on increasing blood lipid level may be related to its impact on the activity of serum lipid transporters 50 . A study tested the effect of cafestol, a diterpenoid in coffee, by giving to 10 healthy male volunteers for 28 days. Relative to baseline values, cafestol raised the activity of cholesterylester transfer protein by 18 +/-12% and of phospholipid transfer protein by 21 +/-14% (both P < 0.001), which may be associated with elevated serum VLDL and LDL cholesterol 50 .
Furthermore, filtered coffee containing a relative lower amount of diterpenoids does not increase blood lipid levels 51; 52 . In addition, an observational study showed that C-peptide, a marker of insulin secretion, decreased with every additional cup of decaffeinated coffee (0.063 ng/ml; P = 0.0003), which indicated the potential function of non-caffeine substances for diabetic risk 53 . Taken together, our data suggest that the detrimental effects of coffee consumption on health outcomes may be due to the intake of non-caffeine components. While even though caffeine may exert some beneficial impact on lipids homeostasis, this impact may not be sufficient to compensate the deleterious effects of the non-caffeine components in the coffee. The detailed mechanism underlying the association between coffee consumption and metabolic perturbations remains to be further investigated.
It is noteworthy that our MR analyses also suggested a potential causal relationship between coffee consumption and ovarian cancer, with an opposing risk for different subtypes of the disease. While the detailed mechanism underlying this association remains unclear. Also, caffeine has also been demonstrated to decrease estrogen but increase progesterone, which has been weakly associated with ovarian cancer risk 54; 55 . More studies are needed to further clarify this relationship.

Limitations
The current study was conducted in the participants of European ancestry. Whether our findings can be generalized to other ethnic groups remains to be validated in future studies. Moreover, although the study has identified 19 genomic loci (9 of which are novel loci) associated with coffee consumption, they only explain a small proportion of the phenotype variance, indicating that our MR results were . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; driven by a limited proportion of genetic susceptibility. Our conclusion may not represent the full image of the health impact of coffee consumption. Meanwhile, the findings of GWAS 2 and 3 studies need further independent validations as well.

Conclusions
This study identified novel genetic loci associated with multiple coffee consumption behaviors and provided evidence that coffee consumption increases the risk for metabolic diseases. This could have significant implications for global public health given the increasing burden of metabolic diseases. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 8, 2021. ; a  n  d  G  i  o  v  a  n  n  u  c  c  i  ,  E  .  (  2  0  1  9  ) . C  o  f  f  e  e  c  o  n  s  u  m  p  t  i  o  n  a  n  d  a  l  l  -c  a  u  s  e  a  n  d  c  a  u  s  e  -s  p  e  c  i  f  i  c  m  o  r  t  a  l  i  t  y  :  a   m  e  t  a  -a  n  a  l  y  s  i  s  b  y  p  o  t  e  n  t  i  a  l  m  o  d  i  f  i  e  r  s  .  E  u  r  o  p  e  a  n  j  o  u  r  n  a  l  o  f  e  p  i  d  e  m  i  o  l  o  g  y  3  4  ,  7  3  1  -7  5 2 . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.    . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ;

Figure 1.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ; . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ; . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ; . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ;

Figure 5.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ;  .

CC-BY 4.0 International license
It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint  Footnotes: The outcomes causally associated with coffee consumption in our phenotype-wide TSMR analysis, which were categorized using the definitions in MR-Base, with a slight modification by merging the insulinemic outcomes into the "diabetes" category. The Outcome is the descriptive phenotype with a unique id assigned by MR-Base; Nsnp: the number of IVs used in TSMR; BETA: the causal effect of coffee consumption on outcome; SE: the standard error; P: the p-value of TSMR analysis; FDR: false discovery rate; Heterogeneity: the p-value of Cochran's Q value; Pleiotropy: the p-value of MR-Egger intercept. .

CC-BY 4.0 International license
It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint