Genetic analyses of gynecological disease identify genetic relationships between uterine fibroids and endometrial cancer, and a novel endometrial cancer genetic risk region at the WNT4 1p36.12 locus

Endometriosis, polycystic ovary syndrome (PCOS) and uterine fibroids have been proposed as endometrial cancer risk factors; however, disentangling their relationships with endometrial cancer is complicated due to shared risk factors and comorbidities. Using genome-wide association study (GWAS) data, we explored the relationships between these non-cancerous gynecological diseases and endometrial cancer risk by assessing genetic correlation, causal relationships and shared risk loci. We found significant genetic correlation between endometrial cancer and PCOS, and uterine fibroids. Adjustment for genetically predicted body mass index (a risk factor for PCOS, uterine fibroids and endometrial cancer) substantially attenuated the genetic correlation between endometrial cancer and PCOS but did not affect the correlation with uterine fibroids. Mendelian randomization analyses suggested a causal relationship between only uterine fibroids and endometrial cancer. Gene-based analyses revealed risk regions shared between endometrial cancer and endometriosis, and uterine fibroids. Multi-trait GWAS analysis of endometrial cancer and the genetically correlated gynecological diseases identified a novel genome-wide significant endometrial cancer risk locus at 1p36.12, which replicated in an independent endometrial cancer dataset. Interrogation of functional genomic data at 1p36.12 revealed biologically relevant genes, including WNT4 which is necessary for the development of the female reproductive system. In summary, our study provides genetic evidence for a causal relationship between uterine fibroids and endometrial cancer. It further provides evidence that the comorbidity of endometrial cancer, PCOS and uterine fibroids may partly be due to shared genetic architecture. Notably, this shared architecture has revealed a novel genome-wide risk locus for endometrial cancer.


Introduction
Endometriosis, polycystic ovary syndrome (PCOS) and uterine fibroids are three common non-cancerous gynecological diseases affecting 10-15% (Parasar et al. 2017), 6-9% (Azziz et al. 2011) and 5-69% (Stewart et al. 2017) DylanM. Glubb and TracyA. O'Mara contributed equally to the work. of women of reproductive age, respectively; however, their prevalence is likely underestimated because of under diagnosis (Agarwal et al. 2019;De La Cruz and Buchanan 2017). Although these non-cancerous gynecological diseases primarily affect premenopausal women and endometrial cancer is largely a postmenopausal malignancy, many risk factors are shared with endometrial cancer (e.g. chronic estrogen exposure, inflammation, insulin resistance and obesity Harris and Terry 2016; Li et al. 2019;Wise et al. 2016)), suggesting some shared biological relationship.
A number of studies have used observational data to assess associations between the three non-cancerous gynecological diseases and endometrial cancer risk, the findings of which have been heterogeneous (Harris and Terry 2016;Johnatty et al. 2020;Li et al. 2019;Wise et al. 2016). Indeed, the use of observational studies to evaluate these associations could be confounded by: (i) the failure to adequately account for potential confounders that are associated with risk of endometrial cancer and/or gynecological disease e.g. oral contraceptive use; (ii) the reliance of disease status classification on self-reported data which are subject to misclassification bias from asymptomatic undiagnosed cases; (iii) misdiagnosis of early stage endometrial cancer as uterine fibroids due to shared clinical presentation (Wise et al. 2016); (iv) detection bias in cohort studies as a result of an increased surveillance for endometrial cancer among patients with non-cancerous gynecological diseases; and (v) the comorbidity of non-cancerous gynecological diseases (Choi et al. 2017;Johnatty et al. 2020;Matalliotaki et al. 2018;Nagai et al. 2015;Uimari et al. 2011;Wise et al. 2007). Thus, it remains difficult to determine from observational studies the precise nature of the relationships between endometrial cancer and these non-cancerous gynecological diseases.
Genome-wide association study (GWAS) data have demonstrated genetic correlation between endometrial cancer and endometriosis (Masuda et al. 2020;Painter et al. 2018), and uterine fibroids (Masuda et al. 2020), which may partly explain the comorbidities of these diseases; whether these comorbidities are due to causal relationships or shared genetic etiology remains to be explained. In this study, we have used a variety of approaches to analyze GWAS data and elucidate relationships between endometrial cancer and non-cancerous gynecological disease (summarized in Supplementary Fig. 1). First, we have performed genetic correlation analysis, using the largest currently available datasets to elucidate the degree of shared genetic architecture between the non-cancerous gynecological diseases and endometrial cancer. As inherited genetic variants are less influenced by confounding inherent in observational studies, we have performed genetic causal inference analyses using gynecological disease-associated variants to investigate causal relationships. It is possible that these diseases may not be genetically correlated or causally related to endometrial cancer but share overlapping genetic risk regions. To assess this possibility, we have performed gene-based association analyses. Lastly, we have performed multi-trait GWAS, leveraging genetic correlation between endometrial cancer and non-cancerous gynecological diseases to discover novel GWAS risk loci.

GWAS data
GWAS summary data publicly available for PCOS (Day et al. 2018) (https:// doi. org/ 10. 17863/ CAM. 27720) and uterine fibroids (Gallagher et al. 2019) (ftp:// ftp. ebi. ac. uk/ pub/ datab ases/ gwas/ summa ry_ stati stics/ Galla gherCS_ 31649 266_ GCST0 09158), and via collaboration for endometriosis (Sapkota et al. 2017), were used for all analyses except Mendelian randomization. For PCOS and uterine fibroids, GWAS summary data from the 23andMe, Inc., cohort had been excluded because of restrictions related to data sharing agreements (Day et al. 2018;Gallagher et al. 2019). For Mendelian randomization analyses, risk estimates and respective standard errors of genome-wide significant variants were accessed from the largest published GWAS for each disease (Day et al. 2018;Gallagher et al. 2019;Rahmioglu et al. 2018). Details of studies and sample sizes used in each analysis are shown in Supplementary Table 1. Detailed descriptions of the quality control procedures and GWAS analysis can be found in the corresponding publications.
GWAS summary data for endometrial cancer were available from O' Mara et al. (2018). As the GWAS for endometrial cancer (O'Mara et al. 2018), endometriosis (Rahmioglu et al. 2018), and uterine fibroids (Gallagher et al. 2019) included participants from the UK Biobank (https:// www. ukbio bank. ac. uk/), we re-analyzed the endometrial cancer dataset, excluding these participants to avoid sample overlap bias in the two-sample Mendelian randomization analysis. This also allowed us to use the UK Biobank endometrial cancer dataset as part of the replication set to confirm multitrait GWAS results. This revised endometrial cancer GWAS meta-analysis consisted of 12,270 cases and 46,126 controls of European descent. Genetic variants with minor allele frequency (MAF) < 1% and imputation information score < 0.4 were excluded, leaving ~ 9 million genetic variants in the revised endometrial cancer GWAS. The revised endometrial cancer GWAS data were used only in Mendelian randomization and multi-trait GWAS analyses, while the published endometrial cancer GWAS data (O'Mara et al. 2018) were used in all other analyses. Prior to genetic correlation analysis, genetic variants in the extended human major histocompatibility complex region (26-34 Mb on chromosome 6) were removed due to the complex linkage disequilibrium (LD) structure in this region.

Genetic correlation between endometrial cancer and non-cancerous gynecological diseases
We used LD Score regression (Bulik-Sullivan et al. 2015) to estimate the genetic correlation between endometrial cancer and each non-cancerous gynecological disease. Genetic correlation analyses were restricted to common HapMap3 variants (MAF > 0.01). To reduce bias from potential residual confounding in genetic correlation analyses, including bias from unknown sample overlap, we used the estimated genetic covariance intercept, obtained without constraint. Genetic correlation values range from − 1 to 1; positive values indicated that shared genetic variants have concordant effects across the genome, whereas negative values indicated divergent effects.
Obesity is a major risk factor for endometrial cancer, and is prevalent amongst women with PCOS and uterine fibroids (Ilaria and Marci 2018;Sam 2007). For genetic correlation analysis between endometrial cancer and PCOS or uterine fibroids, we thus additionally corrected for the effect of obesity, as measured by genetically predicted BMI. PCOS and uterine fibroids GWAS were conditioned using summary data from a large GWAS of BMI (Yengo et al. 2018) in GCTA-mtCOJO analysis (Zhu et al. 2018) before performing LD score regression analysis.

Genetic causal inference tests
We performed two-sample Mendelian randomization analysis to explore potential causal relationships between noncancerous gynecological diseases and endometrial cancer. Independent (LD r 2 < 0.01) genetic variants associated with the non-cancerous gynecological diseases at genome-wide significance (P < 5 × 10 -8 ) were used as genetic instruments. The list of genetic instruments and the respective risk association estimates were extracted from the largest GWAS of endometriosis (Rahmioglu et al. 2018), PCOS (Day et al. 2018) and uterine fibroids (Gallagher et al. 2019). We excluded independent genetic variants with ambiguous alleles and intermediate frequencies (i.e., variants with A/T or C/G alleles and minor allele frequency of more than 0.42), leaving 26 variants as genetic instruments for endometriosis, 14 for PCOS and 23 for uterine fibroids.
As the three non-cancerous gynecological diseases mostly affect premenopausal women and endometrial cancer primarily affects postmenopausal women, we performed a unidirectional Mendelian randomization analysis, assessing the effect of genetic predisposition to non-cancerous gynecological disease on endometrial cancer risk. We used inverse variance weighted (IVW) analysis as the primary analysis by regressing the genetic variant-endometrial cancer association on the genetic variant-non-cancerous gynecological disease association, weighted by inverse of their variance. This method has the most power to detect associations although it has a strong assumption of no heterogeneity (potentially resulting from pleiotropy) amongst genetic variants ); thus, this method assumes all genetic variants for the exposure of interest have a proportional effect on outcome risk.
We also performed several sensitivity analyses for Mendelian randomization that are more robust to heterogeneity amongst genetic variants: MR-Egger, weighted median, weighted mode and Generalized Summary-data based Mendelian randomization (GSMR) analysis. MR-Egger analysis regresses the genetic variant-outcome association on the genetic variant-exposure association, without constraining the regression intercept (Bowden et al. 2015). If the MR-Egger regression intercept is non-zero, it provides evidence that directional horizontal pleiotropy amongst genetic variants are driving the causal estimates (i.e. genetic variation influences the outcome through a pathway other than the exposure and indicates that the ratio of genetic variants with positive and negative pleiotropic effects is not balanced). The MR-Egger regression slope represents a valid effect estimate after adjustment for pleiotropic effects, provided the Instrument Strength Independent of Direct Effect (InSIDE) assumption is met (i.e. the association of a genetic variant with the exposure of interest is independent from its direct effect on outcome) (Bowden et al. 2015). We also performed weighted median (Bowden et al. 2016) and weighted mode (Hartwig et al. 2017) analyses, which are more robust to violation of the InSIDE assumption. Weighted median analysis relies on the assumption that more than 50% of the weights come from valid genetic instruments (Bowden et al. 2016), while weighted mode analysis relies on the assumption that most of the weights come from valid genetic instruments (Hartwig et al. 2017). We also performed GSMR analysis which filters out heterogeneous genetic instruments (through the HEIDI outlier test) that may demonstrate pleiotropic effects (Zhu et al. 2018). We used a P value threshold of 0.01 for the HEIDI outlier test (as recommended by authors) (Zhu et al. 2018). Cochran's Q statistic was used to assess the heterogeneity in the effects of variants (a potential indicator of horizontal pleiotropy) (Bowden et al. 2018), and leave-one-out analysis was used to assess whether a single variant drives the causal association ).
Two-sample Mendelian randomization analysis was performed using the "TwoSampleMR"  package in R. Unless stated otherwise, results with a Bonferroni-corrected P value for testing the effects of the three non-cancerous gynecological diseases (P < 0.05/3 = 0.017) on endometrial cancer risk were considered statistically significant.

Gene-based association analysis
To identify genetic risk regions shared between the noncancerous gynecological diseases and endometrial cancer, we performed gene-based analysis using the fast and flexible set-based association test (fastBAT) (Bakshi et al. 2016). fast-BAT was used to perform an enrichment analysis on GWAS risk variants, located within 50 kb of gene regions, for the non-cancerous gynecological cancers and endometrial cancer. A random sample of 10,000 unrelated participants from the UK Biobank was used as the reference panel in these analyses. We applied a false discovery rate (FDR) < 0.05 for the gene-based analysis, and adjacent risk-associated genes were considered a single risk region if within 1 Mb of each other. Colocalization analysis was performed on the genetic risk regions shared between the non-cancerous gynecological diseases and endometrial cancer using COLOC (Giambartolomei et al. 2014). This method estimates posterior probabilities for five different hypotheses, and can disentangle the associations with two traits due to shared causal variant (Hypothesis 4; colocalization) and distinct causal variant (Hypothesis 3; pleiotropy) (Giambartolomei et al. 2014). Genetic variants located within 1 Mb from lead variant for each gene identified from the fastBAT analysis of endometrial cancer were used in each colocalization analysis. A larger posterior probability for value Hypothesis 4 indicates a higher probability of co-occurrence of endometrial cancer GWAS signals and non-cancerous gynecological diseases GWAS signals at region of interest.

Multi-trait analysis of GWAS (MTAG)
MTAG (Turley et al. 2018) was used to improve endometrial cancer risk loci discovery through joint analysis of endometrial cancer and non-cancerous gynecological diseases that showed evidence of genetic correlation with endometrial cancer (i.e. PCOS and uterine fibroids). GWAS summary statistics were used as input and bivariate LD score regression was used to account for sample overlap. Using pre-computed LD scores for Europeans, MTAG analysis was performed on common variants (MAF > 0.01). Alleles of genetic variants were aligned across GWAS, and only variants present across included studies were assessed by MTAG. The final number of included variants for MTAG was 4,734,443. Summary statistics were produced for each trait where effect sizes and standard error estimates could be interpreted as the output from a single-trait GWAS.

Replication of novel endometrial cancer GWAS risk loci
MTAG assumes that the variance-covariance matrix across traits is homogenous across the genome, but it is likely some variants are null for one trait and not null for another trait(s). Violation of this assumption could increase false positive discovery in MTAG (discussed in (Turley et al. 2018)). To address this issue, we tested the replication of novel genome-wide significant endometrial cancer risk variants from MTAG in an independent GWAS meta-analysis using data from the Finnish Biobank Study (FinnGen; https:// www. finng en. fi/ en) and the UK Biobank. Endometrial cancer GWAS summary statistics for 566 cases and 75,822 controls were downloaded directly from FinnGen (data freeze 2; http:// r2. finng en. fi/). Quality control procedures for the FinnGen GWAS data are described in https:// finng en. gitbo ok. io/ docum entat ion/. For UK Biobank, we performed an endometrial cancer GWAS using genotype and phenotype data obtained under the application number 25331. Endometrial cancer cases were defined based on ICD10 code (C54) in the data fields of 40,006, 41,270 and 41,202. Controls were selected randomly from unrelated women participants** (π̂ < 0.1) with no history of any cancers. GWAS was performed on 1,866 cases and 18,660 controls using REGENIE (Mbatchou et al. 2020) to implement a logistic mixed model, adjusting for genotyping array and the top 10 principal components. A genetic relationship matrix was included in the model as a random effect to account for cryptic relatedness and population stratification. As recommended by REGENIE, we excluded genetic variants with MAF < 0.01, minor allele count below 100, genotype missingness above 10% and variants which deviated from Hardy-Weinberg equilibrium (P value < 1 × 10 -15 ). After quality control exclusion, a total of 9,789,172 SNPs remained in the GWAS analysis.
To create a replication set, the FinnGen and UK Biobank GWAS results were meta-analyzed by a fixed-effect inverse variance weighted model using "meta" software in R. Novel endometrial cancer genome-wide significant variants identified by MTAG were considered to have replicated if they had the same effect direction, and a P value < 0.05 for association in the replication set.

Identification of candidate target genes at the 1p36.12 endometrial cancer risk locus
We performed fine-mapping of the 1p36.12 region using the summary statistics version of the Sum of Single Effects (SuSiE) v0.11.42 R package to identify credible sets that are likely to contain causal variants (i.e. credible causal risk variants) through an iterative Bayesian stepwise selection procedure , using recommended values for all parameters. Genetic variants 1 Mb flanking the lead variant at 1p36.12 (rs3820282) were used in the analysis, and LD matrices were calculated based on the 1000 Genome Phase 3 European reference panel using plink v1.90 (Chang et al. 2015). We used previously generated promoter-associated HiChIP chromatin looping data from endometrial (one immortalized and three tumor) cell lines (O'Mara et al. 2019) to explore potential regulatory interactions between credible causal risk variants identified by fine-mapping and gene promoters at the 1p36.12 locus. We also explored the candidate target genes through overlap of credible causal variants with lead cis-eQTLs from GTEx v8 (https:// gtexp ortal. org/ home/ index. html) and the Blood eQTL Browser data (Munz et al. 2020). Colocalization analysis was performed if there was overlap of credible causal variants and lead cis-eQTLs (Giambartolomei et al. 2014). As colocalization analysis requires full summary statistics of two traits and full summary statistics for peripheral blood cis-eQTLs results was not publicly available, colocalization analysis for CDC42 was only performed using the currently available peripheral blood eQTL results limited to eQTL associations detected at FDR < 0.05 (Munz et al. 2020).

Results
We found endometrial cancer to be significantly genetically correlated with PCOS (r G = 0.36, se = 0.12, P value = 1.6 × 10 -3 ) and uterine fibroids (r G = 0.24, se = 0.09, P value = 5.4 × 10 -3 ) but not with endometriosis (Table 1). After adjusting for genetically predicted BMI, the genetic correlation between PCOS and endometrial cancer was no longer statistically significant, indicating that the initial genetic correlation was, at least partly, mediated by genetically predicted BMI (Table 1). In contrast, there was no material difference in the genetic correlation between uterine fibroids and endometrial cancer after adjusting for genetically predicted BMI (Table 1), consistent with a previous observation of no significant differences in BMI for endometrial cancer cases with or without uterine fibroids (Johnatty et al. 2020).
IVW Mendelian randomization and the GSMR sensitivity analyses provided evidence that, of the non-cancerous gynecological diseases, only genetic predisposition to uterine fibroids, affects endometrial cancer risk (Table 2, Fig. 1).
Although the other sensitivity analyses were not statistically significant, the directionality of the associations between uterine fibroids and endometrial cancer were consistent with the IVW and GSMR results (Table 2, Fig. 1). The MR-Egger intercept did not significantly differ from zero (Table 2) providing no evidence for confounding by directional horizontal pleiotropy amongst genetic instruments. However, Cochran's Q statistics indicated evidence of heterogeneity between causal estimates based on individual variants (Cochran's Q statistics = 42.1, degrees of freedom = 22, P = 6 × 10 -3 ), suggesting that some variants may be associated with endometrial cancer risk through pathways other than uterine fibroids. Leave-one-out analysis showed that no single variant was driving the causal association revealed by the IVW analysis ( Supplementary Fig. 2).
While genetic correlation analysis assesses the average genetic concordance across the genome for two traits, it does not reveal common genomic regions that harbor trait-associated variation. Further, a lack of evidence for genetic correlation may reflect opposing pleiotropic effects across the genome. Thus, we performed gene-based analyses to identify common risk regions across endometrial cancer and the non-cancerous gynecological diseases. The initial analysis revealed 24 genetic regions associated with endometrial cancer risk, 28 regions with endometriosis risk and 41 regions with uterine fibroids (Supplementary Table 2). No associations with PCOS passed FDR < 0.05, potentially reflecting a lack of power due to the small sample size of this cohort. We found four genetic risk regions (3q21.3, 9p21.3, 15q15.1 and 17q21.32), containing seven shared candidate susceptibility genes, were shared between endometriosis and endometrial cancer ( Table 3). Three of these regions (9p21.3, 15q15.1 and 17q21.32) have independently been associated with the risks of endometrial cancer (O'Mara et al. 2018) and endometriosis through GWAS (Rahmioglu et al. 2018). The LD of lead risk variants at each gene was compared and only one region (17q21.32) demonstrated evidence of a shared genetic risk signal (r 2 > 0.9; Table 3). Additionally, we found two genetic risk regions (5p15.33 and 11p13), containing five shared candidate susceptibility genes, were shared between uterine fibroids and endometrial cancer (Table 3). 5p15.33 has been associated with uterine fibroids risk through GWAS (Gallagher et al. 2019) while 11p13 has independently associated with uterine fibroids and endometrial cancer risk in GWAS (Gallagher et al. 2019;O'Mara et al. 2018). The LD of lead risk variants at each gene was compared but there was no strong genetic correlation at either 5p15.33 or 11p13 (r 2 ≤ 0.4; Table 3), suggesting that the genetic risk signals may be independent.
Incorporation of the two gynecological diseases genetically correlated with endometrial cancer (uterine fibroids and PCOS) in MTAG revealed 10 genome-wide significant risk loci for endometrial cancer (Table 4, Fig. 2). We observed an inflation of median test statistics in the MTAG result (λ = 1.06), which was likely due to a polygenic signal (LD score regression intercept = 0.98, se = 0.01) rather than population stratification. Two of the risk loci (5p15.33 and 1p36.12) were novel endometrial cancer genome-wide risk loci. We assessed both these risk loci in an independent endometrial cancer dataset and found that only the association at the 1p36.12 locus replicated (Table 4). Fine-mapping of the replicated novel endometrial cancer GWAS risk locus 1p36.12 identified one association signal which contained eight credible causal risk variants (Supplementary Table 3). To identify candidate target genes at 1p36.12, we intersected credible causal risk variants with promoter-associated chromatin loops from four endometrial (immortalized and tumor) cell lines (O'Mara et al. 2019). We identified five candidate target genes through chromatin looping, including WNT4 for which a candidate causal risk variant was revealed as a lead eQTL in lung tissue (Supplementary Tables 4 and 5; Fig. 3). However, there was limited evidence of colocalization between endometrial cancer GWAS risk and WNT4 eQTL associations in lung tissue, with posterior probability for colocalization of 0.14. Additionally, we identified CDC42 as a candidate target gene through a blood eQTL, with posterior probability for colocalization between endometrial cancer GWAS and CDC42 eQTL associations of 0.74 (Supplementary Table 4).

Discussion
Using large-scale genome-wide datasets, we observed evidence of positive genetic correlation between endometrial cancer and PCOS, and uterine fibroids, but not endometriosis. The observed genetic correlation between endometrial cancer and PCOS was at least partly mediated by genetically predicted BMI, consistent with the role of BMI as a risk factor for both PCOS and endometrial cancer. Mendelian randomization analysis suggested a causal relationship only between genetic predisposition to uterine fibroids and endometrial cancer risk. Gene-based analyses revealed several  Fig. 1 Association between genetic predisposition to non-cancerous gynecological diseases and endometrial cancer, obtained from twosample Mendelian randomization analysis. The boxes represent the risk of endometrial cancer (odds ratio) per standard deviation increment in genetic predisposition to non-cancerous gynecological disease. Error bars represent 95% confidence intervals genetic risk regions shared between endometrial cancer and endometriosis, and uterine fibroids. This included one apparent joint genetic risk signal, for endometrial cancer and endometriosis at 17q21.32. Multi-trait GWAS analysis, including endometrial cancer and the genetically correlated gynecological diseases identified two novel genome-wide significant risk loci for endometrial cancer, one of which (1p36.12) replicated in an independent endometrial cancer dataset. Lastly, functional analyses highlighted CDC42 and WNT4 as candidate target genes at the 1p36.12 endometrial cancer risk locus. Two previous studies have reported a positive genetic correlation between endometriosis and endometrial cancer (Masuda et al. 2020;Painter et al. 2018), but we found no evidence for such genetic correlation. This discrepancy may be related to: (i) the smaller sample sets used by the prior studies; (ii) the ethnicity studied (Masuda et al. (2020) analyzed a Japanese population); or (iii) the different genetic correlation analysis approaches used. For example, unlike Painter et al. (2018), we used an unconstrained LD score regression intercept to account for potential residual confounding, resulting in a conservative estimate of genetic correlation. Indeed, we found the estimated genetic covariance intercept to be significantly different from zero, suggesting the presence of bias from population stratification and/or sample overlap. The null results from the genetic causal inference analyses of endometriosis and endometrial cancer are concordant with observational studies that observed no associations after controlling for ascertainment bias by excluding recent endometriosis diagnosis (Melin et al. 2007;Olson et al. 2002;Rowlands et al. 2011).
We found PCOS and endometrial cancer to be genetically correlated but no association was observed in genetic causal inference analyses, concordant with observational studies that account for the effect of obesity (Fearnley et al. 2010;Zucchetto et al. 2009). These findings are consistent with our observation of substantial attenuation in genetic correlation between PCOS and endometrial cancer after adjusting for genetic components of BMI. We detected evidence of positive genetic correlation between uterine fibroids and endometrial cancer risk, consistent with observational studies (Fortuny et al. 2009;Rowlands et al. 2011;Wise et al. 2016). IVW Mendelian randomization analysis and GSMR analysis provided evidence of a causal relationship between genetic predisposition to uterine fibroids and endometrial cancer risk. However, Cochran's Q statistics showed evidence that variants used in the IVW analysis had heterogeneous effects, suggesting that not all variants that increase uterine fibroids risk are expected to increase endometrial cancer risk. Although results from subsequent sensitivity analyses that are robust to the presence of varying levels of pleiotropy were not statistically significant, they showed concordant effect directions with IVW result. It is important to note that the Mendelian randomization sensitivity analyses have lower power to detect causal relationships compared with IVW analysis. As genetic causal inference tests rely on the statistical power of GWAS used, future larger GWAS are required to provide more accurate causal estimates and thus greater confidence with regards to the nature of the relationship between uterine fibroids and endometrial cancer.
Gene-based analysis revealed two genetic risk regions (5p15.33 and 11p13) that were shared by endometrial cancer and uterine fibroids. The 11p13 shared risk region has been associated with the risks of uterine fibroids (Gallagher et al. 2019) and endometrial cancer (O'Mara et al. 2018) in GWAS. From the gene-based analysis, we identified WT1 and WT1-AS as candidate susceptibility genes for both uterine fibroids and endometrial cancer at 11p13. Consistent with this finding, we had previously established both genes as candidate targets of endometrial cancer risk GWAS variation through promoter-associated chromatin looping studies (O'Mara et al. 2019) and WT1 had also been identified through chromatin looping as a candidate target of uterine fibroids risk variants (Rafnar et al. 2018). WT1 encodes a transcription factor that is essential for urogenital development (reviewed by Roberts (2005)) and in the Fig. 3 The upper panel depicts a regional association plot for the 1p36.12 novel endometrial cancer risk locus. Genetic variants at the locus are plotted by their genomic position (hg19) and MTAG − log 10 (P) for association with endometrial cancer risk is on the left y-axis. Recombination rate (cM/Mb) is on the right y-axis and plotted as blue lines. The color of the circles indicates the level of linkage disequilibrium between each variant and the lead variant, rs3820282 (purple diamond), from the 1000 Genomes 2014 EUR reference panel (see legend, inset). The lower panel shows promoter-associated chromatin looping at 1p36.12 identified from HiChIP analysis of the ARK-1 endometrial cancer cell line. Promoter-associated loops that intersect with candidate causal variants (shown as red vertical lines) are shown as purple arcs GTEx database of tissue gene expression it is most highly expressed in the uterus (https:// gtexp ortal. org/ home/). These observations suggest that alteration of uterine WT1 expression by endometrial cancer and uterine fibroids genetic risk variation may affect susceptibility to these diseases.
The 5p15.33 region was found to associate with endometrial cancer risk through both the gene-based analysis and the multi-trait GWAS. However, the multi-trait GWAS association did not replicate in the independent endometrial cancer dataset, with discordant effect directions and non-overlapping confidence intervals. Previously, this region has associated with uterine fibroids risk in a GWAS (Gallagher et al. 2019), with endometrial cancer risk in a candidate locus study (Carvajal-Carmona et al. 2015) and in a cross-cancer GWAS meta-analysis of endometrial cancer and ovarian cancer (Glubb et al. 2021). The gene-based analysis at this region revealed three candidate risk genes that were shared between uterine fibroids and endometrial cancer. The most biologically relevant of these genes is TERT, which encodes telomerase reverse transcriptase and maintains chromosomal stability by elongating the telomere (Rubtsova et al. 2012). Relevantly, chromosomes in uterine fibroids (Bonatz et al. 1998;Rogalla et al. 1995) and in endometrial tumors (reviewed by Alnafakh et al. (2019)) have been shown to have shorter telomere length. Indeed, a recent Mendelian randomization study found genetically predicted telomere length to be strongly associated with endometrial cancer risk (Telomeres Mendelian Randomization et al. 2017).
The novel 1p36.12 endometrial cancer risk locus, revealed by the multi-trait GWAS, replicated in the independent endometrial cancer GWAS dataset. Genetic variation at this region has associated with traits that are genetically correlated or causally related to endometrial cancer (e.g. heel bone mineral density (Morris et al. 2019), body mass index (Pulit et al. 2019), diabetes (Vujkovic et al. 2020), age at menarche (Kichaev et al. 2019) and ovarian cancer (Kuchenbaecker et al. 2015)). Furthermore, genetic variation at 1p36.12 has associated with endometriosis and the lead endometrial cancer risk variant from the multi-trait GWAS also represents a GWAS risk signal for pelvic organ prolapse (Olafsdottir et al. 2020). Promoter-associated chromatin looping data from endometrial cell lines highlighted five candidate target genes, one of which, CDC42, was supported by risk variation that colocalised with a blood CDC42 eQTL. The lead candidate causal risk variant at 1p36.12 (rs3820282) and two candidate causal variants (rs61768001 & rs12037376) have previously been associated with expression of CDC42 in blood and a long non-coding RNA (LINC00339) in blood and the endometrium (Mortlock et al. 2020). LINC00339 has been found to promote oncogenesis in several different cancer types (Gao et al. 2020;Ye et al. 2020;Zhao et al. 2020), although not specifically endometrial cancer.
Semi-quantitative chromatin looping analysis in an endometrial cancer cell line demonstrated evidence of an interaction between a region containing rs3820282 and a ~ 15 kb region containing the promoter of LINC00339 (Powell et al. 2016). However, the quantitative chromatin looping data from the HiChIP analysis of the normal immortalized and tumoral endometrial cell lines (O'Mara et al. 2019)), which also has much greater resolution (Lareau and Aryee 2018), did not provide evidence for a physical interaction between LINC00339 and candidate causal endometrial cancer risk variants.
Of the candidate target genes at the 1p36.12 locus, CDC42 and WNT4 have biological function most relevant to endometrial cancer. CDC42 encodes a small GTPase of the Rho-subfamily that regulates cell cycle, cell-cell adhesion, cell migration and cancer progression (Qadir et al. 2015). Notably, CDC42 binds to PAK6 (encoded by an endometrial cancer GWAS risk candidate target gene (O'Mara et al. 2019)) and this complex, which localizes to cell-cell adhesions, is correlated with epithelial colony escape (Morse et al. 2016). WNT4 encodes a protein that activates WNT/βcatenin signaling and appears to be crucial for the development of the female reproductive system, including the uterus (reviewed in (Biason-Lauber and Konrad 2008)). Moreover, genes belonging to the WNT/β-catenin pathway are frequently mutated in cancer, including the gene encoding β-catenin which is mutated in ~ 25% of endometrial tumors (Cancer Genome Atlas Research et al. 2013). As with CDC42, there are also links between WNT4 and other genes located at endometrial cancer GWAS risk loci, such as WT1 and RSPO1 (O'Mara et al. 2018). For example, in the ovary, there is evidence of WNT4 regulation by proteins encoded by both of these genes (Biason-Lauber 2012; Gao et al. 2014) and RSPO protein activity potentiates WNT signaling (Bugter et al. 2021). There also appears to be some connection between CDC42 and WNT4: both genes have been found to be differentially expressed in the endometrium during the menstrual cycle (Powell et al. 2016).
To reduce confounding inherent in the comorbidity observational studies of endometrial cancer and gynecological disease, prospective studies with long follow-up, large sample sizes and case identification using surgical confirmation would ideally be performed. Nevertheless, our study has demonstrated the utility of genetic causal inference analysis as a cost-effective alternative approach for unraveling relationships while reducing bias from unmeasured confounding. However, a limitation of our study is that the sample size of PCOS GWAS (the largest publicly available) was relatively small, reducing power to identify shared genetic risk regions or a causal relationship between PCOS and endometrial cancer. Consequently, these analyses should be revisited when more genome-wide significant variants are revealed in future PCOS GWAS.
In conclusion, our study has provided insights into the comorbidity of non-cancerous gynecological diseases and endometrial cancer by revealing shared genetic risk architecture, a potential causal relationship between uterine fibroids and endometrial cancer, and shared candidate risk regions and genes. Furthermore, our study has leveraged this shared genetic architecture to identify a novel risk locus for endometrial cancer, uncovering biologically relevant candidate target genes and furthering our understanding of endometrial cancer etiology.