Rho GTPase transcriptional activity and breast cancer risk: A Mendelian randomization analysis

Background: Rho GTPases are a family of 20 intracellular signalling proteins that influence cytoskeletal dynamics, cell migration and cell cycle progression. Rho GTPases are implicated in breast cancer progression but their role in breast cancer aetiology is unknown. As aberrant Rho GTPase activity could be associated with breast cancer, we aimed to determine the potential for a causal role of Rho GTPase gene expression in breast cancer risk, using two-sample Mendelian randomization (MR). Methods: MR was undertaken in 122,977 breast cancer cases and 105,974 controls, including 69,501 estrogen receptor positive (ER+) cases and 105,974 controls, and 21,468 ER negative (ER-) cases and 105,974 controls. Single nucleotide polymorphisms (SNPs) underlying expression quantitative trait loci (eQTLs) obtained from normal breast tissue, breast cancer tissue and blood were used as genetic instruments for Rho GTPase expression. Colocalisation was performed as a sensitivity analysis to examine whether findings reflected shared causal variants or genomic confounding. Results: We identified genetic instruments for 14 of the 20 human Rho GTPases. Using eQTLs obtained from normal breast tissue and normal blood, we identified evidence of a causal role of RHOD in overall and ER+ breast cancers (overall breast cancer: odds ratio (OR) per standard deviation (SD) increase in expression level 1.06; (95% confidence interval (CI): 1.03, 1.09) and OR 1.22 (95% CI: 1.11, 1.35) in normal breast tissue and blood respectively). The direction of association was consistent for ER- breast cancer, although the effect-estimate was imprecisely estimated. Using eQTLs from breast cancer tissue and normal blood there was some evidence that CDC42 was inversely associated with overall and ER+ breast cancer risk. The evidence from colocalization analyses strongly supported the MR results particularly for RHOD. Conclusions: Our study suggests a potential causal role of increased RHOD gene expression, and a potential protective role for CDC42 gene expression, in overall and ER+ breast cancers. These finding warrant validation in independent samples and further biological investigation to assess whether they may be suitable targets for drug targeting.

assumptions are met: i) the genetic instrument is robustly associated with the exposure or 23 metabolic trait of interest; ii) there is no confounding of the instrument-outcome relationship; 24 and iii) there is no alternative pathway through which an instrument influences the outcome 25 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint 6 except through the exposure (8). The feasiblity, statistical power and precision of MR 26 analysis can be increased by employing a "two-sample MR" framework in which summary 27 genetic association data from independent samples representing genetic variant-exposure and 28 genetic variant-outcome associations are combined in order to estimate causal effects (10). 29

30
The aim of our study was to assess whether a potential association exists between the 31 expression of genes encoding Rho GTPases and risk of overall, estrogen receptor-positive 32 (ER+) and estrogen receptor-negative (ER-) breast cancer.

Instrument Construction 43
To generate genetic instruments to proxy for Rho GTPase gene expression, we performed a 44 multi-step approach. First, single nucleotide polymorphisms (SNPs) marking expression 45 quantitative trait loci (eQTLs) underlying expression of the genes encoding 20 Rho GTPases 46 were obtained from normal breast tissue, breast cancer tissue and blood from patients without 47 breast cancer. We obtained normal breast tissue specific eQTLs by searching for the 48 expression of each gene in the Genotype-Tissue Expression (GTEx) project (v8) 49 (https://gtexportal.org/home/) (12) and selected the top SNP associated with the gene 50 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint 7 expression (defined by the smallest P-value). Second, we extracted eQTLs (the top SNP; 51 smallest P-value) from the eQTLGen consortium (https://www.eqtlgen.org/) which has 52 performed cis-and trans-eQTL analysis in blood from 31,684 individuals (13) of largely 53 European ancestry. The consortium defined the cis-eQTLs as every SNP-gene combination 54 with a distance <1Mb from the center of the gene and were tested in at least 2 cohorts. Third, 55 we obtained breast cancer tissue eQTLs from The Cancer Genome Atlas (TCGA) 56 (https://albertlab.shinyapps.io/tcga_eqtl/) which has systematically performed eQTL analyses 57 across 24 human cancer types (14). Finally, we selected cis-SNPs (within 1Mb of the target 58 gene) that were associated with gene expression level at a P-value threshold of P<5x10 -08 59 based on summary data available on the three platforms. To retain only independent SNPs, 60 we used linkage disequilibrium [LD] clumping with a threshold of r 2 ≤ 0.01 based on the 61 1000 Genomes European ancestry reference panel data (15). 62

Two-sample MR analysis 63
We extracted the following information for each selected eQTL -effect allele, other allele, 64 beta coefficient and standard error -and calculated the proportion of variance in gene 65 expression explained by the SNP. R 2 and F-statistics were calculated to assess the strength of 66 the genetic instruments and to examine for weak instrument bias using previously reported 67 methods (16). Exposure and outcome data were harmonised such that the effect of each SNP 68 on the outcome and exposure was relative to the same allele (17). 69 70 For our primary analyses using the top eQTL, we used the Wald ratio method, equivalent to 71 β YG/ β XG (where Y= outcome [overall, ER+, and ER-breast cancer], G= germline genetic 72 variant, and X= Rho GTPase gene expression). In secondary analyses, when the genetic 73 instrument consisted of multiple SNPs ('a multi-allelic instrument'), we used the inverse-74 variance weighted (IVW) method, which performs an inverse variance weighted meta-75 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020.  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. Of the 20 genes of the Rho GTPase family, 14 could be analysed using MR because they had 105 at least one SNP that was strongly associated with their expression at least one of the three 106 databases that were searched ( The direction of association was consistent for ER-breast cancer but the evidence of 118 association was suggestive (OR per SD increase in expression level was 1.06 (95% CI: 1.01, 119 1.12; P=0.03) and OR per SD increase in expression level 1.19 (95% CI: 0.99, 1.42; P=0.06) 120 in normal breast tissue and blood respectively). As we obtained only one SNP for the eQTL 121 for RHOD from both breast tissue and blood, and none from breast cancer tissue, we were 122 unable to perform multiple SNP sensitivity analyses. 123 124 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint There was some suggestive evidence that increased expression of cell division cycle 42 125 (CDC42) was inversely associated with overall and ER+ breast cancer risk. ( the outcome dataset against overall, ER+ and ER-breast cancer risk, as described previously 141 (16). The power to detect the odds ratio of 1.2 (or equally 0.80) was >99% for RHOD in 142 overall, ER+ and ER-breast cancer using eQTL obtained from breast tissue and ≥ 30% using 143 eQTL obtained from blood ( Table 1). The power was >99% for CDC42 in overall, ER+ and 144 ER-breast cancer using eQTL obtained from blood and ≥ 27.19 using eQTL obtained from 145 breast cancer. The results for R 2 , F-statistic and power calculations are provided in Table 1. 146

147
In sensitivity analyses, using multiple SNPs obtained from breast cancer tissue, the direction 148 of association for ER+ and ER-together was consistent with single SNP analyses but the 149 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020.  (Figure 3 and 4). We did not find evidence of heterogeneity and pleiotropy across 160 the individual causal effects. 161

162
In colocalisation analyses for RHOD (using summary data from the GTEx platform) the 163 posterior probability of colocalisation (i.e. exposure and outcome are associated and share the 164 same causal variant) was 84% for overall breast cancer risk and 98% for ER+ breast cancer 165 risk suggesting that breast cancer risk and RHOD expression are associated and share a single 166 causal variant (Table 3). However, the evidence of colocalisation was weak for ER-breast 167 cancer risk and RHOD eQTLs (9%). There was less substantial evidence of colocalisation of 168 the CDC42 expression (using summary data from the eQTLGen platform) and breast cancer 169 risk signals (Table 3). is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint 13 CDC42 pathway (4). Despite this, mutations in CDC42-related genes are low at between 0.1 200 and 1.7% and the elevated CDC42 expression in breast cancer is thought to be due to 201 activation of oncogenes or cell surface receptors such as epidermal growth factor receptor 202 (EGFR) that lead to CDC42 upregulation (30). Differences in tissue-specific CDC42 203 expression biasing the result are unlikely as the effects of SNPs for CDC42 were derived 204 from both breast cancer tissue and from blood and were in concordance. The more plausible 205 explanation for the apparent protective effect of CDC42 is that CDC42 maintains epithelial 206 polarity (31,32) and hence protects against cancer initiation. At later stages of breast cancer 207 development, increased CDC42 expression could promote cancer progression via its effects 208 on cell cycle progression and invasion (3). The activity of most Rho GTPases, including 209 RHOD and CDC42, is also controlled by over 70 guanine nucleotide exchange factors 210 (GEFs), 60 GTPase-activating proteins (GAPs) and 3 Rho GDI proteins (guanine-nucleotide-211 dissociation inhibitors) that switch the Rho GTPases between active and inactive forms (26). 212 It will therefore be interesting to vestigate whether the expression of any of these regulators 213 has a causal association with breast cancer risk. We performed sensitivity analyses to 214 disentangle the causal effects of gene expression from associations driven by horizontal 215 pleiotropy, genetic confounding through linkage disequilibrium and reverse causation. We 216 found strong evidence of colocalization for RHOD, suggesting that our MR findings could 217 not be driven by genetic confounding through LD between eQTLs and other disease-causal 218 SNPs strengthening the evidence of causality. Evidence of colocalization thus served as a 219 complementary approach to reinforce the MR finding for RHOD. 220 221 This study has several strengths; firstly, we tested the effects of the GTPases in human 222 samples rather than cell lines or animal models; secondly, the use of MR which is less 223 susceptible to problems of measurement error, confounding and reverse causation in 224 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. In conclusion, we found evidence that RHOD may be causally and positively related to breast 241 cancer risk, and CDC42 may be causally and inversely related to breast cancer risk. Given 242 that the activity of RHOD and CDC42 proteins is regulated by a variety of other proteins, it 243 will be interesting to determine whether any of the genes encoding these regulators is 244 causally associated with breast cancer risk. The role of RHOD warrants further biological 245 investigation to assess its role in breast carcinogenesis. 246

Availability of data and materials 247
All data analysed during this study was previously generated. Data availability repository 248 links are given below: 249 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020.  The authors would like to thank the participants of the individual studies contributing to the 273 BCAC for their participation in these studies along with the principal investigators of BCAC 274 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint Table 1  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint Figure 1. Twenty human Rho GTPase family members. A ClustlW alignment using the 295 amino acid sequences of the 20 human Rho GTPases was used to generate the phylogenetic 296 tree. *, 14 genes able to be analysed by MR (see Table 1). 297 298 299 300 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

A) B)
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint

C) D)
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint

312
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint 25 313 314 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint

320
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint 27 321 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted December 2, 2020. ; https://doi.org/10.1101/2020.12.01.20241034 doi: medRxiv preprint