Genetic analyses on the health impacts of testosterone highlights effects on female-specific diseases and sex differences

Testosterone (T) is linked with diverse characteristics of human health, yet, whether these associations reflect correlation or causation remains debated. Here, we provide a broad perspective on the role of T on complex diseases in both sexes leveraging genetic and health registry data from the UK Biobank and FinnGen (total N=625,650). We find genetically predicted T affects sex-biased and sex-specific traits, with a particularly pronounced impact on female reproductive health. We show T levels are intricately involved in metabolism, sharing many associations with sex hormone binding globulin (SHBG), but report lack of direct causality behind most of these associations. Across other disease domains, including behavior, we find little evidence for a significant contribution from normal variation in T levels. Highlighting T's unique biology, we show T displays antagonistic effects on stroke risk and reproduction in males and females. Overall, we underscore the involvement of T in both male and female health, and the complex mechanisms linking T levels to disease risk and sex differences.


Introduction 29
Testosterone (T) is the male sex hormone responsible for regulation of development of primary and 30 secondary male sexual characteristics. Individual variation in T levels has been suggested to shape human 31 physiology broadly, including effects on disease risk in both males and females (1-3). Epidemiological studies 32 and randomized clinical trials for T replacement therapy have observed associations between serum T levels 33 and various traits ranging from type 2 diabetes (T2D) and cardiovascular disease to body composition and 34 behavior (1-11). Yet, these studies have yielded partly mixed results, and, in many instances, the proposed 35 relationships between T, complex traits and disease remain elusive (2,(7)(8)(9)(10)(11). 36 Besides the disease links, T is a known driver for sex differences. After puberty, males and females differ 37 extensively with respect to their average T levels, with males showing roughly 7-15-fold higher serum total T 38 concentrations (1,12). This difference largely results from the testicular T production in males that far 39 exceeds the amount of T produced in the ovaries and the adrenal gland in females, and is known to directly 40 contribute to variation in, for instance, body composition between the sexes (1, 12). 41 In the human body, the majority of T is bound to a carrier molecule, whereas only a small fraction (1-3%) of 42 this total T exists as free T, considered to represent the most potent form of T in terms of biological activity 43 (13)(14)(15). Most of the remaining T in circulation is tightly bound by sex hormone-binding globulin (SHBG), and 44 the bulk of the rest remains attached to carrier proteins like serum albumin (13,14). Non-SHBG bound T is 45 often approximated with free androgen index (FAI) ( Figure 1A) (4,13,14). 46 In addition to processes affecting T production, circulating T levels are determined in both sexes by factors 47

Results 76
We utilized the rich biochemical and health information available in two population-scale genetic datasets 77 and analysis methods building on GWAS discovery (Figure 1). In brief, we conducted sex-stratified GWAS for 78 T, SHBG, FAI and free T, using data available in the UK Biobank ( Figure 1A), from which we built sex-specific 79 PGS for these four traits. The PGSs capture the combined genetic effects on T and SHBG levels, and therefore 80 serve as a proxy for cumulative post-pubertal T exposure. Using an external dataset (Young Finns Study; YFS), 81 we validated the performance of the PGSs. We then investigated the effects of the PGSs on a wide range of 82 diseases across diverse clinical entities using the FinnGen study ( Figure 1B). Lastly, we evaluated causal 83 relationships and genetic correlations between the studied T traits and complex traits, leveraging publicly 84 available GWAS summary statistics. 85 86 Figure 1. Illustration of the studied traits and study overview. A) Sex-specific distributions for serum total T and SHBG 87 and calculated FAI and free T levels for the UK Biobank participants included in the genome-wide association study 88 (GWAS). The box plots show median (black line), lower and upper quartiles (colored area of the box) and the error bars 89 indicate 5% and 95% quantiles. B) Overview of the study design to assess the contribution of T to health and disease 90 using genetic approaches and biobank data. We conducted the discovery GWAS in the UK Biobank, built sex-specific 91 PGSs for the four T-related traits, validated the PGSs in the Young Finns Study (YFS), and performed complex disease 92 and trait associations in FinnGen (release 5) and using publicly available GWAS data.

93
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

GWAS and polygenic scores for testosterone traits 94
We identified more than a hundred genome-wide significant (p<5e-08) loci for all testosterone traits (up to 95 263 loci for SHBG in males) in the UK Biobank GWAS (Methods; Supplementary Tables 1-9). In both sexes, all 96 traits displayed significant contribution from common variants (allele frequency >1%) to the trait variability, 97 with the SNP heritability (h 2 ) estimates ranging from 10% for total T in females to 28% for SHBG in males 98 (Supplementary Table 9). The associated loci were enriched for genes affecting steroid hormone 99 biosynthesis, metabolism and excretion, with preferential expression in the liver for all the studied traits, in 100 line with recent findings (Supplementary Figures 1&2,Supplementary  The loci affecting SHBG were largely shared between the sexes (genetic correlation (rg) =0.88, p=9.7e-197), 102 for FAI sharing was intermediate (rg=0.54, p=5.8e-26), and we observed a near-zero genetic correlation 103 estimates for both serum T and free T between males and females (rg=0.08 and 0.05, respectively, p>0.05), 104 indicating sex-specific genetic determinants, as previously reported (22,23). Co-localization analyses 105 between the male and female GWAS further confirmed the widespread sex-specificity of the genetic loci 106 (Methods, Supplementary Figure 3

and Supplementary Tables 11-19) (23). 107
Reflecting the sex-specific genetic architecture, we observed strong genetic correlation between total T and 108 SHBG only in males (rg=0.78 in males vs. 0.05 in females, Supplementary Figure 3 and Supplementary Table  109 9). In females, instead, the genetic determinants for FAI and free T were shared with SHBG (e.g. rg=-0.80 with 110 FAI). Further highlighting the close connection between T and SHBG, we detected evidence for SHBG being 111 causal for total T levels in males (genetic causality proportion (GCP)=0.80, p=5.8e-05, Methods), whereas in 112 females SHBG appeared to control especially FAI and free T fractions (GCP=0.83, p=3.8e-07 for free T, 113 Supplementary Table 9). 114 To study the impacts of T in datasets where T measurements are not directly available, we next constructed 115 sex-specific genetic predictors for T levels, PGS, for each trait applying the LDpred algorithm (28) to the sex-116 specific GWASs (Methods). We tested the predictive ability of the PGS in the YFS where the phenotypic 117 variance explained by the PGS (R 2 ) ranged between 2.5% (male free T) to 9.1% (male SHBG), indicating the 118 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 PGS predict T and SHBG levels in an independent cohort (Supplementary Table 20). Notably, the sex-specific 119 PGSs for T and free T had no predictive value in the opposite sex (Supplementary Table 20 and Supplementary 120 Figure 4). 121 Studying the links between cumulative T exposure and disease 122 We continued by using the PGS to study how post-pubertal T exposure associates with disease risk, the 123 associations potentially implying causal relationships (28). To this end we used the FinnGen data, consisting 124 of 217,464 (94,478 males, 122,986 females) Finnish participants with genotypes linked to up to 46 years of 125 follow-up within nationwide healthcare registries. At present, the study includes roughly 5% of the Finnish 126 adult population (29). We studied 36 diseases with potential links to hormones from the following categories: 127 1) endocrine and metabolic problems, 2) sex-specific endpoints (many specific to females, e.g. 128 postmenopausal bleeding, (PMB)), 3) cardiovascular and circulatory system, 4) nervous system disease, 5) 129 behavioral and neurological diagnoses and 6) other endpoints like injury risk (Supplementary Table 21). The 130 number of cases ranged from 229 individuals diagnosed with hirsutism to 68,774 statin users. 131 We first tested whether the PGS are associated with disease risk in a sex-specific manner, observing 32 132 associations (p<0.0014, after Bonferroni correction for 36 independent tests; Supplementary Table 21 and  133 Supplementary Figure 5). The vast majority of these associations involved endocrine, metabolic and sex-134 specific disorders, highlighting in particular female-specific endpoints (Figure 2A&B, and Supplementary  135   Table 21). Further, multiple PGSs often associated with the same endpoint. In males, both total T and SHBG 136 PGSs often associated with reduced disease risk, whereas we saw few associations to free T. In females, 137 higher bioavailable T (FAI and free T) PGSs were linked with an increased risk for multiple diseases, often 138 showing inverse associations to SHBG. Given these shared associations with SHBG, we additionally included 139 the SHBG PGS as a covariate in the analyses to distinguish true T-driven effects. 140 Underscoring T's involvement in metabolism, larger PGS values for T and SHBG were consistently associated 141 with reduced T2D risk and statin use in both sexes (for total T in males, HR=0.94, 15; for total T in females, HR=0.94, p=9.4e-12 and 0.97, p=5.9e-06 per 1SD increase in PGS, T2D and statin 143 use, respectively). However, the SHBG adjusted analyses suggested these associations were not primarily 144 attributable to androgen action, as the effect of the T was substantially attenuated in all cases (e.g. HR=1.00, 145 p=0.93 for T and T2D in females, Supplementary Table 21 and Supplementary Figure 6). In females, higher T 146 PGS was additionally associated with lower hypothyroidism risk (HR=0.97, p=3.0e-05), an association that 147 remained after SHBG adjustment (HR=0.97, p=3.0e-04). We also detected suggestive links to bone strength 148 and injury risk in both sexes, but with the exception of SHBG associating with osteoporosis in females 149 (HR=1.08, p=0.00015), none of these findings reached statistical significance after Bonferroni correction. 150 In the sex-specific category, we replicated the known associations of free T to PCOS and breast cancer risk in 151 females (22) (HR=1.02, p=2.8e-06 and HR=1.04, p=0.0001 for free T PGS) ( Figure 2B and Supplementary Table  152 20). Of the novel endpoints, we robustly linked free T with hirsutism and post-menopausal bleeding 153 (HR=1.45, p=2.7e-08 and HR=1.05, p=0.00032). The association to hirsutism (excessive hair growth in a male-154 type fashion) appeared particularly pronounced, with the risk of this condition almost doubling with a 2SD 155 change in free T PGS. With the exception of PCOS, all these associations strengthened upon SHBG adjustment 156 (Supplementary Table 21 and Supplementary Figure 6). Higher FAI and free T PGSs associated also with 157 increased infertility risk in females (HR=1.04, p=0.00076 and HR=1.04, p=0.0050, respectively), but here we 158 instead observed confounding by SHBG. In contrast to FAI and free T, higher PGS for SHBG associated with 159 positive effects on several of the female reproductive health endpoints (e.g. HR=0.98, p=0.00116 for irregular 160 menstruation). In males, higher PGSs for male free T and FAI were linked with increased prostate cancer risk 161 in a nominally significant fashion (HR=1.03, p=0.0083 for free T, HR=1.03, p=0.0058 for FAI). 162 We observed no statistically significant associations to other diseases ( Figure 2A). We for example detected 163 no associations to any of the 13 neurological/behavioral endpoints studied, including Alzheimer's disease, 164 alcohol use, and conduct and anxiety disorders (all p>0.0014) in both sexes. However, the PGSs did show 165 several nominal associations to cardiovascular disease (HR=1.02, p=0.0077 for FAI and coronary heart disease 166 (CHD) risk in males; HR=0.97, p=0.017 for T and stroke risk in males; HR=1.04, p=0.0080 for T and stroke risk 167 in females; Figure 2, and Supplementary Table 21). 168 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Evaluation of causal relationships between T and disease 178
Although the PGS associations may imply direct causality of the first trait to another, these are also prone for 179 confounders including genetic pleiotropy (30). We estimated causal relationships between T and the studied 180 endpoints using two complementary MR methods that correct for potential pleiotropy: LCV (31), 181 representing a genome-wide approach, and MR-Egger (32) that uses significantly associated SNPs. For the 182 sex-specific PGS data, in 23/252 instances one or both methods suggested evidence for a causal relationship 183 (p<0.0014) between a PGS and a disease (  Table 22). 184 Finally, we distinguished between the effects of T and SHBG by including the latter as a covariate in 185 multivariable MR Egger analyses (33). 186 The causality analyses supported the role of T in the regulation of female reproductive health. Under both 187 MR models, we observed causality between total and free T and postmenopausal bleeding (β=0.61, p=8.6e-188 05 and β=0.51, p=0.00034). We also observed nominally significant evidence also for expected causal 189 relationships between female free T, hirsutism (β=1.68, p=0.0095) and PCOS (GCP=0.54, p=0.0017) (Figure 3 190 and Supplementary Table 22). Additionally, both approaches indicated causality between T and hormonal 191 cancers (e.g. GCP=-0.55, p=0.041 and β=0.34, p=0.0031 for female free T and breast cancer; β=0.32,p=0.014 for male free T and prostate cancer). Notably, genetically predicted free T was 193 linked with increased cancer risk also in the opposite sex (GCP=-0.37, p=6.6e-19 for male free T and breast 194 cancer, and GCP=-0.74, p=0.0041 for female free T and prostate cancer). Although SHBG appeared causal to 195 irregular menstruation (GCP=0.83, p=1.8e-06, β=-0.11, p=0.022), adjusting for the effects of SHBG in the 196 multivariable MR Egger analyses seemed to rather strengthen most causality estimates for T and free T 197 (Supplementary Table 22). 198 Despite the many associations to metabolism-related endpoints (Figure 2), here the evidence for causality 199 was limited to nominally significant links between female FAI and obesity (GCP=-0.63, p=0.014) and male T 200 and SHBG with statin use (β=-0.22, p=0.018 and β=-0.14, p=0.016). The support for T levels being causal for 201 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint metabolic traits was thus sparse, and for example the link between total T and statins appeared confounded 202 by SHBG (the effect being close to zero in the adjusted model, Supplementary Table 22). Notably, we also 203 consistently saw no causality between T and T2D under LCV and both univariate and multivariate MR-Egger 204 models (p>0.05 for all), a result that remained with both sex-combined and sex-specific T2D GWAS from 205 Finally, evidence for a causal relationship between T levels and disease risk was detected in some instances 207 where the PGS were not associated with a given endpoint. Both T and SHBG were for example linked with 208 injury risk in both sexes (e.g. GCP=0.53, p=0.0018 and β=0.17, p=1.3e-05 for SHBG increasing forearm/elbow 209 injuries in males), potentially aligned with known hormonal contribution to bone strength (34). Notably, 210 nominal causality between male total T and osteoporosis risk vanished when adjusting with SHBG (β=0.28, 211 p=0.019 to β=0.034, p=0.85 Figure 3 and Supplementary Table 22). The multivariable MR Egger analyses also 212 highlighted the potential protective effect of T on seropositive rheumatoid arthritis (RA) in females (β=-0.43, 213 p=0.0079 for 1SD increase in total T, β=-0.61, p=4.7e-05 for FAI and β=-0.48, p=0.0025 for free T). Also, while 214 the PGSs did not associate with any of the neurological endpoints, LCV suggested a causal relationship 215 between higher SHBG and ADHD in both sexes, and higher free T and increased risk for conduct disorder but 216 decreased risk for emotional instability in males (p<0.0014, Supplementary Table 22), pointing to potential 217 hormonal involvement in the regulation of neuronal processes. 218 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021.

Results from the cross-sex PGS associations in FinnGen 226
Owing to the unique genetic architecture, the sex-specific PGS for T and free T do not predict the 227 corresponding hormone levels in the opposite sex (Supplementary Table 20). Given this, we reasoned that 228 cross-sex analyses, i.e., analysis of the effect of a sex-specific PGS in the opposite sex, would provide us with 229 an additional means to assess if the original associations stem from T action, and to detect potential 230 antagonistic effects for the PGSs between the sexes. 231 We first concentrated on replicating the associations of total and free T PGSs on endpoints common to both 232 sexes. Here, aligning with the results from the MR analyses, 15/20 of the nominally significant (p<0.05) 233 associations remained similar (Z-test p>0.05 for difference in PGS effects) also in the cross-sex analyses, 234 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint pointing to shared genetic etiology rather than T action as the likely cause for these associations. For 235 example, both female and male total T PGSs, with and without SHBG adjustment, associated with 236 hypothyroidism risk with a similar effect size in the opposite sex, and all PGS associations to T2D were 237 replicated in the other sex ( Figure 4A, and Supplementary Tables 23&24). 238 The sex-specific associations were not replicated (Z-test p<0.05) in the opposite sex for total T PGS and stroke 239 in both sexes, and the male free T PGS, head injury risk and osteoporosis ( Figure 4A, and Supplementary 240 Tables 23&24). In addition, the effect of male total T PGS on statin use was attenuated in females. 241 Intriguingly, while both male and female total T PGSs associated with increased risk for stroke in females 242 (HR=1.03, p=0.017 and HR=1.04, p=0.0079, respectively), the same PGSs rather associated with a reduced 243 stroke risk in males (HR=0.97, p=0.017 and HR=0.98, p=0.073). This was the only endpoint for which we 244 detected statistically significant evidence for a PGS having antagonistic effects depending on sex. The 245 antagonistic effect of male total T PGS on stroke enhanced with SHBG adjustment (HR=0.96, p=0.011 in males 246 and HR=1.07, p=0.0010 in females). In contrast, the female total T PGS showed no association with stroke in 247 males after the SHBG adjustment (HR 0.99, p=0.65), suggesting all the PGS associations to stroke may not be 248 fully androgen-driven. The results nevertheless indicate the genetic effects on stroke risk may be partly sex-249 specific (35), with potential interplay from sex hormones. 250 Finally, the cross-sex analyses further implied increased androgen load as a direct contributor to poorer 251 reproductive health in females, agreeing with the MR-based causality assessments. For the reproductive 252 endpoints that were associated with female free T PGS, i.e., infertility, PMB, PCOS and hirsutism, the male 253 free T PGS had no predictive power (all p<0.05, Figure 4B, Supplementary Tables 23&24). Yet, this approach 254 did not fully support exclusive sex-specific causality of free T for breast cancer in females and prostate cancer 255 in males. Here, we observed that the effect sizes of the PGS associations were attenuated, but not in a 256 statistically significant manner (p>0.05, Figure 4B). Echoing the findings from the causality analyses, these 257 results thus suggest some degree of shared genetic risk, irrespective of T levels, between male and female 258 hormonal cancers. 259 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

266
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Extending the FinnGen discoveries 267
We next sought to validate and refine the FinnGen discoveries in additional datasets, extending our analysis 268 to include quantitative traits not available in large numbers in FinnGen. To this end, we used genetic 269 correlation analysis, allowing for estimation of the extent to which two traits are affected by the same genetic 270 factors (36, 37), followed by causality estimations. 271 We selected 44 traits with publicly available GWAS summary statistics, identical to (e.g., T2D, breast and 272 prostate cancers) or closely reflecting the studied disease phenotypes (heel bone mineral density (HBMD), 273 mood swings) from FinnGen, adding anthropometric traits to the analyses (Supplementary the FinnGen PGS associations. We observed significant genetic correlations to traits related to metabolism, 279 including many biomarkers and anthropometrics, but detected only few correlations to behavioral traits, and 280 no significant correlations to neurological or temperamental traits ( Figure 5, Supplementary Table 25). 281 Notably, in all cases where we observed genetic correlation to behavioral traits, these had clear links to 282 metabolism (smoking, sleep duration and exercise). 283 The genetic factors increasing serum total T and SHBG appeared to promote a favorable metabolic profile in 284 males, supporting the PGS findings. Despite correlating with increased BMI, total T and SHBG were positively 285 correlated to adiponectin, high-density lipoprotein (HDL) and lower waist-to-hip ratio (WHR) (rg>0.20, 286 p<0.0011) whilst lowering triglycerides and T2D incidence in males (rg<-0.25, p<0.0011). In contrast, higher 287 FAI and free T fractions in females associated with negative metabolic effects, including higher WHR (rg=0.25, 288 p=1.6e-22) and lower HDL cholesterol (rg=-0.18, p=4.2e-05), whilst free T showed no significant correlations 289 to these metabolism-related traits in males. Strong correlations were observed also between SHBG and 290 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint metabolic traits in females, including negative associations to markers of liver damage (alanine transaminase 291 (ALAT), rg=-0.21, p=1.7e-08) and gamma-glutamyl transferase (GGT), rg=-0.16, p=0.0001). 292 We observed significant genetic correlations to hormonal cancers in both sexes, in line with recent findings 293 (22,25). In males, the genetic factors increasing FAI and free T promoted prostate cancer (rg=0.12, p=0.0004), 294 whereas in females these increased especially the risk of estrogen receptor (ER) positive breast cancer 295 (rg=0.15, p=9.1e-09). Finally, consistent with existing genetic, epidemiological and experimental data (1, 25), 296 the genetic correlation analyses pointed towards shared genetic background for T, hemoglobin levels and 297 body fat in both sexes (e.g. rg=0.15, p=1.0e-07 and rg=-0.14, p=0.0002 for hemoglobin and body fat, 298 respectively, with male free T). 299 In the LCV and MR-Egger analyses we found statistically significant evidence (p<0.0011) of a causal 300 relationship in 7% (26/354) of instances across the 44 traits ( Figure 5 and Supplementary As examples of expected causal relationships (1, 25, 38), our analyses supported the contribution of free T 305 levels to male-pattern baldness (MPB,GCP=0.44, and hemoglobin levels (GCP=0.64, p=0.00085) 306 (Figure 5 and Supplementary Table 26). In addition, despite the lack of significant genetic correlations 307 between these traits, higher T in males was linked to increased number of children fathered (NCF, GCP=0.62, 308 p=1.2e-15 for total T, GCP=0.42, p=0.011 for free T), raising speculation about the potential evolutionary 309 benefits of males maintaining adequate T levels. 310 The MR Egger analyses supported the causality of female total T and free T to ER+ breast cancer (β=0.282, 311 p=0.00014 and β=0.250, p=0.002, respectively). Instead of T, in these analyses prostate cancer risk was linked 312 to SHBG levels in males (GCP=-0.64, p=0.00019) (39). Additionally, SHBG increased lymphocyte count in both 313 sexes (GCP=0.39, p=1.6e-07 in males, GCP=0.42, p=0.00014 in females) (40), and in males, we linked SHBG 314 also with reduced risk of erectile dysfunction (GCP=0.42, p=0.00032). In females, we observed causality 315 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint between SHBG and age at menopause (β=-0.756, p=0.0010), suggesting that inherited differences in SHBG 316 levels likely modify reproductive phenotypes across female lifespan. 317 In most instances, we however found no evidence of a significant causal relationship between T levels and 318 the studied traits (Supplementary Table 26). Yet, emphasizing the intricate relationship between hormone 319 levels and metabolism (41), some connections to metabolism-related biomarkers emerged: for example, it 320 appeared that triglycerides may causally influence T (GCP=-0.48, p=1.3e-30) and free T (GCP=-0.80, p=1.4e-321 06) levels in females. In combination with the PGS analyses, these results thus further suggest that whilst 322 some sex-biased phenotypes may be directly related to T levels, in most instances T's relationship to complex 323 traits and diseases is not straightforward. 324 325 326 327 328 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion 339
Since its discovery in the early 20 th century, testosterone (T) has been proposed to modify phenotypes and 340 diseases that differ between the sexes, due to the extensive male-female differences in circulating T levels. 341 To provide a broad and systematic perspective into the involvement of T as a regulator of health and disease 342 in males and females, we have used genetic data as an anchor allowing us to address the potential causal 343 role of post-pubertal T exposure for a wide range of diseases and traits thought to be under hormonal control. 344 We leveraged the UK Biobank resource to construct PGSs -predictors for genetically determined T levels -345 for both males and females, which we then associated with a uniquely rich collection of disease endpoints 346 from 217,464 participants in FinnGen. This combination allowed us to extend and refine recent efforts that 347 have utilized genetic data to understand the disease impacts of T, concentrating on a limited set of 348 phenotypes or only to males (22,23,(25)(26)(27). For instance, we included many novel endpoints from FinnGen, 349 including registry-based phenotypes like statin use, female-specific endpoints such as hirsutism and post-350 menopausal bleeding, and neurological diagnoses like conduct disorder, where hormonal contribution has 351 not been previously studied using genetic approaches. In addition, by taking advantage of the sex-specific 352 genetic determinants of T in cross-sex analyses, and through careful Mendelian Randomization strategies, 353 we could pinpoint T's causal effects on adult health. 354

Post-pubertal T exposure and disease risk 355
Based on our analyses, three major themes emerged regarding T's contribution to disease. First, contrary to 356 T's established role as a male androgen, we report that the studied PGSs associated with disease risk 357 especially in females. Secondly, we highlight distinct association profiles for total T and free T in both sexes, 358 consistent with proposed divergent biological effects for the bound and unbound T fractions (13-15). 359 Underscoring the potential role of SHBG as a confounder when studying the action of T, the former closely 360 correlates with SHBG levels in males, and the latter in females (13-15). Thirdly, in many, but not all instances, 361 the associations with the T and free T PGSs truly seemed to reflect androgen action. We observed causal 362 relationships between a genetic predisposition to higher free T levels and several sex-specific and sex-biased 363 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint phenotypes with clear biological links to T, but less contribution to most other phenotypes, echoing 364 experimental data and findings from recent MR studies (22,25). 365 Overall, our study highlights T's intricate connections to reproduction and metabolism. Besides the causal 366 links to several female-specific reproductive endpoints, many of the significant associations involved diseases 367 and traits from the metabolic and endocrine categories. We stress the complexity of these relationships, as 368 for many metabolic traits SHBG -either directly or potentially through its action on other hormones -369 appeared to confound T's associations and causality estimates. Beyond this T/SHBG relationship, we 370 speculate that some of the observed associations to metabolic health, including T2D, may reflect even more 371 widespread genetic pleiotropy and thus overall complex shared genetic etiology rather than T action. 372 Due to the lack of significant associations and causality for many other traits, we anyhow speculate that 373 normal variation in T levels -contrary to popular beliefs -has only modest effects on most phenotypes. 374 Particularly, the grounds to explain some temperamental and neurological phenotypes like anxiety and 375 emotional instability with heritable differences in adult T levels (42, 43) appears unsubstantiated in the light 376 of our study. Although we cannot exclude a causal relationship between T and some of these phenotypes, 377 our data suggests that especially without larger sample sizes or refined phenotyping, efforts to establish 378 relationships between T levels and e.g. behavior will likely be unproductive. Taken together, supporting 379 recent recommendations, our data thus suggests that for example the risks and benefits of using T as a 380 medical treatment should be carefully weighted, given T's complex and indirect relationship to most 381 phenotypes and potential adverse and beneficial outcomes in both sexes (7, 25). 382

T as a contributor to sex differences 383
Having comprehensively mapped the impacts of T across diverse complex disease and traits, we can start 384 drawing inferences on the role of T as a contributor to the male-female differences. We reason that should 385 T causally contribute to within-sex variability of a trait, T should similarly be accountable for a fraction of the 386 between-sex difference detected in the same trait. Indeed, in such instances where causality of T was 387 implicated for a sex-shared trait, the effect estimates often aligned with the direction of the phenotypic sex 388 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint bias in the given trait, i.e., higher T levels associated with typical male characteristics. For instance, our work 389 highlighted causal connections between increasing free T, higher hemoglobin and higher bone strength, 390 backed up by previous experimental observations (1, 34, 44) and thus suggests the involvement of T as a 391 mediator of the established sex differences in these characteristics. Extrapolating from the MR estimates, we 392 estimate that ~10-20% of the mean difference in hemoglobin levels between males and females may result 393 from average differences in free T levels, consistent with the notion that T directly affects male-female 394 differences in, e.g., athletic capacity (1). Moreover, we find that higher free T levels are causal to masculine 395 external features like hirsutism in females and baldness in males, further implicating direct involvement of T 396 in defining sex differences. Additionally, in females higher bioavailable T levels correlated genetically with a 397 shift of metabolism into a male-like direction (e.g., increased BMI, WHR, and poorer blood lipids, yet reduced 398 risk for obesity and reduced body fat levels). 399 Overall, the limited evidence for the causal involvement of T for most of the studied traits suggests that most 400 phenotypic sex differences are not attributable to a linear relationship between T levels and a phenotype. 401 Instead the impact of T can be mediated through a threshold effect, potentially at a given developmental 402 time window, acting as a switch that results in more global rewiring of biological processes and thereby in 403 systematic male-female differences. Under this model, the within-sex variability in T may have non-existent 404 or very subtle effects on many phenotypes, complicating the detection of potential causality of T in a study 405 setting like ours that utilizes normal physiological ranges of adult T levels. 406

Connection between reproductive success and distinct genetic architecture of T between sexes? 407
Finally, our study provides some unique insight into the potential causes behind the sex differences in T 408 levels. In the cross-sex PGS analyses, we found only one case where there was clear evidence of antagonistic 409 effects for the T PGS between the sexes, which in theory might promote sex-differences in T levels. Both 410 male-and female-specific total T PGS protected from stroke in males, whereas the same PGSs had exactly 411 the opposite effects on stroke risk in females. Although we cannot conclude that the action of T truly drives 412 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 all of these associations, the result agrees with a degree of sex-specificity in the genetic disease mechanisms 413 for stroke (35). 414 The observation that the genetic variants responsible for regulating T levels are largely distinct between 415 males and females anyhow raises speculation about the evolutionary forces maintaining sex differences and 416 shaping the biology of T. The cross-sex genetic correlations for traits related to fitness (e.g., reproductive 417 success) are generally expected to be low, due to potentially conflicting evolutionary pressures (45). We 418 indeed associated T positively with reproductive success in males (T increasing the number of children 419 fathered), but negatively with both pre-and postmenopausal reproductive health in females with evidence 420 for a causal role of T behind these associations. We may thus speculate that there exists a selective advantage 421 in relation to reproductive success to maintain higher T levels in males, whereas the opposite may be true 422 for females, potentially promoting the widespread sex differences across multiple traits. 423

Challenges for genetic analyses in addressing T function 424
In this study, the combination of data from two independent biobanks should result in reduced confounding, 425 allowing robust inferences about the contribution of T to human phenotypes. Although based on extensive 426 data sets, our study still has some limitations. We stress that studying T levels differs drastically from studying 427 T action, the serum T levels serving only as a proxy for the latter. Theoretically, variants increasing T activity 428 may include 1) variants increasing T production or decreasing T metabolism and breakdown, resulting in 429 consistently higher T levels and 2) variants increasing T uptake in peripheral tissues or the sensitivity of body 430 to T, lowering circulating T levels. This severely limits the potential of standard genetic methods, expecting 431 linear relationships between two endpoints, to find links between T levels and complex traits. In light of the 432 established model of regulation of T levels via the hypothalamic-pituitary-gonadal (HPG) axis, (high T levels 433 leading to downregulation of gonadotropin secretion, ultimately lowering T levels), this limitation may apply 434 especially to studying T action in males (16,17). 435 Moreover, likewise to other recent studies (22, 25), we emphasize the challenges the close ties between 436 SHBG and T may pose. Earlier studies have addressed this issue by attempting to distinguish genetic variants 437 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 with more specific effects for testosterone by using clustering-based approaches (22,25). We, besides 438 accounting for the effect of SHBG by calculating the level of bioavailable T, additionally examined how SHBG-439 adjustment influences the results of the testosterone PGS analyses and causality estimates, allowing for the 440 detection of some traits with clear SHBG confounding. 441 Besides this specific example, genetic studies are also prone to confounding by pleiotropy in a more general 442 sense, whereby a gene influences multiple traits via independent biological pathways (30). Although we 443 opted for adjusting for the effect of BMI in our GWAS -obesity being a known confounder for testosterone 444 levels (Supplementary Figure 8) (46, 47) -it remains possible that this connection still affects our findings. 445 Genetic pleiotropy may confound also MR analyses, and despite the vast potential of MR in establishing 446 causal relationships (48), we generally propose caution in interpreting these findings. In our case all MR 447 models did not always agree on causality (Supplementary Tables 22 and 27 Finally, our setting does not allow for assessing the effects of fetal T exposure, which may be critical, e.g., for 453 neurological traits (49). We also emphasize that our results are based on normal variation in T levels, not on 454 the effects of supraphysiological T injections. Additionally, many of T's effects depend on its conversion to 455 estradiol also in males, and we cannot rule this out as a potential confounder in our study. Finally, the data 456 used in our study does not allow for assessing the effects of acute changes in hormone secretion, and 457 personal differences in the response to such fluctuations may be crucial for some phenotypes. 458 Despite these challenges, we were able to highlight several novel albeit often expected relationships with 459 genetically determined T levels, human health, and sex differences. Overall, we show the power of biobank-460 scale genetic analyses to extend and clarify the results obtained from epidemiological and experimental 461 studies, leading to improved understanding on how human phenotypes are related to sex hormone levels. 462 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Conclusion 463
Here, by combining PGS and electronic health records, we have assessed the link between T, disease risk and 464 complex traits, including many understudied phenotypes. We provide a broad genetic perspective on disease 465 impacts and phenotypic effects of post-pubertal T exposure, extending previous experimental and MR 466 studies. We shed light on the interplay between T and complex traits in both sexes, and the role of T in driving 467 sex differences. Finally, we underscore some critical factors that should be taken into account when assessing 468 these relationships, providing a reference point for future genetic and epidemiological studies studying the 469 action of T. 470 471

Methods 472
Genotype and phenotype data from the UK Biobank 473 The genetic association analysis was based on data from the UK Biobank, a population-based biobank 474 consisting of 502 637 subjects (aged 37-73 years)(50 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 We restricted our study to encompass 408,186 individuals from the white British subset. We removed 488 outliers for genotype heterozygosity and missingness, as well as samples with sex chromosome aneuploidies, 489 mismatches between reported and inferred sex, and samples that UK Biobank did not use in relatedness 490 calculations (50). We did not exclude related samples since our analysis method (BOLT-LMM) allows for their 491 inclusion. 492 For the GWASs, we used biochemically measured testosterone and SHBG, and calculated FAI and free T. For 493 calculation of FAI, we used the formula 100*Testosterone/SHBG (nmol/ml). Calculated free T was derived 494 using the Vermeulen equation using directly measured albumin values for each participant in the equation, 495 as described in (51, 52). 496 All four traits were separated by sex and log transformed. A linear regression model was fit for each trait with 497 BMI and age as covariates, as well as menopause status for females. Subjects with residuals values of +-5 SD 498 from the mean were excluded from the analyses, serving as a further QC step to exclude outliers potentially 499 reflecting medical conditions or drug use affecting androgen levels. Inverse normalized values of the 500 remaining residuals were used as phenotype values for the GWAS analyses. After the QC-steps, our study 501 included altogether 177,499 males and 205,141 females. 502

Genetic association analysis and definition of the lead SNPs 503
The GWAS analyses were performed using BOLT-LMM (v2.3.2)(51, 52). Imputed SNPs were restricted to 504 variants with MAF ≥ 0.1 % and imputation quality ≥ 0.7 (50). 1000 Genomes European data was used as 505 reference LD scores for calibrating the BOLT-LMM statistic. First 10 Principal Components were used as 506 quantitative covariates in the runs. A linear regression model was fit for each trait with BMI and age as 507 covariates, as well as menopause status for females. Genetic correlation analyses, heritability estimates and 508 number of loci found implied the results remained consistent with different covariate configurations 509 (Supplementary Figure 8), but we chose to include body mass index (BMI), known to associate with T levels 510 (28,29), as a covariate in our GWAS. Including up to 127 covariates (based on (53)), e.g., assay center, dilution 511 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 factors, blood draw time, and socioeconomic status indicators, or excluding related individuals from the 512 analysis all showed negligible effects on the genetic findings we report here (Supplementary Figure 8). 513 Independent lead SNPs were selected for each chromosome by recursively taking the SNP with the lowest p-514 value (until none below the p-value threshold 5e-08 were left) from the GWAS summary statistics and 515 removing all SNPs 500kb on each side of it from the next round. The chromosomal positions of these 1mb 516 windows were stored, and overlapping windows were merged into the final list of loci. The SNP with the 517 lowest p-value in each of these windows was selected as the lead SNP. 518

Pathway, tissue enrichment and co-localisation analyses 519
Tissue and gene set enrichment analyses were carried out with SNP2GENE and GENE2FUNC implemented in 520 FUMA using default settings (54). For testing in which tissues the genes residing in the GWAS loci were 521 preferably expressed, we used the full distribution of SNP p-values and the GTEx v6 30 general tissue types. 522 For pathway analysis, to assess whether the genes in the GWAS loci are overrepresented in pre-defined gene 523 sets via hypergeometric tests, we selected manually curated . For co-localisation 524 analyses to assess whether the genetic loci showed evidence for shared genetic effects between males and 525 females, and to estimate the maximum posterior probability (MAP) for the loci being shared, we used gwas-526 pw (56). 527

Replication in the Young Finns Cohort and calculation of PGS 528
The Cardiovascular Risk in Young Finns Study (YFS) is a longitudinal follow-up of 3,596 subjects at baseline. 529 The baseline survey was conducted in 1980 and subsequent follow-ups involving the whole sample were held 530 in 1983, 1986, 2001, 2007, 2011, and 2017. For testosterone and SHBG, we used data on 2001 follow-up 531 (Subjects aged 24-39 yrs). A venous blood sample was drawn from antecubital vein after 12-hour overnight 532 fast. Serum was aliquoted and stored in -70 Celsius degrees until analysis. In males, total testosterone 533 quantification was performed in 2009 with competitive radioimmunoassay (Spectria Testosterone kit, Orion 534 Diagnostica, Espoo, Finland) and Bio-Rad Lyphocheck control serums 1, 2, and 3 were used in quality control. 535 Before quantification, serum aliquots had been melted three times. Total testosterone was quantified first 536 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 and aliquots were re-frozen before SHBG quantification. In females, total testosterone quantification was 537 performed in 2011. SHBG quantification for males was done in 2009 and for females in 2011 with Spectria 538 SHBG IRMA kit (Orion Diagnostica, Espoo, Finland). Free testosterone was estimated using Vermeulen's 539 formula. As albumin concentration was not available, we used fixed albumin concentration of 43 g/l. 540 In YFS, genotyping was performed at the Wellcome Trust Sanger Institute (UK) using customized Illumina 541 Human Map 670k bead array. The custom content on 670k array replaced some poor performing probes on 542 Human610 and added more CNV content. As quality control, we excluded individuals and probes with over 543 5% of missingness (--geno and --mind filters in Plink). Variants deviating from Hardy-Weinberg equilibrium (p 544 < 1x10^-6) and minor allele frequency below 1% were excluded. Related samples were excluded (n=51) with 545 pi-hat cut-off of 0.2. Total of 2,442 individuals and 546,674 variants passed the quality control measures. The 546 mean call rate across all included markers after the quality control was 0.9984. Next, imputation was 547 performed using population-specific Sequencing Initiative Suomi (SISu) as reference panel. We examined the 548 association of the UK Biobank GWAS lead SNPs with the corresponding T trait in YFS. If the annotated lead 549 SNP was not available in YFS, we used LDstore2 (v2.0b) (57) to calculate LD in a 100kb window around the 550 lead SNP in the UK Biobank imputed data and selected the closest SNP with R2>0.8 with the lead SNP as a 551 proxy. 552 To construct PGS we applied the LDpred (28) method to the sex-specific GWAS results from the UK Biobank 553 for total T, SHBG, FAI and free T using 1000 Genomes Europeans as LD reference and the default LD radius to 554 account for LD. We then used the weights from the LDpred infinitesimal model to construct genome-wide 555 PGSs for each individual in the YFS with Plink 2.0 (11 Feb 2018). Only variants imputed with high confidence 556 (imputation INFO > 0.8) were included in PGS calculation. Variants in chromosomes 1-22 and chrX were 557 included. In males, allele dosage of 2 was used for X-chromosomal haploid variants. To evaluate the PGS 558 prediction accuracy in the YFS, we calculated the R2 for each trait using using linear regression with z-score 559 normalized PGS as predictor, age and 10 PCs as covariates and z-score normalized T trait as outcome. 560

Estimation of heritability and calculation of the genetic correlations by LDSC 561
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ; https://doi.org/10.1101/2021.04.23.21255981 doi: medRxiv preprint SNP-based heritability for the studied T traits and genetic correlations between these and with 44 additional 562 phenotypes were estimated using linkage disequilibrium score regression (LDSC)(36). The summary statistics 563 for the 44 traits were downloaded directly from the source repositories and analysed locally, for the original 564 sources please see references in Supplementary Table 25. For the genetic correlation analyses, pre-computed 565 LD Scores from 1000 Genomes Europeans excluding the HLA region were used. For 23 traits we performed 566 the analyses using sex-specific GWAS results and compared these to data from sex-combined GWAS 567 (Supplementary Table 27). Generally, genetic correlation results using either sex-specific or sex-combined 568 GWAS data were highly similar. 569

Disease associations in FinnGen 570
To assess if the PGS for studied traits associate with disease risk we utilized the FinnGen study (data freeze 571 5), consisting of 217,464 (94,478 males, 122,986 females) (29). FinnGen is comprised of Finnish prospective 572 epidemiological and disease-based cohorts and voluntary biobank samples collected by hospital biobanks. 573 The genotypes have been linked to national hospital discharge (available from 1968), death (1969-), cancer 574 (1953-) and medication reimbursement (1964-) registries as well as the registry on medication purchases 575 (1995-). The samples were genotyped with Illumina and Affymetrix arrays (Illumina Inc., San Diego, and 576 Thermo Fisher Scientific, Santa Clara, CA, USA). The genotypes have been imputed with using the SISu v3 577 population-specific reference panel developed from high-quality data for 3,775 high-coverage (25-30x) 578 whole-genome sequencing in Finns. The detailed genotype imputation workflow can be found at 579 https://dx.doi.org/10.17504/protocols.io.xbgfijw. The dataset uses genome build 38 (hg38). 580 For PGS analyses, we used same variant weights (LDpred infinitesimal model) as for YFS, and calculated 581 genome-wide PGSs for each individual with PLINK2 (v2.00a2.3LM). Variants in chromosomes 1-22 and 582 chromosome X (imputed with high confidence, imputation INFO ≥0.7) were included (total number of 583 variants ranging from 6,535,263 for female total T to 6,536,405 for female SHBG) and we used genotype 584 dosages to incorporate imputation uncertainty. In males, allele dosage of 2 was used for X-chromosomal 585 haploid variants. We studied the PGS associations to 36 disease endpoints with potential links to androgens, 586 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 representing six loosely defined disease categories. For details of the studied phenotypes see Supplementary 587 Table 21, www.finngen.fi and risteys.finngen.fi. Cox proportional hazards models were used for estimating 588 hazard ratios (HRs) and 95% CIs, with age as the time scale and 10 first principal components of ancestry and 589 genotyping batch as covariates. The proportionality assumption for Cox models was assessed with 590 Schoenfeld residuals and log-log plots. For the cross-sex analyses, we took the sex-specific PGSs, and checked 591 whether these would associate with the studied endpoints in the other sex, using the z-test to compare 592 equality between the original and cross-sex associations. We additionally performed SHBG-adjusted PGS 593 associations to all endpoints to control for potential confounding of SHBG to total and free T. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Causality analyses 612 MR analyses treat genetic variants as instrumental variables and their reliability depends on two key 613 assumptions: 1) alleles are randomly assigned, and 2) that alleles that influence exposure do not influence 614 the outcome via any other means. The first assumption is controlled by using BOLT-LMM as our model in the 615 primary GWAS analysis, but the second is harder to control when using a large number of SNPs as 616 instrumental variables. In line with the observed wide-spread genetic pleiotropy affecting most complex 617 traits, we noted that the GWAS loci contained many genes associated with pleiotropic effects on human 618 phenotypes (for example, LIN28B (58), GCKR (56) and TYK2 (59)). Therefore, given the vast polygenicity of the 619 studied T traits, we chose latent causal variable (LCV) (31) and MR-Egger (32) as our primary MR methods, 620 designed to take into account pleiotropy-induced confounding when assessing causal relationships. LCV has 621 been proposed to provide more unbiased causality estimates than conventional MR approaches, whereas 622 MR Egger should provide accurate causality estimates under the InSIDE assumption (the genetic variants 623 have pleiotropic effects that are independent in magnitude and are thus not mediated by a single confounder 624 exposure), besides its recommended use as a sensitivity analysis for conventional MR (31,32). For 625 comparison we also ran conventional MR analyses (Inverse-Variance Weighed (IVW)), that remains more 626 sensitive for confounding by genetic correlation and pleiotropy (31). To extend the basic MR Egger analysis 627 and to tease out the potential effects of SHBG on causality estimates of total and free T, we used multivariable 628 MR Egger (33). LCV reports genetic causality proportion (GCP) as an estimate of causality, under a model 629 where genetic correlation between two traits is mediated by a latent variable having a causal effect on each 630 trait. GCP=1 means trait 1 is fully correlated with the latent variable, and hence fully causal to trait 2. A high 631 GCP value and a statistically significant effect support partial genetic causality between the traits, and suggest 632 that interventions targeting trait 1 are likely to affect trait 2. The p-value obtained in the analysis refers to 633 the null hypothesis that the GCP=0. A highly significant p-value does not require a high GCP. Positive GCP 634 value indicates causality of trait1 to trait2, whereas a negative value indicates support for causality of trait 2 635 to trait 1. LCV also estimates genetic correlation between the traits. To estimate the how much T could 636 explain sex differences in hemoglobin we calculated ((FT_m-FT_f/SD_FT_m)*β*SD_H_m)/((H_m-637 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 26, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 H_f)/SD_H_m), where FT = mean free Testosterone, m=males, f=females, SD = standard deviation, β=MR 638 estimate, H=Hemoglobin in UK Biobank based on (60). We applied the LCV, MR-Egger, multivariate MR-Egger 639 and IWV models locally using R 4.0.2. The MR analyses were run using TwoSampleMR (v0.5.2) (61) and 640 MendelianRandomization (v0.5.0) R packages (62). For the traits from public GWAS included in genetic 641 correlation analyses, in 16 out 44 instances we could perform two sample MR (phenotype data not based on 642 UK Biobank samples, Supplementary Table 24). FinnGen represents an independent research cohort from 643 the UK Biobank and thus all FinnGen causality analyses were two-sample MR analyses. 644