The genetic architecture of changes in adiposity during adulthood

Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 1.5 million primary-care health records in over 177,000 individuals in UK Biobank to study the genetic architecture of weight-change. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (a missense variant in APOE). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI, and higher in women than in men. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology driving quantitative trait values in adulthood.

Introduction 0.040 SD decrease (95% CI=0.021-0.049, P = 2.3 × 10 −5 ) in expected WC slope over time and 0.031 SD 134 decrease (0.012-0.050, P = 1.1 × 10 −3 ) in expected WHR slope over time, independent of baseline values 135 ( Figure 3D and Supp. Table 6). While the effect direction remains consistent, these associations are no 136 longer significant upon adjustment for BMI (all P > 0. 1), suggesting that the observed loss in abdominal 137 adiposity over time may represent a reduction in overall adiposity. 138 The APOE locus is a highly pleiotropic region that is associated with lipid levels 56,57 , Alzheimer's dis-139 ease 58,59 , and lifespan 60,61 , among other traits 62 . Excluding the 242 individuals with diagnoses of dementia 140 or Alzheimer's disease in our replication datasets did not alter associations of rs429358 with any of the longi-141 tudinal obesity traits (Supp. Fig. 2), indicating that they are unlikely to be driven solely by weight loss that 142 accompanies dementia. We additionally performed a longitudinal phenome-wide scan to test for the associ-  Table 8). 150 Genome-wide architecture of change in adiposity over time is sex-specific and 151 distinct from baseline adiposity 152 We identify six independent genetic loci associated with distinct longitudinal trajectories of obesity traits by 153 performing GWAS on individuals' posterior probabilities of membership in the high gain cluster (k1), high and 154 moderate gain clusters (k1 and k2), or no loss (clusters k1, k2, and k3), adjusted for baseline obesity trait 155 ( Table 2). This included the APOE locus and five signals in intergenic regions. rs9467663 (OR=1.011 for 156 membership in the high-gain weight cluster, P = 1.6×10 −9 ) and chr6:26076446 (OR=1.012 for membership in lead SNPs associated with baseline obesity traits, which is expected given the 7-to 9-fold lower heritability of 167 adiposity change. The heritability explained by genotyped SNPs (h 2 G ) 66 of the posterior probability of belonging 168 to an adiposity-gain cluster is between 1.38% in men to 2.82% in women, while the h 2 G of baseline obesity traits 169 varies between 21.6% to 29.0% across strata ( Figure 4). Furthermore, we observe that the heritability of BMI 170 and weight trajectories are higher in women than in men (2.89% vs 1.05% for BMI slopes, P sexhet = 0.012; 171 and 3.42% vs 1.69% for weight slopes, P sexhet = 9.9 × 10 −3 ). We do not observe a corresponding difference 172 in the h 2 G of baseline BMI or weight between the sexes (P sexhet > 0.1). Finally, baseline and change in obesity 173 traits are genetically correlated, with r G ranging from 0.35 (95% CI=0.24-0.45) for weight in women to 0.91 174 (0.59-1.23) for BMI in men ( Figure 4). While the genetic correlation between baseline adiposity and adiposity 175 change appears to be higher in men as compared to women, these estimates have wide confidence intervals 176 (overlapping 1) and P sexhet > 0.05 for both BMI and weight.

177
Throughout this study, we evaluate both BMI and weight as obesity traits, and expect these to track closely  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint change metrics, derived from regression models incorporating linear and non-linear temporal trends, are better suited to identify the genetic component of BMI and weight trajectories, and are robust to the manner in which 236 this is defined. For example, many of the lead SNPs from our obesity-change GWASs are also associated with 237 self-reported weight change, despite self-report being an imprecise metric 87 .

238
In particular, rs429358 (missense variant in APOE ) is robustly associated with loss in BMI and weight, inde-239 pendent of baseline obesity, across men and women, in individuals of various ancestral groups. APOE codes 240 for apolipoprotein E, which is a core component of plasma lipoproteins that is essential for cholesterol trans-241 port and homeostasis in several tissues across the body, including the central nervous system, muscle, heart, 242 liver, and adipose tissue 88,89 . The precise pathway by which this variant affects weight change is difficult 243 to pinpoint, as APOE is a highly pleiotropic locus associated with hundreds of biomarkers and diseases 62 .

244
Here too, we find an association between rs429358 and changes in 11 biomarkers over time. Obesity is cross- associations remain to be established. As rs429358 is also the strongest genetic risk factor for Alzheimer's 249 disease 58,59 , which is preceded by weight loss 95 , we ensured that our findings were robust to the exclusion of 250 individuals with dementia. We hypothesise that the APOE effect on weight loss may act through cholesterol-251 and lipid-metabolism pathways that partly determine response to dietary and environmental factors, as seen in 252 mouse models 96,97 . Indeed, it has recently been suggested that APOE -mediated cholesterol dysregulation in 253 the brain may influence the onset and severity of Alzheimer's disease 98 , suggesting that ageing-associated sys-254 temic aberrations in cholesterol homeostasis could have far-ranging consequences from weight loss to cognitive 255 decline.

256
Patterns of weight change in mid-to-late adulthood have been observed to be sex-specific, particularly as 257 women undergo significant changes in weight and body fat distribution around menopause 99 . Here, we find 258 that the heritability of changes in obesity traits is significantly higher in women than in men, supporting a 259 previous finding that obesity polygenic scores are more strongly associated with weight-change trajectories in 260 women than in men 68 . This is in contrast to baseline obesity, which is equally heritable in men and women, 261 both in our study and as previously reported 43 . The lower genetic correlation between baseline obesity and 262 obesity-change in women as compared to men, while not statistically significant, may nevertheless indicate sex-263 differential genome-wide contributions to these phenotypes. We hypothesise that sex hormones could explain 264 some of this sex-specificity, particularly through their role in altering overall obesity and fat distribution around 265 menopause 100, 101 . We were under-powered to study the genome-wide architecture of change in adult WC 266 and WHR (ten-fold fewer observations than BMI and weight), whose cross-sectional levels are genetically sex-267 specific with higher heritability in women 43 , so more work is needed to disentangle the genetic contribution to 268 changes in adult body fat distribution over time. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; While the EHR-linked UKBB cohort has driven genetic discovery for a vast array of human traits in populations of European ancestry 102 , sample sizes remain under-powered to detect genome-wide associations in other 271 ancestral groups. We were thus limited to replicating European-ancestry associations in other populations, 272 without the ability to discover ancestry-specific variants associated with adult adiposity trajectories. Further-273 more, despite the inclusion of >200,000 individuals in the first release of the UKBB EHR data, sample sizes 274 remain low to analyse the genetics of longitudinal trajectory metrics, which have lower heritability than the EHRs and difficult to ascertain the direction of causality by which these covariates may be associated with 280 weight change. For example, the use of statins to lower blood pressure may be connected to weight gain, 281 mediated indirectly by change in appetite 104 , but high blood pressure may itself be a consequence of weight 282 gain 105 . Inappropriate adjustments along this causal pathway may lead to unexpected collider biases 106 . In 283 general, despite their longitudinal nature, it is challenging to assign causality to the associations between 284 weight change and covariates or disease diagnoses from EHR observations alone, as there is no prospective 285 study design to follow 107 . Advances in emulating randomised control trials from longitudinal EHR are begin-286 ning to overcome these challenges 108,109 , and in the future, it will be critical to incorporate information on 287 genetic risk into these simulated studies.

288
To the best of our knowledge, this is the largest study to date that characterises the genome-wide architecture 289 of adult adiposity trajectories, and the first to identify specific variants that alter BMI and weight in mid-to 290 late-adulthood. We add evidence to support the growing utility of EHRs in genetics research, and particularly 291 highlight opportunities for incorporating longitudinal information to boost power and identify novel associa-292 tions. In particular, the APOE -associated weight loss identified here contributes to a growing body of evidence 293 on the ageing-associated effects of cholesterol dysregulation. Heterogeneity between men and women in the 294 genome-wide architecture of obesity-change and genetic correlation with baseline obesity highlights the impor-295 tance of distinguishing between the genetic contributions to mean and lifetime trajectories of phenotypes in 296 sex-specific analyses. In the future, the growing integration of EHR with genetic data in large biobanks will al-297 low us to assess the time-varying associations of rare variants with outsize effects on quantitative traits, as well 298 as to establish genetic and phenotypic relationships among the trajectories of multiple correlated biomarkers 299 across adulthood. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint Identification and quality control of longitudinal obesity records 302 UK Biobank. This study was conducted using the UKBB resource, which is a prospective UK-based cohort 303 study with approximately 500,000 participants aged 40-69 years at recruitment, on whom a range of medical,   i.e. t i,1 < . . . < t i,Ji , a "jump" P i,j for j = 1, . . . , J i − 1 was defined as: . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint 1. Fit random-slope, random-intercept mixed model with the maximum likelihood estimation procedure in the lme4 113 package in R 114 . We target two quantities: the baseline value of each individual's clinical 361 trait (the β 0 + u i,0 below); and the the linearly approximated rate of change in the trait during each 362 individual's measurement window (the β 1 + u i,1 below): where individual-specific covariates x i comprise: baseline age, (baseline age) 2 , data provider, year of 364 birth, and sex. Variance parameters σ 2 u,k and σ 2 ε are estimated. Fitting model (1) where the intercept-adjusting covariates x 0,i in (2) (2) and (3). For example, the intercept trait for individual i taken forward to GWAS is is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint where: the n df -vector b i contains the ith individual's spline basis coefficients; X B is the (T + 1) × n df matrix 383 of spline basis functions evaluated at days 0, . . . , T post-baseline; and Z i is a J i × (T + 1) matrix whose jth 384 row extracts day t i,j − t i,1 post-baseline, i.e.,

385
[ We specify an order-1 autoregressive (AR(1)) model as a smoothing prior on spline coefficients, b i , which vary 386 smoothly around an individual-specific mean value, µ i . On µ i we specify a non-informative prior: where: Σ AR(1) is the n df × n df autocovariance matrix implied by an AR(1) model with lag-1 autocorrelation 389 ϕ ∈ [0, 1) and scale parameter σ 2 AR(1) > 0; and ⃗ 1 is an n df × n df matrix of ones.

390
The prior at (6) and likelihood at (5) are a specific case of the Bayes linear model 117 , for which the posterior 391 is available in closed form: The posterior at (7) Fig. 4). We additionally compared cluster allocations for 5,000 randomly selected individuals across  Fig. 8).

399
For each trait separately, we set σ 2 to the median of its individual-specific maximum likelihood estimates is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint (MLEs), i.e., σ 2 := median 1 Ji ||y i − Z i X B m i || 2 2 : i = 1, . . . , n where each MLE is calculated from (5) 401 after substituting for b i its maximum a posteriori estimate, m i from (7) (Supp . Table 12).

402
The measurements y i inputted into the likelihood for the regularised spline model at (5) with m i and V i defined at at (7).
where m i and σ 2 V i are the posterior mean and covariance of individual i's spine coefficients b i taken from 420 (7). For each spline coefficient k in (9), the squared difference between individuals' i and i ′ mean coefficients 421 is standardised by the sum of the corresponding variances. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint selected subset of 80% of individuals in each analysis strata. We filter individuals in the training set to retain only those with at least L = 2 observations. For a fixed number of clusters, K = 4, we initialize cluster 426 membership according to bins B 1:K demarcated by the 0, 1 K , 2 K , . . . , 1 empirical quantiles of the estimated 427 fold change in obesity trait between baseline and year M = 2: To ensure robustness, we run the clustering algorithm S = 10 times, each on a random sub-sample of size For each clustering s, we observe all trajectories c s,1:K to be monotonic and non-overlapping (Supp. Fig.   432 6). We can therefore define ordered cluster means c (k),s ,  Fig. 7). Finally, we compared cluster 442 allocations over each of the 10 random trains for a set of 5,000 randomly sampled individuals held out of the 443 training splits (Supp. Fig. 9). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; belonging to cluster k as the posterior probability of being closest in Euclidean distance to cluster k's centroid: where the second term in the integrand is the posterior from (7), and we approximate the integral in (12) 447 using 100 Monte Carlo samples from the posterior.  . Table 13).
BMI and weight slope-trait genetic associations. We created adiposity slope phenotypes for the 17,006 individuals with multiple observations of BMI and 17,035 individuals with multiple observations of weight from 504 repeat assessment centre visits (Supp. Fig. 3) with BLUPs from linear mixed-effects models as described 505 in the slope trait modelling section above. We tested for association of this slope trait with GWS variants 506 associated with adiposity change in our discovery analyses, adjusted for the first 21 genetic PCs and genotyping 507 array, via the linear regression framework implemented in PLINK 123 . As PLINK does not account for family 508 structure, we compared each pair of second-degree or closer related individuals (kinship coefficient > 0.0884) 110

509
and excluded the individual in the pair having higher genotyping missingness. We repeated the same protocol 510 within each self-identified ethnic group of individuals not of white British ancestry.

511
Genetic associations with BMI and weight cluster probabilities. We fit regularised splines as detailed 512 above to the 17,006 individuals with multiple observations of BMI and 17,035 individuals with multiple obser-513 vations of weight from repeat assessment centre visits (Supp. Fig. 3). Soft cluster membership probabilities 514 for these individuals were calculated, and the three logit-transformed π i traits were carried forward for asso-515 ciation testing with GWS variants associated with adiposity change in our discovery analyses. As above, we 516 pruned out second-degree or closer related individuals and performed association analysis, adjusted for baseline year weight change coded as an ordinal categorical variable with three levels: "loss", "no change", and "gain" 523 in 301,943 individuals (described in the data section above). All models were adjusted for BMI, age, sex, year 524 of birth, data provider, assessment centre, first 21 genetic PCs and genotyping array. We repeated the same is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint to determine the boost in power over that expected from the sample size difference between the two studies. The following analyses were all conducted in female-specific, male-specific, and sex-combined strata.   125 ) by integrating UKBB assessment centre measurements with the interim release of primary 545 care records provided by GPs, with QC performed as described above for obesity traits. Slope changes in 546 each of these phenotypes were calculated using linear mixed-effects models described in (1). A deterministic 547 rank-based inverse normal transformation 115 , as described in (4) is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; We tested for sex heterogeneity in the effects of adiposity-change lead SNPs by calculating Z-statistics and 564 corresponding P-values for the difference in female-specific and male-specific effects as: A similar statistic and test was used to determine heterogeneity between (h 2 G ) of all traits in males and females, 566 and r G between obesity-intercepts and obesity-change traits in males and females. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint Figure 2: Genome-wide novel and refined SNP associations with baseline obesity estimated over the measurement window for each individual. (A) Combined Manhattan plot displaying genome-wide SNP associations with obesity trait (BMI or weight) across female, male, and sex-combined analysis strata. Each point represents a SNP, with GWS SNPs (P < 5 × 10 −8 ) coloured in: green for previously published obesity associations, blue for SNPs in linkage disequilibrium (LD) (r 2 >0.1) with published associations, yellow for refined SNPs that represent conditionally independent (P conditional < 0.05) and stronger associations with baseline obesity than published SNPs in the region, and pink for novel associations (see Methods 44 ). Novel SNPs are annotated to their nearest gene. (B) Proportion of variance in baseline BMI and weight that can be explained by the fine-mapped independent lead SNPs in each strata. In green is the proportion of variance explained by previously published obesity-associated variants (and those in LD with these variants), while that explained by novel and refined variants is in pink. The numbers represent the number of lead SNPs in each of these categories (published / refined and novel).

30
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint Table 2: Lead SNPs identified from genome-wide association studies (GWAS) of posterior probability of membership in an adiposity-change cluster (high gain k1, high/moderate gain k1/k2, or high/moderate gain and steady k1/k2/k3), independent of baseline obesity. MAF = minor allele frequency (European-ancestry), TSS = transcription start site, SE = standard error, OR = odds ratio, CI = confidence interval Trait is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; Figure 3: Association of minor C allele of rs429358, missense variant in APOE, with various longitudinal phenotypes. (A) Effect size (beta) and 95% CI for associations of rs429358 with BMI and weight intercepts or linear slope change over time estimated from linear mixed-effects models in all analysis strata. (B) Left: OR and 95% CI for association of rs429358 with posterior probability of membership in the BMI and weight high-gain clusters (k1). Right: Modelled trajectories of standardised (std.) covariate-adjusted (adj.) BMI in carriers of the different rs429358 genotypes. (C) Proportion of individuals who self-report weight gain, weight loss, or no change in weight over the past year for carriers of each rs429358 genotype. (D) Effect size and 95% CI for associations of rs429358 with slopes over time of waist circumference (WC) and waist-to-hip ratio (WHR), adjusted for BMI (-adjBMI), estimated from linear mixed-effects models. (E) Effect size and 95% CI for associations of rs429358 with linear slope change in quantitative biomarkers over time, estimated from linear mixed-effects models. Across all panels, estimates of trait change are adjusted for baseline trait values, and P-values for significance are controlled at 5% across number of tests performed via the Bonferroni method. n.s.=non-significant 32 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint Figure 4: Genotyped SNP-based heritability of, and genetic correlation between, baseline obesity trait and obesity-change phenotypes. Left column: heritability (h 2 G ) estimates and 95% CI, calculated using the LDSC software 66 on a subset of 1 million HapMap3 SNPs 67 for the following traits: baseline BMI and weight, estimated from intercepts of linear mixed-effects models of obesity traits over time (u0), linear slope change in obesity traits over time (u1 adj. u0), adjusted for intercepts, and posterior probability of membership in a high-gain BMI or weight cluster, adjusted for baseline trait value (prob(k1) adj. u0). Right column: Genetic correlation, r G and 95% CI between the two obesity-change phenotypes and corresponding baseline obesity traits. In all panels, circles represent BMI, triangles represent weight; points are coloured by analysis strata (pink: female-sepcific, green: male-specific, grey: sex-combined). P-values display the level of significance of heterogeneity between the female-and male-specific estimates in each panel.

33
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted January 11, 2023. ; https://doi.org/10.1101/2023.01.09.23284364 doi: medRxiv preprint