Characterization of the genetic architecture of BMI in infancy and early childhood reveals age-specific effects and implicates pathways involved in Mendelian obesity

To elucidate the role of common genetic variation on infant and child weight development, we performed genome-wide association studies across 12 time points from birth to eight years in 28,681 children and their parents (27,088 mothers and 26,239 fathers) in the Norwegian Mother, Father and Child Cohort Study (MoBa). We identify 46 distinct loci associated with early childhood BMI at specific ages, matching different child growth phases, and representing four major trajectory patterns. Among these loci, 30 are independent of known birth weight and adult BMI loci, and 21 show peak effect between six months and three years, making these discoverable only at early age. Several of the 21 variants reside in/near genes previously implicated in severe forms of early-onset obesity, and monogenic obesity genes are enriched in the vicinity of the 46 loci. Four loci demonstrate evidence of several independent association signals as key drivers for BMI development near LEPR, GLP1R, PCSK1, and KLF14, all central to appetite and energy balance. At the KLF14 locus, we detect significant associations for maternally inherited alleles only, consistent with imprinting effects. Finally, we demonstrate how the BMI distribution stratified by different polygenic risk scores transitions from birth to adult profile throughout early childhood, and how age-specific polygenic risk scores improve the prediction of childhood obesity, outperforming scores based on adult BMI. In conclusion, our results offer a fine-grained characterization of the rapidly changing genetic association landscape sustaining early growth.

But how the genetics of BMI develops from birth to adiposity rebound, where the genetic signature of an adult-like obesity emerges, remains unknown.
Studies among children with severe obesity have identified rare genetic variants that cause early-onset monogenic and syndromic forms of obesity 19,20 . A recent investigation of severe childhood obesity found an excess burden of rare, predicted deleterious, variants involving genes near adult obesity loci 21 . Variants with different penetrance were detected in genes in the leptin/melanocortin pathway, a major determinant of satiety and energy expenditure.
Interestingly, GWASs suggest that the LEP-LEPR axis is also central to BMI development during infancy and childhood 17,18 .
In this study, we investigated the association of common variation with BMI from birth to eight years of age through a longitudinal analysis in the Norwegian Mother, Father and Child Cohort Study (MoBa) 22 . Using this unique pregnancy-based open-ended cohort with dense harmonized phenotypes and genotypes from both parents and child, we here present a detailed characterization of the rapidly changing genetic landscape of BMI during the first years of life.
Results BMI from 28,681 children was measured at birth, 6 weeks, 3, 6, 8 months, and 1, 1.5, 2, 3, 5, 7, and 8 years of age (Supplementary Table 1). At each time point we conducted linear mixed model regression analyses on standardized BMI under an additive genetic model, followed by approximate conditional and joint multiple single-nucleotide polymorphism (SNP) analyses to identify independent signals 23 , resulting in 46 independent loci reaching genome-wide significance (p < 5 × 10 -8 ) for at least one time point (Table 1). Of these, 30 are independent of the birth weight and adult BMI loci reported in the latest meta-analyses 13,24 .
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Four major association trajectory clusters emerge
We investigated the dynamics of the association for the 46 loci by projecting their effect sizes over time onto a basis of reference profiles (Figure 1). The variants displayed different trajectories ( Figure 1C), demonstrating how the genetics of early childhood BMI is an agedependent combination of interweaved signals. Four major clusters of profiles emerged ( Figure 1E), which we hypothesize to represent distinct biological processes.
The Birth cluster represents nine loci previously associated with birth weight 24 . Our longitudinal analysis showed that the association near SH2B3, CCNL1, GPSM1, GCK, and DLG4 quickly vanishes after birth, indicating that these loci are conferring pure prenatal influences, while loci near ESR1, DLK1, and HHEX seem to influence growth also postnatally ( Figure 1 and Figure 2). The trajectory of ADCY5, known primarily as a type 2 diabetes (T2D) locus, is remarkable in presenting a strong association at birth which rapidly disappears, followed by a steady increase during infancy and childhood, but almost no association with adult BMI 13 ( Figure 1 and Figure 2).
The Transient cluster represents 21 independent signals with no effect at birth, peak association during infancy or early childhood, and little or no effect after the adiposity rebound.
None of the SNPs in this cluster reach genome-wide significance (p < 5 × 10 -8 ) in the largest adult BMI meta-analysis to date 13 (Figure 1 and Figure 2), and among SNPs that reach p < 1 × 10 -5 , three out of four have opposite direction of effect on BMI in adults compared to infancy (LEPR(rs10493377), MLXIPL(rs17145750) and KLF14(rs287621)). Conversely, of the variants previously implicated in birth weight, only one (PTCH1(rs28457693)) is present in this cluster. Thus, this cluster represents biological mechanisms with distinct effects on BMI development in infancy and childhood. The other phenotypes associated with the loci in this cluster are primarily anthropometric traits (Figure 1), yet the majority (11 of 21) are not known to be associated with adult traits.
The Early Rise cluster represents 12 loci showing a gradually stronger association with BMI from infancy into childhood. In contrast with the variants from the Transient cluster, the association levels plateau around adiposity rebound and maintain some effect until age seven to eight years. This cluster includes variants associated with self-estimated comparative height and size at age ten years in the UK Biobank, as well as traits related to adult body composition, which supports the hypothesis of a more persisting effect. However, only two SNPs in this cluster (ADCY3(rs11676272) and TNNI3K(rs10493544)) reach genome-wide significance in the largest adult BMI study 13 , and one (AC105393.2(rs77165542)) with no proxy in Yengo et al 13 showed an association with BMI in the parents in MoBa, the nine others show no association with adult BMI per se ( Figure 1 and Figure 2).
The Late Rise cluster, represents four loci (FTO(rs17817288), MC4R(rs78263856), SEC16B(rs545608), and FAIM2(rs7132908)) that show little to no association prior to adiposity rebound where they exhibit a rapid increase contrasting with the other clusters. The variants in the Late Rise cluster are in high LD with loci reported in a previous study on childhood BMI consisting mainly of children measured at age six to ten years 14 and with adult BMI 13 (Figure 2).
The observed upward trajectory therefore yields effects that seem to remain significant into adulthood. Figure 1H shows the density of the overall distribution of trajectory profiles in MoBa for all previously detected birth weight and adult BMI SNPs 13,24 along with the density of the 46 early growth loci detected in this study: the trajectories for birth weight and adult BMI segregate to the left and right sides of the space defined by the reference profiles, respectively, while the early . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) growth BMI is dominated by transient profiles. In contrast to the association profiles in our Birth cluster, the birth weight variants mostly display trajectories persisting or rising throughout childhood. Conversely, variants associated with adult BMI present a strong concentration of late rising profiles, suggesting that better power at late ages would provide a higher number of variants in this cluster.

SNP heritability and genetic correlation
We estimated SNP-based heritability and genetic correlation between various traits and BMI at all time points using LD score regression. The heritability estimates vary with age in a pattern mirroring childhood BMI curves (Supplementary Figure 1). Overall, the phenotypes assessed display age-dependent genetic correlation patterns with BMI, with lower correlation from six months to three years ( Figure 3). Birth weight adjusted for maternal effect presents a high correlation with BMI at birth (rg: 0.89, se: 0.061, p < 1 × 10 -47 ) that decreases quickly in infancy and throughout childhood, whereas for indirect maternal effects, the correlation is initially lower but increases from one year onwards. While obesity-related traits in general show constant correlation levels before accelerating at three years, comparative body size at age 10 in the UK Biobank, in which participants reported being thinner or plumper than average at age ten years, presents a rapid linear increase throughout development from birth to seven years (rg: 0.86, se: 0.06, p < 9 × 10 -53 ), which is in line with the observed overlap of this phenotype with the Early and Late Rise clusters. Higher childhood BMI correlates with younger age of menarche, taller stature in early puberty but not during adulthood, indicating a strong genetic correlation between childhood BMI and early pubertal development. The well-known inverse relationship of T2D with fetal growth vanishes quickly after birth and the genetic correlation of BMI with glycaemic traits varies rapidly throughout childhood, suggesting a continuous interplay between genetic variation influencing insulin metabolism and growth.

Monogenic obesity and the importance of the leptin/melanocortin pathway
We further investigated the relationship between our results and childhood obesity by evaluating whether genes involved in monogenic obesity are overrepresented in the vicinity of the loci.
Seven out of 42 genes used in routine testing for monogenic and severe early onset obesity reside within 250 kb of one of the 46 top hits (overrepresentation p < 1.01 × 10 -7 ). Six of these seven genes encode proteins participating in the leptin/melanocortin pathway (LEP, LEPR (3 variants), PCSK1 (two variants), POMC, ADCY3, and MC4R) providing compelling support for the importance of this pathway also in normal growth (Supplementary Table 2). The remaining gene, INPP5E, is implicated in MORM Syndrome (OMIM #610156), a ciliopathy presenting early-onset central obesity 25,26 . Apart from MC4R, the associated variants belong to the Transient and Early Rise clusters, showing that mechanisms at play act very early after birth, some of which in a narrow age window.

Key roles for variants in the LEP and LEPR loci
The strongest association with BMI across all time points is the intronic variant rs2767486 with peak association at six months in the LEPR locus (Transient cluster, eaf: 16%, β: 0.14, se: 0.012, p < 6.4 × 10 -34 ), presenting a transient association profile that peaks at six months, in agreement with 17,18 . We also identified two novel independent signals in this locus: the intronic rs10889551 and rs10493377, located 85 kb and 112 kb downstream of rs2767486, respectively.
Conditional and joint multiple-SNP analysis and different timings of association confirmed the independence of the three signals (Supplementary Table 3) (Figure 4). The previously described association with rs10487505 in LEP 17 showed a later and broader association profile, which resulted in its assignment to the Early Rise cluster. Its child BMI-increasing allele is associated with lower plasma leptin levels adjusted for BMI in adults 27 , and our results suggest that the association with BMI is specific to childhood.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)  Figure 2).

Novel variants near PCSK1
We identified two independent loci near the monogenic obesity gene PCSK1 31,32 (Figure 4). The strongest association was found for rs6899303, downstream of PCSK1 with peak association at six months (Transient cluster, β: 0.057, se: 0.0090, p < 5.3 × 10 -11 ). PCSK1 encodes the prohormone convertase 1/3 (PC1/3), highly expressed in the hypothalamic arcuate nucleus regulating food intake and body weight 33 . No previous phenotypic associations are reported for rs6899303, but the variant is a strong pQTL for PC1/3 34 . The second signal, tagged by rs263377, displays its strongest association at one year and resides between PCSK1 and CAST (Transient cluster, β: 0.054, se: 0.0095, p < 2.9 × 10 -8 ). rs263377 associates with multiple adult anthropometric traits including fat-free body mass in the UK Biobank (p < 1.84 × 10 -9 ). None of the two variants are in LD with the PCSK1 missense variant rs6235 associated with insulin and adult BMI-related traits 35 . The hypothalamic PC1/3 expression is high in two leptin-sensitive neuronal populations: proopiomelanocortin (POMC)-expressing neurons, and neuropeptide Y (NPY) and agouti-related peptide (AgRP)-expressing neurons. In the periphery, PC1/3 is highly . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint expressed in specific ghrelin-expressing endocrine cells in the stomach, the α-and β-cells of the islets of Langerhans in the pancreas, and various intestine enteroendocrine cells. These play an important role in appetite, glucose homeostasis, and nutrient assimilation by secreting several PC1/3 products including ghrelin, insulin, and proglucagon-derived peptides such as the hormone glucagon-like peptide-1 (GLP-1).

Three novel variants in GLP1R with different timing and effect on infant BMI
GLP-1 is released in the small intestines in response to food intake. It interacts with GLP1R abundant in hypothalamic regions regulating feeding behavior 36 , hereby inducing satiety. It is an incretin with insulinotropic effects in response to oral food intake. GLP-1 induces insulin secretion by interacting with beta cell GLP1R. We identified three independent signals at the GLP1R locus belonging to the Transient cluster ( Figure 4): (1) an LD region spanning the promoter and first exon of GLP1R with peak association at three months with lead SNP rs2268657; (2) one intergenic LD block spanning the gene itself with lead SNP rs2268647, peaking at one year; and (3) a 20 kb region with lead SNP rs1820721, 45 kb downstream GLP1R, between the SAYSD1 and KCNK5 genes, with peak association at six months.
Conditional and joint analysis revealed a complex relationship and timing between the signals.
The strength of association increased for all three variants when analysed together, and in particular for rs1820721 (at six months pcojo < 5.3 × 10 -21 ), suggestive of allelic heterogeneity at this locus. None of the three SNPs have been associated with childhood or adult BMI. However, the BMI-increasing alleles at rs2268657 and rs2268647 are both associated with lower GLP1R expression in stomach, pancreas, and adipose tissues (GTEx). Interestingly, rs2268657 has previously been associated with gastric emptying rate 37 , where individuals homozygous for the BMI-increasing allele expressed faster gastric emptying rates, suggesting that GLP1R variants may affect childhood BMI through higher digestion rate, in line with its function in the treatment of T2D.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint

Rapidly decreasing maternal influences at birth for SH2B3, HHEX, and ADCY5
For each of the 46 independent loci, we extended the association model using the parental genotypes, and conducted child-mother-father trio-and haplotype-resolved analyses. For most loci, the child effect at peak remains after conditioning on the maternal and paternal genotypes, with no noticeable parental effect ( Figure 5). However, for five variants, different patterns emerged: three loci from the Birth cluster SH2B3 (rs7310615), HHEX (rs11187129), and ADCY5 (rs11708067), and two from the Transient cluster near KLF14 (rs287621 and rs12672489).
For the ADCY5 and HHEX loci, associated with T2D and birth weight, the trio analysis demonstrated opposing fetal and maternal effects, as already observed for birth weight 24 , and no effect from the father (Supplementary Table 4). This differs from the SH3B2 locus, where the trio analysis indicated a dual and directionally consistent effect from both maternal and fetal alleles on birth BMI. The association trajectory of these three birth weight loci illustrates how the maternal genome provides heterogeneous indirect effects on fetal growth that vanish after birth with different dynamics ( Figure 5 and Supplementary Table 4).

Age-dependent imprinting patterns near KLF14
We identified two variants associated with childhood BMI upstream KLF14. rs287621 shows peak association at six months (Transient cluster, β: 0.064, se: 0.010, p < 3.7 × 10 -10 ) before sharply declining to a plateau from one to five years and then vanishing ( Figure 1 and Figure 2). Separated by an upstream recombination hotspot, rs12672489 displays a different association pattern, assigned to the Early Rise cluster, which does not persist into adulthood ( Figure 1 and  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint haplotype analysis revealed that the association for both variants is driven by the maternally inherited allele throughout infancy, with little to no contribution from the paternal alleles and the maternal non-transmitted alleles ( Figure 5, Supplementary Table 4), consistent with imprinting effects. While rs287621 is associated with several adult phenotypes, the strongest known association for rs12672489 is comparative body size at age 10 in the UK Biobank (p < 3.5 × 10 -7 ), showing that this variant influences childhood growth despite residing outside of the region critical for adult traits. eQTL studies have linked variants to the abundance of KLF14 transcript in adipose tissue 40 and a variant near KLF14 has been associated with lower plasma leptin levels 41 , offering a mechanistic hypothesis and yet another putative link between leptin regulation and weight gain in infancy.

Polygenic transition across infancy and childhood
Individual-level polygenic risk scores (PRS) were created using PRSice-2 42 for each time point using summary statistics from the largest meta-analyses on birth weight 24 , childhood BMI 14 , childhood obesity 16 , adult BMI 13 , and T2D 43 . We then stratified the study population by PRS decile for each of the traits across all time points to investigate their relationship with BMI over time. Strong age-dependent gradients were found with opposing patterns for birth weight and BMI-related traits ( Figure 6 and Supplementary Table 5).
For the birth weight-based PRS, the difference in standardized BMI between the 1 st and 10 th decile is 0.7 at birth ( Figure 6), declines considerably already at six weeks, and then stabilizes.
This residual and lasting association of the birth weight PRS supports an overlap between genetic variants influencing birth weight and BMI development in infancy and childhood.
Furthermore, the top risk score decile captures an elevated and consistent share of obese children, even until seven to eight years, where it performs similarly to scores trained on childhood BMI and obesity ( Figure 6).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021.
The PRS based on adult BMI displays a shift from three to eight years, where the difference in standardized BMI between the 1 st and 10 th decile rapidly grows from 0.36 to 0.80 ( Figure 6) and variance explained increases from 0.4 to 5.3% ( Figure 6). In the top risk decile, 13% of children were obese at age eight years, corresponding to a 2.6 times higher risk compared to the median at this age, and a 7.  Table 5).

Age stratified PRS improves prediction of childhood BMI
None of the PRS models above capture the BMI development during infancy and the first years of childhood. To evaluate the improvement in predictive performance of PRS models when based on time resolved GWAS-results generated in this study compared to models trained on adult BMI using 1,096 independent children in MoBa. Age-specific modelling vastly improved the variance explained by the PRS during infancy, especially around the adiposity peak at six months, where R 2 increased from 1.5% using results from adult BMI to 6.4% using age-specific results ( Figure 6, Supplementary Table 5). This corresponds to a doubling of the distance between the mean standardized BMI of the top and bottom PRS quintiles, from 0.4 to 0.8 ( Figure 6). We also tested the predictive ability of the 21 variants in the Transient cluster, which peaked between six months and 1.5 years (p < 1 × 10 -5 ), and explained between 3.0 and 4.5% . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; of the variance during this age span. Hence, the identified variants in the Transient cluster alone explain a substantial proportion of the variance in BMI around the adiposity peak. Tracking the share of children in the different risk score strata at each time point yielded interweaved trajectories illustrative of the dramatic changes in the genetics of BMI ( Figure 6).

Discussion
The genetics of BMI changes rapidly during infancy and early childhood, which are stages of life characterized by rapid development and drastic changes in the environment, body composition, and metabolism. From the 46 independent loci that we associate with childhood BMI, 30 are not associated with birth weight or adult BMI in large meta-analyses. We propose to group the genetic association with early BMI into four main clusters that align well with the phases of early growth ( Figure 1): the Birth cluster, characterized by loci mainly acting on fetal growth; the Transient and Early Rise clusters that affect BMI development during the key transitions around adiposity peak and rebound; and finally the Late Rise cluster of loci that come into play later in childhood and have persisting influence on BMI into adult life. Most of the variants that we discovered show age-specific transient effects and thus would not be identified from GWASs in other age groups. Conversely, early rising loci display gradually stronger effects after birth lasting into pre-pubertal age. These loci may be particularly important for processes preceding puberty onset, which is supported by the LD score regression profiles that show gradually increasing genetic correlation between BMI at three to eight years of age and early puberty, higher stature at age 10-12 years, and shorter relative length increase after age 12 years. The age-specific association patterns demonstrate a major change in the underlying genetic architecture of childhood BMI pre and post adiposity rebound, where a shift in association trajectories, genetic correlations, PRS prediction power, and heritability occurs. This is further underlined by the large overlap between variants identified in adult BMI and late childhood, but lower overlap with earlier childhood.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; In addition to replicating the association with variants in LEP/LEPR 17,18 , we identify two novel variants in LEPR and variants near PCSK1, ADCY3 and MC4R, all known monogenic obesity genes and central to the hypothalamic signalling pathway. All show age-dependent influences during early childhood. Thus, our findings are highly suggestive for energy intake and expenditure being central to controlling BMI during early childhood. As more genes implicated in monogenic obesity are found to harbour common variants associated with BMI, the notion that monogenic and polygenic obesity share underlying etiologies is strengthened.
The bidirectional gut-brain-axis connecting the enteric with the central nervous system plays a vital role in informing the brain of peripheral energy status. However, relatively few genetic variants associated with genes with direct or indirect roles in gastrointestinal functions have been associated with childhood obesity. Thus, identifying a complex signal within GLP1R is particularly intriguing considering its biological prior. First, the discovery of three novel independent associations in GLP1R not picked up in the much bigger meta-analyses on adult BMI is advocating for distinctly different underlying biology driving early BMI development.
Second, it iterates on the importance of hypothalamic signalling and further establishes the importance of common variation in genes related to the gut-brain-axis in development of early childhood BMI. Finally, increased understanding of GLP-1 signalling in early childhood BMI development is particularly important as GLP1R is a proven pharmaceutical target for treating adult obesity 45 and recently showed promising results for treating obesity in adolescence 46 . As the reluctance to medically treat polygenic childhood obesity might at some point be outweighed by the strong evidence of early childhood obesity persisting into adulthood 5 , GLP1R agonist could prove promising. A study of patients treated with the GLP1R agonist liraglutide found alterations in brain activity related to highly desirable food cues and reduced activity in areas of the brain involved in the reward system 47 . Mice injected with liraglutide show increased energy expenditure through stimulation of brown adipocyte thermogenesis acting through hypothalamic processes 48 . GLP1R expression in adipose tissue has also been linked to insulin sensitivity 49 .
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint Child-mother-father trio analyses revealed that the association for two independent loci near KLF14 is driven by the maternal transmitted allele only, suggesting that the paternal allele is silenced. Maternal imprinting for variants in KLF14 has previously been identified for T2D and one of our variants tag the same signal, while the other appears as a novel second imprinting effect acting on KLF14 in early childhood. Additionally, our PRS analysis using a T2D reference study finds persistently low BMI during childhood for children in the highest decile of BMIadjusted T2D PRS, and it is tempting to ascribe these late effects on childhood BMI to mechanisms acting through insulin and glucose metabolism given the numerous studies associating KLF14 with T2D. However, alleles in high LD with the infant BMI and T2D risk increasing alleles were recently associated with lower plasma leptin levels adjusted for BMI 41 , offering a mechanistic hypothesis and yet another putative link between leptin regulation and weight gain in infancy.
Polygenic risk prediction provides opportunities to estimate an individual-level genetic liability and may potentially be used for early identification of children with considerable risk for developing obesity. Here, we show striking differences in BMI between children in the top and bottom deciles of an adult-BMI based PRS concurrent with timing of the adiposity rebound.
Notably, the 0.8 effect difference in standardized BMI between top and bottom decile at age eight in the Norwegian MoBa population is almost identical to what was previously described for British children from ALSPAC 10 , suggesting that this score is transferable between Scandinavian and British children. We also show that the PRS can identify children at considerably higher risk of being obese already from five years of age. As much as 13 % of children in the top decile could be defined as obese at age eight years, corresponding to a seven fold higher risk compared to the bottom risk decile ( Figure 5).
The shift in genetic architecture before age five years renders PRSs based on adult BMI inferior to age-resolved scores during infancy. The testing in our independent sample demonstrates that BMI in the earlier years of life is shaped by a complex interplay and transitions from both age . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint restricted and more long-term genetic influences that have to be taken into consideration when evaluating a child's growth pattern and the potential for targeted interventions.
Our study sample consists of a single cohort, all of Northern European ancestry. How generalizable the results are to other populations remains to be evaluated. However, the larger size of the current MoBa release, the availability of parental data and the homogeneous phenotyping allowed us to perform much more detailed time-resolved analyses than typically possible in a meta-analysis involving studies performed under different protocols and data collection timepoints. The age-dependent association patterns identified here illustrate the importance of early age sampling, and the need for unifying data collection and measurements across cohorts to balance the putative benefit from increased sample size without introducing considerable variance in the phenotyping.
In conclusion, our results provide a fine-grained understanding of the changing genetic landscape regulating BMI from birth to eight years. The identified loci represent clusters of association trajectories that reflect various phases of growth and highlight a fundamental role of pathways involved in appetite regulation and energy metabolism in both normal growth and rare syndromic obesity. These results demonstrate a strong genetic drive ensuring that children gather the energy necessary to sustain healthy growth.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Pre-phasing and imputation
Prior to imputation, insertions and deletions were removed to make the dataset congruent with Haplotype Reference Consortium (HRC) v.1.1 imputation panel using HRC Imputation preparation tool by Will Rayner version 4.2.5 (see URLs). Allele, marker position, and strand orientation were updated to match the reference panel. Pre-phasing was conducted locally using Shapeit v2.790 50 . Imputation was performed at the Sanger Imputation Server (see URLs) with positional Burrows-Wheeler transform 51 and HRC version 1.1 as reference panel.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Outlier detection and missing value imputation
For all children in MoBa (n > 100,000), length and weight curves were inspected for outlying values, missing values were imputed, and artefacts causing the length of kids to decrease were corrected. Outlier detection, missing value imputation, and length decrease correction was conducted as described previously 17 . In brief, length and weight values presenting an extreme peak or an extreme gap were removed. Missing values preceded and followed by at least two measurement points were imputed by interpolating over the growth curve. Length curves were adjusted to prevent peaks to cause length decrease. These steps were conducted iteratively until no data point was changed, as detailed in Supplementary Figure 1. Finally, for all children and all time points presenting both length and weight values, the BMI was computed.

Sample selection
From the total set of growth curves, only the genotyped children were retained. In addition, the following pregnancies were excluded: 1) pregnancies strictly shorter than 37 full gestational . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint weeks (259 days); 2) plural pregnancies; 3) ADHD excess cases. The set of ADHD excess were defined as the additional cases included by the ADHD case/control study. The resulting set of 28,681 children of Norwegian ancestry was used in genetic association and is referred to in the following as the full set of children. From this, we built a set of child-mother-father trios by selecting children who had both parents genotyped, with parents passing genotype QC and

Phenotypes standardization
For the full set of children, at each time point, the BMI was standardized using the generalized additive model for location, scale and shape (GAMLSS) v5.1-7 (gamlss.com) in R v. 3.6.1 (2019-07-05) --"Action of the Toes". Two GAMLSS models based on a Log Normal distribution were fitted separately for boys and girls, using pregnancy duration as covariate, as detailed in Table Phenotypes Standardization. Note that the models of early BMI include a non-linear dependency on pregnancy duration, but the non-linear terms had to be removed after six months to ensure the convergence of GAMLSS. GAMLSS models were fitted solely on children from the set of unrelated trios. The models obtained were used to compute standardized BMI values for the full set of children, i.e. including related children, using the 'centiles.pred' function of GAMLSS. Throughout the text, all effect sizes are expressed relative to the standardized phenotypes. Please note that standardization approaches vary between the present study and the meta analyses on birth weight 24 and adult BMI 13 , notably with regards to scale and covariates, which might impair the comparability of association results. A child was considered . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021.

Genetic association
The association between the genotypes and the standardized phenotypes using linear mixed models was conducted using BOLT-LMM v2.3.4 52  Throughout all analyses, the age at peak association refers to the age of lowest p-value in the association with BMI, and the effect allele refers to the BMI-increasing allele at age at peak association.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Obesity gene enrichment analysis
The gene enrichment analysis around the 46 top hits was conducted using the union of two panels of genes implicated in monogenic and severe early onset obesity: Blueprint Genetics Monogenic Obesity Panel (test code KI1701) (blueprintgenetics.com/tests/panels/endocrinology/monogenic-obesity-panel) consisting of 36 genes, and Genomics England severe early-onset obesity panel v.2.2 consisting of 32 genes (panelapp.genomicsengland.co.uk/panels/130). The union of the two resulted in 42 genes used in analysis. A list containing gene locations for hg19 were obtained from PLINK 1.9 resources (cog-genomics.org/plink/1.9/resources). The list contained 25,303 unique genes used in the analysis. A 500 kb window was used to identify genes in the vicinity of the top hits. The significance for the enrichment of monogenic genes compared to random sampling was estimated using the distribution function of the Hypergeometric distribution via the function phyper from the R package stats.

Comparison with adult BMI in MoBa
Pre-pregnancy BMI values were computed using self-reported height and weight for the parents who were genotyped, passed QC, and of core Norwegian ancestry (27,088 mothers and 26,239 fathers), yielding 26,062 and 22,719 values for mothers and fathers, respectively. As detailed in Table Parent Phenotypes Standardization, BMI values were standardized using GAMLSS for mothers and fathers separately, using their birth year as covariate, as a proxy for age. Like for children, GAMLSS models were fitted solely on parents from the set of unrelated trios, and used to compute standardized values for all parents, i.e. including related parents.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. The association between parent BMI and genotypes was computed for mothers and fathers separately, using BOLT-LMM v2.3.4 52 as done for the children. The covariates used were the genotyping batch, birth year, and ten principal components.

Clustering of association profiles
For each of the 46 independent genome-wide significant variants, alleles were aligned so that the association with standardized BMI is positive at the age of peak association. Effect sizes for all time points were then combined into an association profile for this variant, i.e. a vector = ( ℎ , 6 , . . ., 8 )). Reference effect size over time profiles corresponding to an association at birth waning afterwards, and an increasing association after one year of age towards adulthood were built using equations 1 and 2, respectively. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint The resulting projection is shown in Figure 1B Each variant was plotted on Figure 1B using the sum of the absolute values of the effect size over time as radial coordinate, and the relative association with x1 and x2 to define the angular coordinate, as described in equations 4 and 5, respectively.
Where represents the radial coordinate, the angular coordinate, 1 and 2 the association between the genetic association profile and the reference profiles x1 and x2, respectively, and 0 a constant. Each association profile was plotted after normalization to the association level at age at peak in Figure 1C using the angular coordinate as baseline on the ordinate.
A cardinal or intercardinal cluster was defined for each of the eight reference profiles corresponding to the cardinal and intercardinal directions in Figure 1B. Every cardinal and intercardinal cluster was assigned a first element chosen to be the variant with the angular coordinate closest to the direction (i.e. most correlated to that profile). The other variants were then assigned to a cluster based on their angular nearest neighbour, yielding the clustering displayed by the dendrogram of Figure 1D. Finally, as illustrated in Figure 1E . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Mapping to pathways
The lead SNP of the 46 independent loci were submitted to the Ensembl Variant Effect Predictor (VEP) 54 . All proteins coded by genes reported with a consequence other than downstream_gene_variant, upstream_gene_variant, or intergenic_variant were retained as potentially affected by a given variant. If no such gene was found, the protein coded by the closest gene within 500 kb was retained. Proteins were matched to Reactome 55 using PathwayMatcher 56 . Then, for each of the four main clusters, we built the smallest set of top-level pathways that explained the protein set returned by the VEP analysis, counted the number of variants in this cluster affecting a protein in one of these top-level pathways ( Figure 1F).

Mapping to other traits
For each SNP, other associated traits were extracted using PhenoScanner 57,58 . PhenoScanner was queried using EUR and an R 2 threshold of 0.8 for proxies and 5e-8 as p-value threshold.
Synonymous terms were grouped, and, for each of the four main clusters, the number of variants mapping to a given trait relative to the number of variants in the cluster was plotted in Figure 1G.

Comparison with birth weight and adult BMI
Summary statistics on birth weight and adult BMI were obtained from Warrington et al. 24 and Yengo et al. 13 , respectively. Variants were matched by rsid. For the variants with no match, proxies were sought using LDproxy (ldlink.nci.nih.gov) using a window of 500 kb, CEU as reference population, and an R 2 threshold of 0.2, and alleles were aligned. From the 46 top hits, variants were considered independent of the hits in Warrington et al. 24 and Yengo et al. 13 if the matched SNP had a p-value higher than 5 × 10 -8 . In addition, for each of the 46 top hits, the variant in Warrington et al. 24 and Yengo et al. 13 with the lowest p-value with an LD R 2 value . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. higher or equal to 0.2 was extracted. Summary statistics for all variants in the three data sets are available in Supplementary Table 6.
Subsequently, for all variants associated with own birth weight in Warrington et al. 24 , and all variants associated with adult BMI in Yengo et al. 13 , the association profile in MoBa was extracted and the angular coordinate of Figure 1B was computed by projecting onto the reference profiles as before. The angular density of each study was subsequently computed using sliding windows over , normalized to the number of variants in each study, and plotted in Figure 1H.

Child-mother-father trio and haplotype analysis
At all time points, for all 46 independent genome-wide significant variants, the association with the children genome was conditioned on the genomes of the parents in the set of unrelated trios using the linear model described in equation 6. bmi ~ child + mother + father + 1 (6) Where bmi refers to the standardized BMI of the child at a given time point, and child, mother, and father the number of tested alleles for this variant in the child, mother, and father genomes, respectively. Note that the number of alleles were taken from genotype hard calls and not genotyping probabilities.
Taking advantage of the phasing of the children genotypes, we could infer the parent-of-origin of the genotyped alleles as done by Chen et al. 59 . This results in an alternative model that allows studying the association per haplotype in the set of unrelated trios, as detailed in equation 7.
bmi ~ MnT + MT + FnT + FT + 1 (7) Where MnT and MT refer to the number of tested alleles non-transmitted and transmitted by the mother to the child, respectively. Similarly, FnT and FT refer to the number of tested alleles nontransmitted and transmitted by the father to the child, respectively.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; For a given variant, the share of Mendelian errors in the set of unrelated trios was estimated using trios presenting a homozygous parent. Then, a Mendelian error results in a value of -1 or +2 in the non-transmitted allele count. The share of Mendelian error was estimated by comparing the number of such erroneous genotypes to the number of trios with a homozygous parent expected from the tested allele frequency. When the estimated share of Mendelian errors was over 50%, the alleles of the children were swapped.
For the chromosome X, no filtering was done based on ploidy, when only one chromosome was found the allele was assumed to be inherited from the mother. Note that the chromosome X was not phased, yielding a high share of Mendelian errors, approximately 50%, indicative of a random assignment of children alleles. Haplotype analysis was therefore not possible for the variant on chromosome X, while trio analysis is unaffected by this.
For both models, the same covariates were used as for the genetic association analysis using BOLT-LMM, i.e. genotyping batch, sex, gestational age, and ten principal components, and both phenotypes and genotypes were adjusted for covariates in the same way as BOLT-LMM does.

LD score regression
LD score regression was performed with LD Hub v.1.9.0 using LDSC v.1.0.027 using all markers remaining after filtering on the provided SNP-list as recommended by the LD Hub authors. A total of 1,215,001 markers remained after filtering. All available phenotypes were selected for correlation analyses. Results for all variables along with heritability and QC reports are available in Supplementary Table 7. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Polygenic risk scores (PRS)
PRSs were calculated using PRSice-2 v. 2.3.0 (prsice.info). For scores based on study results from previous meta-analyses, the results were obtained from EGG (egg-consortium.org) for birth weight, childhood BMI and childhood obesity, GIANT for adult BMI (portals.broadinstitute.org/collaboration/giant), and DIAGRAM (diagram-consortium.org) for type 2 diabetes (T2D). PRSs were calculated separately for all time points per phenotype using ten principal components, sex, gestational age, and genotyping batch as covariates. The target dataset included all markers available after imputation. From the full set of children, one in each pair of samples with PI_HAT > 0.1 was removed at random, leaving 25,113 samples for the PRS analyses. Time-resolved scores used age-specific summary results from the primary analyses as base with the independent set of 1,062 samples from MoBa as target. Here, ten principal components, sex, and gestational age were used as covariates. Defaults were used for all other parameters.

Figures
All figures in the manuscript were generated in R version 3.6.1 (2019-07-05) --"Action of the Toes" (R-project.org). In addition to the base packages, the following packages were used: tidyr . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. Health approved the study protocol. The establishment of MoBa and initial data collection was based on a license from the Norwegian Data Protection Agency and approval from The Regional Committee for Medical Research Ethics. The MoBa cohort is currently regulated by the Norwegian Health Registry Act. The study was approved by The Regional Committee for Medical Research Ethics (#2012/67).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.   24,13 , and this study, respectively, and processed as in 1B. See methods for details.  Figure   . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. 1C. The estimated effect size for association with birth weight Warrington et al. 24 (column 1), BMI during early growth (this study, column [2][3][4][5][6][7][8][9][10][11][12], and adult BMI (mothers and fathers in this study (column [13][14] and Yengo et al. 13 24 and adult BMI 13 plotted against the estimated effect size at the age of peak association during early growth (this study). Dashed line indicates equal effect sizes in both studies. The colour represents the age of peak association, as defined as the age with lowest pvalue. Variants are grouped according to their profile cluster as defined in Figure 1E. Error bars represent one standard error estimate on each side of the effect size estimate. Note that for the sake of readability, GCK at birth is plotted in an insert with a different scale.

(last column)) is displayed in each cell. The cell colour represents the estimated effect size and the text colour represents the association p-value. Empty cells indicate that no proxy could be found for the given variant in the given study. See methods for details. (B) Scatter plot of the estimated effect size for association with birth weight
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ;  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint SNP, regional plots are displayed at the age at peak association, highlighting the lead SNPs with red diamonds and SNPs coloured according to the LD R 2 , with the exons of the gene according to Ensembl at the bottom and recombination rates in blue.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. ; Figure 5 -Trio-and haplotype-resolved association profiles. (A) Effect size estimate for the conditional allelic association of the child, mother, and father with child standardized BMI for each of the 46 variants at age at peak association. The age at peak association is annotated to the right. Error bars represent 95% confidence intervals. Note that in this analysis child, mother, and father are conditioned on each other, see methods for details. (B) Association profile for birth weight loci known to present both maternal and fetal effect on birth weight. Effect size estimates of the association with child standardized BMI are represented for the child and the mother from birth to two years of age (fathers were included in the model, but data are not displayed for the sake of readability). p-values represent the significance of the association with the number of effect alleles in the child, mother, and father in a joint model using unrelated trios, and thus differ from the p-values of the GWAS. Ribbons represent one standard error estimate on each side of the effect size estimate. (C) Association profiles with child standardized BMI . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021. from birth to eight years of age for two variants upstream of KLF14 in a model combining the child, mother, and father alleles into four haplotypes: (MnT) allele non-transmitted from the mother to the child; (MT) allele transmitted from the mother to the child; (FT) allele transmitted from the father to the child; and (FnT) allele non-transmitted from the father to the child. Note that the FnT allele is not represented here for the sake of readability but results are available in Supplementary Table 4  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 7, 2021 Figure 1D in inserts. All error bars/ribbons represent one standard error estimate on each side of the mean estimate.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 7, 2021. ; https://doi.org/10.1101/2021.05.04.21256508 doi: medRxiv preprint Tables . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 7, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 7, 2021.