A genome-wide association study of total child psychiatric problems scores

Substantial genetic correlations have been reported across psychiatric disorders and numerous cross-disorder genetic variants have been detected. To identify the genetic variants underlying general psychopathology in childhood, we performed a genome-wide association study using a total psychiatric problem score. We analyzed 6,844,199 common SNPs in 38,418 school-aged children from 20 population-based cohorts participating in the EAGLE consortium. The SNP heritability of total psychiatric problems was 5.4% (SE = 0.01) and two loci reached genome-wide significance: rs10767094 and rs202005905. We also observed an association of SBF2, a gene associated with neuroticism in previous GWAS, with total psychiatric problems. The genetic effects underlying the total score were shared with common psychiatric disorders only (attention-deficit/hyperactivity disorder, anxiety, depression, insomnia) (rG > 0.49), but not with autism or the less common adult disorders (schizophrenia, bipolar disorder, or eating disorders) (rG < 0.01). Importantly, the total psychiatric problem score also showed at least a moderate genetic correlation with intelligence, educational attainment, wellbeing, smoking, and body fat (rG > 0.29). The results suggest that many common genetic variants are associated with childhood psychiatric symptoms and related phenotypes in general instead of with specific symptoms. Further research is needed to establish causality and pleiotropic mechanisms between related traits.


Introduction
Psychiatric disorders are moderately heritable, on average about 30-50% of the variability in symptoms can be explained by genetic differences between individuals.
[1] The joint effect of common single nucleotide polymorphisms (SNP heritability) explains 5% to 30% of the variance in psychiatric disorders in adults. [2] Similar levels have been reported for behavioral and emotional symptoms in children, although there is large variability depending on child age and informant. [3,4] A focus on childhood problems is particularly important, as many adult disorders can be traced back to problems in childhood. [5] Recent family and molecular genetic studies demonstrated that much of the genetic effects underlying psychiatric disorders are not unique to particular diagnoses, but rather shared across several psychiatric diagnoses and symptoms. [2,[6][7][8][9][10] This phenomenon is known as cross-phenotype association and suggests pleiotropy, i.e. the influence of a genetic variant on multiple traits, [11] and may be an explanation for the extensive cooccurrence of mental disorders. [12] Several lines of evidence support this notion. First, the SNP based genetic correlations between disorders from different domains, such as major depression, attention-deficit/hyperactivity disorder (ADHD), bipolar disorder and schizophrenia are moderate to high, [2] averaging 0.41 [9]. Second, measures of global psychopathology in children showed a common SNP heritability between 16% and 38%. [8,13] Third, a genome-wide association meta-analysis (GWAS) of eight psychiatric disorders (ADHD, anorexia, autism, bipolar, depression, obsessive compulsive disorder, schizophrenia and Tourette's) identified 23 loci associated with at least four of these disorders. [14] GWAS derived polygenic risk scores (PRS) for single disorders are good predictors of general psychopathology. For instance, a PRS for ADHD was more strongly associated with a general psychopathology factor than with specific hyperactivity or attention problems adjusted for general psychopathology. [15] In another study a composite PRS based on eight GWAS was associated with general psychopathology in childhood. [16] These cross-phenotype associations present a challenge in interpreting GWAS results that typically target a single disorder, raising the question of whether a multi-disorder approach would be more informative.
Previous GWAS of childhood disorders, such as autism spectrum disorders, ADHD, aggression and internalizing disorders, [4,[17][18][19] have provided insights into the genetic architecture of child psychiatric problems and into the genetic correlations between childhood psychiatric problems. However, with notable exceptions of a large recent ADHD study [20] and a GWAS on autism spectrum disorder [17], these studies mostly failed to identify individual genome-wide significant loci. Besides increasing the sample size, some researchers propose the inclusion of related phenotypes in analyses to increase power. [21,22] Genetic loci with pleiotropic effects may be missed in a GWAS of single psychiatric disorders. If a variant only modestly increases the risk of symptoms from different domains, any association with a specific disorder may be too weak to be detected. A focus on global psychopathology increases the power to detect unspecific genetic loci, which are associated with global psychiatric vulnerabilities. A previous GWAS [14] examined multiple disorder simultaneously, but analyses of multiple dimensional measures of psychiatric problems in childhood are lacking. This approach is arguably particularly promising in childhood given the less clearly expressed symptoms and the low homotypic but high heterotypic stability of problems, [23] i.e. the changing of symptoms from one domain to another.
Our aim was to identify genetic loci associated with a total psychiatric problem score representing a variety of psychiatric problems including internalizing, externalizing, CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint attention, neurodevelopmental and other psychiatric problems. To identify these genetic variants, we performed a GWAS meta-analysis within the EArly Genetics and Lifecourse Epidemiology (EAGLE) consortium (https://www.eagle-consortium.org/). Finally, we estimated genetic correlations of the total psychiatric problem score with various single child and adult psychiatric, psychological, neurological and lifestyle or educational characteristics.

9
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint

Participants
Cohorts from the EAGLE consortium with parent-rated measures of psychiatric symptoms in the age range 5-16 years were invited to participate in the project Twenty cohorts from Europe, the US and Australia contributed data to this meta-analysis. See Table 1 and supplementary materials for cohort descriptions. Parents provided informed consent and study protocols were approved by local ethics committees. We restricted the analysis to children of European ancestry to avoid population stratification bias. In total data from 38,418 participants with a mean age of 9.9 years (SD=2.02) were metaanalyzed. This study was originally planned with a discovery-replication design. However, the obtained sample-size was not sufficiently large to split the sample, and we opted for maximizing power in discovery analyses.

Outcome
Psychiatric problems were assessed with parent-rated questionnaires at the assessment wave closest to age 10 years. All items of a broad psychiatric questionnaire were summed into a single total psychiatric sum score. In all cohorts internalizing, externalizing and attention problems were assessed; in some questionnaires items on sleep, thought, eating problems, and pervasive developmental disorders were included in the total problem score (Table 1). Instruments included the Child Behavior Checklist (CBCL) [24], Strengths and Difficulties Questionnaire (SDQ) [25], parental version of the Multidimensional Peer Nomination Inventory (MPNI) [26], Rutter Children' Behaviour Questionnaire [27], the Autism-Tics, AD/HD and other Comorbidities inventory (A-TAC) [28], and items derived from the Health Examination Survey [29].

10
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint We applied a log transformation plus 1 to avoid bias due to non-normal residuals and influential observations. Because different scales were used, the log-transformed scores were converted to a z-score within cohorts to make units comparable across cohorts.

Genotyping and QC
Genotyping was performed using genome-widearrays. Cohort-specific preimputation quality control (QC) was performed using established protocols. In all cohorts, SNPs were imputed to the 1000 Genomes Phase 1 or Phase 3 reference panel.
[30] Each cohort performed a GWAS and summary results were collected for meta-analysis. We omitted the X-chromosome from further analysis as most cohorts had no information available on X-linked SNPs. Pre-meta-analysis QC was performed with EasyQC and QCGWAS. [31][32][33] The QC steps are summarized in Figure S1. After meta-analysis, we excluded SNPs with low minor allele frequency (MAF < 5%), sample size (<5000), or with data from a small number of cohorts (<5). Finally, we checked the pooled results for spurious inflation by examining QQ-plots of the p-value distribution and by examining the LD score regression intercept (see statistical analysis). Full genetic methods and quality control per cohort can be found in Table S1 and Table S2.

Single SNP associations and meta-analysis
The z-scores of the total psychiatric problems scores were related to the SNP dosages in a linear model. Covariates included gender, age at assessment and principal components of ancestry. The number of dimensions (1-10) were specified by each cohort.. CATSS and TCHAD additionally used a random effect to account for familial relatedness.
FinnTwin12 and NTR applied a mixed model with two random effects to control for 11 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . population stratification and relatedness. We pooled the results from the individual cohorts using an inverse-variance weighted fixed-effects meta-analysis. R 3.4.3 was used for QC, data preparation and analysis of results. [34] Meta-soft 2.0.1 was used for the metaanalysis of single SNP associations. [35] [35] The individual cohort results after quality control were examined and meta-analyzed independently by the first and second author with consistent results. Genome-wide significance was set at p<5E-08.
We also used the FUMA web tool [36] to explore potential functional implications of any identified variants. We reviewed positional mapping, eQTL analyses and chromatin interactions with all available databases (date: 2019-06-30). We also performed a lookup in the mQTL [37] database, to check for potential influences on gene expression via DNA methylation.

Gene-based and expression analysis
We performed gene-based tests using MAGMA [38] in FUMA. MAGMA estimates the joint effect of all SNPs within a gene, while accounting for the LD structure and gene size. We tested 18,168 protein coding genes and thus the p-value significance threshold was set at 3e-6 based on Bonferroni correction. Second, we tested, whether the results from the gene-based tests were related to gene expression in several tissues. Specifically, we used MAGMA to test whether the strength of association between genes and the total psychiatric problem score was related to the mean gene expression level in a specific tissue, while considering average expression levels. Given that we expected gene variants to act via brain pathways, we tested expression in 13 brain regions (Table S3). However, as gene effects may impact the brain indirectly via other tissues, we also investigated gene expression levels on an organ level (Table S4). Gene expression levels were obtained from the GTEx 7 database. [39] 12 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . Third, we further examined whether the predicted gene expression of selected genes was related to total psychiatric problems. We selected genes, that were (functionally) annotated to genome-wide hits, or that were genome-wide significant according to gene-based tests. To correlate gene expression with total psychiatric problems, we used a transciptome-wide association study (TWAS) approach. [40] In short, gene expression in a tissue is imputed based on expression information from the GTEx 7 database for a specific tissue and then correlated with a phenotype, as inferred from GWAS summary statics. We chose to examine expression in the basal ganglia post-hoc, as genes most strongly associated with total psychiatric problems tended to be expressed in this brain region. We also performed a lookup on TWAS hub, to examine whether gene expression by a gene identified in this study has previously been associated with other phenotypes. [41,42]

SNP heritability and genetic correlations
We estimated the SNP heritability of total psychiatric problem scores with LD score regression. [43] We used the online tool LD Hub [44] to estimate common SNP heritability and genetic correlations with various psychiatric, psychological, neurological and lifestyle or educational characteristics. To compute the genetic correlations we selected published GWAS summary statistics available on LD Hub, except genetic correlations with anxiety symptoms [45], which were computed locally with ldsc 1.0.0.

13
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Spurious inflation and SNP Heritability
We tested 6,844,199 SNPs after quality control. The QQ-plot ( Figure 1) showed some inflation, however, the LD score intercepts was close to 1 (β0 = 1.01, SE=0.01), suggesting that the inflation was due to a true signal rather than spurious associations.

SNP based tests
Two loci on chromosome 11 were genome-wide significant, see Figure 2. One locus is located around lead SNP rs10767094, which showed an increase of 0.08SD in total psychiatric problems per A allele (SE=0.01, p=3E-09, n=8,216) ( Figure S2). The A allele is very common with an average frequency of 48% across the cohorts, but the SNP's average imputation quality was a moderate 50% (Info/R 2 ). Information on this locus was only available in 27% of participants (8 cohorts). The SNP showed a moderate amount of effect heterogeneity (I 2 =47.6%). Also on chromosome 11 an insertion/deletion variant (InDel) was genome-wide significant. A deletion of the A allele at rs202005905 was associated with an increase of 0.08SD in total psychiatric problems (SE=0.01, p=4E-08, n=15,886, Figure S3). Deletion prevalence was on average 16%, but again the imputation quality was modest with 52%, information was available in 41% of participants (9 cohorts) and the genetic variant showed moderate effect heterogeneity (I 2 =59.6%).
The SNP rs10767094 lies in the intron of an uncharacterized gene and rs202005905 lies in an intergenic region with no nearby genes. A FUMA eQTL and chromatin interaction analysis did not reveal any interactions with genes. The mQTL database did not list any associations with DNA methylation.
14 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint The third top locus did not reach genome-wide significance, but is of interest for its location in a gene previously implicated in neuroticism [46,47] as well as being very close to genome-wide significance. The SNP rs72854494 lies within the gene SBF2. The T allele was associated with 0.05SD lower total psychiatric problems (SE=0.01, p=5E-08, n=38,330)( Figure S4). This association showed no heterogeneity (I 2 =0.0%) among the cohorts. The T allele occurred on average in 14% across cohorts, with a very good imputation quality of 96%. FUMA eQTL and chromatin interaction analysis, as well as a lookup in mQTL DB did not reveal any further information on functional association.
Results for all SNPs with genome-wide suggestive p-values (p<5E-06) can be found in Table 2.

Gene-based test
Next we tested the association of 18,290 protein coding genes with the child total psychiatric problem score None of the genes reached genome-wide significance (Table   S3, Figure S5 and Figure S6). We also post-hoc looked up the association of SBF2. The aggregate of 1,508 SNPs in SBF2 showed a nominal significance of p=0.0004 (n=35,736).
The full summary results can be found as supplementary data.

Gene expression
We performed a MAGMA tissue expression analysis in 13 specific brain tissues (Table S4). Genes more strongly associated with total psychiatric problems tended to express particularly in four subcortical structures: caudate, putamen, anterior cingulate cortex and amygdala. However, this associations were not significant after correction for multiple testing. In addition we analyzed tissue expression for 30 tissues on an organ level, see Table S5. None of the organs had statistically significant associations, however, expression in the brain showed the strongest association (p=0.06).

15
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint The top two genome-wide significant loci were not linked to a characterized gene, thus we decided to perform a TWAS analysis only for SBF2. We found that higher predicted levels of SBF2 in the basal ganglia were related to higher scores of total psychiatric symptoms (Z=+2.33, p=0.02) based on the best linear unbiased predictions (BLUP) of a random variable representing 489 SNPs. A lookup in the TWAS Hub database revealed, that predicted levels of SBF2 gene products associate most with following phenotypes: neuroticism, body fat measures, red blood cell count, nervous feelings and worrying (http://twas-hub.org/genes/SBF2/).

Genetic correlation
Next we quantified the extent to which the genetic associations of child psychiatric problems scores were shared with other phenotypes. After adjustment for false discovery rate, insomnia, depressive symptoms, neuroticism, cigarettes smoked per day, body fat, body mass index, number of children, and age of smoking initiation all showed positive genetic correlations between 0.29 and 0.60 with the total psychiatric problem score (Table   3) based on the results of independent GWAS in adults. The highest correlation of total psychiatric problems was with ADHD, but this association did not survive multiple testing correction (rG=0.86, SE=0.39, p=0.03, q=0.06). Subjective wellbeing, childhood IQ, college completion, years of schooling, intelligence and age of smoking initiation showed significant negative correlations with the total psychiatric problem score, ranging from -0.66 to -0.42. Of the psychiatric phenotypes tested, the less common psychiatric disorders like schizophrenia, bipolar disorder, autism spectrum disorder, and anorexia were not genetically correlated with the total psychiatric problem score (rG < 0.01).

16
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion
The current study reports the first GWAS examining global psychopathology in children. Two genetic loci were genome-wide significant in the total sample. Additionally, we found support for the involvement of gene SBF2 in the development of psychopathology. The genetic effects underlying global psychopathology were shared with common psychiatric disorders (ADHD, anxiety, depression, insomnia), but not with less common and on average more severe ones (schizophrenia, bipolar disorder, autism, eating disorders).
The two genome-wide significant variants are one SNP (rs10767094) and one InDel (rs202005905). To the best of our knowledge these variants have not been associated with psychiatric traits before. It is unclear, how exactly these variants or tagged causal variants may affect general psychopathology, as functional annotation for these loci is sparse. The modest imputation quality possibly affected study results as both variants failed quality control in most cohorts. Measurement error of the genotypes could explain the relatively high estimates heterogeneity.. An important next step would therefore be to replicate these SNPs using direct genotyping or denser arrays.
While just not genome-wide significant, the evidence for an involvement of SBF2 with the lead SNP rs72854494 in total psychiatric problems is more convincing. This locus has been implicated in neuroticism based on two GWAS. In a GWAS of neuroticism [46] rs1557341, located in SBF2, showed genome-wide significance. In a second larger independent GWAS of 449,484 participants, SBF2 showed a genome-wide significant effect for both neuroticism and worry in gene-based tests. [47] Furthermore, according to TWAS hub, the predicted gene products of SBF2 correlate with neuroticism based on several GWAS. Neuroticism describes a disposition to experience negative emotions and a higher stress reactivity. It robustly and substantially associates with general 17 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint psychopathology in children [8,48], adolescence [49] and adults [50] (between r=0.13 and r=0.81). A twin study suggested that this correlation arises partly due to shared genetic causes [51] and in this GWAS the genetic correlations between total psychiatric problems and neuroticism were substantial as well, similar to the phenotypic association (rG=0.41).
These results suggest that SBF2 pleiotropically affects neuroticism and psychopathology, but the mechanisms would need to be explored further. Neuroticism has been hypothesized to contribute strongly to general psychopathology [52], thus it may mediate the effect of genetic variants on total psychiatric problems, but both phenotypes may also be independently affected. In regards to biology, human and mice studies points towards abnormal myelination as one of the consequences of SBF2 alterations. [53,54] We recently reported an association between lower global white matter integrity and higher levels of general psychopathology in school-aged children. [55] Thus, one may speculate that SBF2 affects psychiatric problems via white matter development.
We additionally tested, whether genetic variants associated with total psychiatric problems were associated with gene expression in the brain. Association with gene expression in the limbic system of the brain showed the most support, but did not survive multiple testing correction. The findings are thus compatible with the possibility of a chance finding, but strong theoretical support for a major role of the limbic system exists.
The limbic system includes evolutionary preserved regions responsible for emotion regulation and motivation [56], which were previously implicated in affective disorders, ADHD and OCD, [57,58] and are a potential intervention target [59].
In this study we observed 5% SNP heritability, which is similar to the LD score estimated SNP heritability of continuously measured ADHD [4], depression [46] and anxiety symptoms [45] in population based cohorts. The total psychiatric problem scores were based on various instruments, which all included items for common psychiatric 18 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint internalizing, attention, and externalizing symptoms. Therefore, it is not surprising that common psychiatric symptoms and disorders such as ADHD and depression shared 36% or more of the genetic variation with the total psychiatric problem score. The extent to which the questionnaires used in this study covered other less common problems, such as psychotic, bipolar or autistic symptoms varied greatly by instrument. Furthermore, age of onset for schizophrenia and bipolar disorder is typically in late adolescence and early adulthood. [60][61][62] For autism spectrum disorder, the age of onset is early, but the prevalence in the cohorts was low. Thus the total psychiatric problem score covered broad symptomatology but was not representative of severe psychiatric disorders with lower prevalence rates or emergence at later ages. The differential genetic correlations with common and relatively rare disorders suggests a continuum of genetic effects varying from very specific variants, variants which underlie either common or less common disorders, to variants which underlie most psychiatric problems. The presence of these universal variants is supported by genetic correlations between common and less common disorders, such as ADHD and schizophrenia. [2,63] The latter set of variants may be better detected with measures of global psychopathology in older children, when thought disorders such as schizophrenia and bipolar disorder occur.
A limitation of this study is the large heterogeneity in measures of psychopathology.
On the one hand, this variety of methods is an advantage, since any associations detected are expected to be more generalizable. On the other hand, it might limit the detectability of less robustly associated variants. This lack of power to identify in this study more loci probably stems mostly from insufficient sample size, but also from measurement error and low numbers of participating children with high psychiatric problems, as all cohorts were population-based.

19
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint Finally, as in any other GWAS study, the extent to which the found associations can be interpreted causally is difficult. Due to linkage disequilibrium it is unclear whether the two top variants have causal influence on psychopathology or are a marker for other causal variants. The same is true for the association of SBF2 with total psychiatric problems. However, the association of predicted SBF2 gene products with neuroticism and psychiatric problems, as well as the influence on myelination in an experimental mouse model, suggest a causal role.
In conclusion, this GWAS of total psychiatric problem scores suggests that common genetic variants exist that are associated simultaneously with internalizing, externalizing, attention and other psychiatric problems in childhood. The pleiotropy was not restricted to psychiatric phenotypes, but also included intelligence, educational attainment, wellbeing, smoking, body fat and number of children in adulthood. Interestingly, we did not find shared genetic effects with autism, schizophrenia and bipolar disorder. Two novel loci were genome-wide significant, though, the low sample size and modest imputation quality necessitate replication before firm conclusions can be drawn whether they influence total psychiatric problems. Furthermore, we found evidence that the gene SBF2, which was previously known to be associated with neuroticism, is also implicated in general psychopathology in children. Our results merit further investigation for confirmation and exploration of potential causal mechanisms.

20
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020.

ALSPAC
We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. The UK Medical Research Council and Wellcome (Grant ref: 102215/2/13/2) and the University of Bristol provide core support for ALSPAC. This publication is the work of the authors and they will serve as guarantors for the contents of this paper. GWAS data was generated by Sample Logistics and Genotyping Facilities at Wellcome Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe.

BREATHE
The research leading to these results has received funding from the European Research Council under the ERC Grant Agreement number 268479 -the BREATHE project. We are 21 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . participants and their families, and to the Raine Study research staff for cohort coordination and data collection. The authors gratefully acknowledge the NH&MRC for their long term funding to the study over the last 25 years and also the following institutes for providing funding for Core Management of the is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. .  MULTIEPIGEN project); and The Wihuri Foundation. We thank the teams that collected data at all measurement time points; the persons who participated as both children and adults in these longitudinal studies; and biostatisticians Irina Lisinen, Johanna Ikonen, Noora Kartiosuo, Ville Aalto, and Jarno Kankaanranta for data management and statistical advice.

26
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint Figure 1: Quantile-quantile plot of observed -log 10 p values vs expected -log 10 p values assuming chance findings in single SNP analysis. Diagonal line indicates a p value distribution compatible with chance finding. Upward deviations indicate p values more significant than expected.

34
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20121061 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. .

36
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . Table S1: Genes with genome-wide suggestive (p<3e-4) results