Abstract
Objective The first two years of life are a critical period of rapid brain development. Since early neurodevelopment is influenced by prenatal risk factors and genetics, neonatal biomarkers can potentially provide the opportunity to detect early signs of neurodevelopmental delay. We analyzed associations between DNA methylation levels from cord blood, neonatal MRI imaging data, and neurodevelopmental delay at two years of age.
Methods Neurodevelopment was assessed in 161 children from the South African Drakenstein Child Health Study at two years of age using the Bayley Scales of Infant and Toddler Development. We performed an epigenome-wide association study of neurodevelopmental delay using DNA methylation levels from cord blood. A mediation analysis was conducted to analyze if associations between differential methylation and neurodevelopmental delay were mediated by changes in neonatal brain volumes.
Results We found epigenome-wide significant associations between severe neurodevelopmental delay and differential methylation in the SPTBN4 locus (cg26971411, p-value=3.10×10−08), an intergenic region on chromosome 11 (cg00490349, p-value=2.41×10−08) and in the NBPF8 locus (FDR p-value for the region=9.06×10−05). While these associations were not mediated by neonatal brain volume, neonatal caudate volumes were independently associated with severe neurodevelopmental delay, particularly with language development (p-value=0.0443) and delay in motor function (p-value=0.0082).
Conclusion Differential methylation levels from cord blood and increased neonatal caudate volumes were associated with severe neurodevelopmental delay at two years of age. If confirmed by future studies, our findings could have implications for early detection of developmental delay and timely implementation of interventions, which are essential for positive child development.
Introduction
The first two years of life are a critical period of rapid growth and brain development. Child cognitive development is influenced by genetic and environmental factors which interact to determine how the brain develops and functions 1. Multiple factors have been shown to affect neurodevelopment; these include poverty, maternal education, maternal physical and psychological health, substance use, and nutrition 2–6. As a result, low- and middle-income countries in general and in particular sub-Saharan Africa have the highest proportion of young children at risk of developmental delay 3. New research in early human development shows that epigenetic, immunological, physiological, and psychological adaptations to the environment occur from conception, and that these adaptations affect development throughout the life course 6, which highlights the importance of early identification of developmental delay and timely implementation of interventions 4. Studies have shown that children who have received early intervention services experience improvements in cognitive and academic performance and engage less in risky behaviors such as alcohol, tobacco, and drug use, and high-risk sexual activity 7–9. However, screening for neurodevelopmental delay is only conducted at nine months of age at the earliest 10. Therefore, exploring neonatal biomarkers that can potentially be used to detect early signs of severe neurodevelopmental delay is a nascent area of research.
DNA methylation levels from cord blood could potentially be used as such neonatal biomarkers. DNA methylation can be altered by psychological, environmental and genetic factors and is one of the most studied modifications of the genome. Multiple studies show that epigenetic modifications are associated with the risk of cognitive disability or impairment 11. However, the role of epigenetic variation on cognitive development of infants remains largely unknown. Neuropsychiatric outcomes that are known to be associated with altered DNA methylation in newborns include attention-deficit/hyperactivity disorder (ADHD) 12 and amygdala:hippocampal volume ratio, which is a marker for anxiety and aggression 13. Some evidence that altered methylation levels in neonates can also be used as a biomarker for neurodevelopmental delay comes from a study of 238 Mexican-American children 14. However, this study only found suggestive (non-significant) associations between two methylation sites and working memory and processing speed.
Over the last decades, structural magnetic resonance imaging (MRI) studies in children have increasingly been used to ascertain relations between brain volume and various measures of atypical behavioral development 15,16. Changes in brain volume been associated neurodevelopmental disorders such as autism 17,18, ADHD 19, and specific language impairment 20. However, most of these studies have been conducted in children older than two years of age and information regarding structural and functional brain development in the first two years of life is very limited 21. More recent studies used MRI brain scans from infants at 6 or 7 months of age and neonates to predict cognitive development, suggesting involvement of the maturation of subcortical brain regions 16,22,23. These early changes in brain volume can potentially be used as an early biomarker for neurodevelopmental delay. Furthermore, given their association with prenatal environmental, psychological risk factors and genetics 6, we argue that early changes in brain volume can potentially mediate the association between DNA methylation and neurodevelopmental delay.
In this study we will estimate associations between DNA methylation levels from cord blood, subcortical volumes derived from neonatal MRI imaging data and neurodevelopmental delay at two years of age in the South African Drakenstein Child Health Study (DCHS).
Methods
Study design and study population
The DCHS, a population-based birth cohort, has been described previously 24–26. Mothers were enrolled prenatally in their second trimester and followed through pregnancy at two primary care clinics serving two distinct populations (predominantly black African ancestry or predominantly mixed ancestry). Mother-child pairs were followed from birth and infants enrolled in the DCHS are followed until at least five years of age 24. All births occurred at a single, central facility, Paarl Hospital. The sample included in the present study were 161 children who had all measurements available including cognitive Bayley scores at 2 years of age, methylation data from cord blood and genotyping data.
Ethical approval for human subjects’ research was obtained from the Human Research Ethics Committee of the Faculty of Health Sciences of University of Cape Town (HREC UCT REF 401/2009; HREC UCT REF 525/2012). Informed consent was signed by the mothers on behalf of herself and her infant for participation in this study.
Assessment of neurodevelopmental delay
At 24 months, children were assessed using the Bayley Scales of Infant and Toddler Development, third edition (BSID-III) 27. The BSID-III is a gold standard assessment of child development used internationally, and validated in South Africa 28,29. The assessment generates scores for cognitive, language and motor development. Trained assessors administered the BSID-III using direct observation to score children on their cognitive, language and motor development 30. BSID specialized software was used to produce normed and age-adjusted scores calculated using data from a US-reference group. Composite scores for cognitive, motor and language scales were scaled to have a mean of 100 and standard deviation of 15. The scores were categorized into severe delay if they were two standard deviations from the BSID-III reference mean, and mild delay if they were one standard deviation from the mean 27.
DNA methylation
DNA was isolated from cord blood samples that were collected at time of delivery 31. DNA methylation was assessed with the Illumina Infinium HumanMethylation450 BeadChips (n=156) and the MethylationEPIC BeadChips (n=160). Pre-processing and statistics were done using R 3.5.1 32. Raw iDat files were imported to RStudio where intensity values were converted into beta values. The 450K and EPIC datasets were then combined using the minfi package 33,34 resulting in 316 samples and 453,158 probes. Background subtraction, color correction and normalization were performed using the preprocessFunnorm function 35. After sample and probe filtering, 273 samples and 409,033 probes remained for the downstream analyses (see supplementary methods for details). Batch effects were removed using ComBat from the R package sva 36. Cord blood cell type composition was predicted using the most recent cord blood reference data set 37 and the IDOL algorithm and probe selection 38 based on the previous methods 39.
Genotype data
Offspring samples were selected for genotyping analysis based on a number of criteria relevant to the DCHS as a whole – including (but not limited to) maternal psychosocial risk/stressors and/or availability of offspring lung function data. DNA was isolated from cord blood samples that were collected at time of delivery 31. Genome-wide genotyping was performed in 270 newborns using the Illumina Infinium PsychArray (n=119) and the Illumina Infinium Global Screening Array, GSA (n=151). After quality control (QC) SNPs were imputed on the 1000 Genomes reference panel (Phase III) using the Michigan Imputation Server 40. Imputed genotypes reaching an R2≥0.3 in both arrays were used in analyses. Principal components were calculated using PLINK v1.90b4 64-bit 41,42.
MRI imaging data from neonates
Neonatal imaging was performed on a subgroup of newborns from the DCHS at the Cape Universities Brain Imaging Centre, Tygerberg Hospital 2. Newborns were excluded if they were found to have medical comorbidities or had neonatal intensive care admission at birth, were premature (<36 weeks’ gestation), had an APGAR core of <7 at 5 minutes, or their mothers used illicit drugs in pregnancy. Structural T2-weighted images were acquired during natural sleep on a Siemens Magnetom 3T Allegra MRI scanner (Erlangen, Germany) using a head coil with a wet clay inlay 43,44. Statistical parametric mapping software SPM8 (www.fil.ion.ucl.ac.uk/spm/software/spm8) was used to process the T2-images, using a well-established neonatal brain template 45 for image normalization to standard space and segmentation of images into three tissue types using probabilistic maps as priors. Volumetric data was extracted for the following subcortical regions: caudate, pallidum, putamen, thalamus, amygdala and hippocampus. In addition, total grey matter, total white matter, and total CSF volumes were extracted and summed to obtain a measure of intracranial volume. More details regarding image acquisition and processing can be found elsewhere 46. Since an appreciable number of scans needed to be discarded largely due to movement artifacts 46, only 51 out of the 161 samples could be included in the analysis of neonatal MRI imaging data.
Statistical analysis
To identify DNA methylation patterns in cord blood that are associated with severe delay in neurodevelopment at two years of age, we conducted an epigenome-wide association study (EWAS) on single CpG sites as well as an analysis of differentially methylated regions (DMRs). For the EWAS, we ran a multivariate robust linear regression model with empirical Bayes from the R package limma (version 3.40.6) 47 using severe neurodevelopmental delay at the age of two years as the independent variable and each CpG methylation as a dependent variable, adjusting for sex, preterm birth, maternal smoking, household income, the first five genetic PCs to control for population stratification and the first three PCs from the estimated cell type proportions (after centered log-ratio transformation), which explain >90% of the cell type heterogeneity 48. We applied a Bonferroni threshold to correct for multiple testing based on the number of tested CpG sites (threshold: 0.05/403933=1.24 × 10−07). We conducted the following sensitivity analyses: 1) We validated our findings by using mild delay in neurodevelopment and the continuous Bayley scores as independent variables, 2) We confirmed our associations using linear regression with p-values obtained from normal theory (lm() function in R) as well as from a permutation test. To identify plausible pathways associated with severe delay in neurodevelopment, we performed an over-representation analysis based on the CpGs with p-values < 0.001 for the association with severe delay in neurodevelopment. We used the R Bioconductor package missMethyl (version 1.18.0 gometh function), which performs one-sided hypergeometric tests taking into account and correcting for any bias derived from the use of differing numbers of probes per gene interrogated by the array 49. Differentially methylated regions (DMRs) in severe neurodevelopmental delay were identified using DMRcate, that identifies DMRs from tunable kernel smoothing process of association signals 50. Input files were our single-CpG EWAS results on severe neurodevelopmental delay including regression coefficients, standard deviations and uncorrected p-values. DMRs were defined based on the following criteria: a) a DMR should contain more than one probe; b) regional information can be combined from probes within 1,000 bp; c) the region showed FDR corrected p-value < 0.05.
Finally, we analyzed if associations between differential methylation and neurodevelopmental delay were mediated by neonatal brain volume as follows: (1) We analyzed associations between CpG sites, that were associated with neurodevelopmental delay, and MRI imaging data from neonates (total grey matter, total white matter and subcortical brain volumes) using linear regression models adjusted for the same covariates as the EWAS analyses plus age at MRI scan, child sex and intracranial volume. (2) We analyzed associations between MRI imaging data and severe delay in neurodevelopment using linear regression models adjusted for age at MRI scan, child sex and intracranial volume. These associations were validated in two sensitivity analyses: First, associations were additionally adjusted for preterm birth, maternal smoking, household income and the first five genetic PCs to control for population stratification and second, the findings were validated by using mild neurodevelopmental delay and the continuous Bayley scores as independent variables. Lastly, if the associations in (1) and (2) were significant, a formal mediation analysis was conducted 51.
Results
Description of Study Participants
The study sample included 161 children with DNA methylation and genotype data, BSID-III scores at two years of age as well as information on all relevant covariates (Table 1). Of these, neonatal MRI imaging data were available in 51 children. Almost half of our study sample was female, half of them were of African ancestry and the other half were of mixed ancestry, and a quarter of the population was exposed to maternal smoking during pregnancy. Cognitive development at two years of age was severely delayed in 4 to 12% of the study sample (depending on the tested domain). Most of the cases had a severe delay in language development (8% in the whole study sample and 12% in the subsample with imaging data), followed by severe neurodevelopmental delay (7-8%) and severe delay in motor function (4%). MRI scans were conducted in neonates at three weeks of age.
Differential Methylation as an Early Sign of Neurodevelopmental Delay
Differentially methylated CpG sites in the SPTBN4 locus (cg26971411) and in an intergenic region on chromosome 11 (cg00490349) were associated with severe neurodevelopmental delay at the epigenome-wide significance level (Bonferroni-adjustment) after adjusting for sex, preterm birth, maternal smoking, household income, the first five genetic PCs and cell type proportions (Table 2, Figure 1, Figure S1). Differential methylation in cg26971411 was significantly associated with severe delay in motor function (beta = −0.024, p-value = 3.10 × 10−08) and nominally significant for cognitive development (beta = −0.013, p-value = 3.93 × 10−05) and language development (beta = −0.014, p-value = 2.57 × 10−06). Differential methylation in cg00490349 was significantly associated with severe delay in language development (beta = −0.036, p-value = 2.41 × 10−08) and nominally significant for cognitive development (beta = −0.031, p-value = 1.94 × 10−05) and motor function (beta = −0.050, p-value = 2.73 × 10−07). The significant associations were confirmed in a sensitivity analysis using linear regression models as well as permutation tests (Table S1). Associations with differential methylation in cg26971411 and cg00490349 were weaker, but still nominally significant, for mild neurodevelopmental delay (p-values < 0.05, Table S2). Associations with the continuous Bayley scores were only nominally significant for cognitive function, but not for language development or motor function (Table S2).
In addition to the two significant single CpG sites, we identified one significant DMR from our EWAS results on severe delay in language development that is located in the NBPF8 locus (Figure 1, Table 2, minimum FDR p-value for the region = 9.06 × 10−05).
No significantly enriched pathway was found among the most significant CpG sites (p-values < 0.001) from the EWAS of severe neurodevelopmental delay (Tables S3-S5).
Changes in Neonatal Subcortical Brain Volumes Associated with Neurodevelopmental delay
Increased caudate volumes were associated with severe neurodevelopmental delay at two years of age (Table 3, Figure 2), particularly for language development (beta = 165.30, p-value = 0.0443) and delay in motor function (beta = 365.36, p-value = 0.0082). The association between caudate volumes and delay in motor function was also significant for mild delay in motor function (beta = 156.62, p-value = 0.0148) and when using the continuous Bayley score (beta = −4.94, p-value = 0.0150) (Table S6). Furthermore, the associations were robust towards additional adjustment for preterm birth, maternal smoking, household income and the first five genetic PCs to control for population stratification (beta = 349.97, p-value = 0.0144, Table S7). Associations with language development were only significant for a severe delay (Table S6) and not significant after including additional covariates (Table S8). There were no associations with other subcortical brain volumes or with total grey or white matter (Table 3).
However, since we did not find associations between methylation in the significant CpG sites from the EWAS of neurodevelopmental delay (cg26971411 and cg00490349) and changes in brain volumes (Table S9), our data did not support a mediating effect of altered brain volumes on the association between methylation and neurodevelopmental delay.
Discussion
In this study of infants from a poor peri-urban community in South Africa, we showed that differential methylation levels from cord blood as well as larger neonatal caudate volumes were associated with severe neurodevelopmental delay at two years of age. These results suggest that methylation levels and brain imaging data from newborns can potentially be useful as neonatal biomarkers for neurodevelopmental delay and highlight the need to understand the biological pathways how prenatal environmental, psychological risk factors and genetics affect neurodevelopment. If confirmed by future studies, our findings could have implications for early detection of neurodevelopmental delay and consequently timely implementation of interventions 4, which are essential for a positive cognitive functioning throughout the life course 7–9.
Differential Methylation in SPTBN4 and NBPF8 as an Early Sign of Neurodevelopmental Delay
Differential methylation in three loci (SPTBN4, NBPF8, and an intergenic region on chromosome 11) was significantly associated with severe neurodevelopmental delay at two years of age. Differential methylation in the SPTBN4 locus (cg26971411) showed the strongest association with severe delay in motor function. SPTBN4 (Spectrin Beta, Non-Erythrocytic 4) is a protein coding gene, which is, according to the Human Protein Atlas, primary expressed in brain tissue and particularly in the cerebellum, which is the part of the brain that functions as a co-processor of movement in concert with the cortex and basal ganglia 52. In line, mutations and loss-of-function variants in SPTBN4 were reported in association with arthrogryposis, which is a neuromuscular condition 53, and with a severe neurological syndrome that includes congenital hypotonia, intellectual disability, and motor axonal and auditory neuropathy 54. To the best of our knowledge, only one study has identified differential DNA methylation in SPTBN4 in association with neurocognitive outcomes (Alzheimer’s disease) and this study was conducted in mice 55. Therefore, our study extends the current literature by highlighting that not only genetic variants or methylation levels from brain tissue, but also methylation levels from cord blood are linked to cognitive outcomes by providing to our knowledge the first report of an association between neonatal differential methylation in SPTBN4 and neurodevelopment.
The second locus for which we found an association with severe neurodevelopmental delay was NBPF8. NBPF8 is a member of the neuroblastoma breakpoint family (NBPF). Members of this gene family are characterized by tandemly repeated copies of DUF1220 protein domains. Copy-number variations in the 1q21.1 region, where most DUF1220 sequences map, have been implicated in an increasing number of human diseases, including autism, schizophrenia, microcephaly/macrocephaly, and neuroblastoma 56. However, to the best of our knowledge, this is the first study showing differential methylation in this region in association with cognitive outcomes.
Increased Neonatal Caudate Volumes Associated with Severe Neurodevelopmental Delay
While we did not find indications that altered neonatal brain volumes mediate the association between methylation and neurodevelopmental delay, we showed that increased neonatal caudate volumes were associated with neurodevelopmental delay at two years of age, particularly in motor function. This finding suggests the potential use of brain imaging data as neonatal biomarker for neurodevelopmental delay in addition to methylation. The caudate nucleus is one of the structures that make up the corpus striatum, which is a component of the basal ganglia. The caudate nucleus plays a prominent role in motor processes, and caudate nucleus dysfunction has been found in Parkinson’s disease, Huntington’s chorea, dyskinesias, obsessive–compulsive disorder and other movement and cognitive disorders 57. The human brain undergoes a rapid change in the first year of life. The growth rates of subcortical grey-matter structures are similar to cortical grey-matter growth rates, with the amygdala, thalamus, caudate, putamen, and pallidum growing about 105% in the first year and roughly 15% in the second 58. Our study showed that increased neonatal caudate volumes were associated with severe neurodevelopmental delay at two years of age, suggesting that risk factors for neurodevelopmental delay impact brain development in utero, resulting in deviations in maturation that can already be detected soon after birth. Here, we assume that any extreme deviation from the normal trajectory can result in adverse cognitive outcomes.
Strengths & Limitations
Our study has a number of limitations. First, our findings were based on a relatively small sample size, particularly the analyses of neonatal brain imaging data, because an appreciable number of scans needed to be discarded largely due to movement artifacts. Furthermore, although cortical brain regions are expected to contribute to neurodevelopmental delay 16, cortical segmentation in the present dataset was hindered by low rates of white matter myelination and the absence of T1-weighted images that have been shown to enhance segmentation quality. While the infants whose scans were included and excluded did not differ significantly on background variables, the loss of this large amount of data may have impacted our results and limited the statistical power to detect mediation effects. Another limitation of our analyses was the small number of cases of severe neurodevelopmental delay in our sample. However, to reduce the risk of false positive findings due to the imbalanced study design, we validated our findings by using different modelling approaches in our EWAS analyses (limma, linear regression, permutation tests, DMR analysis) and different variables to measure neurodevelopmental delay in all our analyses (severe neurodevelopmental delay, mild neurodevelopmental delay, continuous Bayley scores).
Despite these limitations, this is a very well characterized group of infants recruited as part of a population-based prospective study design, and the selection of the infant age-group is an additional strength. For this study, 2 to 4 weeks old infants were chosen for imaging because cerebral changes are particularly intense during the last weeks of gestation and the first postnatal months. Furthermore, imaging during the early postnatal period may more accurately reflect the effects of prenatal environmental, psychological risk factors and genetics on brain structure.
Conclusions
We have presented evidence that differential neonatal methylation in SPTBN4, NBPF8, and an intergenic region on chromosome 11 as well as increased neonatal caudate volumes were associated with severe neurodevelopmental delay at two years of age. If confirmed by future studies, our findings could have implications for early detection of developmental delay and timely implementation of interventions, which are essential for positive child development.
Data Availability
Data are available upon request from the Steering Committee (Prof. Heather Zar, heather.zar{at}uct.ac.za)
Funding
The Drakenstein Child Health Study was funded by the Bill & Melinda Gates Foundation (OPP 1017641), Discovery Foundation, Medical Research Council South Africa, National Research Foundation South Africa, CIDRI Clinical Fellowship and Wellcome Trust (204755/2/16/z). AH was supported by a research fellowship from the Deutsche Forschungsgemeinschaft (DFG; HU 2731/1-1) and by the HERCULES Center (NIEHS P30ES019776). MPE was supported by NIH grant R01 GM117946. NAG was supported by a Claude Leon Fellowship. DJS, HJZ, KAD and NK are supported by the SA Medical Research Council (SAMRC). PDS is supported by the National Health and Medical Research Council, Australia. CJW is supported by the Wellcome Trust through a Research Training Fellowship [203525/Z/16/Z]. Support for the neuroimaging was also received by KAD from an ABMRF young investigator grant, the South African Medical Association and the Harry Crossley Foundation. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of manuscript.
Competing financial interests declaration
All authors declare they have no actual or potential competing financial interest.
Acknowledgments
The authors thank the study and clinical staff at Paarl Hospital, Mbekweni and TC Newman clinics, as well as the CEO of Paarl Hospital, and the Western Cape Health Department for their support of the study. The authors thank the families and children who participated in this study.