Abstract
Children of older fathers have higher risk for certain diseases. Nevertheless, how ageing specifically affects male germ cells is so far not completely understood. In a cohort of 197 healthy men (18-84 years), we found that semen and reproductive parameters remained normal over six decades. Along with an age-dependent increase in telomere length in sperm (r=0.41, p>0.001), we found accelerated DNA fragmentation, more prominent after the sixth decate of life, and with around 60% of men older than 66 showing abnormal levels of DNA breaks. At the epigenetic level, by whole genome bisulfite sequencing we identified 236 sperm-specific differentially methylated regions between the youngest and oldest group, affecting mostly regions associated with homeobox genes and nervous system development. Therefore, we propose that during ageing, male germ cells are affected by an intrinsic and specific ageing process, distinguishable from the soma. These age-dependent changes might have consequences for fertility and offspring of older men.
Introduction
In developed countries, couples postpone parenthood mainly for socio-political and cultural reasons (1–3). The reproductive implications of delaying fatherhood, such as a decline of sperm parameters (e.g. sperm count, motility, and morphology) and disruptions in reproductive hormone secretion (4), have long been undervalued. In fact, if and how ageing itself causes a decline in men’s reproductive health is unclear because selection criteria for clinical studies are inconsistent and some studies indicate only a subtle change during ageing (5, 6) which might in part be due to unhealthy lifestyles or age-related somatic diseases (7). Observational studies have indicated that the partners of elderly men take longer to become pregnant and that these pregnancies have higher rates of spontaneous abortion (8–10). Recently, advanced paternal age has been shown to have adverse consequences for maternal and offspring health (11). In spite of this, knowledge of how ageing specifically affects the quality of male germ cells remains limited.
The increasing number of pre-meiotic cell divisions in the human male germline leads to an increase in de novo mutation rates in children of older men (12, 13). In line with this, there is an association between paternal age and incidence of so called paternal age effect (PAE) disorders (14). Still, the mechanism by which PAE arise is not completely understood. Among the hypotheses put forward to explain the detrimental effects of age on male fertility and offspring health, epigenetic changes to the DNA stand out. Changes in DNA methylation accumulate in the male germline with age (15–17), however only a small portion of all CpG sites in the genome have so far been interrogated.
Our unique cross-sectional observation study on healthy ageing men [Fertility and Ageing in Healthy Men (FAMe)] was conducted in order to exclude the effects of age-related mobidities and evaluate the impacts of age to male reproductive health. Our ultimate goal was to identify the fundamental processes involved in male germline ageing which are caused exclusively by age.
Results
Reproductive parameters in healthy ageing men
We evaluated reproductive parameters in 197 healthy men, who were stratified into six age groups (Table 1). We did not observe any age-related changes in testicular volume or semen parameters beyond or below the normal ranges given by the WHO (18). Subtle, but significant age-dependent processes such as a decrease of ejaculate volume, changes in sperm motility, and a decrease in markers for accessory gland function, e.g. zinc were clearly visible, however all changes remained within the given physiological range (Table 2).
Similar observations were made regarding reproductive endocrine parameters. Although there was a slight age-dependent increase in FSH, LH, and SHBG and a decrease in free testosterone, all hormone levels stayed within the normal range (Table 3). A similar scenario was observed when evaluating anthropometric parameters and questionnaires for (sexual) wellbeing (Table 4).
Molecular signatures of male germ cells in healthy ageing men
We investigated to what extent hallmarks of ageing previously established in somatic cells, such as telomere attrition, loss of genome stability, and epigenetic alterations, are affected in human sperm from ageing men.
We found an inverse association between age and blood rTL (r=-0.41, p=3.27×10-9), indicating the shortening of telomeres with age in somatic cells (Figure 1A). In contrast, there was a positive association of similar strength between age and sperm rTL (r=0.41, p=3.37×10-8), demonstrating that telomeres in the germline become longer with increasing age. Moreover, the rate at which telomeres shorten in blood DNA (0.0055 relative units per year) is slower than the rate that they lengthen in sperm (0.0077 relative units per year).
Genome instability was assessed in the paternal genome by measuring DNA fragmentation, in particular single- and double-strand DNA breaks (19). In the sperm of healthy men, we observed a highly significant increase in DFI with increasing age (ρ=0.57; p=1.04⨯10−15). Possible confounding parameters, such as abstinence time and sperm vitality, had little or no effect on this association (Table S1). While we found that the DFI continuously and steadily increased over the different age groups until the age of 55 years (Groups 1-4), a marked acceleration in genome instability was noted in the two oldest age groups (groups 5 and 6; Figure 1B), with almost 60% of men over the age of 66 showing DFI above 30%, which is considered the threshold for normal DFI (20).
Another hallmark of ageing seen in somatic cells is epigenetic drift, in which DNA methylation changes occur with age. As a proxy for assessing global DNA methylation in tissues or cells, we measured the DNA methylation of LINE-1 repetitive elements in bisulfite-converted blood DNA from our cohort, but did not find any age-related methylation changes (Figure S1).
The stability of LINE-1 methylation suggests that any age-dependent changes to the methylome might occur at unique regions in the genome. Using deep bisulfite sequencing (DBS), we assessed the sperm DNA methylation levels of regions previously found to be aberrantly methylated and associated with reduced fertility (21). These included two imprinted genes (H19 and MEST) and four spermatogenesis-specific genes (FGFR3, VASA, RHOXF1, and RHOXF2/2B). In parallel, we also analysed blood DNA to exclude a general age-related tissue-independent effect. There were only minor differences detected between the younger (n=14, age range 18-25) and older (n=14; age range 61-71) age groups in DNA methylation levels of H19 and VASA in blood DNA and in FGFR3 in sperm DNA (Figure S2). It is therefore likely that age-dependent changes to the methylome are located at genomic loci not previously associated with male infertility.
In order to explore age-dependent methylation changes in sperm in an unbiased and genome-wide manner, we performed whole genome bisulfite sequencing (WGBS) on 6 samples each of sperm DNA from the two extreme age groups (Group 1 vs. Groups 5+6). For comparison, blood DNA from the same men were prepared and sequenced. There were no differences in mean DNA methylation between age groups or between tissues (Table S2; confirmation of purity of samples shown in Figure S3).
Differentially methylated regions (DMRs; with at least 4 CpGs and 30% mean methylation difference) between the extreme groups were identified using two different algorithms (BSmooth and Metilene (22, 23)). We identified a total of 236 DMRs (Supplementary File 1) in sperm and 36 in blood. Of the sperm DMRs, 121 increased and 115 decreased methylation with age. Notably, there was no overlap between the age-associated sperm DMRs and blood DMRs, consequently only the sperm DMRs were analysed further by bioinformatic and analytical tools. We did not find any chromosomal preference or hotspot for the observed sperm methylation changes (Figure S4).
Analysing the identified sperm DMRs, we noticed a significant overrepresentation of DNA transposons (p=9.59×10-4) and LINE elements (p=6.21×10-3), but no significant over- or underrepresentation of either LTRs or SINE (Figure S5).
A subset of 11 sperm DMRs was chosen for further in-depth analysis and used to confirm findings in samples from all of the age groups using DBS (Table S3). In addition to the 12 samples originally analysed by WGBS, 30-63 sperm samples, equally distributed throughout the six age groups, were randomly selected. By DBS analyses, we confirmed that seven out of eleven DMRs (BS21, BS74, BS134, ML15, BS71, and BS174) had a statistically significant correlation between DNA methylation levels and age. Two DMRs (BS46 and ML12) showed a similar trend as identified by WGBS, but the correlation with age was not statistically significant. Two other DMRs (BS88 and BS130) displayed a non-linear pattern of age-related methylation change (Figure 2).
The correlation between age and methylation of the different DMRs was adjusted for the possible confounder MTHFR C677T genotype, previously shown to influence DNA methylation. However, we could not observe an overall significant effect of this polymorphism on DNA methylation of the regions analysed.
We used six DMRs with p<0.001 (BS74, BS134, ML15, BS22, BS71 and BS174) and a regression analysis of donor-wise average methylation levels to derive an age predictor for sperm. By leave-one-out cross-validation (24), we found a strong correlation between predicted and chronological ages for the 42 initially analysed samples. An additional set of 33 samples were independently analysed, confirming this strong correlation (Figure 3; R script available in Supplementary File 2). Taking the two sample sets together, we found a strong correlation (r=0.78; p=3.83×10-16) between chronological and predicted age and a mean absolute error of 8.7 years.
In order to analyse the functional importance and potential biological impact of the 236 sperm DMRs, a list with the upstream and downstream neighbouring genes (distance from TSS ≤1000 kb) was obtained. Gene ontology (GO) analysis of the DMR-associated gene list showed a significant enrichment for Homeobox genes (16 genes; padj=2.0×10−2) and other DNA binding protein gene families (58 genes; padj=4.3×10−2). Gene set enrichment analysis showed enrichment in several gene sets (FDR<0.05), of which the ten most significant candidates (for the GO Biological Process Ontology) are highlighted in Figure 4.
Discussion
The FAMe study of healthy ageing men delineates for the first time an inherent ageing process that take place in germ cells during spermatogenesis by minimizing the effects of confounding factors. Considering the hallmarks of ageing already known for other tissues and cells, it is evident that male germ cells age, but the ageing process is different from that of somatic cells.
Good general health seems to be an essential requirement for good reproductive health, regardless of age, while severe male infertility appears to be associated with increased morbidities (25). Our andrological evaluation revealed that sperm production is not affected by age in a clinically relevant way. These findings contrast with several studies which showed a decline in endocrine and spermatogenic function with age (4, 6, 26). This is likely due to the strict health criteria applied during our volunteer recruitment, which were not used in most previous studies. Our results are in complete agreement with a previous study which also applied strong selection criteria (27). A possible limitation of this study is that we do not have longitudinal data, and therefore we cannot exclude an influence of the year of birth on the data.
We have found a negative correlation between age and blood rTL and a positive correlation of similar strength between age and sperm rTL, demonstrating an extension of telomeres in the germline with increasing age. This is presumably due to telomerase activity in the germline (28). The increased rate of telomere lengthening in sperm than telomere shortening in blood is in accordance with Aston et al. (29). There is evidence that very long sperm telomeres are detrimental to fertility (30). The age-dependent increase of telomeres in sperm could also result in increased telomere length in the progeny of elderly fathers; however, how this might affect their general health is currently unclear (31).
In our cohort, DFI increased with age, indicating increased genomic instability in the germline. Previous studies have shown that an increase in DFI above 30% is associated with male infertility (20). Consequently, the fertility of elderly men might be impaired by increased DNA instability, contributing to previously reported longer time to pregnancy and higer miscarriage rates irrespective of female partner age (32, 33). The number of men with DFI levels above 30% increased more dramatically in groups 5 and 6 (>56 years; Figure 1B), affecting close to 60% of the volunteers in the oldest group, pointing to an accelerated increase in DNA damage, a pattern of DNA instability hitherto unknown.
To our knowledge, this is the first study investigating age-related methylation changes in all CpG dinucleotides across the human sperm genome. We identified over 200 DMRs in sperm, none of which overlaped with DMRs identified in blood in this study or with any of the 353 CpGs from Horvath’s epigenetic clock (34). This strongly supports the notion that epigenetic ageing in the male germline differs from somatic cell ageing.
For a majority of the DMRs validated by DBS, a significant linear pattern of hyper- or hypomethylation with age was evident. The range of methylation changes observed was remarkable. For example, the mean methylation decreased from 87% to 46% (for ML15) and increased from 32% to 62% (for BS71) with age. Two candidate DMRs (BS88 and BS139) showed a non-linear pattern which might indivate a strong impact of other factors (e.g. genetic background). DNA methylation in sperm has been previously shown to be affected by genetic variants, such as MTHFR polymorphisms, however we could not find any influence of this variation on the DNA methylation levels of the analysed DMRs.
The inter-individual variation in DNA methylation was very low in most DMRs validated, which allowed us to devide an epigenetic age-predictor for sperm based on six of these regions. Recently, an epigenetic clock for sperm has been published by Jenkins et al. (16), which is more precise than our own. One reason for this difference may be the narrower age range in their cohort (20-60 years) and also that our predictor is based on only six regions. We think the two epigenetic clocks might complement each other, as they likely reflect different factors. Jenkins’ study cohort includes clinically heterogeneous sperm samples, including infertile patients, and therefore their clock may reflect not only the effect of age but also other factors such as age-related morbidities. In contrast, our data are based on a highly selected population of healthy men and therefore can be used for tracking the influence of health, fertility, lifestyle, and treatment on the ageing status of the germline.
Gene ontology analyses of genes in the vicinity of the identified sperm DMRs indicate an enrichment of biological processes involved in nervous system and head development (Figure 4). A similar finding was reported previously (15). Although it is tempting to speculate that there might be a link between this finding and the increased frequency of neuropsychiatric disorders in the offspring of older men, there is currently no experimental support for such a hypothesis.
A crucial question emerges from our study: what is the origin of the epigenetic changes in sperm? It is unlikely that DNA methylation is altred during spermiogenesis, where DNA is tightly packed with protamines and therefore inaccessible for transcription factor binding and modification by enzymes. The spermatogenial stem cells migh be a source of the observed changes. At 60 years of age, it is estimated that a spermatogonia has undergone approximately 920 mitotic cell divisions (35), making them prone to replication errors and resulting in increased mutation rates. It is therefore likely that male germ cells are also prone to increased DNA methylation maintenance errors as they age. Moreover, to maintain sperm production levels, testes of older men require require activation and recruitment of additional, otherwise quiescent, spermatogonia (36). This changed spermatogonial activity pattern could cause the observed methylation, and even DNA instability, changes. In addition, similarly to what happens with mutations (35), certain DNA methylation changes might confer a selective advantage to some spermatogonia, which would contribute more to the spermatogenic output with increasing age. This would indicate that two further hallmarks of ageing, spermatogonial stem cell replication stress and clonal drift, could contribute significantly to the ageing of the male germline.
In summary, healthy ageing is strongly associated with good reproductive health, with normal semen parameters being maintained over a six decades-long period. However, this starkly contrasts with intrinsic ageing processes occurring in the male germline, such as telomere lengthening, increasing genome instability, and DNA methylation changes at specific sites. These molecular changes might contribute to the reduced fertility and fecundity in older men.
It has been previously advised that genetic counselling should be provided for couples undergoing ART when the male partner is over 49 years of age (37, 38). Our data support this recommendation, but further work will be necessary in the future to understand whether the molecular changes to sperm DNA have consequences to progeny health, to set precise age thresholds, and ultimately whether fertility preservation options (e.g. sperm cryopreservation) should be offered to men while they are still young.
Materials and Methods
Ethics
All participants gave informed written consent for performing the examinations, evaluation of clinical data and genetic analysis of the DNA samples according to protocols approved by the Ethics Committee of the Medical Faculty in Münster (2013-255-f-S).
Participants
Participants were recruited through advertisements in local newspapers, internet, and hospitals. The recruitment and clinical evaluation/selection of the participants took place between October of 2014 and April of 2016. Prospective participants (above the age of 18) were first asked to fill an online questionnaire. Exclusion criteria in the questionnaire included smoking within the previous year, illegal drug consumption, regular use of medication (except for the treatment of mild hypertension, hypothyroidism, and dyslipoproteinemia), hospitalization within the previous month, current or former cancer/cancer treatment, severe chronic renal failure, chronic viral infection, urogenital malformations and surgeries, prior diagnosis or treatment for fertility impairment, chromosomal alterations, and participation in clinical trials within the previous year. 265 selected volunteers were invited to the outpatient clinic of the CeRA for clinical evaluation and sample collection. 65 volunteers dropped out for unwillingness or inability to make it to the appointment. 200 men underwent a thorough anthropometric, endocrine, and andrological evaluation, which was accompanied by several questionnaires addressing their wellbeing and sexual health. Two participants were further excluded due to the discovery of assymptomatic tumours during the clinical examination. 197 participants were ultimately included in the study.
Physical examination included measurement of body weight, height, blood pressure, pulse, and electrocardiogram. Blood samples were collected under fasting conditions and used for DNA isolation, hormone measurements, haematological analysis including metabolic parameters as blood lipids, HbA1c and blood glucose. Ultrasound of the thyroid, carotid arteries, kidneys, bladder, testicles, epididymis, prostate, and seminal vesicles (transrectal) in addition to palpation of breast, external genitalia, and digital rectal examination were performed. The volunteers were also asked to provide at least 2 semen samples in two separate visits, with an abstinence time of 2 -7 days. The semen samples were analysed in our andrological lab in accordance with the WHO criteria (18). A microbiological analysis of semen was performed and swim-up sperm was prepared (18) and stored for further analyses. All collected data were stored in our in-house clinical database, Androbase (39).
Grouping was done in 10 year intervals, except for groups 1 (18-24 years) and 6 (> 65 years; range 65 - 82 years), the latter due to the difficulties in recruiting sufficient numbers of healthy men above 75 years of age (Table 1).
Questionnaires
General wellbeing including personal medical history, life time smoking habits, socio-economic factors, and family medical history were analysed using four different questionnaires. Clinical symptoms of age-related hypogonadism were evaluated using the Aging Males Symptoms (AMS) rating scale (40). Erectile function was assessed with the international index of erectile function (IIEF) (41, 42). The evaluation of clinical symptoms caused by an enlargement of the prostate was measured by the International Prostate Symptom Score questionnaire (43). Quality of daily sleep was assessed by the Epworth sleepiness scale (44).
Hormone measurements
Hormonal parameters (LH, FSH, T, SHBG, prolactin, estradiol, PSA) were measured using a chemiluminescent microparticle immunoassay (Architect i1000; Abbott Diagnostics, Wiesbaden, Germany). DHT was measured using an in-house radio-immunoassay. All blood samples were obtained in a fasting state after 15 minutes of rest. Methods are validated quarterly against LC-MS/MS.
DNA Fragmentation Index (DFI) determination
Measurement of DNA fragmentation was performed as described previously (45). For the flow cytometer setup and calibration, a reference sample was used from a normal donor. Data were analysed using the FCS 3.0 software package (DeNovo software, Los Angeles, CA, USA). A DNA fragmentation index (DFI) for every detected sperm was calculated according to the formula: red/(red + green) fluorescence. DFI values were plotted in a histogram and sperm with denatured DNA (high DFI values) were those located to the right of the main, normal population.
DNA isolation and bisulfite conversion
Blood DNA was isolated using EDTA-blood using the FlexiGene DNA kit (Qiagen) according to the instructions supplied by the manufacturer and as previously described (46). Sperm DNA was isolated using MasterPure DNA purification kit (Epicentre Biotechnologies, Madison, WI, USA) and a modified protocol (46). DNA concentration was measured by spectrophotometry (NanoDrop ND-1000, Peqlab, Erlangen, Germany). DNA samples were stored at -20°C until further use.
For LINE-1 DNA methylation determination, DNA was bisulfite converted using the EpiTect kit (Qiagen). Briefly, 200 ng blood DNA were bisulfite converted using Qiagen EpiTect according to the manufacturer’s instructions. The converted DNA was eluted in 20 μL Buffer EB.
For Deep Bisulfite Sequencing (DBS), 200 ng blood and sperm DNA were bisulfite converted using the EZ DNA Methylation Gold kit (Zymo Research, Freiburg, Germany) according to the provided instructions, and the DNA was eluted in 10 μL.
Relative Telomere Length Determination
Mean relative telomere length (rTL) was measured on blood and swim-up sperm DNA by quantitative PCR (qPCR) using a previously described method (47), modified from (48).
Briefly, the cycle threshold (Ct) value for telomeric repeats (T) and a single-copy gene [S; haemoglobin subunit gamma (HBG)] were determined for each sample and the rTL was calculated as the ratio between T and S. A standard curve was generated using reference DNA ranging from 0.37 to 90 ng from 10 randomly selected subjects. Measurements for subjects and reference samples were carried out in triplicates. Amplification mixture contained 10 µl QuantiTect SYBR® green PCR Master Mix (2x; Qiagen), 2 µl blood DNA (12,5 ng/μl), 1 µl primer either for telomeric repeats (forward 2 µM, reverse 18 µM; Eurofins MWG) or HBG (forward 6 µM, reverse 14 µM; Eurofins MWG; Table S4, 2 µl DTT (25 mM), and nuclease free water to a total volume of 20 µl.
LINE-1 methylation determination
The DNA methylation of LINE-1 repetitive elements was quantified using PyroMark Q24 CpG LINE-1 kit (Qiagen) according to the procedure advised by the manufacturer.
Library preparation and sequencing
Sperm and blood DNA, respectively, from six “young” men (18-24 years) and six “old” men (61-71 years) were pooled at equimolar ratios. We used pooled DNA to minimize the effect of inter-individual DNA methylation differences (49–51). For each of the 4 pools, two shotgun libraries were generated. Standard pre-bisulfite libraries were prepared from 2 μg of DNA according to the original Illumina protocol essentially as described by Rademacher et al. (52). Tagmentation libraries were prepared from 20 ng of DNA using the Nextera transposase (Illumina, San Diego, CA, USA) and a modified protocol (personal communication from G. Gasparoni and J. Walter, University of Saarland). Bisulfite conversion was performed with the EZ DNA Methylation-Gold Kit (Zymo Research, Irvine, CA, USA), followed by 10 and 12 cycles of PCR, respectively. For each sample, we sequenced two lanes of the Illumina library and one lane of the tagmentation library on an Illumina HiSeq 2500, again as described previously (52).
Whole genome bisulfite sequencing (WGBS) data was deposited at the European Nucleotide Archive under the accession number PRJEB28044.
Read mapping and methylation calling
Raw reads were aligned with bwa-meth (v0.2.0) (53) to the human reference genome (hs37d5) with default parameters. Aligned reads were sorted and indexed, duplicates were detected, and alignment metrics were extracted by sambamba (v.0.6.6) (54). Methylation levels at CpG dinucleotides were called by an in-house script, which considers only CpGs located on the reference genome.
DMR detection
Differentially methylated regions (DMRs) were detected by Metilene (v.0.2-7) (23) and by an in-house script based on Bsmooth (22) as they detect preferentially shorter and longer DMRs, respectively. We removed the smoothing and detect differentially methylated CpGs (DMCs) by t-test as previously described (22). Adjacent differentially methylated cytosines (DMCs) are merged into a DMR if they are separated by not more than two non-DMCs. DMRs containing four DMCs and a minimum of 0.3 (i.e. 30%) average group methylation difference between these DMCs were selected.
Deep Bisulfite Sequencing (DBS)
Primer sets (Table S4) were designed for a selection of the detected DMRs using MethPrimer (55) (http://www.urogene.org/cgi-bin/methprimer/methprimer.cgi) and MethPrimer 2.0 (http://www.urogene.org/cgi-bin/methprimer2/MethPrimer.cgi). Only DMRs not overlaping with repetitive sequences were selected, in order to generate highly specific primer sequences. DMR-specific libraries were constructed by two rounds of PCR as previously described (46, 56). Sample preparation and sequencing were performed as previously described (56) using the Roche/454 GS Junior system (Roche Diagnostics, Manheim, Germany) and yield of reads was increased by applying a special filter setting (57). A power analysis was performed to find the minimum number of samples to be analysed (p<0.05 and β<0.2, large effect size).
Methylation analysis was performed using Amplikyzer (58), and average methylation for each amplicon and each CpG site, as well as methylation plots, were retrieved for each sample.
Gene Ontology Analysis
Annotation of the DMRs was performed by searching the nearest genes within 100 kb. The resulting gene list was submitted to DAVID (59) was used for gene ontology, using the whole human genome as backgroun. GSEA (60) was used for gene set enrichment analysis using the Molecular Signatures Database (MsigDB).
Overlap of DMRs with repeats
Repeat tracks produced by RepeatMasker (61) were retrieved by the UCSC Table Browser (62). Separated BED files for the four different types of repeats were generated and intersected with CpG positions. This results in four BED files, each containing CpGs located in the respective repeat regions. CpGs of DMRs located in repeat regions were then identified by intersecting the DMRs in BED format with each of the four files. All operations on BED files are performed by BEDTools (v2.25.0) (63).
We estimated the p-value for over- and underrepresented repeat overlaps by simulating 1 million datasets, consisting of 236 regions each and equal size distribution compared to our DMRs, while each region contains at least 4 CpGs. For each of our 236 DMRs and each region of the simulated datasets, the fraction of repeat-overlapping CpGs was detected. We then compared the mean fraction f for our DMRs and each of the 1 million random sets F’. The empirical p-value for overrepresentation is the fraction of random sets f’ ∈ F’ with f > f’ (symmetrically, for underrepresentation). This was performed independently for each type of repeat.
Epigenetic age-predictor
Let X be the set of Donors, D our six selected DMRs, mx(d) be the average methylation level (if available) for DMR d ∈ D and donor x ∈ X and a(x) its age.
We use the set Ti,j = {(mx(di), mx(dj), a(x)) for x ∈ X} to separately train a ridge regression (64) predictor pi,j for each pair (di, dj) ∈ D2 with 0 <= i < j < |D|, while elements with at least one missing methylation value for x are removed from Ti,j.
Each pi,j is able to predict an age a’i,j(x) = pi,j(mx(di), mx(dj)) given average methylation of x for di and dj. Let then a’(x) be the average of all single predictions for x.
In a second step we use all pairs T’i,j = {(a’(x), a(x)) for x ∈ X} to train an additional simple linear regression model q. The final predicted age for x is is q(a’(x)).
Statistics
For each analysed variable, normality of the distribution and homogeneity of variance were tested before determining the most adequate statistical test to use. Two-sided t-tests were used to compare betweens between two groups. Pearson’s test was used to test correlations between age and normally distributed variables. Spearman’s rank correlation was used when the variables were not normally distributed or presented outliers. Adjustment for confounders was done by partial correlation analysis. All statistical analysis and graph plotting were performed using R 3.3.1 and suitable R packages.
Data Availability
WGBS data was deposited at the European Nucleotide Archive under the accession number PRJEB28044.
Funding
This work was supported by internal funds of the Department of Clinical and Surgical Andrology (CeRA, Münster University Hospital), by the Klinische Forschergruppe e.V., the German Society of Andrology (to JFC), and the German Research Foundation (GR 1547/19-1) and Clinical Research Unit ‘Male Germ Cells’ (CRU326).
Author contributions
S.L. did the experimental study design, sample and data analyses, evaluated clinical parameters, and wrote the manuscript. J.F.C. did the clinical study design, proband recruitment, and data analyses. B.H. did the epigenetic study design and WGBS analyses, data evaluation, and contributed to the manuscript. F.T. contributed to the experimental and clinical study design and the manuscript. K.C. worked on the study design, online questionaire, the ethical vote, and proband recruitment. M.Z. worked on the study design, proband recruitment and examination. E.P. did experimental work, was involved in data interpretation, and worked on the manuscript. S.R. performed the bioinformatics analyses of the WGBS data. C.S. set up the epigenetic clock and did WGBS analyses. S.B. performed DMR analyses. K.R. did the DFI analysis. C.K. worked on the study design, proband recruitment and examination. St.S. worked on the study design, analysed the data and worked on the manuscript. S.K. was responsible for the clinical study design, the ethical permission, and proband examination. J.G. developed the concept of the study, performed clinical and experimental data analyses, and wrote the manuscript.
Competing interests
Authors declare no competing interests.
Data availability
WGBS data was deposited at the European Nucleotide Archive under the accession number PRJEB28044.
Acknowledgments
We would like to thank all men who contributed as probands in this study. We thank Sabine Forsthoff, Claudia Haak, Daniela Hanke, Jolanta Körber, Raphaele Kürten, Elisabeth Lahrmann, Susanne Orlowski, Sonja Barkhaus, Elena Plester, Reinhild Sandhowe, Sabine Strüwing, and Nicole Terwort for technical assistance, Ludger Klein-Hitpass for Illumina sequencing and Elsa Leitão and Michael Zeschnigk for helpful discussions.
Footnotes
↵§ Shared last-authorship