Abstract
In this paper, I build deep neural networks of various structures and hyperparameters in order to predict human chronological age based on open-access biochemical indicators and their specifications from the NHANES database. In total, 1152 neural networks are trained and tested. The algorithms are trained and tested on incomplete data: missing values in data records are extrapolated by mean or median values for each parameter. I select the best neural networks in terms of validation accuracy (coefficient of determination and mean absolute error). It turns out that the most accurate results are delivered by multilayer networks (6 layers) with recurrent layers. Neural network types are selected by trial and error. The algorithms reached an accuracy of 78% in terms of coefficient of determination and 6.5 in terms of mean absolute error. I also list empirically determined features of neural networks that increase accuracy for the task of chronological age prediction. Obtained results can be considered as an approximation of human biological age. Parameters in training datasets are selected the most broadly: all potentially relevant parameters (926) from the NHANES database are used. Although the networks are trained on the incomplete data, they demonstrated the ability to make reasonable predictions (with R2 > 0.7) based on no more than 100 biochemical indicators. Hence, for practical reasons the full data on each of 926 indicators are not required, although the analysis of the impact of each indicator is useful for theoretical developments.
Similar content being viewed by others
Data availability
The author confirms that the data supporting the findings of this study are available within the article and in open databases.
Code availability
The code is available upon request.
Notes
The truth of this statement is not obvious for functions representing complex nonlinear non-monotonous connections and must be proved separately.
Strictly speaking, must hold the following: \(Var\left(b\right)\le 2(\sum {cov}_{{y}_{k},b}\frac{\partial N\left(\bar{x}\right)}{\partial {y}_{k}}-\sum {cov}_{{y}_{k},c}\frac{\partial N\left(\bar{x}\right)}{\partial {y}_{k}})+Var(c)\). But the difference of covariances may be difficult to estimate. Since \(2(\sum {cov}_{{y}_{k},b}\frac{\partial N\left(\bar{x}\right)}{\partial {y}_{k}}-\sum {cov}_{{y}_{k},c}\frac{\partial N\left(\bar{x}\right)}{\partial {y}_{k}})\) should always be positive, \(Var\left(b\right)\le 2(\sum {cov}_{{y}_{k},b}\frac{\partial N\left(\bar{x}\right)}{\partial {y}_{k}}-\sum {cov}_{{y}_{k},c}\frac{\partial N\left(\bar{x}\right)}{\partial {y}_{k}})+Var(c)\) holds always when \(Var\left(b\right)\le Var(c)\) independently of the exact value of the difference of covariances.
Also, if Var(ε) is negligibly small (almost all variance of biomarkers is due to aging), then the predictions of neural networks will be almost equal to biological age.
References
Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NA, Arshad H (2018) State-of-the-art in artificial neural network applications: a survey. Heliyon 4(11):e00938
Ahadi S, Zhou W, Rose SMSF, Sailani MR, Contrepois K, Avina M, et al (2020) Personal aging markers and ageotypes revealed by deep longitudinal profiling. Nat Med 26(1):83–90
Aubert G, Lansdorp PM (2008) Telomeres and aging. Physiol Rev 88(2):557–579
Bobrov E, Georgievskaya A, Kiselev K, Sevastopolsky A, Zhavoronkov A, Gurov S, et al (2018) PhotoAgeClock: deep learning algorithms for development of non-invasive visual biomarkers of aging. Aging (Albany NY) 10(11):3249
Bürkle A, Moreno-Villanueva M, Bernhard J, Blasco M, Zondag G, Hoeijmakers JH, Gonos ES, et al (2015) MARK-AGE biomarkers of ageing. Mech Ageing Dev 151:2–12
Cheng S, Larson MG, McCabe EL, Murabito JM, Rhee EP, Ho JE, et al (2015) Distinct metabolomic signatures are associated with longevity in humans. Nat Commun 6(1):1–10
Christiansen L, Lenart A, Tan Q, Vaupel JW, Aviv A, McGue M, Christensen K (2016) DNA methylation age is associated with mortality in a longitudinal Danish twin study. Aging Cell 15(1):149–154
Cohen AA, Morissette-Thomas V, Ferrucci L, Fried LP (2016) Deep biomarkers of aging are population-dependent. Aging (Albany NY) 8(9):2253
Cole JH et al (2017) Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker. Neuroimage 163:115–124
Galkin F et al (2018) Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects. New Results. https://doi.org/10.1101/507780
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680.
Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, et al (2013) Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell 49(2):359–367
Horvath S (2013) DNA methylation age of human tissues and cell types. Genome Biol 14(10):3156
Horvath S, Raj K (2018) DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat Rev Genet 19(6):371
Horvath S, Erhart W, Brosch M, Ammerpohl O, von Schönfels W, Ahrens M, et al (2014) Obesity accelerates epigenetic aging of human liver. Proc Natl Acad Sci 111(43):15538–15543
Horvath S, Pirazzini C, Bacalini MG, Gentilini D, Di Blasio AM, Delledonne M, et al (2015) Decreased epigenetic age of PBMCs from Italian semi-supercentenarians and their offspring. Aging (Albany NY) 7(12):1159
Enroth S, Enroth SB, Johansson Å, Gyllensten U (2015) Protein profiling reveals consequences of lifestyle choices on predicted biological aging. Sci Rep 5:17282
Lee H et al (2017) Fully automated deep learning system for bone age assessment. J Digit Imaging 30:427–441
Lee YH, Kim SR, Yu HT, Han YD, Kim JH, Kim SH, et al (2019) Senescent T cells predict the development of hyperglycemia in humans. Diabetes 68(1):156–162
Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16(6):321–332
Mamoshina P, Vieira A, Putin E, Zhavoronkov A (2016) Applications of deep learning in biomedicine. Mol Pharm 13(5):1445–1454
Mamoshina P, Kochetov K, Putin E, Cortese F, Aliper A, Lee WS, et al (2018a) Population specific biomarkers of human aging: a big data study using South Korean, Canadian, and Eastern European patient populations. J Gerontol 73(11):1482–1490
Mamoshina P, Volosnikova M, Ozerov IV, Putin E, Skibina E, Cortese F, Zhavoronkov A (2018b) Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front Genet 9:242
Marioni RE, Shah S, McRae AF, Chen BH, Colicino E, Harris SE, et al (2015) DNA methylation age of blood predicts all-cause mortality in later life. Genome Biol 16(1):25
McCauley BS, Dang W (2014) Histone methylation and aging: lessons learned from model systems. Biochim Biophys Acta 1839(12):1454–1462
Moskalev A (ed) (2019) Biomarkers of human aging. Springer, Berlin
Pal S, Tyler JK (2016) Epigenetics and aging. Sci Adv 2(7):e1600584
Park J, Cho B, Kwon H, Lee C (2009) Developing a biological age assessment equation using principal component analysis and clinical biomarkers of aging in Korean men. Arch Gerontol Geriatr 49(1):7–12
Pastur-Romay LA, Cedron F, Pazos A, Porto-Pazos AB (2016) Deep artificial neural networks and neuromorphic chips for big data analysis: pharmaceutical and bioinformatics applications. Int J Mol Sci 17(8):1313
Pyrkov TV, Slipensky K, Barg M, Kondrashin A, Zhurov B, Zenin A, et al (2018) Extracting biological age from biomedical data via deep learning: too much of a good thing? Sci Rep 8(1):1–11
Putin E, Mamoshina P, Aliper A, Korzinkin M, Moskalev A, Kolosov A, Ostrovskiy A, Cantor C, Vijg J, Zhavoronkov A (2016) Deep biomarkers of human aging: application of deep neural networks to biomarker development. Aging 8(5):1021–1033. https://doi.org/10.18632/aging.100968.PMC4931851.PMID27191382
Quach A, Levine ME, Tanaka T, Lu AT, Chen BH, Ferrucci L, et al (2017) Epigenetic clock analysis of diet, exercise, education, and lifestyle factors. Aging (Albany NY) 9(2):419
Sen P, Shah PP, Nativio R, Berger SL (2016) Epigenetic mechanisms of longevity and aging. Cell 166(4):822–839
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Zhavoronkov A, Mamoshina P (2019) Deep aging clocks: the emergence of AI-based biomarkers of aging and longevity. Trends Pharmacol Sci 40(8):546–549
Zhavoronkov A, Mamoshina P, Vanhaelen Q, Scheibye-Knudsen M, Moskalev A, Aliper A (2019) Artificial intelligence for aging and longevity research: recent advances and perspectives. Ageing Res Rev 49:49–66
Acknowledgements
I thank Anton Piankov for checking mathematical derivations and for his comments which made them more rigorous and understandable.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The author declares that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kendiukhov, I. AI-based investigation of molecular biomarkers of longevity. Biogerontology 21, 731–744 (2020). https://doi.org/10.1007/s10522-020-09890-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10522-020-09890-y