RT Journal Article SR Electronic T1 Estimating youth diabetes risk using NHANES data and machine learning JF medRxiv FD Cold Spring Harbor Laboratory Press SP 19007872 DO 10.1101/19007872 A1 Nita Vangeepuram A1 Bian Liu A1 Po-hsiang Chiu A1 Linhua Wang A1 Gaurav Pandey YR 2020 UL http://medrxiv.org/content/early/2020/08/12/19007872.abstract AB Background Prediabetes and diabetes mellitus (preDM/DM) have become alarmingly prevalent among youth in recent years. However, simple questionnaire-based screening tools to reliably assess diabetes risk are only available for adults, not youth.Methods As a first step in developing such a tool, we used a large-scale dataset from the National Health and Nutritional Examination Survey (NHANES) to examine the performance of a published pediatric clinical screening guideline in identifying youth with preDM/DM based on American Diabetes Association diagnostic biomarkers. We assessed the agreement between the clinical guideline and biomarker criteria using established evaluation measures (sensitivity, specificity, positive/negative predictive value, F-measure for the positive/negative preDM/DM classes, and Kappa). We also compared the performance of the guideline to those of machine learning (ML) based preDM/DM classifiers derived from the NHANES dataset.Results Approximately 29% of the 2858 youth in our study population had preDM/DM based on biomarker criteria. The clinical guideline had a sensitivity of 43.1% and specificity of 67.6%, positive/negative predictive values of 35.2%/74.5%, positive/negative F-measures of 38.8%/70.9%, and Kappa of 0.1 (95%CI: 0.06-0.14). The performance of the guideline varied across demographic subgroups. Some ML-based classifiers performed comparably to or better than the screening guideline, especially in identifying preDM/DM youth (p=5.23×10−5).Conclusions We demonstrated that a recommended pediatric clinical screening guideline did not perform well in identifying preDM/DM status among youth. Additional work is needed to develop a simple yet accurate screener for youth diabetes risk, potentially by using advanced ML methods and a wider range of clinical and behavioral health data.Key MessagesAs a first step in developing a youth diabetes risk screening tool, we used a large-scale dataset from the National Health and Nutritional Examination Survey (NHANES) to examine the performance of a published pediatric clinical screening guideline in identifying youth with prediabetes/diabetes based on American Diabetes Association diagnostic biomarkers.In this cross-sectional study of youth, we found that the screening guideline correctly identified 43.1% of youth with prediabetes/diabetes, the performance of the guideline varied across demographic subgroups, and machine learning based classifiers performed comparably to or better than the screening guideline in identifying youth with prediabetes/diabetes.Additional work is needed to develop a simple yet accurate screener for youth diabetes risk, potentially by using advanced ML methods and a wider range of clinical and behavioral health data.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe research presented in this manuscript was supported by a National Institutes of Health grant (R01GM114434) and an IBM Faculty award to author GP and by a Cigna Foundation grant (10005177) awarded to author NV.Author DeclarationsAll relevant ethical guidelines have been followed and any necessary IRB and/or ethics committee approvals have been obtained.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesAny clinical trials involved have been registered with an ICMJE-approved registry such as ClinicalTrials.gov and the trial ID is included in the manuscript.YesI have followed all appropriate research reporting guidelines and uploaded the relevant Equator, ICMJE or other checklist(s) as supplementary files, if applicable.YesOnly publicly available NHANES data were used in this study. These data are available from https://wwwn.cdc.gov/nchs/nhanes/. https://wwwn.cdc.gov/nchs/nhanes/