RT Journal Article SR Electronic T1 Identifying Sequential Complication and Mortality Patterns in Diabetes Mellitus: Comparisons of Machine Learning Methodologies JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.12.21.20248646 DO 10.1101/2020.12.21.20248646 A1 Jiandong Zhou A1 Sharen Lee A1 Wing Tak Wong A1 Tong Liu A1 Leonardo Roever A1 Kamalan Jeevaratnam A1 William KK Wu A1 Ian Chi Kei Wong A1 Gary Tse A1 Qingpeng Zhang YR 2020 UL http://medrxiv.org/content/early/2020/12/26/2020.12.21.20248646.abstract AB Background Diabetes mellitus-related complications adversely affect the quality of life. Better risk-stratified care through mining of sequential complication patterns is needed to enable early detection and prevention.Methods Univariable and multivariate logistic regression was used to identify significant variables that can predict mortality. A sequence analysis method termed Prefixspan was applied to identify the most common couple, triple, quadruple, quintuple and sextuple sequential complication patterns in the directed comorbidity pathology network. A knowledge enhanced CPT+ (KCPT+) sequence prediction model is developed to predict the next possible outcome along the progression trajectories of diabetes-related complications.Findings A total of 14,144 diabetic patients (51% males) were included. Acute myocardial infarction (AMI) without known ischaemic heart disease (IHD) (odds ratio [OR]: 2.8, 95% CI: [2.3, 3.4]), peripheral vascular disease (OR: 2.3, 95% CI: [1.9, 2.8]), dementia (OR: 2.1, 95% CI: [1.8, 2.4]), and IHD with AMI (OR: 2.4, 95% CI: [2.1, 2.6]) are the most important multivariate predictors of mortality. KCPT+ shows high accuracy in predicting mortality (F1 score 0.90, ACU 0.88), osteoporosis (F1 score 0.86, AUC 0.82), ophthalmological complications (F1 score 0.82, AUC 0.82), IHD with AMI (F1 score 0.81, AUC 0.85) and neurological complications (F1 score 0.81, AUC 0.83) with a particular prior complication sequence.Interpretation Sequence analysis identifies the most common pattern characteristics of disease-related complications efficiently. The proposed sequence prediction model is accurate and enables clinicians to diagnose the next complication earlier, provide better risk-stratified care, and devise efficient treatment strategies for diabetes mellitus patients.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNoneAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study was approved by The Joint Chinese University of Hong Kong - New Territories East Cluster Clinical Research Ethics Committee and Institutional Review Board of the University of Hong Kong/Hospital Authority Hong Kong West Cluster.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe anonymized dataset has been deposited in the following repository's URL: https://zenodo.org/record/4382440#.X-DRTNgzaUlThe anonymized dataset has been deposited in the following repository's URL: https://zenodo.org/record/4382440#.X-DRTNgzaUl