PT - JOURNAL ARTICLE AU - Surya Krishnamurthy AU - KS Kapeleshh AU - Erik Dovgan AU - Mitja Luštrek AU - Barbara Gradišek Piletič AU - Kathiravan Srinivasan AU - Yu-Chuan Li AU - Anton Gradišek AU - Shabbir Syed-Abdul TI - Machine Learning Prediction Models for Chronic Kidney Disease using National Health Insurance Claim Data in Taiwan AID - 10.1101/2020.06.25.20139147 DP - 2020 Jan 01 TA - medRxiv PG - 2020.06.25.20139147 4099 - http://medrxiv.org/content/early/2020/07/30/2020.06.25.20139147.short 4100 - http://medrxiv.org/content/early/2020/07/30/2020.06.25.20139147.full AB - Background and Objective Chronic kidney disease (CKD) represent a heavy burden on the healthcare system because of the increasing number of patients, high risk of progression to end-stage renal disease, and poor prognosis of morbidity and mortality. The aim of this study is to develop a machine-learning model that uses the comorbidity and medication data, obtained from Taiwan's National Health Insurance Research Database, to forecast whether an individual will develop CKD within the next 6 or 12 months, and thus forecast the prevalence in the population.Methods A total of 18,000 people with CKD and 72,000 people without CKD diagnosis along with the past two years of medication and comorbidity data matched by propensity score were used to build a predicting model. A series of approaches were tested, including Convoluted Neural Networks (CNN). 5-fold cross-validation was used to assess the performance metrics of the algorithms.Results Both for the 6 month and 12-month models, the CNN approach performed best, with the AUROC of 0.957 and 0.954, respectively. The most prominent features in the tree-based models were identified, including diabetes mellitus, age, gout, and medications such as sulfonamides, angiotensins which had an impact on the progression of CKD.Conclusions The model proposed in this study can be a useful tool for the policy-makers helping them in predicting the trends of CKD in the population in the next 6 to 12 months. Information provided by this model can allow closely monitoring the people with risk, early detection of CKD, better allocation of resources, and patient-centric managementCompeting Interest StatementThe authors have declared no competing interest.Funding StatementThe authors acknowledge the financial support from the Slovenian Research Agency (research core funding No. P2-0209). This work is part of the CrowdHEALTH project that has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no. 727560 (JSI) and the Ministry of Science and Technology under project no. 106-3805-018-110 (TMU). Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study has been exempted by the Institutional Review Board of Taipei Medical University beforehand.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data that support the findings of this study are available on request. The data has been taken from Taiwan's National Health Insurance Research Database (NHIRD).