%0 Journal Article %A Esra Zihni %A Vince Istvan Madai %A Michelle Livne %A Ivana Galinovic %A Ahmed A. Khalil %A Jochen B. Fiebach %A Dietmar Frey %T Opening the Black Box of Artificial Intelligence for Clinical Decision Support: A Study Predicting Stroke Outcome %D 2019 %R 10.1101/19010975 %J medRxiv %P 19010975 %X Background State-of-the-art machine learning (ML) artificial intelligence methods are increasingly leveraged in clinical predictive modeling to provide clinical decision support systems to physicians. Modern ML approaches such as artificial neural networks (ANNs) and tree boosting often perform better than more traditional methods like logistic regression. On the other hand, these modern methods yield a limited understanding of the resulting predictions. However, in the medical domain, understanding of applied models is essential, in particular, when informing clinical decision support. Thus, in recent years, interpretability methods for modern ML methods have emerged to potentially allow explainable predictions paired with high performance.Methods To our knowledge, we present in this work the first explainability comparison of two modern ML methods, tree boosting and multilayer perceptrons (MLPs), to traditional logistic regression methods using a stroke outcome prediction paradigm. Here, we used clinical features to predict a dichotomized 90 days post-stroke modified Rankin Scale (mRS) score. For interpretability, we evaluated clinical features’ importance with regard to predictions using deep Taylor decomposition for MLP, Shapley values for tree boosting and model coefficients for logistic regression.Results With regard to performance as measured by AUC values on the test dataset, all models performed comparably: Logistic regression AUCs were 0.82, 0.82, 0.79 for three different regularization schemes; tree boosting AUC was 0.81; MLP AUC was 0.81. Importantly, the interpretability analysis demonstrated consistent results across models by rating age and stroke severity consecutively amongst the most important predictive features. For less important features, some differences were observed between the methods.Conclusions Our analysis suggests that modern machine learning methods can provide explainability which is compatible with domain knowledge interpretation and traditional method rankings. Future work should focus on replication of these findings in other datasets and further testing of different explainability methods.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work has received funding by the German Federal Ministry of Education and Research through (1) the grant Centre for Stroke Research Berlin and (2) a Go-Bio grant for the research group PREDICTioN2020 (lead: DF).Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData cannot be shared publicly because of data protection laws. Data might be available from the institutional ethics commitee of Charité Universitätsmedizin Berlin (contact via ethikkommission{at}charite.de) for researchers who meet the criteria for access to confidential data. %U https://www.medrxiv.org/content/medrxiv/early/2019/11/04/19010975.full.pdf