%0 Journal Article %A Sundreen Asad Kamal %A Changchang Yin %A Buyue Qian %A Ping Zhang %T An Interpretable Risk Prediction Model for Healthcare with Pattern Attention %D 2020 %R 10.1101/2020.07.26.20162479 %J medRxiv %P 2020.07.26.20162479 %X Background The availability of massive amount of data enables the possibility of clinical predictive tasks. Deep learning methods have achieved promising performance on the tasks. However, most existing methods suffer from three limitations: (i) There are lots of missing value for real value events, many methods impute the missing value and then train their models based on the imputed values, which may introduce imputation bias. The models’ performance is highly dependent on the imputation accuracy. (ii) Lots of existing studies just take Boolean value medical events (e.g. diagnosis code) as inputs, but ignore real value medical events (e.g., lab tests and vital signs), which are more important for acute disease (e.g., sepsis) and mortality prediction. (iii) Existing interpretable models can illustrate which medical events are conducive to the output results, but are not able to give contributions of patterns among medical events.Methods In this study, we propose a novel interpretable Pattern Attention model with Value Embedding (PAVE) to predict the risks of certain diseases. PAVE takes the embedding of various medical events, their values and the corresponding occurring time as inputs, leverage self-attention mechanism to attend to meaningful patterns among medical events for risk prediction tasks. Because only the observed values are embedded into vectors, we don’t need to impute the missing values and thus avoids the imputations bias. Moreover, the self-attention mechanism is helpful for the model interpretability, which means the proposed model can output which patterns cause high risks.Results We conduct sepsis onset prediction and mortality prediction experiments on a publicly available dataset MIMIC-III and our proprietary EHR dataset. The experimental results show that PAVE outperforms existing models. Moreover, by analyzing the self-attention weights, our model outputs meaningful medical event patterns related to mortality.Conclusions PAVE learns effective medical event representation by incorporating the values and occurring time, which can improve the risk prediction performance. Moreover, the presented self-attention mechanism can not only capture patients’ health state information, but also output the contributions of various medical event patterns, which pave the way for interpretable clinical risk predictions.Availability The code for this paper is available at: https://github.com/yinchangchang/PAVE.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNot ApplicableAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study uses the MIMIC-III dataset. We are using the MIMIC IRB. This study was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA, USA), the Massachusetts Institute of Technology (Cambridge, MA, USA) and Institute for Infocomm Research (Singapore). Requirement for individual patient consent was waived because the study did not impact clinical care and all protected health information was de-identified. De-identification was performed in compliance with Health Insurance Portability and Accountability Act (HIPAA) standards in order to facilitate public access to MIMIC-III. Deletion of protected health information (PHI) from structured data sources (e.g., database fields that provide patient name or date of birth) was straightforward.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesWe used ublicly available dataset MIMIC-III. https://mimic.physionet.org/ %U https://www.medrxiv.org/content/medrxiv/early/2020/07/29/2020.07.26.20162479.full.pdf