Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records

Riccardo Miotto; Li Li; Brian A Kidd; Joel T Dudley

doi:10.1038/srep26094

Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records

Sci Rep. 2016 May 17:6:26094. doi: 10.1038/srep26094.

Authors

Riccardo Miotto^{1

2

3}, Li Li^{1

2

3}, Brian A Kidd^{1

2

3}, Joel T Dudley^{1

2

3}

Affiliations

¹ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
² Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
³ Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Abstract

Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name "deep patient". We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Biostatistics
Electronic Data Processing*
Electronic Health Records*
Humans
Machine Learning*
Prognosis*

Abstract

Publication types

MeSH terms

Grants and funding