TY - JOUR T1 - Improving Pre-eclampsia Risk Prediction by Modeling Individualized Pregnancy Trajectories Derived from Routinely Collected Electronic Medical Record Data JF - medRxiv DO - 10.1101/2021.03.23.21254178 SP - 2021.03.23.21254178 AU - Shilong Li AU - Zichen Wang AU - Luciana A. Vieira AU - Amanda B. Zheutlin AU - Boshu Ru AU - Emilio Schadt AU - Pei Wang AU - Alan B. Copperman AU - Joanne Stone AU - Susan J. Gross AU - Eric E. Schadt AU - Li Li Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/03/24/2021.03.23.21254178.abstract N2 - Preeclampsia (PE) is a heterogeneous and complex disease associated with rising morbidity and mortality in pregnant women and newborns in the US. Early recognition of patients at risk is a pressing clinical need to significantly reduce the risk of adverse pregnancy outcomes. We assessed whether information routinely collected and stored on women in their electronic medical records (EMR) could enhance the prediction of PE risk beyond what is achieved in standard of care assessments today. We developed a digital phenotyping algorithm to assemble and curate 108,557 pregnancies from EMRs across the Mount Sinai Health System (MSHS), accurately reconstructing pregnancy journeys and normalizing these journeys across different hospital EMR systems. We then applied machine learning approaches to a training dataset from Mount Sinai Hospital (MSH) (N = 60,879) to construct predictive models of PE across three major pregnancy time periods (ante-, intra-, and postpartum). The resulting models predicted PE with high accuracy across the different pregnancy periods, with areas under the receiver operating characteristic curves (AUC) of 0.92, 0.83 and 0.89 at 37 gestational weeks, intrapartum and postpartum, respectively. We observed comparable performance in two independent patient cohorts with diverse patient populations (MSH validation dataset N = 38,421 and Mount Sinai West dataset N = 9,257). While our machine learning approach identified known risk factors of PE (such as blood pressure, weight and maternal age), it also identified novel PE risk factors, such as complete blood count related characteristics for the antepartum time period and ibuprofen usage for the postpartum time period. Our model not only has utility for earlier identification of patients at risk for PE, but given the prediction accuracy substantially exceeds what is achieved today in clinical practice, our model provides a path for promoting personalized precision therapeutic strategies for patients at risk.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNo funding supportAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This data usage is approved by institutional review board (IRB) of Icahn School of Medicine at Mount Sinai: IRB-17-01245.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data used for this study are available from Mount Sinai Genomics Inc dba Sema4, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with the permission of Mount Sinai Genomics Inc dba Sema4. ER -