Structured Abstract
Background Preeclampsia (PE) is one of the leading factors in maternal and perinatal mortality and morbidity worldwide. Delivery timing is key to balancing the risk between severe maternal and neonatal morbidities in pregnancies complicated by PE.
Method In this study, we constructed and validated first-of-their-kind deep learning models that can forecast the time to delivery among patients with PE using electronic health records (EHR) data. The discovery cohort consisted of 1,533 preeclamptic pregnancies, including 374 cases of early-onset preeclampsia (EOPE), that were delivered at University of Michigan Health System (UM) between 2015 and 2021. The validation cohort contained 2,172 preeclamptic pregnancies (including 547 EOPE) from University of Florida Health System (UF) in the same period. Using Cox-nnet, a neural network-based prognosis prediction algorithm, we built baseline models of all PE patients and of the subset of EOPE patients, using 47 features on demographics, medical history, comorbidities, the severity of PE, and gestational age of initial PE diagnosis. We also built full models using 62 features, combining those in baseline models and additional features on lab tests and vital signs, on the same PE patients and EOPE subset. The models were re-trained and re-validated using reduced sets of the most important features, to improve their interpretability and clinical applicability.
Findings The 7-feature baseline models on all PE patients reached C-indices of 0·73, 0·74 and 0·73 on UM training, hold-out testing and UF validation dataset respectively, whereas the 12-feature full model had improved C-indices of 0·78, 0·79 and 0·74 on the same datasets. For the EOPE cases, the 6-feature baseline model achieved C-indices of 0·67, 0·68 and 0·63 on the training, hold-out testing and UF validation dataset respectively, while its 13-feature full model counterpart reached C-indices of 0·74, 0·76 and 0·67 in the same datasets. Besides confirming the prognostic importance of gestational age at the time of diagnosis and of sPE status, all four models identified parity and PE in prior pregnancies as important features, which are not in the current guidelines for PE delivery timing. Laboratory results and vital signs such as platelet count, the standard deviation of respiratory rate within a 5-day observation window, and mean diastolic blood pressure are critical to increase the accuracy of predicting time to delivery, in addition to testing aspartate aminotransferase and creatinine levels. For EOPE time to delivery prediction, comorbidities such as pulmonary circulation disorders and coagulopathy as defined in Elixhauser Comorbidity Index are important to consider.
Interpretation We set up a user-friendly web interface to allow personalized PE time to delivery prediction. The app is available at http://garmiregroup.org/PE-prognosis-predictor/app These actionable models may help providers to plan antepartum care in these pregnancies and significantly improve the management/clinical outcomes of pregnancies affected by PE.
Funding This study is funded by the National Institutes of Health
Evidence before this study Determining the optimal delivery time is essential in preeclampsia management to balance the risk of maternal and neonatal morbidities. Current clinical guidelines for delivery timing in preeclampsia, according to the American College of Obstetricians and Gynecologists (ACOG), mainly depend on the gestational age at diagnosis and the severity of PE. However, the current knowledge doesn’t provide a quantitative prediction of patients’ risk of delivery, nor does it discuss the effect of some important phenotypic factors (eg. patients’ demographics, lifestyles and comorbidities) on delivery time. Rather, according to a systematic review published in 2021, 18 prior studies predicted the timing of delivery for preeclampsia using biomarkers, which are yet to be implemented in routine checkups in pregnancy. On the other hand, EHR data are routinely collected but often overlooked information, with huge potential to predict challenging time to delivery problems such as those in PE.
Added value of this study To our knowledge, these are the first deep-learning-based time to delivery prediction models for PE and EOPE patients using routine clinical and demographic variables. We enlist the quantitative values of critical EHR features informative of delivery time among PE patients, many of which are newly reported clinical features. We disseminate these models by the web tool “PE time to delivery Predictor”.
Implications of all the available evidence All models are externally validated with a large EHR dataset from the University of Florida Health System. Adopting these models may provide clinicians and patients with valuable management plans to predict and prepare for the best delivery times of pregnancies complicated by PE, especially for EOPE cases in which consequences of early delivery are more significant. Further prospective investigation of these models’ performance is necessary to provide feedback and potential improvement of this model.
Introduction
Preeclampsia (PE) is a pregnancy complication affecting 2% to 8% of all pregnancies worldwide and is a leading cause of maternal, fetal, and neonatal mortality and morbidity1,2. PE is defined by new-onset hypertension after 20 weeks of gestation and the presence of proteinuria, and/or other signs of end organ damage. PE is a diverse syndrome with various subtypes along the spectrum of gestational hypertensive disorders.3 It can be divided into early-onset PE (diagnosed before 34 weeks of pregnancy) or late-onset PE (diagnosed after 34 weeks of pregnancy); PE with severe features(sPE) or PE without severe features (nsPE) 4,5. Failure to properly manage PE can lead to severe maternal morbidities, long-term adverse health outcomes, and even maternal death, and the only known cure to PE is delivery of the placenta6, 7.
This, especially in cases of EOPE8, creates a dilemma as earlier delivery can potentially prevent severe morbidities including maternal seizure, stroke, organ dysfunction and intrauterine fetal demise, but may lead to premature birth and subsequent neonatal complications9,10. To balance the risks to both mother and baby, current clinical management of PE includes supportive blood pressure management and prophylaxis for maternal seizures, and a two-dose intramuscular course of betamethasone to augment fetal lung maturation11.
Clinical guidelines for delivery timing in preeclampsia mainly depend on gestational age at diagnosis and disease severity; generally, delivery is immediate for patients past 37 weeks of gestation or past 34 weeks with sPE. If less than 34 weeks and diagnosed with sPE, pregnancy should only continue if intensive care is available and no severe maternal morbidity suspected11–13. In cases in which delivery is not immediately indicated at the time of PE diagnosis, there is currently no way to know when patients might escalate to a state that requires immediate delivery. Additionally, known risk factors for PE, such as the patient’s demographics, social status, lifestyle, and other comorbidities may also influence the timing of delivery, but they are not discussed in the management guidelines. A comprehensive quantitative model using patient-level data to assess time to delivery among PE patients would help clinicians to make management decisions, particularly among the challenging EOPE cases.
Towards this goal, we conducted the first study to predict patient delivery time after the diagnosis of PE using electronic health records (EHR) data. We utilized the state-of-the-art deep learning-based prognosis prediction model, Cox-nnet, which we previously developed14–16. Cox-nnet methods have consistently shown better predictive performances than the conventional Cox-PH models under a variety of conditions, including on EHR data14. Our objectives were: (1) to predict the time to delivery interval among PE patients, and an EOPE sub-cohort, from the time of initial diagnosis by constructing and validating deep-learning models utilizing EHR data; and (2) to assess the quantitative contributions of critical EHR features informative of delivery time among PE patients, including those EOPE patients.
Methods
Data Source
We obtained the discovery cohort from Michigan Medicine (UM), the academic health care system of the University of Michigan, Ann Arbor. Data usage was approved by the Institutional Review Board (IRB) of the University of Michigan Medical School (HUM#00168171). The validation cohort was obtained from the Integrated Data Repository database at the University of Florida (UF). Data usage was approved by IRB of the University of Florida (#IRB201601899). We extracted all obstetric records with at least one PE diagnosis based on ICD-10 diagnosis codes (Supplementary Table 1). We excluded patients with the following conditions: Hemolysis, Elevated Liver Enzymes, and Low Platelet (HELLP) syndrome and eclampsia, for which iatrogenic delivery is ubiquitously induced within 48 hours of diagnosis; chronic hypertension with superimposed PE, whose onset may occur before week 20 and whose diagnostic criteria are less clear; and postpartum PE, which is only developed after delivery. We also removed patients transferred from other institutions by deleting patients with no visit record within 180 days prior to the first diagnosis of PE to confirm the accuracy of the initial diagnosis time of PE.
EHR Feature Engineering
The EHR provided baseline individual features, vital signs, and lab values obtained after PE diagnosis. Baseline features included age, race, ethnicity, smoking and drug use status, medical history, pregnancy characteristics, and comorbidities at the earliest PE diagnosis. Pregnancy characteristics included parity, number of fetuses, gestational age, and PE severity at initial diagnosis. Comorbidities were grouped into 29 categories using the Elixhauser Comorbidity Index17. The observational window for lab results and vital signs was 5 days before the day of the initial PE diagnosis. Only the first results of repeated tests were used to avoid intervention/treatment effect, resulting in 13 lab test features (10 hematological, 2 liver function, 1 urine). Vital signs were collected within the 5-day window, and summary statistics of systolic blood pressure (SBP), diastolic blood pressure (DBP), and respiratory rate (RR) measures were included (max, min, mean, standard deviation), as done in previous work18. As a result, 62 features were kept for initial analysis (Supplementary Table 2). Details of feature selection and cleaning are described in Supplementary Methods.
Fully-connected Cox-nnet neural network models
We developed 4 models to predict the time to delivery of PE patients: PE baseline, PE full, EOPE baseline and EOPE full model. The baseline models include demographics, medical history, comorbidities, the severity of PE, and the gestational age of initial PE diagnosis. The full model incorporated all features from the baseline model, with additional lab results and vital signs collected in the observation window. EOPE models were built and tested on patients with PE onset time before 34 weeks of gestation. We constructed all models using the Cox-nnet v2 algorithm (Supplementary Figure 1)14. In this study, we adopted the model to predict the time between PE diagnosis to delivery. To ensure the stability of the models, the discovery dataset was divided into a training set (80%) and a hold-out testing set (20%).
Reduced feature representation from the Cox-nnet models
To derive a subset of clinically significant and easily interpretable features, we reduced Cox-nnet features based on both their importance scores and significance levels. To do so, we first selected the top 15 most important features based on their average permutation importance scores generated by Cox-nnet models19. Then we fit each of the K features individually by a univariate Cox-PH model and kept those features with statistical significance (log-rank p-value <0·05). We rebuilt the clinically informative Cox-nnet models with the reduced set of features, exactly the same way as the models using all initial input features.
Interactive Web Application for Easy Model Validation
To disseminate the models for public use, we containerized the pre-trained Cox-nnet model into a Docker-based web application using R shiny20. This allows the users to access the models easily through a local web interface and get prediction results quickly. This app contains two main panels: the individual prediction panel and the group prediction panel. Using pre-trained models, the individual prediction panel calculates the prognosis index (PI) score of a single new patient, marking its positions and percentiles in a distribution plot of PIs within the UM discovery cohort. The group panel takes in a group of new patients and returns predicted PIs and percentiles of their PIs in a table. The shiny app is available at http://garmiregroup.org/PE-prognosis-predictor/app
Role of the funding source
The sponsor of the study had no role in the study design, data collection, data analysis, data interpretation, writing of the manuscript, or decision to submit the manuscript for publication.
RESULT
Cohort characteristics
The discovery cohort contains EHR records from 1,533 unique preeclamptic pregnancies, including 374 pregnancies complicated by EOPE, from Michigan Medicine between the years 2015 to 2022. Patients with HELLP syndrome, chronic hypertension with superimposed PE, and postpartum PE were removed from the cohort. Additionally, transferred patients from other clinics outside of Michigan Medicine were excluded from the cohort. Following the same inclusion/exclusion criteria, 2,249 unique preeclamptic pregnancies (547 with EOPE) from the University of Florida (UF) between 2015 and 2022 were identified as the validation cohort (Figure 1). Summaries of the patient characteristics of these cohorts are shown in Table 1 and 2.
The baseline prediction model of time to delivery interval among PE patients
We built the baseline model using 47 variables including patient demographics, medical history, comorbidities, PE diagnosis time, and severity (Supplementary Table 2). We randomly split the discovery dataset into a training set (80%) and a hold-out testing set (20%). We then built the survival prediction model using the Cox-nnet (version 2) algorithm14 (Supplementary Figure 1). Cox-nnet (version 2) is a multilayer perceptron prognosis prediction model based on Cox Proportional Hazards regression, suitable for EHR data prediction (Methods). The resulting model is predictive with C-indices of 0·73, 0·72, and 0·71 in the UM training, UM hold-out testing, and UF validation cohorts, respectively (Figure 2A).
To enhance the clinical utilities of the Cox-nnet model, we reduced the number of predictive features. We selected the top 15 most important features based on their average permutation importance scores in the Cox-nnet model19, followed by univariate Cox-PH fitting for each to keep only the features with statistical significance (log-rank p-value <0·05)19. This procedure resulted in 7 significant features, which we used to rebuild the “clinically informative Cox-nnet baseline model”. It reaches high predictability of the time to delivery, with median C-index scores of 0·73, 0·74, and 0·73 on UM training, hold-out testing, and UF validation dataset respectively (Figure 2A). We stratified patients into 3 risk groups by the quartiles of predicted PI scores from the reduced model: high-risk (upper quartile), intermediate-risk (interquartile), and low-risk (lower quartile) groups. The survival curves of the time to delivery interval on these three risk groups display significant differences (log-rank p-value < 0·0001) on both the hold-out testing set (Figure 2B) and validation set (Figure 2C), confirming the strong discriminatory power of the PI score.
The seven features in the clinically informative baseline model included those that shorten the time to delivery and extend the time to delivery (Figure 2C; Table 3). In descending order of importance scores, the features that shorten the time to delivery are: gestational age at diagnosis, sPE, uncomplicated pregestational diabetes mellitus, and parity. Conversely, features extending the time to delivery are: PE in a prior pregnancy, increasing maternal age, and comorbid valvular disease. To demonstrate the associations of these important features with time to delivery, we dichotomized patient survival in the hold-out testing set by the median value of each feature (Supplementary Figure 2). All features, except maternal age, show significant differences (log-rank p-value < 0·05) between the dichotomized survival groups. We further examined the relationship of the top 3 features (gestational age at diagnosis, sPE, and history of PE in prior pregnancy) with the gestational age at delivery and time to delivery (day) using the UM discovery set in (Figure 2E-2J). Later gestational age at diagnosis leads to a later gestational age of delivery (Figure 2E), but a shorter time to delivery (Figure 2H). Higher percentage of patients with earlier gestational age of delivery (Figure 2F) and shorter time to delivery (Figure 2I) are diagnosed with sPE. In the deliveries from smaller (<32 weeks) gestational ages, the percentages of patients with PE in prior pregnancies are significantly higher (Figure 2G). However, the percentages of prior PE fluctuate with respect to time to delivery (Figure 2M).
The full model of time to delivery among PE patients
We next investigated the contribution to time of delivery from all 62 variables, including the 47 baseline variables above and an additional 15 laboratory testing results and vital signs obtained in the 5-day observation window before the time of diagnosis (Supplementary Table 2). We performed model construction, validation, and feature reduction for clinical use in the same way as the baseline model. As a result, 12 top features were kept in the clinically informative full model (Table 4). This model shows higher predictive accuracy of time to delivery compared to the seven-feature baseline model, with median C-index scores of 0·78, 0·79, and 0·74 in the training, testing, and validation datasets respectively. The difference between survival curves of the high-, intermediate- and low-risk groups stratified by predicted PI scores from the reduced model has log-rank p-values close to 0 in the hold-out test set (Figure 3B) and validation set (Figure 3C), even more significant than those from the baseline model (Figure 2B and 2C).
Further examination of the 12 important features in the full model (Figure 3D, Table 4) shows good consistency with the 7-feature baseline model (Figure 2D, Table 3). Five out of seven features in the baseline model also exist in the full model with similar importance scores: gestational age at diagnosis, sPE, parity, maternal age, and PE in prior pregnancies. Gestational age at PE diagnosis and sPE continued to be the two most important features in the full model. New features associated with shorter intervals from diagnosis to delivery were also identified. They were, in descending order of feature importance: aspartate aminotransferase (AST) value, the standard deviation of diastolic blood pressure (DBP), the standard deviation of respiratory rate (RR), creatinine value, mean DBP and white blood cell count (Figure 3D). Conversely, platelet count was identified as a new feature with a negative importance score, which means it’s associated with a longer time to delivery. All dichotomized survival plots using median stratification on each of the 12 important features have log-rank p-values smaller than 0·05, confirming their associations with time to delivery in the discovery set (Supplementary Figure 3). We examined the 3 top lab/vital sign features: AST, the standard deviation of DBP, and the standard deviation of RR, on their association with the duration of time between diagnosis and delivery. These values show negative trends with time to delivery, particularly for AST value and the standard deviation of DBP (Figure 3E-G). These 3 features are roughly uniform across delivery gestational ages, except AST which shows slightly higher values in deliveries less than 32 weeks of gestational age (Supplementary Figure 4).
Time to delivery prediction of EOPE patients
Accurate prediction of EOPE patients’ time to delivery is crucial, given that delivery of a premature infant has more significant neonatal consequences. Using similar modeling techniques, we trained two additional EOPE-specific Cox-nnet (version 2) models (baseline vs. full model), using the same features described earlier (Supplementary Table 2), on a subset of 374 EOPE patients from the UM discovery cohort.
The C-indices for the clinically informative EOPE baseline model are 0·67, 0·68, and 0·63 on the UM training, testing, and UF validation sets, respectively (Figure 4A). In the UM hold-out testing set and UF validation set, the high-median-, and low-EOPE patient delivery risk groups using the PI scores from the reduced model show significant differences, with log-rank p-value < 0·001 (Figure 4B and 4C). This baseline model consists of the six most important features: gestational age at diagnosis, sPE, PE in a past pregnancy, parity, pulmonary circulatory disorders, and coagulopathies (Figure 4D; Table 5). All survival plots, dichotomized using the median stratification on each of the 6 features, have log-rank p-values smaller than 0·05 in the discovery dataset (Supplementary Figure 5).
The clinically informative EOPE full model reached much higher accuracy compared to the EOPE baseline model, with median C-indices of 0·74, 0·76, and 0·67 on the training, testing, and validation sets (Figure 4E). The 3 risk-stratified groups within the EOPE patients cohort also showed significant (log-rank p-value<0·001) differences in the hold-out testing set and validation set (Figure 4F, 4G). This model contains 13 important features selected from the original 62 features (Figure 4H; Table 6).
Gestational age at diagnosis continued to be the most important feature. Several other features (eg. PE with severe symptoms, PE in a past pregnancy, parity, and coagulopathy) were of significant importance as well, similar to the EOPE baseline model. Many additional features in the vital signs and lab test categories were also significant, including creatinine value, mean DBP and mean SBP, standard deviation of RR, AST, and platelet counts. Among these 13 features, parity, PE in a prior pregnancy, and higher platelet counts were protective against early delivery (Figure 4H).
We created dichotomized survival curves based on creatinine value and platelet counts, two new features relative to the EOPE baseline model. Both of them show strong distinctions between the risk groups (Figure 4I, 4L), similar to all other selected features (Supplementary Figure 6). These two features also revealed systematic trends in associations with the gestational age at delivery and time from diagnosis to delivery. Patients with high creatinine levels were more likely to be delivered within 3 days or less of diagnosis and more likely to deliver preterm (Figure 4J, 4K). Lower platelet counts were also associated with shorter time to delivery (Figure 4N), even though the platelet levels were not strongly associated with gestational age at delivery among all EOPE patients (Figure 4M).
PE time to delivery predictor graphic user interface (GUI)
To disseminate our model publicly, we packaged the pre-trained clinically informative models above into an interactive, user-friendly web application using R shiny23. We named this app “PE time to delivery predictor”. The app contains two main panels: the single-patient prediction panel and the group prediction panel (Supplementary Figure 7). The single-patient prediction panel calculates the prognosis index(PI) of a single patient if provided the required clinical variables. The PI score describes the patient’s risk of delivery at the time of the diagnosis of PE, relative to the population. The panel also provides the percentile of the PI score among the training data, and displays the results in a histogram figure and a table. The group prediction panel calculates the PI and PI percentile of multiple patients simultaneously and also displays them in a table, below the histogram built on the training data. The app is available at http://garmiregroup.org/PE-prognosis-predictor/app
Discussion
The implementation of predictive models of delivery time may provide clinicians with critical, and patient-specific, information for time-sensitive PE management. Here, we present the first study to precisely predict time to delivery among PE patients using electronic health records (EHR). Our work is distinctive from most other studies of risk prediction or classification21,22, in that the patients’ time to delivery is explicitly modeled as a continuum by the neural network model. Our models confirmed key factors in current PE management, including gestational age at the time of diagnosis, sPE, and the use of creatinine and AST as biomarkers in clinical decision-making (Figure 5). An important observation made by our study includes the identification of parity and PE in prior pregnancies as important predictors in all models tested which are not included in current guidelines to manage PE (Figure 5). For EOPE time to delivery prediction, we found that consideration of comorbidities such as pulmonary circulatory disorders and coagulopathies is important. We expect that these new findings will accelerate the improvement of PE management guidelines. Most importantly, the work here provides clinicians with a practical management tool through personalized prognosis score prediction in a user-friendly manner. This nuanced clinical calculator may help clinicians both provide more accurate expectations for families and make decisions about the urgency to transfer a patient to a higher level of care.
There are several noticeable strengths of this study. Unlike the majority of previous studies that are not validated with external data21,23,24, our models are validated with an external and independent EHR dataset from UF Health System, despite the differences between the populations in the two cohorts (Table 1). These models also addressed clinical interpretability by providing importance scores with directionality for each included feature. Furthermore, the model is designed for accessibility by utilizing fewer than 15 commonly collected demographic and clinical variables and can be readily utilized through a user-friendly shiny application. Our results are contrary to previous studies focused on predicting PE delivery timing relying extensively on nonstandard biomarkers such as uterine artery pulsatility index (UtA-PI)or placental growth factor (PLGF)23,25,26). Measurement of these biomarkers is rare in routine prenatal checkups, particularly in lower-income regions, posing significant challenges to the widespread adoption of these biomarker-based models. A few caveats to this study are potentially limiting. As a retrospective study, it was not clear if a patient was delivered at the most optimal time, or what the clinical considerations of their care might have been. Further prospective investigations of this model’s performance would be necessary to confirm the findings. Also,the quality of EHR data is influenced by clinicians’ previous judgment and use of billing codes, changes in hospital protocol, communication between patient and provider, intensive care resource availability, and each patient’s intentions. These factors cannot be quantified or corrected for in this model, but can potentially affect the timing of delivery in PE and/or EOPE patients. Additionally, our data came from two medical centers with high levels of obstetrics care and therefore testing the model in other settings may provide additional insights.
In summary, we have developed the first accurate, deep-learning-based, time to delivery prediction models for PE and EOPE patients. The models are disseminated with an easy-to-use web app. Adoption of these models could provide clinicians and patients with valuable management plans to predict and prepare for the best delivery time of each PE pregnancy. Further prospective investigation of the performance of these models is necessary to provide feedback and potential improvement of these models.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Declaration of Interest
The authors declare no conflict of interest.
Author’s Contribution
LG conceived this project and supervised the study. XY conducted data analysis, implemented the Shiny app and wrote the manuscript. HKB, ADM, KX, and DJL collaborated on validation using UF cohort. ESL and ADM provided clinical assessments and assistance. DG assisted Shiny app editing and troubleshooting. All authors have read, revised and approved the manuscript.
Acknowledgment
LXG was supported by grants K01ES025434 awarded by NIEHS through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (www.bd2k.nih.gov), R01 LM012373 and LM012907 awarded by NLM, R01 HD084633 awarded by NICHD. DJL was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (K01DK115632) and the University of Florida Clinical and Translational Science Institute (UL1TR001427). AM is supported by the National Center for Advancing Translational Science (5TL1TR001428).
We thank Anisa Driscoll and Kate Smith from the University of Michigan Precision Health for providing technical support when extracting data used in this study. We acknowledge the University of Florida Integrated Data Repository (IDR) and the UF Health Office of the Chief Data Officer for providing the analytic data set for this project.
Footnotes
The version has been revised to update figures, validation results, and add a new user-friendly web application.
Abbreviations
- PE
- preeclampsia
- EOPE
- early-onset preeclampsia
- LOPE
- late-onset preeclampsia
- EHR
- electronic health record
- SBP
- systolic blood pressure
- DBP
- diastolic blood pressure
- RR
- respiratory rate
- HELLP
- hemolysis, elevated liver enzymes, low platelet count
- AST
- aspartate transaminase
- PI
- prognosis score
- UM
- University of Michigan
- UF
- University of Florida
- ICD-10
- The International Classification of Diseases, Tenth Revision
- MAP
- mean arterial pressure
- UtA-PI
- uterine artery pulsatility index
- PLGF
- placental growth factor
- ACOG
- American College of Obstetricians and Gynecologists