RT Journal Article SR Electronic T1 New and Increasing Rates of Adverse Events Can be Found in Unstructured Text in Electronic Health Records using the Shakespeare Method JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.01.12.21249674 DO 10.1101/2021.01.12.21249674 A1 Bright, Roselie A. A1 Dowdy, Katherine A1 Rankin, Summer K. A1 Blok, Sergey V. A1 Palmer, Lee Anne A1 Bright-Ponte, Susan J. YR 2021 UL http://medrxiv.org/content/early/2021/01/16/2021.01.12.21249674.abstract AB Background Text in electronic health records (EHRs) and big data tools offer the opportunity for surveillance of adverse events (patient harm associated with medical care) (AEs) in the unstructured notes. Writers may explicitly state an apparent association between treatment and adverse outcome (“attributed”) or state the simple treatment and outcome without an association (“unattributed”). We chose to study EHRs from 2006-2008 because of known heparin contamination during this timeframe. We hypothesized that the prevalence of adulterated heparin may have been widespread enough to manifest in EHRs through symptoms related to heparin adverse events, independent of clinicians’ documentation of attributed AEs.Objective Use the Shakespeare Method, a new unsupervised set of tools, to identify attributed and unattributed potential AEs using the unstructured text of EHRs.Methods We studied 21,287 adult critical care admissions divided into three time periods. Comparisons of period 3 (7/2007 to 6/2008) to period 2 (7/2006 to 6/2007) were used to find admissions notes to review for new or increased clinical events by generating Latent Dirichlet Allocation topics among words in period 3 that were distinct from period 2. These results were further explored with frequency analyses of periods 1 (7/2001 to 6/2006) through 3.Results Topics represented unattributed heparin AEs, other medical AEs, rare medical diagnoses, and other clinical events; all were verified with EHRs notes review and frequency analysis. The heparin AEs were not attributed in the notes, diagnosis codes, or procedure codes. Somewhat different from our hypothesis, heparin AEs increased in prevalence from 2001 through 2007, and decreased starting in 2008 (when heparin AEs were being published).Conclusions The Shakespeare Method could be a useful supplement to AE reporting and surveillance of structured EHRs data. Future improvements should include automation of the manual review process.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe only funding came from US Food and Drug Administration (FDA) in two forms. FDA supplied the salaries and computing resources for Drs. Bright, Bright-Ponte, and Palmer. FDA paid for contracts with Booz Allen Hamilton that supported salaries and research computing resources for Ms. Dowdy and Drs. Rankin and Blok.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Our use of the data was approved by the governing board for the data administrators: Massachusetts Institute of Technology IRB. Our study was deemed to not be human subjects research by the Food and Drug Administration IRB.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe study is not a clinical trial. The data are available from the Massachusetts Institute of Technology at MIMIC-III Critical Care Database. https://mimic.physionet.org/about/mimic/. https://mimic.physionet.org/about/mimic/. AEAdverse eventsAFAtrial fibrillationBIDMCBeth Israel Deaconess Medical CenterCABGCoronary artery bypass graftCCUCritical (or Intensive) Care UnitCPRCardiopulmonary resuscitationDMIIDiabetes mellitus, type 2DVTDeep vein thrombosisEHRsElectronic healthcare recordsFDAFood and Drug AdministrationHDHospital dayHITHeparin induced thrombocytopeniaIABPIntra-aortic balloon pumpIPHIntraparenchymal hemorrhageIVIntravenousLDALatent Dirichlet Allocation algorithm for topic modelingLRLogistic regression supervised learning algorithmMCAMiddle cerebral arteryMIMIC-IIIMedical Information Mart for Intensive Care IIIMRIMagnetic resonance imageMVAMotor vehicle accident MVC Motor vehicle collisionNBNaïve Bayes supervised learning algorithmNLPNatural language processingO2OxygenOROperating roomPAEPotential adverse eventPICCPeripherally inserted central catheterPODPost-operative daytPATissue plasminogen activatorUTIUrinary tract infection