New and Increasing Rates of Adverse Events Can be Found in Unstructured Text in Electronic Health Records using the Shakespeare Method

Roselie A. Bright; Katherine Dowdy; Summer K. Rankin; Sergey V. Blok; Lee Anne Palmer; Susan J. Bright-Ponte

doi:10.1101/2021.01.12.21249674

ABSTRACT

Background Text in electronic health records (EHRs) and big data tools offer the opportunity for surveillance of adverse events (patient harm associated with medical care) (AEs) in the unstructured notes. Writers may explicitly state an apparent association between treatment and adverse outcome (“attributed”) or state the simple treatment and outcome without an association (“unattributed”). We chose to study EHRs from 2006-2008 because of known heparin contamination during this timeframe. We hypothesized that the prevalence of adulterated heparin may have been widespread enough to manifest in EHRs through symptoms related to heparin adverse events, independent of clinicians’ documentation of attributed AEs.

Objective Use the Shakespeare Method, a new unsupervised set of tools, to identify attributed and unattributed potential AEs using the unstructured text of EHRs.

Methods We studied 21,287 adult critical care admissions divided into three time periods. Comparisons of period 3 (7/2007 to 6/2008) to period 2 (7/2006 to 6/2007) were used to find admissions notes to review for new or increased clinical events by generating Latent Dirichlet Allocation topics among words in period 3 that were distinct from period 2. These results were further explored with frequency analyses of periods 1 (7/2001 to 6/2006) through 3.

Results Topics represented unattributed heparin AEs, other medical AEs, rare medical diagnoses, and other clinical events; all were verified with EHRs notes review and frequency analysis. The heparin AEs were not attributed in the notes, diagnosis codes, or procedure codes. Somewhat different from our hypothesis, heparin AEs increased in prevalence from 2001 through 2007, and decreased starting in 2008 (when heparin AEs were being published).

Conclusions The Shakespeare Method could be a useful supplement to AE reporting and surveillance of structured EHRs data. Future improvements should include automation of the manual review process.

INTRODUCTION

Avoidable patient harm continues to be a significant problem [1]. To learn of patient harm known as adverse events (AEs) related to products it regulates, FDA relies on spontaneous reports from manufacturers, healthcare providers, and the general public. Published deficiencies of these reports [2-10] include well known biases in reporting. Now that electronic healthcare records (EHRs) are very common [11] and seen as more informative than billing codes from payment claims [6, 12, 13] we have an opportunity to leverage them for automated surveillance of AEs [6, 14, 15].

Many methods for finding AEs in text [6, 7, 9, 16-38] rely on predefining the possible AEs. Writers may explicitly state an apparent association between treatment and adverse outcome (“attributed”) or state the simple treatment and outcome without an association (“unattributed”). More critically, attributed and unattributed potential AEs (PAEs) may not necessarily be captured in structured data (e.g., diagnosis and procedure codes) [14, 23, 39].

Many medical care AEs occur at higher frequency in hospital critical care settings, related to complex illnesses, invasive procedures, and relatively long lists of treatments [40, 41]. In previous work, we performed a comparison of transfused to non-transfused admissions to critical care at a major teaching hospital [42] that successfully found potential blood transfusion adverse events, while addressing many published challenges (such as synonyms, overlapping meanings, and nonstandard terms) with using unstructured EHRs text [5, 11, 14, 19, 23].

We hoped the Shakespeare Method [42] would overcome the challenges of EHRs text to detect not only clinical and administrative changes but also trending potential AEs (PAEs), including heparin contamination PAEs which were first reported early in 2008 [43].

METHODS

The Shakespeare strategy is to find unusual, significant words that were new or increased in the most recent time period, use topic analysis to find words that tended to occur together, examine admissions that were prominent for topics of interest, and then evaluate how well the topics performed [42].

Study Population

We used EHRs for critical care admissions within an adult hospital, Beth Israel Deaconess Medical Center, Boston, MA the Medical Information Mart for Intensive Care III (MIMIC-III) [35, 44], which used one medical record system in 2001-2008 and another afterwards. We received the real dates, within several weeks, for the earlier data. MIMIC III is publicly available to those meeting human subjects research requirements. The research was designated not human subjects research by the FDA Institutional Review Board under the Code of Federal Regulations [45].

We wanted to simulate real-time analysis to find new or increasing events in the most recent time period. MIMIC-III data were collected with two sequential EHRs, so we selected the longer, earlier period of exclusive use of the earlier EHRs (7/2001-6/2008). We restricted the admissions to patients > 16 years old because this was a hospital for adults.

Preprocessing

We concatenated in chronological order all text notes for a hospital admission into a document. We removed the personally identifying information mask string and lowercased the text, and retained punctuation, numerals and stop words (because they convey clinical information and are sometimes components of abbreviations).

Since our methods would be based on the frequencies of words, we eliminated duplicate sentences because they do not represent additional information and give weight to variable personal duplication practices. We removed widespread duplicated sentences and lists within the notes, using Bloatectomy [46].

Word Extraction

We utilized sci-kit learn’s CountVectorizer [47, 48] to convert each document into a bag of words vector where each dimension is represented by the frequency of each n-gram present in the document (see Figure 1a and 1b). Details are in Table 1.

View this table:

Table 1. Preprocessing/Model Parameters.

Figure 1. Word selection and topic modeling process with truncated examples.

We then divided the study population into three cohorts: (Period 1) admissions starting between 7/1/2001 and 6/30/2006 (14,410 documents); (Period 2) 7/1/2006-6/30/2007 (3,581 documents), and (Period 3) 7/1/2007-6/30/2008 (3,296 documents).

To focus on new or increasing AEs, we reduced the number of words to analyze by filtering by whether they were unusual and increasing (or new) in period 3 compared to period 2 (see Figures 1c, 1d and 2a). We adopted two parallel approaches, shown in Figure 2: through binary classification of the notes, and analysis of term frequency between periods 3 and 2.

Figure 2. Feature extraction flowchart.

This demonstrates the two parallel processes for extracting relevant features prior to topic modeling on the notes: term frequency analysis and binary classification of notes.

For the binary classification, we fit two classification models: logistic regression (LR) with L2 / ridge regularization [49] and multinomial naïve Bayes (NB) [50, 51]. Model evaluation found LR outperformed NB (with a weighted average F1 score of 0.76 compared to NB’s weighted average F1 of 0.69), but that NB more effectively identified completely new terms in the target time period.

After evaluating the models, we re-fit both models without a train-test split on the entire 24-month dataset and combined the top 5,000 features from LR (those with the highest positive coefficient, associated with the positive target class) and the top 5,000 features from NB (those with the lowest log probability ratio). Combining lists resulted in a set of 9,896 terms.

We used frequency analysis to find emerging rare clinical events. We identified two groups of terms: those which appeared in fewer than 10% of documents in period 2 and saw a 30% increase in raw frequency in period 3, and any terms that never appeared in period 2 and did appear in period 3. For those new terms appearing in period 3, we filtered out digit-only terms (a large number of terms in this group).

For the final feature set, we took the intersection of terms identified from the binary classification and frequency analysis processes. This resulted in 6,122 significant terms identified from the initial 117,049 unique terms in documents from period 3 (5.2% of terms). We re-vectorized (Figure 1e) the 12-month corpus from period 3 using the combined feature list as our vocabulary (which has the effect of filtering the notes to only include terms in the vocabulary).

Topic Modeling and Interpretation

The co-occurrence of words in documents in the last time period was analyzed with Latent Dirichlet Allocation (LDA) topic analysis [52]. We chose the final number of topics (20) based on a balance of large and small topics and at least one topic with no substantive words. We used the words with the highest scores of their relationship to topics (Figure 1f), as well as the topic document scores that indicate the probability of the topic fit for a document (Figure 1g), to explore topic meanings. We manually read the three top-scoring documents for each topic (Figure 1h).

Statistical Analysis of Words and Codes Suggested by Manual Review of the Topics

Documents from selected individual admissions, as well as summary data from 7/2001 to 6/2008 were used to evaluate whether any topics formed around AEs. Most topics inspired time plots of selected words, diagnosis codes, or procedure codes through periods 1, 2, and 3. Slopes were analyzed for changes [53, 54].

For this report, out of concern for patient privacy we substituted generic words (such as “condition01”, “condition02”, etc.) for rare conditions, drugs, events, and languages because the year of admission is being presented. Related substitute words (e.g., “condition09a”, “condition09b”) were used for synonyms.

RESULTS

Table 2 shows the statistics for each topic. The strength of the maximum word score in a topic roughly corresponded with the number of admissions that had strong matches with the topic. The words in many of the topics seem to readily suggest interpretations, for example: long complex stay (topic 18), heart problem (3), trauma (19), cardiac catheterization (7), brain (1), cardiac catheterization (17), abdomen (12), uterus (16), and a foreign language (2). The other topics seemed broad.

View this table:

Table 2. Topics sorted by the maximum word score in the topic, with the top 20 substantive words, the maximum topic match score among admissions, and distribution of the topic match scores among admissions.

“Substantive” words had topic scores above the minimum topic score. “Max” means maximum.

Common topics

For the most common topics, the admissions with the top three topic match scores are summarized in Table 3. For the topics with words that suggested an interpretation, the records supported the interpretations. For the other topics, the records suggested interpretations that were consistent with the top words. Each of the three top scoring admissions within a topic were quite similar to each other (an indication that the topics were coherent and the model was working correctly; with the exception of the third admission in topic 3).

View this table:

Table 3. Summaries of the admissions with the top three topic match scores, for the most common topics.

“HD” is hospital day. “Intubated” and “extubated” refer to starting and ending mechanical ventilation.

The top three scoring documents for topic 18 described long complex stays, which included large numbers of notes. The general words in the topic (“for”, “hr”, “plan”, “cont”, “today”, “skin”, and “are”) are nearly ubiquitous in periods 2 and 3. The words indicating mechanical ventilation (“vent”, “intubated”, and “trach”) were present in between 51% and 58% of the admissions per quarter in periods 2 and 3, with a slight, not clinically significant increase for period 3. The lengths of stay and numbers of notes also did not vary between periods 2 and 3.

We noticed that among the five records in Table 3 that mentioned cardiac catheterization, all mentioned explicit or implied dosing with heparin followed the same day with hypotension that required treatment (heparin is generally involved with cardiovascular procedures) [55].

Topics 3 and 7 both have cardiac catheterization for heart problems in common; for five out of six instances, the procedure or heparin administration was followed by hypotension (four instances) that needed to be treated or heart rhythm deterioration (one instance). To investigate whether these potential heparin AEs were increasing 7/2001-6/2008, we plotted two measures of exposure (invasive cardiac procedure code and “heparin”) and a measure of AE (“hypotension”). The proportion of admissions that had invasive cardiovascular procedure codes (see Figures 3a and b) declined overall (see Figure 3a), but had a local increase in period 3, compared to period 2. In contrast to the procedures, the words “heparin” and “hypotension” showed an overall rough increase over the entire timeframe. We also noticed that the proportion of admissions with invasive cardiology codes that had the word “hypotension” increased gradually over time (Figures 3a and b), followed by a drop in the last quarter; the pattern was similar and weaker for the proportion of admissions with “heparin” that also had “hypotension”. There was a decrease in “hypotension” in the last quarter, both as a proportion of all admissions, and as a proportion of either indicator of having been exposed to heparin.

Figure 3. Heparin and hypotension.

Figure 3a. Invasive cardiology-, heparin-, and hypotension-related criteria as proportion of all admissions.

Invasive cardiology is presumed to involve heparin treatment. The definitions are listed in “Results eFigures 2 to 6 and figures info”. For invasive cardiovascular procedure code, slope = −0.0053 (95%CL −0.0069 to −0.0037, p<0.0001), for heparin word slope = 0.0039 (95% CI 0.0025 to 0.0054, p<0.0001), and for hypotension word slope = 0.0029 (95%CL 0.0017 to 0.0040, p<0.0001).

3b. “Hypotension” word as proportion of presumed heparin exposure. For proportion of any invasive cardiovascular procedure code (presumed to involve heparin), slope = 0.0055 (95%CL 0.0038 to 0.0072, p<0.0001). For proportion of those with “heparin”, slope = 0.0013 (95%CL −0.00036 to 0.0030, p=0.12).

Figure 3 notes:

• Invasive cardiovascular code:

• 3891 Arterial catheterization

• 3961 Extracorporeal circulation auxiliary to open heart surgery

• [3965 to 3966]

• “Heparin” word

• “Hypotension” word

Less common topics

Summaries of admissions with topic matching scores for the less common topics are shown in Table 4. We examined the top scoring admissions matched to topic 11 and all admissions matched to the others. All admissions in this Table had topic match scores for the index topic of <0.15 (column 2). Despite each admission in Table 4 having at least one strong topic match score for at least one of the strong topics in Table 3, the topics in Table 4 are distinct from those in Table 3. Some of the topics have admissions that have common aspects (topics 11, 10, 2, 9).

View this table:

Table 4. Summaries of admissions (top scoring for 20-11 and all for the other topics) with topic matching scores for the less common topics.

“Unusual” means there were a few or some instances in period 1. “Rare” means there were no instances in period 1.

Fourteen PAEs evident in the notes were distributed among the less common topics: 13 related to medical therapy (6 medications, 3 medical devices, 2 procedures, and 2 combinations) and 2 non-medical. Five drug and all of the medical device PAEs are published in the product labels and/or medical literature. Nine of the PAEs occurred outside the hospital and were related to the reason for admission. The diagnosis and procedure codes generally did not give enough information to understand the specific cause and associated potential AE. Figure 7 shows that while the proportions over the 7 years of admissions with allergy and anaphylaxis words steadily decreased, the diagnosis codes for drug AEs and for surgical or procedure AEs increased slightly over time.

Figure 7. Allergy, anaphylaxis, and AE as proportion of admissions by quarter.

For allergy or anaphylaxis word, slope=-0.0022 (95%CI −0.0027 to −0.0018), p<0.0001. For drug AE code, slope=0.00031 (95%CI - 0.000079 to 0.00070), p=0.12. For surgery or medical AE code, slope=0.00049 (95%CI −0.00022 to 0.0012), p=0.18.

Figure 7 notes:

• Allergy or anaphylaxis word: “allerg*” or “anaphyl*”

• Drug AE code: 960** to 979** Poisoning By Drugs, Medicinals And Biological Substances

• Surgery or medical AE code: 996** to 999** Complications Of Surgical And Medical Care, Not Elsewhere Classified

The other rare and infrequent terms, related diagnosis or procedure codes, and foreign language sentences were rare throughout all three time periods and increased during period 3.

DISCUSSION

We succeeded in our expectation of finding increases in clinical events and our hope of finding increases in AEs, especially AEs that would not have been reported because they were not attributed. We found increases in hypotension following heparin or presumed-heparin exposure. Hypotension occurring in the cardiac catheterization lab could be a vasovagal reaction [56]. However, vasovagal reaction generally does not respond to fluids and drugs for raising blood pressure, and all our observed patients’ hypotension did respond to treatment. Hypotension can occur as anaphylaxis begins and, alone, may reflect mild anaphylaxis. We note that the nurses and physicians that described the sequence of events did not link sudden hypotension to heparin and the diagnosis codes did not reflect any awareness of a link. The warnings from FDA and the Centers for Disease Control and Prevention about heparin in the winter of 2007-2008 were for anaphylaxis due to adulterated heparin [57, 58]. Knowledge of the extent of the distribution of adulterated heparin products was not specific, so it may have been in the hospital’s stock at the time. We had expected to see increases starting in 2006 because a few articles indicate heparin may have been adulterated before 2007 [59-61], but were surprised that the increases had started before 2006. The reduction in the last quarter coincided with recalls of contaminated heparin products and lend credibility to the idea that contaminated heparin was in slowly increasing use at this hospital for many years. We are struck that such a high proportion of the invasive cardiac catheter patients in the last two years experienced hypotension following heparin exposure (either as explicitly documented administration or implicitly in the catheter coating).

The types of clinical event changes we detected from period 2 to period 3 were: increases in patients with common conditions (heart disease, brain injuries, trauma, and complex conditions associated with long hospital stays), increases in rare conditions, change in administration (foreign language portion), and adverse events of concern.

The increases in common conditions may have reflected hospital marketing [62].

The increases in rare conditions could have reflected chance, or marketing as a referral center.

Nine of the adverse events happened outside the hospital and illustrate the utility of hospital records for monitoring severe reactions that occur in other health facilities or outside the healthcare system. Our method was useful for detecting words that are rare in hospital records, partly reflecting events that normally occur outside the hospital.

The topic with the highest document score exhibited typical behavior of a topic containing words that are common to most documents. The filter that was removing words comprised of only digits also removed digits from some words. This resulted in some high frequency words getting into the vocabulary. When topic modeling, this resulted in high scores for these common words in the topics where they were correlated (as expected this happens in several topics) and created a common word topic (topic 18). This topic is a noise topic; the LDA model will put words that are low scoring and not correlated with other topics into their own noise topic in order to deal with noise and frequent words. Because this topic included words that were frequent in almost all documents, as expected the document topic scores for this topic were high [63]. This was dealt with by looking at the other more coherent topics that were assigned to each document (essentially ignoring this common-noise topic, capturing what most documents have in common. The top scoring words in this topic that were general survived the ensemble filtering method as an artifact of the digit-removal step. For future work, we recommend removing this step from the filtering process and relying on the classification terms to filter out irrelevant variations of terms.

Our method worked despite:

• the known challenges posed by clinical text notes

• restriction to one major hospital

• lack of all surgical and non-CCU nursing notes, and variable lack of physician, nursing, or discharge summary notes, probably reflecting hospital policy of gradually converting types of notes to EHRs [64]

• errors up to several weeks in dates.

Different, and hopefully improved, results may be derived from EHRs databases that are more complete and have actual dates.

Much of our manual work to evaluate topics could be reduced with a combination of natural language processing and dictionaries of clinical terms. Dictionaries should include standard acronyms and common abbreviations, and should try to account for context when the meaning of term could be ambiguous. The ability to decipher ongoing care notes will be important for noticing unrecognized signals of AEs.

CONCLUSIONS

We suggest that heparin contamination may have occurred earlier than previously recognized in the winter of 2007-2008, at a lower rate.

Our method successfully aided in the detection of a variety of medical product AEs that were not attributed in clinicians’ notes, suggesting that this method could be a useful supplement to existing post-marketing surveillance programs at local as well as national levels. The method also found other changes in clinical care experiences. The method is easy to execute and understand and could be adopted by subnational public and private entities. It finds potential adverse events that are candidates for causality assessment with epidemiology or other clinical studies.

Our method enabled manual review of key EHRs by narrowing interest from the original large volume of words used in notes. Future improvements could include automation of the manual review process.

Data Availability

The study is not a clinical trial. The data are available from the Massachusetts Institute of Technology at MIMIC-III Critical Care Database. https://mimic.physionet.org/about/mimic/.

https://mimic.physionet.org/about/mimic/.

Conflict of interest and disclaimer

The research was done with FDA support and under contract HHSF223201510027B between FDA and Booz Allen Hamilton Inc. None of the authors have other relevant financial interests. The opinions are those of the authors and do not represent official policy of either the FDA or Booz Allen Hamilton.

ACKNOWLEDGEMENTS

We thank enthusiastic support by our FDA and Booz Allen Hamilton supervisors, Department of Health and Human Services innovation programs (Ignite Accelerator and Data Science CoLab), and Alistair Johnson, DPhil, of the MIMIC-III program, Massachusetts Institute of Technology. George Plopper, PhD, of Booz Allen Hamilton, provided project and consultation support. Many FDA colleagues offered ideas and feedback regarding the selection of the case and the final paper. All authors had access to the data. All authors are responsible for the study topic, design, and interpretation. Dr. Bright, Ms. Dowdy, Dr. Rankin, and Dr. Blok are responsible for data processing and analysis.

Footnotes

We made minor corrections to Figure 1 to use generic terms for words (Word1, Word2, etc.).

ABBREVIATIONS USED MORE THAN ONCE

AE: Adverse events
AF: Atrial fibrillation
BIDMC: Beth Israel Deaconess Medical Center
CABG: Coronary artery bypass graft
CCU: Critical (or Intensive) Care Unit
CPR: Cardiopulmonary resuscitation
DMII: Diabetes mellitus, type 2
DVT: Deep vein thrombosis
EHRs: Electronic healthcare records
FDA: Food and Drug Administration
HD: Hospital day
HIT: Heparin induced thrombocytopenia
IABP: Intra-aortic balloon pump
IPH: Intraparenchymal hemorrhage
IV: Intravenous
LDA: Latent Dirichlet Allocation algorithm for topic modeling
LR: Logistic regression supervised learning algorithm
MCA: Middle cerebral artery
MIMIC-III: Medical Information Mart for Intensive Care III
MRI: Magnetic resonance image
MVA: Motor vehicle accident MVC Motor vehicle collision
NB: Naïve Bayes supervised learning algorithm
NLP: Natural language processing
O2: Oxygen
OR: Operating room
PAE: Potential adverse event
PICC: Peripherally inserted central catheter
POD: Post-operative day
tPA: Tissue plasminogen activator
UTI: Urinary tract infection

REFERENCES

1.↵
Brewer T, Colditz GA. Postmarketing surveillance and adverse drug reactions: current perspectives and future needs. JAMA. 1999 Mar 3;281(9):824–9. PMID:10071004. DOI: 10.1001/jama.281.9.824.
OpenUrl CrossRef PubMed Web of Science
2.↵
Scott HD, Thacher-Renshaw A, Rosenbaum SE, et al. Physician reporting of adverse drug reactions: Results of the Rhode Island Adverse Drug Reaction Reporting Project. JAMA. 1990;263:1785–1788. PMID:2313850. doi:10.1001/jama.1990.03440130073028.
OpenUrl CrossRef PubMed Web of Science
3.
Bright RA, Nelson RC. Automated support for pharmacovigilance: a proposed system. Pharmacoepidemiol Drug Saf. 2002; 11(2):121–125. PMID:11998536. DOI:10.1002/pds.684.
OpenUrl CrossRef PubMed
4.
Samore MH, Evans RS, Lassen A, et al. Surveillance of medical device-related hazards and adverse events in hospitalized patients. JAMA. 2004; 291:325–34. PMID:14734595 DOI:10.1001/jama.291.3.325.
OpenUrl CrossRef PubMed Web of Science
5.↵
Bright RA. Strategy for surveillance of adverse drug events. Food Drug Law J. 2007; 62(3):605–615. PMID:17915403.
OpenUrl PubMed
6.↵
Hoang T, Liu J, Pratt N, Zheng VW, et al. Authenticity and credibility aware detection of adverse drug events from social media. Int J Med Inform. 2018 Dec;120:101–115. PMID:30409335. doi:10.1016/j.ijmedinf.2018.09.002.
OpenUrl CrossRef PubMed
7.↵
Classen D, Li M, Miller S, Ladner D. An electronic health record-based real-time analytics program for patient safety surveillance and improvement. Health Aff (Millwood). 2018 Nov;37(11):1805–1812. PMID:30395491. DOI:10.1377/hlthaff.2018.0728.
OpenUrl CrossRef PubMed
8.
Wang L, Rastegar-Mojarad M, Ji Z, et al. Detecting pharmacovigilance signals combining electronic medical records with spontaneous reports: A case study of conventional disease-modifying antirheumatic drugs for rheumatoid arthritis. Front Pharmacol. 2018 Aug 7;9:875. PMID:30131701. DOI:10.3389/fphar.2018.00875.
OpenUrl CrossRef PubMed
9.↵
Alghamdi AA, Keers RN, Sutherland A, Ashcroft DM. Prevalence and nature of medication errors and preventable adverse drug events in paediatric and neonatal intensive care settings: A systematic review. Drug Saf. 2019 Dec;42(12):1423–1436. PMID:31410745. DOI:10.1007/s40264-019-00856-9.
OpenUrl CrossRef PubMed
10.↵
Molina FJ, Rivera PT, Cardona A, et al. Adverse events in critical care: Search and active detection through the Trigger Tool. World J Crit Care Med. 2018 Feb 4;7(1):9–15. PMID:29430403. DOI:10.5492/wjccm.v7.i1.9.
OpenUrl CrossRef PubMed
11.↵
Report to Congress: Update on the adoption of health information technology and related efforts to facilitate the electronic use and exchange of health information. Office of the National Coordinator for Health Information Technology, US Department of Health and Human Services. 2016 Feb. https://www.healthit.gov/sites/default/files/Attachment_1_-_2-26-16_RTC_Health_IT_Progress.pdf.
12.↵
Taggart M, Chapman WW, Steinberg BA, et al. Comparison of 2 natural language processing methods for identification of bleeding among critically ill patients. JAMA Netw Open. 2018 Oct 5;1(6):e183451. PMID:30646240. DOI:10.1001/jamanetworkopen.2018.3451.
OpenUrl CrossRef PubMed
13.↵
Jin Y, Li F, Vimalananda VG, Yu H. Automatic detection of hypoglycemic events from the electronic health record notes of diabetes patients: Empirical study. JMIR Med Inform. 2019 Nov 8;7(4):e14340. PMID:31702562. DOI:10.2196/14340.
OpenUrl CrossRef PubMed
14.↵
Melton GB, Hripcsak G. Automated detection of adverse events using natural language processing of discharge summaries. J Am Med Inform Assoc. 2005; 12:448–457. PMID:15802475. DOI:10.1197/jamia.M1794.
OpenUrl CrossRef PubMed
15.↵
Patadia VK, Schuemie MJ, Coloma PM, Herings R, van der Lei J, Sturkenboom M, Trifiro G. Can electronic health records databases complement spontaneous reporting system databases? A historical-reconstruction of the association of rofecoxib and acute myocardial infarction. Front Pharmacol. 2018 Jun 6;9:594. PMID:29928230. DOI:10.3389/fphar.2018.00594.
OpenUrl CrossRef PubMed
16.↵
Young IJB, Luz S, Lone N. A systematic review of natural language processing for classification tasks in the field of incident reporting and adverse event analysis. I J Med Inf. 2019; 132: 103971. PMID:31630063. DOI:10.1016/j.ijmedinf.2019.103971.
OpenUrl CrossRef PubMed
17.
Fortenberry M, Odinet J, Shah P, et al. Development of an electronic trigger tool at a children’s hospital within an academic medical center. Am J Health Syst Pharm. 2019 Nov 13;76(Suppl_4):S107–S113. PMID:31724037. DOI:10.1093/ajhp/zxz222.
OpenUrl CrossRef PubMed
18.
Zhou L, Siddiqui T, Seliger SL, et al. Text preprocessing for improving hypoglycemia detection from clinical notes - A case study of patients with diabetes. Int J Med Inform. 2019 Sep;129:374–380. PMID:31445280. DOI:10.1016/j.ijmedinf.2019.06.020.
OpenUrl CrossRef PubMed
19.↵
Mesfin YM, Cheng A, Lawrie J, Buttery J. Use of routinely collected electronic healthcare data for postlicensure vaccine safety signal detection: A systematic review. BMJ Glob Health. 2019 Jul 8;4(4):e001065. PMID:31354969. DOI:10.1136/bmjgh-2018-001065.
OpenUrl Abstract/FREE Full Text
20.
Morel M, Bacry E, Gaiffas S, Guilloux A, Leroy F. ConvSCCS: convolutional self-controlled case series model for lagged adverse event detection. Biostatistics. 2019 Mar 8. pii: kxz003. PMID:30851046. DOI:10.1093/biostatistics/kxz003.
OpenUrl CrossRef PubMed
21.
Dandala B, Joopudi V, Devarakonda M. Adverse drug events detection in clinical notes by jointly modeling entities and relations using neural networks. Drug Saf. 2019 Jan;42(1):135–146. PMID:30649738. DOI:10.1007/s40264-018-0764-x.
OpenUrl CrossRef PubMed
22.
Wunnava S, Qin X, Kakar T, Sen C, Rundensteiner EA, Kong X. Adverse drug event detection from electronic health records using hierarchical recurrent neural networks with dual-level embedding. Drug Saf. 2019 Jan;42(1):113–122. PMID:30649736. DOI:10.1007/s40264-018-0765-9.
OpenUrl CrossRef PubMed
23.↵
Bagattini F, Karlsson I, Rebane J, Papapetrou P. A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records. BMC Med Inform Decis Mak. 2019 Jan 10;19(1):7. PMID:30630486. DOI:10.1186/s12911-018-0717-4.
OpenUrl CrossRef PubMed
24.
Rafter N, Finn R, Burns K, et al. Identifying hospital-acquired infections using retrospective record review from the Irish National Adverse Events Study (INAES) and European point prevalence survey case definitions. J Hosp Infect. 2019 Mar;101(3):313–319. PMID:30590090. DOI:10.1016/j.jhin.2018.12.011.
OpenUrl CrossRef PubMed
25.
Li F, Liu W, Yu H. Extraction of information related to adverse drug events from electronic health record notes: Design of an end-to-end model based on deep learning. JMIR Med Inform. 2018 Nov 26;6(4):e12159. PMID:30478023. DOI:10.2196/12159.
OpenUrl CrossRef PubMed
26.
Jeong E, Park N, Choi Y, Park RW, Yoon D. Machine learning model combining features from algorithms with different analytical methodologies to detect laboratory-event-related adverse drug reaction signals. PLoS One. 2018 Nov 21;13(11):e0207749. PMID:30462745. DOI:10.1371/journal.pone.0207749.
OpenUrl CrossRef PubMed
27.
Santiso S, Perez A, Casillas A. Exploring joint AB-LSTM with embedded lemmas for adverse drug reaction discovery. IEEE J Biomed Health Inform. 2019 Sep;23(5):2148–2155. PMID:30403644. DOI:10.1109/JBHI.2018.2879744.
OpenUrl CrossRef PubMed
28.
Chu J, Dong W, He K, Duan H, Huang Z. Using neural attention networks to detect adverse medical events from electronic health records. J Biomed Inform. 2018 Nov;87:118–130. PMID:30336262. DOI:10.1016/j.jbi.2018.10.002.
OpenUrl CrossRef PubMed
29.
Wang SV, Maro JC, Baro E, et al. Data mining for adverse drug events with a propensity score-matched tree-based scan statistic. Epidemiology. 2018 Nov;29(6):895–903. doi: 10.1097/EDE.0000000000000907.
OpenUrl CrossRef PubMed
30.
Martins RR, Silva LT, Bessa GG, Lopes FM. Trigger tools are as effective as non-targeted chart review for adverse drug event detection in intensive care units. Saudi Pharm J. 2018 Dec;26(8):1155–1161. doi: 10.1016/j.jsps.2018.07.003.
OpenUrl CrossRef PubMed
31.
Whalen E, Hauben M, Bate A. Time series disturbance detection for hypothesis-free signal detection in longitudinal observational databases. Drug Saf. 2018 Jun;41(6):565–577. PMID:30074538. DOI:10.1007/s40264-018-0640-8.
OpenUrl CrossRef PubMed
32.
Zhou X, Douglas IJ, Shen R, Bate A. Signal detection for recently approved products: Adapting and evaluating self-controlled case series method using a US claims and UK electronic medical records database. Drug Saf. 2018 May;41(5):523–536. PMID:29327136. DOI:10.1007/s40264-017-0626-y.
OpenUrl CrossRef PubMed
33.
Nydert P, Unbeck M, Härenstam KP, et al. Drug Use and Type of adverse drug events-identified by a trigger tool in different units in a Swedish pediatric hospital. Drug Healthc Patient Saf. 2020 Jan 31;12:31–40. PMID:32099481. DOI:10.2147/DHPS.S232604. eCollection 2020.
OpenUrl CrossRef PubMed
34.
Chen L, Gu Y, Ji X, Sun Z, Li H, Gao Y, Huang Y. Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning. J Am Med Inform Assoc. 2020 Jan 1;27(1):56–64. PMID:31591641. DOI:10.1093/jamia/ocz141.
OpenUrl CrossRef PubMed
35.↵
Ju M, Nguyen NTH, Miwa M, Ananiadou S. An ensemble of neural models for nested adverse drug events and medication extraction with subwords. J Am Med Inform Assoc. 2020 Jan 1;27(1):22–30. PMID:31197355. DOI:10.1093/jamia/ocz075.
OpenUrl CrossRef PubMed
36.
Griffey RT, Schneider RM, Todorov AA. Adverse events present on arrival to the emergency department: The ED as a dual safety net. Jt Comm J Qual Patient Saf. 2020 Apr;46(4):192–198. PMID:32007399. DOI:10.1016/j.jcjq.2019.12.003.
OpenUrl CrossRef PubMed
37.
Pandya AD, Patel K, Rana D, et al. Global Trigger Tool: Proficient adverse drug reaction autodetection method in critical care patient units. Indian J Crit Care Med. 2020 Mar;24(3):172–178. PMID:32435095. DOI:10.5005/jp-journals-10071-23367.
OpenUrl CrossRef PubMed
38.↵
McIsaac DI, Hamilton GM, Abdulla K, et al. Validation of new ICD-10-based patient safety indicators for identification of in-hospital complications in surgical patients: A study of diagnostic accuracy. BMJ Qual Saf. 2020 Mar;29(3):209–216. PMID:31439760. DOI:10.1136/bmjqs-2018-008852. Epub 2019 Aug 22.
OpenUrl Abstract/FREE Full Text
39.↵
de Vos MS, Hamming JF, Chua-Hendriks JJC, Marang-van de Mheen PJ. Connecting perspectives on quality and safety: patient-level linkage of incident, adverse event and complaint data. BMJ Qual Saf. 2019 Mar;28(3):180–189. PMID:30032125. doi:10.1136/bmjqs-2017-007457.
OpenUrl Abstract/FREE Full Text
40.↵
Bates DW, Cullen DJ, Laird N, et al. Incidence of adverse drug events and potential adverse drug events implications for prevention. JAMA.1995; 274: 29–34. PMID:7791255. DOI:10.1001/jama.1995.03530010043033.
OpenUrl CrossRef PubMed Web of Science
41.↵
Kane-Gill SL, Kirisci L, Verrico MM, Rothschild JM. Analysis of risk factors for adverse drug events in critically ill patients. Crit Care Med. 2012; 40(3): 823–828. PMID:22036859. doi:10.1097/CCM.0b013e318236f473.
OpenUrl CrossRef PubMed Web of Science
42.↵
Bright RA, Rankin SK, Dowdy K, et al. Potential Blood Transfusion Adverse Events Can be Found in Unstructured Text in Electronic Health Records using the “Shakespeare Method”. MedRxiv 2021;2021.01.05.21249239. DOI:10.1101/2021.01.05.21249239.
OpenUrl Abstract/FREE Full Text
43.↵
Baxter issues urgent nationwide voluntary recall of heparin 1,000 units/ml 10 and 30ml multi-dose vials NDC NUMBERS 0641-2440-45, 0641-2440-41, 0641-2450-45 and 0641-2450-41; LOTS: 107054, 117085, 047056, 097081, 107024, 107064, 107066, 107074, 107111. Food and Drug Administration. 2008 January 25. http://wayback.archive-it.org/7993/20170111131710/http://www.fda.gov/Safety/Recalls/ArchiveRecalls/2008/default.htm?Page=5.
44.↵
Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016; 3:160035. https://doi.org/10.1038/sdata.2016.35.
OpenUrl PubMed
45.↵
Code of Federal Regulations Title 45 Part 46 Protection of Human Subjects, Subpart A—Basic HHS Policy for Protection of Human Research Subjects, §46.101 (b) (4). 2000 Oct 1. https://www.govinfo.gov/content/pkg/CFR-2000-title45-vol1/pdf/CFR-2000-title45-vol1-part46.pdf.
46.↵
Rankin SK, Bright R, Dowdy K. Bloatectomy (Version v0.0.12). Zenodo. 2020, June 26. http://doi.org/10.5281/zenodo.3909030.
47.↵
Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011; 12: 2825–2830. https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf.
OpenUrl CrossRef PubMed
48.↵
Sklearn.feature_extraction.text.CountVectorizer. Scikit-learn Machine Learning in Python. Scikit-learn developers. 2020. http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html.
49.↵
Marafino BJ, Boscardin WJ, Dudley RA. Efficient and sparse feature selection for biomedical text classification via the elastic net: Application to ICU risk stratification from nursing notes. J Biomed Inform. 2015; 54: 114–120. PMID: 25700665. DOI:10.1016/j.jbi.2015.02.003.
OpenUrl CrossRef PubMed
50.↵
Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical machine learning tools and techniques, 4 ed. Elsevier. 2016. Paperback ISBN: 9780128042915. eBook ISBN: 9780128043578.
51.↵
Tang B, Kay S and He H. Toward optimal feature selection in naive Bayes for text categorization.arXiv. 2016; 1602: 02850. DOI:10.1109/TKDE.2016.2563436.
OpenUrl CrossRef
52.↵
Blei D, Ng A Jordan M. Latent Dirichlet Allocation. J Mach Learn Res. 2003;3:993–1022. https://jmlr.org/papers/volume3/blei03a/blei03a.pdf.
OpenUrl CrossRef Web of Science
53.↵
LINEST function. Microsoft Support. 2020. https://support.microsoft.com/en-us/office/linest-function-84d7d0d9-6e50-4101-977a-fa7abf772b6d.
54.↵
Altman DG, Bland JM. How to obtain the P value from a confidence interval. BMJ. 2011;343:d2304. PMID: 22803193. DOI:10.1136/bmj.d2304.
OpenUrl CrossRef PubMed
55.↵
Heparin sodium-heparin sodium injection, solution: Drug label information. DailyMed. U.S. National Library of Medicine. 2020. https://dailymed.nlm.nih.gov/dailymed/drugInfo.cfm?setid=cb1c1e7a-c9ca-4a07-8833-e45ce436d287.
56.↵
Bassereo PP, Cocco D, Bassareo V, et al. Pharmacological treatment of vagal hyperactivity, a rare but potentially fatal cause of sudden cardiac death. Mini Rev Med Chem. 2018;18(6):483–489. PMID:28685699. DOI:10.2174/1389557517666170707102040.
OpenUrl CrossRef PubMed
57.↵
Information on heparin. Food and Drug Administration. 2017. https://wayback.archive-it.org/7993/20170722214801/https://www.fda.gov/Drugs/DrugSafety/PostmarketDrugSafetyInformationforPatientsandProviders/UCM112597.
58.↵
Acute allergic-type reactions among patients undergoing hemodialysis — Multiple states, 2007– 2008. MMWR. 2008, February 8; 57(5): 124–125. PMID: 18256585. https://www.cdc.gov/mmwr/preview/mmwrhtml/mm5705a4.htm.
OpenUrl PubMed
59.↵
Lyn TE. China pig disease caused by new strain: experts. Reuters. 2007 June 26. https://www.reuters.com/article/us-china-disease-pig-idUSHKG26819620070626.
60.
Barboza D. Virus Spreading Alarm and Pig Disease in China. New York Times. 2007, August 16. http://www.nytimes.com/2007/08/16/business/worldbusiness/16pigs.html.
61.↵
Tian K, Yu X, Zhao T, et al. Emergence of fatal PRRSV variants: unparalleled outbreaks of atypical PRRS in China and molecular dissection of the unique hallmark. PLoS ONE. 2007;2(6):e526. PMID: 17565379. DOI:10.1371/journal.pone.0000526.
OpenUrl CrossRef PubMed
62.↵
Levy P. The Harvard medical system. Not Running a Hospital. 2007, January 14. http://runningahospital.blogspot.com/2007/01/harvard-medical-system.html.
63.↵
Schofield A, Magnusson M, Mimno D. Pulling out the stops: rethinking stopword removal for topic models. IN: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, April 3-7, 2017. Association for Computational Linguistics. 2017; 432–436. https://www.aclweb.org/anthology/E17-2069.
64.↵
Halamka J. What will keep me up at night. Dispatch from the digital health frontier. 2007 Nov 1, 2, 19, 20. http://geekdoctor.blogspot.com/2007/11/.

View the discussion thread.

Posted January 16, 2021.

Download PDF

Data/Code

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Addiction Medicine (354)
Allergy and Immunology (679)
Anesthesia (182)
Cardiovascular Medicine (2687)
Dentistry and Oral Medicine (318)
Dermatology (227)
Emergency Medicine (404)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (955)
Epidemiology (12318)
Forensic Medicine (10)
Gastroenterology (771)
Genetic and Genomic Medicine (4161)
Geriatric Medicine (390)
Health Economics (685)
Health Informatics (2699)
Health Policy (1010)
Health Systems and Quality Improvement (1002)
Hematology (365)
HIV/AIDS (861)
Infectious Diseases (except HIV/AIDS) (13764)
Intensive Care and Critical Care Medicine (804)
Medical Education (401)
Medical Ethics (110)
Nephrology (446)
Neurology (3935)
Nursing (213)
Nutrition (586)
Obstetrics and Gynecology (749)
Occupational and Environmental Health (701)
Oncology (2073)
Ophthalmology (596)
Orthopedics (243)
Otolaryngology (308)
Pain Medicine (253)
Palliative Medicine (75)
Pathology (473)
Pediatrics (1131)
Pharmacology and Therapeutics (472)
Primary Care Research (462)
Psychiatry and Clinical Psychology (3489)
Public and Global Health (6583)
Radiology and Imaging (1424)
Rehabilitation Medicine and Physical Therapy (831)
Respiratory Medicine (877)
Rheumatology (414)
Sexual and Reproductive Health (413)
Sports Medicine (345)
Surgery (455)
Toxicology (55)
Transplantation (191)
Urology (170)

[1] 1.↵
Brewer T, Colditz GA. Postmarketing surveillance and adverse drug reactions: current perspectives and future needs. JAMA. 1999 Mar 3;281(9):824–9. PMID:10071004. DOI: 10.1001/jama.281.9.824.
OpenUrl CrossRef PubMed Web of Science

[2] 2.↵
Scott HD, Thacher-Renshaw A, Rosenbaum SE, et al. Physician reporting of adverse drug reactions: Results of the Rhode Island Adverse Drug Reaction Reporting Project. JAMA. 1990;263:1785–1788. PMID:2313850. doi:10.1001/jama.1990.03440130073028.
OpenUrl CrossRef PubMed Web of Science

[3] 3.
Bright RA, Nelson RC. Automated support for pharmacovigilance: a proposed system. Pharmacoepidemiol Drug Saf. 2002; 11(2):121–125. PMID:11998536. DOI:10.1002/pds.684.
OpenUrl CrossRef PubMed

[4] 4.
Samore MH, Evans RS, Lassen A, et al. Surveillance of medical device-related hazards and adverse events in hospitalized patients. JAMA. 2004; 291:325–34. PMID:14734595 DOI:10.1001/jama.291.3.325.
OpenUrl CrossRef PubMed Web of Science

[5] 5.↵
Bright RA. Strategy for surveillance of adverse drug events. Food Drug Law J. 2007; 62(3):605–615. PMID:17915403.
OpenUrl PubMed

[6] 6.↵
Hoang T, Liu J, Pratt N, Zheng VW, et al. Authenticity and credibility aware detection of adverse drug events from social media. Int J Med Inform. 2018 Dec;120:101–115. PMID:30409335. doi:10.1016/j.ijmedinf.2018.09.002.
OpenUrl CrossRef PubMed

[7] 7.↵
Classen D, Li M, Miller S, Ladner D. An electronic health record-based real-time analytics program for patient safety surveillance and improvement. Health Aff (Millwood). 2018 Nov;37(11):1805–1812. PMID:30395491. DOI:10.1377/hlthaff.2018.0728.
OpenUrl CrossRef PubMed

[8] 8.
Wang L, Rastegar-Mojarad M, Ji Z, et al. Detecting pharmacovigilance signals combining electronic medical records with spontaneous reports: A case study of conventional disease-modifying antirheumatic drugs for rheumatoid arthritis. Front Pharmacol. 2018 Aug 7;9:875. PMID:30131701. DOI:10.3389/fphar.2018.00875.
OpenUrl CrossRef PubMed

[9] 9.↵
Alghamdi AA, Keers RN, Sutherland A, Ashcroft DM. Prevalence and nature of medication errors and preventable adverse drug events in paediatric and neonatal intensive care settings: A systematic review. Drug Saf. 2019 Dec;42(12):1423–1436. PMID:31410745. DOI:10.1007/s40264-019-00856-9.
OpenUrl CrossRef PubMed

[10] 10.↵
Molina FJ, Rivera PT, Cardona A, et al. Adverse events in critical care: Search and active detection through the Trigger Tool. World J Crit Care Med. 2018 Feb 4;7(1):9–15. PMID:29430403. DOI:10.5492/wjccm.v7.i1.9.
OpenUrl CrossRef PubMed

[11] 11.↵
Report to Congress: Update on the adoption of health information technology and related efforts to facilitate the electronic use and exchange of health information. Office of the National Coordinator for Health Information Technology, US Department of Health and Human Services. 2016 Feb. https://www.healthit.gov/sites/default/files/Attachment_1_-_2-26-16_RTC_Health_IT_Progress.pdf.

[12] 12.↵
Taggart M, Chapman WW, Steinberg BA, et al. Comparison of 2 natural language processing methods for identification of bleeding among critically ill patients. JAMA Netw Open. 2018 Oct 5;1(6):e183451. PMID:30646240. DOI:10.1001/jamanetworkopen.2018.3451.
OpenUrl CrossRef PubMed

[13] 13.↵
Jin Y, Li F, Vimalananda VG, Yu H. Automatic detection of hypoglycemic events from the electronic health record notes of diabetes patients: Empirical study. JMIR Med Inform. 2019 Nov 8;7(4):e14340. PMID:31702562. DOI:10.2196/14340.
OpenUrl CrossRef PubMed

[14] 14.↵
Melton GB, Hripcsak G. Automated detection of adverse events using natural language processing of discharge summaries. J Am Med Inform Assoc. 2005; 12:448–457. PMID:15802475. DOI:10.1197/jamia.M1794.
OpenUrl CrossRef PubMed

[15] 15.↵
Patadia VK, Schuemie MJ, Coloma PM, Herings R, van der Lei J, Sturkenboom M, Trifiro G. Can electronic health records databases complement spontaneous reporting system databases? A historical-reconstruction of the association of rofecoxib and acute myocardial infarction. Front Pharmacol. 2018 Jun 6;9:594. PMID:29928230. DOI:10.3389/fphar.2018.00594.
OpenUrl CrossRef PubMed

[16] 16.↵
Young IJB, Luz S, Lone N. A systematic review of natural language processing for classification tasks in the field of incident reporting and adverse event analysis. I J Med Inf. 2019; 132: 103971. PMID:31630063. DOI:10.1016/j.ijmedinf.2019.103971.
OpenUrl CrossRef PubMed

[17] 17.
Fortenberry M, Odinet J, Shah P, et al. Development of an electronic trigger tool at a children’s hospital within an academic medical center. Am J Health Syst Pharm. 2019 Nov 13;76(Suppl_4):S107–S113. PMID:31724037. DOI:10.1093/ajhp/zxz222.
OpenUrl CrossRef PubMed

[18] 18.
Zhou L, Siddiqui T, Seliger SL, et al. Text preprocessing for improving hypoglycemia detection from clinical notes - A case study of patients with diabetes. Int J Med Inform. 2019 Sep;129:374–380. PMID:31445280. DOI:10.1016/j.ijmedinf.2019.06.020.
OpenUrl CrossRef PubMed

[19] 19.↵
Mesfin YM, Cheng A, Lawrie J, Buttery J. Use of routinely collected electronic healthcare data for postlicensure vaccine safety signal detection: A systematic review. BMJ Glob Health. 2019 Jul 8;4(4):e001065. PMID:31354969. DOI:10.1136/bmjgh-2018-001065.
OpenUrl Abstract/FREE Full Text

[20] 20.
Morel M, Bacry E, Gaiffas S, Guilloux A, Leroy F. ConvSCCS: convolutional self-controlled case series model for lagged adverse event detection. Biostatistics. 2019 Mar 8. pii: kxz003. PMID:30851046. DOI:10.1093/biostatistics/kxz003.
OpenUrl CrossRef PubMed

[21] 21.
Dandala B, Joopudi V, Devarakonda M. Adverse drug events detection in clinical notes by jointly modeling entities and relations using neural networks. Drug Saf. 2019 Jan;42(1):135–146. PMID:30649738. DOI:10.1007/s40264-018-0764-x.
OpenUrl CrossRef PubMed

[22] 22.
Wunnava S, Qin X, Kakar T, Sen C, Rundensteiner EA, Kong X. Adverse drug event detection from electronic health records using hierarchical recurrent neural networks with dual-level embedding. Drug Saf. 2019 Jan;42(1):113–122. PMID:30649736. DOI:10.1007/s40264-018-0765-9.
OpenUrl CrossRef PubMed

[23] 23.↵
Bagattini F, Karlsson I, Rebane J, Papapetrou P. A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records. BMC Med Inform Decis Mak. 2019 Jan 10;19(1):7. PMID:30630486. DOI:10.1186/s12911-018-0717-4.
OpenUrl CrossRef PubMed

[24] 24.
Rafter N, Finn R, Burns K, et al. Identifying hospital-acquired infections using retrospective record review from the Irish National Adverse Events Study (INAES) and European point prevalence survey case definitions. J Hosp Infect. 2019 Mar;101(3):313–319. PMID:30590090. DOI:10.1016/j.jhin.2018.12.011.
OpenUrl CrossRef PubMed

[25] 25.
Li F, Liu W, Yu H. Extraction of information related to adverse drug events from electronic health record notes: Design of an end-to-end model based on deep learning. JMIR Med Inform. 2018 Nov 26;6(4):e12159. PMID:30478023. DOI:10.2196/12159.
OpenUrl CrossRef PubMed

[26] 26.
Jeong E, Park N, Choi Y, Park RW, Yoon D. Machine learning model combining features from algorithms with different analytical methodologies to detect laboratory-event-related adverse drug reaction signals. PLoS One. 2018 Nov 21;13(11):e0207749. PMID:30462745. DOI:10.1371/journal.pone.0207749.
OpenUrl CrossRef PubMed

[27] 27.
Santiso S, Perez A, Casillas A. Exploring joint AB-LSTM with embedded lemmas for adverse drug reaction discovery. IEEE J Biomed Health Inform. 2019 Sep;23(5):2148–2155. PMID:30403644. DOI:10.1109/JBHI.2018.2879744.
OpenUrl CrossRef PubMed

[28] 28.
Chu J, Dong W, He K, Duan H, Huang Z. Using neural attention networks to detect adverse medical events from electronic health records. J Biomed Inform. 2018 Nov;87:118–130. PMID:30336262. DOI:10.1016/j.jbi.2018.10.002.
OpenUrl CrossRef PubMed

[29] 29.
Wang SV, Maro JC, Baro E, et al. Data mining for adverse drug events with a propensity score-matched tree-based scan statistic. Epidemiology. 2018 Nov;29(6):895–903. doi: 10.1097/EDE.0000000000000907.
OpenUrl CrossRef PubMed

[30] 30.
Martins RR, Silva LT, Bessa GG, Lopes FM. Trigger tools are as effective as non-targeted chart review for adverse drug event detection in intensive care units. Saudi Pharm J. 2018 Dec;26(8):1155–1161. doi: 10.1016/j.jsps.2018.07.003.
OpenUrl CrossRef PubMed

[31] 31.
Whalen E, Hauben M, Bate A. Time series disturbance detection for hypothesis-free signal detection in longitudinal observational databases. Drug Saf. 2018 Jun;41(6):565–577. PMID:30074538. DOI:10.1007/s40264-018-0640-8.
OpenUrl CrossRef PubMed

[32] 32.
Zhou X, Douglas IJ, Shen R, Bate A. Signal detection for recently approved products: Adapting and evaluating self-controlled case series method using a US claims and UK electronic medical records database. Drug Saf. 2018 May;41(5):523–536. PMID:29327136. DOI:10.1007/s40264-017-0626-y.
OpenUrl CrossRef PubMed

[33] 33.
Nydert P, Unbeck M, Härenstam KP, et al. Drug Use and Type of adverse drug events-identified by a trigger tool in different units in a Swedish pediatric hospital. Drug Healthc Patient Saf. 2020 Jan 31;12:31–40. PMID:32099481. DOI:10.2147/DHPS.S232604. eCollection 2020.
OpenUrl CrossRef PubMed

[34] 34.
Chen L, Gu Y, Ji X, Sun Z, Li H, Gao Y, Huang Y. Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning. J Am Med Inform Assoc. 2020 Jan 1;27(1):56–64. PMID:31591641. DOI:10.1093/jamia/ocz141.
OpenUrl CrossRef PubMed

[35] 35.↵
Ju M, Nguyen NTH, Miwa M, Ananiadou S. An ensemble of neural models for nested adverse drug events and medication extraction with subwords. J Am Med Inform Assoc. 2020 Jan 1;27(1):22–30. PMID:31197355. DOI:10.1093/jamia/ocz075.
OpenUrl CrossRef PubMed

[36] 36.
Griffey RT, Schneider RM, Todorov AA. Adverse events present on arrival to the emergency department: The ED as a dual safety net. Jt Comm J Qual Patient Saf. 2020 Apr;46(4):192–198. PMID:32007399. DOI:10.1016/j.jcjq.2019.12.003.
OpenUrl CrossRef PubMed

[37] 37.
Pandya AD, Patel K, Rana D, et al. Global Trigger Tool: Proficient adverse drug reaction autodetection method in critical care patient units. Indian J Crit Care Med. 2020 Mar;24(3):172–178. PMID:32435095. DOI:10.5005/jp-journals-10071-23367.
OpenUrl CrossRef PubMed

[38] 38.↵
McIsaac DI, Hamilton GM, Abdulla K, et al. Validation of new ICD-10-based patient safety indicators for identification of in-hospital complications in surgical patients: A study of diagnostic accuracy. BMJ Qual Saf. 2020 Mar;29(3):209–216. PMID:31439760. DOI:10.1136/bmjqs-2018-008852. Epub 2019 Aug 22.
OpenUrl Abstract/FREE Full Text

[39] 39.↵
de Vos MS, Hamming JF, Chua-Hendriks JJC, Marang-van de Mheen PJ. Connecting perspectives on quality and safety: patient-level linkage of incident, adverse event and complaint data. BMJ Qual Saf. 2019 Mar;28(3):180–189. PMID:30032125. doi:10.1136/bmjqs-2017-007457.
OpenUrl Abstract/FREE Full Text

[40] 40.↵
Bates DW, Cullen DJ, Laird N, et al. Incidence of adverse drug events and potential adverse drug events implications for prevention. JAMA.1995; 274: 29–34. PMID:7791255. DOI:10.1001/jama.1995.03530010043033.
OpenUrl CrossRef PubMed Web of Science

[41] 41.↵
Kane-Gill SL, Kirisci L, Verrico MM, Rothschild JM. Analysis of risk factors for adverse drug events in critically ill patients. Crit Care Med. 2012; 40(3): 823–828. PMID:22036859. doi:10.1097/CCM.0b013e318236f473.
OpenUrl CrossRef PubMed Web of Science

[42] 42.↵
Bright RA, Rankin SK, Dowdy K, et al. Potential Blood Transfusion Adverse Events Can be Found in Unstructured Text in Electronic Health Records using the “Shakespeare Method”. MedRxiv 2021;2021.01.05.21249239. DOI:10.1101/2021.01.05.21249239.
OpenUrl Abstract/FREE Full Text

[43] 43.↵
Baxter issues urgent nationwide voluntary recall of heparin 1,000 units/ml 10 and 30ml multi-dose vials NDC NUMBERS 0641-2440-45, 0641-2440-41, 0641-2450-45 and 0641-2450-41; LOTS: 107054, 117085, 047056, 097081, 107024, 107064, 107066, 107074, 107111. Food and Drug Administration. 2008 January 25. http://wayback.archive-it.org/7993/20170111131710/http://www.fda.gov/Safety/Recalls/ArchiveRecalls/2008/default.htm?Page=5.

[44] 44.↵
Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016; 3:160035. https://doi.org/10.1038/sdata.2016.35.
OpenUrl PubMed

[45] 45.↵
Code of Federal Regulations Title 45 Part 46 Protection of Human Subjects, Subpart A—Basic HHS Policy for Protection of Human Research Subjects, §46.101 (b) (4). 2000 Oct 1. https://www.govinfo.gov/content/pkg/CFR-2000-title45-vol1/pdf/CFR-2000-title45-vol1-part46.pdf.

[46] 46.↵
Rankin SK, Bright R, Dowdy K. Bloatectomy (Version v0.0.12). Zenodo. 2020, June 26. http://doi.org/10.5281/zenodo.3909030.

[47] 47.↵
Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011; 12: 2825–2830. https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf.
OpenUrl CrossRef PubMed

[48] 48.↵
Sklearn.feature_extraction.text.CountVectorizer. Scikit-learn Machine Learning in Python. Scikit-learn developers. 2020. http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html.

[49] 49.↵
Marafino BJ, Boscardin WJ, Dudley RA. Efficient and sparse feature selection for biomedical text classification via the elastic net: Application to ICU risk stratification from nursing notes. J Biomed Inform. 2015; 54: 114–120. PMID: 25700665. DOI:10.1016/j.jbi.2015.02.003.
OpenUrl CrossRef PubMed

[50] 50.↵
Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical machine learning tools and techniques, 4 ed. Elsevier. 2016. Paperback ISBN: 9780128042915. eBook ISBN: 9780128043578.

[51] 51.↵
Tang B, Kay S and He H. Toward optimal feature selection in naive Bayes for text categorization.arXiv. 2016; 1602: 02850. DOI:10.1109/TKDE.2016.2563436.
OpenUrl CrossRef

[52] 52.↵
Blei D, Ng A Jordan M. Latent Dirichlet Allocation. J Mach Learn Res. 2003;3:993–1022. https://jmlr.org/papers/volume3/blei03a/blei03a.pdf.
OpenUrl CrossRef Web of Science

[53] 53.↵
LINEST function. Microsoft Support. 2020. https://support.microsoft.com/en-us/office/linest-function-84d7d0d9-6e50-4101-977a-fa7abf772b6d.

[54] 54.↵
Altman DG, Bland JM. How to obtain the P value from a confidence interval. BMJ. 2011;343:d2304. PMID: 22803193. DOI:10.1136/bmj.d2304.
OpenUrl CrossRef PubMed

[55] 55.↵
Heparin sodium-heparin sodium injection, solution: Drug label information. DailyMed. U.S. National Library of Medicine. 2020. https://dailymed.nlm.nih.gov/dailymed/drugInfo.cfm?setid=cb1c1e7a-c9ca-4a07-8833-e45ce436d287.

[56] 56.↵
Bassereo PP, Cocco D, Bassareo V, et al. Pharmacological treatment of vagal hyperactivity, a rare but potentially fatal cause of sudden cardiac death. Mini Rev Med Chem. 2018;18(6):483–489. PMID:28685699. DOI:10.2174/1389557517666170707102040.
OpenUrl CrossRef PubMed

[57] 57.↵
Information on heparin. Food and Drug Administration. 2017. https://wayback.archive-it.org/7993/20170722214801/https://www.fda.gov/Drugs/DrugSafety/PostmarketDrugSafetyInformationforPatientsandProviders/UCM112597.

[58] 58.↵
Acute allergic-type reactions among patients undergoing hemodialysis — Multiple states, 2007– 2008. MMWR. 2008, February 8; 57(5): 124–125. PMID: 18256585. https://www.cdc.gov/mmwr/preview/mmwrhtml/mm5705a4.htm.
OpenUrl PubMed

[59] 59.↵
Lyn TE. China pig disease caused by new strain: experts. Reuters. 2007 June 26. https://www.reuters.com/article/us-china-disease-pig-idUSHKG26819620070626.

[60] 60.
Barboza D. Virus Spreading Alarm and Pig Disease in China. New York Times. 2007, August 16. http://www.nytimes.com/2007/08/16/business/worldbusiness/16pigs.html.

[61] 61.↵
Tian K, Yu X, Zhao T, et al. Emergence of fatal PRRSV variants: unparalleled outbreaks of atypical PRRS in China and molecular dissection of the unique hallmark. PLoS ONE. 2007;2(6):e526. PMID: 17565379. DOI:10.1371/journal.pone.0000526.
OpenUrl CrossRef PubMed

[62] 62.↵
Levy P. The Harvard medical system. Not Running a Hospital. 2007, January 14. http://runningahospital.blogspot.com/2007/01/harvard-medical-system.html.

[63] 63.↵
Schofield A, Magnusson M, Mimno D. Pulling out the stops: rethinking stopword removal for topic models. IN: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, April 3-7, 2017. Association for Computational Linguistics. 2017; 432–436. https://www.aclweb.org/anthology/E17-2069.

[64] 64.↵
Halamka J. What will keep me up at night. Dispatch from the digital health frontier. 2007 Nov 1, 2, 19, 20. http://geekdoctor.blogspot.com/2007/11/.

New and Increasing Rates of Adverse Events Can be Found in Unstructured Text in Electronic Health Records using the Shakespeare Method

ABSTRACT

INTRODUCTION

METHODS

Study Population

Preprocessing

Word Extraction

Topic Modeling and Interpretation

Statistical Analysis of Words and Codes Suggested by Manual Review of the Topics

RESULTS

Common topics

Other common topics

Less common topics

DISCUSSION

CONCLUSIONS

Data Availability

Conflict of interest and disclaimer

ACKNOWLEDGEMENTS

Footnotes

ABBREVIATIONS USED MORE THAN ONCE

REFERENCES

Citation Manager Formats

Subject Area