Sampling ADE’s: A Guide for Corpus Creation

Electronic health record systems with clinical decision support capabilities are increasingly prevalent in the ambulatory setting. However, the impacts of these systems on patient care are mixed with respect to medication safety.

Approaches to identifying adverse drug events (ADEs) have improved with the use of natural language processing tools and other methods, however, most relevant research has been conducted in the inpatient or oncology settings.

ADEs are not uncommon in the ambulatory setting. However, the prevalence and distribution of the ADEs documented in ambulatory visits is unclear. Our task is to figure out how to identify a corpus of ambulatory visit notes that are representative to train and validate an ADE detection tool.

In this talk, I will discuss our process for identifying a corpus of notes that appropriately captures the wide diversity of ADEs present but is manageable for a small team of annotators to annotate and de-identify.

About the speaker
Plasek-Joseph-NLP

Joseph Plasek

Postdoctoral Research Fellow at Mass General Brigham

Joseph M. Plasek, Ph.D. is a health data scientist, clinical informatician, and dynamic systems theorist. Joseph is currently a Postdoctoral Research Fellow at Brigham and Women’s Hospital.

Joseph is a member of MTERMS lab, a group of researchers who develop natural language processing tools that have been used to study medication reconciliation, allergies, and adverse reactions, among others.