|
[1.] Identifying Suicide Attempt and Ideation Events from EHRs (ongoing).
Suicide is an important public health concern and one of the leading causes of death worldwide. Suicidal behaviors, including suicide attempts (SA) and suicide ideations (SI), are leading risk factors for death by suicide. Information related to patients' previous and current SA and SI are frequently documented in the electronic health record (EHR) notes. Accurate detection of such documentation may help improve surveillance and predictions of patients' suicidal behaviors and alert medical professionals for suicide prevention efforts. In this work, we first built
Suicide Attempt and IdeatioN Events (ScAN) dataset, a subset of the publicly available MIMIC III dataset spanning over 12k+ EHR notes with 19k+ annotated SA and SI events information. The annotations also contain attributes such as method of suicide attempt. We also built a strong baseline model ScANER, a multi-task RoBERTa-based model with a retrieval module to extract all the relevant suicidal behavioral evidences from EHR notes of an hospital-stay and, and a prediction module to identify the type of suicidal behavior (SA and SI) concluded during the patient's stay at the hospital.
Coming soon: Paper, Leaderboard. Work accepted at NAACL'22.
|
|
[2.] Associations Between TBI, PTSD, and Intentional Self-Harm among US Veterans (ongoing). Veterans with a history of traumatic brain injury (TBI) and/or posttraumatic stress disorder (PTSD) may be at increased risk of suicide attempts and other forms of intentional self-harm as compared to Veterans without TBI or PTSD. Using administrative data from the United States (US) Veterans Health Administration (VHA), we studied associations between TBI and PTSD diagnoses, and hospitalizations due to intentional self-harm, among US Veterans who used VHA healthcare.
Coming soon: Paper. Work accepted at Annual Research Meeting (ARM), Academy Health (2022).
|
|
[3.] Extracting SDOH concepts from UMLS.
Social Determinants of Health (SDOH) are the conditions in which people are born, live, work, and age. Unified Medical Language System (UMLS) incorporates SDOH concepts; but few have evaluated its coverage and quality. With 15,649 expert-annotated SDOH mentions from 3176 randomly selected electronic health record (EHR) notes, we found that 100% SDOH mentions can be mapped to at least one UMLS concept, indicating a good coverage of SDOH. However, we discovered a few challenges for the UMLS's representation of SDOH. Next, we developed a multi-step framework to identify SDOH concepts from UMLS, and a clinical BERT-based classification algorithm to assign each identified SDOH concept to one of the six general categories. Our multi-step framework extracted a total of 198,677 SDOH concepts from the UMLS and the SDOH category classification system attained an accuracy of 91%. We also built EASE: an open-source tool to Extract SDOH from EHRs.
Coming soon: Papers. Work accepted at AMIA Clinical Informatics 2022 (poster and system-demo) and Annual Research Meeting (ARM), Academy Health (2022).
|
|
[4.] Inferring Causality between Adverse Drug Reactions (ADRs) and a medication via Naranjo Scale.
In this work we focused towards identifying causality between a medication and its adverse drug reactions (ADRs) using a clinically standardized assessment technique called Naranjo scale. Naranjo scale provides a list of ten questions that are answered against patients' notes to assess drug-ADR causality assessment. We developed a multi-task learning framework that would take a question from Naranjo scale along with a patient's note to identify relevant evidence sentences and paragraphs in the note while also predicting the answer. Such automated causality assessments are essential for pharmacovigilance and drug safety surveillance and would reduce the necessary manpower required for manual chart reviews.
See our KDD 2019, MLHC 2019 and AMIA 2020 papers
|
|
[5.] Entity-Enriched Neural Models for Clinical Question Answering.
We explore state-of-the-art neural models for question answering on electronic medical records and improve their ability to generalize better on previously unseen (paraphrased) questions at test time. We enable this by learn- ing to predict logical forms as an auxiliary task along with the main task of answer span detection. The predicted logical forms also serve as a rationale for the answer. Further, we also incorporate medical entity information in these models via the ERNIE (Zhang et al., 2019a) architecture. We train our models on the large-scale emrQA dataset and observe that our multi-task entity-enriched models generalize to paraphrased questions ∼5% better than the baseline BERT model.
See our BioNLP, ACL 2020 paper. This work was done with Dr. Preethi Raghavan.
|
|
All the remaining projects are listed here: Projects
|