Format

Send to

Choose Destination
See comment in PubMed Commons below
LREC Int Conf Lang Resour Eval. 2016 May;2016:3772-3778.

Annotating Logical Forms for EHR Questions.

Author information

1
School of Biomedical Informatics University of Texas Health Science Center at Houston, Houston TX, USA.
2
Lister Hill National Center for Biomedical Communications National Library of Medicine, National Institutes of Health, Bethesda MD, USA.

Abstract

This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

KEYWORDS:

electronic health records; question answering; semantic parsing

PMID:
28503677
PMCID:
PMC5428549
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PubMed Central
    Loading ...
    Support Center