Format

Send to

Choose Destination
J Biomed Inform. 2016 Apr;60:14-22. doi: 10.1016/j.jbi.2016.01.003. Epub 2016 Jan 13.

Classification of clinically useful sentences in clinical evidence resources.

Author information

1
Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, UT, USA.
2
Lister Hill Center, National Library of Medicine, Bethesda, MD, USA.
3
Department of Preventive Medicine, Division of Health and Biomedical Informatics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
4
Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA. Electronic address: guilherme.delfiol@utah.edu.

Abstract

Most patient care questions raised by clinicians can be answered by online clinical knowledge resources. However, important barriers still challenge the use of these resources at the point of care.

OBJECTIVE:

To design and assess a method for extracting clinically useful sentences from synthesized online clinical resources that represent the most clinically useful information for directly answering clinicians' information needs.

MATERIALS AND METHODS:

We developed a Kernel-based Bayesian Network classification model based on different domain-specific feature types extracted from sentences in a gold standard composed of 18 UpToDate documents. These features included UMLS concepts and their semantic groups, semantic predications extracted by SemRep, patient population identified by a pattern-based natural language processing (NLP) algorithm, and cue words extracted by a feature selection technique. Algorithm performance was measured in terms of precision, recall, and F-measure.

RESULTS:

The feature-rich approach yielded an F-measure of 74% versus 37% for a feature co-occurrence method (p<0.001). Excluding predication, population, semantic concept or text-based features reduced the F-measure to 62%, 66%, 58% and 69% respectively (p<0.01). The classifier applied to Medline sentences reached an F-measure of 73%, which is equivalent to the performance of the classifier on UpToDate sentences (p=0.62).

CONCLUSIONS:

The feature-rich approach significantly outperformed general baseline methods. This approach significantly outperformed classifiers based on a single type of feature. Different types of semantic features provided a unique contribution to overall classification performance. The classifier's model and features used for UpToDate generalized well to Medline abstracts.

KEYWORDS:

Clinical decision support; Machine learning; Natural language processing; Text summarization

PMID:
26774763
PMCID:
PMC4836984
DOI:
10.1016/j.jbi.2016.01.003
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center