Synonym, topic model and predicate-based query expansion for retrieving clinical documents

AMIA Annu Symp Proc. 2012:2012:1050-9. Epub 2012 Nov 3.

Abstract

We present a study that developed and tested three query expansion methods for the retrieval of clinical documents. Finding relevant documents in a large clinical data warehouse is a challenging task. To address this issue, first, we implemented a synonym expansion strategy that used a few selected vocabularies. Second, we trained a topic model on a large set of clinical documents, which was then used to identify related terms for query expansion. Third, we obtained related terms from a large predicate database derived from Medline abstracts for query expansion. The three expansion methods were tested on a set of clinical notes. All three methods successfully achieved higher average recalls and average F-measures when compared with the baseline method. The average precisions and precision at 10, however, decreased with all expansions. Amongst the three expansion methods, the topic model-based method performed the best in terms of recall and F-measure.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Abstracting and Indexing
  • Clinical Medicine
  • Information Storage and Retrieval / methods*
  • MEDLINE
  • Medical Records*
  • Natural Language Processing*
  • Unified Medical Language System
  • Vocabulary, Controlled*