A general method for sifting linguistic knowledge from structured terminologies

Proc AMIA Symp. 2000:310-4.

Abstract

Morphological knowledge is useful for medical language processing, information retrieval and terminology or ontology development. We show how a large volume of morphological associations between words can be learnt from existing medical terminologies by taking advantage of the semantic relations already encoded between terms in these terminologies: synonymy, hierarchy and transversal relations. The method proposed relies on no a priori linguistic knowledge. Since it can work with different relations between terms, it can be applied to any structured terminology. Tested on SNOMED and ICD in French and English, it proves to identify fairly reliable morphological relations (precision > 90%) with a good coverage (over 88% compared to the UMLS lexical variant generation program). For English words with a stem longer than 3 characters, recall reaches 98.8% for inflection and 94.7% for derivation.

MeSH terms

  • Linguistics*
  • Semantics
  • Vocabulary, Controlled*