Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Int J Med Inform. 2009 Dec;78(12):e7-12. doi: 10.1016/j.ijmedinf.2009.02.005. Epub 2009 Mar 18.

Towards automated processing of clinical Finnish: sublanguage analysis and a rule-based parser.

Author information

  • 1Department of Information Technology, University of Turku, Turku, Finland. first.last@utu.fi



In this paper, we present steps taken towards more efficient automated processing of clinical Finnish, focusing on daily nursing notes in a Finnish Intensive Care Unit (ICU). First, we analyze ICU Finnish as a sublanguage, identifying its specific features facilitating, for example, the development of a specialized syntactic analyser. The identified features include frequent omission of finite verbs, limitations in allowed syntactic structures, and domain-specific vocabulary. Second, we develop a formal grammar and a parser for ICU Finnish, thus providing better tools for the development of further applications in the clinical domain.


The grammar is implemented in the LKB system in a typed feature structure formalism. The lexicon is automatically generated based on the output of the FinTWOL morphological analyzer adapted to the clinical domain. As an additional experiment, we study the effect of using Finnish constraint grammar to reduce the size of the lexicon. The parser construction thus makes efficient use of existing resources for Finnish.


The grammar currently covers 76.6% of ICU Finnish sentences, producing highly accurate best-parse analyzes with F-score of 91.1%. We find that building a parser for the highly specialized domain sublanguage is not only feasible, but also surprisingly efficient, given an existing morphological analyzer with broad vocabulary coverage. The resulting parser enables a deeper analysis of the text than was previously possible.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science
    Loading ...
    Write to the Help Desk