[The analysis of CIRSmedical.de using Natural Language Processing]

Z Evid Fortbild Qual Gesundhwes. 2022 Apr:169:1-11. doi: 10.1016/j.zefq.2021.12.002. Epub 2022 Feb 17.
[Article in German]

Abstract

Background: CIRSmedical.de is a publicly accessible, cross-institutional reporting and learning system, which is organized by the German Agency for Quality in Medicine (ÄZQ). CIRSmedical.de has existed since 2005 and has published more than 6,000 event reports. Up to now it has been common practice to analyse these reports in detail or carry out systematic evaluations focusing on specific topics. A systematic evaluation of all case reports has not yet been conducted. Natural Language Processing (NLP) is an analysis strategy from the field of Artificial Intelligence for indexing texts. The examination of case reports using NLP was carried out to describe the characteristics of event reports and comments.

Materials and methods: For this analysis 6,480 case reports from CIRSmedical.de (as of December 10, 2019) were provided by the ÄZQ as Excel files. Several free text fields were included in the analysis as well as the feedback of the CIRS team (expert commentary). Text lengths, reporting behaviour, sentiment values and keywords were examined. The algorithms for the analysis were developed with the programming language Python and the corresponding libraries NLTK and SpaCy.

Results: The comparison of report lengths depending on the different subject groups presented a heterogeneous picture, in terms of both the number of reports and the number of words. There are more than 4,000 reports from the field of anaesthesiology, whereby text lengths vary particularly strongly with a right-skewed distribution. There are only a few reports from the field of psychotherapy, and these are also very short. The different professional groups (nurses, doctors, other staff) write reports of about the same length. Reports and expert commentaries also differ in terms of sentiment values. Due to the length of the comments, they are more negative in terms of sentiment. Keywords can be identified but show a high heterogeneity.

Discussion: Systematic analysis using NLP allows for the description of text properties in event reports and comments. It is now possible to draw a conclusion about the reporters' intention, focus and mood when they report in CIRS. The sentiment analysis is an indication of the mood which the texts convey, both as a report and as a commentary. Text length analysis draws attention to different problems and tendencies: event reports are usually much shorter. Texts that are too short, however, run the risk that the information will not be readily usable for analysis. Comments are often longer, but here one faces the opposite problem: texts that are too long may not be read. The examination of texts by means of NLP helps to rethink the reason for and the form of input, both when reporting and when commenting. It is a first step in the automatic, supportive classification of texts and an improvement of the interaction between reporters and the system.

Keywords: CIRS; Natural Language Processing; Patient safety; Patientensicherheit.

MeSH terms

  • Artificial Intelligence*
  • Attitude
  • Germany
  • Humans
  • Language
  • Natural Language Processing*