Source
Health Services Research Centre, Akershus University Hospital, Lørenskog, Norway. stkr@sus.no
Abstract
PURPOSE:
In this study we evaluated and compared the interrater reliability of SAPS II and SAPS 3 in order to measure the consistency of performance among different raters.
METHOD:
Ten junior doctors working at two general ICUs were trained in the use of SAPS II and SAPS 3 using a 2.5-h training program. After training they scored 24 cases in both systems. Scores were analyzed using intraclass correlation coefficient (ICC) statistics. In order to identify variables with low reliability, subscores were analyzed using the ICC, and single-variables were compared to a template score using weighted kappa statistics.
RESULTS:
The ICC (95% CI) of the scores was 0.84 (0.74, 0.91) in SAPS II and 0.80 (0.68, 0.89) in SAPS 3, which is considered adequate for both systems. Mean mortality predictions among the raters had a range of 0.12 in SAPS II and 0.19 in SAPS 3. Administrative data including age had high reliability, whereas variables based on diagnostic information had only moderate reliability. Laboratory data had consistently higher reliability than variables based on the interpretation of charts.
CONCLUSION:
Both SAPS II and SAPS 3 have adequate interrater reliability, but the standardized mortality ratios are still likely to be influenced by the rater's scoring practice.