Send to

Choose Destination
See comment in PubMed Commons below
Int J Clin Pharmacol Ther. 1997 Mar;35(3):93-5.

Interobserver agreement: Cohen's kappa coefficient does not necessarily reflect the percentage of patients with congruent classifications.

Author information

Department of Biometry, Byk Gulden Pharmaceuticals, Konstanz, Germany.


A widely accepted approach to evaluate interrater reliability for categorical responses involves the rating of n subjects by at least 2 raters. Frequently, there are only 2 response categories, such as positive or negative diagnosis. The same approach is commonly used to assess the concordant classification by 2 diagnostic methods. Depending on whether one uses the percent agreement as such or corrected for that expected by chance, i.e. Cohen's kappa coefficient, one can get quite different values. This short communication demonstrates that Cohen's kappa coefficient of agreement between 2 raters or 2 diagnostic methods based on binary (yes/no) responses does not parallel the percentage of patients with congruent classifications. Therefore, it may be of limited value in the assessment of increases in the interrater reliability due to an improved diagnostic method. The percentage of patients with congruent classifications is of easier clinical interpretation, however, does not account for the percent of agreement expected by chance. We, therefore, recommend to present both, the percentage of patients with congruent classifications, and Cohen's kappa coefficient with 95% confidence limits.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center