Display Settings:


Send to:

Choose Destination
We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Phys Ther. 2005 Mar;85(3):257-68.

The kappa statistic in reliability studies: use, interpretation, and sample size requirements.

Author information

  • 1Primary Care Sciences Research Centre, Keele University, Keele, Staffordshire ST5 5BG, United Kingdom. j.sim@keele.ac.uk



This article examines and illustrates the use and interpretation of the kappa statistic in musculoskeletal research.


The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings. Often, these ratings lie on a nominal or an ordinal scale. For such data, the kappa coefficient is an appropriate measure of reliability. Kappa is defined, in both weighted and unweighted forms, and its use is illustrated with examples from musculoskeletal research. Factors that can influence the magnitude of kappa (prevalence, bias, and non-independent ratings) are discussed, and ways of evaluating the magnitude of an obtained kappa are considered. The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated.


The article concludes with recommendations for the use and interpretation of kappa.

[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk