Assessing interrater agreement on binary measurements via intraclass odds ratio

Isabella Locatelli; Valentin Rousson

doi:10.1002/bimj.201500109

Assessing interrater agreement on binary measurements via intraclass odds ratio

Biom J. 2016 Jul;58(4):962-73. doi: 10.1002/bimj.201500109. Epub 2016 Mar 14.

Authors

Isabella Locatelli¹, Valentin Rousson¹

Affiliation

¹ Division of Biostatistics, Institute for Social and Preventive Medicine, University Hospital Lausanne, Route de la Corniche 10, CH-1010 Lausanne, Switzerland.

PMID: 26988408
DOI: 10.1002/bimj.201500109

Abstract

Interrater agreement on binary measurements is usually assessed via Scott's π or Cohen's κ, which are known to be difficult to interpret. One reason for this difficulty is that these coefficients can be defined as a correlation between two exchangeable measurements made on the same subject, that is as an "intraclass correlation", a concept originally defined for continuous measurements. To measure an association between two binary variables, it is however more common to calculate an odds ratio rather than a correlation. For assessing interrater agreement on binary measurements, we suggest thus to calculate the odds ratio between two exchangeable measurements made on the same subject, yielding the concept of "intraclass odds ratio". Since it is interpretable as a ratio of probabilities of (strict) concordance and discordance (between two raters rating two subjects), an intraclass odds ratio might be easier to understand for researchers and clinicians than an intraclass correlation. It might thus be a valuable descriptive measure (summary index) to evaluate the agreement among a set of raters, without having to refer to arbitrary benchmark values. To facilitate its use, an explicit formula to calculate a confidence interval for the intraclass odds ratio is also provided.

Keywords: Binary measurements; Concordances and discordances; Interrater agreement; Intraclass correlation; Intraclass odds ratio.

MeSH terms

Biometry / methods*
Data Interpretation, Statistical*
Humans
Observer Variation
Odds Ratio
Reproducibility of Results