A Statistical Test for Differential Item Pair Functioning

Timo M Bechger; Gunter Maris

doi:10.1007/s11336-014-9408-y

A Statistical Test for Differential Item Pair Functioning

Psychometrika. 2015 Jun;80(2):317-40. doi: 10.1007/s11336-014-9408-y. Epub 2014 Sep 16.

Authors

Timo M Bechger¹, Gunter Maris

Affiliation

¹ Cito, Amsterdamseweg 13, Arnhem, The Netherlands, timo.bechger@cito.nl.

PMID: 25223228
DOI: 10.1007/s11336-014-9408-y

Abstract

This paper presents an IRT-based statistical test for differential item functioning (DIF). The test is developed for items conforming to the Rasch (Probabilistic models for some intelligence and attainment tests, The Danish Institute of Educational Research, Copenhagen, 1960) model but we will outline its extension to more complex IRT models. Its difference from the existing procedures is that DIF is defined in terms of the relative difficulties of pairs of items and not in terms of the difficulties of individual items. The argument is that the difficulty of an item is not identified from the observations, whereas the relative difficulties are. This leads to a test that is closely related to Lord's (Applications of item response theory to practical testing problems, Erlbaum, Hillsdale, 1980) test for item DIF albeit with a different and more correct interpretation. Illustrations with real and simulated data are provided.

MeSH terms

Algorithms
Data Interpretation, Statistical*
Humans
Models, Statistical*
Models, Theoretical
Psychometrics / methods*