Beyond Likert ratings: Improving the robustness of developmental research measurement using best-worst scaling

Nichola Burton; Michael Burton; Carmen Fisher; Patricia González Peña; Gillian Rhodes; Louise Ewing

doi:10.3758/s13428-021-01566-w

Beyond Likert ratings: Improving the robustness of developmental research measurement using best-worst scaling

Behav Res Methods. 2021 Oct;53(5):2273-2279. doi: 10.3758/s13428-021-01566-w. Epub 2021 Apr 5.

Authors

Nichola Burton¹, Michael Burton², Carmen Fisher³, Patricia González Peña³, Gillian Rhodes¹, Louise Ewing⁴

Affiliations

¹ ARC Center of Excellence in Cognition and its Disorders, School of Psychology, University of Western Australia, Crawley, Australia.
² School of Agriculture and Environment, University of Western Australia, Crawley, Australia.
³ School of Psychology, University of East Anglia, Research Park, Norwich, NR4 7TJ, UK.
⁴ School of Psychology, University of East Anglia, Research Park, Norwich, NR4 7TJ, UK. l.ewing@uea.ac.uk.

Abstract

Some of the 'best practice' approaches to ensuring reproducibility of research can be difficult to implement in the developmental and clinical domains, where sample sizes and session lengths are constrained by the practicalities of recruitment and testing. For this reason, an important area of improvement to target is the reliability of measurement. Here we demonstrate that best-worst scaling (BWS) provides a superior alternative to Likert ratings for measuring children's subjective impressions. Seventy-three children aged 5-6 years rated the trustworthiness of faces using either Likert ratings or BWS over two sessions. Individual children's ratings in the BWS condition were significantly more consistent from session 1 to session 2 than those in the Likert condition, a finding we also replicate with a large adult sample (N = 72). BWS also produced more reliable ratings at the group level than Likert ratings in the child sample. These findings indicate that BWS is a developmentally appropriate response format that can deliver substantial improvements in reliability of measurement, which can increase our confidence in the robustness of findings with children.

Keywords: Best-worst scaling; Children; Development; Face perception; Measurement; Trust.

MeSH terms

Adult
Attitude*
Child
Humans
Reproducibility of Results