Preliminary testing of the reliability and feasibility of SAGE: a system to measure and score engagement with and use of research in health policies and programs

Steve R Makkar; Anna Williamson; Catherine D'Este; Sally Redman

doi:10.1186/s13012-017-0676-7

Preliminary testing of the reliability and feasibility of SAGE: a system to measure and score engagement with and use of research in health policies and programs

Implement Sci. 2017 Dec 19;12(1):149. doi: 10.1186/s13012-017-0676-7.

Authors

Steve R Makkar¹, Anna Williamson², Catherine D'Este³, Sally Redman²

Affiliations

¹ The Sax Institute, Level 13, Building 10, 235 Jones Street, Ultimo, New South Wales, 2007, Australia. steve.makkar@saxinstitute.org.au.
² The Sax Institute, Level 13, Building 10, 235 Jones Street, Ultimo, New South Wales, 2007, Australia.
³ National Centre for Epidemiology and Population Health (NCEPH), Research School of Population Health, The Australian National University, 62 Mills Road, Acton, Australian Capital Territory, 0200, Australia.

Abstract

Background: Few measures of research use in health policymaking are available, and the reliability of such measures has yet to be evaluated. A new measure called the Staff Assessment of Engagement with Evidence (SAGE) incorporates an interview that explores policymakers' research use within discrete policy documents and a scoring tool that quantifies the extent of policymakers' research use based on the interview transcript and analysis of the policy document itself. We aimed to conduct a preliminary investigation of the usability, sensitivity, and reliability of the scoring tool in measuring research use by policymakers.

Methods: Nine experts in health policy research and two independent coders were recruited. Each expert used the scoring tool to rate a random selection of 20 interview transcripts, and each independent coder rated 60 transcripts. The distribution of scores among experts was examined, and then, interrater reliability was tested within and between the experts and independent coders. Average- and single-measure reliability coefficients were computed for each SAGE subscales.

Results: Experts' scores ranged from the limited to extensive scoring bracket for all subscales. Experts as a group also exhibited at least a fair level of interrater agreement across all subscales. Single-measure reliability was at least fair except for three subscales: Relevance Appraisal, Conceptual Use, and Instrumental Use. Average- and single-measure reliability among independent coders was good to excellent for all subscales. Finally, reliability between experts and independent coders was fair to excellent for all subscales.

Conclusions: Among experts, the scoring tool was comprehensible, usable, and sensitive to discriminate between documents with varying degrees of research use. Secondly, the scoring tool yielded scores with good reliability among the independent coders. There was greater variability among experts, although as a group, the tool was fairly reliable. The alignment between experts' and independent coders' ratings indicates that the independent coders were scoring in a manner comparable to health policy research experts. If the present findings are replicated in a larger sample, end users (e.g. policy agency staff) could potentially be trained to use SAGE to reliably score research use within their agencies, which would provide a cost-effective and time-efficient approach to utilising this measure in practice.

MeSH terms

Evidence-Based Practice / standards
Health Policy*
Humans
Interviews as Topic / standards*
Policy Making*
Reproducibility of Results
Research / organization & administration*
Research / standards

Grants and funding

APP1001436/National Health and Medical Research Council