Deriving SF-12v2 physical and mental health summary scores: a comparison of different scoring algorithms

John A Fleishman; Alfredo J Selim; Lewis E Kazis

doi:10.1007/s11136-009-9582-z

Deriving SF-12v2 physical and mental health summary scores: a comparison of different scoring algorithms

Qual Life Res. 2010 Mar;19(2):231-41. doi: 10.1007/s11136-009-9582-z. Epub 2010 Jan 22.

Authors

John A Fleishman¹, Alfredo J Selim, Lewis E Kazis

Affiliation

¹ Center for Cost and Financing Studies, Agency for Healthcare Research and Quality, Rockville, MD 20852, USA. john.fleishman@ahrq.hhs.gov

PMID: 20094805
DOI: 10.1007/s11136-009-9582-z

Abstract

Purpose: Summary scores for the SF-12, version 2 (SF-12v2) health status measure are based on scoring coefficients derived for version 1 of the SF-36, despite changes in item wording and response scales and despite the fact that SF-12 scales only contain a subset of SF-36 items. This study derives new summary scores based directly on SF-12v2 data from a recent U.S. sample and compares the new summary scores to the standard ones. Due to controversy regarding methods for developing scoring coefficients for the summary score, we compare summary scores produced by different methods.

Methods: We analyzed nationally representative U.S. data, which provided 53,399 observations for the SF-12v2 in 2003-2005. In addition to the standard SF-12V2 scoring algorithm, summary scores were generated using exploratory factor analysis (EFA), principal components analysis (PCA), and confirmatory factor analysis (CFA), with orthogonal and oblique rotation. We examined correlations among different summary scores, their associations with demographic and clinical variables, and the consistency between changes in scale scores and in summary scores over time.

Results: The 8 scale means in the current data were similar to the 1998 SF-12v2 means, with the exception of the vitality scale. Correlations among the scales based on SF-12v2 data differed slightly from correlations derived from scales based on the SF-36 data. Correlations among summary scores derived using different methods were high (≥0.84). However, changes in summary scores derived using orthogonal rotation of components or factors were not consistent with changes in sub-scales, whereas changes in summary scores derived using oblique rotation were more consistent with patterns of change in sub-scales.

Conclusions: Although the basic structure of the SF-12 is stable, summary scores derived from oblique rotation are preferable and more consistent with changes in individual scales. On empirical and conceptual grounds, we suggest using summary scores based on oblique CFA.

Publication types

Comparative Study
Validation Study

MeSH terms

Adult
Aged
Aged, 80 and over
Algorithms*
Cohort Studies
Factor Analysis, Statistical
Female
Health Status Indicators*
Health Surveys
Humans
Male
Mental Health*
Middle Aged
Principal Component Analysis
Psychometrics / standards*
Regression Analysis
Surveys and Questionnaires