Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Ann Rheum Dis. Author manuscript; available in PMC Dec 1, 2010.
Published in final edited form as:
PMCID: PMC2950749

Use of “spydergrams” to present and interpret SF-36 health-related quality of life data across rheumatic diseases


The Medical Outcomes Study Short Form-36 (SF-36) is a generic measure of health-related quality of life (HRQOL), validated and cross-culturally translated, which has been extensively utilised in rheumatology. In randomised controlled trials and observational studies, SF-36 provides rich data regarding HRQOL; but as typically portrayed, patterns of disease and treatment-associated effects can be difficult to discern. “Spydergrams” offer a simplified means to visualise complex results across all domains of SF-36 in a single figure: depicting disease and population-specific patterns of decrements in HRQOL compared with age and gender-matched normative data, as well as providing a tool for interpreting complex treatment-associated or longitudinal changes. Utilising spydergrams as a standard format to illustrate and report changes in SF-36 across different rheumatic diseases can greatly facilitate analyses and interpretations of clinical trial results, as well as providing patients an accessible means to compare baseline scores and treatment-associated improvements with normative data from individuals without arthritis. Furthermore, SF-6D utility scores based on mean changes across all eight domains of SF-36 are suggested as a quantitative means of summarising changes illustrated by spydergrams, offering a universal metric for cost-effectiveness analyses of therapeutic interventions.

The Medical Outcomes Study Short Form-36 (SF-36) was developed to measure self-reported health-related quality of life (HRQOL): 36 questions combined into eight domains reflecting different dimensions of health,1, 2 grouped into composite physical and mental component summary (PCS and MCS) scores.3 SF-36 has been cross-culturally translated4 and is widely used for clinical research, health policy evaluations as well as general population surveys. A US Veterans Affairs version has also been derived and validated.5

Extensively validated in randomised controlled trials (RCT) and longitudinal observational studies, this generic instrument has demonstrated sensitivity to treatment effects and reflects the impact of various rheumatic diseases upon HRQOL, including rheumatoid arthritis (RA),6, 7 systemic lupus erythematosus (SLE),8 psoriatic arthritis,9 ankylosing spondylitis,10 gout,11 systemic sclerosis (SSc),12 fibromyalgia13 and osteoarthritis.14 SF-36 scores correlate well with improvements in physical function measured by health assessment questionnaire (HAQ-DI) in RA.15 Over the past decade, RCT with new disease-modifying antirheumatic drugs have documented significant treatment-associated changes, including “improvement in physical function and HRQOL”, which have become established labelling claims for approved therapies.15


Differences in the way individuals perceive and report HRQOL can be better interpreted by viewing baseline and change scores across domains, scored from 0 to 100, without z-transformation and normalisation as recommended in version 2 of SF-36, both of which reduce the magnitude of possible change. In contrast to the current practice of displaying SF-36 as eight-columned bar charts, “spydergrams” offer the ability to view changes more easily across all domains as a pattern recognition profile, depicting disease and population-specific “patterns” of decrements in baseline values compared with matched normative data, as well as treatment-associated or longitudinal changes. These “irregularly formed octagons or polygons” can be informative, reflecting different patterns of HRQOL and the impact of underlying disease on “multidimensional function”.

For heuristic or analytic purposes, SF-36 domain score bar graphs are presented as line graphs to aid the viewer in perceiving effects or trends. Similarly, in spydergrams categorical changes are connected (linked) to facilitate visual recognition of patterns, with the disclaimer that this is not intended to imply these are continuous scales. It is not unusual to see figures presented as line graphs that colour in the area below the line, not for significance as an “area under the curve” analysis, but to facilitate visual recognition of differences further. Spydergrams are an evolution from these standard practices, whereby the axis is simply rotated to connect with itself; bar graphs of baseline and changes in domain scores are connected with lines, and areas below the lines are shaded to facilitate pattern recognition.

To compare across disease states, the order of domains presented should be consistent, whether or not they reflect a certain sequence or priority of importance. Convention has dictated that the four physical domains are presented from left to right in a bar chart, then mental domains; thus in a “spydergram” physical function (PF) is at the top, 12 o’clock, followed clockwise by role physical (RP), bodily pain (BP) and general health perceptions (GH), and vitality (VT) at the 6 o’clock position, followed by social functioning (SF), role emotional (RE) and mental health index (MH) clockwise (fig 1A, B).7, 15 Domain scores are plotted from 0 (worst) at the centre to 100 (best) at the outside; demarcations along axes of the domains present changes of 10 points, representing one to two times minimally clinical important differences (MCID). Changes in shape and thickness of these irregular octagonal rings offer a single graphic representation to: (1) compare baseline decrements with age and gender-matched normative values; (2) assess treatment-associated or longitudinal improvements in HRQOL and (3) compare and contrast scores across protocols and disease states. As spydergrams allow visualisation of these values simultaneously, they may be presented on an individual basis with norms as a “treatment goal” and were recently utilised in a patient-assessed programme of therapy.16

Figure 1
Rheumatoid arthritis. Data from the PREMIER randomised controlled trial (RCT): adalimumab plus methotrexate (ADA+MTX) versus methotrexate (MTX) in methotrexate-naive subjects with disease duration of 7–9 months. (A) Baseline scores from PREMIER ...


Although suggestive of an area under the curve analysis, this would be misleading. The typical eight-column bar graphs have been linked into a single graphic for ease of interpretation (pattern recognition), but technically represent categorical data, not a continuum. Nevertheless, a summary metric that combines data from all eight domains into a single score is important for quantifying changes in these patterns.

Statistical analyses require a primary outcome measure and PCS and MCS scores are often chosen as a single metric for analysis of SF-36 within rheumatology. However, PCS and MCS scores do not fully reflect patterns of change within the domains as they are derived from z-transformed and norm-based domain scores. The model for their derivation assumes that physical and mental health constructs are independent,17 but in the Swedish SF-36 normative database, Taft et al18 illustrated significant correlations between PCS and MCS scores. Farivar et al19 showed there were fewer negative factor scoring coefficients using an oblique factor than standard orthogonal solutions. Hann and Reeves20 recently tested several models in two large databases, again observing correlations between PCS and MCS scores and that the relationship between domain and PCS and MCS scores varied significantly by medical condition, supporting the argument against the orthogonal derivation of scores. Furthermore, Ware and Kosinski21, 22 have argued that: “one of the best defenses against inappropriate conclusions based on the summary measures is the thorough comparison with results based on the 8 SF-36 subscales (domains)”.

An alternative approach to summarise SF-36 domain scores quantitatively could be health state preferences, or utilities valuing health from “0” death, to “1” perfect health, an economic measure critical for evaluation of cost-effectiveness of therapeutic interventions. Ara and Brazier,23, 24 Brazier et al25 and Marra et al26 developed a new calculation of SF-6D, which utilises mean scores across all eight SF-36 domains to yield a single utility measure, which has been validated in longitudinal databases and against EQ-5D within a rheumatic disease population. This single valuation may be used to represent baseline decrements and change scores portrayed by spydergrams.

The use of spydergrams to compare and contrast the impact of multiple rheumatic diseases upon HRQOL, measured by SF-36, has made their value apparent in clarifying decrements in HRQOL compared with matched normative data, as well as treatment-associated improvements.


Figures 13 illustrate SF-36 data, available from published reports and abstract presentations, analysed from RCT in RA, SLE, gout, SSc and osteoarthritis. Age and gender-matched normative data specific to each population were generated based on US norms published in SF-36 manuals and updates.27 Spydergrams were configured for each study, and utility scores were generated following the approach of Ara and Brazier.24 These figures reveal different “polygonal” patterns for each rheumatic disease.

Figure 3
Gout and osteoarthritis. (A) Data from the Vet-QOL survey from veterans with gout and comorbidities (red) versus US norms (light purple polygon), “treatment failure gout” patients enrolled in the longitudinal observational Natural History ...

In early and later disease, RA appears to impact all domains of HRQOL, especially RP, PF and BP, but also RE (figs 1A, C).7, 28 Treatment-associated changes are large in all, not just physical domains, and are greatest in those with the largest decrements at baseline. In SLE (fig 2A), baseline SF-36 scores were low across all domains compared with matched norms.29 In contrast to RA, large decrements in any one domain do not stand out, reflecting the broad impact of active disease on mental as well as physical domains. When baseline as well as treatment-associated changes are viewed as spydergrams, SF-36 data reflected clinical responses defined by the British Isle Lupus Assessment Group (BILAG) disease activity score, patient and physician global scores and decreases in prednisone dose, despite small sample sizes and loss of balanced randomisation.3032

Figure 2
Systemic lupus erythematosus and systemic sclerosis. (A) In a combined analysis of two prematurely discontinued randomised controlled trials (RCT) in systemic lupus erythematosus, treatment-associated changes with active treatment are large; despite low ...

In SSc (fig 2B), SF-36 scores from a failed RCT with relaxin33 reveal patterns different from previous examples with markedly lower PF and RP scores than those with SLE, including more decrements in VT, BP and GH domains; despite the heterogeneity and multi-organ involvement shared by both conditions.

In the Vet-QOL survey (fig 3A), veterans with gout reported statistically more medical and arthritic comorbidities, hospitalisations and utilisation of outpatient services than those without, and large decrements across all HRQOL domains compared with matched norms.11 In comparison with a “treatment failure gout” population enrolled in an observational Natural History Study,34 baseline scores in both groups were low; remarkably similar, as were SF-36 scores reported by treatment failure gout patients enrolled in two phase 3 protocols comparing pegloticase versus placebo.35 Values reflect decrements in HRQOL comparable to those reported by subjects with longstanding, debilitating RA, or active SLE.36 In contrast, osteoarthritis appears exclusively to impact PF, RP and BP domains, with preservation of other scores, including VT (fatigue)37 (fig 3B).


As we have attempted to demonstrate, spydergrams can provide an effective tool to perceive more quickly patterns of change in complex sets of data. They are designed to illustrate differences between baseline and normative data, and portray treatment-associated changes in the context of age and gender-matched norms specific to the population studied. Due to the inclusion of baseline values and comparisons with matched norms as lower and upper bounds, spydergrams allow visual comparisons of the thickness of “rings” exactly proportional to the degree of changes from baseline. Although the shape of the octagon or polygon would change according to the order of presentation of the domains, perceived effects would still remain proportional along each axis, reflecting the impact of disease, facilitating comparisons across conditions.38 Baseline values and treatment-associated changes, in terms of clinical meaningfulness relative to MCID, can easily be discerned by examination of changes along individual domain axes. SF-36 manuals provide data from which age and gender-matched US norms can be generated. Importantly, normative data are available for Great Britain, Denmark, Norway and Sweden, The Netherlands and Turkey, among others.4, 39

Data presented here demonstrate that the pattern of baseline domain scores as well as longitudinal and treatment-associated changes appear to be unique to each rheumatic disease. They also show that improvements tend to occur in domains with the largest decrements at baseline compared with age and gender-matched norms. It is evident that comparing data across all eight domains offers a richness of information not available when solely evaluating PCS and MCS scores or utilising norm-transformed domain scores. Importantly, findings derived from RCT and longitudinal observational studies are similar, supporting the robustness of these observations.

Utilising the recently derived SF-6D utility score to summarise data across all eight domains in a single metric offers a numeric comparison across disease states, in addition to the shape and thickness comparisons offered by spydergrams. The use of SF-6D also facilitates economic evaluations as baseline and change scores can be transformed into utilities for the calculation of quality-adjusted life years, a universal metric in cost-effectiveness analyses. Combining spydergrams with a single metric that generates health utility measures, SF-6D, allows both quantitative and qualitative assessment of the impact of disease and its treatment upon multidimensional function.


Funding: DK was supported by a National Institutes of Health Award (NIAMS K23 AR053858-01A1) and the Scleroderma Foundation (New Investigator Award).


Competing interests: VS is a consultant to the following: Abbott Immunology, Alder, Allergan, Almirall, Amgen Corporation, AstraZeneca, Bexel, BiogenIdec, CanFite, Centocor, Chelsea, Crescendo, Cypress Biosciences, Eurodiagnostica, Fibrogen, Forest Laboratories, Genentech, Human Genome Sciences, Idera, Incyte, Jazz Pharmaceuticals, Lexicon Genetics, Logical Therapeutics, Lux Biosciences, Medimmune, Merck Serono, Novartis Pharmaceuticals, NovoNordisk, Nuon, Ono Pharmaceuticals, Pfizer, Procter and Gamble, Rigel, Roche, Sanofi-Aventis, Savient, Schering Plough, SKK, UCB, Wyeth and Xdx. The other authors declare no conflicts of interest.

Provenance and peer review: Not commissioned; externally peer reviewed.


1. Ware JE, Sherbourne CD. The MOS 36-item short-form health survey (SF–36), I. Conceptual framework and item selection. Med Care. 1992;30:473–83. [PubMed]
2. Ware J, Kosinski M. SF–36 Physical and Mental Health Summary Scales: a manual for users of version 1. Boston, MA: The Health Institute. New England Medical Center; 2001.
3. McHorney CA, Ware JE, Raczek AE. The MOS 36-item Short Form Health Survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructs. Med Care. 1993;31:247–63. [PubMed]
4. Ware JE, Keller S, Bentler PM, et al. Comparisons of health status measurement models and the validity of the SF-36 in Great Britain, Sweden and the USA. Qual Life Res. 1994;3:68.
5. Selim AJ, Berlowitz D, Fincke G, et al. Use of risk-adjusted change in health status to assess the performance of integrated service networks in the Veterans Health Administration. Int J Qual Health Care. 2006;18:43–50. [PubMed]
6. Kosinski M, Zhao SZ, Dedhiya S, et al. Determining minimally important changes in generic and disease-specific health-related quality of life questionnaires in clinical trials of rheumatoid arthritis. Arthritis Rheum. 2000;43:1478–87. [PubMed]
7. Strand V, Singh JA. Improved health-related quality of life with effective disease-modifying antirheumatic drugs: evidence from randomized controlled trials. Am J Manag Care. 2008;14:234–54. [PubMed]
8. Thumboo J, Strand V. Health-related quality of life in patients with systemic lupus erythematosus: an update. Ann Acad Med Singapore. 2007;36:115–22. [PubMed]
9. Gladman D, Mease PJ, Strand V, et al. Consensus on a core set domains for psoriatic arthritis. J Rheumatol. 2007;34:1167–70. [PubMed]
10. Singh JA, Strand V. Spondyloarthritis is associated with poor function and physical health related quality of life. J Rheumatol. 2009;36:1012–20. [PMC free article] [PubMed]
11. Singh JA, Strand V. Gout is associated with more comorbidities, poorer health related quality of life and higher health care utilization in US Veterans. Ann Rheum Dis. 2008;67:1310–16. [PubMed]
12. Khanna D, Furst DE, Clements PJ, et al. Responsiveness of the SF-36 and the Health Assessment Questionnaire Disability Index in a systemic sclerosis clinical trial. J Rheumatol. 2005;32:832–40. [PubMed]
13. Hoffman DL, Dukes EM. The health status burden of people with fibromyalgia: a review of studies that assessed health status with the SF-36 or the SF-12. Int J Clin Pract. 2008;62:115–26. [PMC free article] [PubMed]
14. Kosinski M, Keller SD, Ware JE, Jr, et al. The SF-36 health survey as a generic outcome measure in clinical trials of patients with osteoarthritis and rheumatoid arthritis: relative validity of scales in relation to clinical measures of arthritis severity. Med Care. 1999;37:MS23–39. [PubMed]
15. Strand V, Singh J. Health related quality of life in rheumatoid arthritis (chapter 9C) In: Hochberg MC, Silman A, Smolen J, et al., editors. Rheumatoid arthritis. 1. Philadelphia, PA: Mosby Elsevier; 2008. pp. 237–59.
16. Aventis Sanofi. Welcome to Arrive – your Arava© care programme! Jan2009. [accessed 29 Sep 2009]. www.arrive-online.org.
17. Ware JE, Kosinski M, Bayliss MS, et al. Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: summary of results from the Medical Outcomes Study. Med Care. 1995;33:AS264–79. [PubMed]
18. Taft C, Karlsson J, Sullivan M. Do SF-36 summary component scores accurately summarize subscale scores? Qual Life Res. 2001;10:395–404. [PubMed]
19. Farivar SS, Cunningham WE, Hays RD. Correlated physical and mental health summary scores for the SF-36 and SF-12 health survey, V1. Health Qual Life Outcomes. 2007;5:1–8. [PMC free article] [PubMed]
20. Hann M, Reeves D. The SF-36 scales are not accurately summarised by independent physical and mental component scores. Qual Life Res. 2008;17:413–23. [PubMed]
21. Ware JE, Kosinski M. Interpreting SF-36 summary health measures: a response. Qual Life Res. 2001;10:405–13. [PubMed]
22. Ware JE, Kosinski M. Interpreting SF-36 summary health measures: a response supplemental documentation. [accessed 29 Sep 2009]. www.sf-36.org/news/responsetotaft.pdf.
23. Ara R, Brazier J. Predicting the Short Form-6D preference-based index using the eight mean Short Form-36 health dimension scores: estimating preference-based health-related utilities when patient level data are not available. Value in Health. 2009;12:346–53. [PubMed]
24. Ara R, Brazier J. Deriving an algorithm to convert the eight mean SF-36 dimension scores into a mean EQ-5D preference-based score from published studies (where patient level data are not available) Value in Health. 2008;11:1131–43. [PubMed]
25. Brazier JE, Roberts J, Deverill M. The estimation of a preference based measure of health from the SF–36. J Health Econ. 2002;21:271–92. [PubMed]
26. Marra CA, Woolcott JC, Kopec JA, et al. A comparison of generic, indirect utility measures (the HUI2, HUI3, SF–6D, and the EQ–5D) and disease-specific instruments (the RAQoL and the HAQ) in rheumatoid arthritis. Soc Sci Med. 2005;60:1571–82. [PubMed]
27. Ware JE, Jr, Snow KK, Kosinski M, et al. SF-36 health survey: manual and interpretation guide. Boston, MA: The Health Institute. New England Medical Center; 1993.
28. Strand V, Keininger DL, Tahari-Fitzgerald E. Certolizumab pegol results in clinically meaningful improvements in physical function and health-related quality of life in patients with active rheumatoid arthritis despite treatment with methotrexate. Arthr Rheum. 2007;56:S393.
29. Strand V, Petri M, Buyon J, et al. Baseline data from 5 RCTs demonstrate that SLE impacts all domains of HRQOL. Arthr Rheum. 2006;54:S277.
30. Strand V, Gordon C, Kalunian K, et al. Meaningful improvements in health-related quality of life with epratuzumab (anti-CD22 mAb targeting B-cells) in patients with systemic lupus erythematosus with high disease activity: results from 2 randomized controlled trials (RCTs) Arthr Rheum. 2008;58:S570–1.
31. Petri M, Hobbs K, Gordon C, et al. Clinically meaningful improvements with epratuzumab (anti-CD22 mAb targeting B-cells) in patients (pts) with moderate/severe systemic lupus erythematosus (SLE) flares: results from 2 randomized controlled trials. Arthr Rheum. 2008;58:S571.
32. Wallace D, Hobbs K, Houssiau F, et al. Epratuzumab (anti-CD22 mAb targeting B-cells) provides clinically meaningful reductions in corticosteroid (CS) use with a favorable safety profile in patients with moderate/severe flaring systemic lupus erythematosus (SLE): results from randomized controlled trials (RCTs) Arthr Rheum. 2008;58:S571–2.
33. Khanna D, Clements PJ, Furst DE, et al. Recombinant human relaxin in the treatment of systemic sclerosis with diffuse cutaneous involvement. Arthr Rheum. 2009;60:1102–11. [PMC free article] [PubMed]
34. Becker MA, Schumacher HR, Benjamin KL, et al. Quality of life and disability in patients with treatment-failure gout. J Rheumatol. 2009;36:1041–8. [PubMed]
35. Edwards NL, Baraf HSB, Becker MA, et al. Improvement in health-related quality of life (HRQL) and disability index in treatment failure gout (TFG) after pegloticase therapy: pooled results from GOUT1 and GOUT2, phase 3, randomized, double-blind, placebo controlled trials. Arthr Rheum. 2008;58:S178.
36. Strand V, Singh JA, Sundy J, et al. HRQOL of patients with refractory gout and US veterans with gout and comorbidities is poor, and comparable to that in other severe conditions. Arthr Rheum. 2009;58:S177.
37. Baraf HS, Strand V, Hosokawa H, et al. Effectiveness and safety of a single intraarticular injection of Gel-200, a new cross linked formulation of hyaluronic acid in the the treatment of symptomatic osteoarthritis of the knee. Proceedings of the OARSI World Congress on Osteoarthritis; 10–13 September 2009; Montreal, Canada. OsteoArthritis Research Society International; Poster 326.
38. Slatkowsky-Christiansen B, Mowinckel P, Loge JH, et al. Health related quality of life in women with symptomatic hand osteoarthritis: a comparison with rheumatoid arthritis patients, healthy controls and normative data. Arthr Rheum. 2007;57:1404–9. [PubMed]
39. Kvien TK, Kaasa S, Smedstad LM. Performance of the Norwegian SF-36 health survey in patients with rheumatoid arthritis. II. A comparison of the SF-36 with disease-specific measures. J Clin Epidemiol. 1998;51:1077–86. [PubMed]
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...