Donahue KE, Jonas DE, Hansen RA, et al. Drug Therapy for Rheumatoid Arthritis in Adults: An Update [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012 Apr. (Comparative Effectiveness Reviews, No. 55.)

Appendix GClinical and Self-Reported Scales and Instruments Commonly Used in Studies of Drug Therapy for Rheumatoid Arthritis and Psoriatic Arthritis


This appendix provides a brief overview of the various scales and self-reported measures that investigators used to assess outcomes in all the studies reviewed in this systematic review. The main outcome categories involve radiologic assessments of joint damage (erosion or narrowing) and various instruments that patients or subjects used to report on functional capacity or quality of life; the latter fall into two groups, one related to general health measures and one related to condition- or disease-specific instruments. General measures used in rheumatoid and psoriatic arthritis studies are described first; then the disease-specific measures used in rheumatoid and psoriatic arthritis studies are described separately. The new 2010 American College of Rheumatology ACR criteria are presented at the end of the document.

Radiographic Measures

Radiographic assessment of joint damage in hands (including wrists) or both hands and feet are critical to clinical trials in rheumatoid arthritis. The damage can be both joint space narrowing and erosions, and the underlying construct is sometimes referred to as radiographic progression (i.e., changes, whether positive or negative) as detected by radiography and interpretation. Several approaches exist, but the two commonly used are the Sharp Score (and variants) and the Larsen Score. These and other scoring methods have recently been reviewed by Boini and Guillemin;1 additional citations or sources are given in the brief descriptions below.

Sharp Score and Sharp/van der Heijde Score

The Sharp Score is a means of evaluating joint damage in joints of the hands, including both erosion and joint space narrowing.2 Although it has undergone modifications since its introduction, the version proposed in 1985 has become the standard approach. In this method, 17 joint areas in each hand are scored for erosions; 18 joint areas in each hand are scored for joint space narrowing. The score per single joint for erosions ranges from 0 to 5 and for joint space narrowing from 0 to 4. In both cases, a higher score is worse. Erosion scores range from 0 to 170 and joint space narrowing scores range from 0 to 144. Thus, the “total Sharp Score” is the sum of the erosion and joint space narrowing scores, or 0 to 314.

The Sharp/van der Heijde (SHS) method, introduced in 1989, overcame one drawback to the Sharp Score, namely its focus on only hands, given that feet can also be involved early in rheumatoid arthritis. Therefore, the SHS method was developed to take account of erosions and joint space narrowing in both hands and feet.34 As with the Sharp Score, higher scores reflect worse damage. Erosion is assessed in 16 joints in each hand and 6 joints in each foot. Each joint is scored from 0 to 5 with a maximal erosion score of 160 in the hands and 120 in the feet. Joint space narrowing and subluxation are assessed in 15 joints in the hands and 6 joints in the feet. Each joint is scored from 0 to 4 with a maximal score of 120 in the hands and 48 in the feet. The erosion and joint space narrowing scores are combined to give a total SHS score with a maximum of 448 (weighted toward hands because more joints are scored).

Numerous variants on the Sharp or SHS scores have been developed, differing subtly in terms of the numbers of joints measured and other details.5 Generally, all the Sharp methods are very detailed assessments and the approach, although reliable and sensitive to change, is considered time-consuming and tedious. For a speedier approach, Larsen and colleagues developed a simpler approach.

Larsen Scale for Grading Radiographs

The Larsen Scale is an overall measure of joint damage, originally devised in the 1970s and updated most recently in the late 1990s.610 It produces both a score for each joint (hands and feet) and an overall score that reflects measurement and extent of joint damage. Scores range from 0 (“normal conditions,” i.e., intact bony outlines and normal joint space) to 5 (“mutilating abnormality,” i.e., original bony outlines have been destroyed), so higher scores reflect greater damage. Scores can range from 0 to 250.

General Health Measures

Health Assessment Questionnaire

The Health Assessment Questionnaire (HAQ) is a widely used self-report measure of functional capacity; it is a dominant instrument in studies of patients with arthritis (particularly trials of drugs in patients with rheumatoid arthritis), but it is considered a generic (not disease-specific) instrument. Detailed information on its variations, scoring, etc., can be found at www.chcr.brown.edu/pcoc/EHAQDESCRSCORINGHAQ372.PDF (accessed for this purpose 1/18/2007) or www.hqlo.com/content/1/1/20 (accessed for this purpose 1/18/2007) and in the seminal reports by Fries et al.11 and Ramey et al.12

The full, five-dimension HAQ consists of four domains: disability, discomfort and pain, toxicity, and dollar costs, plus death (obtained through other sources). More commonly, “the HAQ” as used in the literature refers to the shorter version encompassing the HAQ Disability Index (HAQ-DI), the HAQ pain measure, and a global patient outcome measure. The HAQ-DI is sometimes used alone.

The HAQ-DI, with the past week as the time frame, focuses on whether the respondent “is able to…” do the activity and covers eight categories in 20 items: dressing and grooming, arising, eating, walking, hygiene, reach, grip, and common daily activities. The four responses for the HAQ-DI questions are graded as follows: without any difficulty = 0; with some difficulty = 1; with much difficulty = 2; and unable to do = 3. The highest score for any component question in a category determines the category score. The HAQ-DI also asks about the use of aids and devices to help with various usual activities. Two composite scores can be calculated, one with and one without the aids/devices element; both range from 0 to 3.

The HAQ pain domain is measured on a doubly-anchored horizontal visual analog scale (VAS) of 15 cm in length; one end is labeled “no pain” (score of 0) and the other is labeled “very severe pain” (score of 100). Patients mark a spot on the VAS, and scores are calculated as the length from “no pain” in centimeters (cm) multiplied by 0.2 to yield a value that can range between 0 and 3.

With respect to interpretation, HAQ-DI scores of 0 to 1 are generally considered to represent mild to moderate disability, 1 to 2 moderate to severe disability, and 2 to 3 severe to very severe disability.

The HAQ global health status scale measures quality of life (essentially, as how the patient is feeling) with a 15 cm doubly-anchored horizontal VAS scored from 0 (very well) to 100 (very poor).

Medical Outcomes Study Short Form 36 Health Survey

The Medical Outcomes Study Short Form 36 Health Survey (SF-36) is an internationally known generic health survey instrument. Information can be found at www.sf-36.org/tools/sf36.shtml (accessed for this purpose 2/18/2007) and in a large number of articles documenting its psychometric properties.1319 It comprises 36 items in eight independent domains tapping functioning and well-being: physical functioning, role-physical, bodily pain, and general health in one grouping (physical health) and vitality, role-emotional, social functioning, and mental health in another grouping (mental health). The SF-36 provides a separate scale score for each domain (yielding a profile of health) and two summary scores, one for physical health and one for mental health. Each scale is scored from 0 to 100 where higher scores indicate better health and well-being.

A “version 2” of the SF-36 was introduced in the late 1990s to correct some drawbacks in formatting, wording, and other issues and to update the norm-based scoring with 1998 data. It can be fielded in two versions varying by recall period: 4-week recall (the usual approach) and 1-week recall (acute). More recently, it has been tested and used for computer adaptive testing according to item response theory principles.

EuroQol EQ-5D Quality of Life Questionnaire

A third generic quality-of-life instrument is the EuroQol EQ-5D Quality of Life Questionnaire, typically known just as the EQ-5D. More information can be found at http://www.euroqol.org/ (accessed for this purpose 1/18/2007) and in key descriptive articles,20 one of which is about patients with rheumatoid arthritis.21

The EQ-5D covers health status in five domains (three questions each): mobility, self-care, usual activities, pain or discomfort, and anxiety or depression. It is intended for self-response but can be used in other administration modes. Each item can take one of three response levels – no problems, some moderate problems, extreme problems – identified as level 1, 2, or 3, respectively. This yields a profile of one level for each of the five domains; this is essentially a five-digit number, and no arithmetic properties attach to these values. Users can convert health states in the five-dimensional descriptive system into a weighted health state index by applying scores from EQ-5D “value sets” elicited from general population samples to the profile pattern (e.g., 1, 2, 3, 3, 1).

The EQ-5D also has a global health VAS scale (20 cm) scored from 0 to 100.

Rheumatoid Arthritis Measures

American College of Rheumatology 20/50/70

The American College of Rheumatology (ACR) criteria are concerned with improvement in counts of tender and swollen joints and several domains of health.22 A principal aim of these criteria is use in studies (particularly trials) of drugs for rheumatoid arthritis. More information can be found at www.rheumatology.org/publications/response/205070.asp and www.hopkins-arthritis.som.jhmi.edu/edu/acr/acr.html#remis_rheum (both accessed for this purpose 1/18/2007). Originally these latter involved patient assessment, physician assessment, erythrocyte sedimentation rate, pain scale, and functional questionnaire.

Today, based on work done in the mid 1990s,23 values for clinical trial patients are defined as improvement in both tender and swollen joint counts and in three of the following: patient’s assessment of pain; patient’s global assessment of disease activity, patient’s assessment of physical function (sometimes referred to as physical disability), the physician’s global assessment of disease activity, and acute phase reactant (C-reactive protein, or CRP). The 20, 50, or 70 designations (sometimes called the ACR Success Criteria) refer to improvements in percentage terms to 20 percent, 50 percent, or 70 percent in the relevant dimensions. A physician’s global assessment of 70 percent improvement is considered remission.

Thus, patients are said to meet ACR 20 criteria when they have at least 20 percent reductions in tender and swollen joint counts and in at least three of the domains. ACR 50 and ACR 70 criteria are defined in a manner similar to that for ACR 20, but with improvement of at least 50 percent and 70 percent in the individual measures, respectively. The table illustrates, in a study context, how a patient might be said to have an ACR 50 response.

Outcomes MeasuredBaselineEndpoint
Tender joints count *126
Swollen joints count *83
Patient’s pain score*6020
Patient’s physical function (disability) score8060
Physician’s global activity score*5020
C-reactive protein*3.61.4

At least 50 percent improvement between baseline and endpoint measurements.

Ritchie Articular Index

This is a long-standing approach to doing a graded assessment of the tenderness of 26 joint regions, based on summation of joint responses after applying firm digital pressure.24 Four grades can be used: 0, patient reported no tenderness; +1, patient complained of pain; +2, patient complained of pain and winced; and +3, patient complained of pain, winced, and withdrew. Thus, the index ranges from 0 to 3 for individual measures and 0 to 78 overall, with higher scores being worse tenderness.

Certain joints are treated as a single unit, such as the metacarpal-phalangeal and proximal interphalangeal joints of each hand and the metatarsal-phalangeal joints of each foot. For example, the maximum score for the five metacarpal-phalangeal joints of the right hand would be 3, not 15. No weights are used for different types of joints (e.g., by size), because the issue is one of measuring changes (improvements) in tenderness; this is especially relevant for rheumatoid arthritis.

Disease Activity Score

The Disease Activity Score (DAS) is an index of disease activity first developed in the mid 1980s. The history of its development and current definitions, scoring systems, and other details can be found at http://www.das-score.nl/www.das-score.nl/ (accessed for this purpose 1/19/2007) and in recent articles.4,25 The DAS originally included the Ritchie Articular Index (see above), the 44 swollen joint count, the erythrocyte sedimentation rate, and a general health assessment on a VAS. A cut-off level of the DAS of 1.6 is considered to be equivalent with being in remission.

More recently, an index of RA disease activity using only 28 joints – the DAS 28 – has been developed, focusing on joint counts for both tenderness (TJC) and swelling (SJC). It also uses either the patient’s or a physician’s global assessment (PGA) of disease activity (on a 100 mm VAS) and the erythrocyte sedimentation rate (ESR) or C-reactive protein. The formula for calculating a DAS 28 score is as follows: = (0.56 × TJC1/2) + (0.28 × SJC1/2) + (0.7 × ln [ESR]) + (0.014 × PGA [in mm]). Numerous formulas to calculate a variety of DAS and DAS 28 scores exist (see the website above), such as when a global patient assessment of health is unavailable.

The DAS 28 yields a score on a scale ranging from 0 to 10. A DAS 28 of 2.6 is considered to correspond to remission; a DAS 28 of 3.2 is a threshold for low disease activity; and a DAS 28 of more than 5.1 is considered high disease activity

EULAR Response Criteria

The European League Against Rheumatism (EULAR) response criteria classify patients as good, moderate, or nonresponders based on both change in disease activity and current disease activity, using either the DAS or the DAS28 (see description above).26 For example, to be classified as a good responder a patient must have relevant change in DAS (≥1.2) and low current disease activity (≤2.4), while a nonresponder must have ≤0.6 change in DAS and high disease activity (>3.7).27

The EULAR criteria have been validated in multiple clinical trials, and confirmed in an analysis of nine clinical trials that concluded a high level of agreement and equal validity between ACR and EULAR improvement classifications.28 Good and moderate responders showed significantly more improvement in functional capacity and significantly less progression of joint damage than patients classified as nonresponders.28

Psoriatic Arthritis Measures

Psoriatic Arthritis Response Criteria

The psoriatic arthritis response criteria (PsARC) was initially designed for use in a clinical trial that compared sulphasalazine to placebo in the setting of the Veterans Administration.29 It has since been used as the primary or secondary outcome in all the studies that examined biologics versus placebo in the treatment of PsA. The PsARC includes improvement in at least two of the following, one of which had to be a joint count, and no worsening of any measure: tender or swollen joint count improvement of at least 30%, patient global improvement by one point on a five-point Likert scale, or physician global improvement on the same scale.29

American College of Rheumatology 20

The ACR 20 (American College of Rheumatology 20 percent response) is the other outcome that is used as the primary outcome in clinical trials of biologics. The measurement is similar to that of the ACR 20 used for rheumatoid arthritis with modifications made that increased the number of joints tested from 68 tender and 66 swollen to 76 and 78, respectively, with the addition of distal interphalangeal joints of the feet and carpometacarpal joints of the hands.29 The outcomes from the ACR 20 are generally poorer when compared to the PsARC due to the variation in items measured; this is due in part to the need to see an improvement in tender and swollen joints in the ACR 20 versus an improvement in tender or swollen joint counts.

2010 Rheumatoid Arthritis Criteria

Target population (Who should be tested?)
Patients who
  • have at least 1 joint with definite clinical synovitis (swelling)
    • Criteria aimed at classification of newly presenting patients; patients with erosive disease typical of RA with a history compatible with prior fulfillment of the 2010 criteria should be classified as having RA; patients with longstanding disease, including those whose disease is inactive (with or without treatment) who, based on retrospectively available data, have previously fulfilled the 2010 criteria should be classified as having RA
  • with the synovitis not better explained by another disease
    • Differential diagnoses vary among patients with different presentations, but may include conditions such as systemic lupus erythematosus, psoriatic arthritis, and gout. If it is unclear about the relevant differential diagnoses to consider, an expert rheumatologist should be consulted
Classification criteria for RA
Score-based algorithm:
  • Add score of categories: Joint involvement, serology, reactants, duration
    • Differential diagnoses vary among patients with different presentations, but may include conditions such as systemic lupus erythematosus, psoriatic arthritis, and gout. If it is unclear about the relevant differential diagnoses to consider, an expert rheumatologist should be consulted
  • Score of ≥6/10 needed for classification of a patient as having definite RA
    • Although patients with a score of <6/10 are not classifiable as having RA, their status can be reassessed and the criteria might be fulfilled cumulatively over time
Joint involvement
  • Joint involvement refers to any swollen or tender joint on examination, which may be confirmed by imaging evidence of synovitis; d Distal interphalangeal joints, first carpometacarpal joints, and first metatarsophalangeal joints are excluded from assessment; categories of joint distribution are classified according to the location and number of involved joints, with placement into the highest category possible based on the pattern of joint involvement
 1 large joint
  • “Large joints” refers to shoulders, elbows, hips, knees, and ankles
 2–10 large joints1
 1–3 small joints (with or without involvement of large joints)
  • “Small joints” refers to the metacarpophalangeal joints, proximal interphalangeal joints, second through fifth metatarsophalangeal joints, thumb interphalangeal joints, and wrists.
 4–10 small joints (with or without involvement of large joints)3
 >10 joints (at least 1 small joint)
  • In this category, at least 1 of the involved joints must be a small joint; the other joints can include any combination of large and additional small joints, as well as other joints not specifically listed elsewhere (e.g., temporomandibular, acromioclavicular, sternoclavicular, etc.)
Serology (at least 1 test result is needed for classification)††
  • Negative refers to IU values that are less than or equal to the upper limit of normal (ULN) for the laboratory and assay; low-positive refers to IU values that are higher than the ULN but ≤3 times the ULN for the laboratory and assay; high-positive refers to IU values that are >3 times the ULN for the laboratory and assay; where rheumatoid factor (RF) information is only available as positive or negative, a positive result should be scored as low-positive for RF. ACPA = anti-citrullinated protein antibody
 Negative RF and negative ACPA0
 Low-positive RF or low-positive ACPA2
 High-positive RF or high-positive ACPA3
Acute-phase reactants (at least 1 test result is needed for classification)
  • Normal/abnormal is determined by local laboratory standards. CRP = C-reactive protein; ESR = erythrocyte sedimentation rate
 Normal CRP and normal ESR0
 Abnormal CRP or abnormal ESR1
Duration of symptoms
  • Duration of symptoms refers to patient self-report of the duration of signs or symptoms of synovitis (e.g., pain, swelling, tenderness) of joints that are clinically involved at the time of assessment, regardless of treatment status
 <6 weeks0
 ≥6 weeks1

Adapted from: 2010 Rheumatoid arthritis classification criteria: An American College of Rheumatology/European League Against Rheumatism collaborative initiative. Arthritis & Rheumatism. 2010 Sep; 62(9): 2569–2581

The Psoriasis Area and Severity Index

The Psoriasis Area and Severity Index (PASI) was developed to measure the effect of treatments in clinical trials of psoriasis and is utilized to capture the psoriasis component found in psoriatic arthritis. The scale was originally published in 1978 in a trial of 27 patients suffering from severe chronic generalized psoriasis that were treated with Ro 10-9359, a retinoic acid derivative.30 The PASI is a composite index of disease severity incorporating measures of scaling, erythema, and induration, and it is weighted by severity and affected body surface area. A PASI >12 defines severe, PASI 7–12 moderate, and PASI <7 mild psoriasis.


