Aim
To summarize evidence concerning the reliability, validity, scoring, and minimal clinically important difference (MCID) of the following scales used to assess changes in ulcerative colitis (UC) disease activity, and outcome measurement in the clinical trials:
Findings
Mayo scoring system
The Mayo score is one of the most commonly used disease activity indices in placebo-controlled trials in UC. In its complete form, it is composed of four parts: rectal bleeding, stool frequency, physician assessment, and endoscopy appearance. Each part is rated from 0 to 3, giving a total score of 0 to 12. A score of 3 to 5 points indicates mildly active disease, a score of 6 to 10 points indicates moderately active disease, and a score of 11 to 12 points indicates severely active disease. Two abridged versions, the partial Mayo score that excludes the endoscopy subscore and the non-invasive six-point score comprising only the rectal bleeding and stool frequency portions, have been developed and validated.23 The Mayo score and the partial Mayo score have been demonstrated to correlate with patient assessment of change in UC activity.23 Lewis et al. reported a reduction of ≥ 3 points on the Mayo score and the partial Mayo score to constitute a clinically meaningful change.23 Lewis et al. also recommended that clinical remission of UC be defined using a Mayo score of ≤ 2 points.23
Although the Mayo score is a widely recognized UC activity index and is accepted by regulatory bodies, including Health Canada and the US FDA, it may not be optimal. Cooney et al. argue that two components of the Mayo score, the Physician Global Assessment (PGA) and the sigmoidoscopy subscore, are subjective and introduce variability and a lack of precision into the index. The PGA also includes a sigmoidoscopy score that introduces double-counts of some elements.24
Inflammatory Bowel Disease Questionnaire
The IBDQ was developed by Guyatt et al.25 as a physician-administered questionnaire and it is widely used to assess health-related quality of life (HRQoL) in patients with inflammatory bowel disease (IBD) (UC and Crohn disease).26 It is a 32-item Likert-based questionnaire divided into four dimensions: bowel symptoms (10 items), systemic symptoms (5 items), emotional function (12 items), and social function (5 items). Responses to each question are graded from 1 to 7 (1 being the worst situation and 7 the best). Therefore, the total IBDQ score ranges between 32 and 224, with higher scores representing better quality of life. The scores of patients in remission usually range from 170 to 190. An increase in IBDQ score of 16 to 32 points constitutes the upper and lower bounds of the clinically meaningful improvement in HRQoL in patients with Crohn disease.21 Information on whether this correlation between score and levels of clinical improvement translates directly to UC was not available through the literature search for this summary.
A systematic review21 of nine validation studies on the IBDQ for UC reported that, in seven of the studies, the IBDQ was able to differentiate clinically important differences between patients with disease remission and patients with disease relapse, by demonstrating significant differences in score.26 The IBDQ can also discriminate changes in the social and emotional state of patients; however, the correlation of this dimension with disease activity is not as high as the correlation with remission of bowel symptoms.26 The IBDQ also demonstrated high test-retest reliability in all the four IBDQ dimensional scores. Six studies evaluated IBDQ for sensitivity to change, and all suggested it to be a sensitive instrument to quantify changes in HRQoL relative to clinical activity changes in UC.26
Short Form (36) Health Survey: Medical Outcomes Study
The SF-36 is a 36-item, general health status instrument that has been used extensively in clinical trials in many disease areas.27 The SF-36 consists of eight health domains: physical functioning, role-physical, bodily pain, general health, vitality, social functioning, role-emotional, and mental health.28 For each of the eight categories, a subscale score can be calculated. The SF-36 also provides two component summaries, the Physical Component Summary (PCS) and the Mental Component Summary (MCS). The PCS and MCS scores range from 0 to 100, with higher scores indicating better health status. The summary scales are scored using norm-based methods, with regression weights and constants derived from the general US population. Both the PCS and MCS scales are transformed to have a mean of 50 and a standard deviation (SD) of 10 in the general US population. Therefore, all scores above/below 50 are considered above/below average for the general US population. In patients with either UC or Crohn disease, the SF-36 showed good discriminant ability and had satisfactory reliability.29 While reliability was satisfactory, the authors did observe substantial floor effects within the role-physical and role-emotional dimensions, underscoring the lack of sensitivity of the scale to detect small changes in certain groups of patients.29 For patients with UC and patients with Crohn disease, high ceiling effects along with low responsiveness scores (obtained using the Guyatt statistic) indicate some limitations associated with the ability of SF-36 to detect either deterioration or improvement over periods of time, particularly in longitudinal studies.29
The MCID for either the PCS or MCS of the SF-36 is typically between 2.5 and 5 points,18–20 while in Crohn disease the PCS and MCS MCIDs were estimated to range between 1.6 and 7.0 and 2.3 to 8.7, respectively, using various distribution- and anchor-based approaches.17 No MCID was identified in UC.
Summary
The Mayo score and the partial Mayo score are commonly used disease activity indices in placebo-controlled trials in UC. Both have demonstrated correlation with patient assessment of change in UC activity. Mild, moderate, and severe disease activity are indicated by score ranges of 3 to 5 points, 6 to 10 points, and 11 to 12 points, respectively. Lewis et al. reported that a reduction of ≥ 3 points on the Mayo score and the partial Mayo score reflect a clinically meaningful change.23
The IBDQ is a physician-administered, 32-item questionnaire used to assess HRQoL in patients with IBD (UC and Crohn disease).26 It evaluates bowel and systemic symptoms, as well as emotional and social functions. Responses to each question are graded from 1 to 7, with the overall score ranging from 32 (very poor HRQoL) to 224 (perfect HRQoL). Patients in symptomatic remission usually have a score of 170 or greater. An increase in IBDQ score of 16 to 32 points constitutes the upper and lower bounds of the clinically meaningful improvement in HRQoL in patients with Crohn disease.
The SF-36 is a 36-item, general health status instrument that has been used extensively in clinical trials in many disease areas,27 with an MCID generally ranging between 2.5 and 5 points.18–20 With regard to Crohn disease, however, the estimates of MCID for the PCS and MCS ranged between 1.6 and 8.7, while no MCID was identified in UC. The SF-36 was found to have good discriminant ability and satisfactory reliability; however, due to high floor and ceiling effects and low scores obtained for responsiveness using the Guyatt statistic, the SF-36 might be limited in its ability to detect either deterioration or improvement over time in patients with UC and Crohn disease.