U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Clinical Review Report: Dupilumab (Dupixent): (Sanofi-Aventis Canada Inc.): Indication: Moderate-to-severe atopic dermatitis (AD) [Internet]. Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2018 Jul.

Cover of Clinical Review Report: Dupilumab (Dupixent)

Clinical Review Report: Dupilumab (Dupixent): (Sanofi-Aventis Canada Inc.): Indication: Moderate-to-severe atopic dermatitis (AD) [Internet].

Show details

Appendix 5Validity of Outcomes Measures


To summarize the validity of the following end point measures:

  • Eczema Area and Severity Index (EASI)
  • Investigator’s Global Assessment scale (IGA)
  • Scoring Atopic Dermatitis (SCORAD)
  • Patient Global Assessment of Disease Status (PGADS)
  • Pruritus Numerical Rating Scale (NRS)
  • Dermatology Life Quality Index (DLQI)
  • EuroQol 5-Dimensions questionnaire (EQ-5D)
  • Hospital Anxiety and Depression Scale (HADS)
  • Patient-Oriented Eczema Measure (POEM).


Table 17Validity and Minimal Clinically Important Difference of Outcome Measures

InstrumentTypeEvidence of ValidityMCIDReferences
EASIA scale used in clinical trials to assess the severity and extent of ADYes6.6 points8
IGAA scale that provides a global clinical assessment of AD by investigatorsYesUnknown12,22
SCORADA tool used in clinical research to standardize the evaluation of the extent and severity of ADYes8.7 points8
PGADSA scale used for global assessment of AD by patientsunknownUnknown4
Pruritus NRSA tool for patients with AD to report the intensity of their itchYes3 pointsa4 39,40
DLQIA questionnaire used to assess six different aspects that may affect quality of lifeYes2.2 to 6.931,32
EQ-5DA generic QoL instrument that has been applied to a wide range of health conditions and treatmentsYesUnknown for AD4,5,34,41-43
HADSA patient-reported questionnaire designed to identify anxiety disorders and depression in patients at non-psychiatric medical institutionsUnknownUnknown35,44,45
POEMA questionnaire used in clinical trials to assess disease symptoms in children and adults with eczemaYes3.4 points8

AD = atopic dermatitis; DLQI = Dermatology Life Quality Index; EASI = Eczema Area and Severity Index; EQ-5D = EuroQol 5-Dimensions questionnaire; HADS = Hospital Anxiety and Depression Scale; IGA = Investigator’s Global Assessment; MCID = minimal clinically important difference; NRS = Numerical Rating Scale; PGADS = Patient Global Assessment of Disease Status; POEM = Patient-Oriented Eczema Measure; QoL = quality of life; SCORAD = Scoring Atopic Dermatitis.


A reduction of three points in Pruritus NRS was considered a clinical meaningful improvement.39,40

Eczema Area and Severity Index

The EASI is a scale used in clinical trials to assess the severity and extent of atopic dermatitis (AD).8,2022 In EASI, four disease characteristics of AD (erythema, infiltration/papulation, excoriations, and lichenification) are assessed for severity by the investigator on a scale of 0 (absent) to 3 (severe). The scores are added up for each of the four body regions (head, arms, trunk, and legs). The assigned percentages of body surface area (BSA) for each section of the body are 10% for head, 20% for arms, 30% for trunk, and 40% for legs, respectively. Each subtotal score is multiplied by the BSA represented by that region. In addition, an area score of 0 to 6 is assigned for each body region, depending on the percentage of AD-affected skin in that area: 0 (none), 1 (1% to 9%), 2 (10% to 29%), 3 (30% to 49%), 4 (50% to 69%), 5 (70% to 89%), or 6 (90% to 100%). Each of the body area scores are multiplied by the area affected. The resulting EASI score ranges from 0 to 72 points, with the highest score indicating worse severity of AD.20 It has been suggested that the severity of AD based on EASI score should be categorized as follows: 0 = clear; 0.1 to 1.0 = almost clear; 1.1 to 7.0 = mild; 7.1 to 21.0 = moderate; 21.1 to 50.0 = severe; 50.1 to 72.0 = very severe.46 EASI-75 indicates ≥ 75% improvement from baseline.4

The validity and reliability of the EASI was examined in several studies.8,2022,22 The correlation coefficients were estimated between EASI and SCORAD to assess the validity.21 A moderate to high correlation between the EASI and SCORAD (r = 0.84 to 0.93) was reported.21 intra- and inter-rater reliability was examined (r = 0.8 to 0.9).21 The authors concluded that EASI is a validated scale and can be used reliably in the assessment of severity and extent of AD.12,20 In one study,8 it was reported that the overall minimal clinically important difference (MCID) was 6.6 points when IGA improving by one point was used as anchor. However, the reported MCID was not relevant for interpreting the EASI data (such as EASI-75) reported in the pivotal studies.

Investigator’s Global Assessment Scale

The IGA is a five-point scale that provides a global clinical assessment of AD severity ranging from 0 to 4, where 0 indicates clear, 2 is mild, 3 is moderate, and 4 indicates severe AD.4 A decrease in score relates to an improvement in signs and symptoms. However, it was indicated that IGA was designed for and is commonly used for clinical trials and rarely used in clinical practice.12 The clinical expert consulted for this review explained that, in practice, a physician would assess a patient’s AD more subjectively (evaluating inflammatory lesions or erythema) without using the IGA. It was reported that the intra-class correlation coefficient (intra-rater reliability by investigator) for the IGA (0.54)22 is below what would typically be considered acceptable (0.70). A review of the literature found no information on the validity of the IGA scale in patients with AD. Similarly, no information was found on what would constitute an MCID in patients with AD.

Patient Global Assessment of Disease Status

PGADS is measured using a five-point Likert scale. Higher score indicates a better overall condition. In the pivotal clinical studies,46 patients rated their overall well-being based on five response choices ranging from poor to excellent. Patients were asked: “Considering all the ways in which your eczema affects you, indicate how well you are doing.” Response choices were: “Poor,” “Fair,” “Good,” “Very Good,” and “Excellent.”4 No information in the literature reviewed was found on the validity, reliability, or MCID of PGADS in AD.

Scoring Atopic Dermatitis

The SCORAD is a tool used in clinical research that was developed to standardize the evaluation of the extent and severity of AD.4,23 It assesses three components of AD: the affected BSA, severity of clinical signs, and symptoms. The extent of AD is assessed as a percentage of each defined body area and reported as the sum of all areas. The maximum score is 100%. The severity of six specific symptoms of AD (redness, swelling, oozing/crusting, excoriation, skin thickening/lichenification, dryness) is assessed using a four-point scale (i.e., none = 0, mild = 1, moderate = 2, severe = 3) with a maximum possible total of 18 points. The symptoms (itch and sleeplessness) are recorded by the patient or caregiver on a visual analogue scale, where 0 is no symptoms and 10 is the worst imaginable symptom, with a maximum possible score of 20. The SCORAD is calculated based on the three components of the AD discussed previously. The maximum possible SCORAD score is 103; higher scores indicate poorer or more severe condition.4 The intra-class correlation coefficient (ICC) was calculated to assess intra-rater reliability; the coefficient of variation was used to assess inter-rater variability.22 It was reported that the ICC for SCORAD was 0.66, indicating fair-to-good reliability in patients with AD.22 Based on the analysis of the data from three randomized controlled trials (RCTs) with patients with atopic eczema, the MCID was estimated using the mean change in SCORAD scores among patients who showed a relevant improvement based on IGA, defined as an “improvement” or “decline” of ≥ 1 point in Physician’s Global Assessment and IGA. A difference of 8.7 points in SCORAD was estimated as the MCID for the patients with atopic eczema (also known as AD).8

Pruritus Numerical Rating Scale

The Pruritus NRS is a tool that patients used to report the intensity of their itch during a daily recall period using an interactive voice response system. Patients were asked to rate their overall (average) and maximum intensity of itch experienced during the past 24 hours on a scale from 0 to 10 (0 = no itch and 10 = worst itch imaginable).4 The proportion of patients with improvement (reduction ≥ 3 or ≥ 4 points) in the weekly average of the peak daily Pruritus NRS from baseline to week 16 was reported in the pivotal studies.4 Additional information provided by the manufacturer reported the validity and reliability of the Pruritus NRS based on three phase III and one phase IIb RCTs.39,40 In the aforementioned RCTs, the Pruritus NRS was completed daily from baseline through week 16, and weekly from week 17 to week 52.39,40 Patient data from weeks 15 and 16 were used to examine the test–retest reliability, and ICCs were computed. The pooled ICC from the three RCTs was 0.96, and the ICC from the phase IIb study ranged from 0.95 to 0.97.39,40 The ICC values indicated that the Pruritus NRS scores were stable over a period of time when the patients’ disease was stable. To assess the validity of the Pruritus NRS, a priori hypotheses were evaluated using correlational analyses and three known-groups analysis of variance (ANOVA) models (“absent/mild” group based on the Pruritus Categorical Scale [PCS]; “poor” disease group based on the PGADS, and “no impact” on the skin-related quality-of-life group based on DLQI total score). Results for all three known groups were in the anticipated direction and were statistically significant, and the effect sizes for the differences between the extreme categories for each known group were all above Cohen’s threshold of 0.80 for large effect sizes (Cohen, 1998).39,40 Based on the data from the phase IIb study, using EASI, IGA as anchors, NRS responder reportedly ranged between 2.2 and 4.2, with the highest estimates based on the most stringent clinical criteria (EASI 90–100 and IGA 0 or 1). Using PCS as an anchor, the responder was estimated as 2.6 points. These analyses suggested that the most appropriate definition of a responder on the Pruritus NRS is in the range of 3 to 4 points.39,40

Dermatology Life Quality Index

The DLQI is a widely used dermatology-specific quality-of-life instrument. It is a 10-item questionnaire that assesses six different aspects that may affect quality of life.31,32 These aspects are symptoms and feelings, daily activities, leisure, work and school performance, personal relationships, and treatment.31,32 The maximum score per aspect is either 3 (with a single question) or 6 (with two questions), and the scores for each can be expressed as a percentage of either 3 or 6. Each of the 10 questions is scored from 0 (not at all) to 3 (very much) and the overall DLQI is calculated by summing the score of each question, resulting in a numeric score between 0 and 30 (or a percentage of 30).31,32 The higher the score, the more quality of life is impaired. The meaning of the DLQI scores on a patient’s life is as follows:26

  • 0 to 1 = no effect
  • 2 to 5 = small effect
  • 6 to 10 = moderate effect
  • 11 to 20 = very large effect
  • 21 to 30 = extremely large effect.

The DLQI has shown good test–retest reliability (correlation between overall DLQI scores was 0.99 [P < 0.0001], and for individual question scores was 0.95 to 0.98 [P < 0.001]),32 internal consistency reliability (with Cronbach’s alpha coefficients ranging from 0.75 to 0.92 when assessed in 12 international studies),26 construct validity (37 separate studies have mentioned a significant correlation between the DLQI and either generic or dermatology-specific and disease-specific measures),26 and responsiveness (the DLQI being able to detect changes before and after treatment in patients with psoriasis in 17 different studies).26

Estimates of the minimal important difference (the smallest difference a patient would regard as beneficial) have ranged from 2.2 to 6.9.26,31 It should be noted that some of the anchors that were used to obtain the DLQI MCID were not patient-based (i.e., Basra et al.26 derived estimates from the Psoriasis Area and Severity Index and Physician’s Global Assessment anchors, as well as a distribution-based approach).

Limitations associated with the DLQI are as follows:

  • Concerns have been identified regarding its unidimensionality and the behaviour of items of the DLQI in different psoriatic patient populations with respect to their cross-cultural equivalence and age and gender; however, these concerns were identified in only two citations out of the 12 international studies identified.26
  • The patient’s emotional responses to their disease may be underrepresented and this may be one reason for unexpectedly low DLQI scores in patients with more emotionally disabling diseases, such as vitiligo. To overcome this, it is suggested that the DLQI be combined with more emotionally oriented measures, such as the mental component of the Short Form (36) Health Survey (SF-36) or HADS.26
  • There are no available benchmarks for the MCID in DLQI scores in general dermatological conditions, although there have been some attempts to determine these differences for specific conditions, such as psoriasis.26
  • The DLQI may lack sensitivity in detecting change from mild to severe psoriasis.47

No validity information or MCID information was found for the patients with AD.

EuroQol 5-Dimensions Questionnaire

The EQ-5D41,42 is a generic quality-of-life instrument that has been applied to a wide range of health conditions and treatments, including AD. The first of two parts of the EQ-5D is a descriptive system that classifies respondents (aged 12 years or older) into one of 243 distinct health states. The descriptive system consists of the following five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Each dimension has three possible levels (1, 2, or 3) representing “no problems,” “some problems,” and “extreme problems,” respectively. Respondents are asked to choose one level that reflects their own health state for each of the five dimensions. A scoring function can be used to assign a value (EQ-5D index score) to self-reported health states from a set of population-based preference weights.41,42 The second part is a 20 cm visual analogue scale (EQ-VAS) that has end points labelled 0 and 100, with respective anchors of “worst imaginable health state” and “best imaginable health state,” respectively. Respondents are asked to rate their own health by drawing a line from an anchor box to the point on the EQ-VAS that best represents their own health on that day. Hence, the EQ-5D produces three types of data for each respondent:

  • a profile indicating the extent of problems on each of the five dimensions represented by a five-digit descriptor, such as 11121 or 33211
  • a population preference-weighted health index score based on the descriptive system
  • a self-reported assessment of health status based on the EQ-VAS.

The EQ-5D index score is generated by applying a multi-attribute utility function to the descriptive system. Different utility functions are available that reflect the preferences of specific populations (e.g., US or UK). The lowest possible overall score (corresponding to severe problems on all five attributes) varies depending on the utility function that is applied to the descriptive system (e.g., −0.59 for the UK algorithm and −0.109 for the US algorithm). Scores lower than 0 represent health states that are valued by society as being worse than dead, while scores of 0 and 1 are assigned to the health states “dead” and “perfect health,” respectively.

The MCID for the EQ-5D ranges from 0.033 to 0.074.34 The EQ-5D index utility score and EQ-VAS score were reported in the pivotal studies.46 No additional validity information or MCID information for the EQ-5D in AD was found from literature search.

Hospital Anxiety and Depression Scale

The HADS is a widely used patient-reported questionnaire designed to identify anxiety disorders and depression in patients at non-psychiatric medical institutions. Repeated administration also provides information about changes in a patient’s emotional state.35,44,45 The HADS questionnaire contains 14 items that assess symptoms experienced in the previous week; among these, seven items are related to anxiety and seven are related to depression. Patients provide responses to each item based on a four-point Likert scale. Each item is scored from 0 (the best) to 3 (the worst); thus, a person can score between 0 and 21 for each subscale (anxiety and depression). A high score is indicative of a poor state. Scores of 11 or more on either subscale are considered to be a “definite case” of psychological morbidity, while scores of 8 to 10 represent “probable case” and 0 to 7 “not a case.”35 No additional information about the validity of or MCID for the HADS in AD was found from the literature search.

Patient-Oriented Eczema Measure

The POEM is a seven-item, questionnaire used in clinical trials to assess disease symptoms in children and adults.24 Based on frequency of occurrence during the past week, the seven items (dryness, itching, flaking, cracking, sleep loss, bleeding, and weeping) are assessed using a five-point scale. The possible scores for each question are: 0 (no days), 1 (1 to 2 days), 2 (3 to 4 days), 3 (5 to 6 days), and 4 (every day). The maximum total score is 28; a high score is indicative of poor quality of life (0 to 2 indicates clear or almost clear skin, 3 to 7 indicates mild eczema, 8 to 16 indicates moderate eczema, 17 to 24 indicates severe eczema, and 25 to 28 indicates very severe eczema).24 In one study,8 it was reported that the overall mean MCID of the POEM was 3.4 points (standard deviation = 4.8) when IGA improving by one point was used as anchor.


The IGA, EASI, and SCORAD are the most commonly used tools in clinical trials to evaluate disease severity in patients with AD. Among them, the IGA is widely accepted and considered a “validated” scale. The MCID for EASI, SCORAD, and POEM was estimated to be 6.6, 8.7, and 3.4 points, respectively, for the patients with AD. Additional information provided by the manufacturer suggested that a reduction of 3 to 4 points in the Pruritus NRS was a reasonable threshold for treatment response. Although the PGADS and HADS are commonly used in clinical practice to assess AD, no validity information and no MCID information were found for AD. The DLQI and EQ-5D (3-Levels questionnaire) are commonly used tools to assess health-related quality of life in patients with AD; however, no information about the validity of or MCID for the EQ-5D and DLQI in AD was found from the literature search.

Copyright © 2018 Canadian Agency for Drugs and Technologies in Health.

The copyright and other intellectual property rights in this document are owned by CADTH and its licensors. These rights are protected by the Canadian Copyright Act and other national and international laws and agreements. Users are permitted to make copies of this document for non-commercial purposes only, provided it is not modified when reproduced and appropriate credit is given to CADTH and its licensors.

Except where otherwise noted, this work is distributed under the terms of a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence (CC BY-NC-ND), a copy of which is available at http://creativecommons.org/licenses/by-nc-nd/4.0/

Bookshelf ID: NBK539234


Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...