NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Smith B, Carson S, Fu R, et al. Drug Class Review: Disease-modifying Drugs for Multiple Sclerosis: Final Update 1 Report [Internet]. Portland (OR): Oregon Health & Science University; 2010 Aug.

Cover of Drug Class Review: Disease-modifying Drugs for Multiple Sclerosis

Drug Class Review: Disease-modifying Drugs for Multiple Sclerosis: Final Update 1 Report [Internet].

Show details


Inclusion Criteria


  • Adult outpatients (age ≥18 years) with multiple sclerosis16, 17
    • Relapsing-remitting multiple sclerosis
    • Secondary progressive multiple sclerosis
    • Primary progressive multiple sclerosis
    • Progressive relapsing multiple sclerosis
  • Adult outpatients with a clinically isolated syndrome (also known as “first demyelinating event”, first clinical attack suggestive of multiple sclerosis, or monosymptomatic presentation).17

Interventions (all formulations)

The following 7 drugs are available in the United States and Canada. Black box warnings associated with each drug are listed in Appendix B.

  • Glatiramer acetate (Copaxone®)
  • ■ Interferon beta-1a (Avonex®, Rebif®)
  • ■ Interferon beta-1b (Betaseron®, Extavia® [not available in Canada])
  • Mitoxantrone (Novantrone®)
  • Natalizumab (Tysabri®)

Effectiveness outcomes

Multiple sclerosis
  • Disability
  • Clinical exacerbation/relapse
  • Quality of life
  • Functional outcomes (e.g. wheel-chair use, time lost from work)
  • Persistence (discontinuation rates)
Clinically isolated syndrome
  • Disability
  • Clinical exacerbation/relapse of symptoms
  • Quality of life
  • Functional outcomes (e.g. wheel-chair use, time lost from work)
  • Persistence (discontinuation rates)
  • Progression to multiple sclerosis diagnosis

Note: Magnetic resonance imaging findings are not included, as they are intermediate or surrogate outcomes.


  • Overall rate of adverse effects
  • Withdrawals due to adverse effects
  • Serious adverse events
  • Specific adverse events (cardiovascular, hepatotoxicity, progressive multifocal leukoencephalopathy, secondary cancers, etc.)

Study designs

  • ■ For effectiveness, controlled clinical trials and good-quality systematic reviews. Observational studies with 2 concurrent arms of at least 100 patients each and duration ≥1 year are included (e.g. cohort, case-control).
  • ■ For harms, in addition to controlled clinical trials, observational studies are included.

Literature Search

To identify relevant citations, we searched Ovid MEDLINE® (1966 - December 2009), the Cochrane Database of Systematic Reviews® (4th quarter 2009), the Cochrane Central Register of Controlled Trials® (4th quarter, 2009), and the Database of Abstracts of Reviews of Effects (4th Quarter 2009) using terms for included drugs, indications, and study designs (see Appendix C for complete search strategies). We attempted to identify additional studies through hand searches of reference lists of included studies and reviews. In addition, we searched the US Food and Drug Administration Center for Drug Evaluation and Research, the Canadian Agency for Drugs and Technology in Health, and the National Institute for Health and Clinical Excellence web sites for medical and statistical reviews and technology assessments. Finally, we requested dossiers of published and unpublished information from the relevant pharmaceutical companies for this review. All received dossiers were screened for studies or data not found through other searches. All citations were imported into an electronic database (Endnote® XI, Thomson Reuters).

Study Selection

Selection of included studies was based on the inclusion criteria created by the Drug Effectiveness Review Project participants, as described above. Two reviewers independently assessed titles and/or abstracts of citations identified from literature searches for inclusion, using the criteria described below. Full-text articles of potentially relevant abstracts were retrieved and a second review for inclusion was conducted by reapplying the inclusion criteria. Results published only in abstract form were not included because inadequate details were available for quality assessment, however if we were provided with enough information to conduct quality assessment we did include the study. Additional results from fully published studies (e.g. relating to secondary outcome measures) found only in abstract form were included because the study quality could be assessed through the complete publication.

Data Abstraction

The following data were abstracted from included trials: study design, setting, population characteristics (including sex, age, ethnicity, diagnosis), eligibility and exclusion criteria, interventions (dose and duration), comparisons, numbers screened, eligible, enrolled, and lost to follow-up, method of outcome ascertainment, and results for each outcome. Data were abstracted by one reviewer and checked by a second. We recorded intention-to-treat results when reported. If true intention-to-treat results were not reported, but loss to follow-up was very small, we considered these results to be intention-to-treat results. In cases where only per-protocol results were reported, we calculated intention-to-treat results if the data for these calculations were available.

Validity Assessment

We assessed the internal validity (quality) of trials based on the predefined criteria (see These criteria are based on the US Preventive Services Task Force and the National Health Service Centre for Reviews and Dissemination (United Kingdom) criteria.18, 19 We rated the internal validity of each trial based on the methods used for randomization, allocation concealment, and blinding; the similarity of compared groups at baseline; maintenance of comparable groups; adequate reporting of dropouts, attrition, crossover, adherence, and contamination; loss to follow-up; and the use of intention-to-treat analysis. Trials that had fatal flaws were rated “poor-quality”; trials that met all criteria were rated “good-quality”; the remainder were rated “fair-quality.” As the fair-quality category is broad, studies with this rating vary in their strengths and weaknesses: the results of some fair-quality studies are likely to be valid, while others are only probably valid. A poor-quality trial is not valid in that the results are at least as likely to reflect flaws in the study design as the true difference between the compared drugs. A fatal flaw is reflected by failing to meet combinations of items of the quality assessment checklist.

A particular randomized trial might receive 2 different ratings: 1 for effectiveness and another for adverse events. The overall strength of evidence for a particular key question reflects the quality, consistency, and power of the set of studies relevant to the question.

The criteria for observational studies of adverse events reflect aspects of the study design that are particularly important for assessing adverse event rates. We rated observational studies as good quality for adverse event assessment if they adequately met 6 or more of the 7 predefined criteria, fair quality if they met 3 to 5 criteria, and poor quality if they met 2 or fewer criteria.

Included systematic reviews were also rated for quality (see appendix C) based on predefined criteria, based on a clear statement of the questions(s), inclusion criteria, adequacy of search strategy, validity assessment and adequacy of detail provided for included studies, and appropriateness of the methods of synthesis.

Grading the Strength of Evidence

We graded strength of evidence based on the guidance established for the Evidence-based Practice Center Program of the Agency for Healthcare Research and Quality.20 Developed to grade the overall strength of a body of evidence, this approach incorporates 4 key domains: risk of bias (includes study design and aggregate quality), consistency, directness, and precision of the evidence. It also considers other optional domains that may be relevant for some scenarios, such as a dose-response association, plausible confounding that would decrease the observed effect, strength of association (magnitude of effect), and publication bias.

Table 2 describes the grades of evidence that can be assigned. Grades reflect the strength of the body of evidence to answer key questions on the comparative effectiveness, efficacy and harms of disease-modifying drugs for multiple sclerosis. Grades do not refer to the general efficacy or effectiveness of pharmaceuticals. Two reviewers independently assessed each domain for each outcome and differences were resolved by consensus.

Table 2. Definitions of the grades of overall strength of evidence .

Table 2

Definitions of the grades of overall strength of evidence .

We chose outcomes related to relapse and disease progression. Magnetic resonance imaging findings were considered intermediate outcomes and were not assessed.

Data Synthesis

We constructed evidence tables showing the study characteristics, quality ratings, and results for all included studies. We reviewed studies using a hierarchy of evidence approach, where the best evidence is the focus of our synthesis for each question, population, intervention, and outcome addressed. Studies that evaluated 1 disease-modifying drug for multiple sclerosis against another provided direct evidence of comparative effectiveness and adverse event rates. Where possible, these data are the primary focus. Direct comparisons were preferred over indirect comparisons; similarly, effectiveness and long-term safety outcomes were preferred to efficacy and short-term tolerability outcomes.

In theory, trials that compare a disease-modifying drug for multiple sclerosis to placebo can also provide evidence about effectiveness.22, 23 This is known as an indirect comparison and can be difficult to interpret for a number of reasons, primarily issues of heterogeneity between trial populations, interventions, and assessment of outcomes. Data from indirect comparisons are used to support direct comparisons, where they exist, and are also used as the primary comparison where no direct comparisons exist. Such indirect comparisons should be interpreted with caution.

Meta-analyses were conducted to summarize data and obtain more precise estimates on outcomes for which studies were homogeneous enough to provide a meaningful combined estimate. In order to determine whether meta-analysis could be meaningfully performed, we considered the quality of the studies and heterogeneity across studies in study design, patient population, interventions, and outcomes. When meta-analysis could not be performed, the data were summarized qualitatively.

Random-effects models were used to estimate pooled effects.24 The Q statistic and the I2 statistic (the proportion of variation in study estimates due to heterogeneity) were calculated to assess heterogeneity in effects between studies.25, 26 Meta-analysis was performed using Stats Direct (Cam code, United Kingdom) and the meta package in R.27

If necessary, indirect meta-analyses were done to compare interventions for which there were no head-to-head comparisons and where there was a common comparator intervention across studies. We used the method described by Bucher et al, to perform indirect analyses.23 Indirect comparisons usually agree with direct comparisons, though large discrepancies have been reported in some cases.28, 29 In addition, indirect comparisons also result in less precise estimates of treatment effects compared with the same number of similarly sized head-to-head trials because methods for indirect analyses incorporate additional uncertainty from combining different sets of trials.22, 23 Because of this, we pursued an exploratory analysis combining the indirect and direct pooled estimates using a Bayesian approach. Data from indirect comparisons was synthesized with data from direct, head-to-head studies when possible. Using a Bayesian data analytical framework, effect size estimated from the indirect analysis was used as the prior probability distribution in a meta-analysis of the data from the direct head-to-head studies. Bayesian analysis was conducted using Open BUGS and the BRugs package in R.27, 30

Peer Review and Public Comment

We requested and received peer review of the report from 2 content and methodology experts. Their comments were reviewed and, where possible, incorporated into the final document. A draft of this report was also posted to the Drug Effectiveness Review Project website for public comment. We received comments from 6 pharmaceutical companies. All comments and the authors’ proposed actions were reviewed by representatives of the participating organizations of the Drug Effectiveness Review Project before finalization of the report. Names of peer reviewers for the Drug Effectiveness Review Project are listed at

Copyright © 2010 by Oregon Health & Science University, Portland, Oregon 97239. All rights reserved.
Bookshelf ID: NBK50571
PubReader format: click here to try


Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...