Inclusion Criteria


Adults with neuropathic pain, including:

  • Painful diabetic neuropathy
  • Post herpetic neuralgia
  • Trigeminal neuralgia
  • Cancer related neuropathic pain
  • HIV-related neuropathic pain
  • Central/poststroke neuropathic pain
  • Neuropathy associated with low back pain
  • Peripheral nerve injury pain
  • Phantom limb pain
  • Guillain-Barre syndrome
  • Polyneuropathy
  • Spinal cord injury related pain
  • Complex Regional Pain Syndrome (also known as Reflex Sympathetic Dystrophy)


Effectiveness Outcomes

  • Response (including patient reported pain relief, patient reported global impression of clinical change, any other pain related measure)
  • Use of rescue analgesics
  • Speed and duration of response
  • Relapse
  • Functional capacity (quality of life, work productivity)

Harms Outcomes

  • Overall adverse effects
  • Withdrawals
  • Withdrawals due to adverse effects
  • Serious adverse events (including mortality, arrhythmias, seizures, overdose)
  • Specific adverse events or withdrawals due to specific adverse events (including, but not limited to, hepatic, renal, hematologic, dermatologic, sedation/drowsiness, and other neurologic side effects)

Study Designs

For effectiveness

  • Controlled clinical trials
  • Recent, good quality systematic reviews
  • Comparative observational studies of at least 1 year’s duration, reporting functional outcomes

For harms

  • Controlled clinical trials
  • Comparative observational studies (cohort or case-control) with a well-defined neuropathic pain population
  • Noncomparative observational studies only if the duration is 1 year or longer, and if serious harms are reported; a serious harm is one that results in long-term health effects or mortality

Literature Search

To identify relevant citations, we searched Ovid MEDLINE® (1966 to November Week 3 2010), the Cochrane Database of Systematic Reviews® (4th Quarter 2010), the Cochrane Central Register of Controlled Trials® (4th Quarter 2010), and the Database of Abstracts of Reviews of Effects (4th Quarter 2010), using terms for included drugs, indications, and study designs (see Appendix C for complete search strategies). Electronic database searches were supplemented by hand searches of reference lists of included studies and reviews. In addition, we searched the US Food and Drug Administration Center for Drug Evaluation and Research, the Canadian Agency for Drugs and Technology in Health, and the National Institute for Health and Clinical Excellence web sites for medical or statistical reviews and technology assessments. Finally, we searched dossiers of published and unpublished studies submitted by pharmaceutical companies. All citations were imported into an electronic database (Endnote® v.X2).

Study Selection

All citations were reviewed for inclusion using the prespecified criteria detailed above. Two reviewers independently assessed titles and abstracts of citations identified from literature searches. Full-text articles of potentially relevant citations were retrieved and again were assessed for inclusion by 2 reviewers. Disagreements were resolved by consensus. Results published only in abstract form (e.g. as a conference proceeding) were not included because they typically provide insufficient detail to perform adequate quality assessment. In addition, results of studies can change substantially between initial presentation at a conference and final journal publication.17 We also did not include the IMMPACT recommendations18–25 as these articles, although important in the field of chronic pain by providing guidance for future research, represent consensus statements rather than a controlled trial.

Data Abstraction

We constructed evidence tables showing the study characteristics, quality ratings, and results for all included studies. The following data were abstracted by 2 independent reviewers from included trials: population characteristics (including gender, age, ethnicity, diagnosis); eligibility; interventions (dose and duration); comparisons; numbers enrolled, lost to follow-up, and analyzed; and results for each outcome and funding. We recorded intent-to-treat results when reported. We considered methods to meet criteria for intent-to-treat analysis if outcomes for at least 95% of participants were analyzed according to the group to which they were originally assigned. In cases where only per-protocol results were reported, we calculated intent-to-treat results if the data to perform these calculations were available. For crossover trials, we abstracted results from both crossover periods.26 If this data was not available, we abstracted results from the first intervention period.

For included systematic reviews, we abstracted the databases searched, study eligibility criteria, number of studies and patients represented, characteristics of included studies, data synthesis methods, main efficacy and safety results, and any subgroup analyses.

Validity Assessment

We assessed the internal validity (quality) of trials using predefined criteria (available at These criteria are based on the U.S. Preventive Services Task Force and the National Health Service Centre for Reviews and Dissemination (U.K.) criteria. 27, 28 We rated the internal validity of each trial based on use of adequate methods for randomization, allocation concealment, and blinding; similarity of compared groups at baseline; maintenance of comparable groups; adequate reporting of dropouts, attrition, crossover, adherence, and contamination; absence of high or differential loss to follow-up; and use of intent-to-treat analysis. We also rated whether trials adequately described methods and criteria for identifying and classifying adverse events. Trials that had a “fatal flaw” were rated “poor -quality”; trials that met all criteria were rated “good-quality”; the remainder were rated “fair-quality.” As the fair-quality category is broad, studies with this rating vary in their strengths and weaknesses: the results of some fair-quality studies are likely to be valid, while others are only probably valid. A poor-quality trial is not valid—the results are at least as likely to reflect flaws in the study design as the true difference between the compared drugs. We defined a “fatal flaw” as a very serious methodological shortcoming or a combination of methodological shortcomings that is highly likely to lead to biased or uninterpretable results. External validity of trials was assessed based on whether the publication adequately described the study population, how similar patients were to the target population in whom the intervention will be applied, and whether the treatment received by the control group was reasonably representative of standard practice. We also recorded the role of the funding source. Overall quality ratings for the individual study were based on internal and external validity ratings for that trial. A particular randomized trial might receive 2 different ratings: one for effectiveness and another for adverse events.

We assessed the internal validity of systematic reviews using pre-defined criteria developed by Oxman and Guyatt.29 These included adequacy of literature search and study selection methods, methods of assessing validity of included trials, methods used to combine studies, and validity of conclusions.

Grading the Strength of Evidence

We graded strength of evidence based on the guidance established for the Evidence-based Practice Center Program of the Agency for Healthcare Research and Quality.30 Developed to grade the overall strength of a body of evidence, this approach incorporates 4 key domains: risk of bias (includes study design and aggregate quality), consistency, directness, and precision of the evidence. It also considers other optional domains that may be relevant for some scenarios, such as a dose-response association, plausible confounding that would decrease the observed effect, strength of association (magnitude of effect), and publication bias.

Table 2 describes the grades of evidence that can be assigned. Grades reflect the strength of the body of evidence to answer key questions on the comparative effectiveness, efficacy and harms of drugs for neuropathic pain. Grades do not refer to the general efficacy or effectiveness of pharmaceuticals.

Table 2. Definitions of the grades of overall strength of evidence.

Table 2

Definitions of the grades of overall strength of evidence.

We rated the strength of evidence for outcomes that we judged to represent the most clinically important and reliable: Patient-reported change in pain score, response defined as 50% or 30% reduction in pain, quality of life, and withdrawals due to adverse events.

Data Synthesis

We assigned an overall strength of evidence (good, fair, or poor) for a particular body of evidence based on the quality, consistency, and power of the set of studies. A body of evidence consisting of multiple good-quality, consistent, head-to-head trials with at least some studies evaluating larger sample sizes would generally be rated good quality. A body of evidence consisting of a few poor-quality, small trials with inconsistent results would be rated poor quality. Such evidence is unreliable for drawing conclusions about benefits or harms. Other factors that could result in downgrading of a body of evidence from good to fair (or poor) include high likelihood of publication bias or selective outcomes reporting bias, unexplained statistical heterogeneity, or primarily relying on indirect evidence (i.e. lack of head -to-head trials).

Meta-analytic Methods

We reviewed studies using a hierarchy of evidence approach, where the best evidence is the focus of our synthesis for each question, population, intervention, and outcome addressed. Studies that evaluated one drug for neuropathic pain against another provided direct evidence of comparative effectiveness and adverse event rates. Where possible, these data were the primary focus. Direct comparisons were preferred over indirect comparisons; similarly, effectiveness and long-term safety outcomes were preferred to efficacy and short-term tolerability outcomes.

In theory, trials that compare an included drug for neuropathic pain with any other nonincluded treatment or with placebos can also provide evidence about effectiveness. This is known as an indirect comparison and can be difficult to interpret for a number of reasons, primarily heterogeneity of trial populations, interventions, and outcomes assessment. Data from indirect comparisons are used to support direct comparisons, where they exist, and are used as the primary comparison where no direct comparisons exist. Indirect comparisons should be interpreted with caution.

Meta-analyses were conducted to summarize data and obtain more precise estimates on outcomes for which studies were homogeneous enough to provide a meaningful combined estimate. In order to determine whether meta-analysis could be meaningfully performed, we considered the quality of the studies and the heterogeneity among studies in design, patient population, interventions, and outcomes. When meta-analysis could not be preformed, the data were summarized qualitatively.

For continuous outcomes, we used the mean difference between treatment and placebo groups as the effect measure, which we estimated based on mean change scores and standard errors from baseline to follow up for each group from each study. For dichotomous outcomes, relative risk was used as the effect measure. All combined effects were estimated using random-effects models.32 The Q statistic and the I 2 statistic (the proportion of variation in study estimates due to heterogeneity) were calculated to assess heterogeneity in effects between studies.33, 34 We conducted sensitivity analyses to check the impact of dosage on the results.

Because head-to-head evidence was sparse, we used the method described by Bucher, et al.35 to perform indirect comparison meta-analysis to evaluate the difference between drugs based on data from placebo-controlled trials, as the trials were generally comparable in patient population and clinical and methodological characteristics. The magnitude of difference was characterized using relative risk ratio for relative risks and difference of mean difference for mean differences. Negative (−) difference of mean differences were interpreted as suggesting that drug A is associated with a greater reduction in neuropathic pain than drug B. Relative risk ratios greater than 1.0 were interpreted as suggesting that drug A is associated with a higher relative benefit compared to drug B for efficacy outcomes and higher relative risk for adverse events. All analyses were performed using Stata 11.0 (StataCorp, College Station, TX, 2009) or Stats Direct (Version 2.7.8, Stats Direct Ltd, 9 Bonville Chase, Altrincham, Cheshire WA14 4QA, UK).

Peer Review

We requested and received peer review of the report from 3 experts. Their comments were reviewed and, where possible, incorporated into the final document. All comments and the authors’ proposed actions were reviewed by representatives of the participating organizations of the Drug Effectiveness Review Project before finalization of the report. Names of peer reviewers for the Drug Effectiveness Review Project are listed at

Public Comment

This report was posted to the Drug Effectiveness Review Project website for public comment. We received comments from 2 pharmaceutical companies.



Not available in Canada, available in the United States.


Available in Canada, not available in the United States.