Inclusion Criteria


Adult outpatients with one of the following diseases or conditions:

  • Bipolar disorder (any) diagnosed according to Diagnostic and Statistical Manual of Mental Disorders criteria.1
  • Fibromyalgia or fibromyalgia syndrome diagnosed according to the American College of Rheumatology’s diagnostic criteria for fibromyalgia.
  • Migraine including any level of severity (mild, moderate, severe), with or without aura. Other types of headache (such as tension headache) were excluded.
  • Chronic pain defined as continuous or recurring pain of at least 6 months’ duration. Neuropathic pain was excluded.


Only oral formulations of the drugs listed in Table 1 (above) were included. These are carbamazepine, divalproex sodium, ethotoin (not available in Canada), gabapentin, lamotrigine, levetiracetam, oxcarbazepine, phenytoin, pregabalin, tiagabine (not available in Canada), topiramate, valproic acid, zonisamide (not available in Canada). In this report we referred to divalproex sodium and valproic acid collectively as “valproate,” except in the evaluation of adverse events and where extended-release formulations were used.

Effectiveness outcomes

Bipolar Disorder

  • Danger to self (suicide attempts and completions, suicidal ideation)
  • Functional capacity (quality of life, work productivity)
  • Hospitalization rates or duration
  • Response (rate, degree, speed of onset, duration). Response reported as defined by studies’ protocols.
  • Remission (rate, speed of onset, duration). Remission reported as defined by studies’ protocols.
  • Maintenance of response or remission (rate of recurrence or relapse, time to recurrence or relapse). Both reported as defined by studies’ protocols.
  • Use of other medications for acute episodes

Fibromyalgia and Chronic Pain

  • Functional capacity (quality of life, work productivity)
  • Response (pain intensity and pain relief, change from baseline and proportion achieving relief)
  • Relapse
  • Speed and duration of response
  • Use of rescue medications

Migraine prophylaxis

  • Quality of life
  • Functional outcome (for example, change in days of work lost)
  • Attack frequency
  • Days with migraine
  • Response (intensity, duration, proportion of patients achieving)
  • Use of acute treatments

Safety Outcomes

  • Overall adverse effect reports
  • Withdrawals due to adverse effects
  • Serious harms. A serious harm is one that results in death or long-term health effects. An increase in rates of suicide or suicidal ideation was considered here as a serious harm. Reduction in these rates was considered with other effectiveness outcomes.
  • General adverse effects or withdrawals due to specific adverse events (for example, dizziness, drowsiness/sedation, rash, hepatotoxicity, thrombocytopenia, hyperammonemia)

Study Designs

For effectiveness, controlled clinical trials and good-quality systematic reviews directly comparing one antiepileptic drug with another were preferred. If none existed, trials comparing an included antiepileptic drug with placebo or another drug were considered.

For safety, in addition to controlled clinical trials, observational studies were included. Observational studies were defined as comparative cohort and case-control studies. Studies without a control group were included only if the duration of follow-up was 1 year or longer and serious harms were reported. Studies investigating potential harm to fetuses as a result of exposure to an antiepileptic drug were included only if the population exposed included women who did not have epilepsy, such that studies including only women with epilepsy were not reviewed.

Literature Search

The Original and Update 1 versions of this report, previously produced by the Southern California Evidence-based Practice Center at RAND, provided the basis for identification of included studies in bipolar disorder and fibromyalgia patients through 2005. Their searches included the Cochrane Central Register of Controlled Trials and the Database of Abstracts of Reviews of Effects, MEDLINE/PubMed (1966–2005), and Embase (1974–2005). For Update 2, for bipolar disorder and fibromyalgia we searched PsychINFO from 1806 to week 2 of March 2008 and searched MEDLINE, the Cochrane Central Register of Controlled Trials, and the Cochrane Database of Systematic Reviews only back to 2005. For chronic pain and migraine, we searched MEDLINE (1996 to week 1 of June 2008), the Cochrane Central Register of Controlled Trials (2nd Quarter 2008), and Cochrane Database of Systematic Reviews (2nd Quarter 2008). We also checked reference lists of included review articles. In electronic searches for efficacy trials, we combined terms for antiepileptic drugs, bipolar or mood disorder, fibromyalgia, migraine, chronic pain, randomized clinical trials, systematic reviews, and meta-analyses. For adverse event studies, we combined terms for antiepileptic drugs, adverse effects, and various types of observational studies. All searches were limited to English language and human studies. (See Appendix C for complete search strategy.) Pharmaceutical manufacturers were invited to submit dossiers, including citations. All citations were imported into an electronic database (EndNote® X1, Thomson Reuters).

Study Selection

Selection of included studies was based on the inclusion criteria created by the Drug Effectiveness Review Project participants, as described above. Two reviewers independently assessed titles and abstracts of citations identified through literature searches for inclusion using the criteria below. Full-text articles of potentially relevant citations were retrieved and again were assessed for inclusion by 2 reviewers. Disagreements were resolved by consensus. Results published only in abstract form were not included because lack of detail prevented quality assessment.

Data Abstraction

The following data were abstracted from included trials: study design; setting; population characteristics (including sex, age, ethnicity, diagnosis); eligibility and exclusion criteria; interventions (dose and duration) and comparisons; numbers screened, eligible, enrolled, and lost to follow-up; method of outcome ascertainment; and results for each outcome. We recorded intention-to-treat results when reported. If true intention-to-treat results were not reported, but loss to follow-up was very small, we considered these results to be intention-to-treat results. In cases where only per protocol results were reported, we calculated intention-to-treat results if the data for these calculations were available.

Quality Assessment

We assessed the internal validity (quality) of trials based on the predefined criteria listed in Appendix D. These criteria were based on the US Preventive Services Task Force and the National Health Service Centre for Reviews and Dissemination (United Kingdom) criteria for assessing study quality.6, 7 In rating the internal validity of each trial we assessed the methods used for randomization, allocation concealment, and blinding; the similarity of compared groups at baseline; maintenance of comparable groups; adequate reporting of dropouts, attrition, crossover, adherence, and contamination; loss to follow-up; and the use of intention-to-treat analysis. Trials that had a fatal flaw were rated poor quality; trials that met all criteria were rated good quality. The remainder were rated fair quality. As the fair-quality category was broad, studies with this rating varied in their strengths and weaknesses; the results of some fair-quality studies were likely to be valid, while others were only possibly valid. Poor-quality trials were not valid: The results were at least as likely to reflect flaws in the study design as a true difference between the compared drugs. A fatal flaw is reflected by failure to meet combinations of items on the quality assessment checklist.

External validity of trials was assessed based on whether the publication adequately described the study population, whether patients were similar enough to the target population in whom the intervention would be applied, and whether the treatment received by the control group was reasonably representative of standard practice. We also recorded the role of the funding source.

Appendix D also shows the criteria we used to rate observational studies of adverse events. These criteria reflect aspects of study design that are particularly important for assessing adverse event rates. We rated observational studies as good-quality for adverse event assessment if they adequately met 6 or more of the 7 predefined criteria, fair-quality if they met 3 to 5 criteria, and poor-quality if they met 2 or fewer criteria.

Included systematic reviews were also rated for quality based on predefined criteria (see Appendix D), which assessed the research questions(s) and inclusion criteria, adequacy of search strategy and validity assessment, adequacy of detail provided for included studies, and appropriateness of the methods of synthesis.

The overall strength of evidence for a particular Key Question or outcome reflected the risk of bias of the studies (based on quality and study design) and the consistency, directness, and precision of the studies relevant to the question. Strength of evidence was graded as insufficient, low, moderate, or high.

Data Synthesis

We constructed evidence tables showing the study characteristics, quality ratings, and results for all included studies. We reviewed studies using a hierarchy-of-evidence approach, in which the best evidence was the focus of our synthesis for each question, population, intervention, and outcome addressed. Studies that evaluated one antiepileptic drug against another provided direct evidence of comparative effectiveness and adverse event rates. Where possible, these data (from direct comparisons) were the primary focus; direct comparisons were preferred over indirect comparisons. Similarly, effectiveness and long-term safety outcomes were preferred to efficacy and short-term tolerability outcomes.

In theory, trials that compare antiepileptic drugs to other drug classes or placebos can also provide evidence about effectiveness. This approach is known as an indirect comparison. Indirect comparisons can be difficult to interpret for a number of reasons, mainly heterogeneity between trial populations, interventions, and assessments of outcomes. Data from indirect comparisons are used to support direct comparisons, where they exist, and also are used as the main comparison where no direct comparisons exist. Such indirect comparisons should be interpreted with caution.

In addition to qualitative discussion of studies’ finding, this report contains quantitative analyses that were conducted using meta-analyses on outcomes reported by a sufficient number of studies that were homogeneous enough that combining their results could be justified. In order to determine whether meta-analysis could be meaningfully performed, we considered the quality of the studies and their heterogeneity in design, patient population, interventions, and outcomes.

Random-effects models were used to estimate pooled effects.8 Forest plots are presented to graphically summarize the study results and the pooled results.9 The Q-statistic and the I2 statistic (the proportion of variation in study estimates due to heterogeneity) were calculated to assess heterogeneity between the effects from the studies.10, 11 Heterogeneity was examined with subgroup analysis by factors such as study design, study quality, variations in interventions, and patient population characteristics.

Meta-Analysis of Specific Adverse Events

We aggregated the more commonly documented (or expected) adverse events using patient-level data (Appendix E). We included only trials that specifically reported events at the patient level. Use of patient-specific data can underestimate prevalence and/or eliminate low-level signals of events that occur rarely, because the inclusion criteria for the studies are narrower than in the general population with any given disease.

Data for the adverse events, such as diarrhea, headache, nausea, and rash, were extracted, and an odds ratio was calculated for subgroups that had only 1 trial. For subgroups of events that had at least 2 trials, at least 1 event in the medication group, and at least 1 event in the placebo group, we performed a meta-analysis to estimate the pooled odds ratio and its associated 95% confidence interval. Because many of the events were rare, we used exact conditional inference to either estimate an odds ratio for a single study or to perform the pooling if meta-analysis was warranted, rather than apply the usual asymptotic methods that assume normality. Asymptotic methods require correction if zero events are observed, and generally half an event is added to all cells in the outcome-by-treatment (two-by-two) table in order to allow estimation, because these methods assume continuity of effects. Such corrections can have a major impact on the results when the outcome event is rare. Exact methods do not require such corrections. We conducted the meta-analysis using the statistical software package StatXact (Cytel).21

Any significant pooled odds ratio greater than 1 indicated that the odds of the adverse event associated with an antiepileptic drug (the intervention group) was larger than the odds associated with the comparison (placebo, lithium, or other antiepileptic drug). If no events were observed in the comparison group, but events were observed in the intervention group, the odds ratio was infinity and the associated confidence interval was bounded from below only. We report the lower bound of this confidence interval. If no events were observed in either group, the odds ratio was undefined, which we denote as “Not calculated” (NC) in the results tables. We did not observe any subgroups of studies for which no events were reported for the intervention group but events were observed in the comparison group.

Since only 1 bipolar disorder trial directly compared adverse events between antiepileptic drugs, for bipolar disorder we assessed only 2 comparisons, antiepileptic drug compared with placebo and antiepileptic drug compared with lithium. We looked for overlap between the confidence intervals of the pooled odds ratios (or single study odds ratio if only 1 trial was available) for each antiepileptic drug. If the confidence intervals overlapped, then we could not conclude that the odds between antiepileptic drugs were significantly different.

Peer and Public Review

The Original report underwent a review process that involved solicited peer review from 3 clinical experts. Their comments were reviewed and, where possible, incorporated into the final document. The comments received and the author’s proposed actions were reviewed by the representatives of the participating organizations of the Drug Effectiveness Review Project prior to finalization of the report. Names of peer reviewers for Drug Effectiveness Review Project reports are listed at



We excluded trials that included heterogeneous patient populations unless data were presented separately for patients with bipolar disorder.