NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Sumamo Schellenberg E, Dryden DM, Pasichnyk D, et al. Acute Migraine Treatment in Emergency Settings [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012 Nov. (Comparative Effectiveness Reviews, No. 84.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Acute Migraine Treatment in Emergency Settings

Acute Migraine Treatment in Emergency Settings [Internet].

Show details


The methods section reflects the protocol that was developed a priori as part of the topic development and refinement stages of this comparative effectiveness review (CER).

Topic Refinement and Review Protocol

The University of Alberta Evidence-based Practice Center (EPC) was commissioned to conduct a preliminary literature review to gauge the availability of evidence and to draft key research questions for a CER. Investigators from the EPC developed the Key Questions (KQs) in consultation with the Agency for Healthcare Research and Quality (AHRQ) EPC Program, the Scientific Resource Center, and a panel of key informants. AHRQ posted the KQs on their website for public comment for a period of 1 month. The EPC revised the KQs based on the public feedback that was received, and AHRQ approved the final KQs.

A technical expert panel was assembled to provide content and methodological expertise throughout the development of the CER. The technical experts are identified in the front matter of this report.

Literature Search Strategy

A research librarian systematically searched the following bibliographic databases: MEDLINE®, Embase, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, Database of Abstracts of Reviews of Effectiveness, International Pharmaceutical Abstracts, PASCAL, Biosis Previews, Science Citation Index Expanded, and Conference Proceedings Citation Index-Science. Databases were searched from inception to January 5, 2012. The search strategy did not employ any study design search filters, nor were language restrictions applied. See Appendix A for the detailed search strategies.

Search terms were selected by scanning search strategies of systematic reviews on similar topics and examining index terms of potentially relevant studies. The search terms were adapted to accommodate the controlled vocabulary and search languages of each database. Key search concepts and text words related to migraine, headache, emergency or acute care setting, and adults.

The reference lists of included studies and relevant systematic reviews were screened to identify additional studies. The following online trial registries were searched to identify unpublished and ongoing trials:, metaRegister of Controlled Trials, WHO International Clinical Trials Registry Platform, and CenterWatch. The U.S. Food and Drug Administration documents related to the drugs of interest were reviewed for additional data. The Scientific Resource Center contacted drug manufacturers to request published and unpublished study data. Hand searches of conference proceedings (from 2008 to 2011) were completed for the following scientific meetings that were identified by clinical experts: American College of Emergency Physicians, Society for Academic Emergency Medicine, American Headache Society, International Headache Society, American Neurological Association, Canadian Neurological Association, European College of Neuropsychopharmacology, International Neuropsychological Society, American Pain Society, Canadian Pain Society, and International Association for the Study of Pain. As well, the Web sites of key organizations in emergency medicine, pain, headache, neuropharmacology, and neurology were searched for relevant research. When necessary, study authors were contacted to obtain additional data or clarification.

Reference Manager© for Windows version 11.0 (2004–2005 Thomson ResearchSoft) bibliographic database was used to manage the results of all literature searches.

Inclusion and Exclusion Criteria

The eligibility criteria were developed in consultation with the technical expert panel and are provided in Table 2. The population of interest was adults ≥18 years of age with severe acute migraine headache presenting to an ED or equivalent setting. Equivalent settings included headache or pain clinics, neurology departments, physician offices and public health centers. Studies that enrolled children or adolescents were included only when at least 80 percent of patients were ≥18 years of age, or when subgroup analyses for adult patients were provided. Studies that predominantly enrolled patients with non-migraine headaches (e.g., cluster headaches, tension headaches) were excluded. Studies that included a mixed cohort of patients with migraine and non-migraine headaches were included only if they reported data separately for migraine headaches or had a predominance of migraine headache patients. Studies that were excluded on the basis of population (i.e., headache type) were reviewed by a clinician (BHR).

Table 2. Eligibility criteria for this review.

Table 2

Eligibility criteria for this review.

Study Selection

Eligibility of studies was assessed in two phases. First, two reviewers independently screened titles and abstracts (where available) to determine if an article met broad inclusion criteria. Each article was rated as “include,” “exclude,” or “unclear.” Second, a single reviewer screened U.S. Food and Drug Administration reports, conference proceedings, and grey literature for potential relevance. The full text of articles identified as “include” or “unclear” by at least one reviewer were retrieved. Finally, two reviewers independently assessed the full text of each study using a detailed form (Appendix B). Disagreements were resolved by consensus or third-party adjudication.

Data Extraction

Data were extracted using a standardized, electronic form using Microsoft Excel™ 2007 (Microsoft Corp., Redmond, WA) (Appendix B). One reviewer extracted data, and a second reviewer verified the data for accuracy and completeness. Any discrepancies were resolved by consensus or third party adjudication. The data extraction form was piloted tested on three studies, and revisions were made to address errors and inconsistencies among reviewers prior to proceeding with the remaining studies.

The following data were extracted: study and participant characteristics (including inclusion and exclusion criteria, age, sex, ethnicity, and diagnosis), intervention details (including dose, frequency, and duration), and outcomes including adverse effects. Information regarding the need for and use of rescue medications in the event of treatment failure was also extracted.

Outcome data were extracted only if quantitative data were presented or could be derived from graphs or figures. Outcomes that were only described qualitatively (i.e., statements that there was no difference between groups) were not included. Non-response was evaluated independently by two reviewers using two definitions: 1) non-response as defined by the authors; and 2) any patient who did not achieve complete resolution of pain (visual analogue scale [VAS] = 0) before discharge or the end of the study. In cases where graphs were identified, they were enlarged and data were estimated by two people. In cases of abstracts and foreign language publications, non-response could not be adjudicated accurately.

It is recognized that many drugs have various effects (e.g., a neuroleptic can be used for the antiemetic treatment of nausea and vomiting). In consultation with the technical expert panel, the research team organized drugs by the classes outlined in Table 1. For each drug class (e.g., neuroleptics), the intervention monotherapy is presented compared with placebo, followed by trials in which the intervention monotherapy is compared with another active treatment (e.g., neuroleptics versus metoclopramide). Combination therapies versus an active comparator (e.g., metoclopramide plus DHE versus ketorolac) were considered as a separate category. For the pain related outcomes, drugs that were added to the pain intervention in order to specifically deal with side effects were grouped with the main drug class (e.g., prochlorperazine plus antihistamine versus metoclopramide was included in the neuroleptics versus metoclopramide category).

We extracted drug related adverse effects as they were reported by the authors of each study. The terminology used to describe adverse effect outcomes varied across studies. The adverse effects of interest were determined a priori in consultation with the technical expert panel and were classified as outlined in Table 3. For each adverse effect, the number of patients in each treatment, active comparator, or placebo group, and the number of patients experiencing an adverse effect were recorded. We counted each event as if it corresponded to a unique individual. Because an individual patient may have experienced more than one event during the course of the study, this assumption may have overestimated the number of adverse effects. Only quantitative adverse effect data describing the number of patients who experienced an event were extracted; that is, studies that reported only p-values or reported one arm to have fewer events than another were not included in these analyses.

Table 3. Adverse effects and associated terms.

Table 3

Adverse effects and associated terms.

Quality (Risk of Bias) Assessment of Individual Studies

We assessed the internal validity of randomized controlled trials (RCTs) and nonrandomized controlled trials (NRCTs) using the Cochrane Collaboration risk of bias tool (Appendix B).30 This tool comprises six domains of potential bias (sequence generation, concealment of allocation, blinding, incomplete outcome data, selective outcome reporting, and “other” sources of bias). Each separate domain was rated as having “high,” “low,” or “unclear” risk of bias. Both blinding and incomplete outcome data were assessed separately for subjective outcomes (e.g., pain severity) and objective outcomes (e.g., blood pressure). For “other” sources of bias, baseline imbalances between groups, carryover in cross-over trials, and early stopping for benefit were assessed. In addition, the funding source for each study was extracted.

The overall assessment was based on the responses to individual domains. If one or more individual domains were assessed as having a high risk of bias, the overall score was rated as high risk of bias. The overall risk of bias was considered low only if all components were rated as having a low risk of bias. The risk of bias for all other studies was rated as unclear.

Two reviewers independently assessed the risk of bias of the studies and resolved discrepancies through consensus. A priori decision rules were developed regarding application of the risk of bias tool and pilot tested on a sample of trials.

Data Analysis

The following assumptions were made and the following imputations were performed to transform reported data into the form required for analysis. Data from graphs were extracted using the measurement tool of Adobe Acrobat 9 Pro (Adobe Systems Inc., California, U.S.) when data were not reported in text or tables. If necessary, means were approximated by medians, and 95% confidence intervals (CI) were used to calculate approximate standard deviations. We calculated p-values when they were not reported. Change from baseline data were used wherever possible for continuous outcomes. As needed, change from baseline was calculated for studies that reported baseline and endpoint data, and a correlation of 0.5 was used to calculate the appropriate standard deviation.31 Where change from baseline could not be calculated, we used the reported endpoint data. One study32 used a cross-over design; however, there was no washout period between administrations of the interventions, so only the first period data were used.

The majority of studies used the VAS to assess pain. When pain scores were reported in any format other than VAS (mm), they were converted to VAS (mm) by multiplying results by a conversion factor. While using a standardized mean difference (SMD) is an alternative approach to dealing with varying scales across a single outcome, we chose the more direct conversion for two reasons. First, we believe that using VAS as a common scale would be less confusing than the “effect size” or SMD units of standard deviation. Second, since all pain scales used in the studies were subjective and numerical and anchored by severe and none (zero) extremes, a simple conversion to a 100 point scale was felt to be more consistent than a conversion using standard deviations when dealing with differences in pain among intervention groups.

For all studies, qualitative data are presented in the results section and in evidence tables. When appropriate, meta-analyses were performed to synthesize the available data. Studies were considered appropriate for pooled analyses if they were sufficiently similar in terms of their population, interventions, comparators, and outcomes.

The evidence for efficacy was summarized separately for each intervention category (e.g., neuroleptics, metoclopramide). Within each intervention category, data are presented both by individual drug comparison and across the drug class (e.g., all neuroleptics).

A traditional pair-wise meta-analysis of adverse effects was not performed since we did not identify multiple studies with the same comparisons (e.g., prochlorperazine versus MgSO4) that reported common adverse effects. Instead, we present a summary of adverse effects by treatment arm that allows us to provide an overall picture of which interventions had a high risk of specific adverse effects. For each adverse effect category, risks (i.e., incidence rates) were pooled using a random effects model to obtain a summary estimate and 95 percent CI.

Review Manager Version 5.0 (The Cochrane Collaboration, Copenhagen, Denmark) was used to perform meta-analyses. For continuous variables, mean differences (MDs) were calculated for individual studies. For dichotomous outcomes, risk ratios (RR) or odds ratios (OR) were computed to estimate between-group differences. If no events were reported in one treatment arm, a correction factor of 0.5 was added to each cell of the two-by-two table in order to obtain estimates of the RR or OR. All results are reported with 95 percent CI. All meta-analyses used a random effects model. We quantified statistical heterogeneity using the I-squared (I2) statistic.

Where there were more than 10 studies for the primary outcome (pain severity), a test for publication bias was visually performed using the funnel plot and quantitatively using the Egger graphical test.33

For two outcomes, pain relief (VAS) and akathisia, a mixed treatment analysis was conducted using a Bayesian network model to compare all interventions simultaneously and to use all available information on treatment effects in a single analysis.3436 The studies that were included in these analyses represented similar populations, outcomes, and designs, and the research team judged that clinical heterogeneity was sufficiently low. MDs or log ORs were modeled using non-informative prior distributions. A normal prior distribution with mean 0 and large variance (10,000) was used for each of the trial means or log ORs, whereas their between study variance had a uniform prior with range 0 to 2 (akathisia) or 0 to 100 (VAS). These priors were checked for influence with sensitivity analyses. Markov Chain Monte Carlo simulations using WinBugs software were carried out to obtain simultaneous estimates of all interventions compared with placebo, as well as estimates of which interventions were the best.37 A burn-in sample of 20,000 iterations was followed by 200,000 iterations used to compute estimates. Results are reported with 95 percent credibility intervals. We checked the analyses for consistency using cross validation of all contrasts that had direct evidence.38


Applicability of evidence distinguishes between effectiveness studies conducted in primary care settings that use less stringent eligibility criteria, assess health outcomes, and have longer followup periods than most efficacy studies.39 The results of effectiveness studies are more applicable to the spectrum of patients in the community than efficacy studies, which usually involve highly selected populations. The applicability of the body of evidence was assessed following the PICOTS (population, intervention, comparator, outcomes, timing of outcome measurement, and setting) format used to assess study characteristics. Specific factors that were considered included sex, age, race or ethnicity, baseline headache severity, clinical setting (e.g., non-ED), and geographic setting (e.g., countries other than in North America).

Grading the Strength of a Body of Evidence

Two independent reviewers graded the strength of the evidence for key outcomes and comparisons using the EPC GRADE approach40 and resolved disagreements by consensus. For each key outcome, the following four major domains were assessed: risk of bias (rated as low, moderate, or high), consistency (rated as consistent, inconsistent, or unknown), directness (rated as direct or indirect), and precision (rated as precise or imprecise). No additional domains were used.

The key effectiveness outcomes for grading (KQs 1, 2, 5, 6) were pain related outcomes and headache recurrence. For KQ 3, we did not grade outcomes because there were no comparative effectiveness analyses. For KQ 4, the key outcome was the development of akathisia. Based on the individual domains, the following overall evidence grades were assigned for each outcome for each comparison of interest: high, moderate, or low confidence that the evidence reflects the true effect. When no studies were available or where there were single studies, the strength of evidence was rated as insufficient.

To determine the overall strength of evidence score, the risk of bias domain was first considered. RCTs with a low risk of bias were initially considered to have a “high” strength of evidence, whereas RCTs with high or unclear risk of bias received an initial grade of “moderate” strength of evidence. The strength of evidence was then unchanged or downgraded depending on the assessments of that body of evidence on the consistency, directness, and precision domains.40 In cases where results were not pooled, the overall strength of evidence rating was not downgraded. We did not make estimates regarding precision when it was inappropriate to pool results from studies. Single trials, particularly those with small sample sizes, were graded as having insufficient strength of evidence despite being precise and having low risk of bias.


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.7M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...