NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

McDonagh M, Peterson K, Carson S, et al. Drug Class Review: Atypical Antipsychotic Drugs: Final Update 3 Report [Internet]. Portland (OR): Oregon Health & Science University; 2010 Jul.

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Drug Class Review: Atypical Antipsychotic Drugs

Drug Class Review: Atypical Antipsychotic Drugs: Final Update 3 Report [Internet].

Show details


Literature Search

To identify relevant citations, we searched the Cochrane Central Register of Controlled Trials (1st Quarter 2010), Cochrane Database of Systematic Reviews (4th quarter 2009), MEDLINE (1950 to week 4 January 2010), and PsycINFO (1806 to February week 1 2010) using terms for included drugs, indications, and study designs (see Appendix D for complete search strategies). We attempted to identify additional studies through searches of reference lists of included studies and reviews. In addition, we searched the US Food and Drug Administration Center for Drug Evaluation and Research website for medical and statistical reviews of individual drug products. Finally, we requested dossiers of published and unpublished information from the relevant pharmaceutical companies for this review. All received dossiers were screened for studies or data not found through other searches. All citations were imported into an electronic database (Endnote XI, Thomson Reuters).

Study Selection

Selection of included studies was based on the inclusion criteria created by the Drug Effectiveness Review Project participants, as described above. Two reviewers independently assessed titles and abstracts of citations identified through literature searches for inclusion using the criteria below. Full-text articles of potentially relevant citations were retrieved and again were assessed for inclusion by both reviewers. Disagreements were resolved by consensus. Publications in languages other than English were not reviewed for inclusion and results published only in abstract form were not included because inadequate details were available for quality assessment.

Data Abstraction

The following data were abstracted from included trials: study design, setting, population characteristics (including sex, age, ethnicity, diagnosis), eligibility and exclusion criteria, interventions (dose and duration), comparisons, numbers screened, eligible, enrolled, and lost to follow-up, method of outcome ascertainment, and results for each outcome. We recorded intention-to-treat results when reported. If true intention-to-treat results were not reported, but loss to follow-up was very small, we considered these results to be intention-to-treat results. In cases where only per-protocol results were reported, we calculated intention-to-treat results if the data for these calculations were available.

Quality Assessment

We assessed the internal validity (quality) of trials based on the predefined criteria (see based on the US Preventive Services Task Force and the National Health Service Centre for Reviews and Dissemination (United Kingdom) criteria.8, 9 We rated the internal validity of each trial based on the methods used for randomization, allocation concealment, and blinding; the similarity of compared groups at baseline; maintenance of comparable groups; adequate reporting of dropouts, attrition, crossover, adherence, and contamination; loss to follow-up; and the use of intention-to-treat analysis. Trials that had a fatal flaw were rated poor quality; trials that met all criteria were rated good quality; the remainder were rated fair quality. As the fair-quality category is broad, studies with this rating vary in their strengths and weaknesses: The results of some fair-quality studies are likely to be valid, while others are only possibly valid. A poor-quality trial is not valid—the results are at least as likely to reflect flaws in the study design as a true difference between the compared drugs. A fatal flaw is reflected by failing to meet combinations of items of the quality assessment checklist. External validity of trials was assessed based on whether the publication adequately described the study population—whether patients were similar enough to the target population in whom the intervention would be applied and whether the treatment received by the control group was reasonably representative of standard practice. We also recorded the role of the funding source.

The criteria we used to rate observational studies of adverse events reflected aspects of the study design that were particularly important for assessing adverse event rates (patient selection methods, degree to which all patients were included in analysis, a priori specification and definition of adverse events, method of identification and ascertainment of events, adequate duration of follow-up for identifying specified events, and degree to which and methods used to control for potentially confounding variables in analyses). We rated observational studies as good-quality for adverse event assessment if they adequately met 6 or more of the 7 predefined criteria, fair-quality if they met 3 to 5 criteria, and poor-quality if they met 2 or fewer criteria.

Included systematic reviews were also rated for quality based on predefined criteria: clear statement of the questions(s), inclusion criteria, adequacy of search strategy, validity assessment, adequacy of detail provided for included studies, and appropriateness of the methods of synthesis.

Overall quality ratings for an individual study were based on internal and external validity ratings for that trial. A particular randomized trial might receive 2 different ratings, 1 for effectiveness and another for adverse events. The overall strength of evidence for a particular key question reflected the quality, consistency, and power of the set of studies relevant to the question.

Grading the Strength of Evidence

We graded strength of evidence based on the guidance established for the Evidence-based Practice Center Program of the Agency for Healthcare Research and Quality.10 Developed to grade the overall strength of a body of evidence, this approach incorporates 4 key domains: risk of bias (includes study design and aggregate quality), consistency, directness, and precision of the evidence. It also considers other optional domains that may be relevant for some scenarios, such as a dose-response association, plausible confounding that would decrease the observed effect, strength of association (magnitude of effect), and publication bias.

Table 2 describes the grades of evidence that can be assigned. Grades reflect the strength of the body of evidence to answer key questions on the comparative effectiveness, efficacy and harms of atypical antipsychotic drugs. Grades do not refer to the general efficacy or effectiveness of pharmaceuticals. Two reviewers independently assessed each domain for each outcome and differences were resolved by consensus.

Table 2. Definitions of the grades of overall strength of evidence.

Table 2

Definitions of the grades of overall strength of evidence.

Data Synthesis

We constructed evidence tables showing the study characteristics, quality ratings, and results for all included studies. Trials that evaluated an atypical antipsychotic against another provided direct evidence of comparative effectiveness and adverse event rates. Where possible, these data were the primary focus. In theory, trials that compare these drugs to other antipsychotic drugs or placebos can also provide evidence about effectiveness. This is known as an indirect comparison and can be difficult to interpret for a number of reasons, primarily issues of heterogeneity between trial populations, interventions, and assessment of outcomes. Indirect data are used to support direct comparisons where they exist, and are also used as the primary comparison where no direct comparisons exist. Such indirect comparisons should be interpreted with caution.

We reviewed studies using a hierarchy of evidence approach, where the best evidence was the focus of our synthesis for each question, population, intervention, and outcome addressed. As such, direct comparisons were preferred over indirect comparisons, but indirect comparisons were used when no direct evidence was available. Similarly, effectiveness and long-term safety outcomes were preferred to efficacy and short-term tolerability outcomes. For each drug pair, the hierarchy of evidence was applied as follows for effectiveness, efficacy, and safety:

  • Direct comparisons
  • Head-to-head trials
  • Head-to-head observational studies with effectiveness outcomes
  • Indirect comparisons
  • Active-control or placebo-controlled trials
  • Other observational studies, such as active-controlled, before-after, and descriptive epidemiologic studies

In this review, a head-to-head study was defined as any study that includes 2 or more atypical antipsychotics where the sample sizes are similar and outcomes reported and aspects of study design are same among the drug groups. This definition may not be the same as that applied by the authors of the study. Active-control studies are those that compare an atypical antipsychotic to another drug (for example, a conventional antipsychotic).

To estimate differences between groups in trials that reported continuous data, we used the weighted mean difference and the 95% confidence intervals. The relative risk or risk difference and 95% confidence intervals were used to estimate differences in trials that reported dichotomous outcomes.

In order to assess dose comparisons we identified the section of the dosing range that included the mean dose of each drug. By using the divisions below midrange, midrange, and above midrange we were able to compare the mean dose of each drug in relative terms. In identifying the midpoint dose for each drug, we realized that the approved US Food and Drug Administration dosing range might not reflect actual practice. The American Psychiatric Association practice guidelines for schizophrenia12 cite the dosing ranges identified in Schizophrenia Patient Outcomes Research Team treatment recommendations.13–16 We created a range of midpoint doses for each drug using the midpoint of the range approved by the US Food and Drug Administration and the range recommended by the Schizophrenia Patient Outcomes Research Team, thereby allowing for greater variability and more realistic dose comparisons. Based on this, midrange daily dosing is as follows: aripiprazole 20 mg, clozapine 375 to 600 mg, olanzapine 15 to 20 mg, quetiapine 450 to 550 mg, risperidone 4 to 5 mg, and ziprasidone 100 to 160 mg. For newer drugs, we only used dosing approved by the US Food and Drug Administration to determine midpoint daily dose ranges: asenapine 5 mg, iloperidone 12 to 24 mg, and extended-release paliperidone 6 mg.

Statistical Analysis

Meta-analyses were conducted where possible. In order to determine whether meta-analysis could be meaningfully performed, we considered the quality of the studies and heterogeneity across studies in design, patient population, interventions, and outcomes. For each meta-analysis, we conducted a test of heterogeneity and applied both a random and a fixed effects model. Unless the results of these 2 methods differed in significance, we reported the random effects model results. If meta-analysis could not be performed, we summarized the data qualitatively. All meta-analysis were weighted using the variance. These analyses were created using Stats Direct (Cam Code, Altrincham UK) software.

Due to the complexity of the body of literature for these drugs, a mixed treatment comparisons analysis was employed.17, 18 This type of analysis is similar to a network analysis. 19 The focus of a more traditional meta-analysis is on paired comparisons between 2 drugs by either a direct, head-to-head comparison or, if such studies are not available, by indirect comparison.20 However, our goal was to quantitatively compare 7 drugs using both direct and indirect evidence from all available studies. The literature does not include all of the possible 21 head-to-head comparisons between 2 drugs. So, our analysis needed to incorporate indirect evidence. However, when direct evidence was available we did not want to ignore the indirect evidence available. The mixed treatment comparisons model utilizes both sources of data. We also wanted to control, or adjust, for treatment-arm characteristics, such as dose level. We adapted the model to do so.

Peer Review

We requested and received peer review of the report from 4 content and methodology experts. Their comments were reviewed and, where possible, incorporated into the final document. All comments and proposed actions by authors were reviewed by representatives of the participating organizations of the Drug Effectiveness Review Project before finalization of the report. Names of peer reviewers for the Drug Effectiveness Review Project are listed at

Public Comment

This report was posted to the Drug Effectiveness Review Project website for public comment. We received comments from 6 pharmaceutical companies.

Copyright © 2010 by Oregon Health & Science University, Portland, Oregon 97239. All rights reserved.
Bookshelf ID: NBK50584


Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...