Executive Summary

Publication Details

This future research needs (FRN) report is based on an Agency for Healthcare Research and Quality (AHRQ) comparative effectiveness review (CER) titled “First- and Second-Generation Antipsychotics for Children and Young Adults.”1 The purpose of the CER was to review and synthesize the evidence regarding the benefits and harms of first- and second-generation antipsychotics (FGAs and SGAs) (see Tables A and B) for the treatment of various psychiatric and behavioral conditions in individuals 24 years of age or younger. Table C shows the key questions from this CER.

Table A. Food and Drug Administration-approved first-generation antipsychotics.

Table A

Food and Drug Administration-approved first-generation antipsychotics.

Table B. Food and Drug Administration-approved second-generation antipsychotics.

Table B

Food and Drug Administration-approved second-generation antipsychotics.

Table C. Key Questions from the CER.

Table C

Key Questions from the CER.

For Key Question (KQ) 1, with few exceptions, the Comparative Effectiveness Review (CER)1 reported that the evidence comparing FGAs with SGAs was insufficient to allow for conclusions. Where an interclass difference was noted, the strength of evidence (SOE) grade was reported as low. The CER reported insufficient studies to allow for within-class efficacy/effectiveness comparisons with several exceptions. In these exceptions, either no difference was noted (supported by low SOE), or the difference found was supported with low SOE. The CER reported low to moderate SOE to support SGAs as a class over placebo for certain outcome-disorder pairs as summarized in Table D.

Table D. Summary SOE for SGAs vs. placebo.

Table D

Summary SOE for SGAs vs. placebo.

Regarding adverse events for FGAs compared with SGAs (KQ 2), SGAs were significantly favored over haloperidol for extrapyramidal symptoms (low SOE). Haloperidol was favored over olanzapine for body composition (low SOE). All other adverse events were not significant (low SOE) or had insufficient evidence. For all comparisons of different FGAs or FGA with placebo, there was insufficient evidence to draw a conclusion for adverse events.

For KQ 3, the evidence was rated as insufficient to draw conclusions for health-related quality of life; legal interactions; and other patient-, parent-, or care provider–reported outcomes for all conditions. Short- and long-term outcomes were reported in nine studies examining pervasive developmental disorders and in eight studies examining ADHD and disruptive behavior disorders.

Evidence and conclusions for KQ 4 are based on studies that compared outcomes across various patient subpopulations. Few studies identified differences in the results across subpopulations. Few associations between the patient or clinical variables and outcomes were supported by more than one study. Few studies reported on key health outcomes, and the duration of most studies was short, limiting conclusions about outcomes such as health-related quality of life, social and occupational functioning, and other long-term effectiveness outcomes, parent- or care provider–reported outcomes, and long-term effects of acute adverse events.


Identifying Evidence Gaps and Developing PICOTS

We developed a preliminary list of evidence gaps based on SOE and other information gleaned from the results and limitations sections of the CER. Our main focus was on capturing topics with insufficient information. We then applied the PICOTS from the CER1 inclusion/exclusion criteria and developed an analytic frame work (Figure A) to show the relationships between the evidence gaps, PICOTS, and key questions.

Analytic Framework (Figure A): To the left of center in the figure, a box represents the populations that are the focus of the evidence gaps. These are children and young adults ages 24 years or younger with a diagnosis of any of the included disorders. Included disorders are: pervasive developmental disorder (PDD), attention deficit hyperactivity disorder (ADHD), disruptive behavior disorder (DBD), bipolar disorder (BPD), schizophrenia/schizophrenia-related psychosis (Sz), tic disorders (Tourette syndrome) (TS), obsessive compulsive disorder (OCD), post-traumatic stress disorder (PTSD), eating disorders (anorexia nervosa, bulimia nervosa, eating disorder not otherwise specified) (AN), or nondisorder-specific severe behavioral issues (e.g., aggression) (BS). Below this box, KQ 1 and KQ 4 are specifically noted in parentheses because they refer to targeted populations of either people with a single diagnosis or clinical or demographic subsets of patients with any of the disorders. Dashed lines with arrows pointing toward the box connect this box to two circles located to the right and above the box. The borders of the circles are dashed lines. The text within the circles describes population characteristics that might influence responses to treatment and methods needed to assess prevalence and treatment outcomes. In the first circle, the text reads “Patient selection criteria that matches typical practice.” In the second circle, the text is “Sociodemographic characteristics: sex, race/ethnicity, cotreatment, history of psychosis, duration of illness.” A thick solid line with an arrow leads to the right from the box representing the populations. This line represents the interventions of interest for the evidence gaps. The interventions are listed under the line. These are: first and second generation antipsychotics. Below the line, KQ 2, KQ 3, and KQ 4 are noted in parentheses to indicate the association between these key questions and the interventions and outcomes of interest. Below the list of interventions is a circle with a dashed line border enclosing the phrase “Dosage variations.” A dashed line with an arrow leads from the circle to the phrase “first and second generation antipsychotics.” A curved solid line with an arrow leads down and to the left from the interventions line. The curved line leads to a oval with a solid line border. The adverse effects that could result from some of the interventions are listed inside this oval. The adverse effects are categorized into major and general. The major adverse effects are mortality, cerebrovascular disease- related events, development of diabetes mellitus, diabetic ketoacidosis, neuroleptic malignant syndrome, seizures, extrapyramidal effects, cardiomyopathies, cardiac arrhythmias, agranulocytosis. The general adverse events are weight gain, agitation, constipation, sedation, elevated cholesterol, elevated transaminases, adverse events related to prolactin elevations, galactorrhea, exercise intolerance, precocious puberty. Outside the border and just above the circle, the term “KQ 2” is noted in parentheses to show the association between key question 2 and the potential adverse effects. To the right of the adverse effects circle are two circles each with dashed line borders and dashed lines with arrows leading from them to the adverse effects oval. The text within the first circle reads “Standardized pediatric side-effect scales.” The text in the second circle reads “Direct and indirect comparison of adverse effects.” The thick solid line leading from the population box on the left, ends at two large boxes on the right side of the figure. These boxes contain lists, respectively, of the intermediate and long-term health outcomes of interest. The intermediate outcomes are listed in the first box. These are core illness symptom response rates with corresponding dose and duration of response, acute school performance/attendance, and acute legal/justice system interaction (i.e., arrests, detention). The long-term outcomes are listed in the second box. These are: long-term symptom response rates with corresponding dose, duration of response, remission, relapse, speed of response, time of discontinuation of medication; growth and maturation; cognitive and emotional development; suicide-related behaviors, death by suicide; medication adherence and persistence; school performance/attendance; work-related functional capacity; patient insight into illness; patient/care provider-reporter outcomes (e.g., levels of physical activity/inactivity, diet, food preferences); health-related quality of life (HRQoL); legal/justice system interaction; health care system utilization; direct or indirect impacts from medication use (e.g., development of diabetes mellitus, weight gain, delayed sexual maturation/fertility, bone density, cardiovascular effects, life expectancy, metabolic effects). Located above these boxes are three circles with dashed line borders. Dashed lines with arrows lead from the circles to one or both boxes. The text within the first circle on the left reads “Consensus on clinically meaningful differences.” The dashed lines lead from this circle to both boxes listing the intermediate and long-term outcomes. The text in the middle circle with the dashed border reads “Adequate blinding of study participants, concealment of treatment allocation, handling of missing data.” One dashed line leads from this circle to the long-term outcomes box. The text in the third circle on the right reads “Standardized and comparable measurement for outcomes.” One dashed line leads from this circle to the long-term outcomes box.

Figure A

Analytic framework depicting relationships between key questions, populations, interventions, outcomes, and components of evidence gaps.

We identified a broad range of potential stakeholders, who represented one or more perspectives, including patient and family advocacy groups; health care providers, including diagnosticians and treatment experts; educators of preschool and school-age children; researchers, including those with experience in pharmacology, psychiatry, education, epidemiology, and screening tools; state policymakers and payers of services; professional provider and educator organizations; individuals with knowledge of health services delivery systems; and research funders. The stakeholders contributed to this project via email, conference calls, and online prioritization activities. We scheduled two rounds of conference calls using GoToMeeting® and two rounds of an online prioritization with the stakeholder group.

The stakeholders received a preliminary list of evidence gaps and an analytic framework showing the relationships between the key questions, PICOTS elements, and components of the evidence gaps as part of their orientation materials. During the first call, we invited stakeholders to comment on and make contributions to the list of evidence gaps. We also reviewed a list of ongoing research studies, developed by the project team through searching online research registries, to help identify new data that might be pertinent to evidence gaps. After receiving stakeholder input, project investigators revised the list of evidence gaps and applied the PICOTS elements to the new and revised gaps.

Criteria for Prioritizing Evidence Gaps

The project team developed an online prioritization tool and invited the stakeholders to rank the revised list of evidence gaps in order of priority to produce an upper tier of evidence gaps. To complement the stakeholders’ own perspectives during the prioritization process, we provided the stakeholders with a modified version of the Effective Health Care (EHC) Program Selection Criteria.

Engaging Stakeholders to Prioritize Evidence and Develop Research Needs

During the second call, we reviewed and discussed the results of the prioritization exercise, finalized the upper tier of evidence gaps, and asked stakeholders for feedback on the PICOTS and thoughts on potential research designs for these upper-tier gaps.

Following this discussion, we applied the updated PICOTS framework to the upper-tier evidence gaps and translated them into research questions. We then invited the stakeholders to reprioritize only the upper tier of the evidence gaps using the online prioritization tool to create a final list of prioritized research needs. The final list of prioritized research needs was not shared with the stakeholders until the public comment period of this report.

Developing Research Questions and Determining Potential Research Designs

We applied study design considerations including issues of validity; resources required; ability to recruit subjects or obtain data; and potential ethical, legal, or social issues, to the top-ranked research needs. We also performed sample power analyses to help identify pragmatic barriers of the potential designs. We did not ask stakeholders to rank study designs or provide input to the proposed study designs.


From the original 16 evidence gaps, of which the stakeholders prioritized 14, the stakeholders deemed 6 as the highest-priority research needs after 2 rounds of prioritization. In this executive summary, we present each research need and the research team’s initial views of the potential study designs that could be used to address the priority research need. A discussion of the potential study design considerations may be found in the full FRN report along with a table describing additional characteristics and study design considerations. The six upper-tier research needs in Table E are not ranked.

Table E. Six high-priority research needs identified by stakeholders.

Table E

Six high-priority research needs identified by stakeholders.


We worked with a group of stakeholders to ultimately identify six high priority research needs in the area of antipsychotic usage in youth. The stakeholders prioritized general medication safety and effectiveness issues across disorders over disorder-specific medication effectiveness gaps.

Although randomized controlled trials (RCTs) may be an ideal study design for many effectiveness questions, they are not viable for most of the high priority research needs because of sample size and length of followup demands. Prospective cohort designs that follow youth with a range of mental health disorders could be valuable for comparative effectiveness and safety research but are hampered by cost and logistical concerns. Secondary data analysis techniques are limited by population heterogeneity, short length of followup, and lack of appropriate measures in source trials. In some situations where raw data could be shared among investigators, meta-analysis of individual patient data could be considered. Lastly, for some questions, registries with linkages to clinical data sets could be a lower cost approach, allowing both prospective and retrospective evaluations, but such efforts are nascent in mental health and have confidentiality and methodological barriers.

Despite the aim of the CER1 to evaluate both antipsychotic classes, stakeholders pointed out that future attempts to use observational designs to compare FGAs with SGAs would be hampered by the relatively low rate at which FGAs are used to treat youth. Even in the context of RCTs, feasibility and applicability issues may be strong barriers for FGA evaluation for most disorders.

There are limitations and challenges related to the future research needs process. Stakeholder input was essential, but scheduling challenges led to incomplete participation from some members. Further, conference call time constraints may have led to certain opinions not being expressed. To accommodate these challenges, we provided opportunities for stakeholders to provide feedback by email, but we did not speak with stakeholders by telephone individually. Inherent in the process is a challenging tension between the need to develop a list of digestible evidence gaps for a diverse group of stakeholders and the need to remain faithful to the purpose, findings, and intent of the original CER on antipsychotics in youth.


Overall, the stakeholders demonstrated engagement in our discussions of research challenges in the field and were able to perform the ranking process without difficulty. The six high priority research needs included a broad range of issues cutting across disorders, key clinical outcomes, safety outcomes, and methodological concerns. PICOTS development aided our consideration of study design issues, and our sample power analyses demonstrated the pragmatic barriers that many of the potential designs will present. Although large long-term multisite clinical trials may be the gold standard to assess many of the questions of importance, issues of feasibility have greatly limited the number of such large pragmatic trials in mental health to date. Large prospective cohort studies of youth exposed to antipsychotics may be viable and offer considerable analytic flexibility, but they are also costly. Patient registries with linkages to clinical datasets may allow for more efficient evaluation of some questions with advanced analysis methods, but the infrastructure for this needs considerable investment, and its development may face considerable hurdles relating to information privacy. Meta-analysis of existing trials data and of individual patient data may prove helpful but will likely be limited to evaluation of specific shorter-term outcomes. Despite its limitations, the structured process used in this project may prove to be an effective way of reaching relative consensus on research priorities in this broad and complex topic area.


Seida JC, Schouten JR, Mousavi SS, et al. First-and Second-Generation Antipsychotics for Children and Young Adults. Rockville, MD: Agency for Healthcare Research and Quality; Feb, 2012. (Comparative Effectiveness Review No. 39). (Prepared by the University of Alberta Evidence-based Practice Center under Contract No. 290-2007-10021.) AHRQ Publication No 11(12)-EHC077-EF.