Topic Development

The topic of this report and preliminary Key Questions (KQs) were developed through a participatory process involving the public, the Scientific Resource Center for the Effective Health Care program of the Agency for Healthcare Research and Quality (AHRQ), and various stakeholder groups. We communicated with eight key informants who represented psychiatrists, primary care practitioners, consumer representatives, and researchers in the area when formulating the research questions. Additional study, patient, intervention, and eligibility criteria, as well as outcomes, were refined and agreed upon through discussions between the McMaster University Evidence-based Practice Center, the Technical Expert Panel (TEP) members, the AHRQ Task Order Officer (TOO), a patient representative, and comments received from the public. Upon completion of the topic refinement, the key questions were posted for public comment, which were then summarized and discussed with the TEP. Relevant modifications (additions or clarifications) were incorporated.

Analytic Framework

Following consultation with key informants, the AHRQ TOO, and the investigative team, the key research questions were developed. Figure 1 shows a flow diagram indicating the relationship between research questions in this CER. The first box in the figure shows the last question (KQ4) where clinical practice guidelines (CPGs) are evaluated. The other research questions are related to interventions used following the inadequate response to a selective serotonin reuptake inhibitors (SSRIs) for the index episode of depression. The treatment options following a failed response include the seven options (defined as interventions) for KQ1. Harms associated with any of these interventions are evaluated in KQ2 and can include suicide, sexual dysfunction, gastrointestinal effects, and neuropsychiatric effects. The study effects are evaluated in KQ1, KQ2, and KQ3, with the latter question considering subgroups related to different populations with depressive symptoms and other related factors potentially impacting treatment response. We note that intermediate outcomes, such as response and remission, may precede quality of life or societal outcomes.

Figure 1 is the analytic framework integrating the four Key Questions addressed in this review that evaluates treatment options following failed response to SSRI antidepressants as first line therapy. There are seven different options for treatment that include: 1)dose and duration optimization, 2) change to different SSRI, 3) change to different non-SSRI antidepressant, 4) change to non-pharmacological therapies, 5) addition of an augmenting agent, 6) addition of a second SSRI or non-SSRI antidepressant and 7) addition of non-pharmacological therapies. Evaluating the potential benefits (key question 1) and harms (key question 2) of these therapies are addressed by evaluating intermediate outcomes (partial or full response and remission) and final outcomes (such as improved quality of life, improved global function, and increased return to work). Key question 4 evaluates all current clinical practice guidelines and recommended clinical actions for patients who have failed to respond to SSRI as first line therapy.

Figure 1

Analytic framework. CPG = clinical practice guideline; GI = gastrointestinal; KQ = Key Question; SSRI = selective serotonin reuptake inhibitor

Search Strategy

For the primary studies, the search strategy was delimited to studies published from 1980 to April 13, 2011, as SSRIs first became available for treatment of depression in the early 1980s. The following electronic bibliographic databases were searched: MEDLINE®, Cochrane Central®, PsychINFO, Cochrane Database of Systematic Reviews, Embase®, CINAHL®, and AMED. The strategies used combinations of controlled vocabulary (medical subject headings, keywords) and text words. Appendix A details the strategies used to capture relevant citations. For the CPGs, the search was limited to those published from 2004 to April 2011.

A grey literature search was undertaken by the AHRQ Scientific Resource Center and identified potentially relevant citations or information by searching the Web sites as follows:

  1. Health Technology Assessment agencies (Hayes Inc. Health Technology Assessment),
  2. Regulatory information (United States Food and Drug Administration [FDA], Health Canada, Authorized Medicines for European Community),
  3. Clinical trial registries (, Current Controlled Clinical Trials, Clinical Study Results, WHO Clinical Trials),
  4. Grants and federally funded research (National Institutes of Health, Health Services Research Projects in Progress [HSRProj]),
  5. Abstracts and conference proceedings (Conference Papers Index, Scopus), and,
  6. The New York Academy of Medicine’s Grey Literature Index. Additionally, the sites of specialty organizations for CPG were searched and members of the TEP were queried for potentially relevant guidelines.

Review of reference lists of systematic reviews published from 2005 forward was also undertaken. Similarly, the reference lists of eligible studies at full text screening were reviewed for relevant references. Any potentially relevant citations were cross-checked with our citation database and any that were new were retrieved and screened at full text.

Study Selection

Types of Participants

Subjects who are classified as having failed treatment or as having an “inadequate response” were eligible for this review. Treatment failure subjects would ideally be defined as those subjects who are currently on SSRI treatment for the index episode at the time of entry into the study. At that point these subjects have been judged to have had an “inadequate response” at the time of entry into the study or just prior to randomization. An “inadequate response” is typically established using a standardized instrument, where the scores relative to baseline reflect an improvement of less than 50 percent.3,62 The term “inadequate response” is therefore synonymous with terms such as “nonresponse,” “failure to respond,” and “treatment failure.” These terms primarily reflect the perspective of the clinician or researcher. Partial response refers to a change in baseline score from 25 to 49 percent. “Nonresponse” is defined as a change in baseline score of less than 25 percent. For this CER, the term “unsatisfactory response” was used to reflect the patient’s perception of their response to the intervention to treat their depression.

Specific eligibility is as follows: the study populations were eligible if they included adults (≥18 years) or adolescents (12 to 18 years) with major depressive disorder (MDD), dysthymia, or subsyndromal depression, who meet the following criteria:

  • Currently on SSRI treatment for the index episode at the time of entry into the study,
  • Have been judged to have had an “inadequate response” at the time of entry into the study (by any method),
  • The SSRIs that patients did not respond to as a first-line therapy include the following: fluoxetine, citalopram, fluvoxamine, sertraline, escitalopram, and paroxetine,


  • The subjects who are recruited for entry into the study are to be placed on an SSRI for purposes of monitoring prospectively the adequacy of their response; subsequent evaluation includes an intervention for those that have been shown to not respond adequately to the SSRI.


The study populations were not eligible if adults (>18 years) and adolescents (12 to 18 years) with MDD, dysthymia, or subsyndromal depression met the following criteria:

  • Are not receiving an SSRI at the time of entry into the study (including studies that included antidepressants but were not stratified for an SSRI subgroup),
  • Are not recruited to evaluate the adequacy of response prospectively,
  • Have post-partum depression, bipolar depression, depressive psychosis, dysphoria, mourning syndrome, postoperative depression, premenstrual dysphoric disorder, pseudodementia, puerperal depression, seasonal affective disorder,


  • Populations for whom the patho-physiological mechanism of depression is not comparable to those diagnosed with MDD, including patients having initially sustained a cerebrovascular accident, who suffer from dementias (including Alzheimer’s disease, vascular dementia, mild cognitive impairment), Parkinson’s disease, hypothyroidism, or Cushings’ syndrome

Types of Interventions

For KQs 1 to 4, the pharmacological and nonpharmacological interventions of interest are as follows:

  • Selective-Serotonin Reuptake Inhibitors (SSRIs): Fluoxetine (Fluoxetine Hydrochloride, Prozac, Prozac Weekly, Sarafem, Symbyax), Citalopram (Celexa, Citalopram Hydrobromide), Fluvoxamine (Fluvoxamine Maleate, Luvox, Luvox CR), Sertraline (Sertraline Hydrochloride, Zoloft), Paroxetine (Paroxetine Hydrochloride, Paxil, Paxil CR, Pexeva), Escitalopram (Escitalopram, Escitalopram Oxalate, Lexapro).
  • NonSSRI Antidepressants: Duloxetine Hydrochloride (Cymbalta), Venlafaxine (Effexor, Effexor XR, Pristiq), Desvenlafaxine Succinate (Pristiq), Phenelzine Sulfate (Nardil), Tranylcypromine Sulfate (Parnate), Emsam (Selegiline), Moclobemide (Manerix), Doxepin (Sinequan, Zonalon, Doxepin Hydrochloride), Clomipramine (Anafranil, Clomipramine Hydrochloride), Amitriptyline (Amitid, Amitril, Elavil, Endep, Etrafon 2–10, Etrafon 2–25, Etrafon-a, Etrafon-Forte, Limbitrol, Limbitrol DS, Perphenazine and Amitriptyline Hydrochloride combinations - Triavil 2–10, Triavil 2–25, Triavil 4–10), Maprotiline (Ludiomil), Desipramine (Norpramin, Pertofrane), Trimipramine (Surmontil, Trimipramine Maleate), Imipramine (Imipramine Hydrochloride, Imipramine Pamoate, Janimine, Pramine, Presamine, Tofranil, Tofranil-pm), Protriptyline Hydrochloride (Vivactil), Agomelatine (Valdoxan), Reboxetine (Edronax, Vestra), Norvale (Mianserin, Bolvidon, Tolvan), Trazodone (Desyrel, Trazodone Hydrochloride, Trialodine), Mirtazapine (Remeron, Remeron Soltab), Nefazodone (Nefazodone Hydrochloride, Serzone), Bupropion (Aplenzin, Bupropion Hydrochloride, Wellbutrin, Wellbutrin SR, Wellbutrin XL, Zyban).
  • Non-pharmacological and complementary and alternative medicine (CAM) therapies: cognitive behavioral therapy (CBT), interpersonal therapy (IPT), and other psychotherapies (behavior therapy, counseling, problem-solving therapy, psychodynamic therapy, bibliotherapy, guided self-help, distraction therapy), light therapy, exercise (any type cardiovascular or strengthening or stretching and including yoga, hydrotherapy), CAM including whole body systems (e.g., acupuncture), mind-body medicine (e.g., meditation), manipulative and body-based practices (e.g., massage), energy medicine (e.g., reiki), biologically based practices (dietary supplements and herbal products (e.g., amino acids, vitamins and minerals, Inositol, herbs, methyl-folate (Deplin), omega-3 fatty acids, SAMe)).
  • Augmenters (no formal indication for use as an antidepressant): Buspirone (Buspar), Gepirone (Ariza), Tandospirone (Sediel), Atypical Antipsychotics (Risperidone (Risperdal), Olanzapine (Zyprexa), Quetiapine (Seroquel), Aripiprazole (Abilify), Ziprasidone (Geodon)), Psychostimulants (Amphetamine (Adderall), Methylphenidate (Ritalin), Dopamine agonists (Bromocriptine (Parlodel), Cabergoline (Dostinex), Pergolide (Permax), Pramipexole (Mirapex), Ropinirole (Requip), Apomorphine (Apokyn), Rotigotine (Neupro), Other drugs (Lithium, Pindolol, Tryptophan), Anticonvulsants (Carbamazepine (Tegretol), Sodium Valproate, Lamotrigine (Lamictal)), Antiprogestational agents (Mifepristone (Mifeprex)), Sex Hormones (Androgens (e.g., Testosterone), Estrogens, Progesterone), Thyroid medications (tri-iodothyronine (T3), Amisulpride (Solian), Phenytoin (Dilantin, Phenytek), Modafinil (Provigil, Alertec, Modavigil, Modiodal, Modafinil, Carim, Armodafinil, Nuvigil), N-methyl-D aspartate (NMDA) NR2B subunit selective agonist CP-101606, mecamylamine hydrochloride (Inversine), Atomoxetine (Strattera)).

Studies that used electroconvulsive therapy, vagal nerve stimulation, or repetitive transcranial nerve stimulation as the intervention were excluded.

For KQ4, we evaluated CPGs that focus on guidelines at a national level or from key professional organizations published in English, but not limited to any country.

Types of Comparators

We identified and included studies with comparative intervention groups. From a design hierarchy perspective, comparative group designs provide stronger evidence for efficacy and effectiveness than noncomparative designs.

The interventions (either alone or in combination) may be compared with any of the following:

  1. Placebo
  2. Same SSRI dose but different MDD population (for example, mild vs. severe MDD)
  3. Same SSRI of different dose or duration
  4. Other SSRI
  5. Other antidepressant (from a different drug class)
  6. Nonpharmacological or CAM therapies as described above
  7. Adjunct therapy: combination of an augmenter plus SSRI
  8. Adjunct therapy: combination of nonpharmacological or CAM therapy plus SSRI
  9. Adjunct therapy: combination of augmenter and nonpharmacological or CAM therapy

Types of Outcomes

Primary outcomes include the following:

  1. Adequate Response: response to treatment is defined as a minimum of 50 percent change relative to baseline using a standardized instrument.3,62
  2. Remission: remission is defined as being free or nearly free of symptoms. It is typically established by achieving a threshold score using a standardized instrument.
  3. Partial and Nonresponse: partial response refers to a change in baseline score from 25 to 49 percent. Nonresponse is defined as less than 25 percent change relative to baseline. We recognize that some of the studies will vary in their definition and this will be noted when detail is provided within the original study.
  4. Speed of Response.
  5. Relapse: relapse is defined as a return of symptoms satisfying the full syndrome criteria for an episode which occurs following a period of remission but before recovery. Relapse is the point at which recurrent symptoms are severe enough that the clinician determines an intervention is warranted. Relapse is related but distinct from the term recurrence. Recurrence is defined as the return of the disease after its apparent cessation (symptoms return after a period of remission).

Secondary outcomes include the following:

  1. Quality of life
  2. Adherence
  3. Return to work
  4. Global change as measured by global assessment scales
  5. External service utilization

Additional Eligibility Criteria

Study Design


  1. Experimental studies with comparator groups (randomized and quasirandomized trials)
  2. Observational studies with comparator groups (retrospective and prospective cohort, case control, and interrupted time series with comparison group)
  3. Letters with study data and abstracts


  1. All other study designs (e.g., case series, qualitative studies)
  2. Editorials, commentaries, and notes

Language of Publication

Non-English language publications were excluded.

Contacting Authors for Additional Data

For studies that included populations that had failed to respond to antidepressants that included SSRIs, study authors were contacted via email requesting additional stratified outcome data. Studies where the authors did not respond or contact could not be established were excluded.


There are no restrictions on study eligibility with respect to a minimum treatment interval.


Studies that recruited patients from primary care, outpatient, and inpatient mental health settings were included. There were no exclusions for study setting.

Clinical Practice Guideline Selection

We defined CPG as “systematically developed statements about specific clinical problems intented to assist practitioners and patients in making decisions about appropriate health care.”68 We included full guidelines and consensus statements but we excluded algorithms with no background or description of the process by which the algorithm was developed.

Data Extraction

Relevant fields of information were extracted from individual studies by trained data extractors using standardized forms and a reference guide. Prior to performing the data extraction, a calibration exercise was undertaken using a convenience sample of five included studies. Key study elements were reviewed by a second person (study investigator) with respect to study outcomes, seminal population characteristics (past psychiatric history elements and definition of prior “treatment failure”), and characteristics of the intervention. Disagreements were resolved by consensus.

Extracted data included:

  • Study characteristics: first author, country of research origin, study design, sample size, (e.g., sample size calculation, power estimate), clinical indications, and study duration or length of followup.
  • Patient population: age, gender, racial composition, socioeconomic status (e.g., income, education), sleeping disturbances or levels, comorbidities (e.g., psychiatric and medical histories, use of CAM treatments concurrently or historically), definition of treatment failure, and severity and duration of the depressive disorder.
  • Study interventions and comparators: type of intervention/comparator (e.g., pharmacological, nonpharmacological), dosage of intervention/comparator (e.g., type, dose, method of administration), frequency and treatment fidelity for psychotherapy related interventions, treatment duration (e.g., total duration of care), duration of followup, and characteristics of treatment providers.
  • Outcomes: type of instrument or scale, primary or secondary outcome status, type of effect measure (e.g., endpoint or change score, measure of variance), definition of “adequate” treatment response, and type of statistical analysis (e.g., intention to treat).

Assessment of Methodological Quality of Individual Studies

We interpret methodological quality to include primarily elements of risk of bias related to the design and conduct of the study. In addition, we evaluated the presence of other key biases, such as the funding bias, and a specific form of selection bias related to “treatment failure” being determined prospectively.

We selected the Risk of Bias Tool by the Cochrane Collaboration69 to assess randomized controlled trials (RCTs). The tool contains 12 items that include evaluation of the domains of randomization, blinding, cointervention, and selective outcome reporting biases. Criteria for evaluation are standardized for these domains. Inconsistency amongst raters was minimized by providing adequate training and standardized instructions; disagreements were resolved by consensus.70 We had selected the Newcastle Ottawa Quality Assessment Tool71 to assess risk of bias for observational studies but no study of this design was eligible. Additionally, we evaluated studies for adequacy of collecting and reporting harms using the McHarm Tool.72,73 This tool has been specifically designed for adverse events and captures domains related to the classification of harms, method of collection (active versus passive), and also the level of withdrawals due to adverse events. We used the AGREE II to assess the methodological quality of the CPG.74 All tools can be viewed in Appendix B.

A study with low risk of bias was defined as a clinical trial fulfilling six or more of the 12 methodological quality criteria in the Risk of Bias Tool. A study with high risk of bias was defined as fulfilling fewer than six criteria. The classification of individual studies into categories of study limitations (high or low), were used to group studies for grading the strength of the evidence.


We determined a priori the key attributes of applicability of our key research questions with respect to the population, intervention, comparator, and outcome in the context of a wider spectrum of patients (especially in primary care settings) that would likely benefit from these interventions in “real world” conditions.

Population characteristics to which these findings are applicable include:

  • Men and woman older than 18 years of age and male and female adolescents aged 12 to 18 years
  • People with a wide spectrum of previous episodes and variation in the course, including a first time episode of depression or several recurrences of MDD, dysthymia, or subsyndromal depression
  • People with a complete spectrum of depression severity (mild to severe MDD and dysthymia)
  • People with a wide spectrum of previous failures to SSRIs, from a first failed response for the current episode, to more than three failed responses to an SSRI for the current episode
  • People with a wide spectrum of failed responses to previous antidepressant exposures for previous episodes of MDD, dysthymia, and subsyndromal depression

Population characteristics to whom the findings of this review are not applicable include:

  • Adults or adolescents of either gender who have a primary diagnosis of bipolar disorder, schizophrenia, or major anxiety disorder

Intervention characteristics that these findings are applicable to include:

  • For switches to new monotherapy treatment, antidepressant doses consistent with current recommended therapeutic dose ranges (as a minimum dose) applied for a minimum of 4 weeks,
  • For combined therapy, there is variation in the doses for the added or augmenting agents; a clear trend for what ranges are applicable in this context.

The comparator treatments to which the research questions could ideally apply include those detailed in the comprehensive list of comparator treatments. Similarly, the outcomes selected for this review would be applicable to those domains listed in the eligibility criteria; however, we would expect that these outcomes would be assessed using standardized instruments.

Data Synthesis

Qualitative Synthesis

For each trial, information on population characteristics (including history of treatment(s) for any previous episodes of depression, age of first diagnosis, etc.), study outcomes (both of benefit and of harm), sample sizes, settings, funding sources, treatments (type, dose, duration, and provider), methodological limitations, statistical analyses, and any important confounders is summarized in text and summary tables. We have stratified the presentation of results based on the type of depressive disorder (MDD, dysthymia, or subsyndromal depression) and by age (adolescent or adult).

Additionally, we grouped study results: (1) according to the index treatment categories (monotherapy or combined therapies) and the corresponding comparator treatment; (2) the specific grouping of the pharmacological treatment (SSRI, nonSSRI, augmenting agents); and (3) nonpharmacological treatment. Forest plots and summary tables were generated to display primary study outcomes of response and remission.

Summary tables were created for CPGs stratified by country of origin, where possible.

Quantitative Synthesis

The decision to pool individual study results was based on clinical judgment with regards to the comparability of study populations, treatments, and outcome measures. Specifically, methodological quality (high risk of bias vs. low risk of bias), clinical diversity (characteristics of the study population, gender, disease severity), treatment (pharmacological, nonpharmacological), intervention duration (2 weeks vs. 12 months), and outcome characteristics (different measuring scales) of individual studies were considered. The extent of heterogeneity was based on the clinical appropriateness of the populations and interventions.

After the final set of eligible studies were extracted, a decision was made to not undertake meta-analyses due to the clinical heterogeneity, predominately due to the different types of interventions and comparators. We presented data in forest plots to visually demonstrate comparative effects across the differing drug and nonpharmacological interventions but did not estimate summary effects. STATA (Version 10, StataCorp, College Station, Texas, United States) software was used to estimate the relative risks (RR) (using a random effects model) for the outcomes of response and remission.

Subgroup and Sensitivity Analysis

No meta-analyses were undertaken in this CER, as study populations, interventons, and comparators were not deemed sufficiently similar. However, we considered specific factors in the qualitative presentation of the review findings. Our search yielded only two eligible studies that did not include subjects with MDD, and, as such, the impact of the type of depressive disorder could not be explored. Primary studies and guidelines applicable to adults and adolescents were identified, and the results were presented stratified by these two age groups. Factors that had the potential to impact study outcomes or account for the clinical heterogeneity, such as gender, number of previous failures, method of determining treatment failure, dose and duration characteristics of the intervention, and type of treatment provider were extracted and explored. We summarized these features within the clinical groupings of study interventions monotherapy versus monotherapy, monotherapy versus combined therapy, and combined therapies versus combined therapies. Methodological heterogeneity was also explored within each of these intervention groupings.

Rating the Body of Evidence

We assessed the overall strength of the evidence (SOE) across the literature using the rating approach as specified by the the AHRQ.75 The SOE can be classified into four grades based on the AHRQ approach: high, moderate, low, or insufficient. Grading of the SOE is applied to individual outcomes, which in this CER are the primary outcomes of benefit (response and remission) and harm (suicidality, weight gain, and sexual dysfunction); partial and nonreponse was either omitted or poorly reported in most studies and as such was not included in the GRADE tables. A grading of “high” would reflect high confidence that the evidence shows the true effect, and that further research is very unlikely to change confidence in the estimate of the effect. A grading of “low” would reflect low confidence that the evidence shows the true effect, and that further research is likely to change confidence, or the magnitude in the estimate of the effect. A grading of “moderate” reflects a moderate level of confidence and that additional research may change confidence. A grading of “insufficient” reflects that the evidence is not available, or what evidence is available does not permit a conclusion of substance.

There are several factors that may decrease the overall grading of the SOE and these include: (1) study limitations (predominately risk of bias criteria) and the type of study design (experimental versus observational); (2) consistency of results (degree to which study results for an outcome are similar (variability across studies is easily explained, range of results is narrow); (3) directness of the evidence (assesses whether interventions can be linked directly to the health outcomes); and (4) precision (degree of certainty surrounding an effect estimate for a specific outcome). Additional factors that can be considered when evaluating the SOE can include: (1) dose response; (2) plausible confounding that would decrease the effect; (3) magnitude of the effect; and (4) publication bias and other factors related to relevance to intended populations.

The AHRQ approach to rating the SOE considers the link between the intervention and the outcomes with respect to the domain of directness. In the context of this CER, the links between intervention and the outcomes are all direct, thus, this domain does not assist in discriminating studies from each other. We have accounted for this by considering directness as per the GRADE approach,76 regarding directness to the population, intervention, and comparator treatments as part of other considerations affecting the SOE. All of these factors were considered when grading the SOE and the overall ratings are detailed in summary tables.

Publication Bias

Although our search strategy is comprehensive and includes a grey literature search including sources for unpublished trials, there is always the potential for publication bias. Publication bias is important to assess in reviews with the use of drugs, as there is evidence to suggest that industry sponsorship may lead to negative trials not being published,77 that reporting of adverse events are more favorable to the funder,78 and that there may be delay in publication of negative findings.78

Our grey literature search was undertaken by the AHRQ Scientific Resource Centre research librarian. Part of this extensive search included a large number of citations from regulatory databases, such as the FDA and clinical trial registries. These sources were searched to identify unpublished or ongoing trials in an attempt to minimize publication bias. Since there were less than 10 studies focusing on any single intervention, no funnel plots were produced, nor was a meta-analysis undertaken.