NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gartlehner G, Hansen RA, Thieda P, et al. Comparative Effectiveness of Second-Generation Antidepressants in the Pharmacologic Treatment of Adult Depression [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2007 Jan. (Comparative Effectiveness Reviews, No. 7.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Comparative Effectiveness of Second-Generation Antidepressants in the Pharmacologic Treatment of Adult Depression

Comparative Effectiveness of Second-Generation Antidepressants in the Pharmacologic Treatment of Adult Depression [Internet].

Show details


General Conclusions

This report provides a comprehensive summary of the comparative efficacy, effectiveness, and harms of 12 second-generation antidepressants for the treatment of major depressive disorder (MDD), dysthymia, and subsyndromal depression. They include bupropion, citalopram, duloxetine, escitalopram, fluoxetine, fluvoxamine, mirtazapine, nefazodone, paroxetine, sertraline, trazodone, and venlafaxine in three classes: selective serotonin reuptake inhibitors (SSRIs), serotonin and norepinephrine reuptake inhibitors (SNRIs and SSNRIs), and other second-generation antidepressants. Table 32 briefly summarizes our findings from evidence for all five key questions and their subquestions and notes the strength of evidence in each case.

Table 32. Summary of findings with strength of evidence.

Table 32

Summary of findings with strength of evidence.

Most of the relevant trials were conducted in patients with MDD. Therefore, we can draw some conclusions regarding the use of second-generation antidepressants for MDD. Evidence is insufficient, however, to draw firm conclusions about comparative efficacy, effectiveness, and harms of second-generation antidepressants for dysthymia and subsyndromal depression.

For MDD, our findings indicate that the existing evidence does not warrant the choice of one second-generation antidepressant over another based on greater efficacy and effectiveness. We could not find any substantial differences in efficacy and effectiveness for either treating the acute depressive phase or maintaining remission. Furthermore, no differences in efficacy and effectiveness are apparent in subgroups based on age and sex, although evidence within subgroups is more limited.

More than 50 percent of patients treated with second-generation antidepressants for acute-phase depression did not achieve remission, the goal of depression treatment. Almost 40 percent of patients failed to respond, a less rigorous outcome. Currently, the evidence is insufficient to determine patient factors that can reliably predict response or nonresponse to an individual drug.

Although limited evidence indicates that second-generation antidepressants are also similar in efficacy for treating patients who had failed to respond to a first-line agent, a substantial proportion of these patients do not achieve response or remission with second-line treatment. Multiple treatment options, therefore, are required for patients who do not respond to first- or second-line treatment.

Clinically, numerous physical and psychological symptoms accompany depressive disorders. Clinicians sometimes recommend using individual second-generation antidepressants for these problems, assuming differences in efficacy to treat these accompanying symptom clusters. The current evidence does not support the selection of one second-generation antidepressant over another for specific accompanying symptoms. The best comparative evidence suggests no difference in efficacy for anxiety symptoms. For other symptom clusters such as melancholia, psychomotor change, pain, and somatization, the evidence is limited to few comparisons. For other common symptoms, such as fatigue and loss of energy, evidence is lacking.

Although second-generation antidepressants are similar in efficacy, they cannot be considered identical drugs. Evidence of moderate strength supports some differences among individual drugs with respect to onset of action, adverse events, and some measures of health-related quality of life; these are of modest magnitude but statistically significant. Specifically, consistent evidence from multiple trials demonstrates that mirtazapine has a faster onset of action than citalopram, fluoxetine, paroxetine, and sertraline60, 61, 72, 73, 76 and that bupropion has fewer sexual side effects than fluoxetine, paroxetine, and sertraline.79, 80, 8890

Some of these differences are small and might be offset by other adverse events. For example, a faster onset of mirtazapine must be weighed against possible decreased adherence because of long-term weight gain. Nonetheless, some of these differences may be clinically significant and influence the choice of a medication for specific patients. For example, patients who have a history of nausea or who dread sexual dysfunction might be more adherent to a choice of treatment that takes these factors into consideration. Past treatment experiences may also frame decisions regarding medications to either select or avoid, but no evidence exists to verify these inferences.

A considerable limitation of our conclusions is that they have been derived primarily from efficacy trials. Although findings from effectiveness studies are generally consistent with those from efficacy trials, the generalizability of some of our conclusions may be limited. Furthermore, the pharmaceutical industry funded a large percentage of these studies, and selective reporting is conceivable, although we had no way to account for missing information.

Our report is the first to assess statistically each of 66 possible drug comparisons of second-generation antidepressants. For comparative efficacy, we employed direct analyses for four comparisons and 62 indirect statistical analyses.

In the following sections we discuss major findings for individual key questions in more detail.

Results for Efficacy and Effectiveness in Major Depressive Disorders

For MDD, direct evidence from head-to-head trials and indirect comparisons using placebo-controlled trials indicate that, overall, the efficacy and effectiveness of second-generation antidepressants do not differ substantially for the treatment of adults. We rated the strength of this evidence as moderate. These findings are consistent with prior systematic reviews and meta-analyses.8, 241

In some of our meta-analyses, results of pooled response rates indicate statistically significant differences in efficacy between some drugs. Specifically, for response, escitalopram is more efficacious than citalopram, sertraline more than fluoxetine, and venlafaxine more than fluoxetine. Accompanying meta-analyses of effect sizes, however, suggest that the actual differences in the mean treatment effects are small and most likely not clinically significant.

For example, a relative risk (RR) meta-analysis of response rates indicates that significantly more patients receiving escitalopram than receiving citalopram achieved treatment response (RR, 1.14; 95% CI, 1.04–1.26). An effect-size meta-analysis yielded a mean difference of 1.3 points on the Hamilton Depression Rating Scale (HAM-D), which represents about one-fifth to one-quarter of a standard deviation. Therefore, this difference most likely does not represent a minimal clinically significant difference. A recent methods study concluded that a change of about one-half of a standard deviation reflects a minimal important difference for a patient.103 In this case, dichotomizing a continuous scale such as the HAM-D appears to overestimate the actual difference in effect sizes.

Similarly, sertraline and venlafaxine had statistically significantly greater response rates than fluoxetine. Effect size meta-analyses, however, yielded no clinically significant mean differences on HAM-D scales.

Findings from indirect comparisons yielded no statistically significant differences in response rates among other potential comparisons. The precision of some of these estimates was low, leading to inconclusive results with wide confidence intervals. Nevertheless, point estimates of treatment effects consistently indicate no substantial differences in efficacy among comparisons.

Although response and remission rates are similar among second-generation antidepressants, 54 percent of patients in these trials did not achieve remission and 34 percent did not respond. Many of these patients will require a second-line treatment. Results from the Sequenced Treatment Alternatives to Relieve Depression (STAR-D) trial—an effectiveness study that randomized patients to bupropion SR, sertraline, or venlafaxine XR after they had failed treatment with citalopram146—indicate that, even with second-line treatments, a substantial proportion of patients do not achieve remission.

Effectiveness trials have greater generalizability of findings than efficacy studies; we found only three such trials. Two of these effectiveness trials were conducted in French primary care settings and one was performed in the United States. Findings were generally consistent with efficacy trials—they did not detect any substantial differences in effectiveness. However, differences between French and US health systems may limit the applicability of results from French effectiveness trials to US patients.

No evidence exists on adherence in effectiveness studies. Although adherence was similar in efficacy trials, the generalizability of such findings may be limited. Most likely, dosing regimens, adverse events, and costs substantially influence adherence of patients in everyday practice. Given similar efficacy and effectiveness, such factors need to be considered when choosing a medication.

Results for Maintaining Response or Remission

The majority of studies included in this report involved treating patients with major depression in its acute phase; for this phase, the goal is reducing signs and symptoms of depression to achieve remission. Patients who achieve remission with acute-phase treatment should be followed to maintain that response and remission. That is, they should be managed in a continuation phase to prevent relapse and, if necessary, in a longer-term maintenance phase to prevent recurrence. (See Figure 1 in the introduction for clarification of these treatment cycles.)

Although evidence was sparse on the comparative efficacy and effectiveness for maintaining response or remission, treating recurrent depression, or treating depression that does not respond to first-line treatment, our findings are consistent with results from acute-phase trials. Overall, no substantial differences among second-generation antidepressants were apparent, but comparisons are limited to a few drugs.

Moderate strength evidence from three efficacy trials47, 96, 116, 117 suggests that no substantial differences in efficacy exist between fluoxetine and sertraline, fluvoxamine and sertraline, and trazodone and venlafaxine for preventing relapse or recurrence. Although results are consistent across these studies, evidence for other drug comparisons is not available; hence, these results are not generalizable to other second-generation antidepressants.

Additionally, trials differed in their design and conduct, further limiting the applicability (generalizability) of this evidence. For example, criteria used to define relapse and recurrence differed considerably across trials. As cases in point with respect to relapse: In the three head-to-head studies, one defined relapse as an increase in the lowest HAM-D or Montgomery-Asberg Depression Rating Scale (MADRS) score of at least 50 percent for 2 weeks, a HAM-D greater than 18 for 2 weeks, and a Clinical Global Impressions - Severity (CGI-S) score greater than 4;47 a second study defined relapse as a HAM-D score greater than 15 with functional impairment;116, 117 and the third simply assessed discontinuation rates.96 Eligibility for continuation- or maintenance-phase treatment also varied considerably.

We advise that, in future studies, investigators try to build on past and current work by employing definitions of relapse that are similar to those commonly found in the published literature to date. In our view, convergence on standard, accepted definitions of recurrence would be useful as well.

A related question may be how long to continue treatment intended to prevent relapse and recurrence. Although we did not set out to answer this question, we believe that some evidence suggests that the risk of relapse decreases over time. For example, one placebo-controlled study compared 14 weeks, 38 weeks, and 50 weeks of continuation treatment with fluoxetine or placebo.122 Relapse rates were significantly lower for patients on fluoxetine than for those on placebo at 14 and 38 weeks, but not at 50 weeks. This finding implies some degree of diminishing returns for longer treatment, although more work is needed to address this question.

Results for Managing Treatment-Resistant or Recurrent Depression

Overall, approximately 40 percent of patients do not achieve clinical response with initial treatment; approximately 10 percent to 15 percent of patients discontinue treatment because of adverse events. Three studies addressed the comparative efficacy or effectiveness among second-generation antidepressants in patients with treatment-resistant depression. These studies came to inconsistent conclusions, although some of these inconsistencies may be partially explained by variations in the quality and applicability (i.e., internal and external validity) of these investigations. We rated the strength of evidence as moderate.

The best evidence comes from the STAR-D trial.146 Although this was an open-label study, an interviewer blinded to the treatment arm did the outcomes assessment. Among patients who did not have a remission or could not tolerate citalopram, the investigators reported that bupropion SR, sertraline, and venlafaxine XR had similar effectiveness and tolerability as second-line treatment. Although the ARGOS study, another effectiveness study, found venlafaxine to be superior to citalopram, fluoxetine, mirtazapine, paroxetine, and sertraline as a second-step treatment,140 we could not determine whether raters were blinded to treatment allocation, potentially limiting the ARGOS conclusions.

No study specifically compared one antidepressant with another in patients experiencing a depressive relapse (i.e., loss of response during continuation-phase treatment) or recurrence (i.e., loss of response during maintenance-phase treatment). Although STAR-D included patients with a history of recurrent depressive episodes at study entry, the analyses involved patients whose acute-phase treatment of the current episode had been unsuccessful; it did not include patients who initially responded and then lost response.

Results for Treating Patients with Depression and Accompanying Symptoms

The range of physical and psychological symptoms that accompany depressive disorders is wide. We found limited information for many accompanying symptom clusters; however, various symptoms may not have the same importance for clinical care. Our analyses concerned the efficacy and effectiveness of these pharmaceuticals for treating depression in patients with such symptoms and treating the accompanying symptoms in patients with depression. Generally, the strength of evidence for anxiety was moderate; for all other symptom clusters, either the strength of evidence was low or no evidence was found.

The most common and distressing accompanying symptoms can be considered the highest priority for further studies. Research involving depressed populations that may be more generalizable suggests that common presenting symptom clusters in both primary care and psychiatric clinics are fatigue and loss of energy (for which no studies were identified), anxiety, insomnia, and pain and other somatic symptoms.147

Anxiety. Although anxiety is not a discrete MDD subtype,242 evidence suggests that it may present as a distinctive cluster243 and be associated with more persistent depression.244246 For patients with high anxiety associated with MDD, we found no difference in patients' depression treatment response by either antidepressant class or specific medication. These findings are consistent with a recent nonsystematic review sponsored by a pharmaceutical manufacturer.247 Although all the included studies identified a high anxiety group, the definitions employed by investigators varied markedly.

In addition, for patients with anxiety symptoms associated with depression, we found no identifiable difference in anxiety response by either antidepressant class or specific medication. Therefore, the current evidence suggests that improvement in both depressive and anxiety symptoms is likely with adequate dosing of antidepressant treatment, but evidence of clear benefit for one antidepressant over another is lacking.

Insomnia. For patients with depression and accompanying insomnia, we found no clear evidence of differences in depressive response or insomnia response by antidepressant class or specific medication.

Indirect evidence from studies that did not identify insomnia subgroups82, 96 provides results that are consistent with improved sleep quality for trazodone compared with fluoxetine82 and venlafaxine.96 Higher quality, direct evidence, however, was limited. Among the three studies that identified an insomnia group, only one trial involved one of these three antidepressants; it suggested greater benefit for nefazodone than fluoxetine.81 The two other studies, which compared SSRIs, produced mixed results.41, 151

Studies were limited by varying and incomplete assessment of insomnia and by insensitive outcome measures. Most studies used a sleep measure that is a part of HAM-D, with three items producing a total sleep score ranging from 0 to 6. The clinical meaningfulness of the small reported differences in this outcome measure is unclear.

Melancholia. Information about outcomes in the melancholic subgroup was limited to three comparative trials; they addressed only the effect on depressive outcomes. Evidence did not consistently support a difference in outcome by either class or medication.

Pain. Patients with depression commonly experience physical symptoms; the majority are pain symptoms. In addition, depression is prevalent among patients with chronic pain disorders.248 We identified few trials addressing the use of second-generation antidepressants for treatment of pain accompanying depression. All the trials we identified tested duloxetine, an SSNRI; two compared duloxetine with paroxetine, and the other three were placebo-controlled trials.

Studies were limited by exclusion of patients with common chronic pain conditions, failure to analyze subgroups with moderate to severe pain, and failure to report outcomes in a clinically meaningful way. No study included patients with comorbid depression and chronic pain, probably the group of most interest to clinicians. The only study that required patients to have pain of at least mild intensity for inclusion excluded those with a history of any diagnosed painful condition, including common pain disorders such as migraine and arthritis.154

The difference in mean pain scores between duloxetine and placebo groups was statistically significant, but probably not clinically meaningful, in three studies; all used a 100 mm pain intensity visual analog scale (VAS) as the outcome measure.153, 155, 156 Prior research has produced different estimates of the minimum clinically important difference on the VAS, ranging from 9 mm to 30 mm.109, 249251 No study included in this review reported the proportion of patients achieving a clinically important improvement in pain scores.

Psychomotor changes. The evidence addressing depression outcomes in patients with psychomotor changes is limited to a single trial. It found that sertraline was more efficacious than fluoxetine in patients with psychomotor agitation but not in those with psychomotor retardation.149

Somatization. The evidence directly addressing treatment of somatization in patients with depression is limited to a single trial that found similar effectiveness for three SSRIs.49 Conclusions from this study are limited because the investigators did not analyze information for a subgroup with high somatization.

Results for Harms (Adverse Events) and Adherence

On average, 61 percent of patients experienced at least one adverse event during the course of the studies we reviewed. Nausea, headache, diarrhea, fatigue, dizziness, sweating, tremor, dry mouth, and weight gain were commonly reported adverse events.

Although the spectrum of adverse events is similar among second-generation antidepressants, the frequencies of specific adverse events differ among individual drugs. For example, venlafaxine had a higher rate of nausea and vomiting than the SSRIs as a class. Also, compared with other second-generation antidepressants, paroxetine frequently led to higher sexual side effects, mirtazapine and paroxetine to higher weight gains, and sertraline to a higher rate of diarrhea. Such differences did not lead to substantial differences in discontinuation rates.

For some patients, these differences might well be clinically important. For example, the choice of an agent with a low rate of sexual side effects might increase adherence in patients who consider sexual dysfunction an intolerable adverse event.

The evidence on the comparative risk for rare but severe adverse events such as suicidality, hyponatremia, seizures, or serotonin syndrome was insufficient to draw firm conclusions. The risk of such harms should be kept in mind during any course of treatment with a second-generation antidepressant.

Efficacy studies did not indicate any differences in adherence across agents. One observational study indicated that extended-release formulations might have a better adherence rate than immediate-release medications. This finding, however, is likely more attributable to differences in dosing regimens than to differences in efficacy and harms. The evidence is insufficient to draw any conclusions about differences in adherence in effectiveness studies.

Results for Population Subgroups

In efficacy and effectiveness studies, treatment effects were similar between different age groups and between males and females. Despite the importance of the harms of second-generation antidepressants, especially in the elderly, little evidence is available on this topic. We found very limited head-to-head evidence assessing potential differences in efficacy in different racial groups or in patients with common comorbidities. Specifically for different racial groups and for patients with common comorbidities, the evidence is sparse and mainly limited to placebo-controlled trials assessing the general efficacy of second-generation antidepressants in such subgroups. Some of these studies indicate that the general efficacy of second-generation antidepressants in patients with serious comorbidities (e.g., cancer, substance abuse) is limited.

Many of these studies had serious methodological flaws or were too small to detect meaningful differences, although they may not have been powered to detect significant differences. Differences in study populations, cutoff points on scales, and drug dosages do not allow analysts to compare initial treatment effects across individual placebo-controlled trials to assess differences in subgroups other than those defined by age and sex.

Results for Dysthymia and Subsyndromal Depression

The evidence is sparse (strength of evidence for comparative efficacy is low for dysthymia and subsyndromal depression). No conclusions can be drawn on comparative efficacy or effectiveness.

For the treatment of dysthymia, the evidence on general efficacy is limited to fluoxetine, paroxetine, and sertraline; for subsyndromal depression, the evidence covers only citalopram, fluoxetine, and paroxetine. Results are mixed. For dysthymia, the two largest placebo-controlled studies did not detect any differences between fluoxetine or paroxetine and placebo for treating patients younger than 60 years.100, 113 Similarly, the evidence on the general efficacy in subsyndromal depression is limited to few studies with mixed results.

Future Research

We identified multiple areas that require additional research to enable clinicians and researchers to draw firm conclusions about the comparative efficacy, effectiveness, and harms of second-generation antidepressants.

Efficacy and Effectiveness

Future research has to establish reliably the general efficacy of second-generation antidepressants for the treatment of dysthymia and subsyndromal depression. Ideally, multiple-arm, head-to-head trials, including placebo groups, should evaluate the general and comparative efficacy of second-generation antidepressants in patients with these conditions.

Effectiveness studies with a high rate of applicability to primary care populations are generally lacking for most drugs. Effectiveness trials with less stringent eligibility criteria, health outcomes, long study durations, and a primary care population would be valuable to determine whether existing differences of second-generation antidepressants are clinically meaningful in “real world” settings. These trials should be powered to be able to assess minimal clinically significant differences. Furthermore, they could provide valuable information on differences in adherence among second-generation antidepressants.

Future research should also focus on differences in efficacy and effectiveness in subgroups such as the very elderly or patients with various common comorbidities.

Prevention of Relapse and Recurrence

More evidence is needed regarding the most appropriate duration of antidepressant treatment for maintaining remission. Such studies should also evaluate whether different formulations (i.e., controlled release vs. immediate release) lead to differences in adherence and subsequently to differences in relapse or recurrence.

Additionally, although most trials maintained the dose used in acute-phase treatment throughout continuation and maintenance treatment, little is known about the effect of drug dose on the risk of relapse or recurrence. The effect of differences in drug doses is also poorly understood.

Management of Treatment-Resistant or Recurrent Depression

Given the fact that approximately 40 percent of patients do not respond to initial treatment, an important future research agenda is to explore whether combinations of antidepressants at treatment initiation lead to better response rates than single agents alone. Furthermore, additional head-to-head evidence is needed to resolve whether one second-generation antidepressant is better than another in patients who either did not respond or could not tolerate a first-line treatment.

Likewise, evidence is lacking to determine whether one antidepressant is better than another in patients who cannot maintain remission during continuation- or maintenance-phase therapy. The role of other depression treatments, such as psychotherapy, vagal nerve stimulation, light therapy, and alternative medicines as substitutes or complements to pharmaceutical management also needs to be better understood.

Accompanying Symptoms

More research is needed to evaluate differences between second-generation antidepressants in populations with accompanying symptoms such as anxiety, insomnia, pain, and fatigue. Given that outcomes for depression treatment do not differ substantially between specific antidepressants, information about treatment of accompanying symptoms is key for clinicians who must select among many antidepressant drugs.

Study questions must be based on a clinically meaningful metric that gives preference to symptoms of high frequency or those that cause a high level of distress. Each subgroup must be clearly and consistently defined (e.g., a high anxiety group should be identified with a consistent definition). Analyses should then be done in such subgroups, using similarly defined outcomes to allow results to be compared across studies and across subgroups. Investigators should report the proportions of patients who reach a predefined threshold for clinically meaningful improvement.

The absence of any trials conducted in a population with fatigue or loss of energy presents a clinically important void in the literature. In addition, future studies of depression with accompanying pain and other somatic symptoms should identify clinically relevant subgroups of patients with moderate to severe pain or other symptoms.

Adverse Events

Large, well-conducted observational studies are needed to assess reliably the comparative risks of second-generation antidepressants with respect to rare but serious adverse events such as suicidality, hyponatremia, hepatotoxicity, seizures, cardiovascular adverse events, and serotonin syndrome. Furthermore, these studies need to evaluate whether very elderly patients have an excess risk of severe adverse events with any second-generation antidepressant.


As this report was going to press, a relevant study addressing sequential treatment steps among patients who did not obtain remission with initial acute-phase treatment was published. We were unable to incorporate this study fully into this report, but we found its results important in light of the general lack of high-quality evidence for treating patients who do not obtain remission with initial treatments.

The Sequenced Treatment Alternatives to Relieve Depression (STAR-D) trial - described in detail in Key Question 2b - consisted of a series of RCTs examining sequential treatment steps in patients who did not obtain remission or could not tolerate previous treatments. Key Question 2b detailed the medication switch arms of the second-step treatment in which all patients in the analysis had failed initial treatment with citalopram and were randomized to second-step treatment with bupropion SR (N = 239), sertraline (N = 238), or venlafaxine XR (N = 250); this analysis found no statistically significant differences in remission rates between second-step treatments.146

The more recently published study describes the acute and longer-term outcomes associated with all four treatment steps.252 Patients not achieving remission or unable to tolerate a treatment step were encouraged to move to the next step; patients achieving acceptable benefit could enter a 12-month follow-up phase. All patients (N = 3,671) received citalopram in Step 1. Step 2 and Step 3 treatments were randomly assigned using an equipoise stratified randomized design. In this, 1,439 patients were randomized in Step 2, which included seven possible treatment alternatives (bupropion SR, sertraline, venlafaxine XR, cognitive therapy, citalopram plus bupropion, citalopram plus buspirone, or citalopram plus cognitive therapy). Step 3 randomized 390 patients to switch to mirtazapine or nortriptyline or to receive augmentation with lithium or triiodothyronine (T3). Step 4 used only a single randomization; 123 patients were randomized to tranylcypromine or venlafaxine XR plus mirtazapine.

Overall, 67 percent of patients achieved remission. Remission rates were 36.8 percent for Step 1, 30.6 percent for Step 2, 13.7 percent for Step 3, and 13.0 percent for Step 4. For patients achieving acceptable benefits who continued on in the 12-month follow-up study, relapse rates were 40.1 percent, 55.3 percent, 64.6 percent, and 71.1 percent for those achieving benefit in Steps 1, 2, 3, and 4, respectively. In all steps, patients achieving remission (Quick Inventory of Depressive Symptomatology-Self Report [QIDS-SR-16] ≤ 5) were less likely to relapse than patients not achieving remission (acceptable benefit but QIDS-SR-16 > 5).


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (11M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...