Figure 1. Analytic framework for evidence report
The Agency for Healthcare Research and Quality (AHRQ), through its Evidence-Based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The Centers for Disease Control and Prevention (CDC) requested and provided funding for this report. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHRQ and conduct additional analyses when appropriate prior to developing their reports and assessments.
To bring the broadest range of experts into the development of evidence reports and health technology assessments, AHRQ encourages the EPCs to form partnerships and enter into collaborations with other medical and research organizations. The EPCs work with these partner organizations to ensure that the evidence reports and technology assessments they produce will become building blocks for health care quality improvement projects throughout the Nation. The reports undergo peer review prior to their release.
AHRQ expects that the EPC evidence reports and technology assessments will inform individual health plans, providers, and purchasers as well as the health care system as a whole by providing important information to help improve health care quality.
We welcome comments on this evidence report. They may be sent by mail to the Task Order Officer named below at: Agency for Healthcare Research and Quality, 540 Gaither Road, Rockville, MD 20850, or by e-mail to epc@ahrq.gov.
Carolyn M. Clancy, M.D.
Director
Agency for Healthcare Research and Quality
Julie Louise Gerberding, M.D., M.P.H.
Director
Centers for Disease Control and Prevention
Jean Slutsky, P.A., M.S.P.H.
Director, Center for Outcomes and Evidence Agency for Healthcare Research and Quality
Beth A. Collins Sharp, Ph.D., R.N.
Director, EPC Program
Agency for Healthcare Research and Quality
Gurvaneet Randhawa, M.D., M.P.H.
EPC Program Task Order Officer
Agency for Healthcare Research and Quality
The authors gratefully acknowledge Jennifer Farmer and Cara O'Brien for assistance with abstract screening; Georgette De Jesus for help with abstract screening and for over-reading of data abstractions; Udita Patel and R. Julian Irvine for assistance with project management; Greg Samsa for reading and commenting on portions of the draft report; Linda Bradley and Glenn Palomaki for assistance with the material on analytic validity, and Gurvaneet Randhawa, AHRQ Task Order Officer, for overall support.
Objectives: To determine if testing for cytochrome P450 (CYP450) polymorphisms in adults entering selective serotonin reuptake inhibitor (SSRI) treatment for non-psychotic depression leads to improvement in outcomes, or if testing results are useful in medical, personal, or public health decisionmaking.
Data Sources: We searched MEDLINE®, the Cochrane Database of Abstracts of Reviews of Effects, PsychInfo, HealthSTAR, and CINAHL, and reviewed the reference lists of included articles and relevant review articles and meta-analyses for eligible studies. We also included documents from the U.S. Food and Drug Administration (FDA) that could be publicly accessed.
Review Methods: We developed an analytic framework and identified key questions to guide the review process. Project-specific inclusion/exclusion criteria were also developed and were used by paired researchers independently to review both abstracts and full-text articles; both researchers were required to agree on inclusion status at the full-text stage. Abstractors evaluated each included article for factors affecting internal and external validity.
Results: A review of 1,200 abstracts led to the final inclusion of 37 articles. The evidence indicates the existence of tests with high sensitivity and specificity for detecting only a few of the more common known polymorphisms of 2D6, 2C19, 2C8, 2C9, and 1A1. There is mixed evidence regarding the association between CYP450 genotypes and SSRI metabolism, efficacy, and tolerability in the treatment of depression, mainly from a series of heterogeneous studies in small samples. There are no data regarding: (a) if testing for CYP450 polymorphisms in adults entering SSRI treatment for non-psychotic depression leads to improvement in outcomes versus not testing, or if testing results are useful in medical, personal, or public health decisionmaking; (b) if CYP450 testing influences depression management decisions by patients and providers in ways that could improve or worsen outcomes; or (c) if there are direct or indirect harms associated with testing for CYP450 polymorphisms or with subsequent management options.
Conclusions: There is a paucity of good-quality data addressing the questions of whether testing for CYP450 polymorphisms in adults entering SSRI treatment for non-psychotic depression leads to improvement in outcomes, or whether testing results are useful in medical, personal, or public health decisionmaking.
Major depressive disorder (MDD) is widely distributed in the population and is associated with substantial symptom severity and role impairment. It is the fourth leading cause of disease burden, accounting for 4.4 percent of total disability-adjusted life years in the year 2000, and it causes the largest amount of non-fatal burden, accounting for almost 12 percent of all total years lived with disability worldwide. In naturalistic studies of followup of depression, almost 60 percent of patients show either residual symptoms or no response to treatment at the end of 1 year.
Selective serotonin reuptake inhibitors (SSRIs) have become first-line drugs in the treatment of depression partly because of their better tolerability and relative safety in overdose compared with older tricyclic antidepressants. The response rate to SSRIs in short-term trials is approximately 50 to 60 percent. As with other antidepressants, a primary limitation of SSRIs is time to response, with most SSRIs showing a benefit only after 2 to 4 weeks of adequate dosing. In addition, even this class of drugs is associated with intolerable adverse effects necessitating discontinuation of medication in 12 to 15 percent of patients in short-term studies. Because of variable efficacy and tolerability among patients, SSRIs are usually titrated through a process of trial and error, potentially further lengthening the time to response.
The cytochrome P450 (CYP450) enzymes are an isoenzyme superfamily that catalyze the oxidation of many drugs and chemicals. The CYP450 enzymes - primarily CYP2D6, CYP2C19, and CYP2C9 - are involved in the metabolism of all of the SSRIs. Genetic polymorphisms have been identified for some of the CYP450 enzyme genes, with inactivating alleles that may decrease or eliminate enzyme activity, or multiple copies of functional genes that may increase enzyme activity. There has been increasing interest in the role of genetic polymorphisms of CYP450 enzymes in metabolism of SSRIs, and several tests are now available to test for CYP450 polymorphisms. A significant recent development was the approval by the U.S. Food and Drug Administration (FDA) of the Roche AmpliChip® CYP450 Test for this purpose. This product delivers the results of testing for CYP2D6 and CYP2C19 polymorphisms in the form of “predicted phenotypes” - poor metabolizers (PMs), intermediate metabolizers (IMs), extensive metabolizers (EMs), and ultra-rapid metabolizers (UMs). The availability of these tests has brought the field of pharmacogenetics to the threshold of influencing clinical practice.
The Agency for Healthcare Research and Quality (AHRQ), on behalf of the Centers for Disease Control and Prevention (CDC) Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Project, requested the development of the present evidence report, which will be used to inform the EGAPP Working Group's formulation of evidence-based recommendations.
A team of investigators at the Duke Evidence-based Practice Center comprised of experienced investigators in health policy, psychiatry, and pharmacogenetics, developed the report, which provides a clear view of the current state of the science in CYP450 polymorphism testing in depression, and - where research is now insufficient for policy decisionmaking - proposes a list of rational research priorities.
Working with AHRQ, the CDC, and members of the project's technical expert panel, we developed the following key research questions:
Question 1: Does testing for CYP450 polymorphisms in adults entering SSRI treatment for non-psychotic depression lead to improvement in outcomes, or are testing results useful in medical, personal, or public health decisionmaking? (overarching question)
Question 2: What is the analytic validity of tests that identify key CYP450 polymorphisms?
Question 3a: How well do particular CYP450 genotypes predict metabolism of particular SSRIs? Do factors such as race/ethnicity, diet, or other medications, affect this association?
Question 3b: How well does CYP450 testing predict drug efficacy? Do factors such as race/ethnicity, diet, or other medications, affect this association?
Question 3c: How well does CYP450 testing predict adverse drug reactions? Do factors such as race/ethnicity, diet, or other medications, affect this association?
Question 4a: Does CYP450 testing influence depression management decisions by patients and providers in ways that could improve or worsen outcomes?
Question 4b: Does the identification of the CYP450 genotypes in adults entering SSRI treatment for non-psychotic depression lead to improved clinical outcomes compared to not testing?
Question 4c: Are the testing results useful in medical, personal or public health decisionmaking?
Question 5: What are the harms associated with testing for CYP450 polymorphisms and subsequent management options?
We also developed a project-specific analytic framework that provides an explicit link between CYP450 testing and various health outcomes of importance to decisionmakers.
We searched MEDLINE® (1966-May 2006), the Cochrane Database of Abstracts of Reviews of Effects (DARE), PsychInfo, HealthSTAR, and CINAHL. Searches of these databases were supplemented by reviews of the reference lists contained in all included articles and in relevant review articles. Documents from the FDA that could be publicly accessed were also included. The searches yielded a total of 1,200 citations. Pairs of researchers independently reviewed each abstract and selected 140 for full-text review. Project-specific inclusion/exclusion criteria were developed, and both researchers were required to agree on inclusion status at the full-text stage. A total of 37 articles were included for data abstraction.
Evidence tables were developed, and data abstraction was carried out by one investigator and checked for accuracy and completeness by another. At the data abstraction stage, researchers were asked to evaluate each included article for factors affecting internal and external validity using guidelines from ACCE criteria for analytic validity (for Question 2) and by the Oxford Centre for Evidence-based Medicine (for all other key questions).
The draft version of this report was reviewed by a panel of experts vetted by AHRQ, and reviewer comments and suggestions have been incorporated into the final report.
Results are summarized below by key question.
No studies were identified that directly addressed any aspect of Question 1.
We identified 12 published articles and two documents from the FDA website (on performance of the Roche AmpliChip®) that described methods for genotyping various CYP450 enzymes (nine pertaining to CYP2D6, three to CYP2C19, two to CYP2C8, and one each to CYP2C9 and CYP1A1). Of the studies of CYP450 enzymes most relevant to SSRI metabolism (CYP 2D6, 2C19, and 2C9), only four used the gold standard comparison (DNA sequencing), while others were methods comparisons. Notably, very few of the known polymorphisms of the CYP enzymes were tested. Sensitivity and specificity were high (in the range of 94 to 100 percent) for these studies, but confidence intervals for analytic sensitivity for most genotypes were very wide because of the relatively few samples tested. Gene deletion and duplication studies had lower sensitivity and specificity, further compounded by the limitation that there is no accepted gold standard for such tests.
Sixteen studies met our inclusion criteria, of which five were conducted in healthy adults after a single dose of an SSRI. Of these, three showed that CYP2C19 PMs have significantly higher area under the curve (AUC), longer half-life, and reduced oral clearance of the parent drug, and significantly lower AUC, and lower maximum plasma concentration (Cmax) of the metabolite of each drug than EMs (drugs studied were sertraline, fluoxetine, and citalopram). Similar results were found in a study of CYP2D6*10 (associated with PM status) in healthy volunteers after a single dose of paroxetine, while another study of CYP2D6 using multiple doses of paroxetine found no significant difference between PMs and EMs. The remaining 11 studies were in clinical patients in treatment with SSRIs, were heterogeneous, had small sample sizes, and showed mixed results with respect to the association between CYP2D6/CYP2C9/CYP2C19 polymorphisms and SSRI blood levels.
We identified only five studies, three of which involved cohorts of depressed patients in antidepressant treatment. Of these, one found no differences in the proportion of responders among CYP2D6 EMs, IMs, and PMs treated with fluvoxamine. The second found that although plasma concentrations varied significantly between groups (with respect to 2D6 and 2C9 metabolizer status), levels above or below the lower limit of presumed therapeutic levels did not predict response. The third found no differences in depression scores between two groups, CYP2D6 UMs + EMs versus PMs + IMs, treated with paroxetine. The other two studies found significantly higher proportions of CYP2D6 PMs in non-responders to CYP2D6 metabolized SSRIs compared to the general population. The studies had several limitations including non-randomized designs, inadequate power, studying several SSRIs together as a group, and not accounting for other genetic factors that may influence SSRI efficacy (e.g., genetic variations in serotonin transporter proteins or serotonin receptor proteins).
We identified nine studies, three of which reported adverse effects in CYP PMs only as a secondary finding. Of the other six, three reported no differences in rates of adverse effects between CYP2D6 PMs and EMs, while a fourth reported no differences in adverse effects between the combined PM + IM and EM + UM groups. One study found a greater prevalence of gastrointestinal adverse effects in PMs compared to EMs. This study also found that the combination of CYP2D6 polymorphism and serotonin receptor 5HT2A polymorphism predicted gastrointestinal adverse effects. Two studies found a significantly higher prevalence of PMs in depressed patients with adverse effects than in the general population. The studies had several limitations including non-randomized design, inadequate power, and not accounting for other genetic factors that may influence SSRI tolerability (e.g., genetic variations in serotonin receptor proteins).
No studies were identified that directly addressed any aspect of these questions.
As a complement to the evidence review, we constructed a basic decision model to consider the circumstances under which testing for CYP polymorphisms could improve clinical outcomes, or favorably impact costs. We examined four strategies: (1) use a non-CYP metabolized SSRI without testing; (2) test and choose a non-CYP or CYP metabolized SSRI based on the result; (3) test and choose the dose of a CYP metabolized SSRI based on the result; and (4) use a CYP metabolized SSRI without testing. In no plausible scenario was a testing strategy predicted to improve expected outcomes of treatment at 6 weeks. The efficacy of a test strategy could approach the efficacy of use of a non-CYP metabolized drug, although this required the condition that a high correlation exist between genotype and phenotype (metabolizer status), as well as between phenotype and clinical outcomes. Current evidence does not support the conclusion that such high correlations apply. Moreover, the cost of testing is not offset by treatment savings if treatment duration is less than approximately 9 months.
Our literature review revealed a paucity of high-quality clinical studies addressing the key questions. We did not find a single prospective study of CYP450 genotyping and its relationship to clinical outcomes. General limitations of the available evidence include:
Most studies were small, poor-quality studies examining prevalence rates of certain genotypes in a sample or examining the correlation between various genotypes and limited clinical outcomes, such as response or adverse effects.
There were no randomized studies of alternative testing strategies.
Many reports did not take into account concurrent medications. No studies examining interactions between CYP polymorphisms and CYP inhibiting or CYP inducing drugs were identified.
Several studies looked at limited genotypes and did not account for the fact that more than one CYP enzyme may be involved in the metabolism of a specific SSRI.
Several studies grouped together multiple SSRIs, or SSRIs with other antidepressants such as tricyclics.
Genetic factors affecting serotonin receptor proteins, membrane transporters, and signal transduction molecules have important pharmacodynamic effects that could affect SSRI efficacy or tolerability. These were not taken into account in any of the studies.
The rated quality of data did not improve even when we were generous in our inclusion criteria and included studies examining SSRI treatment of conditions other than depression, or when we included studies including other antidepressants in addition to SSRIs.
The available data indicate good analytic validity for testing for CYP2D6 and CYP2C19 polymorphisms, but for a limited number of variants, with rare variants being tested infrequently. The data fail to support a clear correlation between CYP polymorphisms and SSRI levels, SSRI efficacy, or tolerability. There are no data regarding whether testing leads to improved outcomes versus not testing in the treatment of depression; whether testing influences medical, personal, or public health decisionmaking; or whether any harms are associated with testing itself or with subsequent management options.
We propose the following conceptual model to guide future research in cytochrome P450 (CYP450) polymorphism testing for depression management. Broadly speaking, the rationale behind CYP450 testing in patients with non-psychotic depression is as follows:
Major depressive disorder is a significant public health problem.
While SSRIs are the first-line treatment for depression, they are associated with a high rate of non-response to treatment, harboring a potential opportunity to improve public health by improving response rates to SSRI treatment.
One factor that makes identification of the optimal SSRI treatment difficult in a specific clinical situation is the CYP polymorphism-associated differences between patients in the rate of metabolism of SSRIs.
CYP450 testing can be used to predict the rate of SSRI metabolism (i.e., to classify patients as PMs, IMs, EMs, or UMs) and, thus, potentially can reduce the amount of trial and error required to select the optimal SSRI in a specific clinical situation.
The better CYP450 testing predicts metabolizer status, the greater the potential of CYP450 testing to improve the process of identifying the optimal SSRI treatment.
However, the more that factors other than CYP450 enzymes affect the metabolism of SSRIs, the less useful CYP450 testing will be.
Because depression is not often acutely life-threatening and SSRIs are rarely associated with life-threatening adverse effects, the main impact of CYP450 testing is likely to be in reducing the time to find the optimal SSRI, and in reducing the likelihood of adverse effects that would have been expected to occur with a suboptimal SSRI that might have been prescribed in the absence of CYP450 testing, thereby potentially reducing disease-management costs.
Finally, the impact of reducing the time to find the optimal SSRI and reducing the likelihood of SSRI-related adverse effects during the initial dosing period is strong enough to be important to patients.
Although some information regarding the above rationale exists, as a whole it is not sufficient to draw firm conclusions about whether this rationale, while intuitively reasonable, is in fact true. Based on this model, two types of studies are proposed. The first type would better elucidate individual points in the rationale. For example, regarding points (e), (f), and (g), the suggested study design would be a properly sized (likely to be large) randomized trial of CYP genotyping-guided treatment versus treatment as usual. The second type of study would encompass multiple steps in the above rationale. Examples include a study that would involve linking a specific genotype to SSRI type and dose, or a “practical clinical trial,” which would involve randomizing clusters (e.g., clinicians, practices, or regions) rather than patients to have genotyping available or not available. This would provide a test of the overarching question, “What difference does having genotyping available make in clinical practice?”
The short list of papers addressing the key questions clearly demonstrates the lack of sufficient evidence for incorporation of any of these tests into guidelines for clinical practice in depression management. There is a critical need to carry out research to answer the key questions in this report. If shown to be useful, CYP450 genotyping will make the most impact by reducing the trial and error currently inherent in SSRI treatment, thereby decreasing morbidity and improving quality of life in patients with non-psychotic depression.
Major depressive disorder (MDD) is widely distributed in the population and is usually associated with substantial symptom severity and role impairment. The lifetime prevalence of MDD by recent population study estimates is as high as 16 percent, with an annual prevalence rate of approximately six percent.1 The condition is twice as common in females as in males. MDD is the leading cause of disability in the United States and is predicted to become the second leading cause of disability worldwide in the next 15 years.2 Depression is the fourth leading cause of disease burden, accounting for 4.4 percent of total disability-adjusted life years in the year 2000, and it causes the largest amount of non-fatal burden, accounting for almost 12 percent of all total years lived with disability worldwide.3 The suicide rate associated with MDD is approximately four percent.4
The course of MDD differs a great deal among affected individuals. The average age of onset of major depression is in the mid-20s, but the first episode may occur at any age. The disease course is highly variable, and generally the number of previous episodes predicts the likelihood of having another episode. For example, 50 to 60 percent of patients with a first episode of depression will have a second episode, and those with two episodes have a 70 percent chance of having a third. After the third episode, the chance of having a fourth is 90 percent.5 Data for over 15,000 employees of a major U.S. corporation showed that depressive illness was associated with a mean of 9.86 annual sick days, significantly more than any of the other medical conditions examined.6 In a naturalistic study of followup of depression (in which treatment was not controlled by the investigators), 20 percent of patients continued to show no evidence of achieving remission, 40 percent showed partial remission, and 40 percent had no evidence of mood disorder at the end of 1 year.7 In the recently completed STAR*D trial, the response rate (rate of improvement in symptoms) was 47 percent and the remission rate (rate of substantial improvement, with only minimal residual symptoms) only 33 percent after 14 weeks of treatment with a selective serotonin reuptake inhibitor (SSRI).8 The high rate of non-response in MDD is one of the biggest challenges in psychiatry as it impacts disease burden.
The advent of the SSRI class of drugs has dramatically changed the landscape of depression treatment. SSRIs have quickly superseded the older tricyclic antidepressants to become first-line drugs in the treatment of depression. The SSRIs currently available on the market include fluoxetine, paroxetine, fluvoxamine, sertraline, citalopram, and escitalopram. Of the top 25 prescription drugs in the U.S. in 2004, two were SSRIs: Zoloft® (sertraline), with over 29 million prescriptions, and Lexapro® (escitalopram), with over 22 million prescriptions.9 Of the SSRIs, fluoxetine and (more recently) citalopram are available in generic forms. Fluoxetine is the only SSRI with an active metabolite (in the form of norfluoxetine) that is more potent in serotonin reuptake inhibition than the parent compound and which is thought to play a significant role in therapeutic effect.10 Moreover, fluoxetine is a racemic mixture of S- and R-fluoxetine, with both enantiomers being approximately equipotent in serotonin reuptake inhibition. However, of the enantiomers of their respective metabolites, S-norfluoxetine has significant serotonin reuptake inhibition and is 20 times more potent than R-norfluoxetine.11
The popularity of SSRI drugs has been attributed to their better tolerability and relative safety in overdose, which is an important consideration when treating depressed patients who may become suicidal. However, SSRIs are not without drawbacks. In addition to the high rates of non-response described above, another limitation of SSRI treatment of depression is the time to response, with most SSRIs starting to show benefit only after 2 to 4 weeks of adequate dosing. In the STAR*D trial, the majority of patients who achieved response or remission did so after 8 weeks of SSRI treatment.8 In addition, even this class of drugs is associated with intolerable adverse effects (such as nausea, diarrhea, or headaches) necessitating discontinuation of treatment in 12 to 15 percent of patients in short-term studies.12, 13 Because of variable efficacy and tolerability among patients, the SSRIs are generally titrated by trial and error, potentially further lengthening the time to response. Additionally, when a drug is discontinued as a result of intolerability, it can result in a “lost opportunity” to treat a condition such as depression that is associated with stigma.
In general, no clear relationship has been found between blood concentration and clinical response with SSRIs at usual doses, nor has any threshold been identified that defines toxic concentrations. Citalopram showed no significant correlation between steady-state plasma concentration and final Montgomery-Åsberg Depression Rating Scale (MADRS) scores (measure of response) in two studies, with numbers of patients ranging from 13 to 16, and doses ranging from 5 to 60 mg/d.14, 15 Paroxetine studies have found no statistically significant differences in plasma levels of paroxetine between responders and non-responders.16 No correlation has been found between Hamilton Rating Scale for Depression (HAM-D) scores (measure of response) and plasma levels of paroxetine. Studies had numbers of subjects ranging from 16 to 44, and doses from 20 to 60 mg/day.17–19 Similarly, studies of fluoxetine with small numbers of patients have suggested either no relationship between plasma concentration of the drug and clinical response,20, 21 or have suggested a curvilinear relationship between clinical response and plasma concentrations.22–24 The limitation of most of these studies is that they may not have been adequately powered. Perhaps the biggest study of plasma concentration and response has been of fluoxetine,25 a multicenter study in which plasma concentrations were available for 615 patients receiving 20 mg/day of fluoxetine. No apparent relationship was observed between plasma concentration and drug response, and plasma concentrations of fluoxetine, norfluoxetine, active moiety, or fluoxetine/norfluoxetine ratio did not differ between responders and non-responders. This is probably the only study with adequate power to be meaningful. However, one limitation of this study was the fact that it was a fixed dose study of fluoxetine at 20 mg/day, raising the possibility that a dose-response relationship could exist at higher doses, or a threshold effect may be possible at lower doses.
Adverse effects of SSRIs, although not generally life-threatening, are typically dose-related. Therapeutic drug monitoring is not routinely recommended for SSRI treatment, but is thought to be of value for ascertaining compliance, for patients who do not respond to multiple SSRIs, or have poor tolerability.26
| Metabolizer status | Genotype | Expected drug effects |
|---|---|---|
| UM (ultra-rapid) | More than two copies of active enzyme gene alleles | Usual doses may not lead to therapeutic drug concentration, possible non-response |
| EM (extensive) | Two copies of active enzyme gene alleles | Usual doses lead to expected drug concentrations and response |
| IM (intermediate) | Homozygous for two reduced activity enzyme gene alleles or are heterozygous for an inactive allele and a reduced activity allele | Drug effects between those of EMs and PMs |
| PM (poor) | Homozygous or compound heterozygous for deficiency alleles | Usual doses may lead to higher than expected drug concentrations and possibly adverse reactions |
Abbreviations: EM(s) = extensive metabolizer(s); IM = intermediate metabolizer; PM(s) = poor metabolizer(s); UM = ultra-rapid metabolizer
| CYP2D6 variant | Predicted enzymatic function | Caucasian (Europe)27 | Caucasian (U.S.)27 | African-American27 | Swedish28 |
|---|---|---|---|---|---|
| *1 | Normal | 33–36% | 27–40% | 29–35% | 36.7% |
| *2 (35%) | Normal | 22–33% | 26–34% | 18–27% | 32.4% |
| *3 | Deficient | 1–4% | 1–1.4% | < 1% | 1.4% |
| *4 | Deficient | 12–23% | 18–23% | 6–9% | 24.4% |
| *5 | Deficient | 2–7% | 2–4% | 6–7% | 4.3% |
| *6 | Deficient | 1–1.4% | 1% | < 1% | 0.9% |
| *9 | Decreased activity | 0–2.6% | 2–3% | < 1% | - |
| *10 | Decreased activity | 1.4–2% | 2–8% | 3–8% | - |
| *17 | Decreased activity | < 1% | < 1% | 15–26% | - |
| *41 | Decreased activity | 20% | - | - | - |
| *1×N | Increased activity | < 1% | < 1% | 1.3% | - |
| *2×N | Increased activity | 1.5% | < 1% | 1.3% | - |
| *4×N | Deficient | < 1% | < 1% | 2.3% | - |
The CYP450 enzymes - primarily CYP2D6, CYP2C19, and CYP2C9 - are involved in the metabolism of all of the SSRIs.29 It is important to note that enzymes other than CYP are also involved in SSRI metabolism,30, 31 and for a given SSRI, more than one CYP enzyme may be involved in its metabolism.32, 33 Additionally, it is noteworthy that CYP2D6 with identical pharmacologic and molecular properties has been identified in microsomal fractions in the brain. Hence, CYP2D6 may potentially contribute to local clearance of psychotropics at the site of action. Differences in personality traits between extensive metabolizers (EMs) and PMs were noted in both Swedish and Spanish healthy white subjects, also suggesting that there may be an endogenous substrate for CYP2D6 in the brain.34
| CYP enzyme | Citalopram | Fluoxetine | Fluvoxamine | Paroxetine | Sertraline |
|---|---|---|---|---|---|
| CYP1A2 | +/- | + | +++ | + | +/- |
| CYP2C9/10 | ? | ? | ? | ? | + |
| CYP2C19 | ? | ++ | +++ | + | |
| CYP2D6 | + | +++ | + | +++ | + |
| CYP3A4 | ? | ++ | ++ | +/- | +/- |
Key to symbols: +/- = unlikely; ? = unknown; + = mild; ++ = moderate; +++ = substantial
SSRI inhibition of a CYP enzyme can raise serum concentrations of drugs metabolized by that enzyme. Because SSRIs are commonly prescribed to patients with medical comorbidities who may be on multiple other medications, CYP polymorphisms may increase the likelihood or severity of such drug-drug interactions.
Currently there are no well-defined strategies regarding SSRI selection in individual patients, and this may contribute to low efficacy and an increased risk of side effects. Knowledge about CYP polymorphisms could potentially aid the selection of a specific SSRI and/or guide decisions about appropriate dosing to optimize efficacy and tolerability for individual patients.
Several companies offer genetic testing for CYP450 polymorphisms using different test formats. These have mainly supported clinical trials and to a smaller extent patient management. The Blue Cross and Blue Shield Association Technology Evaluation Center report on CYP450 genotyping39 offers the most current compilation of such tests. Additionally, laboratories may develop and validate their own tests for CYP450 genotyping that are required to meet Clinical Laboratory Improvement Amendment (CLIA) standards. A significant recent development was the approval by the U.S. Food and Drug Administration (FDA) of the Roche AmpliChip® CYP450 Test for this purpose.40, 41 The AmpliChip® delivers the results of testing for CYP2D6 and CYP2C19 polymorphisms in the form of “predicted phenotypes,” classifying test subjects as PMs, IMs, EMs, or UMs. There are currently no guidelines regarding how testing for polymorphisms, and the knowledge such testing yields about predicted phenotypes, can be incorporated into clinical practice, and little information about whether such testing produces any real benefits at all.
There has been increasing interest in the role of genetic polymorphisms of CYP450 enzymes and metabolism of SSRIs in relation to clinical practice.29, 42, 43 The availability of an FDA-approved test for identifying CYP450 polymorphisms has brought the field of pharmacogenetics to the threshold of influencing clinical practice, as advertising in leading journals exposes physicians to the availability of tests. Given the prevalence of MDD and the prevalence of SSRI treatment of MDD, there is an urgent need to critically review the available literature using standard methods of evidence-based medicine to inform the future use of genetic testing in the treatment of MDD with SSRIs, as well as to guide research priorities in service to optimal patient care.
The Agency for Healthcare Research and Quality (AHRQ), on behalf of the Centers for Disease Control and Prevention (CDC) Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Project, requested the development of the present evidence report on “Testing for Cytochrome P450 Polymorphisms in Adults with Non-Psychotic Depression Treated with Selective Serotonin Reuptake Inhibitors (SSRIs).” The report will be used to inform the EGAPP Working Group's deliberations in a process similar to that used by the U.S. Preventive Services Task Force (USPSTF) to formulate evidence-based recommendations.
A team of investigators at the Duke Evidence-based Practice Center, comprised of experienced investigators in health policy, psychiatry, and pharmacogenetics, have developed the report. The approach included developing an analytic framework concerning testing for CYP450 polymorphisms and treatment related to depression and performing a comprehensive literature review linked to this framework. The report provides a clear view of the current state of the science in CYP450 polymorphism testing in depression, and - where research is now insufficient for policy decisionmaking - proposes a list of rational research priorities. Further, the report provides a framework for evaluating the general issue of genetic testing for decisionmaking in depression treatment.
This section of the report describes the basic methodology used to develop the evidence report, including topic assessment and refinement, analytic framework, literature search strategies and results, literature screening, quality assessment, data abstraction methods, and quality control procedures.
The two study sponsors, the Agency for Healthcare Research and Quality (AHRQ) and the Centers for Disease Control and Prevention (CDC), originally identified five key questions to be addressed by the report. The Duke research team clarified and refined the overall research objectives and key questions by first consulting with these sponsors and then by convening a national panel of technical experts to serve as advisors to the project. These experts were selected to represent relevant specialties, including genomics and neuropsychiatry. Members of the technical expert panel were:
Kathryn A. Phillips, Ph.D., University of California, San Francisco, CA (member of the CDC Evaluation of Genomic Applications in Practice and Prevention [EGAPP] Working
Group)
Margaret Piper, Ph.D., M.P.H., B.C.B.S.A., Atlanta, GA (EGAPP Working Group member)
Ora Strickland, Ph.D., Emory University, Atlanta, GA (EGAPP Working Group member)
Dan G. Blazer, M.D., Ph.D., Duke University Medical Center, Durham, NC
Stephen Stahl, M.D., Ph.D., Neuroscience Education Institute, Carlsbad, CA
The Duke research team refined the key questions as follows:
Question 1 (overarching question): Does testing for cytochrome P450 (CYP450) polymorphisms in adults entering selective serotonin reuptake inhibitor (SSRI) treatment for non-psychotic depression lead to improvement in outcomes, or are testing results useful in medical, personal, or public health decisionmaking?
Question 2: What is the analytic validity of tests that identify key CYP450 polymorphisms?
Question 3a: How well do particular CYP450 genotypes predict metabolism of particular SSRIs? Do factors such as race/ethnicity, diet, or other medications, affect this association?
Question 3b: How well does CYP450 testing predict drug efficacy? Do factors such as race/ethnicity, diet, or other medications, affect this association?
Question 3c: How well does CYP450 testing predict adverse drug reactions? Do factors such as race/ethnicity, diet, or other medications, affect this association?
Question 4a: Does CYP450 testing influence depression management decisions by patients and providers in ways that could improve or worsen outcomes?
Question 4b: Does the identification of the CYP450 genotypes in adults entering SSRI treatment for non-psychotic depression lead to improved clinical outcomes compared to not testing?
Question 4c: Are the testing results useful in medical, personal or public health decisionmaking?
Question 5: What are the harms associated with testing for CYP450 polymorphisms and subsequent management options?
The methodological approach to this review was designed to inform the EGAPP Working Group's deliberations in formulating evidence-based recommendations for the use of genetic testing in depression treatment decisionmaking. With input from the EGAPP Working Group, we developed a project-specific analytic framework (Figure 1
Question 1 poses the overarching question of whether testing for CYP450 polymorphisms before SSRI treatment in non-psychotic depressed adults improves outcomes. Any evidence relating to this question would be “direct” evidence for the purpose of decisionmaking. In the absence of compelling direct evidence of this type, it is relevant to consider the component questions (Questions 2 through 5).
Question 2 examines the ability of clinically available tests for CYP450 polymorphisms to detect genetic variations in the CYP450 genes. This is a question of analytic validity that compares available tests to the gold standard of DNA sequencing. Issues related to harms due to misclassification are addressed in Question 5, below.
Questions 3a, 3b, and 3c concern the relationship between CYP genotypes or their predicted phenotypes and metabolism of individual SSRIs, efficacy of SSRIs in depression treatment, and adverse effects associated with SSRIs, respectively. These questions relate to clinical validity. Additionally, they address surrogate outcomes in depression management. Efficacy of SSRIs is a surrogate outcome measured by change in depression scores on depression rating scales such as the Hamilton Rating Scale for Depression (HAM-D)44 or the Montgomery-Åsberg Depression Rating Scale (MADRS).45
Questions 4a and 4c examine the influence of CYP genotyping on management decisions by patients or providers, and on medical, personal, or public health decisionmaking, respectively. Both of these are surrogate outcomes. Question 4b addresses whether such testing improves outcomes in depression management versus not testing. Examples of health outcomes of depression include health associated quality of life measured by the Medical Outcomes Study 36-Item Short Form Health Survey (SF-36),46 the Sheehan Disability Scale,47 or the Quality of Life Enjoyment and Satisfaction Questionnaire (QLESQ).48 Economic outcomes may include healthcare utilization or absenteeism related to depression. These questions concern decisionmaking at both individual and societal levels. These questions relate to clinical utility and raise the most important aspects of Question 1.
Question 5 addresses the potential harms associated with CYP testing itself and with subsequent management options. Potential harms could include labeling of patients as “treatment resistant” if they are found to be ultra-rapid metabolizers of relevant drugs, or harms could result from basing treatment decisions on inaccurate test results. As such, this question relates to both surrogate and health outcomes.
The primary source of literature was MEDLINE® (1966-May 2006). Additional databases searched included the Cochrane Database of Abstracts of Reviews of Effects (DARE), PsychInfo, HealthSTAR, and CINAHL. Searches of these databases were supplemented by reviews of the reference lists contained in all included articles and in relevant review articles. We also included data from the U.S. Food and Drug Administration (FDA) website describing the operating characteristics of the Roche AmpliChip® CYP450 Test.40, 41 On the advice of our technical expert panel, we did not undertake a comprehensive search of the grey literature.
The basic search strategy used the National Library of Medicine's Medical Subject Headings (MeSH) key word nomenclature developed for MEDLINE.® Searches were limited to articles published in English. The exact search string used is given in Appendix A.* The searches yielded a total of 1,200 citations, whose records are maintained in a ProCite (Thompson ISI ResearchSoft, Berkeley, CA) database.
Paired researchers from the Duke research team independently reviewed all abstracts and classified each as “included” or “excluded” according to project-specific criteria, which they developed. The exclusion criteria were:
Single case.
SSRI inhibition of CYP enzymes (unless the study examines how this is related to genotype).
Outside the scope of the report.
An abstract was included for further review if at least one of the paired reviewers recommended that it be included. A total of 140 abstracts were included for review at the full-text stage. Inter-rater reliability for include/exclude decisions at the abstract stage was tested by having five pairs of readers review 862 abstracts. Agreement (kappa statistic) ranged from -0.037 to 0.613.49
At the full-text review stage, paired researchers independently reviewed the articles and indicated a decision to “include” or “exclude” the article for data abstraction. When two reviewers returned different decisions about whether to include or exclude an article, they were asked to reconcile the difference. Detailed full-text exclusion criteria are listed immediately below.
Studies were excluded at the full-text screening stage if any of the following applied:
Single case.
Patient age < 18 years.
No gold standard comparison or methods comparison (for articles on analytic validity).
Study falls outside study scope (e.g., there were several good reviews, including one that made pharmacogenetics-based therapeutic recommendations,50 that did not answer any of the key questions directly).
At the full-text stage, studies were further identified as addressing one or more of the following criteria:
Clinical tests for polymorphisms. These include studies of commercial (e.g., AmpliChip®) and other tests that may be used for determining genetic polymorphisms in a clinical setting.
Gold standard. DNA sequencing is the accepted gold standard for genotyping. Because very few studies used a gold standard comparison, a decision was made also to include studies that used methods comparisons (e.g., polymerase chain reaction and restriction fragment length polymorphism [PCR-RFLP]). In keeping with the clinical diagnostic test literature, these methods are referred to here as a reference standard, acknowledging that they provide a lower level of evidence than gold standard comparisons.
Predicted metabolism of SSRIs. This includes metabolizer status of an individual with respect to a particular SSRI, e.g., “poor metabolizer” (PM) or “ultra-rapid metabolizer” (UM), and is distinct from PM or UM of a probe drug for a given CYP enzyme. Because an SSRI may not be exclusively metabolized by a certain CYP enzyme, its metabolism may vary from that of the probe drug for that enzyme in a person carrying a function-altering mutation of that CYP enzyme.
Decisionmaking. This includes decisionmaking by patients and providers; medical, personal, and public health decisionmaking.
Health outcomes of interest. Heath outcomes included: drug efficacy, adverse drug reactions, and other outcomes such as improved prognosis and quality of life.
Harms. Harms associated with testing or with subsequent management decisions.
Studies were then classified as addressing one or more of the key questions. For example:
Question 2 (analytic validity): A + B
Question 3a (metabolism of SSRIs): (A or B) + C
Question 4b (improved outcomes versus not testing): (A or B) + E
Please note that although (A or B) + E would apply to all health outcomes questions, we did not expect to find many studies addressing these, and therefore we did not break down E further.
| Articles identified | 1,200 |
| Abstracts reviewed | 1,200 |
Included | 140 |
Excluded | 1,060 |
| Full-text articles reviewed | 140 |
Included | 37 |
Excluded | 103 |
| Question 1 (overarching question) | 0 |
| Question 2 (analytic validity) | 14 |
| Question 3a (effects on metabolism) | 16 |
| Question 3b (effects on drug efficacy) | 5 |
| Question 3c (adverse drug reactions) | 9 |
| Question 4a (effects on disease management) | 0 |
| Question 4b (effects on outcomes) | 0 |
| Question 4c (testing usefulness) | 0 |
| Question 5 (testing and management harms) | 0 |
| Total | 37* |
The sum across questions exceeds total because some articles were included for more than one question.
The Duke research team developed data abstraction forms/evidence table templates for abstracting data for the various key questions (Appendix C *). Based on clinical expertise, a pair of researchers was assigned to the research questions to abstract data from the eligible articles. One of the pair abstracted the data, and the second researcher over-read the article and the accompanying abstraction to check for accuracy and completeness. The completed evidence tables are provided in Appendix D.*
At the data abstraction stage, the abstracting researcher was asked to evaluate each included article for methodological quality. For Question 2 regarding analytic validity, we assessed quality of studies based on questions in the Analytic validity, Clinical validity, Clinical utility and associated Ethical, legal and social implications (ACCE) model for evaluation of genetic testing (Appendix E *). For all other questions for which we could identify data, we intended to use the quality assessment criteria developed by the Tufts-New England Medical Center Evidence-based Practice Center for an evidence report on “Effects of Omega-3 Fatty Acids on Cardiovascular Disease.”51 However, these criteria require the study to be either a randomized controlled trial, longitudinal cohort study, or case-control study, and none of the studies identified for our report had these study designs. Therefore, we elected to use criteria developed by the Oxford Centre for Evidence-based Medicine52 (Appendix E *) to evaluate individual studies based on type of the study (therapy vs. prognosis vs. prevalence) and strength of study design, with numerical scores ranging between 1 and 5 (including 1a, 1b, 1c, 2a, 2b, 2c, 3a, 3b, 4, 5). The overall strength of recommendation for each question was then graded for each question as A, B, C, or D according to criteria that take into account the quality of individual studies identified for each question. The quality assessment scores for individual studies are reported in the relevant evidence tables. Because numerical value may not convey details about quality assessment, methodological issues pertaining to studies relevant to individual questions are addressed in the discussion of results for each question.
In addition to conducting the literature review described above, we also developed a decision model of the decision to test for genotype or not, with the primary outcome of interest being success of initial treatment (resolution of depression without adverse effects). The goal of this exercise was to examine the relationships between the intermediate steps described above and outcomes of importance to patients and physicians. Results are discussed in Chapter 3.
We employed internal and external quality-monitoring checks through every phase of the project to reduce bias, enhance consistency, and verify accuracy. Examples of internal monitoring procedures include: three progressively stricter screening opportunities for each article (abstract screening, full-text article review, data abstraction review); involvement of three individuals (two investigators and a copy-editor) in each data abstraction; and agreement of at least two investigators on all included studies.
Our principal external quality-monitoring device is the peer-review process. Nominations for peer reviewers were solicited from several sources, including the technical expert panel and interested federal agencies. The list of nominees was forwarded to AHRQ for vetting and approval. A list of peer reviewers submitting comments is provided in Appendix F *
Question 1 is: Does testing for cytochrome P450 (CYP450) polymorphisms in adults entering selective serotonin reuptake inhibitor (SSRI) treatment for non-psychotic depression lead to improvement in outcomes, or are testing results useful in medical, personal, or public health decisionmaking?
To address this question, we sought to identify studies in which patients treated with SSRIs were tested for CYP450 genetic polymorphisms, and in which investigators reported on the impact of such testing on outcomes or on medical, personal, or public health decisionmaking. Even after relaxing our inclusion criteria to include all methods used for genotyping and all indications for SSRI treatment, we were unable to identify any studies that directly addressed this question.
Question 2 is: What is the analytic validity of tests that identify key CYP450 polymorphisms?
For purposes of this report, we adopted the definition of analytic validity and its components from the Analytic validity, Clinical validity, Clinical utility and associated Ethical, legal and social implications (ACCE) model (Appendix E *), which reads:
The analytic validity of a genetic test defines its ability to accurately and reliably measure the genotype of interest. This aspect of evaluation focuses on the laboratory component. The four specific elements of analytic validity include analytic sensitivity (or the analytic detection rate), analytic specificity, laboratory quality control, and assay robustness. Analytic sensitivity defines how effectively the test identifies specific mutations that are present in a sample. Analytic specificity defines how effectively the test correctly classifies samples that do not have specific mutations (although the term “mutation” is used here, the terms “polymorphism” or “variant” may be more appropriate for certain situations). Quality control assesses the procedures for ensuring that results fall within specified limits. Robustness measures how resistant the assay is to changes in pre-analytic and analytic variables.
It is notable that the definitions of sensitivity and specificity above are most directly applicable to tests with dichotomous results (mutation present or absent). Because there are multiple CYP450 polymorphisms that can be assessed, and each study may provide information on only a subset of polymorphisms, we defined analytic sensitivity operationally as the proportion of known genotype challenge samples that are correctly identified by the test under evaluation. Similarly, analytic specificity was defined operationally as the proportion of known wild-type challenge samples that are correctly identified by the test under evaluation.
Our assessment of analytic validity focuses on tests that are actually used, or are likely to be used, in clinical settings. The gold standard method for CYP450 genotyping is unequivocally the bidirectional sequencing of the specific genetic region of the gene of interest. However, many reference methods exist due to the complexity and high costs involved with sequencing of large populations. To date, there is only one technology approved by the U.S. Food and Drug Administration (FDA) specifically for CYP450 genotype testing (the Roche AmpliChip®), and one technology approved for genetic testing of a different gene target (Invader Assay for UGT1A1 genotyping) which has been employed in one of the studies for CYP2D6 genotyping.53 Other laboratories currently performing CYP450 tests in clinical settings generally employ traditional methods, including polymerase chain reaction and restriction fragment length polymorphism (PCR-RFLP) or allele-specific polymerase chain reaction (AS-PCR, also referred to as allele-specific amplification, or ASA).
In the absence of a substantial number of studies comparing the test under evaluation to the gold standard (bidirectional DNA sequencing), we decided to include studies that used a traditionally accepted methods comparison, typically PCR-RFLP or AS-PCR, acknowledging that a methods comparison would be a lower level of evidence regarding analytic sensitivity and specificity than a gold standard comparison. Consequently, we refer to the comparator tests as a “reference standard.” It should be noted that in most cases even DNA sequencing for the purpose of assay validation may not have been done bidirectionally (not reported), but is referred to as a gold standard nonetheless.
Few studies reported the ethnic makeup of the tested sample populations, and even when details were provided there was no standard format followed, or description provided of the source of ethnicity data (e.g., based on self-reported or medical or other documentation). We therefore summarize all studies by the common denominator of general ethnic group (e.g., Caucasian).
Some studies provided information about test performance in assessing individual alleles rather than genotypes. Although these are less clinically relevant, they are included to complement the information about genotypes.
| Study | Roche Molecular Systems, Inc., 200440 | Schaeffeler et al., 200357 | Neville et al., 200253 | Soderback et al., 200558 | Stamer et al., 200259 | Genotype-specific analytic sensitivity | 95% CI | Test for homogeneity | |
|---|---|---|---|---|---|---|---|---|---|
| Test evaluated | AmpliChip® | RT-PCR | Long-range PCR and ASA | Pyro-sequencing | RT-PCR | ||||
| Reference standard | Sequencing and ASA# | Long range PCR | Long range PCR | Long range PCR | Long range PCR and ASA | ||||
| Analytic sensitivity | Del/Del | 2/2 | 1/1 | 0/0 | 0/0 | 1/1 | 100% | 42.5 – 100 | 0.97 |
| Del/SC | 41/41 | 13/13 | 16/16 | 24/24 | 11/11 | 100% | 97.2 – 100 | 0.99 | |
| Dup/Del | 3/3 | 5/5 | 0/0 | 0/0 | NR | 100% | 67.1 – 100 | 0.87 | |
| Dup/SC | 31/33 | 0/3‡ | 11/11 | 13/13 | NR | 91.67% | 82.4 – 97.7 | 0.06 | |
| Analytic specificity | SC/SC | 425/426 | 43/43 | NR | 3/3 | NR | 99.79% | 99.0 – 100 | 0.46 |
Real-time polymerase chain reaction (RT-PCR) detects gene copy number and uses an algorithm for genotype assignment based on the single nucleotide polymorphism-genotype analysis. When 2 alleles are detected, the most likely genotype is wild type (2 active alleles), with a less likely result of a combination between duplication and deletion.
Results of genotype calls for the AmpliChip method comparison are pooled for all method validation tests performed, as the report does not specify genotype calls by each method specifically.
Abbreviations: ASA = allele-specific amplification; CI = confidence interval; Del = deletion (*5 allele); Dup = duplication (more than a single gene copy); NR = not reported; PCR = real-time polymerase chain reaction; RT-PCR = real-time polymerase chain reaction; SC = single gene copy
In all studies analytic sensitivity and specificity for each tested genotype ranged from 94.12 to 100 percent, with the exception of Schaeffeler et al.,57 which reported sensitivity of 91.67 percent to detect the duplication/(single copy) genotype and specificity of 99.79 percent. However, only 26 of approximately 100 known CYP2D6 polymorphisms (www.cypalleles.ki.se) were evaluated in the included studies, with most studies focusing on only a handful of these variants.
CYP2D6 gene copy number methods exhibit relatively high sensitivity and specificity, although two of the four studies reporting results on duplication variants reported failures, resulting in sensitivity of 91.67 percent to identify duplication/(single copy) genotypes, and a homogeneity p-value of 0.06. It should be noted that traditional assays designed to identify the deletion variant *5 fail to depict some rearrangement-deletion alleles (the most common of which in Caucasians are *13 and *16 [0.5 to 1 percent], not tested in any of the studies above). These are non-functional alleles and result in the same metabolic phenotype as *5 (i.e., poor metabolizer [PM]).
| Study | Roche Molecular Systems, Inc., 200541 | Eriksson et al., 200254 | Mizugaki et al., 200361 | Genotype-specific analytic sensitivity | 95% CI | Test for homogeneity | |
|---|---|---|---|---|---|---|---|
| Test evaluated | AmpliChip® | Pyro-sequencing | ASA and TaqMan | ||||
| Reference standard | Sequencing | PCR-RFLP | PCR-RFLP | ||||
| Analytic sensitivity | *2/*1 | 101/101 | 24/24 | 45/45 | 100% | 98.25 – 100 | 0.9 |
| *2/*2 | 14/15 | 5/5 | 8/8 | 96.43% | 83.96 – 100 | 0.78 | |
| *2/*3 | 6/6 | NR | 9/9 | 100% | 81.33 – 100 | 0.9 | |
| *3/*1 | 6/6 | NR | 29/29 | 100% | 91.68 – 100 | 0.62 | |
| *3/*3 | 1/1 | NR | 2/2 | 100% | 30.17 – 100 | 0.84 | |
| *4/*1 | NR | 1/1 | NR | 100% | 0.25 – 100 | NA | |
| Analytic specificity | *1/*1 | 270/270 | 108/108 | 51/51 | 100% | 99.3 – 100 | 0.87 |
Abbreviations: ASA = allele-specific amplification; CI = confidence interval; NA = not applicable; NR = not reported; PCR-RFLP = polymerase chain reaction and restriction fragment length polymorphism
CYP2C9. We identified one report that compared clinical methods for genotyping CYP2C9 enzyme polymorphisms to a reference standard.54 This study did not use the gold standard, DNA sequencing. Investigators reported lack of detection of homozygotes for the *2 and/or the *3 alleles. They also stated that compound heterozygotes (*2/*3 genotype) were identified, but they provided no genotype counts, preventing calculation of genotype analytic sensitivity and specificity.
Calculation of analytic sensitivity based on allele counts was 100 percent. Due to an unreported number of compound heterozygotes, it is impossible to calculate confidence intervals of assay specificity, but it is implied that mean specificity is 100 percent.
No measures of robustness were reported. Quality control featured interrogation of the surrounding sequence, along with the variable positions tested and providing internal controls.
| Study | Muthiah et al., 200462 | Weise et al., 200463 | Genotype-specific analytic sensitivity | 95% CI | Test for homogeneity | |
|---|---|---|---|---|---|---|
| Test evaluated | Multiplex PCR | RT-PCR | ||||
| Reference standard | Sequencing | PCR-RFLP | ||||
| Analytic sensitivity | *2/*1 | 2/2 | 2/2 | 100% | 42.49 – 100 | 1 |
| *3/*1 | 3/3 | 16/16 | 100% | 85.05 – 100 | 0.6 | |
| *3/*4 | NR | 1/1 | 100% | 0.25 – 100 | NA | |
| *4/*1 | NR | 8/8 | 100% | 67.07 – 100 | NA | |
| Analytic specificity | *1/*1 | 52/52 | 95/95 | 100% | 97.98 – 100 | 0.85 |
Abbreviations: CI = confidence interval; NA = not applicable; NR = not reported; PCR = polymerase chain reaction; PCR-RFLP = polymerase chain reaction and restriction fragment length polymorphism; RT-PCR = real-time polymerase chain reaction
| Study | Wu et al., 200264 | Genotype-specific analytic sensitivity | 95% CI | |
|---|---|---|---|---|
| Test evaluated | Mismatch hybridization | |||
| Reference standard | PCR-RFLP | |||
| Analytic sensitivity | m1/*1 | 8/8 | 100% | 67.07 – 100 |
| m1/m1 | 20/20 | 100% | 85.76 – 100 | |
| Analytic specificity | *1/*1 | 22/22 | 100% | 86.94 – 100 |
| Analytic sensitivity | m2/*1 | 5/5 | 100% | 51.39 – 100 |
| m2/m2 | 21/21 | 100% | 86.4 – 100 | |
| Analytic specificity | *1/*1 | 24/24 | 100% | 86.4 – 100 |
Abbreviations: CI = confidence interval; PCR-RFLP = polymerase chain reaction and restriction fragment length polymorphism
All studies reporting assay performance for the detection of CYP2C8 and CYP1A1 exhibit 100 percent analytic sensitivity and specificity. Calculations based on allele calls reflect the same findings. Quality control procedures employed include the incorporation of positive and negative controls into the genotyping process.62 Robustness was assessed only by Wu et al.,64 by means of inter- and intra-assay variability. The intra-assay coefficients of variance were reported to be lower than 11.2 percent for both CYP1A1 assays, and the inter-assay coefficients of variance were lower than 14.3 percent. Weise et al.63 implied 100 percent inter-assay reproducibility of results obtained by four different investigators.
Based on emerging standards, analytic validity of genetic tests includes not only the ability of the test to accurately identify challenge genotypes (as assessed by a gold standard test), but also quality control and robustness. We identified only a few studies of test performance relative to the gold standard of DNA sequencing (bidirectionally or unidirectionally), applied to a limited number of samples (as reflected by the wide confidence intervals calculated for analytic sensitivity and specificity), and covering but a small set of possible genetic variants. Many studies appear to be in the realm of preclinical evaluations and are not clearly relevant to the domain of clinical practice.
These data do suggest that the analytic sensitivity and specificity of available tests are generally high. One concern may be that in the evaluation of gene deletions and duplications, assessing the magnitude of the potential problem is limited by the lack of an established gold standard for gene copy number. Another concern is that few CYP450 variants are included in the studies we identified, which focused particularly on the more common variants in Caucasians and African-Americans. However, variants that are rare in these populations may be more frequent, and thus more clinically relevant, in other populations. In the same context, it should be noted that most studies focus on developing reliable methods for the genotyping of CYP2D6 variants known to be non-functional (PM). Of these, the most common in Caucasians and African-Americans are *3, *4, *5, and *6, and the majority of studies target their assays at capturing these variants. Even the AmpliChip®, which targets the largest set of CYP2D6 variants (n = 26), fails to capture a large set of rare variants leading to deficient enzyme activity.
Although these results suggest that analytic validity for detecting some of the CYP450 genotypes more frequently encountered in the Caucasian population is good, overall the data are limited, with relatively small numbers of samples and a relatively narrow range of polymorphisms tested. In addition to studies addressing these limitations, research should include closer examination of the issue of deletions and duplications. Furthermore, practical concerns of quality control and robustness deserve greater investigation based on emerging standards for such studies.
Question 3a is: How well do particular CYP450 genotypes predict metabolism of particular SSRIs? Do factors such as race/ethnicity, diet, or other medications affect this association?
There is definitive literature supporting the association between certain CYP450 genotypes and their predicted phenotypes (i.e., how they would metabolize probe drugs specific for that CYP enzyme). Our question sought to address how well a certain genotype (or its corresponding predicted phenotype) predict metabolism of particular SSRIs. For example, does a CYP2D6 *5*5 or predicted poor metabolizer (PM) of the probe drug dextromethorphan also metabolize fluoxetine poorly? To address this question, we sought to identify all studies in which patients on SSRIs were tested for CYP450 genetic polymorphisms. Studies were included irrespective of the method used for genotyping. Because of the overall paucity of data, we included studies that had diagnoses other than non-psychotic depression as an indication for SSRI treatment, as clinical outcomes in such scenarios may be indicative of genotype effects. We also included studies in which only a subgroup of patients was treated with SSRIs, while others were treated with other antidepressants, including tricyclics.
Note: Here, as throughout Chapter 3, the terms “poor metabolizer (PM),”“extensive metabolizer (EM),” etc., refer to general phenotypes (for a probe drug) as predicted by genotyping.
| Study | Subjects (n, ethnicity) | SSRI | Genotypes | Results |
|---|---|---|---|---|
| Liu et al., 200165 | 14 Chinese | Fluoxetine | 2C19 *1, *2, *3 | Increased AUC, t½, and Cmax, decreased oral clearance in PMs vs. EMs |
| Wang et al., 200143 | 12 Chinese | Sertraline | 2C19 *1, *2, *3 | Increased AUC, t½, and Cmax, decreased oral clearance in PMs vs. EMs |
| Yu et al., 200367 | 13 Chinese | Citalopram | 2C19 *1, *2, *3 | Increased AUC, t½, and Cmax, decreased oral clearance in PMs vs. EMs |
| Yoon et al., 200068 | 16 Koreans | Paroxetine | CYP2D6 *1, *2, *10B | Heterozygotes/homozygotes for *10B showed lower volume of distribution, oral clearance, and higher AUC vs. homozygous for wild type. No difference in Cmax, t½, or renal clearance between groups |
| Ozdemir et al., 199966 | 17 Caucasians | Paroxetine | 2D6*1, *3, *4, *5 | Heterozygous EMs had twofold higher median steady-state concentration than homozygous EMs, but difference not statistically significant |
Abbreviations: AUC = area under the curve; Cmax = maximum plasma concentration; EMs = extensive metabolizers; PMs = poor metabolizers; t½ = terminal elimination half-life
All these studies used standard measures such as area under the curve (AUC, which is an assessment of bioavailability of the drug), half-life (time taken to eliminate half the total ingested quantity of the drug from the body), and oral clearance (pertains to distribution and elimination of drug) as measures of rate of metabolism. Three of the five studies43, 65, 67 included young, healthy, male, non-smoking, Chinese subjects who were free of medications and alcohol for at least 2 weeks prior to the study. These studies looked at the effect of CYP2C19 genotypes and predicted phenotypes (EM vs. PM) on the metabolism of three different SSRIs, namely, fluoxetine, sertraline, and citalopram. All three studies found significantly higher AUC, longer half-life, and reduced oral clearance of the parent drug, and significantly lower AUC and lower maximum plasma concentration (Cmax) of the metabolite of each drug, in PMs as compared to EMs. The fluoxetine and citalopram studies also found a gene dose effect such that heterozygous EMs showed values between homozygous EMs and PMs.
Of the remaining two studies, the first68 was carried out in 16 healthy, young, Korean subjects and examined the effect of the CYP2D6*10 allele (predictive of poor metabolism) on paroxetine metabolism. Investigators found that homozygotes and heterozygotes for *10 alleles showed significantly greater volume of distribution, greater AUC, and lower oral clearance of paroxetine than wild type homozygotes.
Thus, all studies in healthy adults using a single dose of an SSRI found that PMs, as predicted by genotyping, metabolized the SSRI more slowly than EMs, irrespective of particular SSRI.
The other study66 was a multiple-dose study that looked at paroxetine pharmacokinetics in 17 healthy, young, non-smoking Caucasian subjects who received paroxetine 20 mg/day for at least 5 days (range, 5 to 15 days). It found that heterozygous EMs had twofold higher median paroxetine steady-state concentrations than homozygous wild type EMs (n = 10); the difference was not statistically significant (p = 0.2).
| SSRI | CYP enzyme | Drug concentration findings |
|---|---|---|
| Paroxetine69,70,74,76,77,79 n = 14 to 124 | 2D6 | PM mother had highest concentration; her infant had undetectable level; a UM infant had undetectable level69 |
| PM > EM70 | ||
| (PM + IM) = (EM + UM)74 | ||
| PM > EM only at 10-mg dose, not at higher doses76 | ||
| Trough concentration in lower half of reference range for PM (n = 1) and EM77 | ||
| IM > PM and EM in 30 mg/d dose group only79 | ||
| SSRI | CYP enzyme | Drug concentration findings |
| Fluoxetine70,71,73,78, n = 11 to 78 | 2D6 | PM > EM70 |
| PM > EM (S isomer only)71 | ||
| PM = EM (active moiety)73 | ||
| PM = EM (active moiety)78 | ||
| 2C19 | PM = EM (active moiety)78 | |
| 2C9 | Heterozygous EM > homozygous EM (active moiety)73 | |
| Heterozygous EM > homozygous EM (active moeity)78 | ||
| Heterozygous EM > homozygous EM (R isomer only)78 | ||
| Fluvoxamine75 n = 46 | 2D6 | PM = EM75 |
| Citalopram69 n = 14 | 2C19 | PM mother had highest citalopram concentration, five *1*2 infants had higher concentration than five *1*1 infants (3 vs. 0.8 nmol/L). 3 of 4 infants with undetectable level were *1*169 |
Abbreviations: EM = extensive metabolizer; IM = intermediate metabolizer; PM = poor metabolizer; UM = ultra-rapid metabolizer
| SSRI/CYP enzyme | Study | Mean drug concentration, EM group | Mean drug concentration, comparator group (PM, heterozygous EM, etc.) | P-value | Confidence interval80 for difference in mean drug concentration | Dose | Comments |
|---|---|---|---|---|---|---|---|
| Paroxetine/ 2D6 | Charlier et al., 200370 | 20.97 ± 21.17 microg/L (n = 30) | 72.50 ± 29.65 microg/L (n = 6) | 0.00001 | 31.40 to 71.66 | 20 mg/d | - |
| Sawamura et al., 200476 | 2.99 ± 3.52 ng/mL (n = 16) | 7.30 ± 6.11 ng/ mL (*1*10 or *10*10) (n = 35) | 0.019 | 1.04 to 7.58 | 10 mg/d | No difference at higher doses, data not provided | |
| Murphy et al., 200374 | 71.65 ± 52.55 ng/mL (n = 105) (EM + UM) | 99.51 ± 37.35 ng/mL (IM + PM) (n = 15) | NR | -0.15 to 55.87 | Mean 30.21 (EM), 26.67 (PM) | (EM + UM), (IM +PM) groups combined to increase power | |
| Ueda et al., 200679 | 150.9 ± 20.6 ng/mL/mg/kg (n = 17) | 76.7 ± 6.1 ng/mL/mg/kg (n = 12) | NR | -86.45 to 61.95 | 30 mg/d | IM level greater than EM or PM, no difference at other doses | |
| Fluoxetine/ 2D6 | Charlier et al., 200370 | 49.4 ± 40.7 microg/L (n = 10) | 178.5 ± 68.6 microg/L (n = 2) | 0.004 | 60.83 to 197.37 | 20 mg/d | Reported fluoxetine only |
| Eap et al., 200171 | 55 ± 30 ng/mL (n = 6) | 104 ± 8 ng/mL (n = 3) | NR | 12.82 to 85.18 | 20 mg/d | Reported fluoxetine only | |
| LLerena et al., 200473 | 13.0 ± 7.6 nmol/L/mg (n = 41) | 16.7 nmol/L/mg (n = 1) | NR | -11.61 to 19.01 | Dose-corrected | Reported fluoxetine only. “No significant correlation found between plasma concentration of active moiety and number of active genes” | |
| Fluoxetine/ 2C9 | LLerena et al., 200473 | 25.1± 10.1 nmol/L/mg (n = 19) | 35.5 ± 18.5 nmol/L/mg (*1*2) (n = 11) | < 0.05 | 0.07 to 20.73 | Dose-corrected | Active moiety (all subjects were 2D6 EM) |
| 38.6 ± 22.1 nmol/L/mg (*1*3) (n = 8) | < 0.01 | 1.34 to 25.66 | |||||
| Fluvoxamine/ 2D6 | Ohara et al., 200375 | 312.7± 195.3 ng/mL/mg/kg (n = 13) | 321± 422.1 ng/mL/mg/kg n = 15) | 0.984 | -245.79 to 262.39 | Dose-corrected | PM defined as 2D6 *10*10; EM defined as no *10 (any allele which was not *3, *4, *5 or *10 was defined as wild-type) |
Abbreviations: EM = extensive metabolizer; IM = intermediate metabolizer; NR = not reported; PM = poor metabolizer; UM = ultra-rapid metabolizer
The 16 included studies provide mixed evidence regarding the first part of Question 3a (on possible correspondences between CYP450 genotypes and metabolism of particular SSRIs). In multiple dose studies of SSRIs, inconsistent results were obtained for individual SSRIs, and also for individual CYP enzymes.
Methodological issues in studies addressing this question include:
Single-dose studies in healthy volunteers:43, 65, 67, 68 Clinical situations may be very different from single-dose studies, because of the possible effects of the medication on CYP enzymes over time. Data from single-dose experiments cannot be extrapolated to long-term drug therapy, as saturation pharmacokinetics, irreversible enzyme blockade, or enzyme up- or down-regulation might change the outcome with multiple dosing.3, 81, 82
Small sample sizes: All the studies had very small samples of the PM or ultra-rapid metabolizer (UM) groups, and thus may not have been powered adequately to detect significant differences, as shown by wide confidence intervals.
Heterogeneity: The studies were quite variable in terms of the population of interest, specific SSRIs considered, and specific CYP450 polymorphisms.
Not accounting for multiple CYP enzymes that may be involved in metabolism of a certain SSRI: Only one study73 took into account the possibility that more than one CYP enzyme might be involved in the metabolism of a certain SSRI and therefore controlled for polymorphisms in another enzyme.
Not accounting for active metabolites of certain SSRIs like fluoxetine: Two studies measured active moiety rather than parent drug alone,73, 78 whereas two others did not.70, 71
Most studies for accounted for co-medications that may be inhibitors or substrates for the enzyme being studied; one did not.69 Benzodiazepines were typically allowed in these psychiatric cohorts, as these drugs are metabolized mainly by CYP3A4 and have no influence on the enzymes studied.
Diet was not taken into account in any study.
One study72 combined SSRIs and other antidepressants and examined effects of polymorphisms of various CYP enzymes. Combining various SSRIs, and moreover SSRIs with other antidepressant medications, may have confounded results because of variability in the contribution of different CYP enzymes to metabolism of different SSRIs and other antidepressants, and variability in CYP inhibition by different SSRIs.
The quality assessment criteria we applied to individual studies in this report52 (Appendix E *) yielded a range of scores between “3b” and “4.” For the suggestion that the genotypes affect metabolism of SSRIs, the grade of recommendation based on available data would be “C.”
In depressed patients treated with SSRIs, the existing data (a series of heterogeneous studies in small samples) do not support a clear correlation between CYP metabolizer status as predicted by genotyping and SSRI concentrations.
Question 3b is: How well does CYP450 testing predict drug efficacy? Do factors such as race/ethnicity, diet, or other medications, affect this association?
To address this question, we sought to identify all studies in which patients treated with SSRIs were tested for CYP450 genetic polymorphisms. Studies were included irrespective of the method used for genotyping. Because of the overall paucity of data, we included studies that had diagnoses other than non-psychotic depression as an indication for SSRI treatment, as clinical outcomes in such scenarios may be indicative of genotype effects. We also included studies in which only a subgroup of patients was treated with SSRIs, while others were treated with other antidepressants, including tricyclics.
Note: Here, as throughout Chapter 3, the terms “poor metabolizer (PM),”“extensive metabolizer (EM),” etc., refer to general phenotypes (for a probe drug) as predicted by genotyping.
| Study/ design | Patient characteristics | SSRI(s) | Alleles of interest | Predicted phenotypes | Results |
|---|---|---|---|---|---|
| Gerstenberg et al., 200383 | 49 Japanese patients with depression | Fluvoxamine (50 mg 1st week, 100 mg 2nd week, and 200 mg in remaining 4 weeks) | 2D6 *1, *3, *4, *5, *10 | EMs = 25%; IMs = 55%; PMs = 20% | Final MADRS score, % improvement, amelioration score, and proportion of responders not significantly different in the 3 groups (Ems, IMs, PMs). Raw data and p-values NR |
| Cross-sectional study | |||||
| Grasmader et al., 200472 | 136 depressed patients (70 on SSRIs), ethnicity NR (refers to Caucasians in conclusion) | Fluvoxamine, paroxetine, sertraline, citalopram | CYP2C9 *1 to *3, CYP2C19*1 and *2, 2D6 *1 to *9 and gene duplication | NR | Plasma concentration above or below lower limit of presumed therapeutic levels did not predict response (p = 0.082 for CGI, p = 0.982 for HAM-D) |
| Cross-sectional study | |||||
| Murphy et al., 200374 | 246 with depression, ethnicity NR | Paroxetine (n = 120) (and mirtazapine) | 2D6: 16 alleles, deletion, duplication, and *41 allele | PMs = 6.5%; IMs = 10.5%; UMs = 4%; EMs = 79% | No differences between PM + IM vs. EM + UM groups in depression measures (p-values NR) |
| Cross-sectional study | For paroxetine, PM + IM (n = 15, 12.5% ) vs. EM + UM (n = 105, 87.5%) | ||||
Abbreviations: CGI = Clinical Global Impressions Scale; EM(s) = extensive metabolizer(s); HAM-D = Hamilton Rating Scale for Depression; IM(s) = intermediate metabolizer(s); MADRS = Montgomery-Åsberg Depression Rating Scale; NR = not reported; PMs = poor metabolizer(s); SSRI(s) = selective serotonin reuptake inhibitor(s); UM(s) = ultra-rapid metabolizer(s)
| Study/ design | Patient characteristics | SSRI(s) | Alleles of interest | Results |
|---|---|---|---|---|
| Rau et al., 200484 | 16 patients with non-response to SSRIs (n = 5), SNRIs, ethnicity NR (alludes to white) | Various SSRIs | 2D6 *3, *4, *6, *2, *8, *10, *14, *41, *5 | 18% were UMs (3/16), compared to 2.5 to 3% in the general German population (5-fold increase; p = 0.0013) |
| Cross-sectional prevalence study | ||||
| Kawanishi et al., 200485 | 108 Nordic Caucasians with depression and non-response to > 2 treatments | Various SSRIs, plus other classes of antidepressants | 2D6 gene duplication, and *2, *3, *4, *5 | Frequency of PM genotype was 0.028 (95% CI 0 to 0.058), less than in general population (0.068). |
| Cross-sectional prevalence study | Frequency of UMs in the subgroup of 81 subjects treated with CYP2D6 substrates was 9.9% (95%CI 3.4 to 16.4%), significantly greater than in the general Swedish (1%)/Danish (0.8%) populations (95% CI 0.2 to 1.4%) | |||
Abbreviations: CI = confidence interval; EM(s) = extensive metabolizer(s); HAM-D = Hamilton Rating Scale for Depression; NR = not reported; PMs = poor metabolizer(s); SNRI(s) = serotonin/norepinephrine reuptake inhibitors; SSRI(s) = selective serotonin reuptake inhibitor(s); UM(s) = ultra-rapid metabolizer(s)
Based on the available evidence, a definitive association of CYP450 (2D6, 2C9, 2C19) genotypes and efficacy of SSRIs cannot be inferred.
Methodological issues in studies addressing this question include:
Study-design and power: None of the studies was a prospective randomized trial. Three72, 74, 83 were observational or correlational studies, and two84, 85 were pilot studies of prevalence of CYP polymorphisms in non-responders to antidepressant treatment. All the studies had very small numbers of patients in the UM groups.
Only two studies74, 83 studied individual SSRIs (fluvoxamine and paroxetine respectively), while the others grouped the SSRIs together or with groups of other antidepressants. Combining various SSRIs, and moreover SSRIs with other antidepressant medications, may have confounded results because of variability in contribution of different CYP enzymes to metabolism of different SSRIs and other antidepressants, and variability in CYP inhibition by different SSRIs.
The two prevalence studies considered84, 85 have the obvious shortcoming of comparing CYP2D6 UM prevalence in depressed non-responder patients to the UM prevalence in the general population. It is possible that CYP2D6 UM phenotype itself is associated with presence of severe depression that is treatment-resistant, which may have accounted for high prevalence of this phenotype in non-responders to antidepressant treatment. It would be more meaningful to compare prevalence rates between responders and non-responders to a given SSRI, which would require a very large sample. In addition, neither of these studies specified exclusion criteria.
The data considered do not lead to any conclusions about the possible impact of race/ethnicity, diet, or other medications on the association between CYP450 genotypes and SSRI efficacy.
Genetic factors affecting serotonin receptor proteins, membrane transporters, and signal transduction molecules could also have important pharmacodynamic effects that could affect SSRI efficacy.86 Thus, examining the impact of pharmacokinetic variability resulting from CYP enzyme polymorphisms on SSRI efficacy in isolation may not be optimal.
The quality assessment criteria we applied to individual studies in this report52 (Appendix E *) yielded a range of scores between “3b” and “4.” For the suggestion that CYP450 genotypes do not affect SSRI efficacy, the grade of recommendation based on available data would be “C.”
Because of the poor quality of relevant data that could be identified to address the question, no firm conclusions can be drawn about the relationship between CYP450 genotypes and efficacy of SSRI treatment in patients with non-psychotic depression.
Question 3c is: How well does CYP450 testing predict adverse drug reactions? Do factors such as race/ethnicity, diet, or other medications, affect this association?
To address this question, we sought to identify all studies in which patients treated with SSRIs were tested for CYP450 genetic polymorphisms. Studies were included irrespective of the method used for genotyping. Because of the overall paucity of data, we included studies that had diagnoses other than non-psychotic depression as an indication for SSRI treatment, as clinical outcomes in such scenarios may be indicative of genotype effects. We also included studies in which only a subgroup of patients was treated with SSRIs, while others were treated with other antidepressants, including tricyclics. Studies that specifically examined adverse effects were particularly sought.
Note: Here, as throughout Chapter 3, the terms “poor metabolizer (PM),”“extensive metabolizer (EM),” etc., refer to general phenotypes (for a probe drug) as predicted by genotyping.
| Study/design | Patient characteristics | SSRI(s) | Alleles of interest | Predicted phenotypes | Results |
|---|---|---|---|---|---|
| Chen et al., 199688 | 74 patients, ethnicity NR | Various, including paroxetine, fluoxetine, sertraline, fluvoxamine, (also TCAs) | 2D6 - A, B, D, E, and T alleles | NR | PM phenotype was significantly more frequent in depressed patients (n = 18; 44%) reporting adverse effects to substrate of 2D6 compared to a random group (n = 56; 21%) of depressed patients (p < 0.05), or compared to the general population |
| Cross-sectional prevalence study | |||||
| Rau et al., 200484 | 28 patients with adverse effects to SSRIs (9 patients), SNRIs, ethnicity NR (alludes to white) | Various SSRIs | 2D6 *3, *4, *6, *2, *8, *10, *14, *41, *5 | PM: 29% | 29% PMs compared to 7% in the German population (p < 0.0001). |
| Cross-sectional prevalence study | IM: 7% | There were no differences between PM, IM, and EM groups in frequency of dose reduction (p = 0.14), stopping treatment (p = 0.51), reducing or terminating antidepressant (p = 0.39), or number of adverse effects (p = 0.12) | |||
| EM: 64% | |||||
| UM: 0 | |||||
| Gerstenberg et al., 200383 | 49 Japanese | Fluvoxamine (50 mg 1st week, 100 mg 2nd week, and 200 mg in remaining 4 weeks) | 2D6 *1, *3, *4, *5, *10 | PM: 20% | Incidence of adverse effects (nausea) was not significantly different between the 3 groups (raw data and p-value NR) |
| Cross-sectional study | EM: 25% | ||||
| IM: 55% | |||||
| Murphy et al., 200374 | 246 patients, ethnicity NR | Paroxetine (and mirtazapine, not reported here) | 2D6: 16 alleles, deletion, duplication, and *41 allele | PM: 6.5% | No differences between PM + IM vs. EM + UM groups in severity of adverse effects or frequency of discontinuation (p-values NR) |
| Cross-sectional study | IM: 10.5% | ||||
| UM: 4% | |||||
| EM: 79% | |||||
| Roberts et al., 200489 | 125 patients, ethnicity NR | Fluoxetine n = 65 (randomized to fluoxetine or nortriptyline) | 2D6 alleles *1 to * 16, *19, *20 | PM: 9% | PMs were no more likely to experience adverse effects than EMs (17% of PMs vs 41% of EMs) and were no more likely to drop out of the study than EMs (PMs 33% vs. EMs 14%) (p-values NR) |
| Cross-sectional study | EM: 91% | ||||
| Suzuki et al., 200690 | 97 Japanese | Fluvoxamine (25–200 mg) | 2D6 alleles *5, *10 | PM: 22.7% | Greater prevalence of GI side effects in PMs compared to EMs (p = 0.043; CI 1.019 to 3.254). Discontinuation rates similar between PMs and EMs (p = 0.310) |
| Cross-sectional study | EM: 77.3% | ||||
Abbreviations: CI = confidence interval; DSM-IV = Diagnostic and Statistical Manual for Mental Disorders, 4 edition; EM(s) = extensive metabolizer(s); GI = gastrointestinal; HAM-D = Hamilton Rating Scale for Depression; IM(s) = intermediate metabolizer(s); MADRS = Montgomery-Åsberg Depression Rating Scale; NR = not reported; PMs = poor metabolizer(s); SNRI(s) = serotonin/norepinephrine reuptake inhibitors; SSRI(s) = selective serotonin reuptake inhibitor(s); TCAs = tricyclic antidepressants; UM(s) = ultra-rapid metabolizer(s)
All six studies examined CYP2D6 polymorphisms only. Three of the six studies reported no differences in rates of adverse effects between PMs and EMs,83, 84, 89 while a fourth74 reported no differences in adverse effects between the combined PM + IM and EM + UM groups. One study found a greater prevalence of gastrointestinal (GI) adverse effects in PMs compared to EMs.90 This study also found that the combination of CYP2D6 polymorphism and serotonin receptor 5HT2A polymorphism predicted GI adverse effects, such that PM + GG and PM +AG had a significantly greater risk of developing GI side effects compared to EM + AA.
Two studies84, 88 found a significantly higher prevalence of PMs in depressed patients with adverse effects than in the general population. One of these88 also found the PM phenotype to be more frequent in depressed patients with adverse effects than in a random group of depressed patients. Studies that reported types of adverse effects reported a range of typical SSRI adverse effects including but not limited to anxiety, agitation, restlessness, nausea, GI upset, headache, sleep disturbance, and sexual dysfunction.83, 84, 89 The most common adverse effect reported in studies was nausea.
Although four studies did not find any differences in adverse effects in PMs versus EMs, these studies are heterogeneous, with major methodological problems, including:
Study design and power: None of the studies was a prospective randomized trial. Four74, 83, 89, 90 were observational or correlational studies, and two84, 88 were pilot studies of the prevalence of CYP polymorphisms in patients who had adverse effects with antidepressant treatment. All the studies had very small numbers of patients in the PM groups.
Three studies examined individual SSRIs74, 83, 89 (paroxetine, fluvoxamine, and fluoxetine, respectively), whereas the other two grouped the SSRIs together or with groups of other antidepressants. Combining various SSRIs, and moreover SSRIs with other antidepressant medications, may have confounded results because of variability in contribution of different CYP enzymes to metabolism of different SSRIs and other antidepressants, and variability in CYP inhibition by different SSRIs.
The two prevalence studies considered84, 88 did not specify exclusion criteria. Moreover, comparing a group of patients with adverse effects to a particular SSRI to a group of patients with no adverse effects to that SSRI may have been more meaningful, but will require a large number of patients.
The data considered do not lead to any conclusions about the possible impact of race/ethnicity, diet, or other medications, on the association between CYP450 genotypes and adverse effects to SSRIs.
Genetic factors affecting serotonin receptor proteins, membrane transporters, and signal transduction molecules could also have important pharmacodynamic effects that could affect SSRI tolerability.86 Thus, examining impact of pharmacokinetic variability resulting from CYP enzyme polymorphisms on SSRI tolerability in isolation may not be optimal. Only one study90 addressed this issue and did in fact show combined effects of CYP2D6 and 5HT2A polymorphisms on GI adverse effects, further supporting this point.
The quality assessment criteria we applied to individual studies in this report52 (Appendix E *) yielded a range of scores between “2b” and “4.” For the suggestion that CYP450 genotypes do not affect SSRI tolerability, the grade of recommendation based on available data would be “C.”
Because of the poor quality of relevant data that could be identified to address the question, no firm conclusions can be drawn about the relationship between CYP450 genotypes and tolerability of SSRI treatment in patients with non-psychotic depression.
Question 4 is:
(a) Does CYP450 testing influence depression management decisions by patients and providers in ways that could improve or worsen outcomes?
(b) Does the identification of the CYP450 genotypes in adults entering SSRI treatment for non-psychotic depression lead to improved clinical outcomes compared to not testing?
(c) Are the testing results useful in medical, personal, or public health decisionmaking?
To address this question, we sought to identify studies in which patients treated with SSRIs were tested for CYP450 genetic polymorphisms, and in which investigators reported on the impact of such testing on outcomes or on medical, personal, or public health decisionmaking. Even after relaxing our inclusion criteria to include all methods used for genotyping and all indications for SSRI treatment, we were unable to identify any studies that directly addressed any aspect of this question. In addition, we did not find any studies examining the effect of CYP genotypes on SSRI inhibition of CYP enzymes, leading to adverse effects associated with concurrent medications.
Question 5 is: What are the harms associated with testing for CYP450 polymorphisms and subsequent management options?
To address this question, we sought to identify studies in which patients treated with SSRIs were tested for CYP450 genetic polymorphisms, and in which investigators reported on harms or negative outcomes associated with testing or with subsequent management options. It may be hypothesized that, like other genetic tests, CYP genotyping could raise issues of labeling (“treatment-resistant” in the case of UMs) in the minds of providers, patients, or third-party payers that may negatively impact outcomes. This question of harm therefore is very relevant as we consider feasibility of CYP genotyping in practice.
Even after relaxing our inclusion criteria to include all methods used for genotyping and all indications for SSRI treatment, we were unable to identify any studies that directly addressed any aspect of this question.
This section explores the potential clinical impact of CYP450 genotype testing as a guide to therapy of patients newly diagnosed with depression.
In deciding whether to use CYP450 genotype testing to guide depression therapy, it would be ideal to have direct scientific studies demonstrating that use of genotype testing leads to improved clinical outcomes. In the absence of such direct evidence, decision modeling can be used to provide indirect evidence based, for example, on the relationship between genotype and specific serotonin selective reuptake inhibitor (SSRI) metabolism (phenotype), and the relationship between phenotype and responsiveness to therapy. Examining these clinical relationships is of paramount importance. Genetic testing, like all forms of diagnostic tests, should only be promoted if the potential benefits (such as improved response to treatment for depression) outweigh the potential harms (such as increased adverse effects).
Decision analysis is a tool that provides a mechanism for inferring the likely outcomes of competing options by modeling the relationship between each option and the outcome of interest. Such decision models provide a framework for linking information from multiple sources (e.g., epidemiological studies, test performance studies, treatment efficacy studies, and surveys of patient preferences and quality of life). In addition to providing a “best guess” about the impact of a particular decision, decision models can offer insight into the dynamic relationship between various clinical inputs and decision relevant outcomes - under what circumstances is one decision preferred over others? This use of a decision model is especially valuable when the input data are not particularly strong, as is the case here.
We constructed and evaluated a decision model to address the question: Under what circumstances would genetic testing for CYP isoenzymes during the initial evaluation of an individual with non-psychotic major depression lead to a better clinical outcome, when compared to empiric SSRI therapy?
Population. The population of interest for the model was treatment-naïve adults who met the DSM-IV criteria for major depression. They were otherwise generally healthy and not taking medications that could interact with SSRIs.
Model structure. The model is a simple tree structure (Figure 2
For each strategy, an individual could have one of three phenotypes: ultra-rapid metabolizer, extensive metabolizer/intermediate metabolizer, or poor metabolizer, with a probability based on the distribution of phenotypes in the population. We combined the extensive and intermediate metabolizers into a single phenotype to simplify the model since there was little data to support a difference in response to therapy for these two groups. For the first model strategy (use of a non-CYP metabolized SSRI without testing), the likelihood of treatment success is assumed to be the same for all phenotypes. For the second option (use of genetic testing to select SSRI), patients with genotypes that correspond to phenotypes with a high probability of treatment failure (ultra-rapid and poor metabolizers) would receive the more expensive non-CYP metabolized medication, while those not at high risk (extensive and intermediate metabolizers) would receive the less expensive CYP metabolized one. For the third option (use of genetic testing to select dose of SSRI), results of the genetic test are used to adjust the dose of the CYP metabolized medication. A low dose would be used for poor metabolizers, a standard dose for extensive and intermediate metabolizers, and a high dose for ultra-rapid metabolizers. For the fourth option (use of a CYP metabolized SSRI without testing), the likelihood of treatment success depends upon phenotype.
The model was created as a decision tree using TreeAge ProSuite 2006 (TreeAge Software Inc, Williamstown, MA)
| Description | Value | Source |
|---|---|---|
| Prevalence ultra-rapid metabolizers in general depressed population | 0.03 | Grasmader et al., 2004;68 Charlier et al., 200366 |
| Prevalence of extensive metabolizers in general depressed population | 0.86 | Grasmader et al., 2004;68 Charlier et al., 200366 |
| Prevalence of poor metabolizers in general depressed population | 0.11 | Grasmader et al., 2004;68 Charlier et al., 200366 |
| Utility of untreated depression | 0.32 | Bennett et al., 200088 |
| Utility of treated depression | 0.99 | Expert opinion |
| Probability of responding to sertraline | 0.56 | Rossini et al., 200589 |
| Cost of medication primarily metabolized by CYP450 (fluoxetine) | 12 | Anonymous90 |
| Cost of medication not primarily metabolized by CYP450 (sertraline) | 130 | Anonymous90 |
| Cost of genetic testing | 1000 | Palylyk-Colwell, 200691 |
| Description | High correlation | Low correlation |
|---|---|---|
| Probability phenotype poor will have genotype poor | 0.58 | 0.35 |
| Probability phenotype poor will have genotype extensive | 0.37 | 0.39 |
| Probability phenotype poor will have genotype ultra-rapid | 0.05 | 0.26 |
| Probability phenotype extensive will have genotype poor | 0.2 | 0.23 |
| Probability phenotype extensive will have genotype extensive | 0.45 | 0.35 |
| Probability phenotype extensive will have genotype ultra-rapid | 0.35 | 0.42 |
| Probability phenotype ultra-rapid have genotype poor | 0.14 | 0.13 |
| Probability phenotype ultra-rapid will have genotype extensive | 0.49 | 0.36 |
| Probability phenotype ultra-rapid will have genotype ultra-rapid | 0.5 | 0.38 |
| Probability of responding to high dose fluoxetine if phenotype ultra-rapid | 0.61 | 0.56 |
| Probability of responding to high dose fluoxetine if phenotype extensive | 0.5 | 0.45 |
| Probability of responding to high dose fluoxetine if phenotype poor | 0.4 | 0.21 |
| Probability of responding to medium dose fluoxetine if phenotype ultra-rapid | 0.5 | 0.45 |
| Probability of responding to medium dose fluoxetine if phenotype extensive | 0.61 | 0.56 |
| Probability of responding to medium dose fluoxetine if phenotype poor | 0.5 | 0.45 |
| Probability of responding to low dose fluoxetine if phenotype ultra-rapid | 0.4 | 0.21 |
| Probability of responding to low dose fluoxetine if phenotype extensive | 0.5 | 0.45 |
| Probability of responding to low dose fluoxetine if phenotype poor | 0.61 | 0.56 |
In clinical decisionmaking, a key question is the probability that any particular genotype will correspond to a particular level of drug metabolism (phenotype). This question is paramount since the phenotype is purported to effect the likelihood of treatment success, both effectiveness and adverse effects. However, the available literature presents limited data on these essential probability estimates. In the absence of data, we used the technique of bootstrapping to backwards calculate probabilities which were consistent with the two correlation coefficients (0.2 and 0.8). Specifically, we created a series of tables (genotype x phenotype) in which synthetic patient samples were assigned to cells with the target correlation coefficient; the cells were divided by the row totals, and the resulting elements were the estimated probabilities that a specific genotype would be associated with a specific phenotype. We repeated this exercise for both levels of correlation on each genotype to phenotype pair. For example, when the correlation between genotype and phenotype is 0.8 (high), the estimated probability that an ultra-rapid phenotype will have an ultra-rapid genotype is 0.5; if the correlation is 0.2 (low), the estimated probability is only 0.38.
The clinical predictive value of phenotype is reflected in the model as the probability that an individual with a specific phenotype will respond to a specific SSRI. These estimates were based upon expert opinion; however, their clinical plausibility was verified by comparing calculated overall population response rates (using the estimates and known prevalence rates) to published response rates.
As a practical strategy for examining the impact of variations in the relationship between genotype and phenotype, and between phenotype and clinical response, we created four scenarios for levels of linkage between genotype and clinical outcome. These four scenarios corresponded to the four possible combinations of level of correlation between genotype and phenotype (high or low), and correlation between phenotype and clinical response (high or low).
Response rates for the non-CYP metabolized medication were assumed to be the same for all three genotypes since metabolism is not affected significantly by any one of the polymorphisms. For the purposes of this model we assumed that the analytic sensitivity and specificity of the genetic testing used in the field compared to a gold standard genetic testing was 100 percent.
In order to understand the impact of each strategy on quality of life, patient outcomes were adjusted by a quality of life multiplier. This multiplier intended to represent patient preferences for a given health state as a utility. We used the utility of moderate depression to represent those individuals who did not respond to medication by 6 weeks, and a utility very close to that of non-depressed healthy individuals for those who did.88
Outcomes. We estimated two different clinical outcomes at 6 weeks: percent response to medical therapy and cumulative quality-adjusted survival at 6 weeks (in years). Response to medical therapy was defined as a 50 percent or greater improvement as measured by the HAM-D scale. We chose to measure these outcomes at 6 weeks, since response to an initial 6-week trial predicts both ultimate success with a medication and adherence to it. Longer time frames do not improve the response to initial therapy, and since adverse effects are rarely serious, the greatest potential benefit of genetic testing will be to improve initial response rates. In addition, we calculated the average cost for each strategy over a single trial of therapy (6 weeks).
Analyses. In decision modeling it is typical to create a best-guess or “base case” estimate of outcomes. Given the lack of high quality data permitting a credible point estimate for model inputs, we chose to provide results for each of the four levels of linkage, described above. For each of these levels, we also performed one-way sensitivity analysis on all other model inputs (that is, other than probabilities related to the levels of linkage.)
For each of the four scenarios, treating with a non-CYP metabolized SSRI without testing was the most effective strategy, while treating with a CYP metabolized SSRI was the least effective. Of the two testing strategies examined, using testing to guide use of a CYP- versus a non-CYP metabolized SSRI was superior to using testing to guide the dose of a non-CYP SSRI, both in terms of response rates and quality-adjusted life. However, as the level of linkage between genotype and phenotype increased, the difference in efficacy between the two testing strategies and between the testing strategies and the dominant strategy narrowed, such that at the high linkage level both testing strategies approached the efficacy of the optimal strategy of using a non-CYP metabolized SSRI. For example, in the low linkage scenario, the difference between the two testing strategies was 7.92 % in response rate and 0.04 years for cumulative quality-adjusted survival, while in the high linkage scenario the difference was only 0.78% in response rate and 0.005 years for cumulative quality-adjusted survival at 6 weeks.
One-way sensitivity analyses were performed for the following variables: prevalence of each phenotype, utility of depression, probability of responding to sertraline, cost of fluoxetine, cost of sertraline, and cost of genetic testing. The results of these analyses (not shown) were robust, with the relationship between the various options remaining similar at all levels of linkage between genotype and clinical response.
Because of the non-trivial cost of testing, 6-week costs are always greater for the testing strategy (results not shown), even when compared to the strategy of using a non-CYP metabolized drug without testing. For example, using genetic testing to guide medication choice cost $909 more than empiric therapy with a non-CYP medication, while using genetic testing to guide CYP dosing cost $882 more. The least effective strategy was also the least expensive- empiric treatment with a CYP metabolized medication cost $118 less than the empiric treatment with a non-CYP medication. However, if the length of treatment is expected to exceed approximately 9 months, the cost of the test strategies break even.
In this analysis of the potential impact of CYP450 genotype testing on treatment outcomes in a trial of SSRI therapy, use of a non-CYP metabolized SSRI without testing was always the most effective strategy, and use of a CYP metabolized SSRI was always least effective. The two genetic testing strategies considered (testing as a guide to use of a CYP or non-CYP metabolized SSRI) had intermediate efficacy. The degree of efficacy depended primarily on the linkage between genotype and clinical outcome. At relatively low levels of linkage testing provides little benefit over use of a CYP metabolized SSRI without testing. Testing approached the optimal efficacy only at the highest levels of linkage between genotype and clinical outcome. Further, the modeling exercise suggests that the most important element of the link is the ability of genotype to predict phenotype. It is notable that these results apply even though it was assumed that the analytic validity of the test used (ability of the test to discern true genotype) was perfect.
Given the lack of evidence regarding many of the model inputs, it is important not to overstate the specific numerical results. However, the analysis does provide insight into the reasons why various strategies may or may not be clinically desirable. What is easiest to explain is the superiority of the strategy using a non-CYP drug without testing. The reason is that we assume that non-CYP medications do not have increased adverse event rates or reduced response rates in the poor and ultra-rapid metabolizers, respectively, and the CYP drug was assumed to never be superior for any phenotype. What may be less evident is why neither testing strategy was optimal for any combination of plausible model inputs. The explanation is that an imperfect genetic test (i.e., one that provides less than perfect guidance to metabolism, efficacy, or adverse effects) can lead to worse outcomes for misclassified individuals. When ultra-rapid or poor metabolizers are misclassified as extensive metabolizers, they are mistakenly managed with higher risk treatments. In the strategy in which testing is used to guide use of a CYP or non-CYP metabolized SSRI, misclassified individuals are given a CYP metabolized SSRI at standard doses, increasing their risk for adverse effects or lowering the probability of responding. In a strategy in which genetic testing is used to adjust the dose of a CYP metabolized SSRI, misclassified individuals are offered either very high or very low doses of the CYP metabolized SSRI, effectively doubling their risk of a poor outcome.
This basic analysis suggests that when non-CYP metabolized SSRIs are available, they should be used. When this approach is not feasible, CYP genotyping may provide similar patient outcomes if the test results can be shown to be highly predictive of clinical response. A difficulty in supporting the use of CYP450 genotype testing is the lack of evidence regarding the ability of CYP genotyping to guide treatment; if the correlation between genotype and outcomes is only modest, testing strategies are unlikely to be much more effective than treating with a CYP metabolized SSRI without testing. Also, since testing has its own cost, testing strategies do not save costs, even for the optimistic “high correlation” scenario, unless expected treatment duration exceeds approximately 9 months.
Clearly, studies of the relationship between genotype and clinical outcomes present a high value target for future research. Additional modeling which includes variable lengths of treatment, the possibility of treatment changes would help clarify the likely impact of CYP 450 genotype testing on long-term benefits, risks, and costs.
The cytochrome P450 (CYP450) enzyme system is prominently involved in the metabolism of each of the currently available selective serotonin reuptake inhibitors (SSRIs). Pharmacokinetic variability resulting from CYP polymorphisms can potentially impact metabolism of SSRIs. It has been proposed that genotyping may provide information to guide selection and dosing of SSRI therapy, leading to improved efficacy and reduced adverse effects.
In this report we identified and evaluated published research and publicly available U.S. Food and Drug Administration (FDA) reports related to the use of CYP genotyping as it relates to the clinical care of individuals with severe non-psychotic depression, focusing on five key questions: (1) the impact of CYP450 genotyping on outcomes in the treatment of depression, and on medical, personal, and public health decisionmaking (overarching question); (2) the analytic validity of tests available for CYP450 genotyping; (3) the impact of CYP genotypes on SSRI metabolism, efficacy, and tolerability (i.e., clinical validity); (4) the impact of CYP testing on management decisions, clinical outcomes (vs. not testing), and decisionmaking (i.e., clinical utility); and (5) the potential harms associated with testing and with subsequent management options.
We identified moderately good-quality evidence regarding the operating characteristics of clinical tests used for CYP genotyping (Question 2). However, there was a paucity of high-quality clinical studies addressing the other key questions. In particular, there was no evidence for Questions 1, 4a, 4b, 4c, and 5, and evidence for questions 3a, 3b, and 3c was of limited quality.
Methodological issues identified include the following:
We did not find a single prospective study of CYP450 genotyping and its relationship to clinical outcomes. Most studies were small, poor-quality cross-sectional studies examining prevalence rates of certain genotypes in the sample, or examining the differences between various genotypes and limited clinical outcomes, such as response or adverse effects.
There were no randomized studies of alternative testing strategies.
Almost all of the studies identified as reporting on a novel technique for CYP genotyping failed to report key measurements attesting to the robustness, repeatability and quality control of their proposed methods. Rarely was it possible to calculate the positive and negative predictive value of the tests and fully evaluate all aspects relevant to analytical validity. Additionally, often researchers tended to report allele frequencies, rather than genotype frequencies, preventing assessment of specificity and sensitivity in the clinically relevant level. Moreover, the small sample sizes which were utilized in most of these studies severely diminish the reliability of the proposed tests, reflected in large confidence intervals.
Many reports did not take into account concurrent medications. Medications that inhibit or induce certain CYP enzymes, including SSRIs themselves, can affect metabolism of CYP metabolized drugs. Additionally, we did not identify any studies that examined effects of CYP inhibition/induction together with genetic polymorphisms of CYP enzymes (e.g., is there an additive effect of a CYP2D6 inhibitor medication in a CYP2D6 poor metabolizer [PM] subject such that SSRI levels are higher than the levels without such an inhibitor medication in a CYP2D6 PM subject?)
Several studies looked at limited genotypes and did not account for the fact that more than one CYP enzyme may be involved in the metabolism of a specific SSRI.
Many studies examining the clinical outcomes of efficacy or adverse effects did not comment on blinding between treating clinicians and those responsible for interpreting results of genetic testing, or patient blinding.
Many studies grouped together multiple SSRIs, or SSRIs and other antidepressants. This approach can potentially confound results because of variability in contribution of different CYP enzymes to metabolism of different SSRIs and other antidepressants, and variability in CYP inhibition by different SSRIs.
We found only one study that examined combined effect of CYP 450 polymorphism and polymorphism in serotonin 2A receptor.90 Genetic factors affecting serotonin receptor proteins, membrane transporters, and signal transduction molecules have important pharmacodynamic effects that could affect SSRI efficacy or tolerability.50, 86, 97–108 Thus, genetic factors other than pharmacokinetic factors can impact SSRI outcomes, and it may be suboptimal to examine effects of CYP polymorphisms on SSRI outcomes in isolation. Multivariable pathway analysis studies are now starting to emerge; any may provide more information regarding proportion of risk for poor outcomes in SSRI treatment of depression that may be attributable to a certain factor, such as CYP polymorphisms. A recent study109 searched for genetic predictors of treatment outcome in 1953 patients with non-psychotic major depression treated with the SSRI citalopram. Sixty-eight chosen candidate genes were genotypes, with 768 single-nucleotide polymorphism markers chosen to detect common genetic variation. A significant association was found between treatment outcome and HTR2A gene, which encodes the serotonin 2A receptor. Genes primarily involved in drug metabolism were excluded from this study, but are under study by another group using the same DNA samples. These forthcoming results may be particularly relevant to some of the questions posed in this report.
The rated quality of data did not improve even when we were generous in our inclusion criteria and included studies examining SSRI treatment of conditions other than depression, or when we included other antidepressants in addition to SSRIs.
We did not find any data to address directly the overarching question of whether testing for CYP450 polymorphisms in adults entering SSRI treatment for non-psychotic depression leads to improvement in outcomes, or whether testing results are useful in medical, personal, or public health decisionmaking.
We identified only a few studies of test performance relative to the gold standard of DNA sequencing, applied to a limited number of genetic variants. Many studies appear to be in the realm of preclinical evaluations and are not clearly relevant to the domain of clinical practice.
These data do suggest that the analytic sensitivity and specificity of available tests are generally high. One concern may be that in the evaluation of gene deletions and duplications, assessing the magnitude of the potential problem is limited by the lack of an established gold standard for gene copy number. Another concern is that few CYP450 variants are included in the studies we identified, particularly less common variants.
In healthy CYP2C19 PMs, there is evidence of slower metabolism of SSRIs after a single dose, whereas in CYP2D6 PMs, the evidence is weaker. In depressed patients who have reached a steady-state concentration of an SSRI, the existing data (a series of heterogeneous studies in small samples) do not support a clear correlation between CYP metabolizer status and SSRI concentrations.
In depressed patients, the existing data (a series of heterogeneous studies in small samples) do not support a clear correlation between CYP metabolizer status and the efficacy of SSRIs.
In depressed patients, the existing data (a series of heterogeneous studies in small samples) do not support a clear correlation between CYP metabolizer status and the tolerability of SSRIs.
We did not identify any studies that addressed whether CYP450 testing influences depression management decisions by patients and providers in ways that could improve or worsen outcomes, or whether testing for CYP450 polymorphisms in adults entering SSRI treatment for non-psychotic depression leads to improved clinical outcomes compared to not testing. Also, there were no data examining whether testing results are useful in medical, personal, or public health decisionmaking.
There were no data on possible direct or indirect harms associated with testing for CYP450 polymorphisms and subsequent management options.
As a complement to the evidence review, we constructed a basic decision model to consider the circumstances under which testing for CYP polymorphisms could improve clinical outcomes, or favorably impact costs. We examined four strategies: (1) use a non-CYP metabolized SSRI without testing; (2) test and choose a non-CYP or CYP metabolized SSRI based on the result; (3) test and choose the dose of a CYP metabolized SSRI based on the result; and (4) use a CYP metabolized SSRI without testing. In no plausible scenario was a testing strategy predicted to improve expected outcomes of treatment at 6 weeks. The efficacy of a test strategy could approach the efficacy of use of a non-CYP metabolized drug, although this required the condition that a high correlation exist between genotype and phenotype (metabolizer status), as well as between phenotype and clinical outcomes. Current evidence does not support the conclusion that such high correlations apply. Moreover, the cost of testing is not offset by treatment savings if treatment duration is less than approximately 9 months.
This report has two potentially significant limitations:
First, we included only articles published in English. While this could lead to missing important studies, we suspect the likelihood of such exclusion is low, as we identified only one study that met the inclusion criteria at the abstract screening stage that was excluded at the full-text screening stage because the full report was in another language.110
A second potential limitation is that we only included peer-reviewed publications and data publicly available from the FDA. This inclusion criterion was based on the judgment of the technical expert panel that it would be difficult to assess the quality of information from other sources (for example, data from manufacturer websites may be biased in favor of the product, or data from scientific meetings may be subject to change when published in peer-reviewed journals).
We propose the following conceptual model to guide future research in cytochrome P450 (CYP450) polymorphism testing for depression management. Broadly speaking, the rationale behind CYP450 testing in patients with non-psychotic depression is as follows:
Major depressive disorder is a significant public health problem.
While selective serotonin reuptake inhibitors (SSRIs) are the first-line treatment for depression, they are associated with a high rate of non-response to treatment, harboring a potential opportunity to improve public health by improving response rates to SSRI treatment.
SSRI treatment efficacy involves modulation of brain levels of neurotransmitters and consequent adjustments of related pathways, processes that require several weeks to achieve a new steady state. One factor that possibly makes identification of the optimal SSRI treatment (i.e., specific SSRI and/or optimal dose) difficult in a specific clinical situation is the CYP polymorphism-associated differences between patients in the rate of metabolism of SSRIs.
CYP450 testing can potentially be used to predict the rate of SSRI metabolism (i.e., to classify patients as poor, intermediate, extensive, or ultra-rapid metabolizers) and, thus, potentially can reduce the amount of trial and error required to select the optimal SSRI in a specific clinical situation.
The better the operating characteristics of CYP450 testing in predicting metabolizer status, the greater the potential of CYP450 testing to improve the process of identifying the optimal SSRI treatment.
However, the more that factors other than CYP450 enzymes affect the metabolism of SSRIs (e.g., environmental effects, concomitant medications) or SSRI-associated outcomes (e.g., genetic factors associated with the pharmacodynamics of SSRIs, including genetic variability in serotonin receptor proteins, or transporter proteins), the less useful CYP450 testing will be.
Because depression is not often acutely life-threatening (except in severe cases with suicidal ideation) and SSRIs are rarely associated with life-threatening adverse effects, the main impact of CYP450 testing is likely to be in reducing the time to find the optimal SSRI, and in reducing the likelihood of adverse effects that would have been expected to occur with a suboptimal SSRI that might have been prescribed in the absence of CYP450 testing, thereby potentially reducing disease-management costs.
Finally, the impact of reducing the time to find the optimal SSRI and reducing the likelihood of SSRI-related adverse effects during the initial dosing period is strong enough to be important to patients (e.g., by improving their quality of life or decreasing absenteeism from work).
The eight elements described above can be specifically matched to our key questions as follows:
Question 1: Points (a) through (h).
Question 2: Point (e).
Question 3a, 3b, 3c: Points (c), (d), (e), and (f).
Question 4a, 4b, 4c: Points (g) and (h).
Question 5: Points (c) through (h).
This report reviewed the literature pertaining to the above rationale and found that, although some information exists, as a whole it is not sufficient to draw firm conclusions about whether this rationale, while intuitively reasonable, is in fact true. Nevertheless, this rationale can be used to help classify the future research that we recommend would be helpful. In particular, two types of studies can be envisioned.
The first type of study would better elucidate individual steps in the above rationale. For example, although we do not recommend that any additional studies are needed for points (a) and (b), the other points need additional studies that could be designed as follows:
Regarding point (c), studies that better describe the CYP polymorphism-associated differences in the rate of metabolism of individual SSRIs between patients could be designed. These should overcome the limitations of current literature addressing this issue, such that they are adequately powered, address individual SSRIs, account for diet, and co-medications, particularly CYP inhibiting or inducing drugs.
Regarding point (d), there is a need to perform studies of CYP genotyping in a large variety of populations to ascertain sensitivity and specificity of genotyping as applicable in real-world settings. It is essential that such studies explore a large range of the known possible polymorphisms functionally affecting each enzyme, refraining from focusing solely on the detection of the major alleles relevant to Caucasians and African-Americans. In order to reliably assess the performance of these tests the sample sizes employed must bear power to report results within narrow margins of confidence interval, repeatedly and consistently concluding identical genotype calls.
Regarding points (e) and (f), multivariable pathway analysis studies underway may provide guidance regarding extent of variation in depression treatment response attributable to CYP enzymes, albeit this may reflect only a subset of patients treated with citalopram.109
Regarding points (e), (f), and (g), studies that could better ascertain the predictive value of CYP genotyping in depression treatment outcomes, and its impact on medical or personal decisionmaking, could be designed. The suggested study design would be a properly sized (likely to be large) randomized trial of CYP genotyping-guided treatment versus treatment as usual. Such a trial should be in keeping with design standards aimed at minimizing bias (e.g., using intent-to-treat analysis, blinding of physicians and patients), maximizing generalizability (e.g., representative of individuals with severe non-psychotic depression), and including meaningful outcomes (e.g., short-term treatment success, satisfaction, resource utilization). Such a study would provide answers about rates of dropouts/non-response in individuals who were genotyped versus those who were not. It would also provide data about treatment decisions by providers and patients, based on genotyping, and the outcome of such genotyping-guided treatment (e.g., higher starting doses in ultra-rapid metabolizers or lower doses in poor metabolizers) in comparison to the current practice of “trial and error.” It may also provide valuable information about harms.
Regarding point (h), studies that could better examine the importance to patients of potential outcomes, such as time to response or quality of life during the early treatment of depression, could be designed. A suggested study would be a utility or a “willingness-to-pay” model to determine value of these outcomes to patients.
The second type of study would encompass multiple steps in the above rationale. In particular, recognizing that having evidence in favor of all of the steps in the rationale only supports, but does not prove, the thesis that adopting CYP450 testing will improve patient outcomes, various randomized trials could be considered that would test this linkage directly. The simplest study would involve linking a specific genotype to SSRI type and dose. This would provide a direct test of the rationale provided by the foundational studies described above (i.e., when clinicians a treat in a way indicated by evidence, does it make a difference?). However, such a study would not be a direct test of the utility of genotyping in clinical practice if the utility of testing is highly patient-specific and not suitable to being described by an algorithm. In an alternative design, patients would be randomized to being genotyped, without mandating that treatment be based on the results. The most pragmatic, but also the most difficult type of study would be a “practical clinical trial.”111 Rather than randomizing by patient, such a study would involve randomizing clusters (e.g., clinicians, practices, or regions) to have genotyping available (or perhaps reimbursed) or not. This would provide a test of the overarching question, “What difference does having genotyping available make in clinical practice?”
With pharmacogenetics and personalized medicine becoming everyday terms used in medicine, answering questions about the utility of genotyping as it relates to clinical practice has become vital. The practice of medicine in general and psychiatry in particular, involves many challenges, and as knowledge about the biological basis of diseases evolves, those diseases have to be redefined in the light of this new understanding; this redefinition, in turn, guides drug development for conditions such as depression. As we struggle to understand the different variables that influence response to antidepressant treatment, we need every answer that will take us closer to our goal of optimizing treatment for individual patients.
The evidence reviewed in this report demonstrates the high analytic sensitivity and specificity of tests for cytochrome P (CYP) genotyping, but for few of the known variants. The short list of papers addressing the key questions clearly demonstrates the lack of sufficient evidence for incorporation of any of these tests into guidelines for clinical practice. Moreover, the nature of most pharmacogenetic evidence is of rather low positive and negative predictive values, given the functional relevance of each variant and the genetic and biological context in which it is examined for each disease and drug scenario. As outlined in Chapter 5, there is a critical need to carry out research in ways that would help us answer as many questions as we can. We anticipate that the issue will not be one of safety, but rather one of decreasing morbidity and thereby improving quality of life in patients with non-psychotic depression. Considering the high prevalence of depressive disorders and the length of time required to determine whether a given antidepressant is successful or not, there may be a perceivable impact at the population level if even a small benefit can be demonstrated at the individual level.
Another reason for studying this question further is that as newer treatments for depression become available, the resolution of the question of CYP genotyping may help us apply the information to emerging treatments.
In conclusion, we recommend prospective studies of CYP450 genotyping in the treatment of non-psychotic depression with selective serotonin reuptake inhibitors (SSRIs) to examine the utility of such genotyping in clinical practice.
| ACCE | Analytic validity, Clinical validity, Clinical utility and associated Ethical, legal and social implications |
| AHRQ | Agency for Healthcare Research and Quality |
| ASA | Allele-specific amplification |
| AS-PCR | Allele-specific polymerase chain reaction |
| AUC | Area under the curve |
| CDC | Centers for Disease Control and Prevention |
| CGI | Clinical Global Impressions Scale |
| CI | Confidence interval |
| CLIA | Clinical Laboratory Improvement Amendment |
| Cmax | Maximum plasma concentration |
| CYP | Cytochrome P |
| DARE | Cochrane Database of Abstracts of Reviews of Effects |
| Del | Deletion (*5 allele) |
| DSM-IV | Diagnostic and Statistical Manual for Mental Disorders, 4th edition |
| Dup | Duplication (more than a single gene copy) |
| EGAPP | Evaluation of Genomic Applications in Practice and Prevention |
| EM(s) | Extensive metabolizer(s) |
| FDA | U.S. Food and Drug Administration |
| GI | Gastrointestinal |
| HAM-D | Hamilton Rating Scale for Depression |
| IM(s) | Intermediate metabolizer(s) |
| MADRS | Montgomery-Åsberg Depression Rating Scale |
| MDD | Major depressive disorder |
| MeSH | Medical Subject Headings |
| NA | Not applicable |
| NR | Not reported |
| PCR | Polymerase chain reaction |
| PCR-RFLP | Polymerase chain reaction and restriction fragment length polymorphism |
| PM(s) | Poor metabolizer(s) |
| QLESQ | Quality of Life Enjoyment and Satisfaction Questionnaire |
| RT-PCR | Real-time polymerase chain reaction |
| SC | Single gene copy |
| SF-36 | Medical Outcomes Study 36-Item Short Form Health Survey |
| SNRI(s) | Serotonin/norepinephrine reuptake inhibitors |
| SSRI(s) | Selective serotonin reuptake inhibitor(s) |
| t1/2 | Terminal elimination half-life |
| TCA(s) | Tricyclic antidepressant(s) |
| UM(s) | Ultra-rapid metabolizer(s) |
| USPSTF | U.S. Preventive Services Task Force |
Database: Ovid MEDLINE(R) <1966 to November Week 3 2005> [last updated May Week 2 2006]
Search Strategy:
cytochrome p-450 enzyme system/ or aryl hydrocarbon hydroxylases/ or cytochrome p-450 cyp2d6/
(cyp2c19 or cyp2c9 or cyp2cd6 or cyp 2c19 or cyp 2c9 or cyp 2d6).mp.
amplichip.mp.
microarray analysis/ or oligonucleotide array sequence analysis/
or/1–4
serotonin uptake inhibitors/ or citalopram/ or fluoxetine/ or fluvoxamine/ or paroxetine/ or sertraline/
(escitalopram or citalopram or fluoxetine or fluvoxamine or paroxetine or sertraline).mp.
(celexa or lexapro or prozac or luvox or paxil or zoloft).mp.
or/6–8
5 and 9
limit 10 to humans
limit 11 to english language
“Sensitivity and Specificity”/
“REPRODUCIBILITY OF RESULTS”/
13 or 14
5 and 15
limit 16 to humans
limit 17 to english language
18 not 12
(3 or 4) and 15
limit 20 to humans
limit 21 to english language
(1 or 2) and (3 or 4)
1 or (2 and 4) or 3
24 and 15
limit 25 to humans
limit 26 to english language
22 not 27
from 27 keep 1–219
cyp2d6.mp.
30 and 9
31 not 10
limit 32 to (humans and english language)
30 and 15
30 not 16
limit 35 to (humans and english language)
Reference Standards/
Quality Control/
Reference Values/
30 or 5
or/37–39
40 and 41
limit 42 to (humans and english language)
33 or 36
from 44 keep 1–42
from 43 keep 1–481
All excluded studies listed below were reviewed in their full-text version. Following each reference, in italics, is the reason for exclusion. Reasons for exclusion signify only the usefulness of the articles for this study and are not intended as criticisms of the articles.
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]

Free Full text in PMC]The Duke Evidence-based Practice Center is grateful to the following peer reviewers who read and commented on a draft version of this report:
Shashi Amur, Ph.D., Senior Staff Fellow, Genomics Group, Office of Clinical Pharmacology and Biopharmaceutics, Center for Drug Evaluation and Research (CDER), U.S. Food and Drug Administration (FDA), Rockville, Maryland
Dan Blazer, M.D., Ph.D., JP Gibbons Professor, Department of Psychiatry, Duke University Medical Center, Durham, North Carolina
Linda Bradley, Ph.D., Geneticist, Office of Genomics and Disease Prevention, Centers for Disease Control and Prevention (CDC), Atlanta, Georgia
Wylie Burke, M.D., Ph.D., Professor and Chair, Department of Medical History and Ethics, University of Washington, Seattle, Washington
Stephen Crystal, Ph.D., Institute for Health, Health Care Policy and Aging Research, Research Professor, School of Social Work, Rutgers, State University of New Jersey, New Brunswick, New Jersey
Julia Kirchheiner, M.D., Professor of Clinical Pharmacology, Department of Pharmacology of Natural Products and Clinical Pharmacology, University of Ulm, Ulm, Germany
William Lawrence, M.D., M.S., Center for Outcomes and Evidence, Agency for Healthcare Research and Quality (AHRQ), Rockville, Maryland
Dennis J. O'Kane, Ph.D., FACB, Department of Lab Medicine and Pathology, Mayo Clinic, Rochester, Minnesota
Kathryn Phillips, Ph.D., Professor of Health Economics and Health Services Research, School of Pharmacy, Institute for Health Policy Studies and ECSF Comprehensive Cancer Center, University of California SF, San Francisco, California
Margaret Piper, Ph.D., M.P.H., Associate Director, Blue Cross/Blue Shield Association, Technology Evaluation Center, Atlanta, Georgia
Gurvaneet Randhawa, M.D., M.P.H., Center for Outcomes and Evidence, Agency for Healthcare Research and Quality (AHRQ), Rockville, Maryland
Matthew Rudorfer, M.D., Assistant Chief, Division of Services and Intervention Research, National Institute of Mental Health, Bethesda, Maryland
Stephen Stahl, M.D., Ph.D., Chair, Neuroscience Education Institute, Adjunct Professor of Psychiatry, University of California SD, San Diego, California
Combined comments from the Evaluation of Genomics Applications in Practice and Prevention (EGAPP)/Centers for Disease Control and Prevention (CDC) Discussion Group
Comments from the Editorial Staff of the Agency for Healthcare Research and Quality (AHRQ), Rockville, Maryland
Comments from the National Institute of General Medical Sciences (NIGMS)/National Institutes of Health (NIH) Discussion Group
Nominations for peer reviewers were solicited from several sources, including the project's technical expert panel and interested federal agencies. The list of nominees was vetted and approved by the Agency for Healthcare Research and Quality (AHRQ).
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]Appendixes cited in this report are provided electronically at www.ahrq.gov/clinic/tp/cyp450tp.htm.
Available at: www.cdc.gov/genomics/gtesting/ACCE.htm. Accessed September 11, 2006.
Available at http://www.cebm.net/levels_of_evidence.asp. Accessed September 11, 2006.