NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Marchionni L, Wilson RF, Marinopoulos SS, et al. Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes. Rockville (MD): Agency for Healthcare Research and Quality (US); 2008 Jan. (Evidence Reports/Technology Assessments, No. 160.)

Cover of Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes

Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes.

Show details


Breast Cancer

Breast cancer is the most commonly diagnosed cancer in women.1 This tumor is currently the second leading cause of cancer-related deaths in women in the U.S., with approximately 178,000 new cases and 40,000 deaths expected among U.S. women in 2007.1 Treatment for breast cancer usually involves surgery to remove the tumor and involved lymph nodes. Frequently, surgery is followed by radiation therapy (in case of breast conservation or in women with large tumors or many involved lymph nodes), endocrine therapy (for essentially all women with tumors that are estrogen receptor (ER)-positive (see Appendix Aa for a list of acronyms), and/or chemotherapy (for women having a high risk for a poor outcome, such as those with large tumors, involved lymph nodes, advanced disease, or inflammatory breast cancer). Chemotherapy administered in addition to surgery is called “adjuvant” chemotherapy. More than three-quarters of all patients are expected to survive with this multi-modality approach.

One major challenge in breast cancer treatment relates to the decision about whether or not to use adjuvant chemotherapy. Although adjuvant chemotherapy can reduce the annual odds of recurrence and death for many women with breast cancer, especially those with ER-negative tumors,2 it has considerable adverse effects. Even though most women with early-stage breast cancer are advised to undergo chemotherapy, not all will benefit from it and some may remain free of disease recurrence at 10 years without it, especially those with small tumors and ER-positive disease. Decisionmaking protocols have been proposed with the intent of guiding clinicians involved in breast cancer treatment. Examples include the National Institutes of Health (NIH) Consensus Development criteria,3,4 the St. Gallen expert opinion criteria,5 the National Comprehensive Cancer Network (NCCN) guideline,6 and the computer-based algorithm Adjuvant! Online,7,8 which produces risk assessment and recommendations based on patient information, clinical data, tumor staging, and tumor characteristics (including age, menopausal status, comorbidity, tumor size, number of positive axillary nodes, and ER status). In addition, measurement of the human epidermal growth factor receptor 2 (HER-2) is now established as another predictive marker and has been incorporated into some of these indices,9 as it serves to identify candidates for adjuvant therapy with the monoclonal antibody trastuzumab (Herceptin®; Genentec, Inc., San Francisco, CA). Such patients may also be candidates for adjuvant treatment with other new agents such as the tyrosine kinase anti-HER-2 inhibitor lapatinib (Tykerb®, GSK, PA) and the anti-vascular epithelial growth factor (VEGF) receptor antibody bevacizumab (Avastin®; Genentech), which are being studied in trials now in progress. With the proliferation of treatment advances in breast cancer, treatment decisions have become more complex, thereby increasing the demand for tests and predictive models that could help identify those patients most likely to benefit from specific therapies.

Breast cancer is increasingly understood as a broad umbrella label, with various tumor subtypes exhibiting different prognoses and different responses to the various treatment options available for use in the adjuvant setting. Evidence from large randomized trials, and systematic reviews, forms the basis of the various treatment algorithms and nomograms described above. These tools help caregivers determine the risk of recurrence and death and the chances of benefiting from a specific therapy within a tumor subtype (e.g., anti-estrogens alone for ER-positive disease, trastuzumab for HER-2-positive disease). Unfortunately, the predictive utility of these tools for an individual patient within a specific tumor subset is quite limited, and a large number of patients with ER-positive disease or HER-2-positive disease still experience tumor recurrence and die from their disease despite having received adjuvant anti-estrogen therapy or trastuzumab, respectively. Therefore, there is great interest in developing, testing, and validating strong predictive markers that can be used in daily clinical practice to accurately identify those patients most likely to benefit from specific therapy options such as chemotherapy, endocrine therapy, and anti-HER-2 therapy, alone or in combination.

Gene Expression Profiling

Gene expression profiling (see Glossary, Appendix B) is an emerging technology for identifying genes whose activity may be helpful in assessing disease prognosis and guiding therapy. Gene expression profiling examines the composition of cellular messenger ribonucleic acid (RNA) populations. The identity of the RNA transcripts (see Glossary, Appendix B) that make up these populations and the number of these transcripts in the cell provide information about the global activity of genes that give rise to them. The number of mRNA transcripts derived from a given gene is a measure of the “expression” of that gene. Given that messenger RNA (mRNA) molecules are translated into proteins, changes in mRNA levels are ultimately related to changes in the protein composition of the cells, and consequently to changes in the properties and functions of tissues and cells in the body. However, only 2 percent of the genome (see Glossary, Appendix B) is translated into proteins, and little is known about how the expression of this 2 percent is controlled. The key intermediate is the transcriptome (see Glossary, Appendix B), which is made up of all the individual transcripts produced by the cell (see Figure 1).

Figure 1. Increasing complexity of information from genome to transcriptome and proteome: gene expression profiling focuses on the analysis of the transcriptome.


Figure 1. Increasing complexity of information from genome to transcriptome and proteome: gene expression profiling focuses on the analysis of the transcriptome.

Investigators have developed approaches to gene expression analysis that have led to substantial advances in our understanding of basic biology. Gene expression profiling has been applied to numerous mammalian tissues, as well as plants, yeast, and bacteria.1014 These studies have examined the effects of treating cells with chemicals and the consequences of overexpression of regulatory factors in transected cells. Studies also have compared mutant strains with parental strains to delineate functional pathways. In cancer research, such investigation has been used to find gene expression changes in transformed cells and metastases, to identify diagnostic markers, and to classify tumors based on their gene expression profiles (see Glossary, Appendix B).1518 The use of this approach for specific clinical problems, however, is relatively recent and poses several challenges related to the validity, reproducibility, and reliability required for use in diagnostic or predictive testing.

In recent years, gene expression profiling has been successfully used in breast cancer research. For instance, distinct subtypes of breast tumors (such as tumors expressing HER-2) have been identified as having distinctive gene expression profiles, representing diverse biologic entities associated with differences in clinical outcome.1923 Other investigators 24 have found gene expression signatures (see Glossary, Appendix B) associated with the ER and lymph node status of patients, thus identifying subgroups of patients with different clinical outcomes after therapy. From such studies, investigators have proposed a number of gene expression profiles that could be used to classify prognosis. In a case-control study from the Netherlands Cancer Institute (Amsterdam, the Netherlands), one such gene profile, consisting of 70 genes, was developed using archived frozen tissue from 78 young, node-negative women with breast cancer.21 In this study, tumors from patients who suffered rapid relapses after primary therapy had gene expression profiles that were quite distinct from those who remained disease-free. These gene expression profiles were then applied to a second validation set of 295 frozen tissue specimens collected from young women (including 61 patients from the previous cohort), yielding very similar results.25 Indeed, it appeared that this 70-gene profile more accurately predicted outcomes than did the traditional clinical criteria. Results from these preliminary studies further suggested that gene expression profiling may provide a powerful tool for estimating prognosis and the likelihood of benefit from selected therapeutic agents.

Breast Cancer Assays on the Market

Three breast cancer gene expression profiling-based assays are now available in the U.S. These assays investigate the expression of specific panels of genes by measuring their RNA levels in breast cancer specimens using different techniques, real-time reverse transcription-polymerase chain reaction (RT-PCR) 26 (Glossary) and DNA microarrays27 (see Glossary, Appendix B):


The Oncotype DX™ Breast Cancer Assay (Genomic Health, Redwood City, CA) quantifies gene expression for 21 genes in breast cancer tissue by RT-PCR.28 This test is intended to predict the likelihood of recurrence in women of all ages with newly diagnosed Stage I or II breast cancer, lymph node-negative and ER-positive, who will be treated with tamoxifen, an anti-estrogen agent.


The MammaPrint® Test is based on microarray technology, uses the 70-gene expression profile developed by van't Veer and colleagues,21,25 and is marketed by Agendia (Amsterdam, the Netherlands). This is a prognostic test for women 61 years of age or younger with primary invasive breast cancer who are lymph node-negative and ER-positive or negative. The company voluntarily submitted this test to the U.S. Food and Drug Administration for approval under proposed new guidelines for such tests, and received such approval in February 2007. These guidelines were finalized in July 2007.


The Breast Cancer Profiling Test is based on the expression ratio of the two genes HOXB13 and IL17RB, and for this reason is also known as the H/I ratio test. The assay was developed by AviaraDX and licensed to Quest Diagnostics, Inc. (Lyndhurst, NJ). This assay is based on RT-PCR and is offered to treatment-naïve women with ER-positive, lymph node-negative breast cancer.

All three tests have defined protocols for evaluating the tumor content of the specimens to be analyzed, preparing the RNA samples, normalizing the raw expression measurements, and computing summary indices which are related to patient prognosis. The characteristics of the assays, the gene panels used, and the procedures involved in the analysis are summarized in Table 1. Detailed descriptions of the genes can be found in Appendix C. These differences between tests must be taken into account in the evaluation of the available evidence about such tests. In the following section, we provide a brief description of the technologies that are used. A more detailed description is presented in Appendix D.

Table 1. Description of the three gene expression profile assays.

Table 1

Description of the three gene expression profile assays.


RT-PCR is a molecular biology technique that combines reverse transcription with real-time PCR (see Glossary, Appendix B). This methodology allows the quantification of a defined RNA molecule. It is accomplished by reverse transcription of the specific RNA into its complementary DNA, followed by amplification of the resulting DNA using PCR. The quantification of the DNA produced after each round of amplification is accomplished by the use of fluorescent dyes that intercalate with double-stranded DNA, or by modified DNA oligonucleotide probes (see Glossary, Appendix B) that fluoresce when hybridized with complementary DNA.

In a PCR template, relative ratios of the product and reagent vary. At the beginning of the reaction, reagents are in excess, and template and product are present in low concentrations and do not compete with primer binding, so that the amplification proceeds at a constant, exponential rate. After this initial phase, the process enters a linear phase of amplification, and then in the late reaction cycles, the amplification reaches a plateau phase and no more product accumulates To achieve accuracy and precision, it is necessary to collect quantitative data during the exponential phase of amplification, since in this phase the reaction is extremely reproducible. In RT-PCR, this process is automated, and measurements are made at each cycle. Finally, several implementations of this technique allow multiple DNA species to be measured in the same sample (multiplex PCR), since fluorescent dyes with different emission spectra may be attached to the different probes. Multiplex PCR allows internal controls to be co-amplified with the target transcripts (see Glossary, Appendix B) and permits allele discrimination in single-tube, homogeneous assays (Figure 2).

Figure 2. Quantitative RT-PCR. Panel A: PCR reaction using sets of quenched primers and probes. Panel B: binding of fluorescent probe molecules to double-stranded DNA. Panel C: fluorescence intensity curves for different dyes and samples: on the x-axis, the number of PCR cycle is shown, and on the y-axis, the corresponding fluorescence detected is indicated; the dashed line is used to calculate the cycle threshold for each sample. Panel D: computation of the relative levels of expression.


Figure 2. Quantitative RT-PCR. Panel A: PCR reaction using sets of quenched primers and probes. Panel B: binding of fluorescent probe molecules to double-stranded DNA. Panel C: fluorescence intensity curves for different dyes and samples: on the x-axis, (more...)

This technique is extremely sensitive. The development of novel chemistries and instrumentation platforms has led to widespread adoption of real-time RT-PCR as the method of choice for quantifying absolute changes in gene expression. Moreover, this technique has become the preferred method for validating results obtained from microarray analyses and other techniques that evaluate gene expression changes on a global scale.


The analysis of gene expression by microarray technology is based on the Watson-Crick pairing of complementary nucleic acid molecules. In this technique, a collection of DNA sequences, called probes (see Glossary, Appendix B), are “arrayed” on a miniaturized solid support (microarray) and used to detect the concentration of the corresponding complementary RNA sequences, called targets (see Glossary, Appendix B), present in a sample of interest. The advancements made in attaching or synthesizing nucleic acid sequences to solid supports and robotics have allowed investigators to miniaturize the scale of the reactions, and it is now possible to assess the expression of thousands of different genes in a single reaction.2931

In the basic microarray experiment, RNA harvested from the sample of interest is labeled with a fluorescent dye and hybridized to the microarray, then incubated in the presence of RNA from a different sample labeled with a different fluorescent dye. In this two-color experimental design, samples can be directly compared to one another or to a common reference RNA, and their relative expression levels can be quantified. After hybridization, gray-scale images corresponding to fluorescent signals are obtained by scanning the microarray with dedicated instruments, and the fluorescence intensity corresponding to each gene investigated is quantified by specific software. After normalization, the intensity of the hybridization signals can be compared to detect differential expression by using sophisticated computational and statistical techniques (Figure 3).

Figure 3. Schematic model for microarray hybridizations. Panel A: two-color scheme design. Panel B: single-color design.


Figure 3. Schematic model for microarray hybridizations. Panel A: two-color scheme design. Panel B: single-color design.

Sources of Variability in Gene Expression Analysis

Gene expression analysis poses several general challenges that can affect the reproducibility and reliability of the measurements obtained. The control of such sources of variability is clearly a concern when such technologies are used to make decisions about the clinical management of patients. Given the complexity of the procedures used in this type of investigation, the sources of uncertainty are multiple, from the preparation of tissue specimens to the computational analysis used to quantify expression levels.

The first source of variability relates to the various types of specimens that can be used to prepare the RNA to be used in gene expression analysis, including tissue specimens obtained in vivo. In this case, the resulting RNA template will be a mixture of the RNA content of all the cells contained in the specimen, and the relative content of the different cell populations (malignant vs. normal) present in the specimen processed is a major source of variability in gene expression. For this reason, special care must be taken when tumors are sampled for gene expression analysis. In general, macro- or micro-dissection of the samples is performed to ensure that the specimens contain a sufficient percentage of cancer cells.

A second major source of variability is related to the protocols used to prepare the specimens, since several alternatives have been used in the field, including the use of formalin-fixed, paraffin-embedded (FFPE) tumor specimens or laser-captured, micro-dissected (see Glossary, Appendix B) specimens and fresh or snap-frozen samples. Other factors likely to affect RNA quality include storage time and the reagents, and particular batches used. Unlike DNA, RNA is very unstable. The degradation of RNA can be triggered by pH changes as well as by specific enzymes called ribonucleases (see Glossary, Appendix B) that are present in cells and that can remain active in the RNA preparation if the RNA isolation is not properly carried out.

Watson-Crick hybridization of complementary nucleic acid moieties is the fundamental principle that forms the basis of any gene expression analysis. For this reason, sequence selection and gene annotation (see Glossary, Appendix B) are among the most relevant factors that can contribute to variability in the analysis of gene expression.

As in any other laboratory investigation, the use of different platforms (see Glossary, Appendix B), protocols, and reagents can also affect the variability of the obtained measurements, and thus the reproducibility within and across laboratories. Indeed, numerous platforms exist to perform both RT-PCR and microarray-based gene expression analyses. Moreover, within each technique, the same procedure can be performed using different instruments, each with its own different operational characteristics and performance.

Finally, since gene expression measures are virtually never used as raw output but rather undergo sequential steps of mathematical transformation, another source of variability is data pre-processing and analysis. Moreover, the levels of gene expression can be further processed and combined according to complex algorithms to obtain composite summary measurements that are associated with the phenotypes investigated.

International standards have been developed to address the quality of microarray-based gene expression analysis, focusing on documentation of experimental design, details, and results (see MIAME in Glossary, Appendix B).32 Several publications also have addressed the levels of reproducibility across platforms and laboratories.33,34 Such efforts emphasize the importance of trying to control the many described sources of variability in gene expression analysis and of ensuring that the information derived from such analyses is specific and does not represent accidental associations.

Objectives of the Evidence Report

The overall purpose of this evidence report is to review and synthesize the available evidence concerning the analytic and clinical validity of breast cancer gene expression profiling in predicting disease recurrence, as well as its efficacy and effectiveness in improving chemotherapy choices and subsequent outcomes (clinical utility) in women newly diagnosed with early-stage breast cancer. The report was prepared by the Evidence-based Practice Center (EPC) at the Johns Hopkins University (JHU) Bloomberg School of Public Health in response to a task order issued by the Agency for Healthcare Research and Quality (AHRQ) on behalf of the Centers for Disease Control and Prevention (CDC) Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Project. The key questions we were charged with addressing in this evidence report were:


What is the direct evidence that gene expression profiling tests in women diagnosed with breast cancer (or any specific subset of this population) lead to improvement in outcomes?


What are the sources of and contributions to analytic validity in these gene expression-based prognostic estimators for women diagnosed with breast cancer?


What is the clinical validity of these tests in women diagnosed with breast cancer?


How well does this testing predict recurrence rates for breast cancer when compared to standard prognostic approaches? Specifically, how much do these tests add to currently known factors or combination indices that predict the probability of breast cancer recurrence (e.g., tumor type or stage, ER and HER-2 status)?


Are there any other factors, which may not be components of standard predictors of recurrence (e.g., race/ethnicity or adjuvant therapy), that affect the clinical validity of these tests and thereby the generalizability of the results to different populations?


What is the clinical utility of these tests?


To what degree do the results of these tests predict the response to chemotherapy, and what factors affect the generalizability of that prediction?


What are the effects of using these two tests and the subsequent management options on the following outcomes: testing- or treatment-related psychological harms, testing- or treatment-related physical harms, disease recurrence, mortality, utilization of adjuvant therapy, and medical costs?


What is known about the utilization of gene expression profiling in women diagnosed with breast cancer in the United States?


What projections have been made in published analyses about the cost-effectiveness of using gene expression profiling in women diagnosed with breast cancer?

This task is of particular relevance, since the National Cancer Institute (NCI) recently announced its sponsorship of a clinical trial to be conducted by The North American Breast Cancer Intergroup (TBCI) assessing individualized options for breast cancer treatment: the Trial Assigning Individualized Options for Treatment (TAILORx). In this trial, tumors of patients with ER-positive and lymph node-negative breast cancer (and who will be treated with tamoxifen) will be tested using the Oncotype DX assay, and patients will be divided into groups according to the recurrence scores derived from the use of the assay. Patients showing low recurrence scores will receive endocrine therapy alone, while patients with high recurrence scores will receive endocrine therapy and adjuvant chemotherapy. Patients with mid-range scores will receive endocrine therapy and be randomly assigned to chemotherapy or no chemotherapy. This trial is designed to evaluate the treatment implications of Oncotype DX results in a large representative patient population, focusing primarily on patients with intermediate recurrence scores. The trial will also allow for generation of new data on patients with recurrence scores near the ends of the spectrum. Patients at the low end of the recurrence score spectrum will be compared to a pre-specified target of 95 percent recurrence-free survival. It should be noted that the cutoff values used in the TAILORx trial are different than those delineated in other studies of Oncotype DX. The results of the TAILORx trial will not be available for some time (around 2013) and with growing interest in and use of these tests (particularly Oncotype DX) in the oncology community, this evidence review could have an impact on clinical practice in the interim.35

A separate trial (MINDACT, or Microarray in Node-negative Disease may Avoid ChemoTherapy) has recently been activated by TRANSBIG (Translating molecular knowledge into early breast cancer management: building on the Breast International Group (BIG)), a research network of 39 institutions in 21 countries. The trial will compare two different ways of assessing the risk of cancer recurrence and making therapeutic decisions: a “traditional method” using Adjuvant! Online versus the MammaPrint assay. The rationale for this study is that many women who actually have “low risk” tumors are currently classified as “average” or “high risk” and therefore ultimately are recommended to receive adjuvant chemotherapy that ultimately may be of no benefit. The investigators estimate that 12–20 percent of women with early-stage breast cancer fall into this category.36

Structured Approach to Assessment of the Questions

The EPC team used a structured approach to assess the evidence regarding the key questions listed above. The structured approach was based on the following questions:


What was tested? One fundamental concept is the distinction between the investigated gene expression signatures (see Glossary, Appendix B) and the actual gene expression-based tests. The gene “signature” is the collection of genes whose expression levels are measured in a given test, together with the algorithm that combines those levels into a prognostic index; akin to a test's “recipe.” But just like a recipe can be implemented in subtly different ways with different results, this signature can be measured using a variety of technologies and procedures which may not be identical to those used in the actual marketed test being offered to patients. This distinction is important because clinicians' decisions, patients' choices, and the resulting benefits and harms will ultimately depend on the performance of marketed tests rather than on the more general gene expression signatures, although they typically track closely. Information about the signatures is highly relevant to the assessment of the marketed test, but is not identical.


What population was tested? This question required consideration of whether the study involved a representative sample of patients, from a clinical series or from a clinical trial subject to detailed eligibility criteria. This also required consideration of whether the population was clinically homogeneous enough for the implications of risk prediction to be clear and similar for every member of the study population (or for each subgroup). For example, predicting the relapse of patients on tamoxifen therapy may be different than predicting outcomes for untreated patients. The latter tests “intrinsic tumor aggressiveness,” which may not be the same as the factors that determine resistance to tamoxifen.


Was the study a developmental or validation study? Developmental studies were defined as the original reports in which new gene expression signatures were first described or in which previously developed gene expression signatures were first proposed to have a use different from the original use (e.g., the use on different subsets of patients with different purposes). Validation studies were defined as those that confirmed results in independent populations (with approximately the same characteristics as the population of the corresponding development study). If a developmental study, were appropriate statistical methods used to adjust for multiplicities, and was internal validation done? If a validation study, were all the test procedures, cutoffs, definitions, and measurements predefined?


Is it clear, from a clinical decisionmaking perspective, what is the incremental value of the test over and above standardized clinical predictors? It was not sufficient to simply insert clinical predictors into regression equations since this does not properly quantify the numerical consequences of decisions made with and without the new test.


Were the ways in which the tests had been evaluated optimal for clinical decisionmaking? This question required consideration of the choice of cutoffs, definition of categories, and combinations (or lack thereof) with other predictors.


What was the strength of the study design used to estimate clinical utility? Randomized controlled trials, with all samples taken concurrently, which could have taken place in the past, provide the strongest evidence of utility.


For studies of clinical utilization, what specific information was provided to patients and their physicians? Such studies are informative only if they are specific about the information that was given and how it informed decisionmaking.

Using this structured approach, the EPC team evaluated the evidence regarding the key questions of analytic validity, clinical validity, and clinical utility of each test, evaluated separately. The EPC team then used the review of the evidence to formulate both test-specific and general conclusions.



Appendixes cited in this report are provided electronically at: http://www​​.htm


  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...