NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Marchionni L, Wilson RF, Marinopoulos SS, et al. Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes. Rockville (MD): Agency for Healthcare Research and Quality (US); 2008 Jan. (Evidence Reports/Technology Assessments, No. 160.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes

Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes.

Show details


The CDC submitted a request for an evidence report on the “Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes” to the AHRQ on behalf of the EGAPP. This evidence report will be used to inform the CDC's Working Group as part of their work in formulating evidence-based recommendations. Our project consisted of recruiting technical experts, formulating and refining the specific questions, performing a comprehensive literature search, summarizing the state of the literature, constructing evidence tables, and submitting the evidence report for peer review.

Recruitment of Technical Experts and Peer Reviewers

At the beginning of the project, we assembled a core team of experts from JHU who had strong expertise in medical oncology, clinical trials, and biostatistics as well as a special interest in gene expression profiling tests. We also recruited external technical experts from diverse professional backgrounds, including academic, clinical, and corporate settings. The core team asked the technical experts and members of the EGAPP working group to give input regarding key steps of the process, including the selection and refinement of the questions to be examined. Peer reviewers were recruited from professional societies with an interest in breast cancer and gene expression profiling tests. Representatives from Agendia (MammaPrint®), Genomic Health, Inc. (Oncotype DX™), and Quest Diagnostics, Inc.® (BCP or H/I ratio) were also asked to review the report (see Appendix Ea).

Key Questions

The core team worked with the technical experts and representatives of the EGAPP and AHRQ to develop the Key Questions that are presented in the Specific Aims section of Chapter 1 (Introduction). The Key Questions apply to any gene expression profiling test, but they have been focused primarily on two gene expression profiling tests; Oncotype DX, and MammaPrint, because these are the tests that were expected to be commercially available in 2007. During the course of this review, the third gene expression profiling test, the Breast Cancer Profiling (BCP, or H/I ratio) Test (AviaraDX through Quest Diagnostics, Inc.) came to our attention. Although the BCP test was not included in our initial consideration of the Key Questions, we added studies regarding this test as an example of the types of gene expression profiling tests that are likely to be available in the coming years.

Literature Search Methods

Searching the literature involved identifying reference sources, formulating a search strategy for each source, and executing and documenting each search. For the searching of electronic databases we used medical subject heading (MeSH) terms that were relevant to breast cancer and gene expression profiling. We used a systematic approach for searching the literature to minimize the risk of bias in selecting articles for inclusion in the review. In this systematic approach, we were very specific about defining the eligibility criteria for inclusion in the review. The systematic approach was intended to help identify gaps in the published literature.

This strategy was used to identify all the relevant literature that applied to our Key Questions. The team specifically looked for articles that would provide information about the gene expression profiling tests identified in the Key Questions. We also looked for eligible studies by reviewing the references in eligible studies and pertinent reviews, by querying our experts, by contacting the manufacturers of the two tests, and by reviewing abstracts from relevant professional conferences.


Our comprehensive search plan included electronic and hand searching. On January 9, 2007, we ran searches of the MEDLINE® and EMBASE® databases, and on February 7, 2007, we searched the Cochrane database, including Cochrane Reviews and The Cochrane Central Register of Controlled Trials (CENTRAL), and CINAHL®. All searches were limited to articles published in 1990 or later. This cut-off year was established based on the introduction date of the MeSH heading “gene expression profiling,” 2000, and the introduction date of the MeSH heading “gene expression,” 1990. Also, test searches of earlier dates returned limited and irrelevant results.

“Gray” literature was searched following a protocol that was reviewed and approved by EGAPP and the technical expert panel:


Conference abstracts were reviewed using the same criteria as for journal articles but were only included if we felt we had a sufficient understanding of the underlying study and the data reported were critical enough to merit inclusion.


Web sites for the gene profiling tests included in this review, Agendia (MammaPrint®) and Genomic Health (Oncotype DX™), were searched for additional information not available in the peer-reviewed literature.


Agendia and Genomic Health, Inc. were contacted directly with requests for the following information:


A listing of articles that applied to the analytic validity or clinical utility of the gene profiling test,


Marketing materials on the gene profiling test, and


Any pertinent unpublished data.


We searched the Web site of the Food and Drug Administration (FDA) Center for Devices and Radiological Health for additional publicly available, unpublished information. 3739


A request was sent to the Center for Medical Technology Policy (CMTP) Gene Expression Profiling for Early Stage Breast Cancer Work Group to provide all background materials available on our study topic.

Search Terms and Strategies

Search strategies specific to each database were designed to enable the team to focus available resources on articles most likely to be relevant to the Key Questions. We developed a core strategy for MEDLINE, accessed via PubMed, based on an analysis of the MeSH terms and text words of key articles identified a priori. The PubMed strategy formed the basis for the strategies developed for the other electronic databases (see Appendix F).

Organization and Tracking of the Literature Search

The results of the searches were downloaded into ProCite® version 5.0.3 (ISI ResearchSoft, Carlsbad, CA). Duplicate articles retrieved from the multiple databases were removed prior to initiating the review. We then reviewed the citations by scanning the titles, abstracts, and the full articles as described below (Figure 4).

Figure 4. Summary of literature search and review process (number of articles ).


Figure 4. Summary of literature search and review process (number of articles ).

Title Review

To efficiently identify citations that were obviously not relevant, paired reviewers first independently scanned the article titles. For a title to be eliminated at this level, both reviewers had to indicate that it was clearly ineligible (see Appendix G, Title Review Form).

Abstract Review

Inclusion and Exclusion Criteria

The abstract review phase was designed to identify articles that reported on the analytic validity, clinical validity, and/or clinical utility of the gene expression profile tests of interest. Abstracts were reviewed independently by two investigators and were excluded only if both investigators agreed that the article met one of the following exclusion criteria:


The study applied only to breast cancer biology;


The study did not involve Oncotype DX or MammaPrint,


The study did not involve original data or original data analysis;


The study did not involve women;


The study did not involve breast cancer patients;


The study was not in the English language; or


The study did not apply to the key questions.

We excluded letters to the editor and editorials when they did not present original data (usually in the form of electronic supplements in the case of letters). If a letter or editorial cited Some original data, it generally was not sufficiently original for consideration in this report. As mentioned earlier, the initial scope of this project did not include the H/I ratio test, and thus this test was not identified on the abstract review form (Appendix G, Abstract Review Form).

Abstracts were promoted to the article review level if both reviewers agreed that the abstract could apply to one or more of the key questions. Differences of opinion regarding abstract eligibility were resolved through consensus adjudication.

Article Inclusion/Exclusion

Full articles selected for review during the abstract review phase underwent another independent review by paired investigators to determine whether they should be included in the full data abstraction. At this phase of review, investigators determined which of the Key Questions each article addressed (see Appendix G, Article Inclusion/Exclusion Form). If articles were deemed to have applicable information, they were included in the final data abstraction. Differences of opinion regarding article eligibility were resolved through consensus adjudication. A list of articles excluded at this level is included in Appendix H.

Data Abstraction

The purpose of the article review was to confirm the relevance of each article to the research questions and to collect evidence that addressed the questions. Articles eligible for full review had to address one or more of the Key Questions. Because of the heterogeneous nature of the applicable literature, we used a loosely structured approach for extracting data from the studies. Reviewers were given a standard matrix in which to enter data from each article (Appendix G, Data abstraction tables).

For all the data abstracted from the studies, we used a sequential review process. In this process, the primary reviewer completed all data abstraction forms. The second reviewer checked the first reviewer's data abstraction forms for completeness and accuracy. Reviewer pairs were formed to include personnel with both clinical and methodological expertise. Reviewers were not masked to the articles' authors, institutions, or journal.40 In most instances, data were directly abstracted from the article. If possible, relevant data were also abstracted from the figures. A number of articles provided links to supplemental data, and these resources were used during the data abstraction process. Differences of opinion were resolved through consensus adjudication.

For all articles, reviewers extracted information on general study characteristics, such as study design, study participants, and sample size (see Appendix G, Data abstraction tables). Data abstracted regarding participants' characteristics were: information on intervention arms, age, menopausal status, race, diagnoses, methods of diagnosis, exclusion criteria, treatments, and treatment outcomes.

An analytic validity (Key Question 2) data abstraction matrix was developed by the team (see Appendix G, Data abstraction tables). Our data abstraction was designed to capture data in the following general areas: tumor specimens' processing validity, annotation validity; within- and across-laboratory validity; and validity associated with gene expression data preprocessing and analysis.

Studies addressing clinical validity (Key Question 3a, 3b) and utility (Key Question 4a, 4b, 4c) were approached in a similar manner (see Appendix G, Data abstraction tables). The free-form tables developed for these questions were designed to capture details regarding a study's context, the methods used to analyze the data collected, results of the study, and conclusions made by the study authors.

Only three articles addressed the cost-effectiveness of the gene expression profiling tests. Therefore, the reviewers did not use standardized data abstraction forms to abstract results from these studies. Instead, the reviewers extracted information directly into the table that is presented as Evidence Table 5. Please refer to the Philips, 2004 41 article for a detailed explanation of why these domains and their sub-domains are important.

Quality Assessment

We used a synthesis of the general principles of the REporting recommendations for tumour MARKer prognostic studies (REMARK)42 and Standards for Reporting of Diagnostic Accuracy (STARD)43 guidelines. The REMARK guidelines were developed to encourage transparent and relevant reporting of study design, preplanned hypotheses, patient and specimen characteristics, assay methods, and statistical analysis methods, in order to help others judge the usefulness of the data presented.42 STARD was developed to improve the accuracy and completeness of studies reporting diagnostic accuracy, in order to allow readers to assess the potential for bias in a study and to evaluate the generalizability of the results43 (Appendix G, Quality Assessment Matrix).

Because of the extreme variability of the articles included in this report, we did not systematically apply the general principles to them. The strengths and weaknesses of each study were also dependent on the question(s) to which it applied. These strengths and weaknesses are highlighted in the Results section and the Discussion.

The EPC team appraised economic analyses using published guidelines for good practice in decision-analytic modeling in health technology assessment (Phillips 2004). The appraisal took into consideration the domains of structure, data, and consistency (see Evidence Table 5 for details).

Data Synthesis

We created a set of detailed evidence tables containing all the information extracted from eligible studies and stratified the tables according to the gene expression profile test. The investigators reviewed the tables and eliminated items that were rarely reported. They then used the resulting versions of the evidence tables to prepare the text of the report and selected summary tables.

Data Entry and Quality Control

Initial data were abstracted by the investigators and entered directly into the data abstraction tables. Second reviewers were generally more experienced members of the research team, and one of their main priorities was to check the quality and consistency of the first reviewers' answers. In addition to the second reviewers checking the consistency and accuracy of the first reviewers, a senior investigator examined all reviews to identify problems with the data abstraction. If problems were recognized in a reviewer's data abstraction, the problems were discussed at a meeting with the reviewers. In addition, research assistants used a system of random data checks to assure data abstraction accuracy.

Grading of the Evidence

After reviewing the available evidence on the Key Questions, the core team concluded that it would be inappropriate to grade the overall body of evidence using any of the published schemes for grading evidence. None of the grading schemes fit the nature of the data in these studies about gene expression profiling tests. The team therefore decided that it was more appropriate to focus on the specific strengths and weaknesses of the studies on each Key Question.

Peer Review

Throughout the project, the core team sought feedback from the external technical experts and the EGAPP Working Group through ad hoc and formal requests for guidance. A draft of the report was sent to the technical experts and peer reviewers, as well as to representatives of AHRQ, the CDC, the NIH, and the FDA. In response to the comments from the technical experts and peer reviewers, we revised the evidence report and prepared a summary of the comments and their disposition that was submitted to the AHRQ.


  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...