NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Seidenfeld J, Samson DJ, Rothenberg BM, et al. HER2 Testing to Manage Patients With Breast Cancer or Other Solid Tumors. Rockville (MD): Agency for Healthcare Research and Quality (US); 2008 Nov. (Evidence Reports/Technology Assessments, No. 172.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of HER2 Testing to Manage Patients With Breast Cancer or Other Solid Tumors

HER2 Testing to Manage Patients With Breast Cancer or Other Solid Tumors.

Show details

2Methods

This report reviews and synthesizes available evidence on outcomes of using HER2 test results to manage patients with breast cancer or other solid tumors. Five Key Questions are addressed (see “Introduction”). After extensive consideration, we concluded that since a myriad of technical, biologic and performance matters influence HER2 diagnostic performance, that these variables could not be adequately captured in a systematic review. Thus, Key Question 1 will be addressed by a narrative review and Key Questions 2 through 5 will be addressed by systematic review.

This chapter describes the search strategies used to identify literature; criteria and methods used for selecting eligible articles; methods for data abstraction; methods for quality assessment; and, finally, the process for technical expert advice and peer review.

The methods of this review are generally applicable to all Key Questions except Key Question 1. However, as noted, there were variations in specific aspects of the methods as necessary to satisfy requirements of each question.

Peer Review

A technical expert panel provided consultation for the systematic review and reviewed the draft report. The draft report was also reviewed by 12 external reviewers, including invited clinical experts and stakeholders (Appendix D *). Revisions were made to the draft report based on reviewers' comments.

Study Selection Criteria

Types of Participants

For Key Questions 1–4, populations of interest are patients with breast cancer, with separate analyses for early stage patients receiving adjuvant therapy and those undergoing treatment for metastatic disease.

For Key Question 5, populations of interest are patients with cancers of the lung, ovary, prostate, and head and neck.

Types of Outcomes

In general, outcomes should be standard, valid, reliable, and clinically meaningful. Two types of outcomes are relevant to Key Question 1:

  • Diagnostic accuracy (e.g., analytic sensitivity, specificity, reliability, etc.);
  • Concordance between assay methods; and

Multiple levels of outcomes will be addressed for Key Questions 2 through 5:

  • Lead time for detection of progression, recurrence or metastasis.
  • Patient management decisions, which may be altered by test results;
  • Primary (health) outcomes, which may be affected through management changes guided by test results, such as:
    • Duration of survival, disease-free survival, progression-free survival, and/or time to failure or progression.
    • Quality of life.
    • Palliation of measurable symptoms.
    • Treatment-related adverse effects.
  • Secondary (intermediate) outcomes include:
    • Objective clinical response rates (complete and partial responses; separately and summed).
    • Pathologic complete response rates in patients undergoing neoadjuvant therapy followed by surgery.
    • Response durations.

Health outcomes will be given greatest emphasis. However, it will likely be necessary to construct causal pathways to connect assay results to health outcomes through patient management decisions.

Types of Interventions

The interventions of interest for Key Questions 1, 2, 3, and 5 are tissue assays to evaluate tumor HER2 status by:

  • Immunohistochemistry;
  • Fluorescence in-situ hybridization;
  • Chromogenic in-situ hybridization;
  • Polymerase chain reaction; or
  • Other methods.

The interventions of interest for Key Question 4, and also of interest for parts of Key Question 5, are assays to measure serum concentration of the HER2 extracellular domain.

Practice Settings

Interventions relevant to Key Questions 1–5 are used in the following settings:

  • Pathology and laboratory medicine.
  • Hospitals.
  • Outpatient surgery facilities.
  • Office-based practices.

Types of Studies

Following are study selection criteria specific to each key question.

HER2 assay results are influenced by multiple biologic, technical and performance factors. Since many aspects of HER2 assays were not standardized until very recently, we could not isolate effects of these disparate influences on assay results and patient classification.

This challenged the validity of using systematic review methods to compare available assay technologies. For that reason, we provide a narrative review of the following factors influencing HER2 test results and their use to classify patients: biologic processes, assay methods, and sources of variability.

Key Question 2. For patients who are not unequivocally HER2-positive, what is the evidence on outcomes of treatment targeting the HER2 molecule (trastuzumab, etc.), or on differences in outcomes of a common chemotherapy or hormonal therapy regimen with versus without additional treatment targeting the HER2 molecule, in:

a)

Breast cancer patients characterized by discrepant HER2 results from different tissue assay methods performed adequately; and

b)

For those with HER2-negative breast cancer?

Inclusion criteria

  • Randomized trials, or non-randomized studies (prospective or retrospective) on patients given a uniform chemotherapy regimen or hormonal treatment; that
  • Directly compare outcomes of treatment with versus without trastuzumab (or other HER2-targeted therapy); and also
  • Compare outcomes separately for one or more groups whose HER2 assay results are:
    a)

    equivocal, or discordant by IHC and ISH, with results separately reported for IHC 2+ and 3+ cases (IHC 0 and 1+ cases may be pooled); or

    b)

    unequivocally negative by both IHC and ISH.

Key Question 3. For breast cancer patients, what is the evidence on clinical benefits and harms of using HER2 assay results to guide selection of:

a)

Chemotherapy regimen; or

b)

Hormonal therapy?

Inclusion criteria

  • Randomized trials, prospective or retrospective studies on identically treated patients, including:
    • Identical hormonal therapy for all patients in studies on chemotherapy; and
    • Identical chemotherapy for all patients in studies on hormonal therapy; or
    • Separate reporting on identically treated groups.
  • Report outcomes of a breast cancer treatment regimen separately by HER2 status;
  • Report outcomes separately for patients undergoing treatment in the neoadjuvant, adjuvant or advanced (recurrent, refractory, or metastatic) settings
  • Report:
    • Pathologic response (i.e. objective tumor regression) rates for studies on neoadjuvant therapy;
    • Disease-free, relapse-free, recurrence-free or progression-free survival for studies on adjuvant therapy; and
    • Progression-free or overall survival for advanced disease.
  • Defined HER2 positivity consistently with the algorithm recommended in the ASCO/CAP guideline.
  • Included at least 20 HE4R2-positive patients.

Separate evidence tables and analyses will focus on:

  • Treatment setting (neoadjuvant, adjuvant or for advanced disease);
  • Chemotherapy regimens (e.g., anthracycline-based regimens, or a taxane); and
  • Hormonal therapies (e.g., tamoxifen versus aromatase inhibitors).

Key Question 4. What is the evidence that monitoring serum or plasma concentrations of HER2 extracellular domain in patients with HER2-positive breast cancer predicts response to therapy, or detects tumor progression or recurrence, and if so, what is the evidence that decisions based on serum or plasma HER2 assay results improve patient management and outcomes?

Inclusion criteria

  • Randomized trials, prospective single-arm studies, or retrospective series of identically treated patients; that
  • Measure serum or plasma HER2 concentrations in breast cancer patients, either at baseline or at multiple time points; and either:
    • Associate baseline values or changes in HER2 concentration with one or more outcomes of interest (primary or secondary); or
    • Compare outcomes of treatment decisions based on assay results with outcomes of decisions made in absence of assay results.

Key Question 5. In patients with ovarian, lung, prostate, or head and neck cancers, using tumor tissue HER2 or monitoring serum or plasma concentrations of HER2 predicts response to therapy, or detects tumor progression or recurrence. Inclusion criteria:

  • Randomized trials, prospective single-arm studies, or retrospective series of identically treated patients; that
  • Measure HER2 in tumor tissue, serum, or plasma from patients with ovarian, lung, prostate, or head and neck cancers, and either:
    • Associate HER 2 status from tissue assays, or baseline values or changes in serum or plasma HER2 concentration, with one or more outcomes of interest (primary or secondary; see above); or
    • Compare outcomes of treatment decisions based on tumor HER2 status, or serum or plasma assay results, with outcomes of decisions made in absence of test results.

Search Strategy and Review

Search Strategy

Electronic databases. The following databases were searched for citations. The full search strategy is displayed in Appendix A *. The search was not limited to English-language references; however, foreign-language references without abstracts were disregarded.

The MEDLINE® search was performed through 2/23/07. The EMBASE® search was performed through 2/23/07. The Cochrane Controlled Clinical Trials Register search was performed through 2/23/07. Search updates limited by the Cochrane clinical trial filter were performed for all 3 databases on 4/25/08.

Additional sources of evidence. The Technical Expert Panel and individuals and organizations providing peer review were asked to inform the project team of any studies relevant to the key questions that were not included in the draft list of selected studies.

We also examined the bibliographies of all retrieved articles for citations to any relevant study that was missed in the database searches. In addition, we sought studies published in conference published in conference proceedings and abstracts from the American Association for Clinical Chemistry (AACC), American Society of Clinical Oncology (ASCO), College of American Pathologists (CAP) and the San Antonio Breast Cancer Symposium (SABCS) over the past two years.

Search Screen

Search results were stored in a ProCite® database. Using the study selection criteria for screening titles and abstracts, a single reviewer marked each citation as either: 1) eligible for review as full-text articles; 2) ineligible for full-text review; or 3) uncertain. Citations marked as uncertain were reviewed by a second reviewer and resolved by consensus opinion, with a third reviewer to be consulted if necessary. Using the final study selection criteria, review of full-text articles was conducted in the same fashion to determine inclusion in the systematic review. Of 6,337 citations, 666 articles were retrieved and 70 selected for inclusion (Figure 1). Records of the reason for exclusion for each paper retrieved in full-text, but excluded from the review, were kept in the ProCite® database (see Appendix B, Excluded Studies).

Figure 1. QUOROM Diagram.

Figure

Figure 1. QUOROM Diagram.

Data Extraction and Analysis

Data Elements

The data elements below were abstracted, or recorded as not reported, from included studies. Data elements to be abstracted were defined in consultation with the Technical Expert Panel.

Data elements from intervention studies (randomized, controlled trials, prospective single-arm studies, and retrospective consecutive series of identically treated patients) were:

  • Critical features of the study design (for example, patient inclusion/exclusion criteria, number of subjects, use of blinding)
  • Patient characteristics, including:
    • Age
    • Gender
    • Race/ethnicity
    • Disease and stage
    • Disease duration
    • Performance status
    • Other prognostic characteristics (e.g., estrogen or progesterone receptor status)
  • HER2 assay techniques (tissue versus serum, IHC, FISH, PCR, ELISA, scoring methods, cutoffs);
  • Treatment protocols (for example, regimen, dose, frequency, duration)
  • Patient monitoring procedures (for example, followup duration and frequency, outcome assessment methods) and
  • The specified key outcomes and data analysis methods (including techniques for assessing associations between HER2 findings and outcomes and methods for assessing treatment effect interactions)

Evidence Tables

Templates for evidence tables were created in Microsoft Excel® and Microsoft Word®. One reviewer performed primary data abstraction of all data elements into the evidence tables, and a second reviewer reviewed articles and evidence tables for accuracy. Disagreements were resolved by discussion, and if necessary, by consultation with a third reviewer. When small differences occurred in quantitative estimates of data from published figures, the values obtained by the two reviewers were averaged.

Assessment of Study Quality

For this systematic review we constructed a hierarchy of evidence quality for studies assessing HER2 status in predicting outcome. As addressed below, the continuum ranged from more informative specially designed randomized trials to less informative single-arm studies using univariate analyses. In addition to the hierarchy of evidence, we adapted acknowledged frameworks for evaluating the quality of prognostic or predictive studies. For assessing the quality of randomized trials, the general approach to grading evidence developed by the U.S. Preventive Services Task Force (Harris, Helfand, Woolf, et al., 2001) was applied. To assess the quality of predictive studies, we adapted the “Reporting Recommendations for Tumor Marker Prognostic Studies” (REMARK) statement (McShane, Altman, Sauerbrei, et al., 2005). The quality of included prospective, single-arm intervention studies and retrospective consecutive series of identically treated patients was assessed based on a set of study characteristics proposed by Carey and Boden (2003). The quality of the abstracted studies was assessed by two independent reviewers. Discordant quality assessments were resolved with input from a third reviewer, if necessary.

Evidence Hierarchy

Table 3 shows the framework for evaluating how informative different designs and analytic strategies would be to predictions of outcomes according to HER2 status. The most informative scenario would be a trial in which randomized assignment to treatment groups would be stratified by HER2 status or patients were randomized to receive treatment guided by HER2 results or not (Conley and Taube, 2004). An adequately powered stratified randomization would allow valid inferences of treatment by HER2 interactions. Randomized trials generally are preferred because they convey the possibility of determining differences in the relative efficacy of two treatments, whereas single-arm studies can only assess the association between HER2 status and outcomes after a single treatment regimen. Subgroup analyses in randomized trials should ideally assess the significance of treatment effect interactions. Prespecified subgroups analyses guard against the problems of data dredging.

Table 3. Hierarchy of study design and conduct for assessing HER2 status prediction of outcome.

Table 3

Hierarchy of study design and conduct for assessing HER2 status prediction of outcome.

Post-hoc subgroup analyses may generate hypotheses, but may not support strong inferences about differential effectiveness. Multivariate subgroup analyses in randomized trials may be useful if the subgroup variable introduces imbalances between different variable by treatment combinations, particularly when only a subset of patients have tumor or serum specimens available. An alternative to multivariate subgroup analysis is cross tabulation of treatment by HER2 level results. The weakness of this approach is failure to control for imbalances in any important prognostic factors, particularly if the patients analyzed are a subset of those randomized. A formal test of interaction is preferred for any trial subgroup analysis. In single-arm (identically treated) studies, multivariate analyses may identify whether a variable is a significant independent predictor of treatment outcome while taking into account the separate influences of other predictors. The least informative situation would be a single-arm study that presents univariate comparisons of HER2 groups.

Assessment of Study Quality

As stated, to assess the quality of predictive studies, we adapted the REMARK statement (McShane, Altman, Sauerbrei, et al., 2005). A checklist based on portions of REMARK and other sources (Gould Rothberg, and Bracken, 2006; Altman and Riley, 2005; Altman, 2001a, 2001b; Altman and Lyman, 1998; Brocklehurst and French, 1998; Altman, Lausen, Sauerbrei, et al., 1994; Simon and Altman, 1994) was developed. Table 4 identifies good quality characteristics that we looked for in predictive studies, including: prospective design; prespecified hypotheses about relation of marker to outcome; large, well-defined, representative study population; marker assay methods well-described; blinded assessment of marker in relation to outcome; homogeneous treatment(s), either randomized or rule-based selection; low rate of missing data (≤15 percent); sufficiently long followup; well-described, well-conducted multivariate analysis of outcome. Decision rules for evaluating each quality item are described in the table.

Table 4. Interpretation rules for assessing quality of predictive studies.

Table 4

Interpretation rules for assessing quality of predictive studies.

For assessing the quality of randomized trials, the general approach to grading evidence developed by the U.S. Preventive Services Task Force (Harris, Helfand, Woolf, et al., 2001) was applied.

a.

The quality of randomized, controlled trials will be assessed on the basis of the following criteria:

  • Initial assembly of comparable groups: adequate randomization, including concealment and whether potential confounders (e.g., other concomitant care) were distributed equally among groups.
  • Maintenance of comparable groups (includes attrition, crossovers, adherence, contamination).
  • Important differential loss to followup or overall high loss to followup.
  • Measurements: equal, reliable, and valid (includes masking of outcome assessment).
  • Clear definition of interventions.
  • All important outcomes considered.
  • Analysis: Adjustment for potential confounders, intention-to-treat analysis.

Definition of ratings based on above criteria:

  • The rating of intervention studies encompasses the three quality categories described here.

  • Good: Meets all criteria: Comparable groups are assembled initially and maintained throughout the study (followup at least 80 percent); reliable and valid measurement instruments are used and applied equally to the groups; interventions are spelled out clearly; all important outcomes are considered; and appropriate attention is given to confounders in analysis. In addition, for randomized, controlled trials, intention to treat analysis is used.
  • Fair: Studies will be graded “fair” if any or all of the following problems occur, without the fatal flaws noted in the “poor” category below: In general, comparable groups are assembled initially but some question remains whether some (although not major) differences occurred with followup; measurement instruments are acceptable (although not the best) and generally applied equally; some but not all important outcomes are considered; and some but not all potential confounders are accounted for. Intention to treat analysis is done for randomized, controlled trials.
  • Poor: Studies will be graded “poor” if any of the following fatal flaws exists: Groups assembled initially are not close to being comparable or maintained throughout the study; unreliable or invalid measurement instruments are used or not applied at all equally among groups (including not masking outcome assessment); and key confounders are given little or no attention. For randomized, controlled trials, intention to treat analysis is lacking.
b.

The quality of included prospective single-arm intervention studies and retrospective consecutive series of identically treated patients was assessed based on a set of study characteristics proposed by Carey and Boden (2003), as follows:

  • Clearly defined question.
  • Well-described study population.
  • Well-described intervention.
  • Use of validated outcome measures.
  • Appropriate statistical analyses.
  • Well-described results.
  • Discussion and conclusion supported by data.
  • Funding source acknowledged.

Footnotes

*

Appendixes cited in this report are provided electronically at http://www​.ahrq.gov/downloads​/pub/evidence/pdf/her2/her2.pdf.

Views

  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...