The Agency for Healthcare Research and Quality (AHRQ), through its Evidence-Based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHRQ and conduct additional analyses when appropriate prior to developing their reports and assessments.
To bring the broadest range of experts into the development of evidence reports and health technology assessments, AHRQ encourages the EPCs to form partnerships and enter into collaborations with other medical and research organizations. The EPCs work with these partner organizations to ensure that the evidence reports and technology assessments they produce will become building blocks for health care quality improvement projects throughout the Nation. The reports undergo peer review prior to their release.
AHRQ expects that the EPC evidence reports and technology assessments will inform individual health plans, providers, and purchasers as well as the health care system as a whole by providing important information to help improve health care quality.
We welcome comments on this evidence report. They may be sent by mail to the Task Order Officer named below at: Agency for Healthcare Research and Quality, 540 Gaither Road, Rockville, MD 20850, or by e-mail to epc@ahrq.gov.
| Carolyn M. Clancy, M.D. Director Agency for Healthcare Research and Quality | Jean Slutsky, P.A., M.S.P.H. Director, Center for Outcomes and Evidence Agency for Healthcare Research and Quality |
| Beth A. Collins Sharp, Ph.D., R.N. Director, EPC Program Agency for Healthcare Research and Quality | Gurvaneet Randhawa, M.D., M.P.H. EPC Program Task Order Officer Agency for Healthcare Research and Quality |
The research team would like to acknowledge the efforts of Maxine A. Gere, M.S., for general project management and editorial assistance; Elizabeth De La Garza and Joyce Gonzalez for administrative support; Ariel Katz, M.D., M.P.H., for study selection and data abstraction; Thomas Ratko, Ph.D., for fact-checking; and Gurvaneet Randhawa, M.D., M.P.H., for advice as our Task Order Officer.
Objectives: Systematic review of trastuzumab outcomes among breast cancer patients who have negative, equivocal, or discordant HER2 assay results; use of HER2 assay results to predict outcomes of chemotherapy or hormonal therapy regimen for breast cancer; use of serum HER2 to monitor treatment response or disease progression in breast cancer patients; and use of HER2 testing to manage patients with lung, ovarian, prostate, or head and neck tumors. Also, narrative review of concordance of HER2 assays.
Data Sources: We abstracted data from: three articles plus one conference abstract on negative, equivocal, or discordant HER2 results; 26 studies on selection of chemotherapy or hormonal therapy; 15 studies on serum HER2; and 26 studies on ovarian, lung, prostate, or head and neck tumors. Foreign-language studies were included.
Review Methods: We sought randomized trials or single-arm series (prospective or retrospective) of identically treated patients that presented relevant outcome data associated with HER2 status.
Results: HER2 assay results are influenced by multiple biologic, technical, and performance factors. Many aspects of HER2 assays were standardized only recently, so inconsistencies confound the literature comparing different methods. The evidence is weak on outcomes of trastuzumab added to chemotherapy for HER2-equivocal, -discordant, or -negative patients. Evidence comparing chemotherapy outcomes in HER2-positive and HER2-negative patient subgroups may generate hypotheses, but is too weak to test hypotheses. Only a rigorous test can resolve whether HER2-positive patients (but not HER2-negative patients) benefit from an anthracycline regimen. Evidence is available only from uncontrolled series on whether HER2 status predicts complete pathologic response to neoadjuvant chemotherapy. Evidence also is weak regarding differences by HER2 status for outcomes of chemotherapy for advanced or metastatic disease; with most studies lacking statistical power. Data from studies of tamoxifen and aromatase inhibitors suggest that future studies should examine whether HER2 status predicts response to specific hormonal therapies among estrogen-receptor-positive patients. The evidence is weak on whether serum HER2 predicts outcome after treatment with any regimens in any setting, as is the evidence on use of serum or tissue HER2 testing for malignancies of lung, ovary, head and neck, or prostate.
Conclusions: Overall, few studies directly investigated the key questions of this systematic review. Going forward, cancer therapy trial protocols should incorporate elements to facilitate robust analyses of the use of HER2 status and other biomarkers for managing treatment.
The human epidermal growth factor receptor-2 (HER2) gene is amplified and the HER2 protein overexpressed in approximately 18–20 percent of breast cancer cases. Amplification or overexpression of HER2 is associated with poor prognosis. Evidence from randomized trials demonstrates that adding trastuzumab, a therapeutic monoclonal antibody that targets HER2, to adjuvant chemotherapy regimens for HER2-positive breast cancer improves survival. HER2 also is overexpressed in other epithelial malignancies such as ovarian, thyroid, lung, salivary gland/head and neck, stomach, colon, and prostate cancers.
This report is a systematic review of the evidence on other applications of HER2 testing to the management of cancer patients including: potential for response to trastuzumab among breast cancer patients who have negative, equivocal, or discordant HER2 assay results; use of HER2 assay results to guide selection of breast cancer treatments other than trastuzumab (i.e., chemotherapy regimen or hormonal therapy regimen); the use of serum HER2 to monitor treatment response or disease progression in breast cancer patients; and use of HER2 testing to manage patients with ovarian, lung, prostate, or head and neck tumors. The concordance and discrepancy of HER2 measurement methods are discussed in a narrative review.
The review methods were defined prospectively in a written protocol. A technical expert panel provided consultation. The draft report was also reviewed by other experts and stakeholders.
A narrative review was conducted on Key Question 1, which addressed concordance and discrepancy among HER2 assays in breast cancer. HER2 assay results are influenced by multiple biologic, technical, and performance factors. Since many aspects of HER2 assays were standardized only recently, we could not isolate effects of these disparate influences on assay results and patient classification. This challenged the validity of using systematic review methods to compare available assay technologies.
For Key Questions 2–5, we sought randomized trials or single-arm series (prospective or retrospective) of identically treated patients that presented relevant outcome data associated with HER2 status. Primary outcomes were: overall survival (OS); disease-free survival (DFS); progression-free survival (PFS); time to failure (TTF) or progression; quality of life; palliation of symptoms; and treatment-related adverse effects.
Our search had no language restrictions and used these electronic databases:
MEDLINE® (through February 2007)
EMBASE® (through February 2007)
Cochrane Controlled Trials Register (through February 2007)
The searches were updated in April 2008, using the Cochrane clinical trial filter.
Additional sources were the past two years of conference proceedings of the American Association for Clinical Chemistry (AACC), American Society of Clinical Oncology (ASCO), College of American Pathologists (CAP), and the San Antonio Breast Cancer Symposium (SABCS).
Of 6,337 citations, 666 articles were retrieved and 70 were selected for inclusion:
Three articles plus one abstract on use of trastuzumab among HER2-negative or -discordant breast cancer patients;
26 articles on chemotherapy or hormonal therapy for breast cancer patients;
15 articles on plasma or serum HER2 in patients treated for breast cancer; and
26 articles on serum or tissue HER2 in patients with lung cancer, ovarian cancer, head and neck cancer, and prostate cancer.
A single reviewer screened citations for article retrieval; citations judged as “uncertain” were reviewed by a second reviewer. The same procedure was used to select articles for inclusion in the review. A single reviewer performed data abstraction and a second reviewed the evidence tables for accuracy. However, study quality was appraised by dual independent review. All disagreements were resolved by consensus.
The quality of predictive studies was assessed using the general approach described in the “Reporting Recommendations for Tumor Marker Prognostic Studies” (REMARK) statement (McShane, Altman, Sauerbrei, et al., 2005). In addition, we used a hierarchical framework for evaluating how informative different designs and analytic strategies would be to predictions of outcomes according to HER2 status. Most informative is a trial that randomizes patients to receive treatment guided by HER2 results or not; or, alternatively, a trial that stratifies randomized assignment to treatment groups by HER2 status (Conley and Taube, 2004). Other types of studies, in decreasing order of information value, include: randomized trials using prespecified multivariate subgroup analyses, randomized trials using post-hoc multivariate subgroup analyses, randomized trials presenting HER2 by treatment subgroup analyses, single-arm studies using prespecified multivariate analyses, single-arm studies using post-hoc multivariate analyses, and single-arm studies using univariate analyses.
HER2 assay results are influenced by multiple biologic, technical and performance factors. Since many aspects of HER2 assays were standardized only recently, these disparate influences confound the existing literature that compares results of different methods. Discordances between immunohistochemistry (IHC) and fluorescent in situ hybridization (FISH) results might arise in one of three ways. They may be artifacts of one accurate and one inaccurate test or of two inaccurate tests, as preanalytic, analytic, and postanalytic practices can vary among laboratories within a study, as well as among studies. Interobserver variability can play a role. Alternatively, discordances may reflect a threshold issue, either related to changes in threshold definitions over time, or an inherent problem of using a continuous measure to classify patients dichotomously. Finally, discordant test results might accurately reflect a variation among patients with respect to the biologic mechanisms that can increase membrane levels of the HER2 protein. This clearly affects the interpretation of evidence on the use of “HER2 status” to predict treatment or disease outcomes, which presumes accurate classification by tissue assays.
Notably, there is no recognized gold standard to determine the HER2 status of tumor tissue, which also precludes consensus on one “best” HER2 assay. Recent guidelines acknowledge present uncertainty, permit clinicians and laboratories to choose an initial well-validated and properly performed HER2 assay method, and recommend confirming results with an alternative assay when initial tests are equivocal. The ASCO/CAP expert panel (Wolff, Hammond, Schwartz, et al., 2007a) defines equivocal HER2 assay results as IHC 2+, or HER2 gene copy number from 4.0 to 6.0, or HER2/CEP17 ratio from 1.8 to 2.2, if ISH is the first or only assay.
Currently available evidence on outcomes of trastuzumab added to chemotherapy for most HER2-equivocal, -discordant, or -negative patients may generate hypotheses, but is too weak to test hypotheses. Most of this evidence is from post-hoc analyses on subgroups not directly randomized or stratified by HER2 status. Scant but intriguing evidence suggests the hypothesis that some patients currently classified as HER2 negative may benefit from adjuvant trastuzumab. Data reported from a post-hoc subgroup analysis of one adjuvant trial (NSABP B31) showed significantly longer DFS and relapse-free interval (RFI) in FISH-negative IHC ≤2+ patients given trastuzumab than in patients managed without trastuzumab, whether the analysis did or did not include those who were IHC 0. However, analysis of data from another similar adjuvant trial (NCCTG N9831) found no significant differences. Both were interim analyses of trials in which fewer than 25 percent of subjects had reached a failure event. Followup analyses from these trials will be of interest.
CALGB 9840 investigators also analyzed a subgroup of metastatic FISH-negative patients that either had (n=38) or did not have (n=103) polysomy 17; overall response rate (ORR) was significantly higher with versus without trastuzumab for those with polysomy 17, but was identical with or without trastuzumab for those without polysomy 17. However, a study in the adjuvant setting (Reinholz, Jenkins, Hillman, et al., 2007) reports no impact of polysomy 17 on benefit from trastuzumab. Additionally, other studies report conflicting data on association of polysomy 17 with overexpression of HER2 protein.
For Question 3a, across all three treatment settings (adjuvant, neoadjuvant, or advanced/metastatic), currently available evidence comparing chemotherapy outcomes in HER2-positive and HER2-negative patient subgroups may generate hypotheses, but is too weak to test hypotheses. In the only study that prespecified multivariate subgroup analysis by HER2 status, interaction of assigned adjuvant treatment (with or without paclitaxel) with HER2 status to predict outcome was not statistically significant (ratio of hazard ratios [HRs]=0.85; p=.41). All other evidence is from post-hoc analyses on subgroups not directly randomized, selected, or stratified by HER2 status, and used data from secondary or correlative analysis on patient subgroups with archived tissue samples. It is uncertain whether these subgroups were well balanced. No studies for Question 3a used trastuzumab for HER2-positive patients.
Available evidence focuses on three types of adjuvant chemotherapy: cyclophosphamide plus methotrexate plus fluorouracil (CMF), regimens with an anthracycline, and paclitaxel after or with doxorubicin (Adriamycin®) plus cyclophosphamide (AC). Evidence from two studies (one randomized, controlled trial and one series) suggests HER2-positive patients may benefit less from CMF (smaller improvements in OS and DFS) than HER2-negative patients. Only one of four randomized, controlled trials reports a statistically significant interaction that suggests HER2-positive patients (but not HER2-negative patients) benefit from including an anthracycline in their treatment regimen. Given the highly statistically significant result favoring anthracycline therapy for the entire population (N=14,000) of breast cancer patients included in the Early Breast Cancer Trialists' Collaborative Group (EBCTCG 2005) patient-level meta-analysis, a rigorous test of this hypothesis is necessary before one can conclude that omitting anthracyclines from adjuvant chemotherapy regimens would not worsen outcome for HER2-negative patients.
Two trials compared different doses or frequencies of anthracycline-based regimens. One reported statistically significant interaction of cyclophosphamide, doxorubicin, and fluorouracil (CAF) dose with HER2 status to predict treatment outcome, but the second showed no relationship. One study found that adding paclitaxel after AC improves OS and DFS for HER2-positive patients, but may not improve these outcomes for HER2-negative patients. In contrast, the only randomized, controlled trial with a prespecified multivariate subgroup analysis found no difference by HER2 status in outcomes of concurrently added paclitaxel. Thus, for each of the adjuvant chemotherapy regimens compared, available evidence is too weak to rule out the possibility that HER2-negative patients may benefit from using the added drug or higher dose.
Evidence on whether HER2 status predicts complete response (pCR) to neoadjuvant chemotherapy is limited to four uncontrolled series (retrospective analysis in three). Data are lacking to directly compare any neoadjuvant regimens. There is also limited evidence on differences by HER2 status for outcomes of chemotherapy for advanced or metastatic disease, with most studies lacking statistical power.
For Question 3b, four studies addressed use of tamoxifen in various breast cancer patient populations, and two compared tamoxifen with aromatase inhibitors. None of these studies included trastuzumab. There were no trials that stratified randomization by HER2 status or randomization to therapy directed by HER2 results or not. Less informative designs were used, including post-hoc multivariate analyses in five randomized trials and one post-hoc multivariate analysis in a single-arm study. Data are too weak to reach new conclusions about differences between subgroups based on HER2 status in effects of specific hormone therapies for patients who are hormone-receptor positive.
Of 13 included studies, three were randomized trials and 11 were single-arm designs. The evidence is weak on whether sHER2 predicts outcome after treatment with any regimens in any setting. Evidence primarily focused on first-line or second- and subsequent-line treatment of metastatic disease using variety of regimens. Studies used different thresholds for a positive sHER2 result and varied on whether patient selection required positive tissue HER2 status. One randomized and two single-arm studies performed multivariate analysis, although reporting lacked sufficient detail. Univariate analyses provide very limited information value, suggesting candidate variables for future multivariate analyses. Overall, the evidence is too weak to assess whether sHER2 predicts disease progression, treatment response, or outcomes of any specific treatment regimen.
With respect to use of serum or tissue HER2 testing for malignancies of lung, ovary, head and neck, or prostate, the evidence is quite weak. Studies were heterogeneous regarding treatment regimens and thresholds for positive HER2 test results. Of 22 studies addressed for the four types of malignancies, there were no randomized trials that could have analyzed HER2 by treatment effect interactions. Six multivariate analyses in single-arm designs were performed, all of which were poorly described; it is unclear if they were well conducted. Data from these exploratory analyses did not consistently find that HER2 status predicts treatment results. Univariate analyses provide very limited information value, at best suggesting candidate variables for future multivariate analyses.
Overall, few trials directly investigated the key questions of this systematic review. Going forward, cancer therapy trial protocols should incorporate elements to facilitate robust analyses of the potential of HER2 to improve treatment management. These elements include:
Detailed reporting of how HER2 status was ascertained.
Stratified randomization by HER2 status or prospectively specified HER2 subgroup analysis of outcomes.
Detailed recording of relevant data and archiving of tissue samples for all participants, and accessible to other researchers, to permit future subgroup analyses of outcomes by HER2 status.
The rationale is strongest for breast cancer therapy trials, as many therapeutic agents, classes, and regimens have been and will be tested. This approach can be generalized to other tumors, to promising biomarkers other than HER2, and to serial collection of serum samples for sHER2 levels. Maximizing data collection in trials planned for other purposes offers an opportunity to screen for potential applications of HER2 and other biomarkers.
For Key Question 2, potential for response to trastuzumab among breast cancer patients who have equivocal, discordant, or negative HER2 assay results, evidence is scant but intriguing. Whether other markers might predict response to trastuzumab for these subgroups could be explored using tissue samples from completed trials.
For Key Question 3, the most compelling question is whether anthracyclines benefit HER2-negative patients. A pragmatic approach for future research is to use individual patient data, of the Early Breast Cancer Trialists' Collaborative Group (EBCTCG) meta-analysis, which compared survival with anthracyclines versus CMF in 14,000 patients. However, this approach may be limited by availability of sufficient tumor samples. Also of interest is evidence to clarify whether aromatase inhibitors are more effective than tamoxifen in HER2-positive patients.
For Key Questions 4 and 5, evidence does not support conclusions about use of serum HER2 for any treatment setting within breast cancer or about any use of serum or tissue HER2 for cancer of the lung, ovary, head and neck, or prostate. Future exploratory studies in these areas using preserved or prospectively collectively specimens should be designed with attention to study quality concerns.
Since many technical and performance aspects of HER2 assays were not standardized until very recently, differences in preanalytic, analytic, and postanalytic practices confound the existing literature. Available evidence supports hypotheses generation but is too weak to test hypotheses. Scant but intriguing evidence suggests the hypothesis that some patients currently classified as HER2 negative may benefit from adjuvant trastuzumab. Future research should focus on biomarkers that might select such patients. Evidence suggests HER2-positive, but not HER2-negative, patients may benefit from chemotherapy regimens with an anthracycline; but rigorous testing of this hypothesis is necessary. Also worth additional testing is the hypothesis that aromatase inhibitors may be more beneficial than tamoxifen for HER2-positive, hormone-receptor-positive breast cancer patients. Overall, few trials directly investigate the key questions of this systematic review.
Going forward, cancer therapy trial protocols should incorporate elements to facilitate robust analyses of the use of HER2 status and other biomarkers for managing treatment. Given the human and financial cost of cancer therapy trials, the limited resources available, and the long duration of followup needed to assess outcomes, particularly for early stage or slowly growing cancers, it is imperative that tumor tissue blocks be collected, optimally fixed, saved, and made available for correlative tumor marker studies from all randomized patients. Agreement to share blocks with investigators should be made a condition for institutions seeking to participate in cooperative group trials.
| Cancer Type | Estimated New Cases | Estimated Deaths |
|---|---|---|
| Breast cancer (female) | 182,460 | 40,480 |
| Ovarian cancer | 21,650 | 15,520 |
| Thyroid cancer | 37,340 | 1,590 |
| Lung cancer | 215,020 | 161,840 |
| Head and neck | ||
| • oral cavity/pharynx | 35,310 | 7,590 |
| • larynx | 12,250 | 3,670 |
| Stomach | 21,500 | 10,880 |
| Colon | 108,070 | 49,960 |
| Prostate | 186,320 | 28,860 |
Laboratory assays for the HER2 gene and protein in tumor tissue are used to determine the HER2 status of patients with breast cancer (positive if either HER2 gene amplification or HER protein overexpression is present; negative if neither is present). As outlined in guideline recommendations for HER2 testing in breast cancer from the American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP; Wolff, Hammond, Schwartz, et al., 2007a), and in a report from a task force of the National Comprehensive Cancer Network (NCCN; Carlson, Moench, Hammond, et al., 2006), information regarding a patient's HER2 status can contribute to treatment and other patient management decisions in several ways. HER2 overexpression has been associated with clinical outcomes in patients with breast cancer (Press, Pike, Chazin, et al., 1993; Press, Bernstein, Thomas, et al., 1997; Yamauchi, Stearns, Hayes, 2001). Because HER2 positivity is associated with a worse prognosis in patients with newly diagnosed breast cancer who do not receive systemic adjuvant chemotherapy, HER2 status may be incorporated along with other prognostic factors into decision making regarding such therapy (Wolff, Hammond, Schwartz, et al., 2007a; Carlson Moench, Hammond, et al., 2006).
HER2 positivity also appears to be associated with relative, but not absolute, resistance to certain endocrine therapies (e.g., tamoxifen; less so for aromatase inhibitors) and lower benefit from nonanthracycline, nontaxane-containing chemotherapy regimens (Konecny, Pauletti, Pegram, et al., 2003; Ellis, Coop, Singh, et al., 2001; Menard, Valagussa, Pilotti, et al., 2001). HER2 status is also used to determine whether a patient is eligible to receive biologic therapy specifically targeted to HER2 activity, e.g., trastuzumab (Herceptin®, Genentech, San Francisco, CA) or lapatinib (Tykerb®, GlaxoSmithKline, Research Triangle Park, NC).
Additionally, therapies have been developed that specifically target the HER2 protein (Dinh, de Azambuja, Piccart-Gebhart, et al., 2007; Pal and Pegram, 2007; Viani, Afonso, Stefano, et al., 2007; Lin and Rugo, 2007). Evidence from multiple randomized trials demonstrates that trastuzumab, a therapeutic monoclonal antibody that targets HER2, decreases the risk of recurrence and mortality when added to adjuvant chemotherapy regimens for resected HER2-positive breast cancer. A recent meta-analysis (five trials; pooled N=9,117) reported an odds ratio (OR) for mortality with versus without trastuzumab of 0.52 (95 percent CI: 0.44–0.62; p<0.00001), while OR for recurrence was 0.53 (95 percent CI: 0.46–0.60; p<.00001) (Viani, Alfonso, Stefano et al. 2007). In patients with metastatic HER2-positive breast cancer, trastuzumab alone or with chemotherapy increases time to disease progression and improves survival. Thus, there is increased emphasis on accurately determining the HER2 status of patients with newly diagnosed or recurrent breast cancer.
| A. IHC Assays: measure HER2 protein overexpression in tissue | ||||
|---|---|---|---|---|
| Assay | Mfr | Methodology | Scoring Criteria | FDA Status |
| Clinical Trials Assay | Developed by independent laboratory | CB11 and 4D5 MAb | 0 and 1+ negative, 2+ weakly positive, 3+ strongly positive | Research assay used in trials of trastuzumab in metastatic breast cancer |
| HercepTest™ | DAKO* | A0485 polyclonal antibody | Weakly positive (2+): weak to moderate complete membrane staining in >10% of tumor cells; strongly positive (3+): strong complete membrane staining in >10% of tumor cells* | U.S. Food and Drug Administration (FDA) approved as an aid in the assessment of patients for whom Herceptin™ (trastuzumab) treatment is being considered |
| PATHWAY™ | Ventana† | CB11 MAb | Positive (2+): weak complete staining of the membrane, >10% of cancer cells; positive (3+): intense complete staining of the membrane, >10% of cancer cells† | FDA approved as an aid in the assessment of patients for whom Herceptin™ (trastuzumab) treatment is being considered |
| B. In-Situ Hybridization (ISH) Assays: measure HER2 gene amplification in tissue | ||||
| PathVysion® HER2 DNA Probe Kit (FISH) | Abbott‡ | Hybridization of fluorescent DNA probes to HER2 gene (orange) and chromosome 17 centromere (green) | HER2 amplification: HER2/CEP17 ratio ≥2 on average for 60 cells; results at or near the cut off point (1.8–2.2) should be interpreted with caution (Persons, Tubbs, Cooley, et al., 2006; Dal Lago, Durbecq, Desmedt, et al., 2006) | FDA approved as an aid in the assessment of patients for whom Herceptin™ (trastuzumab) treatment is being considered |
| INFORM HER2/neu Probe (FISH) | Ventana§ | Hybridization of biotin-labeled DNA probe to HER2 gene and fluorescently labeled avidin | HER2 amplification: average of >6 HER2 gene copies/nucleus; an average of >4.0 <6.0 gene copies/nucleus for 60 cells described as equivocal in one publication (Dal Lago, Durbecq, Desmedt, et al., 2006; Vera-Roman and Rubio-Martinez, 2004) | FDA approved as an adjunct to existing clinical and pathologic information currently used as prognostic indicators in the risk stratification of breast cancer in patients with a primary, invasive, localized, node-negative tumor |
| HER2 FISH pharmDx™ Kit | Dako![]() | Hybridization of fluorescent DNA probes to HER2 gene (red) and PNA probes to chromosome 17 centromere (CEN-17; green) | Count 20 nuclei per tissue specimen, when possible from distinct tumor areas. Specimens with a HER2/CEN-17 ratio ≥2 should be considered HER2 gene amplified (Kallioniemi, Kallioniemi, Kurisu, et al., 1992; Ellis, Dowsett, Bartlett, et al., 2000; Hanna, 2001; Tsuda, Akiyama, Terasaki, et al., 2001). Results at or near the cut-off (1.8–2.2) should be interpreted with caution. If the ratio is borderline (1.8–2.2), count an additional 20 nuclei and recalculate the ratio for the 40 nuclei | FDA approved as an adjunct to clinicopathologic information currently used for estimating prognosis in stage II, node-positive breast cancer patients and as an aid in assessment of patients being considered for Herceptin™ (trastuzumab) treatment |
| SPoT-Light (CISH) | Invitrogen/Zymed¶ | Hybridization of digoxigenin-labeled DNA probe to HER2 gene; detection via mouse antidigoxigenin antibody followed by antimouse-peroxidase | High HER2 amplification defined as >10 dots, or large clusters, (low if >5 dots to 10 dots, or small clusters) or mixture of multiple dots and large clusters of the HER2 gene present per nucleus in >50% tumor cells (Hanna and Kwok, 2006) | DNA probe kit not available in the U.S. |
| EnzMet GenePro (SISH) | Ventana | Hybridization of dinitrophenol-labeled DNA probe to HER2 gene; detection via peroxidase-labeled multimer followed by enzyme metallography | Amplification defined as six or more dots, or large clusters of dots, in 30% or more of invasive tumor cells (Downs-Kelly, Pettay, Hicks, et al., 2005) | DNA probe kit not available in the U.S. |
| C. HER2 Extracellular Domain (ECD) Assays: detect HER2 ECD in serum | ||||
| Immuno 1®/ADVIA Centaur® | Bayer | Enzyme immunoassay (EIA); primary MAbs NB-3 and TA-1 (one is labeled with fluorescein and the other is either linked to an enzyme or a chemiluminogenic molecule) specific for the ECD of HER2 added to sera; detection via binding of immunocomplex to antifluorescein antibodies in the solid phase, followed by addition of substrate in case of Immuno 1 assay | Elevated ECD concentrations often defined as >15 ng/mL (Payne, Allard, Anderson-Mauser, et al., 2000; Esteva, Cheli, Fritsche, et al., 2005) | FDA approval for followup and monitoring patients with metastatic breast cancer only |
CISH: chromogenic in situ hybridization; ECD: extracellular domain; IHC: immunohistochemistry; FISH: fluorescent in situ hybridization; MAb: monoclonal antibody; Mfr: manufacturer; SISH: silver enhanced in situ hybridization;
This systematic review will address five key questions regarding HER2 testing to manage patients with breast cancer or other solid tumors:
What is the evidence on concordance and discrepancy rates for methods (e.g., FISH, IHC, etc.) used to analyze HER2 status in breast tumor tissue?
For patients who are not unequivocally HER2 positive, what is the evidence on outcomes of treatment targeting the HER2 molecule (trastuzumab, etc.), or on differences in outcomes of a common chemotherapy or hormonal therapy regimen with versus without additional treatment targeting the HER2 molecule, in:
Breast cancer patients characterized by discrepant HER2 results from different tissue assay methods performed adequately; and
For those with HER2-negative breast cancer?
For breast cancer patients, what is the evidence on clinical benefits and harms of using HER2 assay results to guide selection of:
Chemotherapy regimen; or
Hormonal therapy?
What is the evidence that monitoring serum or plasma concentrations of HER2 extracellular domain in patients with HER2-positive breast cancer predicts response to therapy, or detects tumor progression or recurrence, and if so, what is the evidence that decisions based on serum or plasma HER2 assay results improve patient management and outcomes?
In patients with ovarian, lung, prostate, or head and neck cancers, what is the evidence that:
Testing tumor tissue for HER2; or
Monitoring serum or plasma concentrations of HER2;
either predicts response to therapy, or detects tumor progression or recurrence; and if so, what is the evidence that decisions based on HER2 assay results improve patient management and outcomes?
The first Key Question will be dealt with via a narrative review of the recent ASCO/CAP guidelines and evidence published subsequently.
This report reviews and synthesizes available evidence on outcomes of using HER2 test results to manage patients with breast cancer or other solid tumors. Five Key Questions are addressed (see “Introduction”). After extensive consideration, we concluded that since a myriad of technical, biologic and performance matters influence HER2 diagnostic performance, that these variables could not be adequately captured in a systematic review. Thus, Key Question 1 will be addressed by a narrative review and Key Questions 2 through 5 will be addressed by systematic review.
This chapter describes the search strategies used to identify literature; criteria and methods used for selecting eligible articles; methods for data abstraction; methods for quality assessment; and, finally, the process for technical expert advice and peer review.
The methods of this review are generally applicable to all Key Questions except Key Question 1. However, as noted, there were variations in specific aspects of the methods as necessary to satisfy requirements of each question.
A technical expert panel provided consultation for the systematic review and reviewed the draft report. The draft report was also reviewed by 12 external reviewers, including invited clinical experts and stakeholders (Appendix D *). Revisions were made to the draft report based on reviewers' comments.
For Key Questions 1–4, populations of interest are patients with breast cancer, with separate analyses for early stage patients receiving adjuvant therapy and those undergoing treatment for metastatic disease.
For Key Question 5, populations of interest are patients with cancers of the lung, ovary, prostate, and head and neck.
In general, outcomes should be standard, valid, reliable, and clinically meaningful. Two types of outcomes are relevant to Key Question 1:
Diagnostic accuracy (e.g., analytic sensitivity, specificity, reliability, etc.);
Concordance between assay methods; and
Multiple levels of outcomes will be addressed for Key Questions 2 through 5:
Lead time for detection of progression, recurrence or metastasis.
Patient management decisions, which may be altered by test results;
Primary (health) outcomes, which may be affected through management changes guided by test results, such as:
Duration of survival, disease-free survival, progression-free survival, and/or time to failure or progression.
Quality of life.
Palliation of measurable symptoms.
Treatment-related adverse effects.
Secondary (intermediate) outcomes include:
Objective clinical response rates (complete and partial responses; separately and summed).
Pathologic complete response rates in patients undergoing neoadjuvant therapy followed by surgery.
Response durations.
Health outcomes will be given greatest emphasis. However, it will likely be necessary to construct causal pathways to connect assay results to health outcomes through patient management decisions.
The interventions of interest for Key Questions 1, 2, 3, and 5 are tissue assays to evaluate tumor HER2 status by:
Immunohistochemistry;
Fluorescence in-situ hybridization;
Chromogenic in-situ hybridization;
Polymerase chain reaction; or
Other methods.
The interventions of interest for Key Question 4, and also of interest for parts of Key Question 5, are assays to measure serum concentration of the HER2 extracellular domain.
Interventions relevant to Key Questions 1–5 are used in the following settings:
Pathology and laboratory medicine.
Hospitals.
Outpatient surgery facilities.
Office-based practices.
Following are study selection criteria specific to each key question.
HER2 assay results are influenced by multiple biologic, technical and performance factors. Since many aspects of HER2 assays were not standardized until very recently, we could not isolate effects of these disparate influences on assay results and patient classification.
This challenged the validity of using systematic review methods to compare available assay technologies. For that reason, we provide a narrative review of the following factors influencing HER2 test results and their use to classify patients: biologic processes, assay methods, and sources of variability.
Key Question 2. For patients who are not unequivocally HER2-positive, what is the evidence on outcomes of treatment targeting the HER2 molecule (trastuzumab, etc.), or on differences in outcomes of a common chemotherapy or hormonal therapy regimen with versus without additional treatment targeting the HER2 molecule, in:
Breast cancer patients characterized by discrepant HER2 results from different tissue assay methods performed adequately; and
For those with HER2-negative breast cancer?
Randomized trials, or non-randomized studies (prospective or retrospective) on patients given a uniform chemotherapy regimen or hormonal treatment; that
Directly compare outcomes of treatment with versus without trastuzumab (or other HER2-targeted therapy); and also
Compare outcomes separately for one or more groups whose HER2 assay results are:
equivocal, or discordant by IHC and ISH, with results separately reported for IHC 2+ and 3+ cases (IHC 0 and 1+ cases may be pooled); or
unequivocally negative by both IHC and ISH.
Key Question 3. For breast cancer patients, what is the evidence on clinical benefits and harms of using HER2 assay results to guide selection of:
Chemotherapy regimen; or
Hormonal therapy?
Randomized trials, prospective or retrospective studies on identically treated patients, including:
Identical hormonal therapy for all patients in studies on chemotherapy; and
Identical chemotherapy for all patients in studies on hormonal therapy; or
Separate reporting on identically treated groups.
Report outcomes of a breast cancer treatment regimen separately by HER2 status;
Report outcomes separately for patients undergoing treatment in the neoadjuvant, adjuvant or advanced (recurrent, refractory, or metastatic) settings
Report:
Pathologic response (i.e. objective tumor regression) rates for studies on neoadjuvant therapy;
Disease-free, relapse-free, recurrence-free or progression-free survival for studies on adjuvant therapy; and
Progression-free or overall survival for advanced disease.
Defined HER2 positivity consistently with the algorithm recommended in the ASCO/CAP guideline.
Included at least 20 HE4R2-positive patients.
Separate evidence tables and analyses will focus on:
Treatment setting (neoadjuvant, adjuvant or for advanced disease);
Chemotherapy regimens (e.g., anthracycline-based regimens, or a taxane); and
Hormonal therapies (e.g., tamoxifen versus aromatase inhibitors).
Key Question 4. What is the evidence that monitoring serum or plasma concentrations of HER2 extracellular domain in patients with HER2-positive breast cancer predicts response to therapy, or detects tumor progression or recurrence, and if so, what is the evidence that decisions based on serum or plasma HER2 assay results improve patient management and outcomes?
Randomized trials, prospective single-arm studies, or retrospective series of identically treated patients; that
Measure serum or plasma HER2 concentrations in breast cancer patients, either at baseline or at multiple time points; and either:
Associate baseline values or changes in HER2 concentration with one or more outcomes of interest (primary or secondary); or
Compare outcomes of treatment decisions based on assay results with outcomes of decisions made in absence of assay results.
Key Question 5. In patients with ovarian, lung, prostate, or head and neck cancers, using tumor tissue HER2 or monitoring serum or plasma concentrations of HER2 predicts response to therapy, or detects tumor progression or recurrence. Inclusion criteria:
Randomized trials, prospective single-arm studies, or retrospective series of identically treated patients; that
Measure HER2 in tumor tissue, serum, or plasma from patients with ovarian, lung, prostate, or head and neck cancers, and either:
Associate HER 2 status from tissue assays, or baseline values or changes in serum or plasma HER2 concentration, with one or more outcomes of interest (primary or secondary; see above); or
Compare outcomes of treatment decisions based on tumor HER2 status, or serum or plasma assay results, with outcomes of decisions made in absence of test results.
Electronic databases. The following databases were searched for citations. The full search strategy is displayed in Appendix A *. The search was not limited to English-language references; however, foreign-language references without abstracts were disregarded.
The MEDLINE® search was performed through 2/23/07. The EMBASE® search was performed through 2/23/07. The Cochrane Controlled Clinical Trials Register search was performed through 2/23/07. Search updates limited by the Cochrane clinical trial filter were performed for all 3 databases on 4/25/08.
Additional sources of evidence. The Technical Expert Panel and individuals and organizations providing peer review were asked to inform the project team of any studies relevant to the key questions that were not included in the draft list of selected studies.
We also examined the bibliographies of all retrieved articles for citations to any relevant study that was missed in the database searches. In addition, we sought studies published in conference published in conference proceedings and abstracts from the American Association for Clinical Chemistry (AACC), American Society of Clinical Oncology (ASCO), College of American Pathologists (CAP) and the San Antonio Breast Cancer Symposium (SABCS) over the past two years.
Search results were stored in a ProCite® database. Using the study selection criteria for screening titles and abstracts, a single reviewer marked each citation as either: 1) eligible for review as full-text articles; 2) ineligible for full-text review; or 3) uncertain. Citations marked as uncertain were reviewed by a second reviewer and resolved by consensus opinion, with a third reviewer to be consulted if necessary. Using the final study selection criteria, review of full-text articles was conducted in the same fashion to determine inclusion in the systematic review. Of 6,337 citations, 666 articles were retrieved and 70 selected for inclusion (Figure 1
The data elements below were abstracted, or recorded as not reported, from included studies. Data elements to be abstracted were defined in consultation with the Technical Expert Panel.
Data elements from intervention studies (randomized, controlled trials, prospective single-arm studies, and retrospective consecutive series of identically treated patients) were:
Critical features of the study design (for example, patient inclusion/exclusion criteria, number of subjects, use of blinding)
Patient characteristics, including:
Age
Gender
Race/ethnicity
Disease and stage
Disease duration
Performance status
Other prognostic characteristics (e.g., estrogen or progesterone receptor status)
HER2 assay techniques (tissue versus serum, IHC, FISH, PCR, ELISA, scoring methods, cutoffs);
Treatment protocols (for example, regimen, dose, frequency, duration)
Patient monitoring procedures (for example, followup duration and frequency, outcome assessment methods) and
The specified key outcomes and data analysis methods (including techniques for assessing associations between HER2 findings and outcomes and methods for assessing treatment effect interactions)
Templates for evidence tables were created in Microsoft Excel® and Microsoft Word®. One reviewer performed primary data abstraction of all data elements into the evidence tables, and a second reviewer reviewed articles and evidence tables for accuracy. Disagreements were resolved by discussion, and if necessary, by consultation with a third reviewer. When small differences occurred in quantitative estimates of data from published figures, the values obtained by the two reviewers were averaged.
For this systematic review we constructed a hierarchy of evidence quality for studies assessing HER2 status in predicting outcome. As addressed below, the continuum ranged from more informative specially designed randomized trials to less informative single-arm studies using univariate analyses. In addition to the hierarchy of evidence, we adapted acknowledged frameworks for evaluating the quality of prognostic or predictive studies. For assessing the quality of randomized trials, the general approach to grading evidence developed by the U.S. Preventive Services Task Force (Harris, Helfand, Woolf, et al., 2001) was applied. To assess the quality of predictive studies, we adapted the “Reporting Recommendations for Tumor Marker Prognostic Studies” (REMARK) statement (McShane, Altman, Sauerbrei, et al., 2005). The quality of included prospective, single-arm intervention studies and retrospective consecutive series of identically treated patients was assessed based on a set of study characteristics proposed by Carey and Boden (2003). The quality of the abstracted studies was assessed by two independent reviewers. Discordant quality assessments were resolved with input from a third reviewer, if necessary.
| More informative ↑ ↑ Continuum ↓ ↓ Less informative | Randomized trial, randomization stratified on HER2 status OR patients randomized to HER2-guided treatment or non-HER2-guided treatment |
| Randomized trial, prespecified multivariate subgroup analysis | |
| Randomized trial, post-hoc multivariate subgroup analysis | |
| Randomized trial, treatment by HER2 subgroup analysis | |
| Single-arm study, prespecified multivariate analysis | |
| Single-arm study, post-hoc multivariate analysis | |
| Single-arm study, univariate analysis |
Post-hoc subgroup analyses may generate hypotheses, but may not support strong inferences about differential effectiveness. Multivariate subgroup analyses in randomized trials may be useful if the subgroup variable introduces imbalances between different variable by treatment combinations, particularly when only a subset of patients have tumor or serum specimens available. An alternative to multivariate subgroup analysis is cross tabulation of treatment by HER2 level results. The weakness of this approach is failure to control for imbalances in any important prognostic factors, particularly if the patients analyzed are a subset of those randomized. A formal test of interaction is preferred for any trial subgroup analysis. In single-arm (identically treated) studies, multivariate analyses may identify whether a variable is a significant independent predictor of treatment outcome while taking into account the separate influences of other predictors. The least informative situation would be a single-arm study that presents univariate comparisons of HER2 groups.
| Quality Criterion | Rule |
|---|---|
| Prospective design | Applies to original study design, whether predictive aspect was part of original focus or not. |
| Prespecified hypotheses about relation of marker to outcome | Article must clearly state that investigation of relation of marker to outcome was prespecified primary or secondary objective of study. Must be coded no if original study design is retrospective. Retrospective analysis of originally prospective design is not a prespecified analysis (e.g., use of banked specimens). |
| Large, well-defined, representative study population | At least 100 participants and must have at least 10 events (not participants) per candidate predictor variable. |
| Marker assay methods well-described | Details or references available for detailed assay protocol including reagents or kits used, quality control procedures, reproducibility assessments, quantitation methods, scoring and reporting. |
| Blinded assessment of marker in relation to outcome | Were individuals assessing assay results blinded to outcomes? |
| Homogeneous treatment(s), either randomized or rule-based selection | All patients within a study arm must be given the same treatment regimen (no differences in type and number of modalities). Exceptions made for members of a class within a modality or combinations that have been show to have comparable efficacy. Heterogeneity of treatment regimens allowable up to 5% of patient population. |
| Low rate of missing data (≤15%) | Refers to number of participants originally enrolled. |
| Sufficiently long followup | Depends on natural history of disease for patient population defined by stage and other prognostic factors. |
| Well-described, well-conducted multivariate analysis of outcome: | |
1) clear candidate variable selection | Methods for selecting candidate variables should be clearly described. |
2) clear, appropriate model-building guidelines | Model building strategies should be based on previous evidence of predictive factors, not on arbitrary univariate significance levels or stepwise procedures. |
3) assumptions tested | Mention should be made, for example, that the proportional hazards assumption of the Cox regression was tested. |
4) standard prognostic variables included | A final model should include standard prognostic/predictive variables regardless of significance in univariate analysis. |
5) continuous variables well handled | Arbitrary cutoffs should be avoided, optimal cutoffs should be clearly explained, multiple analytic methods explored including keeping variable continuous and more than 2 categories. |
6) validation | Was a validation procedure mentioned? |
For assessing the quality of randomized trials, the general approach to grading evidence developed by the U.S. Preventive Services Task Force (Harris, Helfand, Woolf, et al., 2001) was applied.
The quality of randomized, controlled trials will be assessed on the basis of the following criteria:
Initial assembly of comparable groups: adequate randomization, including concealment and whether potential confounders (e.g., other concomitant care) were distributed equally among groups.
Maintenance of comparable groups (includes attrition, crossovers, adherence, contamination).
Important differential loss to followup or overall high loss to followup.
Measurements: equal, reliable, and valid (includes masking of outcome assessment).
Clear definition of interventions.
All important outcomes considered.
Analysis: Adjustment for potential confounders, intention-to-treat analysis.
Definition of ratings based on above criteria:
The rating of intervention studies encompasses the three quality categories described here.
Good: Meets all criteria: Comparable groups are assembled initially and maintained throughout the study (followup at least 80 percent); reliable and valid measurement instruments are used and applied equally to the groups; interventions are spelled out clearly; all important outcomes are considered; and appropriate attention is given to confounders in analysis. In addition, for randomized, controlled trials, intention to treat analysis is used.
Fair: Studies will be graded “fair” if any or all of the following problems occur, without the fatal flaws noted in the “poor” category below: In general, comparable groups are assembled initially but some question remains whether some (although not major) differences occurred with followup; measurement instruments are acceptable (although not the best) and generally applied equally; some but not all important outcomes are considered; and some but not all potential confounders are accounted for. Intention to treat analysis is done for randomized, controlled trials.
Poor: Studies will be graded “poor” if any of the following fatal flaws exists: Groups assembled initially are not close to being comparable or maintained throughout the study; unreliable or invalid measurement instruments are used or not applied at all equally among groups (including not masking outcome assessment); and key confounders are given little or no attention. For randomized, controlled trials, intention to treat analysis is lacking.
The quality of included prospective single-arm intervention studies and retrospective consecutive series of identically treated patients was assessed based on a set of study characteristics proposed by Carey and Boden (2003), as follows:
Clearly defined question.
Well-described study population.
Well-described intervention.
Use of validated outcome measures.
Appropriate statistical analyses.
Well-described results.
Discussion and conclusion supported by data.
Funding source acknowledged.
What is the evidence on concordance and discrepancy rates for methods (e.g., FISH, IHC, etc.) used to analyze HER2 status in breast tumor tissue?
HER2 assay results are influenced by multiple biologic, technical and performance factors. Since many aspects of HER2 assays have not been standardized until very recently, the effects of these disparate influences could not be isolated. This challenged the validity of using systematic review methods to compare available assay technologies. For that reason, we provide a narrative review of the following factors influencing HER2 test results and their use to classify patients: biologic processes, assay methods, and sources of variability.
Genes such as those in the epidermal growth factor (EGF) receptor family (HER1 through HER4) affect cellular function through the proteins they encode. The HER2 gene is expressed and HER2 protein is found in membranes of all breast and other epithelial cells, and cut-points between “normal” and “overexpressed” levels of HER2 protein are imprecise. Nevertheless, studies have associated increased amounts of HER2 protein in cell membranes with more aggressive behavior of breast and other epithelial cancers and may predict treatment outcomes (Slamon, Clark, Wong, et al., 1987; Esteva, Pusztai, Symmans, et al., 2000; Rowinsky, 2004; Hynes and Lane, 2005; Ettinger, 2006; Serrano-Olvera, Duenas-Gonzalez, Gallardo-Rincon, et al., 2006).
Expression of HER2 and similar genes is a sequential process that (in a simplified overview) includes the following steps: transcription of DNA to messenger RNA (mRNA); processing mRNA to mature, translatable messages; and translation of mature mRNA to synthesize the protein's amino acid sequence. For many proteins (including HER2), additional steps required to produce functional molecules include: post-translational modification (e.g., glycosylation), three-dimensional folding, assembly of multi-subunit proteins, and movement to the relevant cellular site or organelle (not necessarily in this sequence).
We will discuss each of the following biologic mechanisms that potentially may increase the amount of HER2 protein in cell membranes:
Increased gene copy number (i.e., more than diploid amounts of HER2 DNA in cell nuclei), by:
HER2 gene amplification, or
Chromosome 17 polysomy;
Elevated HER2 protein levels in cells with diploid amounts of HER2 DNA, by
Increased rate of HER2 gene expression; or
Decreased degradation (increased stability) of HER2 mature message and/or protein.
Gene amplification. In most HER2-positive cases, increased levels of HER2 protein in breast cancer cell membranes are attributable to an amplified HER2 gene (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007; Slamon, Clark, Wong, et al., 1987). Gene amplification increases the copy number for a segment from one arm of a chromosome (Albertson, 2006; Myllykangas and Knuutila, 2006); amounts of the central portion (centromere) and the chromosome's other arm remain unaltered. The amplified DNA segment (amplicon) can include one or several genes. It can be organized as extrachromosomal elements, as repeated units at a single locus (which lengthens the affected chromosome arm), or repeats can be spread throughout the genome. Typically, all or most copies of the amplified gene(s) are expressed, and amounts of the excess protein increase nearly exponentially with gene copy number per cell (Szollosi, Balazs, Feurenstein, et al., 1995; Konecny, Pegram, Venkatesan, et al., 2006).
The HER2 gene has been mapped to the long arm of chromosome 17, at position 17q12 (Vanden Bempt, Drijkoningen, and De Wolf-Peeters, 2007; Jarvinen and Liu, 2006; Kauraniemi and Kallioniemi, 2006; Mano, Rosa, De Azambuja, et al., 2007). Amplicon size can vary, with from two to ten (or more) other amplified genes mapping to the region from 17q12 to 17q21. Although not relevant to assays used to classify HER2 status of patients with breast cancer, note that the gene coding for the enzyme topoisomerase II-α (TOPIIA, a target of the anthracyclines) also is located in this segment. Co-amplification of these genes may be more relevant to predict outcomes of therapy with an anthracycline regimen than amplification of the HER2 gene alone, since excess TOPIIA activity is a potential mechanism of anthracycline resistance (see “Results and Conclusions, Key Question 3”).
Chromosome 17 polysomy. HER2 gene copy number also may rise if cells have more than two copies of chromosome 17. Obviously, cells that have replicated their DNA but not yet divided have four rather than two copies of each chromosome, thus also of the HER2 gene. But some breast or other cancer cells may have extra copies of one or more whole chromosomes (termed polysomy), and may stably pass this characteristic to daughter cells. Cells with chromosome 17 polysomy have extra copies of the HER2 gene, although the ratio of HER2 copy number to centromere copy number is the same as in diploid cells unless HER2 also is amplified. However, it is uncertain whether chromosome 17 polysomy is associated with overexpression of the HER2 protein (Vanden Bempt, Drijkoningen, and De Wolf-Peeters, 2007; Beser, Tuzlali, Guzey, et al., 2007; Corzo, Bellosillo, Corominas, et al., 2007; Hyun, Lee, Kim, et al., 2008; Torrisi, Rotmensz, Bagnardi, et al., 2007; Downs-Kelly, Yoder, Stoler, et al., 2005; Ma, Lespagnard, Durbecq, et al., 2005).
Elevated HER2 protein in cells with diploid HER2 DNA. Although uncommon, clinical investigators have reported breast cancer cases with elevated HER2 protein levels in malignant diploid cells (i.e., cells lacking amplified HER2 genes or polysomy 17; e.g., Mass, Press, Anderson, et al., 2005; Vogel, Cobleigh, Tripathy, et al., 2002; Pauletti, Godolphin, Press, et al., 1996). This probably arises through increased expression of the HER2 gene, although decreased rates of degradation for either the mRNA or protein are at least theoretically possible. Increased expression may involve enhanced rates of transcription, message processing, translation, and/or post-translational modification (selectively for the HER2 gene). Detailed review of mechanisms that may increase rates of these processes is outside this report's scope.
It is uncertain whether tumors with increased membrane HER2 protein but diploid HER2 DNA respond differently to therapies (targeted to the HER2 protein, or to others) than do tumors with amplified HER2 DNA that increases HER2 protein. It is also unknown if the route to excess HER2 protein (i.e., whether from increased mRNA production, protein synthesis, or decreased degradation of either) affects tumor biology and aggressiveness or treatment outcomes. In vitro data suggest that increased membrane HER2 protein affects cell physiology, proliferation, and treatment responses in the same way, regardless of how the excess is produced (Pierce, Arnstein, DiMarco, et al., 1991).
In current clinical practice, assays used to classify breast cancer patients with respect to HER2 status detect either HER2 protein or HER2 DNA (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). Research laboratories use assays for HER2 mRNA to study molecular mechanisms and biologic regulation. They are technically more difficult than protein and DNA assays, and measure less-stable molecules. Although real-time reverse transcription polymerase chain reaction (RT-PCR) methods recently were adapted to measure HER2 mRNA in fixed, paraffin-embedded tissues and compared with IHC and ISH assays (Capizzi, Gruppioni, Grigioni, et al., 2008), RT-PCR assays for HER2 mRNA are still uncommon in clinical management of patients with breast cancer and thus are not included in this review.
Each method used to determine HER2 status applies results of a quantitative or semiquantitative assay to assign a binary (“yes/no”) classification. Thus, test results with each assay can vary with different scoring systems and thresholds for positivity. As discussed in a following section (“Postanalytic Factors”), scoring and thresholds may depend on choice of reagents to detect, visualize, and quantitate analytes. Scoring systems and thresholds also have changed over time, with standardized approaches recommended quite recently (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). Data are lacking to determine whether differences in treatment outcome as a function of HER2 status are affected by reclassifying patients with currently recommended scoring systems and thresholds.
Methods to detect/measure amount of HER2 protein. Immunohistochemistry (IHC) is the assay used most widely for classifying HER2 status of breast cancer patients, since it uses techniques and equipment long used by most clinical pathology laboratories for other proteins such as estrogen and progesterone receptors (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007; Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). The assay incubates thin slices of fixed tissue on a microscope slide with an antibody to HER2, washes off unbound antibody, then visualizes bound antibody. Because IHC preserves tissue architecture and cellular structure (morphology), it permits scoring to focus on antibody specifically bound to membranes of invasive breast cancer cells. IHC also permits permanent storage of stained slides if later re-evaluation is needed.
Protein assays on homogenized tissue may use antibody to visualize HER2 after separating proteins in a solid matrix (Western blots), or quantitate HER2 by enzyme-linked immunosorbent assay (ELISA). These assays destroy the analyzed tissue samples. Additionally, tissue extracts may mix proteins from cytosol, membranes, and other organelles; and also from multiple cell types: normal breast, inflammatory cells, in situ tumor, and invasive cancer. HER2 levels of in situ breast tumor cells often are elevated, for uncertain reasons and with inadequately studied clinical implications (Allred, Clark, Tandon, et al., 1992; Hoque, Sneige, Sahin, et al., 2002; Collins and Schnitt, 2005). Guidelines stress avoiding areas of ductal carcinoma in situ when scoring assay results (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). Nearly all clinical studies on HER2 protein assays to predict treatment outcomes used IHC on tissue slices rather than assays on tissue homogenates, and assigned HER2 status by amount of HER2 protein in membranes of invasive breast cancer cells.
Methods to detect/measure HER2 gene copy number or amount of HER2 DNA. In situ hybridization (ISH) is the most commonly used method to measure HER2 gene copy number in tissue samples from breast cancer patients (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007; Ross, Fletcher, Linette, et al., 2003; Hicks and Kulkarni, 2008). It uses a labeled probe complementary to the DNA sequence of interest (here, a unique segment from the HER2 gene). Double-stranded DNA in cell nuclei of the fixed tissue sample is denatured so the probe can hybridize (bind) to its complementary sequence, then unbound probe is washed away. As with IHC, tissue preparation for ISH preserves tissue and cell morphology, and scoring focuses on invasive breast cancer cells.
In ISH assays, pathologists count fluorescent (FISH) or dark-colored (CISH, SISH) spots visible above the nucleus to measure HER2 gene copy number: two in diploid cells; more in cells with amplified HER2 or polysomy 17. Typically, one determines gene copy number for multiple invasive cancer cells on the slide, and averages results for the tissue sample. In some ISH assays, slides are hybridized simultaneously with two probes that fluoresce in or show different colors, to permit copy number measurement for the HER2 gene and chromosome 17 centromere (CEP17). With this approach, HER2 gene status is defined by the ratio of HER2 to CEP 17 copy numbers: greater than 2 if amplified, but approximately 2 if unamplified whether chromosome 17 polysomy is absent or present.
Early research studies extracted DNA from tissue homogenates and measured amounts of the HER2 gene by Southern or slot blots, or by quantitative polymerase chain reaction (PCR) assays. Southern blots first separate DNA molecules by their mobility in a matrix, while slot blots use the mixed extract. Each selectively visualizes the DNA sequence of interest by hybridizing to labeled probes as in ISH. PCR assays amplify (selectively replicate) DNA sequences of interest in vitro, detect them by fluorescent or other probes, and quantify the starting amount using standard curves. As with protein assays on tissue homogenates, these techniques dilute DNA from invasive cancer cells with DNA from surrounding normal tissues and inflammatory cells (Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). They also consume the samples they analyze. Southern and slot blots are less sensitive than PCR and require substantially larger amounts of DNA. Southern blot assays also are labor intensive and less widely available in clinical pathology labs. The remainder of this review focuses on IHC and ISH methods, the only HER2 assays with FDA-approved kits available for clinical use.
Accurately determining HER2 status depends on proper performance of preanalytic, analytic, and postanalytic steps (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hicks and Kulkarni, 2008; Hanna, O'Malley, Barnes, et al., 2007; Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). Preanalytic steps are those involved in obtaining, preserving (fixing), and storing tissue samples prior to staining and analysis. Analytic steps prepare and stain fixed tissue samples with antibody to HER2 for IHC, or prepare and hybridize them to HER2 gene probe for ISH, then visualize tissue-bound antibody or probe. Postanalytic steps score test results, classify patients, and assure test quality, consistency, and reproducibility. Some processes for these steps are the same for IHC or ISH, but many differ.
Preanalytic: tissue processing and storage. HER2 tests can use tissue from core (incisional) biopsy or tumor excised for biopsy, lumpectomy, or mastectomy (Wolff, Hammond, Schwartz, et al., 2007a). Tissue sources can be the primary tumor or a lymph node or distant metastasis (Carlson, Moench, Hammond, et al., 2006). While uncommon, studies have reported discordances in HER2 status between primary tumor and metastases (for references, see Carlson, Moench, Hammond, et al., 2006). Retesting HER2 status if metastases develop after a long disease-free or progression-free interval may be warranted, depending on where and how HER2 status of the primary tumor was determined.
Tissues are prepared and preserved for assays by slicing larger samples, fixing in a denaturing solution, and embedding fixed tissue for long-term storage (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). Factors that may influence test results include: edge, retraction, or crush artifacts with some core needle biopsies; time from excision to slicing, and to fixation; type of and time in fixative; choice of embedding material; and conditions and duration of storage for fixed and embedded tissues.
| Recommendation | |
|---|---|
| Optimal algorithm for HER2 testing | Positive for HER2 is either IHC HER2 3+ (defined as uniform intense membrane staining of >30% of invasive tumor cells) or FISH amplified (ratio of HER2 to CEP17 of > 2.2 or average HER2 gene copy number > six signals/nucleus for those test systems without an internal control probe) |
| Equivocal for HER2 is defined as either IHC 2+ or FISH ratio of 1.8–2.2 or average HER2 gene copy number four to six signals/nucleus for test systems without an internal control probe | |
| Negative for HER2 is defined as either IHC 0–1+ or FISH ratio of < 1.8 or average HER2 gene copy number of < four signals/nucleus for test systems without an internal control probe | |
These definitions depend on laboratory documentation of the following:
| |
| Optimal FISH testing requirements | Fixation for fewer than 6 hours or longer than 48 hours is not recommended |
Test is rejected and repeated if
| |
| Interpretation done by counting at least 20 cells; a pathologist must confirm that counting involved invasive tumor | |
| Sample is subjected to increased counting and/or repeated if equivocal; report must include guideline-detailed elements | |
| Optimal IHC testing requirements | Fixation for fewer than 6 hours or longer than 48 hours is not recommended |
Test is rejected and repeated or tested by FISH if
| |
Interpretation follows guideline recommendation
| |
| Sample is subjected to confirmatory FISH testing if equivocal based on initial results | |
| Report must include guideline-detailed elements | |
| Optimal tissue handling requirements | Time from tissue acquisition to fixation should be as short as possible; samples for HER2 testing are fixed in neutral buffered formalin for 6–48 hours; samples should be sliced at 5–10 mm intervals after appropriate gross inspection and margins designation and placed in sufficient volume of neutral buffered formalin |
| Sections should ideally not be used for HER2 testing if cut >6 weeks earlier; this may vary with primary fixation or storage conditions | |
| Time to fixation and duration of fixation if available should be recorded for each sample | |
| Optimal internal validation procedure | Validation of test must be done before test is offered |
| Initial test validation requires 25–100 samples tested by alternative validated method in the same laboratory or by validated method in another laboratory | |
| Proof of initial testing validation in which positive and negative HER2 categories are 95% concordant with alternative validated method or same validated method for HER2 | |
| Ongoing validation should be done biannually | |
| Optimal internal QA procedures | Initial test validation |
| Ongoing quality control and equipment maintenance | |
| Initial and ongoing laboratory personnel training and competency assessment | |
| Use of standardized operating procedures including routine use of control materials | |
| Revalidation of procedure if changed | |
| Ongoing competency assessment and education of pathologists | |
| Optimal external proficiency assessment | Participation in external proficiency testing program with at least two testing events (mailings)/year |
Satisfactory performance requires at least 90% correct responses on graded challenges for either test
| |
| Optimal laboratory accreditation | Onsite inspection every other year with annual requirement for self-inspection
|
Abbreviations: HER2, human epidermal growth factor receptor 2; IHC, immunohistochemistry; FISH, fluorescent in situ hybridization; QA, quality assurance.
Notably, the literature review for this report showed that most studies reporting concordance and discordance rates of different IHC and ISH assays used archived samples, fixed and embedded elsewhere than the laboratory performing the HER2 assays. With exceptions, most publications did not report adequately on adherence to guideline or prior (consensus) recommendations for tissue processing.
Analytic: performing HER2 assays. Analytic steps for processing thin sections of fixed and embedded tissue cut onto glass slides differ for IHC and ISH assays (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). Each begins by deparaffinizing thin tissue sections, but IHC assays use an antigen retrieval step that optimizes antibody binding to HER2 protein while ISH assays first unwind (denature) cells' double-stranded DNA so that the probe can hybridize to its complementary sequence. The temperature and duration of heating used to bake tissue sections on slides, as well as the conditions used for antigen retrieval, can introduce variability in IHC results. Each assay incubates slides with an analytic reagent (antibody for IHC; probe for ISH), removes unbound reagent in one or more washing steps, and incubates with other reactants to visualize bound analytic reagent. Some steps can be automated, which improves consistency and reproducibility if equipment is well-maintained and regularly calibrated. In addition to reagent choice (which antibody, for IHC; which DNA probe, for ISH), varying the conditions (temperatures, durations, etc.), solutions, and reactants used for each step can affect test results, as can poorly maintained or calibrated automated equipment.
While FDA-approved kits include protocols with optimized methods for each analytic step, guideline publications report that approximately half of surveyed laboratories did not adhere completely to protocol methods (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006). The guidelines stress the need to train and periodically assess the skills of staff conducting these assays, and that each run should include standardized positive and negative controls. They also emphasize that each laboratory offering HER2 testing services should validate its test results against a previously validated test, and that laboratories departing from protocol-specified methods with FDA-approved kits, and those using independently developed assays with analyte-specific reagents, should validate test results against established methods and develop their own standard protocols.
As with preanalytic steps, most published studies did not adequately report information needed to evaluate complete adherence with guideline or prior (consensus) recommendations on all analytic steps. Studies that used FDA-approved kits rarely commented on protocol adherence in the methods sections of their reports, and studies that used independently developed assays rarely described assay validation against approved kits.
Postanalytic factors. IHC scoring systems and positivity thresholds have changed over time, and these changes likely alter the proportion of patients classified as HER2 positive (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hicks and Kulkarni, 2008; Hanna, O'Malley, Barnes, et al., 2007; Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). Some studies on archived tissues classified tumors as HER2 positive if any invasive cells showed strong, complete membrane staining (e.g., Paik, Bryant, Park, et al., 1998; Houston, Plunkett, Barnes, et al., 1999; Paik, Bryant, Tan-Chiu, et al., 2000). Others classified samples as HER2 positive if 1 percent or more of invasive cells were stained (e.g., MacGrogan, Mauriac, Durand, et al., 1996; Elledge, Green, Ciocca, et al., 1998; Di Leo, Larsimont, Gancberg, et al., 2001); yet others, only if 50 percent or more were stained (e.g., Agrup, Stal, Olsen, et al., 2000; Berry, Muss, Thor, et al., 2000; Colozza, Sidoni, Mosconi, et al., 2005). Few studies adopted (or adapted) Allred's system (Harvey, Clark, Osborne, et al., 1999; developed for IHC assays of estrogen receptors), which rates the proportion of stained invasive cells (from 0 to 5) and the intensity of staining (from 0 to 3), then adds for a final score between 0 and 8.
The scale recommended in FDA-approved IHC kits (0 to 3+; developed for HercepTest™ but also used with PATHWAY™) requires membrane staining in 10 percent or more of invasive cells for scores greater than 0. The scale assigns positive scores by staining intensity and totality of membrane staining: 1+ is faint or barely perceptible staining that is incompletely circumferential; 2+ is moderate intensity but complete circumferential staining; and 3+ is strong intensity and complete circumferential staining (www.dakousa.com/prod_downloadpackageinsert.pdf?objectid_105073003). However, some studies that used this scale defined HER2-positive cases as those scored 2+ or 3+, while others classified only those with a score of 3+ as HER2 positive. The ASCO/CAP guideline retains the original definitions for scores of 0 to 2+, but recommends scoring IHC 3+ only if more than 30 percent of invasive breast cancer cells show dark, homogeneous, circumferential membrane staining in a “chicken wire” pattern (Wolff, Hammond, Schwartz, et al., 2007a). Adequate data are lacking to compare accuracy or concordance for this wide variety of scoring systems and thresholds used to classify patients' HER2 status by IHC alone. However, in one recent study (Hameed, Chhieng, and Adams, 2007), three pathologists blinded to FISH results scored IHC-stained slides from 98 breast cancer cases separately using cut-offs of 10 percent, 30 percent, and 50 percent of stained cells to classify samples as HER2+. Specificity of IHC versus FISH was 82 percent, 86 percent, and 87 percent, respectively, for the three increasing cut-offs, while concordance rates of 3+ cases with FISH were 59 percent, 64 percent, and 65 percent.
Scoring and categorizing results of ISH assays also varies (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hicks and Kulkarni, 2008; Hanna, O'Malley, Barnes, et al., 2007; Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). Guidelines stress that precision and accuracy depend on the number of cells counted and averaged, on accurately identifying and only counting invasive cells, and on counting invasive cells from two or more separate areas of each tumor on either the same or sequential slide(s) (Wolff, Hammond, Schwartz, et al., 2007a). With assays estimating gene copy number per cell without normalizing to a CEP17 probe, most published studies using FISH classified tissues averaging more than 4.0 copies per cell as HER2 positive (for references, see Wolff, Hammond, Schwartz, et al., 2007; Carlson, Moench, Hammond, et al., 2006; Laudadio, Quigley, Tubbs, et al., 2007a). Most published studies using CISH scored samples HER2 positive if the average gene copy number per cell was greater than 5, although some followed the manufacturer's recommendation and defined low-level amplification as copy numbers between 6 and 10. In contrast to published studies with FISH or CISH, recent guidelines consider average scores greater than 6.0 as FISH positive, scores less than 4.0 as FISH negative, and scores between 4.0 and 6.0 as equivocal (ASCO/CAP) or borderline (NCCN). Most studies that normalized to CEP17 classified HER2 to CEP17 ratios greater than 2.0 as HER2 positive (for references, see Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Laudadio, Quigley, Tubbs, et al., 2007). The guidelines consider a HER2/CEP17 ratio greater than 2.2 as positive, a ratio less than 1.8 as negative, and ratios between 1.8 and 2.2 as equivocal (ASCO/CAP) or borderline (NCCN). As with IHC scoring and thresholds, data are lacking to evaluate consequences of the newer classification criteria on accuracy or concordance.
Guidelines and reviews caution that assigning HER2 status is partially subjective and potentially inconsistent because IHC and FISH scoring criteria are variably interpreted and applied by different raters (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hicks and Kulkarni, 2008; Hanna, O'Malley, Barnes, et al., 2007; Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). Expert panels and reviewers emphasize that image analysis methods, using digital microscopy and automated cellular imaging systems (e.g., Bloom and Harrington, 2004; McCabe, Dolled-Filhart, Camp, et al., 2005; Tubbs, Pettay, Swain, et al., 2006; Ciampa, Xu, Ayata, et al., 2006; Tawfik, Kimler, Davis, et al., 2006; Moeder, Giltnane, Harigopal, et al., 2007), can decrease inter-rater variability and thus improve scoring consistency, accuracy, and precision, particularly for IHC assays. However, this requires careful validation and periodic recalibration of automated systems against standardized positive, negative, and equivocal control samples. Nevertheless, a study testing agreement between pathologists reported that use of digital microscopy to score IHC improved concordance with FISH and also decreased inter-rater variability (Bloom and Harrington, 2004).
Postanalytic steps also include reporting elements that should be provided to clinicians ordering HER2 testing, as well as quality assurance procedures (laboratory accreditation and proficiency testing; competency assessment for pathologists). However, these issues are outside the scope of this report. Readers are referred to recommendations in current guidelines (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007).
Although many studies reported concordance and discrepancy rates for collections of breast tumor tissue tested for HER2 status by IHC with different antibodies, or by IHC and ISH assays, or by multiple ISH assays, current evidence does not suggest one HER2 assay is superior to all others (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hicks and Kulkarni, 2008; Laudadio, Quigley, Tubbs, et al., 2007; Ross, Fletcher, Linette, et al., 2003). As described previously, preanalytic, analytic and postanalytic methods varied between studies, and all studies preceded guidelines for standardizing these methods. Additionally, data are lacking to fully evaluate effects of nonadherence with certain guideline recommendations on test results. Thus, it is difficult (perhaps impossible) to isolate effects of individual factors that contribute to discordance. As detailed above, these include differences in:
Fixing and embedding tissues, preparing and staining them for assays, or scoring and classifying test results;
Inherent differences in antibody binding, epitope stability, or antigen retrieval when comparing different antibodies used for IHC;
Different biologic mechanisms that can increase membrane HER2 protein, when comparing IHC assays versus ISH assays; or differences in sensitivity and specificity of diverse DNA probes and visualization techniques when comparing different ISH methods.
Identifying one “best” HER2 test clearly requires better comparative data than presently available, with assays that standardized key aspects of preanalytic, analytic, and postanalytic steps in HER2 assay methods.
The lack of a gold standard to determine breast tumors' HER2 status also prevents agreement on one “best” HER2 assay. Furthermore, seeking a single gold standard may be unrealistic, since HER2 status is used in different ways. The optimal assay (or combination of assays) may differ for HER2 as a prognostic marker, as a marker to predict clinical benefit from trastuzumab, or as a marker to predict benefit from a chemotherapy drug class (e.g., an anthracycline or a taxane). For example, HER2 gene amplification may best predict tumor aggressiveness hence prognosis, while membrane density of HER2 protein may best predict trastuzumab binding to tumor cells and thus clinical response. Furthermore, HER2 may only be a surrogate marker for other molecular alterations that more directly impact tumor cell sensitivity to certain chemotherapy drugs (e.g., anthracyclines).
Outcomes of well-designed and adequately powered comparative clinical trials with sufficient followup duration may be a gold standard to evaluate HER2 assays as predictors of treatment benefit. However, even the large randomized, controlled trials on adjuvant trastuzumab (Romond, Perez, Bryant, et al., 2005; Piccart-Gebhart, Procter, Leyland-Jones, et al., 2005; Slamon, Eiermann, Robert, et al., 2005; Joensuu, Kellokumpu-Lehtinen, Bono, et al., 2006) may not have adequately standardized preanalytic steps at local hospitals, did not test all patients with at least two assays, treated few patients with discordant results by different assays conducted in central laboratories; and presently lack sufficient followup to compare outcomes in subgroups of the main treatment arms (see “Results and Conclusions, Key Question 2”).
Current guidelines acknowledge present uncertainty, permit clinicians and laboratories to choose an initial HER2 assay method, and recommend confirming results with an alternative assay when initial tests are equivocal (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007).
Current guidelines recommend very similar algorithms for using well-validated IHC and ISH assays to classify breast cancer patients with respect to HER2 status (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). The algorithm shown in Figure 2
Figure 3
Interestingly, a recent study reported on 17 patients with breast core biopsy specimens showing invasive carcinoma and equivocal FISH results (HER2/CEP17 ratios between 1.8 and 2.2) (Striebel, Bhargava, Horbinski, et al., 2008). These patients were subsequently re-evaluated by IHC and FISH testing on resection specimens. For 10 of the 17 cases, equivocal results obtained with biopsy specimens were definitively resolved by retesting of resection specimens. Four patients were classified HER2 positive and treated with trastuzumab, while six were classified HER2 negative and managed without trastuzumab.
Concordance and discordance of different assay methods
Discordance between central and local laboratory results
Validation and proficiency testing
Reports on polysomy 17
For purposes of this review, discordant results are operationally defined as unequivocally positive results by one assay method and unequivocally negative results by a different assay method on sections from the same tumor, with both assays conducted using good laboratory practices, as recommended in the ASCO/CAP guideline (Wolff, Hammond, Schwartz, et al., 2007a). Presently, evidence is lacking to estimate discordance rates from studies that followed all ASCO/CAP recommendations on tissue preparation, testing practices, scoring systems, and thresholds to classify HER2 status of breast cancer patients. Therefore, in the following sections, we summarize evidence on discordance rates reported after the guideline was published by studies that used scoring systems and thresholds similar to those originally specified in U.S. Food and Drug Administration (FDA) -approved kits for IHC and ISH assays.
Investigators from the National Surgical Adjuvant Breast and Bowel Project's (NSABP) central pathology laboratory and colleagues at NSABP-approved reference laboratories conducted IHC (HercepTest™) and FISH (PathVysion®) assays on formalin fixed, paraffin embedded tumor blocks (Paik, Kim, Jeong, et al., 2007; Paik, Kim, and Wolmark, 2008). They reported results with both assays for 1,787 of 2,043 patients enrolled in the NSABP B31 randomized, controlled trial on adjuvant therapy with versus without trastuzumab (Romond, Perez, Bryant, et al., 2005). Of these, they found FISH-negative, IHC 3+ discordant results in 31 cases (1.7 percent). They also reported FISH-positive, IHC 0, 1+, or 2+ results in another 125 cases (7 percent), but did not separately report the proportion of those who tested FISH positive and IHC 0 or 1+.
Central and reference laboratory results with both IHC (HercepTest™) and FISH (PathVysion®) assays also are available (Perez, Romond, Suman, et al., 2007) for 1,779 of the 2,535 patients registered in a similar randomized, controlled trial conducted by the North Central Cancer Treatment Group (NCCTG N9831; Romond, Perez, Bryant, et al., 2005). Investigators reported discordant IHC 3+, FISH-negative results in 53 cases (3 percent), and FISH-positive, IHC 0, 1+, or 2+ results in 218 cases (12.3 percent). Here again, separate results were not reported for the proportion who tested FISH positive and IHC 0 or 1+. Data presently are unavailable on IHC/ISH discordance rates from three other randomized, controlled trials of adjuvant trastuzumab (Piccart-Gebhart, Procter, Leyland-Jones, et al., 2005; Slamon, Eiermann, Robert, et al., 2005; Joensuu, Kellokumpu-Lehtinen, Bono, et al., 2006).
In a retrospective study, a Canadian central reference laboratory used HercepTest™ and three other HER2 antibody IHC assays to retest tumors from patients diagnosed with metastatic breast cancer between 1999 and 2002, and compared the IHC results with central lab FISH using PathVysion® (O'Malley, Thomson, Julian, et al., 2008). Among 505 patients initially classified HER2 positive by IHC in local labs and treated with trastuzumab for metastatic disease, concordance between central IHC and central FISH ranged from 88.9 percent to 90.9 percent, depending on the HER2 antibody used. Concordance between IHC and FISH was highest (92.2 percent) when all four HER2 antibody assays were used to test each sample, and tumors were only classified IHC positive if positive by 2 or more assays. In a sequential sample of 205 invasive breast tumors locally classified IHC negative, from patients diagnosed with metastasis, concordance of central IHC and central FISH ranged from 93.7 percent to 99 percent for individual antibody assays, and was 98.1 percent if tumors were only classified IHC negative if negative by 2 or more assays. However, this study did not report FISH/IHC discordance rates separately by IHC score.
A study from Greece that separately compared IHC results (using HercepTest™ and two other methods) from central and regional laboratories versus central FISH (PathVysion®) reported on 375 breast tumors tested centrally by IHC and FISH (Papadopoulos, Kouvatseas, Skarlos, et al., 2007). FISH-positive, IHC 0/1+ discordances were seen in six cases (1.6 percent; 11.5 percent of 52 IHC 0/1+ cases), while FISH-negative, IHC 3+ discordances were seen in three cases (0.8 percent; 9.4 percent of 32 IHC 3+ cases). Another study from three Greek hospitals compared IHC results (CB11 antibody) with FISH (PathVysion®) for 194 resected breast cancer patients, and also with CISH (SpoT-Light) for 159 of these patients (Kostopoulou, Vageli, Kaisaridou, et al., 2007). This study reported no FISH-positive cases and only one CISH-positive case among 94 IHC 0/1+ patients. Of 30 patients with IHC 3+ results, one (3.3 percent) was FISH negative and CISH negative.
A study from Germany on patients evaluated for inclusion in a trial of trastuzumab for metastatic breast cancer reported central IHC (HercepTest™) and FISH (PathVysion®) results for 289 patients (Hofmann, Stoss, Gaiser, et al., 2008). Investigators reported no FISH-positive cases among 100 patients scored IHC 0/1+, and nine FISH-negative but IHC 3+ cases (8.4 percent of 107 scored IHC positive; 3.1 percent of all patients evaluated).
A small study (n=55) compared two dual-probe (i.e., for HER2 and CEP17) FISH kits (PathVysion® and HER2 FISH pharmDx), a single-probe FISH kit (Inform; HER2 only) and the SpoT-Light CISH kit versus two IHC assays (HercepTest™ and an independently developed test) (Cayre, Mishellany, Lagarde, et al., 2007). Investigators reported results with each assay (and with different positivity thresholds for Inform and SpoT-Light) separately for each sample. Four of 55 (7.3 percent) cases tested IHC 3+ with HercepTest™ and ISH-negative by all assays (other than a threshold of more than four signals for Inform). Three of the four were scored less than 3+ by independently developed IHC. All cases scored FISH positive by two or more kits also were scored IHC 3+ by HercepTest™.
Another small study (n=54) used the HercepTest™ and PathVysion® kits on all samples (Kuo, Wang, Chang, et al., 2007). Three cases (5.6 percent) that tested FISH negative were scored 3+ by IHC. In contrast, no cases that tested FISH positive were scored IHC 0 or 1+.
| IHC Score | median % of patients | 95% credible interval | expected # per 1,000 screened by IHC | 95% credible interval | % discordant by FISHa | 95% credible interval | expected # of discordances by FISH per 1,000 screened by IHC | 95% credible interval |
|---|---|---|---|---|---|---|---|---|
| 0 | 36.1 | 4.4–64.3 | 362 | 44–642 | 1.6 | 0.9–2.8 | 6 | 1–13 |
| 1+ | 35.5 | 7.4–67.4 | 355 | 74–674 | 4.9 | 2.6–17.9 | 18 | 8–30 |
| 2+ | 12.0 | 3.5–21.4 | 120 | 35–214 | NAb | NAb | NAb | NAb |
| 3+ | 16.2 | 10.7–22.9 | 162 | 107–230 | 7.6 | 3.8–12.9 | 12 | 6–21 |
percentages shown are of expected # patients with IHC score listed in left column;
NA = not applicable, since IHC 2+ is considered an equivocal result, thus defined as not discordant regardless of subsequent FISH result.
Three small studies (combined N=211) conducted outside North America compared results of different ISH methods. An Australian study on 49 breast cancer samples reported that each case (n=20) scored highly positive (greater than 10 signals/cell) by FISH, and seven of 10 cases scored low-positive (5–10 signals/cell) by FISH, also scored positive by CISH (Bilous, Morey, Armes, et al., 2006). Each sample scored IHC 3+ by HercepTest™ also tested CISH positive. A study from Germany reported agreement in 95 of 99 breast tumor samples tested by FISH (PathVysion®) and SISH, an overall concordance of 96 percent (Dietel, Ellis, Hofler, et al., 2007). Finally, a study from Poland compared FISH, CISH, and SISH on 63 breast tumor specimens selected for 2+ or 3+ staining by IHC (Sinczak-Kuta, Tomaszewska, Rudnicka-Sosin, et al., 2007). Investigators reported and interpreted multiple statistical tests (Pearson chi-square tests with p<0.01; gamma correlation coefficients of 0.89 to 0.96; Spearman rank correlation coefficients of 0.70 to 0.79; and Kappa coefficients of 0.38 to 0.58) for separate two-way comparisons of assay results (i.e., CISH versus FISH, FISH versus SISH, and SISH versus CISH) as evidence for good agreement between the methods, but did not report concordance or discordance rates. Larger studies are needed to estimate more reliably rates of concordance and discordance between FISH or IHC and newer ISH methods (CISH, SISH). Furthermore, FDA-approved kits for CISH or SISH are not yet available.
To summarize, evidence from seven studies and a meta-analysis reported after the ASCO/CAP guideline (Wolff, Hammond, Schwartz, et al., 2007a) suggests variable but perhaps non-negligible rates for FISH-negative, IHC 3+ discordance (albeit by the older definition of strong, complete membrane staining in greater than 10 percent of invasive cells), ranging from 0.5 percent to 7.3 percent of breast cancer cases. The meta-analysis also estimated that 0.6 percent (95 percent CI: 0.1–1.3 percent) of cases might be scored IHC 0 and FISH positive, while 1.8 percent (95 percent CI: 0.8–3.0 percent) of cases might be scored IHC 1+ and FISH positive. However, data are unavailable to estimate discordance rates for either group using the current ASCO/CAP definition of IHC 3+ (greater than 30 percent of invasive cells stained).
Disagreement between central and local laboratory results. Evidence reviewed by the ASCO/CAP expert panel demonstrated disagreement between central and local laboratory HER2 test results in approximately 20 percent of cases (Wolff, Hammond, Schwartz, et al., 2007a). This included data from the first 104 patients registered for NSABP B31, showing disagreement in 18 percent of cases (Paik, Bryant, Tan-Chiu, et al., 2002), which resulted in a protocol amendment limiting HER2 testing to 23 approved laboratories. The evidence also included data from NCCTG N9831 showing agreement in 88.1 percent of 813 cases rated FISH positive, 81.6 percent of 1,063 cases scored IHC 3+ by HercepTest™, and 75.0 percent of 636 cases scored IHC 3+ by non-HercepTest™ assays (Perez, Suman, Davidson, et al., 2006). Finally, it included data from a community-based clinical study on trastuzumab for metastatic breast cancer showing 77 percent agreement on samples scored IHC 3+ by local laboratories, but only 26 percent agreement on samples locally scored IHC 2+ (Reddy, Reimann, Anderson, et al., 2006). Based on the available evidence, the panel recommended specific measures for assay validation, self-assessment, accreditation, and proficiency testing by laboratories conducting HER2 assays. In the following section, we summarize new evidence comparing local versus central laboratory results, published since the ASCO/CAP review. Although published after the ASCO/CAP guideline, these studies preceded the guideline and scored samples as originally recommended by manufacturers and FDA labeling.
Final data from NSABP B31 showed disagreement on HER2 status in 174 of 1,787 cases (9.7 percent) classified HER2 positive by local laboratories but HER2 negative by both FISH (PathVysion®) and IHC assays in central or reference laboratories (Paik, Kim, Jeong, et al., 2007; Paik, Kim, and Wolmark, 2008). Data presently are unavailable on rates of disagreement between local and central laboratories from three other randomized, controlled trials of adjuvant trastuzumab (Piccart-Gebhart, Procter, Leyland-Jones, et al., 2005; Slamon, Eiermann, Robert, et al., 2005; Joensuu, Kellokumpu-Lehtinen, Bono, et al., 2006).
A small study compared central and local laboratory IHC results on breast tumor samples initially scored IHC 2+ locally and found FISH positive after referral for central laboratory confirmation (Barrett, Magee, O'Toole, et al., 2007). Investigators reported that of 153 IHC 2+ cases referred to the central laboratory for FISH confirmation, 29 (19 percent) had amplified HER2 genes. With repeat IHC in 25 of the 29, the central laboratory scored 18 cases (72 percent) as IHC 3+ and agreed with the local laboratory score of IHC 2+ in only 7 cases (28 percent). Since the central laboratory did not repeat IHC testing for the 124 cases with nonamplified HER2 genes by FISH, the overall rate of agreement with local results cannot be determined.
A larger study compared IHC results in local (regional) and central laboratories (Papadopoulos, Kouvatseas, Skarlos, et al., 2007). Of 458 available samples, 369 were tested by IHC both regionally and centrally and scores agreed for 296 (80.2 percent). Disagreement was greatest among samples (n=11) scored IHC 3+ by regional laboratories (63 percent concordance). Concordance was better among those (n=20) scored IHC 0 or 1+ and those scored IHC 2+ (n=338) at regional laboratories (85 percent and 80 percent, respectively).
A central reference laboratory analyzed tumor specimens from 315 of 399 (79 percent) patients randomized to capecitabine with or without lapatinib, using both IHC (antibody not reported) and FISH (PathVysion®), seeking confirmation of local laboratory results that classified these patients HER2 positive thus eligible for this randomized, controlled trial (Cameron, Casey, Press, et al., 2008). Central testing found 241 of 315 (77 percent) HER2 positive, including 211 with IHC 3+ results and 30 with IHC 2+, FISH-positive results.
In the Canadian study cited previously, central laboratory testing of breast tumor tissue samples confirmed the IHC-positive status of 79.3 percent to 89.6 percent of 505 cases found IHC positive by local laboratory results (O'Malley, Thomson, Julian, et al., 2008). Among 205 cases found IHC negative by local labs, central IHC testing confirmed local results in 94.8 percent to 100 percent of cases. The concordance rates varied, depending on which of four IHC assays the central laboratory used.
To summarize, data reported after publication of the ASCO/CAP guideline (Wolff, Hammond, Schwartz, et al., 2007a) confirm the estimate of approximately 20 percent disagreement between local (or regional) and central laboratories with respect to HER2 assay results. Data are presently lacking to evaluate the effects of adherence to guideline recommendations for preanalytic, analytic, and postanalytic steps on rates of local/central disagreement.
Validation and proficiency testing. Since these issues are outside the scope of this evidence report, interested readers are referred to current guidelines for specific recommendations on best practices to validate assays and test laboratory proficiency (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). Evidence reviewed by the expert panel included a summary of results from 2004 and 2005 surveys of laboratories participating in CAP-sponsored interlaboratory comparisons of IHC results, using tissue microarrays as the test material (Fitzgibbons, Murphy, Dorfman, et al., 2006). The key finding was that 97 of 102 laboratories (95 percent) in 2004 and 129 of 141 laboratories (91 percent) in 2005 correctly scored 90 percent or more of the test cases. In the following section, we briefly summarize evidence published after the ASCO/CAP guideline. Again, these studies scored samples as originally recommended by manufacturers and FDA labeling.
An international study compared five pathology reference centers (from Netherlands, Canada, France, Belgium, and Germany) on assay scoring and HER2 status classification for separate samples tested by IHC (n=20) or by FISH (n=20) (Dowsett, Hanna, Kockx, et al., 2007). Agreement was uniform among centers on HER2 status classifications for all 20 IHC test cases, although some scoring differences were noted, and some equivocal cases (i.e., those scored IHC 2+) required FISH confirmation to determine HER2 status. Agreement was uniform among centers 16 of 20 (80 percent) FISH test cases. Each of the other four cases was scored in the equivocal range (HER2/CEP17 ratio 1.7–2.3).
A similar international study (from Netherlands, Australia, Canada, France, and Germany) compared results from five central laboratories on 211 breast cancer specimens tested by CISH, FISH and IHC (van de Vijver, Bilous, Hanna, et al., 2007). Each central laboratory sent unstained sections from samples they tested to four other (“outside”) central laboratories. Investigators reported uniform agreement by CISH in the “outside” laboratories on 73 of 76 cases (96 percent) scored highly amplified (HER2/CEP17 greater than 4.0) by FISH in the initial laboratory. Similarly, “outside” CISH uniformly agreed with 94 of 100 (94 percent) cases initially scored as not amplified by FISH (HER2/CEP17 less than 2.0). Among 35 cases scored as equivocal by initial FISH testing (HER2/CEP17 2.0–4.0), 20 were scored as CISH positive and 15 were scored as CISH negative. Overall interlaboratory concordance was 95 percent for cases with normal HER2 gene copy number (1–5) and was 92 percent for cases with 6 or more copies of the HER2 gene.
A brief report by investigators from the Italian Network for Quality Assessment of Tumor Biomarkers (INQUAT) and the United Kingdom National External Quality Assessment Service (U.K. NEQAS) highlighted the importance of including both preanalytic and analytic steps in proficiency testing programs (Paradiso, Miller, Marubini, et al., 2007). The U.K. NEQAS program for HER2 testing focuses on preanalytic aspects of the IHC assay, while the INQUAT program focuses on intra- and interlaboratory variability in scoring a set of fixed and stained IHC slides. Twelve Italian laboratories participated in both quality control programs during 2003, and only one achieved high-quality performance in preanalytic processing steps and in intra- and interlaboratory reproducibility. Some laboratories that achieved high-quality performance in preanalytic steps did not score slides reproducibly, or vice versa. Three of the 12 laboratories did not perform adequately on either preanalytic or analytic steps.
A recent study covalently attached fixed and unfixed samples of synthetic HER peptide to glass microscope slides with unstained sections of invasive breast carcinomas (Vani, Sompuram, Fitzgibbons, et al., 2008). The peptide fragments were used as positive analyte controls on slides distributed to 192 laboratories participating in the CAP 2006 HER2-B proficiency testing survey. Stained slides were returned and centrally reviewed (n=109 laboratories), permitting participants to evaluate sources of variability in HER2 staining performance. Investigators reported suboptimal staining in 20 of 109 slides (18.3 percent). Of these, seven cases (35 percent of the 20 failures) were attributable to errors in the antigen retrieval step, four (20 percent) were attributable to problems with the antibody staining protocol, and nine (45 percent) had problems with both.
In summary, two studies published subsequent to the ASCO/CAP review (Wolff, Hammond, Schwartz, et al., 2007a) reported similar results on interlaboratory comparisons. Overall, the available evidence shows 90 percent or greater agreement between high-volume reference laboratories in North America, Europe, and Australia. Scoring differences between laboratories occur most often with cases of low-level amplification or low-level overexpression. Results reported before and after the ASCO/CAP review (and other guidelines) support considering such cases as equivocal results, with confirmatory testing needed to classify HER2 status. Collaborative data from Italy and the United Kingdom suggest that quality control programs must evaluate all steps (preanalytic, analytic, and postanalytic) in HER2 testing. Positive analyte controls confirmed that antigen retrieval and antibody staining are persistent sources of interlaboratory variability in IHC results.
Reports on polysomy 17. The ASCO/CAP expert panel (Wolff, Hammond, Schwartz, et al., 2007a) interpreted evidence from two studies (Downs-Kelly, Yoder, Stoller, et al., 2005; Ma, Lespagnard, Durbecq, et al., 2005) as not supporting an association of polysomy 17 (defined as three or more copies of CEP 17) with HER2 protein or mRNA overexpression. However, one of these (Ma, Lespagnard, Durbecq, et al., 2005) reported increased HER2 protein (IHC 3+) in a subset of patients with polysomy 17 and HER2/CEP 17 ratios less than 2. In the following section, we summarize evidence published subsequent to the ASCO/CAP guideline.
Nine studies have reported data on polysomy 17 and HER2 status of breast cancer patients since the ASCO/CAP review. Of these, seven have been published in full (Dal Lago, Durbecq, Desmedt, et al., 2006; Torrisi, Rotmensz, Bagnardi, et al., 2007; Corzo, Bellosillo, Corominas, et al., 2007; Beser, Tuzlali, Guzey, et al., 2007; Hyun, Lee, Kim, et al., 2008; Kostopoulou, Vageli, Kaisaridou, et al., 2007; Hofmann, Stoss, Gaiser, et al., 2008) and two were reported at meetings with slides or video available on line (Kaufman, Broadwater, Lezon-Geyda, et al., 2007; Reinholz, Jenkins, Hillman, et al., 2007). Three studies reported no association of polysomy 17 with HER2 protein and/or mRNA overexpression (Dal Lago, Durbecq, Desmedt, et al., 2006; Torrisi, Rotmensz, Bagnardi, et al., 2007; Corzo, Bellosillo, Corominas, et al., 2007). In contrast, five other studies reported increased levels of HER2 protein in some cases with polysomy 17 and unamplified HER2 genes (Hyun, Lee, Kim, et al., 2008; Kaufman, Broadwater, Lezon-Geyda, et al., 2007; Reinholz, Jenkins, Hillman, et al., 2007; Kostopoulou, Vageli, Kaisaridou, et al., 2007; Hofmann, Stoss, Gaiser, et al., 2008). The ninth study did not report data on overexpression of HER2 protein or mRNA; this study reported chromosome 17 polysomy in two of 11 patients with HER2 gene amplification and in seven of 39 patients with unamplified HER2 genes (Beser, Tuzlali, Guzey, et al., 2007). In one study (Hofmann, Stoss, Gaiser, et al., 2008), seven of nine discordant IHC 3+/FISH-negative patients had chromosome 17 polysomy, and six of 26 patients with polysomy 17 responded to trastuzumab therapy for metastatic disease. However, all six responders were scored 3+ by IHC.
In contrast to conclusions of the ASCO/CAP review (Wolff, Hammond, Schwartz, et al., 2007a), evidence published subsequently reopens the question of whether chromosome 17 polysomy has implications for classifying patients' HER2 status. Five of eight new studies found polysomy 17 to be associated with protein (and/or mRNA) overexpression in at least some patients with nonamplified HER2 genes, while three of eight found no association.
Discordances between IHC and FISH results might arise in one of three ways. They may be artifacts of one accurate and one inaccurate test. Alternatively, they may reflect a threshold issue, either related to the changes in threshold definitions over time, or an inherent problem of using a continuous measure to classify patients dichotomously. Finally, discordant test results might accurately reflect a small number of different patients with respect to the biologic mechanism that increases membrane levels of the HER2 protein. Present data could not tease apart the many factors reviewed here (preanalytic, analytic and postanalytic) that might have contributed to discordances in HER2 assay results. This clearly affects the interpretation of evidence on key questions that address use of “HER2 status” to predict treatment outcomes, even in nonbreast malignancies (Key Questions 2, 3, and 5). Furthermore, it also affects interpretation of evidence on the added clinical utility of serum measurements for patients with known tissue status, since this presumes accurate classification by tissue assays. Future studies reporting outcomes as a function of HER2 status should report separately on patients with concordant, equivocal, and discordant assay results.
For patients who are not unequivocally HER2 positive, what is the evidence on outcomes of treatment targeting the HER2 molecule (trastuzumab, etc.), or on differences in outcomes of uniform chemotherapy or hormonal therapy regimens with versus without additional treatment targeting the HER2 molecule, in:
Breast cancer patients characterized by equivocal or discordant HER2 results from different tissue assay methods performed adequately; and
For those with HER2-negative breast cancer?
| Study | Treatments Compared | Age or Menopause Status | Disease Extent | ER+ | PR+ | n FISH+ IHC- | n FISH- IHC3+ | n FISH- IHC1,2+ | n FISH- IHC- | |
|---|---|---|---|---|---|---|---|---|---|---|
| Adjuvant treatment for resected early breast cancer | ||||||||||
| NSABP B31 Paik et al., 2007; Paik et al., 2008; Romond et al., 2005 | ≥50 years | >2 cm | >3 + nodes | |||||||
| Tx: AC → (P+TRZ) (n=1,019 randomized) | 48.4% | 61.4% | 42.6% | 51.9% | 39.0% | 56 | 10 | 69 | 82 | |
| Cx: AC → P (n=1,024 randomized) | 48.4% | 57.1% | 43.3% | 52.8% | 41.4% | 69 | 21 | 80 | 92 | |
| NCCTG N9831 Perez et al., 2007; Reinholz et al., 2007; Perez et al., 2006; Romond et al., 2005 | ≥50 years | >2 cm | >3 + nodes | |||||||
| Tx: AC → (P+TRZ) (n=884 randomized) | 50.4% | 61.5% | 39.1% | 51.2% | 39.4% | 123 | 23 | 59 | ||
| Cx: AC → P (n=895 randomized) | 48.9% | 58.7% | 39.1% | 52.8% | 41.3% | 95 | 30 | 44 | ||
| First- or second-line treatment for advanced breast cancer | ||||||||||
| CALGB 9840 Seidman et al., 2004, 2008 | Menopausal status | ≥3 metastatic sites | ||||||||
| Tx: P (q wk vs. q3wk)+TRZ (n=115 randomized) | 75% post | 15% | 55% | NR | 113 | |||||
| Cx: P (q wk vs. q3wk) (n=113 randomized) | 84% post | 11% | 49% | NR | 115 | |||||
| CALGB 150002 (from 9840) Kaufman et al., 2007 | Tx: P (q wk vs. q3wk)+TRZ (n=115 randomized) | 75% post | 15% | 55% | NR | central FISH-, polysomy +: 19 | ||||
| central FISH-, polysomy -: 53 | ||||||||||
| Cx: P (q wk vs. q3wk) (n=113 randomized) | 84% post | 11% | 49% | NR | central FISH-, polysomy +: 19 | |||||
| central FISH-, polysomy -: 50 | ||||||||||
| EGF100151 Cameron et al., 2008; Geyer et al., 2006 | Tx: capecitabine (2 g/m2 days 1–14 q 3wk) + lapatinib (1.25 g q day) (n= 198 randomized) | Median 54 yrs; range 26–80 yrs | ≥3 metastatic sites | ER+ &/or PR+ | ||||||
| 49% | 48% | 15 | 1 | 14 | 23 | |||||
| Cx: capecitabine alone (2.5 g/m2 days 1–14 q 3wk) (n=201 randomized) | Median 51 yrs; range 28–83 yrs | 48% | 46% | 7 | 2 | 14 | 21 | |||
Abbreviations: AC: Adriamycin [doxorubicin]/cyclophosphamide; Cx: control; ER+: estrogen-receptor positive; IHC: immunohistochemistry; FISH: fluorescent in situ hybridization; mos: months; PR+: progesterone-receptor positive; P: paclitaxel; q wk: every week; q3wk: every 3 week; TRZ: trastuzumab; Tx: treatment; yrs: years.
| Study | Tumor Response (%) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CALGB 9840 Seidman et al., 2004, 2008 | Grp | N | CR | PR | OR (CR+PR; with 95% CI) | SD | PD | NE | Test | p | Comments |
| +TRZ | 112 | 38% (29%–48%) | multivariate logistic regression | 0.28 | OR=1.35 (0.78–2.34) | ||||||
| -TRZ | 114 | 32% (23%–41%) | |||||||||
| CALGB 150002 (from 9840) Kaufman et al., 2007 | Grp | N | CR | PR | OR (CR+PR) | SD | PD | NE | Test | p | Comments |
| +TRZ | 19 | 63% | ??? | 0.048 | FISH-/polysomy+ | ||||||
| - TRZ | 19 | 26% | |||||||||
| +TRZ | 53 | 36% | ??? | NS | FISH-/polysomy- | ||||||
| - TRZ | 50 | 36% | |||||||||
Abbreviations: CR: complete response; Grp: group; NE: not evaluable; NS: not significant; OR: overall response; PD: progressive disease; PR: partial response; SD: stable disease; TRZ: trastuzumab;
| Study | Design | Therapeutic Setting | n, Enrolled (n per group) | n, Evaluated | n, withdrawn or lost to F/U | Treatment Regimen (Agents) |
|---|---|---|---|---|---|---|
| HER2 Discrepant | ||||||
| Paik et al. 2007; Kim et al, in preparation; Romond et al. 2005 | RCT NSABP-B31 | adjuvant therapy | 2043 (1024, 1019) | 1829 w tumor blocks; 1795 w baseline and F/U data | 248 | AC→ (P ± trastuzumab) |
| Perez et al. 2007; Perez et al. 2006; Romond et al. 2005 | RCT NCCTG N9831 | adjuvant therapy | 1842 | 1779 (895, 884) | 63 | AC→ (P ± trastuzumab) |
| HER2 Negative | ||||||
| Seidman et al. 2004 | RCT CALGB 9840 | inoperable or metastatic disease, stratified by 1st or 2nd line therapy | 735 | 228 (HER2-) (113, 115) | 0 (507 HER2+ or UNK given TRZ) | 4 arm trial: P (weekly vs. q3w) stratified by HER2 status; HER2- randomized to ± TRZ, all HER2+ given TRZ |
| Kaufman et al. 2007 | RCT CALGB 150002 | metastatic, 1st or 2nd line; companion study on CALGB 9840 pts | 585 | 303 (samples available for central testing) | 282 | 4 arm trial: P (weekly vs. q3w) stratified by HER2 status; HER2- randomized to ± TRZ, all HER2+ given TRZ |
| Toxicity Type | Study | Severity or Grade | Results | ||||
|---|---|---|---|---|---|---|---|
| HER2 Discrepant (IHC 2+/FISH+) | |||||||
| Treatment-related mortality | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Nausea | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Vomiting | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Anorexia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Lethargy | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Neurosensory | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Hearing loss | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Cardiac ischemia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Diminished LVEF | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Arrhythmias | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Bronchopulmonary | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Dermatologic | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Kidney | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Anemia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Thrombocytopenia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Leukopenia or neutropenia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Infection | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Other | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| HER2 Negative | |||||||
| Treatment-related mortality | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Nausea | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Vomiting | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Anorexia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Lethargy | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Neurosensory | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Hearing loss | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Cardiac ischemia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Diminished LVEF | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Arrhythmias | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Bronchopulmonary | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Dermatologic | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Kidney | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Anemia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Thrombocytopenia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Leukopenia or neutropenia | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Infection | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
| Other | F/U (mo) | Grp1 n | % | Grp2 n | % | ||
One trial on trastuzumab in adjuvant therapy (NSABP B31) reported data on post-hoc subgroup analyses in a brief published communication (Paik, Kim, and Wolmark, 2008). Another adjuvant trastuzumab trial (NCCTG N9831) compared local, central, and reference laboratory results of HER2 testing in a published article that did not report outcomes (Perez, Suman, Davidson, et al., 2006). Both trials reported subgroup outcomes in meeting abstracts, with slides available online (B31: Paik, Kim, Jeong, et al., 2007; N9831: Perez, Romond, Suman, et al., 2007, and Reinholz, Jenkins, Hillman, et al., 2007). A single, published report provided baseline characteristics and preliminary outcomes data for patients randomized to treatment arms common to B31 and N9831 (Romond, Perez, Bryant, et al., 2005). Data were reported in this publication on each trial separately and both trials combined.
Two trials on patients with advanced or metastatic disease published full reports with subgroup analyses (Seidman, Berry, Cirrincione, et al., 2008; Cameron, Casey, Press, et al., 2008). The EGF100151 trial on chemotherapy with or without lapatinib (Cameron, Casey, Press et al., 2008) also published an earlier report (Geyer, Forster, Lindquist et al., 2006), but without results of repeat HER2 testing by a central or reference laboratory or analyses relevant to Key Question 2. CALGB 9840, the only preplanned analysis relevant to this key question, is on a HER2-negative (i.e., non-overexpressor) subgroup randomized to chemotherapy with or without trastuzumab within a larger trial studying an unrelated question (Seidman, Berry, Cirrincione, et al., 2004, 2008). CALGB 9840 also is the source of all patients in the subgroup analyzed post-hoc in CALGB 150002 (Kaufman, Broadwater, Lezon-Geyda, et al., 2007).
Adjuvant therapy. Two trials (NSABP B31, NCCTG N9831) investigated outcomes of adjuvant doxorubicin plus cyclophosphamide (AC; every three weeks for four cycles), followed by paclitaxel (P; every three weeks for four cycles), with versus without trastuzumab (+/-TRZ; weekly for 12 months, beginning concurrently with paclitaxel) in women with fully resected early breast cancer. Outcomes are as-yet unreported for a third arm of N9831, which began trastuzumab therapy after all eight cycles of chemotherapy (AC→P→TRZ). Both B31 and N9831 limited eligibility to HER2-positive patients, defined as FISH-positive/IHC unknown, IHC3+/FISH-unknown, or IHC2+/FISH-positive. Patients were initially evaluated by local laboratory testing, and randomized if classified HER2-positive by these results. They were subsequently re-evaluated by central laboratory testing, but continued with assigned treatments regardless of results. A planned interim analysis at two years' median followup (2.4 years for B31 patients; 1.5 years for N9831 patients) for all patients randomized to the treatment arms common to both trials, pooled patients assigned to the control arms(n=1,679; AC→P) and those assigned to concurrent trastuzumab (n=1,672; AC→P+TRZ) (Romond, Perez, Bryant, et al., 2005). Trastuzumab significantly improved overall survival (OS) at four years: 91.4 percent versus 86.6 percent; hazard ratio (HR) =0.67; 95 percent CI: 0.48–0.93; p=0.015. The B31 (Paik, Kim, and Wolmark, 2008; Paik, Kim, Jeong, et al., 2007) and N9831 (Perez, Romond, Suman, et al., 2007 and Reinholz, Jenkins, Hillman et al., 2007) results included here were unplanned, post-hoc analyses. They compared outcomes of adjuvant AC→(P+/-TRZ) in subgroups found HER2 discordant or negative by central lab results, using data collected for the pooled analysis of Romond, Perez, Bryant, et al. (2005) without longer followup.
Advanced/metastatic disease. A randomized, controlled trial (CALGB 9840) that studied paclitaxel in women receiving first- or second-line therapy for metastatic breast cancer reported outcomes at two meetings (Seidman, Berry, Cirrincione, et al., 2004; Kaufman, Broadwater, Lezon-Geyda, et al., 2007) and in a published article (Seidman, Berry, Cirrincione, et al., 2008). Primary randomization in this trial compared once-weekly to every-third-week paclitaxel dosing regimens. Testing for HER2 status began after enrolling the first 171 patients, and HER2-negative patients (termed “HER2 non-overexpressors” by study authors and defined as 0 or 1+ or IHC 2+/FISH negative by local laboratory tests) were also randomized to treatment with or without trastuzumab. Seidman, Berry, Cirrincione, et al. (2004, 2008) reported outcomes for this second randomization without separating results by paclitaxel treatment frequency. HER2-positive patients (by local laboratory tests) all received trastuzumab and are excluded from the analysis for Key Question 2.
| Study | Time to Event Outcomes | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HER2 Discordant (all data on adjuvant AC→P +/- TRZ) | ||||||||||||
| FISH+ IHC 0, 1+, or 2+ by central lab: | ||||||||||||
| NSABP B-31a | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| DFS | Tx | 56 | Cox prop | 0.064 | 0.30 (0.08–1.07) | |||||||
| Cx | 69 | hazards | ||||||||||
| NCCTG N9831 | DFS | Tx | 123 | ??? | 0.97 | 0.98 (0.33–2.91) | ||||||
| Cx | 95 | |||||||||||
| FISH- IHC 3+ by central lab: | ||||||||||||
| NSABP B-31a | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| DFS | Tx | 10 | Cox prop | 0.94 | 0.91 (0.08–10) | |||||||
| Cx | 21 | hazards | ||||||||||
| NCCTG N9831 | DFS | Tx | 23 | ??? | 0.57 | 0.61 (0.11–3.29) | ||||||
| Cx | 30 | |||||||||||
| HER2 Negative | ||||||||||||
| adjuvant AC→P +/- TRZ: FISH- IHC 1+, 2+ by central lab: | ||||||||||||
| NSABP B-31a | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| DFS | Tx | 69 | ~98% | ~95% | ~90% | ~90% | ~86% | Cox prop | 0.02 | 0.30 (0.11–0.83) | ||
| Cx | 80 | ~90% | ~79% | ~75% | ~70% | ~62% | hazards | |||||
| adjuvant AC→P +/- TRZ: FISH- IHC 0, 1+, or 2+ | ||||||||||||
| NSABP B-31a | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| DFS | Tx | 82 | ~97% | ~90% | ~87% | ~87% | ~84% | Cox prop | 0.014 | 0.34 (0.14–0.80) | ||
| Cx | 92 | ~92% | ~80% | ~76% | ~72% | ~65% | hazards | |||||
| NCCTG N9831 | DFS | Tx | 59 | 90.2% | 81.2% | ??? | p | HR (95%CI) | ||||
| Cx | 44 | 82.6% | 60.9% | 0.13 | 0.51 (0.21–1.2) | |||||||
| P +/- TRZ as 1st or 2nd line therapy for metastatic disease | ||||||||||||
| CALGB 9840 IHC2+/FISH- or IHC 0, 1+ | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | |
| OS | Tx | 113 | 21.6 | ~75% | ~40% | ~25% | 20% | K-M | 0.65 | |||
| Cx | 115 | 21.6 | ~70% | ~40% | ~25% | 20% | analysis | |||||
| TTP | Tx | 113 | 6.5 | ~30% | ~13% | ~7% | ~5% | K-M | 0.28 | |||
| Cx | 115 | 5.5 | ~25% | ~12% | ~12% | ~4% | analysis | |||||
| CALGB 150002 (from 9840) central FISH-, polysomy 17 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | |
| OS | Tx | 19 | ~30 | ~90% | ~65% | ~30% | ??? | 0.538 | ||||
| Cx | 19 | ~23 | ~69% | ~48% | ~30% | |||||||
| capecitabine +/- lapatinib for advanced or metastatic disease progressing after an anthracycline, a taxane, and trastuzumab | ||||||||||||
| EGF100151 Cameron et al., 2008 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95% CI) |
| PFS | Tx | K-M | 0.46 | 0.77 (0.39–1.54) | ||||||||
| Cx | analysis | |||||||||||
| (sample include 74 patients not centrally confirmed to meet protocol HER2 eligibility criteria) | ||||||||||||
Subgroup analyses reported from NSABP B31 adjusted each Cox proportional hazards model used to estimate HR for included patients' ER and nodal status; subgroup analyses from NCCTG N9831 are unadjusted.
Abbreviations: AC: Adriamycin [doxorubicin]/cyclophosphamide; CI: confidence interval; Cx: control; DFS: disease-free survival; HR: hazard ratio; IHC: immunohistochemistry; FISH: fluorescent in situ hybridization; K-M: Kaplan-Meyer; Med: median; mos: months; OS: overall survival; P: paclitaxel; prop: proportional; q wk: every week; q3wk: every 3 week; TRZ: trastuzumab; TTP: time to progression; Tx: treatment; yr: year(s)
A post-hoc analysis on HER2 non-overexpressors randomized to paclitaxel with versus without trastuzumab in CALGB 9840 compared outcomes for subsets found FISH negative by central laboratory testing who had or did not have chromosome 17 polysomy (CALGB 150002; Kaufman, Broadwater, Lezon-Geyda, et al., 2007). This analysis was not included in the published final report (Seidman, Berry, Cirrincione, et al., 2008). It also did not include patients from CALGB 9342, none of whom were randomized to paclitaxel with or without trastuzumab.
Data are available from B31 for two HER2-discordant groups:
FISH positive/IHC 0, 1+, or 2+: n=56 +TRZ; n=69 -TRZ (data not reported separately for FISH-positive, IHC 0, 1+ subset)
FISH negative/IHC 3+: n=10 +TRZ; n=21 -TRZ;
and for two (partially overlapping) HER2-negative groups:
FISH negative/IHC 1+ or 2+: n=69 +TRZ; n=80 -TRZ
FISH negative/IHC 0, 1+, or 2+: n=82 +TRZ; n=92 -TRZ (13 and 12 patients per arm added to the 69 and 80 in the arms above).
Data are available from N9831 for two HER2-discordant groups:
FISH positive/IHC 0, 1+, or 2+: n=123 +TRZ; n=95 -TRZ (data not reported separately for FISH-positive, IHC 0, 1+ subset)
FISH negative/IHC 3+: n= 23 +TRZ; n=30 -TRZ;
and for one HER2-negative group:
FISH negative/IHC 0, 1+, or 2+: n=59 +TRZ; n=44 -TRZ.
Advanced/metastatic disease. Patients in CALGB 9840 had metastatic disease undergoing first- or second-line therapy. All were randomized to weekly or every third week paclitaxel, and those who were HER2 negative (IHC 2+/FISH negative or IHC 0 or 1+) by local laboratory results were simultaneously randomized to receive (n=113) or not receive (n=115) trastuzumab. The analysis pooled outcomes in the HER2-negative arms for patients given paclitaxel weekly or every third week. Subsequent analyses (CALGB 150002) compared outcomes separately for subgroups from CALGB 9840 who were FISH negative by central laboratory results and had (+/-TRZ, n=19 each arm) or did not have (+TRZ, n=53; -TRZ, n=50) chromosome 17 polysomy.
Adjuvant AC→(P±TRZ). The only available data are from post-hoc subgroup analyses, without stratification for the subgroups' defining characteristics. Neither the B31 nor the N9831 analyses reported subgroup-specific comparisons of baseline characteristics or prognostic factors by treatment arm. Furthermore, one subgroup mixed results for a discordant subgroup (IHC 0, 1+, FISH positive) with results for initially equivocal but ultimately positive (IHC 2+ but amplified by FISH) patients. Finally, data are presently unavailable from studies that classified patients using assay thresholds consistent with current guidelines (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007 see “Results and Conclusions, Key Question 1, Narrative Review”).
Neither trial reported median followup durations, or showed numbers per arm at risk over time, for the specific subgroups compared. In each subgroup from each treatment arm, failure events (e.g., death or relapse) occurred in less than 25 percent of patients (range: 5–23 percent) at the time of analysis. Therefore, length of followup was inadequate for reliable estimates of median event-free durations for any outcome reported. The interim analyses for all patients randomized in the larger trials that were sources of these subgroups (Romond, Perez, Bryant, et al., 2005) also lacked sufficient followup for reliable estimates of median overall survival or median disease-free survival (DFS).
For HER2 discrepant patients who were FISH positive and IHC 0, 1+ or 2+ by central laboratory testing, between-arm differences in outcome were not statistically significant in either trial. In B31 (n=56 +TRZ; n=69 -TRZ), the HR for failure in analysis of DFS was 0.30 (95 percent CI: 0.08–1.07; p=0.064) and the HR for failure in analysis of recurrence-free interval (RFI) was 0.35 (95 percent CI: 0.10–1.28; p=0.11). In N9831 (n=123 +TRZ; n=95 -TRZ), the HR for failure in analysis of DFS was 0.98 (95 percent CI: 0.33–2.91; p=0.97).
Few patients were FISH negative and IHC 3+ by central laboratory results (from B31: n=10 +TRZ; n=21 -TRZ; from N9831: n=23 +TRZ; n=30 -TRZ). B31 reported HR for failure was 0.91 for both DFS and RFI (for each outcome, 95 percent CI: 0.08–10.0; p=0.94), and N9831 reported hazard ratio for failure was 0.61 (95 percent CI: 0.11–3.29; p=0.57). Each between-arm subgroup comparison was not statistically significant.
Only B31 analyzed outcomes of patient subgroups that were HER2 negative by FISH but IHC 1+ or 2+ by central laboratory testing [n=69 +TRZ; n=80 -TRZ]). Between-arm differences reported by Paik, Kim, Jeong et al. (2007) were statistically significant for DFS (HR=0.30; 95 percent CI: 0.11–0.83; p=0.02) and RFI (HR=0.31; 95 percent CI: 0.10–0.95; p=0.041), and favored the subgroup given trastuzumab.
Both trials reported on patients who were FISH negative and IHC 0, 1+ or 2+ by central laboratory testing. In B31, this subgroup added FISH-negative/IHC 0 patients (13 and 12 per arm, respectively) to those in the FISH-negative/IHC 1+ or 2+ arms shown above (combined n=82 +TRZ; combined n=92 -TRZ). Between-arm differences were statistically significant for DFS (7 events, +TRZ, 20 events, -TRZ; HR=0.34; 95 percent CI: 0.14–0.80; p=0.014) and RFI (HR=0.36; 95 percent CI: 0.14–0.92; p=0.034), and again favored the subgroup given trastuzumab. One patient died in the trastuzumab arm, while 10 died in the control arm (HR=0.08; 95 percent CI: 0.01–0.64, p=0.017). In N9831 (n=59 +TRZ, n=44 -TRZ), the between-arm difference in DFS (HR=0.51; 95 percent CI: 0.21–1.2; p=0.13) was not statistically significant.
HER2 gene copy number and magnitude of benefit from trastuzumab. Additional unpublished subset analyses from the B31 trial presented at the June 2007 ASCO annual meeting (Paik, Kim, Jeong, et al., 2007), and similar analyses from the N9831 trial (Reinholz, Jenkins, Hillman, et al., 2007) and the HERA trial (McCaskill-Stevens, Proctor, Goodbrand, et al., 2007) presented at the December, 2007 San Antonio Breast Cancer Symposium, investigated the hypothesis that higher HER2 gene copy numbers, or higher HER2/CEP17 FISH ratios, were associated with a larger magnitude of relative benefit from trastuzumab. Data from the N9831 and HERA trials showed that the hazard ratio for DFS did not grow more favorable to the trastuzumab arm as average FISH ratios increased from 2.0 to 15 or greater (N9831), or from 2 to greater than 8 (HERA). Additionally, investigators found the HR for DFS did not increase as average HER2 gene copy number per cell increased from 4 to greater than 18 (HERA), or from 2 to greater than 10 (B31).
Polysomy 17 and adjuvant trastuzumab. An unpublished post-hoc analysis of data from N9831 presented at the December 2007 San Antonio Breast Cancer Symposium evaluated whether polysomy 17 influenced effects of adjuvant trastuzumab (Reinholz, Jenkins, Hillman, et al., 2007). Investigators reported that among patients with amplified HER2 genes, trastuzumab increased DFS whether or not these patients had polysomy 17. Central lab results identified very few patients without HER2 overexpression by IHC or HER2 gene amplification by FISH, but with polysomy 17. DFS was lower (79 percent versus 83 percent at 3 years; 65 percent versus 75 percent at 5 years) among those given trastuzumab than among those not given trastuzumab, although the sample size was small and few events had occurred in either arm (6 of 24 given trastuzumab, 3 of 13 controls). Investigators also analyzed slightly larger patient subsets without HER2 overexpression by IHC, HER2 gene amplification by FISH, or polysomy 17. DFS was substantially higher (94 percent versus 77 percent at 3 years; 84 percent versus 55 percent at 5 years) among those given than among those not given trastuzumab. As in the subset with polysomy 17, few events had occurred in either arm in the subset without polysomy (4 of 34 given trastuzumab, 13 of 33 controls). Additionally, unpublished data from the NSABP B31 trial showed no impact on prognosis or degree of benefit from trastuzumab (Dr. S. Paik; personal communication, May 2008).
HER2-negative patients with metastatic disease given P±TRZ for first- or second-line therapy. Patients found IHC 2+/FISH negative or IHC 0, 1+ by local laboratory results were randomized in CALGB 9840 to have or not have trastuzumab added to paclitaxel (n=113 +TRZ; n=115 -TRZ). Between-arm differences in OS (median: 21.6 versus 19.6 months, p=0.67), time to progression (TTP; median: 12 versus 6 months, p=0.088), and overall response rate (ORR; 35 percent versus 29 percent, p=0.32) were not statistically significant (Seidman, Berry, Cirrincione, et al., 2008).
CALGB 150002 reported that subgroups from CALGB 9840 found FISH negative by central laboratory results, and also found to have chromosome 17 polysomy (n=19 +TRZ; n=19 -TRZ), showed a statistically significant increase in ORR (63 percent versus 26 percent, p=0.048) among those given trastuzumab plus paclitaxel compared with those given paclitaxel alone (Kaufman, Broadwater, Lezon-Geyda, et al., 2007). In contrast, ORR did not differ between treatment arms (36 percent in each) for centrally FISH-negative patients without chromosome 17 polysomy. The ORR difference between arms for the centrally FISH-negative subgroup with polysomy 17 (+/-TRZ; n=19 each) did not yield statistically significant differences between arms for either OS (p=0.538) or TTP (p=0.88).
Adjuvant trastuzumab. Currently available evidence is inconclusive on outcomes of trastuzumab added to adjuvant chemotherapy for resected HER2-discordant or HER2-negative patients. Evidence on each subgroup may be used to generate hypotheses, but is too weak to test hypotheses, for the following reasons. All available evidence is from post-hoc analyses on subgroups not directly randomized or stratified by the HER2 subgroups of interest. Furthermore, available reports did not show direct comparisons of baseline characteristics and prognostic factors for the specific subgroups compared. Thus, it is uncertain whether the HER2-discordant or HER2-negative subgroups were balanced by treatment arm (i.e., with or without trastuzumab; although treatment arms appeared well-balanced across all patients randomized). Finally, the data used for the two adjuvant studies are from interim analyses, with inadequate followup to estimate median survival for all patients randomized, and inadequate information on median duration of followup in the specific subgroups compared. Thus, although these were large, well-designed and well-conducted randomized, controlled trials, since the overwhelming majority of patients they randomized were unequivocally HER2-positive, only poor quality evidence is presently available on outcomes of adjuvant trastuzumab in either HER2 discordant or HER2 negative patient subgroups.
Factors influencing discordant results. Discordant results may occur if one assay is correct and the other in error, either due to preanalytic, analytic, or postanalytic factors (see Key Question 1). As with any assay, 100 percent accuracy cannot be expected even from the most careful and proficient laboratories. Proficiency testing and other quality control and quality assurance measures to minimize false-negative and false-positive results are recommended in current practice guidelines (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007). However, concordance of different methods to classify an individual as HER2 positive or negative is at least partly independent from accuracy of performing a specific assay. Even with the most careful and highly accurate laboratory techniques, discordance in classification may occur between a method that detects gene amplification (FISH in these studies, but also true with CISH or SISH) and a method that detects protein overexpression (IHC in these studies, but also true with Western blots).
Adjuvant trastuzumab in HER2-negative patients. Scant but intriguing evidence suggests the hypothesis that some patients currently classified as HER2 negative may benefit from adjuvant trastuzumab. Data reported from B31 showed significantly longer DFS and RFI in FISH-negative IHC ≤2+ patients given trastuzumab than in similar patients managed without trastuzumab, whether the analysis did or did not include those who were IHC 0. However, a similar analysis of data from N9831 did not show significant differences. Since both were interim analyses of trials in which fewer than 25 percent of subjects had reached a failure event, neither provides conclusive evidence as yet, and follow up analyses from these trials will be of great interest. Blinded review of IHC and FISH scoring would also be useful for samples from these trials, and from other adjuvant trastuzumab trials that confirmed eligibility by central lab testing before randomizing each patient. Recent guidelines conclude that present evidence does not demonstrate improved outcomes with use of adjuvant trastuzumab for patients who would be classified HER2 negative by protocols of B31, N9831, and similar studies (Wolff, Hammond, Schwartz, et al., 2007a; Carlson, Moench, Hammond, et al., 2006; Hanna, O'Malley, Barnes, et al., 2007).
Importantly, the B31 and N9831 subgroup analyses combine results for HER2-negative patients many now consider to be different: those with the so-called “triple-negative” subtype (i.e., negative for HER2, estrogen receptor, and progesterone receptor), and the luminal subtypes (luminal A or luminal B) that are negative for HER2 but positive for at least one of the hormone receptors. These subtypes were initially defined in studies using microarrays to subdivide breast cancer patients by gene expression patterns (for reviews, see Peppercorn, Perou, and Carey, 2008; Razzak, Lin, and, Winer, 2008; Kang, Martel, and Harris 2008). There is evidence that the triple negative and luminal subsets differ with respect to prognosis, chemotherapy response, and outcomes (Carey, Dees, Sawyer, et al., 2007; Liedtke, Mazouni, Hess, et al., 2008), and they clearly differ with respect to effects of endocrine therapy. Further complexity comes from reports that there is substantial but incomplete overlap between triple negative patients and those classified in the “basal-like” subset by gene expression arrays (Cheang, Voduc, Bajdik, et al., 2008). Notably, new phase III trials have recently opened (and others are planned) specifically for patients with triple negative or “basal-like” breast cancer (Kilburn, 2008). Results from these studies will likely be more conclusive than analyses that pool all HER2-negative patients to determine outcomes for subsets of HER2-negative breast cancer.
Advanced or metastatic disease. No data were reported on patients with advanced or metastatic disease and discordant results from IHC and ISH HER2 testing. Evidence is available from one trial (CALGB 9840; n=226) that randomized metastatic breast cancer patients who were HER2 negative by local laboratory testing to chemotherapy with or without trastuzumab (Seidman, Berry, Cirrincione, et al., 2008). Additionally, a small subset of advanced and metastatic patients randomized to chemotherapy with or without lapatinib in another trial (EGF100151; n=74) were found by central lab confirmatory testing not to meet protocol criteria for HER2 positivity (Cameron, Casey, Press, et al., 2008). Thus, one source of good quality evidence (CALGB 9840) and one source of moderate quality evidence (EGF100151) suggest that HER2-negative patients with advanced or metastatic disease do not benefit from treatments targeting the HER2 molecule. Additional evidence supporting this conclusion comes from an analysis of data pooled from three pivotal trials of trastuzumab for metastatic breast cancer. The analysis showed that among patients found IHC 2+ by the presently unavailable “clinical trial assay,” benefit from trastuzumab was limited to those subsequently shown to have amplified HER2 genes by FISH (Mass, Press, Anderson et al., 2005).
CALGB 15002 investigators compared outcomes with versus without trastuzumab for a subgroup of FISH-negative patients who either had (n=38) or did not have (n=103) polysomy 17, (Kaufman, Broadwater, Lezon-Geyda, et al., 2007). Overall response rate was significantly higher with versus without trastuzumab for those with polysomy 17, but was identical with or without trastuzumab for those without polysomy 17. In contrast, the N9831 study on adjuvant therapy (Reinholz, Jenkins, Hillman, et al., 2007) reported no impact of polysomy 17 on benefit from trastuzumab, and unpublished data from a second study (NSABP B31; Dr. S. Paik, personal communication, May 2008)) suggested the same finding. This might be due to different definitions of polysomy 17 for CALGB 15002 (average CEP17 copy number per cell greater than 2.2) and N9831 (more than 3 CEP17 signals in more than 30% of nuclei). It might also reflect differences between adjuvant therapy and treatment for metastatic disease with respect to polysomy 17 as a predictor of benefit from trastuzumab. Note also that studies reviewed for “Results and Conclusions, Key Question 1” report conflicting data on a possible association of polysomy 17 with overexpression of HER2 protein. Thus, presently available evidence leaves unanswered questions with respect to the utility of polysomy 17 to select patients for HER2-targeted therapy.
For breast cancer patients, what is the evidence on clinical benefits and harms of using HER2 assay results to guide selection of chemotherapy regimen?
The search strategy for studies on HER2 testing in breast cancer yielded 3,218 citations. Initial review of titles and abstracts selected 219 citations potentially relevant to Key Question 3 for retrieval and review as full articles. Of these, 161 were considered potentially relevant to Key Question 3a (HER2 status to guide choice of chemotherapy regimen) while 62 were considered potentially relevant to Key Question 3b (HER2 status to guide choice of hormonal therapy regimen). Four reports were considered for both question 3a and 3b.
| Study/Design | Treatments | Age or Menopause Status | Extent of Disease | (% of pts analyzed by HER2 status) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ER+ | PR+ | HER2+ | HER2- | |||||||
| Adjuvant chemotherapy for resected early breast cancer | ||||||||||
| Yang et al., 2003 series | cyclophosphamide + methotrexate + fluorouracil (CMF; n=94) | ≥50 yr: | ≥3 cm: 67% | NR | NR | IHC only: | 36% | 64% | ||
| 52.1% | N+: 62% | |||||||||
| Gusterson et al., 2003; stratified RCT | perioperative CMF (one cycle) | post: 47% of n=760 of 1275 N- patients randomized | >2 cm: 53%, HER2+ | of HER2+: | IHC only: | 12.8% | 87.2% | |||
| 40%, HER2- | 36% | 24% | ||||||||
| no adjuvant therapy | of HER2-: | IHC only: | 20.8% | 79.2% | ||||||
| 100% N0 | 51% | 38% | ||||||||
| Multiple cycles of CMF | post: 45% of n=746 of 1229 N+ patients randomized | T size, NR; 100% | of HER2+: | IHC only: | 17.3% | 82.7% | ||||
| node+; ≥4 nodes +: | 32% | 22% | ||||||||
| perioperative CMF (one cycle) | 49%, HER2+ | of HER2-: | ||||||||
| 43%, HER2- | 59% | 45% | IHC only: | 21.6% | 78.4% | |||||
| Moliterni et al., 2003; RCT | 8 cycles CMF + 4 cycles doxorubicin (CMF → A; n=248 of 277 randomized) | ≥52 yr: | ~65%, <2.1 cm | only reported for all randomized to each arm | IHC only: | 18.1% | 81.9% | |||
| 67% | 100% N1 | |||||||||
| 12 cycles of CMF alone (n=258 of 275 randomized) | ≥52 yr: | IHC only: | 19.4% | 80.6% | ||||||
| 69% | ||||||||||
| Colozza et al., 2005; RCT | epirubicin(E), weekly for 4 months (n=133 of 166 randomized) | >50 yr: | ≤2 cm: 46% | 55% | 63% | IHC only: | 40.6% | 59.4% | ||
| 51% | 1–3 N+: 52% | |||||||||
| 6 cycles CMF (n=133 of 174 randomized) | >50 yr: | ≤2 cm: 45% | 56% | 63% | IHC only: | 27.8% | 72.2% | |||
| 56% | 1–3 N+: 59% | |||||||||
| Pritchard et al. 2006; RCT | 6 cycles of CEF (n=312 of 351 randomized) | 100% pre | FISH: | pos | neg | 62% | NR | by FISH: | 24.0% | 76.0% |
| 6 cycles of CMF (n=316 of 359 randomized) | T2 | 52% | 49% | |||||||
| 100% pre | 1–3 N+ | 57% | 63% | 56% | NR | by FISH: | 27.8% | 72.2% | ||
| Knoop et al., 2005; RCT | 9 cycles of CEF (n=352 of 480 randomized) | post: 31.5% | T≥2.1 cm: 60.7% | 25% | NR | IHC 3+ or | ||||
| 1–3 N+: 29.5% | FISH+ | 32.5% | 67.5% | |||||||
| 9 cycles of CMF (n=421 of 500 randomized) | post: 30.2% | T≥2.1 cm: 57.6% | 27% | NR | IHC 3+ or | |||||
| 1–3 N+: 33.3% | FISH+ | 32.8% | 67.2% | |||||||
| Dressler et al., 2005, Thor et al., 1998; 3-arm RCT (CALGB 8541) | 4 cycles high-dose CAF (n=179 of 519 randomized)a (A=doxorubicin) | mn, 50.1 yrs | mn T size, 2.91 cm | 68% | 54% | FISH+ | 17.3% | 82.7% | ||
| 42.5% pre | mn # N+, 4.51 | IHC+ | 24.8% | 75.2% | ||||||
| 6 cycles moderate-dose CAF (n=167 of 513 randomized)a | mn, 51.4 yrs | mn T size, 2.88 cm | 71% | 65% | FISH+ | 20.7% | 79.4% | |||
| 38.3% pre | mn # N+, 4.43 | IHC+ | 25.7% | 74.3% | ||||||
| 4 cycles low-dose CAF (n=178 of 518 randomized)a | mn, 50.4 yrs | mn T size, 3.07 cm | 66% | 58% | FISH+ | 18.8% | 81.2% | |||
| 41.1% pre | mn # N+, 4.92 | IHC+ | 22.9% | 77.1% | ||||||
| Del Mastro et al. 2004, 2005; RCT (GONO-MIG-1) | up to 9 cycles FEC14 regimen (q2wk; n=370 of ~607 randomized) | IHC 3+ | ||||||||
| T1: 47% N+: 62% | CB11 | 50 (13.5%) | 320 | |||||||
| median, 54 yrs | T2: 46% N-: 38% | 54% | (86.5%) | |||||||
| 6 cycles FEC21 regimen (q3wk; n=361 of ~607 randomized) | range, 25–70 | T3–4: 5% | 42% | IHC 3+ | ||||||
| T? 1% | CB11 | 53 (14.7%) | 308 | |||||||
| (85.3%) | ||||||||||
| Tanner et al., 2006; control arm from RCT | 9 cycles of FEC (n=180 of 251 randomized; n=211 from HDC/AuSCS arm excluded) | ≥50 yr: | HER2: | pos | neg | only reported pooled data for both study arms | CISH | |||
| 42% of all tested | T:2–5cm | 60% | 52% | only: | 31.1% | 68.9% | ||||
| 5–9 N+ | 41% | 47% | ||||||||
| ≥10 N+ | 59% | 53% | ||||||||
| Hayes et al., 2007; RCT (randomly selected 2 groups of 750 ea) | 4 cycles AC → paclitaxel (n=1,570 randomized) | post: 38% | Grp1 | Grp1 | 57% | not reported | ||||
| Grp2 | ||||||||||
| T>2cm | 66% | 64% | NR | |||||||
| 4 cycles AC → observation (n=1551 randomized) | post: 38% | 1–3 N+ | 48% | 46% | Grp2 | 62% | not reported | |||
| 4–9 N+ | 40% | 43% | NR | |||||||
| Martin et al., 2005b; RCT | 6 cycles DAC (n=630 with known HER2 status of 745 randomized) (D=docetaxel) | median, 49 yrs | T1: 40% 1–3N+: 63% | 155 (24.6%) | 475 | |||||
| range, 26–70 | T2: 52% ≥4N+: 37% | ER+ &/or PR+: 76% | (75.4%) | |||||||
| pre, 56% | T3: 8% | |||||||||
| 6 cycles FAC (n=632 with known HER2 status of 746 randomized) | median, 49 yrs | T1: 43% 1–3N+: 62% | 164 (26.0%) | 468 | ||||||
| range, 23–70 | T2: 51% ≥4N+: 38% | ER+ &/or PR+: 76% | (74.0%) | |||||||
| pre, 55% | T3: 6% | |||||||||
| Neoadjuvant (preoperative) chemotherapy for locally advanced breast cancer | ||||||||||
| Learn et al., 2005c; 3-arm RCT | 4 cycles AC ± D (concurrent or after resection) (n=104 of 144 randomized) | mean, 48 yrs | T ≤ 2 cm: 28% N0:61% | only reported data for n=121 with biopsy specimens | TAB 250 (n=104 classified) | |||||
| median, 47 yrs | T 2–5 cm:47% N1:39% | IHC+ | 41 (39%) | 63 (61%) | ||||||
| range, 27–73 | T >5 cm: 25% N2: 0 | |||||||||
| Arriola et al., 2006; series | 4 cycles of doxorubicin followed by surgery (n=232) | mean, 47 yrs | T3: 70% | 67% | 52% | IHC + FISH | ||||
| N1: 40% | then CISH | 18% | 82% | |||||||
| Park et al., 2003; series | 4 cycles of doxorubicin followed by surgery (n=67) | ≥50 yrs, 18% | 5–10 cm | 91% | ||||||
| >10 cm | 9% | 46% | NR | CISH only: | 46% | 54% | ||||
| N status | NR | |||||||||
| Zhang et al., 2003; series | 3–6 cycles of FAC followed by surgery (n=97) | T2 | 53% | |||||||
| ≥50 yrs, 44% | ≥T3 | 34% | 65% | 56% | IHC 3+ | |||||
| N- | 33% | or FISH+ | 28% | 72% | ||||||
| N+ | 67% | |||||||||
| Tulbah et al., 2002; series | 3–4 cycles of paclitaxel + cisplatin followed by surgery (n=54) | HER2+ HER2- | HER2+ | of HER2+: | ||||||
| ≤50 91% 84% | HER2- | 55% | 50% | IHC 3+ | 41% | 59% | ||||
| pre 91% 78% | ≥T3 | 86% | 78% | |||||||
| N0 | 36% | 28% | of HER2-: | |||||||
| N1 | 55% | 56% | 50% | 34% | ||||||
| N2 | 9% | 16% | ||||||||
| Tinari et al., 2006; series | median 4 (range, 3–6) cycles FEC, q3wk followed by surgery (n=77) | median, 46 yrs | T 2–5 cm: 75% | 62% | IHC 3+ | |||||
| range, 25–74 | T >5 cm: 25% | 45% | or 2+ & FISH+ | 20 (26%) | 57 (74%) | |||||
| First- or second-line chemotherapy for advanced or metastatic breast cancer | ||||||||||
| Harris et al., 2006; RCT | paclitaxel (n=165 of 474 randomized to 3 dose arms, but pooled for HER2 analysis) | median: 54.9 yr | # metastatic sites: | ER+ &/or PR+: | FISH | 26% | 74% | |||
| median, 1 | 58% | CB11 | 20% | 80% | ||||||
| Hercep. 3+ | 21% | 79% | ||||||||
| Di Leo et al., 2004; RCT | doxorubicin (A; n=91 of 165 randomized) | 54 yr | ≥3 sites: | 46% | NR | IHC+ ≥1% & FISH+: | ||||
| visceral: | 79% | 16% | 69% | |||||||
| docetaxel (T; n=85 of 161 randomized) | 51 yr | ≥3 sites: | 51% | NR | IHC+ ≥1% & FISH+: | |||||
| visceral: | 76% | 25% | 59% | |||||||
| Konecny et al., 2004; RCT | epirubicin + cyclophosphamide (EC; n=137 of 254 randomized) | mean: 55 yr | 1–2 sites: | 57% | 52.6% | 48.9% | FISH only | 36% | 64% | |
| (31–74) | ≥3 sites: | 42% | ||||||||
| epirubicin + paclitaxel (ET; n=138 of 262 randomized) | mean: 55 yr | 1–2 sites: | 53% | 60.9% | 49.3% | FISH only | 35% | 65% | ||
| (29–75) | ≥3 sites: | 42% | ||||||||
Data on eligible patients randomized to each arm are from Budman, Berry, Cirrincione, et al., 1998.
Except for HER2 status, data shown compare all patients randomized to TAC versus all patients randomized to FAC
Except for ER, PR and HER2 status, data shown pool evaluable patients (n=142) randomized to AC, AC+D, or AC→adjuvant D
Abbreviations: Please refer to the text or list of abbreviations at the end of the report for definition of specific chemotherapy regimens/agents.
Grp: group; IHC: immunohistochemistry; FISH: fluorescent in situ hybridization; mn: mean; q wk: every week; q3wk: every 3 weeks;
| Study | Design | Therapeutic Setting | n, Enrolled (Randomized) | n, Evaluated | n, Withdrawn (Lost to F/U) | Treatment Regimen (Agents) |
|---|---|---|---|---|---|---|
| Adjuvant Chemotherapy | ||||||
| Yang et al. 2003 rec. # 8840 | single arm retrospective series | adjuvant therapy post mastectomy | 94 (identically treated; 13 of 107 in series not given adj. chemo) | 94 (outcomes reported separately) | 0 | cyclophosphamide + methotrexate + fluorouracil (CMF) |
| Gusterson et al. 2003; rec. # 43690 | RCT; separate randomization by nodal status | adjuvant therapy: none versus one cycle peri-op versus prolonged | 1275 node-neg 1229 node-pos | 760 node-neg 746 node-pos | 515 node-neg 483 node-pos (no samples) | node-neg: peri-op CMF versus no adj therapy; node-pos: peri-op versus continuous CMF |
| Moliterni et. al. 2003; rec. # 10210 | RCT retrospective analysis by HER2 status | adjuvant therapy post mastectomy or quadrantectomy with axillary dissect. (1–3 nodes+) | 552 | 506 | 46 (HER2 status unknown) | CMF alone (12 cycles) versus CMF for 8 cycles then doxorubicin for 4 cycles (CMF→(A) |
| Colozza et al. 2005; rec. # 3820 | RCT retrospective analysis by HER2 status | post-operative adjuvant therapy; node- if ER/PR neg or node+ with ≤9 nodes involved | 348 | 266 | 82 (no tumor samples) | CMF for 6 cycles versus epirubicin weekly for 4 months |
| Pritchard et al. 2006; rec. # 1760 | RCT retrospective analysis by HER2 status | adjuvant therapy post mastectomy or lumpectomy with axillary dissection; all node+ | 710 | 634 (by IHC) 628 (by FISH) | 71 (no tumor samples) 5 (IHC & FISH failed) | CMF (Cx) versus CEF (Tx); each given for 6 cycles; no endocrine therapy after adjuvant chemoTx |
| Knoop et al. 2005; rec. # 3450 | RCT (2 × 2) retrospective analysis by HER2 status | adjuvant therapy post mastectomy or lumpectomy with axillary dissection | 1,195 (980 Danes eligible) | 773 (805 tested for HER2 status) | CMF: 79 of 500 CEF: 128 of 480 | CMF (Cx) versus CEF (Tx); each given for 9 cycles ± pamidronate, daily for 4 years; no adjuvant tamoxifen |
| Dressler et al. 2005, rec. # 4280; Thor et al. 1998, rec. # 40880 CALGB trial 8541 & lab companion study 8869 | 3-arm RCT retrospective analysis by HER2 status | adjuvant therapy post mastectomy or lumpectomy with axillary dissection; all node+ | 1,549 (in CALGB 8541) | 524 (of 993 in CALGB 8869) | 1,025 (556 not in 8869 study + 469 not in Dressler et al.) | 4 cycles high dose CAF (600/60/600 mg/m2) q4wk versus 6 cycles moderate dose CAF (400/40/400 mg/m2) q4wk versus 4 cycles low dose CAF (300/30/300 mg/m2) q4wk; similar proportions in each arm given 5 years of twice daily tamoxifen (41%, 40%, 34%) for ER+, post-menopausal disease |
| Del Mastro et al. 2004, 2005; rec. # 48020 GONO-MIG-1 trial | RCT retrospective analysis by HER2 status | adjuvant therapy for node- high-risk or node+ patients | 1,214 | 731 | 483 (specimens unavailable for HER2 testing) | 6 cycles FEC21 regimen q3wk versus up to 9 cycles FEC14 regimen q2wk (same drug doses in each regimen; ER+ & PR+ patients in each arm received tamoxifen qd for 5 years |
| Tanner et al. 2006; rec. # 1820 | STD-dose arm of RCT; retrospective analysis by HER2 status | adjuvant therapy post mastectomy or lumpectomy with axillary dissection | 525 (251 to STD-dose arm) | 391 (180 for STD-dose arm) | 274 (71 from STD-dose arm; no samples) | FEC (9 cycles; individualized doses based on hematological toxicity) versus HDC/AuSCS using CTCb after 3–4 cycles of FEC (did not abstract data from HDC/AuSCS arm); loco-regional RTx + 5 years of tamoxifen for all patients |
| Hayes et al. 2007; rec. # 47610 CALGB 9344 | subset from 3 X 2 RCT; retrospective analysis by HER2 status | adjuvant therapy for node+ patients after surgery with negative margins | 1500 (2 groups, 750 each, randomly selected from 3121 in RCT) | 1322 | 178 (no tumor specimens; 1621 RCT patients not analyzed by HER2 status) | 4 cycles of AC (randomized to 1 of 3 doxorubicin doses) followed by 4 cycles of paclitaxel or observation (a second; separately reported doxorubicin dose did not change outcomes |
| Martin et al. 2005; rec # 47650 | RCT pre-planned subgroups; 2nd interim analysis of ongoing trial | adjuvant therapy for node+ patients after surgery with negative margins | 1491 | 1262 | 229 (no tumor specimens | 6 cycles (3 wks each) of docetaxel + doxorubicin + cyclophosphamide (DAC) versus flluorouracil + doxorubicin + cyclophosphamide (FAC); equal proportion (ER or PR)+ patients, each arm took qd tamoxifen for 5 years |
| Neoadjuvant (Pre-operative) Chemotherapy | ||||||
| Learn et al. 2005; rec. # 47640 | 3 arm RCT; retrospective analysis by HER2 status | pre-operative chemotherapy for operable breast cancer (T1–3, N0–1, M0) | 144 | 104 | 40 (no tumor specimen, 23; HER2 status unknown, 17) | 4 cycles AC ± docetaxel (D) q3wk, followed by surgery; 3rd arm given AC + post-surgery D (pooled with AC alone controls for analysis by HER2 status); all patients given 5 yrs of TAM qd |
| Arriola et al. 2006; rec # 950 | prospective single-arm series | primary chemotherapy for T2–3 N0–1 operable breast cancer | 232 | 232 | 0 | doxorubicin (75 mg/m2) 4 cycles, q3wk, then lumpectomy or mastectomy + 3-level axillary dissect. |
| Park et al. 2003; rec # 9960 | retrospective single-arm series | pre-operative chemotherapy for locally-advanced disease | 67 | 67 | 0 | doxorubicin (50 mg/m2) 4 cycles, q3wk, prior to breast conservation or mastectomy |
| Zhang et al. 2003; rec # 9820 | retrospective single-arm series | pre-operative chemotherapy for operable breast cancer | 97 | 97 | 0 | FAC q3wk (6 cycles for 7 patients, 5 cycles for 1, 4 cycles for 81, and 3 cycles for 8) |
| Tulbah et al. 2002; rec # 11560 | retrospective single-arm series | pre-operative chemotherapy for locally-advanced, non-inflammatory breast cancer | 54 | 54 | 0 | paclitaxel + cisplatin, q3wk, for 3 or 4 cycles |
| Tinari et al. 2006; rec # 2300 | retrospective single-arm series | pre-operative chemotherapy for operable breast cancer | 77 (selected; 16 ineligible of 93 consecutive) | 77 | 0 | FEC q3wk (median 4 cycles; range 3–6 cycles) |
| Chemotherapy for Advanced or Metastatic Disease | ||||||
| Harris et al 20061; rec. # 390, no data on no. of sites, 1994-? | RCT/RET; CALGB 9342 | Advanced (Stage IV or inoperable); first or second line Tx. No concurrent hormonal therapy | 474 | 165 (of n=175 w adequate tumor blocks; n= 10, all bio-marker tests unsuccessful) | 299 (n=273, no blocks; n=26, blocks inadequate); similar characteristics & outcomes, w/wo blocks, except DFS | Paclitaxel; compared 3 doses—175, 210, or 250 mg/m2 q3wk to failure (progression or intolerable toxocity)—but data combined for this analysis) |
| Di Leo et al 2004; rec. # 5970; 29 of 41 sites in original trial, 7/94–1/972 | Phase III RCT (not blinded); TAX 303 trial; secondary analysis | Metastatic disease; first or second line therapy; prior CMF required (adj or for mets); prior anthraciclines or taxanes excluded | 326 | 176 | 150 (n=74, Grp1; n=76, Grp 2) | Grp 1: doxorubicin (75 mg/m2) (A; n=91) vs Grp 2: docetaxel (100 mg/m2) (T; n=85) every 3 wks; max 7 cycles absent progression or toxicity. No stat sig differences between populations with versus without specimens for HER2 analysis. |
| Konecny et al 2004; rec. # 6740; ~71 sites, Germany, 10/96–12/99 | RCT; secondary analysis | Metastatic; no prior chemo for metastatic disease, no metastasis to CNS or to bone only. Stratified by 0 vs 1 prior hormonal Tx for metastatic disease. | 579 enrolled; 516 eligible were randomized & treated | 275 | 241 (n=219, no block; n=17, technically inadequate; n=5, no invasive cancer; no SS diffs between pts w/wo known HER2 status. | Grp 1: epirubicin (60 mg/m2) and cyclophosphamide (600 mg/m2) (EC, n=137); Grp 2: epirubicin (60 mg/m2) and paclitaxel (175 mg/m2)(ET, n=138). Chemo given q3 wks for max of 10 cycles; median=6 cycles. |
Some data from: Winer EP. Berry DA. Woolf S. Duggan D. Kornblith A. Harris LN. Michaelson RA. Kirshner JA. Fleming GF. Perry MC. Graham ML. Sharp SA. Keresztes R. Henderson IC. Hudis C. Muss H. Norton L. Failure of higher-dose paclitaxel to improve outcome in patients with metastatic breast cancer: cancer and leukemia group B trial 9342. J Clin Oncol 22(11):2061–8, 2004 Jun 1.
Some data from: Chan S, Friedrichs K, Noel D et al. Prospective randomized trial of docetaxel versus doxorubicin in patients with metastatic breast cancer. J Clin Oncol 1999;17(8):2341–54.
Studies on the CMF regimen. The uncontrolled series (Yang, Klos, Zhou, et al., 2003; n=94) and one comparative randomized, controlled trial (Gusterson, Gelber, Goldhirsch, et al., 2003; n=2,504 randomized) studied the cyclophosphamide plus methotrexate plus fluorouracil (CMF) regimen. The Gusterson and co-workers trial separately randomized groups of node-negative and node-positive patients. Tissue blocks for determining HER2 status were unavailable for 515 (40 percent) of 1,275 randomized node-negative patients and for 483 (39 percent) of 1,229 randomized node-positive patients. Node-negative patients were randomized to one perioperative cycle of adjuvant CMF or to observation. Node-positive patients were randomized to multiple cycles of adjuvant CMF or to one perioperative cycle of adjuvant CMF. The relevance of these findings for current practice may be limited as taxane-based regimens have largely replaced CMF when anthracyclines are not used, particularly for hormone-receptor-negative patients.
Studies on anthracycline-based regimens. Four randomized, controlled trials (Moliterni, Menard, Valagussa, et al., 2003; Colozza, Sidoni, Mosconi, et al., 2005; Pritchard, Shepherd, O'Malley, et al., 2006; Knoop, Knudsen, Balslev, et al., 2005) compared CMF versus anthracycline-based regimens, and a fifth randomized, controlled trial compared an anthracycline-based regimen without autologous stem-cell support (AuSCS) versus a higher-dose regimen with AuSCS (Tanner, Isola, Wiklund, et al., 2006). Only the non-AuSCS arm of the Tanner and co-workers study met selection criteria for data abstraction. Moliterni, Menard, Valagussa, et al. (2003) compared CMF followed by doxorubicin (CMF→A) versus CMF alone, and included 92 percent of originally randomized patients. Colozza, Sidoni, Mosconi, et al. (2005) compared epirubicin (E) alone versus CMF, and included 76 percent of originally randomized patients. Pritchard, Shepherd, O'Malley, et al. (2006) and Knoop, Knudsen, Balslev, et al. (2005) compared cyclophosphamide plus epirubicin plus fluorouracil (CEF) versus CMF, although the Pritchard and co-workers study gave 6 cycles while the Knoop and co-workers study gave 9 cycles. Pritchard and co-workers included 89 percent of originally randomized patients while Knoop and co-workers included 79 percent. Tanner, Isola, Wiklund, et al. (2006) also gave 9 cycles of CEF in the non-AuSCS arm of their trial, although the doses administered were higher than those in the Pritchard and Knoop trials. Outcomes by HER2 status for 72 percent of those randomized to the non-AuSCS arm are considered a single-arm study in this review.
Two randomized, controlled trials with two reports each compared different doses (Dressler, Berry, Broadwater, et al., 2005; Thor, Berry, Budman, et al., 1998) or dose intensities and schedules (Del Mastro, Bruzzi, Nicolo, et al., 2005; Del Mastro, Bruzzi, Venturini, et al., 2004) for anthracycline-based regimens. The Dressler and co-workers study investigated interaction of HER2 status with dose in 524 patients from the Cancer and Leukemia Group B (CALGB) trial 8541. This trial randomized 1,549 patients to high-dose (600/60/600 mg/m2 every four weeks for 16 weeks), moderate-dose (400/40/400 mg/m2 every four weeks for 24 weeks) or low-dose (300/30/300 mg/m2 every four weeks for 16 weeks) regimens of cyclophosphamide, doxorubicin and fluorouracil (CAF) (Budman, Berry, Cirrincione, et al., 1998). Although earlier reports (Thor, Berry, Budman, et al., 1998; Muss, Thor, Berry, et al., 1994) included different proportions of randomized patients tested for HER2 status by IHC and/or PCR, Dressler and colleagues compared outcomes separately by assay method (IHC, FISH, or PCR) for HER2 status subgroups from each dose arm (n=524, 33.8 percent of originally randomized patients).
In the GONO-MIG-1 study, Del Mastro and colleagues (2004, 2005) randomized 1,214 patients to either six cycles of CEF every three weeks (FEC21) or up to nine cycles at the same dose (600/60/600 mg/m2) every two weeks (FEC14). The analysis by HER2 status included 731 (60 percent) of originally randomized patients.
Studies on regimens with a taxane. Two randomized, controlled trials investigated effects of HER2 status on outcomes of regimens with versus without a taxane (Hayes, Thor, Dressler, et al., 2007; Martin, Pienkowski, Mackey, et al., 2005). Hayes and colleagues (2007; CALGB trial 9344) randomized 3,121 patients to doxorubicin plus cyclophosphamide (AC) followed by paclitaxel or observation. The trial used a 3 × 2 factorial design to compare three doses of doxorubicin in AC, each followed or not by paclitaxel. Since outcomes were not statistically significantly different across doxorubicin doses, the analysis of outcomes with versus without paclitaxel by HER2 status pooled patients from all three doxorubicin doses. Two groups of 750 patients each were randomly selected for this correlative analysis, but tissue blocks were available and analyzed for only 1,322 (42 percent of those originally randomized).
Martin, Pienkowski, Mackey, et al. (2005) stratified patients (n=1,491) by number of involved axillary lymph nodes and randomized them to six three-week cycles of docetaxel plus doxorubicin plus cyclophosphamide (TAC) or fluorouracil plus doxorubicin plus cyclophosphamide (FAC). The preplanned analysis by HER2 status included 1,262 (85 percent) of originally randomized patients. Patients were not stratified by HER2 status. In the TAC group, 20.8 percent were HER2 positive and 15.4 percent lacked tumor specimens for measuring HER2; in the FAC group, 22 percent were HER2 positive and 15.3 percent lacked tumor specimens. The study does not report the distribution of other prognostic factors by treatment group and HER2 status combined, which would be useful in ensuring balance in this subset of trial patients with known HER2 status.
| Level of Evidence | Study | n | Setting | Treatments | Outcome Results | |
|---|---|---|---|---|---|---|
| Adjuvant chemotherapy for resected early breast cancer | ||||||
| HER2 stratified or HER2-guided RCT | ||||||
| RCT prespecified MV SGA | Martin 2005 | 1262 | adjuvant | TAC vs. FAC | DFS | Cox regression treatment by FISH |
| HER2 interaction, p=NS | FISH+ TAC > FAC, FISH- TAC > FAC
| |||||
| RCT post-hoc MV SGA | Gusterson 2003 | 1506 | adjuvant | LN-: no tx vs. CMF | OS | LN- adjusted Cox regression IHC |
| HER2+, tx < cx p=NS | LN+: periop CMF vs. prolonged CMF | LN- adjusted Cox regression IHC | ||||
| HER2-, tx ≈ cx p=NS | LN+ adjusted Cox regression IHC | |||||
| HER2+, tx < cx p=NS | LN+ adjusted Cox regression IHC | |||||
| HER2-, tx > cx p<0.05 | DFS | LN- adjusted Cox regression IHC | ||||
| HER2+, tx < cx p=NS | LN- adjusted Cox regression IHC | |||||
| HER2-, tx > cx p=NS | LN+ adjusted Cox regression IHC | |||||
| HER2+, tx > cx p=NS | LN+ adjusted Cox regression IHC | |||||
| HER2-, tx > cx p<0.05 | ||||||
| Moliterni 2003 | 506 | adjuvant | CMF→A vs. CMF | OS | Cox regression treatment by IHC | |
| HER2 interaction p=0.052 | HER2+ tx > cx p= NS, HER2- tx < cx p=NS | |||||
| RFS | Cox regression treatment by IHC | |||||
| HER2 interaction p=NS | HER2+ tx > cx p= NS, HER2- tx < cx p=NS | |||||
| Colozza 2005 | 266 | adjuvant | CMF vs. epirub | OS | Cox regression treatment by IHC | |
| HER2 interaction p=NS | cx HER2+ < HER2- p=0.024, tx | |||||
| HER2+ < HER2- p=NS | ||||||
| HER2 interaction p=NS | RFS | Cox regression treatment by IHC | ||||
| HER2+ < HER2-p=NS | cx HER2+ HER2- p=NS, tx | |||||
| Pritchard 2006 | 628 | adjuvant | CMF vs. CEF | OS | Cox regression treatment by FISH | |
| HER2 interaction p=0.02 | ||||||
HER2+ tx > cx p=0.06, HER2- tx ≈ cx p=NS | ||||||
| RFS | Cox regression treatment by FISH | |||||
| HER2 interaction p=0.02 | HER2+ tx > cx p=0.003, HER2- tx ≈ cx p=NS | |||||
| Knoop 2005 | 805 | adjuvant | CMF vs. CEF | OS | Cox regression HER2+ tx > cx p=0.09, HER2- tx > cx p=0.23 | |
| RFS | Cox regression HER2+ tx > cx p=0.10, HER2- tx > cx p=0.10 | |||||
| Dressler 2005 | 521 | adjuvant | CAF: high vs. mode- | DFS | Cox regression FISH HER2 by CAF | |
| dose interaction, p=0.033 | rate vs. low dose | Cox regression IHC HER2 by CAF | ||||
| dose interaction, p=0.0003 | Cox regression PCR HER2 by CAF | |||||
| dose interaction, p=0.043 | FISH+/PCR+/IHC+ high > moderate ≈ low dose | |||||
FISH-/PCR-/IHC- high ≈ moderate ≈ low dose | ||||||
| Del Mastro 2004 | 731 | adjuvant | FEC q2wk vs. q3wk | DFS | Cox regression IHC HER2 by Tx | |
| schedule interaction, p=0.12 | FEC q2wk HER2 + ≈ HER2-, FEC | |||||
| q3wk HER2+ < HER2- | OS | Cox regression IHC HER2 by Tx | ||||
| schedule interaction, p=0.38 | FEC q2wk HER2 + ≈ HER2-, FEC | |||||
| q3wk HER2+ < HER2- | ||||||
| Hayes 2007 | 1500 | adjuvant | AC vs. AC→P | OS | Cox regression treatment by FISH | |
| HER2 interaction p=0.01 | DFS | HER2+ tx > cx, HER2- tx ≈ cx
Cox regression treatment by FISH | ||||
| HER2 interaction p=0.01 | HER2+ tx > cx, HER2- tx ≈ cx | |||||
| RCT treatment by HER2 SGA | ||||||
| 1-arm prespecified MV analysis | ||||||
| 1-arm post-hoc MV analysis | ||||||
| 1-arm UV analysis | Yang 2003 | 94 | adjuvant | CMF | DFS | IHC HER2+ ↓ vs. HER2- p=0.002 |
| Tanner 2006 | 180 | adjuvant | FEC | OS | CISH HER2+ < HER2- but not statistical tests described | |
| RFS | CISH HER2+ < HER2- but not statistical tests described | |||||
| Neoadjuvant (preoperative) chemotherapy for locally advanced breast cancer | ||||||
| HER2 stratified or HER2-guided RCT | ||||||
| RCT prespecified MV SGA | ||||||
| RCT post-hoc MV SGA | Learn 2005 | 104 | neoadjuvant | AC vs. AC+D | pCR | IHC HER2+, AC vs. AC+D, p=NS |
| IHC HER2-, AC vs. AC+D, p=NS | ||||||
| cORR | IHC HER2+, AC vs. AC+D, p=NS | |||||
| (CR+PR) | IHC HER2-, AC vs. AC+ D, p<0.05 | |||||
| RCT treatment by HER2 SGA | ||||||
| 1-arm prespecified MV analysis | ||||||
| 1-arm post-hoc | Park 2003 | 67 | neoadjuvant | doxorub | pResp | CISH HER2+ > HER2- p=0.013 |
| MV analysis | Zhang 2003 | 97 | neoadjuvant | FAC | DFS | CISH HER2+ ≈ HER2- p=NS |
| ORR | IHC HER2+ > HER2- p=NS | |||||
| pResp | IHC HER2+ > HER2- p=NS | |||||
| 1-arm UV analysis | Arriola 2006 | 229 | neoadjuvant | doxorub | pResp | CISH HER2+ > HER2- p=0.03 |
| Tulbah 2002 | 52 | neoadjuvant | paclit+cispl | pResp | IHC HER2+ ≈ HER2- p=NS | |
| OS | IHC HER2+ (3+) ≈ HER2- p=NS | |||||
| OS | IHC HER2+ (2+/3+) ≈ HER2- p=0.051 | |||||
| DFS | IHC HER2+ (3+) ≈ HER2- p=NS | |||||
| DFS | IHC HER2+ (2+/3+) ≈ HER2- p=0.09 | |||||
| Tinari 2006 | 77 | neoadjuvant | FEC | pResp (pCR+MRD) | IHC 3+ or IHC2+/FISH+ HER2+ vs. HER2-, p=0.008 | |
| First- or second-line chemotherapy for advanced or metastatic breast cancer | ||||||
| HER2 stratified or HER2-guided RCT | ||||||
| RCT prespecified MV SGA | ||||||
| RCT post-hoc MV SGA | Di Leo 2004 | 149 | metastatic | doxorub vs. docetax | OS | Cox regression treatment by IHC HER2 interaction p=.10 |
IHC/FISH HER2+ tx < cx p=NS, HER2- tx > cx p=.07 | ||||||
| TTP | Cox regression treatment by IHC HER2 interaction p=NS | |||||
IHC/FISH HER2+ tx > cx p=NS, HER2- tx > cx p=NS | ||||||
| Resp | logistic regression treatment by IHC HER2 interaction p=.01 | |||||
IHC/FISH HER2+ tx > cx p=.04, HER2- tx > cx p=NS, HER2? tx ≈ cx p=NS | ||||||
| Konecny 2004 | 275 | metastatic | epirub+cyclophosph vs. epirub+paclitaxel | OS | Cox regression treatment by IHC HER2 interaction p=NS | |
IFISH HER2+ tx > cx p=.059, HER2- tx ≈ cx p=NS | ||||||
| PFS | Cox regression treatment by IHC HER2 interaction p=.109 | |||||
IFISH HER2+ tx > cx p=.062, HER2- tx ≈ cx p=NS | ||||||
| ORR | logistic regression treatment by IHC HER2 interaction p=NS | |||||
IFISH HER2+ tx > cx p=.005, HER2- tx > cx p=.046 | ||||||
| RCT treatment by HER2 SGA | ||||||
| 1-arm prespecified MV analysis | ||||||
| 1-arm post-hoc MV analysis | ||||||
| 1-arm UV analysis | Harris 2006 | 156 | metastatic | paclitaxel | OS | IHC CB11 HER2+ < HER2- p=NS |
| OS | FISH HER2+ < HER2- p=NS | |||||
| OS | IHC HercepTest HER2+ ≈ HER2- p=NS | |||||
| ORR | IHC CB11 HER2+ ≈ HER2- p=NS | |||||
| ORR | FISH HER2+ ≈ HER2- p=NS | |||||
| ORR | IHC HercepTest HER2+ > HER2- p=.026 | |||||
Abbreviations: Please refer to the text or list of abbreviations at the end of the report for definition of specific chemotherapy regimens/agents. cx: control; DFS: disease-free survival; HR: hazard ratio; MV: multivariate; ORR: overall response rate; OS: overall survival; q2wk: every 2 weeks; q3wk: every 3 weeks; RCT: randomized, controlled trial; RFS: recurrence-free survival; SGA: subgroup analysis; TTP: time to progression; tx: treatment; UV: univariate analysis;
| Study | Prospective design | Prespecified hypotheses about relation of marker to outcome | Large, well-defined, representative study population | Marker assay methods well-described | Blinded assessment of marker in relation to outcome | Homogeneous treatment(s), either randomized or rule-based selection | Low rate of missing data (≤ 15%) | Sufficiently long follow-up | Well-described, well-conducted multivariate analysis of outcome: 1) clear candidate variable selection, 2) clear, appropriate model-building guidelines, 3) assumptions tested, 4) standard prognostic variables included, 5) continuous variables well handled, 6) validation | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1) | 2) | 3) | 4) | 5) | 6) | |||||||||
| Adjuvant Chemotherapy | ||||||||||||||
| Yang et al., 2003 | N | N | N | Y | ? | Y | Y | ? | NA | |||||
| Gusterson et al., 2003 | Y | N | Y | Y | ? | Y | N | med: 6 yrs | ? | ? | ? | Y | ? | N |
| Moliterni et al., 2003 | Y | N | Y | Y | ? | Y | Y | med: 14.8 yrs | ? | ? | Y | Y | ? | Y |
| Colozza et al., 2005 | Y | N | Y | Y | Y | Y | N | min 8 yrs | ? | N | ? | Y | ? | N |
| Pritchard et al., 2006 | Y | N | Y | Y | ? | Y | Y | med: 10 yrs | ? | ? | ? | Y | ? | N |
| Knoop et al., 2005 | Y | N | Y | Y | ? | Y | N | med: 10 yrs | ? | N | Y | ? | ? | N |
| Dressler et al., 2005; Thor et al., 1998 | Y | Y | Y | Y | Y | Y | N | med: 9 yrs | Y | ? | ? | Y | ? | Y |
| Del Mastro et al., 2004, 2005; | Y | N | Y | Y | Y | Y | N | med: 6.7 yrs | Y | N | ? | N | ? | N |
| Tanner et al., 2006 | Y | N | Y | Y | Y | Y | N | ? | NA | |||||
| Hayes et al., 2007 | Y | N | Y | Y | Y | Y | N | med: ~10 yrs | Y | Y | ? | Y | ? | Y |
| Martin et al., 2005 | Y | Y | Y | N | ? | Y | Y | med: 4.6 yrs | Y | Y | ? | Y | ? | N |
| Neoadjuvant (Preoperative) Chemotherapy | ||||||||||||||
| Learn et al., 2005 | Y | N | N | N | ? | Y | N | pCR at resection | ? | ? | NA | ? | ? | N |
| Arriola et al., 2006 | Y | Y | Y | Y | ? | Y | Y | pCR at resection | ? | N | NA | N | ? | N |
| Park et al., 2003 | N | N | N | Y | ? | Y | Y | pCR at resection | NA | |||||
| Zhang et al., 2003 | N | N | N | N | ? | Y | Y | pCR at resection | NA | |||||
| Tulbah et al., 2002 | N | N | N | Y | Y | Y | Y | pCR at resection | NA | |||||
| Tinari et al., 2006 | N | N | N | Y | Y | Y | Y | pCR at resection | ? | ? | NA | Y | ? | ? |
| Chemotherapy for Advanced or Metastatic Disease | ||||||||||||||
| Harris et al., 2006 | Y | N | Y | Y | Y | Y | N | med: 8.3 yrs | ? | ? | ? | Y | ? | N |
| Di Leo et al., 2004 | Y | N | Y | Y | Y | Y | N | med: 23 months | ? | N | ? | N | ? | N |
| Konecny et al., 2004 | Y | N | Y | Y | ? | Y | N | ? | ? | N | Y | ? | ? | N |
Six studies on preoperative neoadjuvant chemotherapy. Six studies, including one randomized, controlled trial and five uncontrolled series, compared outcomes by HER2 status for patients undergoing neoadjuvant (preoperative) chemotherapy. The randomized, controlled trial (Learn, Yeh, McNutt, et al., 2005) randomized patients (n=144) to one of three arms: doxorubicin plus cyclophosphamide (AC), AC plus docetaxel (AC+D), or AC followed by docetaxel after resection (AC→D). Analysis of pathologic outcomes at resection pooled patients from the AC and AC→D arms and compared these versus the AC+D arm. The secondary, unplanned analysis by HER2 status included 104 (72 percent) of originally randomized patients.
Two uncontrolled series, one prospective (n=232, Arriola, Moreno, Varela, et al., 2006) and the other retrospective (n=67, Park, Kim, Lim, et al., 2003) reported on patients given doxorubicin alone. One uncontrolled retrospective series (n=97, Zhang, Yang, Smith, et al., 2003) reported on patients given three to six cycles of fluorouracil plus doxorubicin plus cyclophosphamide (FAC). A similar uncontrolled, retrospective series (n=77; Tinari, Lattanzio, Natoli, et al., 2006) reported on patients given three to six cycles of fluorouracil plus epirubicin plus cyclophosphamide. Finally, one uncontrolled retrospective series (n=54, Tulbah, Ibrahim, Ezzat, et al., 2002) reported on patients given three or four cycles of paclitaxel plus cisplatin. Each series reported outcomes by HER2 status for all patients (n=232 for the Arriola and co-workers series; n<100 for each of the others).
Three studies on chemotherapy for advanced or metastatic breast cancer. Each was a secondary analysis from a randomized, controlled trial designed to compare outcomes of treatment regimens in populations not selected or stratified for HER2 status, and each published earlier reports comparing outcomes by treatment arm for all randomized patients. One randomized, controlled trial (n=474, Harris, Broadwater, Lin, et al., 2006; CALGB 9342) randomized patients with stage IV or inoperable disease undergoing first- or second-line therapy to three different doses of paclitaxel. The analysis of outcomes by HER2 status included 35 percent of originally randomized patients, and pooled data across all three doses. Thus, Harris and co-workers (2006) was considered a single-arm study in this systematic review.
A second randomized, controlled trial (n=326, Di Leo, Chan, Paesmans. et al., 2004) randomized patients to doxorubicin alone (A) or docetaxel alone (T). Eligibility required patients to have metastatic disease and to have failed prior CMF (either as adjuvant therapy or for metastasis), but no prior exposure to either of the randomized drug therapies. The analysis by HER2 status included 54 percent of originally randomized patients. The third randomized, controlled trial (n=516, Konecny, Thomssen, Luck, et al., 2004) randomized patients to first-line therapy for metastatic disease with either epirubicin plus cyclophosphamide (EC) or epirubicin plus paclitaxel (ET). Up to one prior hormonal therapy for metastasis was permitted, with patients stratified by prior hormonal therapy. The analysis by HER2 status included 53 percent of originally randomized patients.
| Study | Age | Race (%) | Disease Stage | Disease Stage | Performance Status | Hormone Receptor Status | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Adjuvant Chemotherapy | |||||||||||||||||||
| Yang et al. 2003 rec. # 8840 (of n=107 tested for expression of various markers) | mn | Grp1 | Grp2 | Scale | not reported | ||||||||||||||
| md | 51.9 yrs | B | I | T <3cm | 31 (33%) | ER+ | not reported | ||||||||||||
| rng | 33–77 yrs | W | IIa | T 3–5cm | 39 (41%) | PR+ | |||||||||||||
| sd | H | IIb | not reported | T3 >5 cm | 24 (26%) | ||||||||||||||
| <50 yrs | 45 (47.9%) | A | 100% | IIIa | N- | 41 (38%) | |||||||||||||
| ≥50 yrs | 49 (52.1%) | O | IIIb | N+ | 66 (62%) | ||||||||||||||
| IV | |||||||||||||||||||
| Gusterson et al. 2003; rec. # 43690 760 node- pts randomized to periop CMF vs no adj. Tx | HER2+ | HER2- | Grp1 | Grp2 | Grp1 | Grp2 | HER2+ | HER2- | Scale | not reported | HER2+ | HER2- | |||||||
| mn | B | I | Mn T | ER+ | 36% | 51% | |||||||||||||
| md | W | not reported | IIa | T size | ER- | 41% | 32% | ||||||||||||
| rng | H | IIb | not reported | ≤2cm | 41% | 57% | unk | 23% | 17% | ||||||||||
| sd | A | IIIa | >2cm | 53% | 40% | PR+ | 24% | 37.5% | |||||||||||
| menopausal status: | O | IIIb | unk | 6% | 3% | PR- | 50% | 37.5% | |||||||||||
| pre- | 52.5% | 53% | IV | N0 | 100% | 100% | unk | 26% | 25% | ||||||||||
| post- | 47.5% | 47% | |||||||||||||||||
| Gusterson et al. 2003; rec. # 43690 746 node+ pts randomized to perioperative vs prolonged CMF | HER2+ | HER2- | Grp1 | Grp2 | Grp1 | Grp2 | HER2+ | HER2- | Scale | not reported | HER2+ | HER2- | |||||||
| mn | B | I | Mn T | ER+ | 56% | 28% | |||||||||||||
| md | W | not reported | IIa | T size | ER- | 32% | 59% | ||||||||||||
| rng | H | IIb | not reported | ≤2cm | not reported | unk | 12% | 13% | |||||||||||
| sd | A | IIIa | >2cm | PR+ | 62% | 36% | |||||||||||||
| menopausal status: | O | IIIb | # positive nodes: | PR- | 22% | 45% | |||||||||||||
| pre- | 50% | 60% | IV | 1–3+ | 51% | 57% | unk | 16% | 19% | ||||||||||
| post- | 50% | 40% | ≥4 | 49% | 43% | ||||||||||||||
| Moliterni et. al. 2003; rec. # 10210 RCT; CMF (Grp 1) versus CMF→A (Grp 2) | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| mn | B | I | Mn T | ER+ | 59% | 52% | |||||||||||||
| md | not reported | W | not reported | IIa | T stage distribution not reported | ER- | 34% | 39% | |||||||||||
| rng | H | IIb | not reported | ~65% <2.1 cm diam. | unk | 7% | 9% | ||||||||||||
| sd | A | IIIa | N0 | PR+ | 53% | 53% | |||||||||||||
| <51 yr | 69% | 67% | O | IIIb | N1 | 100% | 100% | PR- | 38% | 34% | |||||||||
| IV | N2 | unk | 9% | 13% | |||||||||||||||
| Colozza et al. 2005; rec. # 3820 RCT; CMF (Grp 1) vs epirubicin (Grp 2); n=133 each tested for HER2 status | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| age (years): | B | I | tumor diameter (cm): | ER+ | 56% | 55% | |||||||||||||
| <40 | 12 | 9 | W | not reported | IIa | ≤2 | 45% | 46% | ER- | 41% | 44% | ||||||||
| 40–50 | 32 | 40 | H | IIb | not reported | 2–5 | 50% | 48% | unk | 4% | 2% | ||||||||
| >50 | 56 | 51 | A | IIIa | >5 | 1% | 0% | PR+ | 63% | 63% | |||||||||
| menopausal status: | O | IIIb | unk | 5% | 6% | PR- | 33% | 35% | |||||||||||
| pre | 53 | 53 | IV | N0 | 20% | 23% | unk | 4% | 2% | ||||||||||
| post | 47 | 47 | N1–3 | 59% | 52% | ||||||||||||||
| N4–9 | 21% | 26% | |||||||||||||||||
| Pritchard et al. 2006; rec. # 1760 RCT; CMF versus CEF; n=163 FISH+ (Grp 1); n=465 FISH- (Grp2) | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| age (%) | B | I | T1 | 35% | 40% | ER+ | 56% | 62% | |||||||||||
| ≤29 yr | 4 | 1 | W | not reported | IIa | T2 | 52% | 49% | ER- | 35% | 27% | ||||||||
| 30–39 | 27 | 22 | H | IIb | not reported | T3 | 5% | 5% | unk | 9% | 12% | ||||||||
| 40–49 | 54 | 60 | A | IIIa | # positive nodes: | PR+ | not reported | ||||||||||||
| ≥50 yr | 15 | 17 | O | IIIb | 0 | 0 | 0 | ||||||||||||
| all pre-menopausal; ineligible if post-menopausal | IV | 1–3 | 57% | 63% | |||||||||||||||
| 4–10 | 36% | 31% | |||||||||||||||||
| ≤10 | 7% | 7% | |||||||||||||||||
| Knoop et al. 2005; rec. # 3450 RCT (n=773); CMF (Grp 1; n=421) vs CEF (Grp 2; n=352) | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| age in years (%) | B | I | T size, cm (%) | (%) | (%) | ||||||||||||||
| <40 | 16.4 | W | not reported | IIa | 0–2 | 42.4 | 39.3 | ER+ | 27.1 | 25 | |||||||||
| 40–49 | 47.6 | H | IIb | not reported | 2.1–5 | 49.5 | 52.4 | ER- | 66.7 | 68.2 | |||||||||
| 50–59 | 22.0 | A | IIIa | >5 | 8.1 | 8.3 | PR+ | not reported | |||||||||||
| 60–69 | 14.0 | O | IIIb | # positive nodes (%) | PR- | ||||||||||||||
| menopausal status | IV | 0 | 35.6 | 37.8 | |||||||||||||||
| pre | 69.8 | 68.5 | 1–3 | 33.3 | 29.5 | ||||||||||||||
| post | 30.2 | 31.5 | >3 | 31.3 | 32.7 | ||||||||||||||
| Dressler et al. 2005, rec. # 4280; Thor et al. 1998, rec. # 40880; Grp 1, n=542, Dressler analysis; Grp 2, n=469, rest of CALGB 88693 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| mn | 50.6 yr | 50.4 yr | B | I | Mn T | 2.96 | 2.86 | (%) | (%) | ||||||||||
| md | W | not reported | IIa | (cm) | ER+ | 68.2 | 64.8 | ||||||||||||
| rng | H | IIb | not reported | Mn # | 4.62 | 4.68 | |||||||||||||
| pre- | 40.7 | 40.3 | A | IIIa | N+ | PR+ | 59.1 | 55.7 | |||||||||||
| O | IIIb | ||||||||||||||||||
| IV | |||||||||||||||||||
| Del Mastro et al. 2004, 2005; rec # 48020 Grp 1: n=731, HER2 known Grp 2: n=483 HER2 unknown | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| md | 54 | 54 | B | I | T1 | 47.1% | 52.6% | ER+ | 54% | 49% | |||||||||
| rng | 25–70 | 26–70 | W | not reported | IIa | T2 | 46.2% | 42.2% | ER- | 43% | 38% | ||||||||
| <50 | 35.8% | 43.1% | H | IIb | not reported | T3–4 | 5.3% | 4.4% | ER? | 3% | 13% | ||||||||
| 50–59 | 34.7% | 35.6% | A | IIIa | T? | 1.4% | 0.8% | PR+ | 42% | 36% | |||||||||
| >59 | 29.5% | 21.3% | O | IIIb | N+ | 62.3% | 67.7% | PR- | 50% | 44% | |||||||||
| IV | N- | 37.6% | 32.3% | PR? | 8% | 20% | |||||||||||||
| Tanner et al. 2006; rec. # 1820 (n=391 tested for HER2 status; 180 from FEC arm + 211 from CTCb arm) | HER2+ | HER2- | Grp1 | Grp2 | Grp1 | Grp2 | n | HER2+ | HER2- | Scale | not reported | HER2+ | HER2- | ||||||
| <50 years of age: | B | I | tumor size: | ER+ and/or PR+: | |||||||||||||||
| n=227 | 30.8% | 69.2% | W | not reported | IIa | <2 cm | 126 | 27% | 73% | yes (210) | 22% | 78% | |||||||
| ≥50 years of age | H | IIb | not reported | 2–5cm | 213 | 36% | 64% | no (148) | 49% | 51% | |||||||||
| n=164 | 35.4% | 64.6% | A | IIIa | >5 cm | 37 | 30% | 70% | unk (33) | 27% | 73% | ||||||||
| O | IIIb | unk | 15 | 47% | 53% | ||||||||||||||
| IV | # positive nodes: | ||||||||||||||||||
| 5–7 | 68 | 34% | 66% | ||||||||||||||||
| 8–9 | 107 | 27% | 73% | ||||||||||||||||
| ≥10 | 216 | 35% | 65% | ||||||||||||||||
| Hayes et al. 2007; rec. # 47610 Grp1, n=643 Grp2, n=679 each is random mix of patients from 6 RCT arms4 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | not reported | Grp1 | Grp2 | |||||||
| age, years: | B | 8% | 9% | I | tumor size (cm): | ER+ | 57% | 62% | |||||||||||
| <40 | 20% | 20% | W | 84% | 84% | IIa | ≤2 | 33% | 35% | ER- | 43% | 38% | |||||||
| 40–49 | 40% | 38% | H | 5% | 4% | IIb | not reported | >2 | 66% | 64% | PR+ | not reported | |||||||
| 50–59 | 27% | 30% | A | 2% | 2% | IIIa | unk | <1% | <1% | ||||||||||
| ≥60 | 12% | 12% | O | 1% | 1% | IIIb | # positive nodes: | ||||||||||||
| menopausal status: | IV | 1–3 | 48% | 46% | |||||||||||||||
| pre | 61% | 61% | 4–9 | 40% | 43% | ||||||||||||||
| post | 39% | 39% | ≥10 | 12% | 11% | ||||||||||||||
| Martin et al. 2005; rec # 47650 Grp 1: DAC, n=745 Grp 2: FAC, n=746 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | Grp1 | Grp2 | Grp1 | Grp2 | ||||||
| mn | B | I | T1 | 40% | 43% | 100% had Karnofsky score ≥80% | ER+ &/or | ||||||||||||
| md | 49 | 49 | W | not reported | IIa | T2 | 52% | 51% | PR+ | 76% | 76% | ||||||||
| rng | 26–70 | 23–70 | H | IIb | not reported | T3 | 8% | 6% | menopausal status: | ||||||||||
| sd | A | IIIa | N0 | 0 | 0 | pre: | 56% | 55% | |||||||||||
| O | IIIb | 1–3N+ | 63% | 62% | post: | 44% | 45% | ||||||||||||
| IV | ≥4 N+ | 37% | 38% | ||||||||||||||||
| Neoadjuvant (Pre-operative) Chemotherapy | |||||||||||||||||||
| Learn et al. 2005; rec. # 47640 AC, n=50 AC+D, n=47 AC→D, n=47 pooled data on n=142 evaluated for clin. response | mn | 48 yrs | B | I | 23.6% | clinical tumor diameter | Scale | not reported | Of n=121 with biopsy specimens available for IHC: | ||||||||||
| md | 47 yrs | W | IIa | 39.6% | ≤2 cm | 28.2% | ER+ | 60.3% | |||||||||||
| rng | 27–73 yrs | H | 71% | IIb | 30.6% | >2-≤5 cm | 47.2% | ER- | 39.7% | ||||||||||
| sd | A | IIIa | 6.3% | >5 cm | 24.6% | PR+ | 57.9% | ||||||||||||
| O | IIIb | 0 | N0 | 61.3% | PR- | 42.1% | |||||||||||||
| not reported | 29% | IV | 0 | N1 | 38.7% | ||||||||||||||
| N≥2 | 0 | ||||||||||||||||||
| Arriola et al. 2006; rec # 950 prospective single-arm series; n=232 | mn | 47 yrs | B | I | T2 | 30% | Scale | not reported | ER+ | 67% | |||||||||
| md | W | IIa | T3 | 70% | ER- | 29% | |||||||||||||
| rng | H | not reported | IIb | not reported | N0 | 60% | unk | 4% | |||||||||||
| sd | A | IIIa | N1 | 40% | PR+ | 52% | |||||||||||||
| O | IIIb | PR- | 43% | ||||||||||||||||
| IV | unk | 5% | |||||||||||||||||
| Park et al. 2003; rec # 9960 retrospective single-arm series; n=67 | years | B | I | tumor size (cm): | |||||||||||||||
| <50 | 82% | W | IIa | 5–10 | 91% | Scale | not reported | ER+ | 46% | ||||||||||
| ≥50 | 18% | H | not reported | IIb | not reported | >10 | 9% | ER- | 54% | ||||||||||
| 0 | IIIa | PR status not reported | |||||||||||||||||
| IV | |||||||||||||||||||
| Zhang et al. 2003; rec # 9820 retrospective single-arm series; n=97 | md | 44.5 yr | B | I | T1 | 13% | Scale | not reported | ER+ | 65% | |||||||||
| rng | 25–74 yr | W | not reported | IIa | T2 | 53% | PR+ | 56% | |||||||||||
| sd | H | IIb | not reported | ≥T3 | 34% | ||||||||||||||
| ≥50 | 44% | A | IIIa | N- | 33% | ||||||||||||||
| <50 | 56% | O | IIIb | N+ | 67% | ||||||||||||||
| IV | |||||||||||||||||||
| Tulbah et al. 2002; rec # 11560 | HER2+ | HER2- | B | HER2+ | HER2- | HER2+ | HER2- | Scale | not reported | HER2+ | HER2- | ||||||||
| age (yr) | W | not reported | I | 0 | 0 | T2 | 3 | 7 | ER+ | 12 | 16 | ||||||||
| ≥50 | 20 | 27 | H | IIa | 1 | 3 | T3 | 9 | 16 | ER- | 9 | 11 | |||||||
| >50 | 2 | 5 | A | IIb | 6 | 9 | T4 | 10 | 9 | unk | 1 | 5 | |||||||
| menopausal status: | O | IIIa | 5 | 10 | N0 | 8 | 9 | PR+ | 11 | 11 | |||||||||
| pre | 20 | 25 | IIIb | 10 | 10 | N1 | 12 | 18 | PR- | 10 | 16 | ||||||||
| post | 2 | 7 | IV | 0 | 0 | N2 | 2 | 5 | unk | 1 | 5 | ||||||||
| Tinari et al. 2006; rec # 2300 retrospective single-arm series; n=77 | mn | B | I | tumor size (cm): | ER+ | 62% | |||||||||||||
| md | 46.1 yrs | W | not reported | IIa | 2–5 | 75% | Scale | not reported | ER- | 38% | |||||||||
| rng | 25.5–73.7 yrs | H | IIb | not reported | >5 | 25% | PR+ | 45% | |||||||||||
| sd | A | IIIa | PR- | 55% | |||||||||||||||
| O | IIIb | ||||||||||||||||||
| IV | |||||||||||||||||||
| Chemotherapy for Advanced or Metastatic Disease | |||||||||||||||||||
| Harris et al 2006; rec. # 390 | All | All | All | All | Scale | All | All | ||||||||||||
| mn | B | 20.6† | I | T1 | ECOG performance status of 0,1,2=100 | ER+ and/or PR+=58† | |||||||||||||
| md | 54.9† | W | II | T2 | |||||||||||||||
| rng | H | III | not reported | T3 | |||||||||||||||
| sd | A | IV | T4 | ||||||||||||||||
| pre | O | Unk | N0 | ||||||||||||||||
| post | N1 | ||||||||||||||||||
| N2 | |||||||||||||||||||
| N3 | |||||||||||||||||||
| median # mets: 1† | |||||||||||||||||||
| Di Leo et al 2004; rec. # 5970 Grp 1: A, n=91 Grp 2: T, n=85 patients with tumor blocks tested for HER2 status | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | Grp1 | Grp2 | Grp1 | Grp2 | ||||||
| mn | B | not reported | I | ≥3 sites | 46% | 51% | Karnofsky | ER+ | not reported | ||||||||||
| md | 54 yr | 51yr | W | II | Visceral Involvement | 79% | 76% | all w specimens: | PR+ | not reported | |||||||||
| rng | H | III | 60–70: | 15% | 15% | reported (data not shown) other factors similar in HER2 status subgroups of each arm | |||||||||||||
| sd | A | IV | 100% | 100% | ≥80 | 85% | 85% | ||||||||||||
| pre | O | Unk | HER2+ subgroup: | ||||||||||||||||
| post | 60–70: | 33% | 0 | ||||||||||||||||
| Konecny et al 2004; rec. # 6740 Grp 1: EC, n=137 Grp 2: ET, n=138 data are for subgroups with known HER2 status | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Grp1 | Grp2 | Scale | Grp1 | Grp2 | Grp1 | Grp2 | ||||||
| Mn | 55 | 55 | B | I | Nuclear Grade | Karnofsky | ER+ | 52.6 | 60.9 | ||||||||||
| md | W | Not reported | II | 1 | 2.2 | 2.9 | >60 | 100 | 100 | ER- | 37.2 | 32.6 | |||||||
| rng | 31–74 | 29–75 | H | III | 2 | 41.6 | 37.0 | Prior adj chemo | unk | 10.2 | 6.5 | ||||||||
| sd | A | IV | 100% | 100% | 3 | 38.7 | 46.3 | Yes | 40.2 | 32.6 | PR+ | 48.9 | 49.3 | ||||||
| pre | O | Unk | Unk | 17.5 | 13.8 | No | 59.1 | 65.9 | PR- | 40.1 | 42.7 | ||||||||
| post | # of met sites | unk | 0.7 | 1.5 | unk | 11.0 | 8.0 | ||||||||||||
| 1 | 35.8 | 31.9 | Prior palliative hormone therapy | ||||||||||||||||
| 2 | 21.2 | 21.0 | |||||||||||||||||
| ≥3 | 42.3 | 42.0 | Yes | 14.6 | 13.8 | ||||||||||||||
| Unk | 0.7 | 5.1 | No | 84.7 | 86.2 | ||||||||||||||
| unk | 0.7 | 0 | |||||||||||||||||
Also showed patient populations were similar in 3 CAF dose arms (high, moderate, low); data not abstracted here.
Also reported data comparing groups 1 and 2 with 1799 patients from CALGB 9344 not included in biomarker analysis; data showed similar baseline characteristics and 5-year outcomes.
p<.05;
characteristics of subset with biomarker data, n=165, similar to those of patients w/o biomarker measurements, n=299
| Study | Assays (Name) | Criteria for Positivity | Test Results (%) | Comments | |||||
|---|---|---|---|---|---|---|---|---|---|
| Adjuvant Chemotherapy | |||||||||
| Yang et al. 2003 rec. # 8840 | FISH | not done | FISH | not done | Pos | 36% | HER2+ = IHC3+ by DAKO scoring pre-ASCO/CAP | ||
| IHC | Neomarker antibody | Equiv | 0 | ||||||
| Neg | 64% | ||||||||
| IHC | strong & complete membrane staining in >10% of tumor cells | 3+ | |||||||
| 2+ | |||||||||
| 1+ | |||||||||
| 0 | |||||||||
| Gusterson et al. 2003; rec. # 43690 | FISH | not done | FISH | not done | Pos | ||||
| IHC | ICR12 monoclonal antibody | Equiv | |||||||
| Neg | |||||||||
| IHC | strong & complete membrane staining at dilution shown to give + signal if ≥3 copies of HER2 gene | Pos | 16% of 760 node- pts; 19% of 746 node+ pts | ||||||
| Equiv | none | ||||||||
| Neg | 84% of 760 node- pts; 81 % of 746 node+ pts | ||||||||
| Moliterni et. al. 2003; rec. # 10210 RCT; CMF (Grp1) vs CMF→A (Grp 2) | FISH | not done | FISH | not done | Pos | ||||
| IHC | CB11 antibody | Equiv | |||||||
| Neg | |||||||||
| IHC | strong membrane staining found equivalent to 3+ by HercepTest | Pos | Grp1: | 18.2%; | Grp 2: | 16.2% | |||
| Neg | 75.6% | 73.3% | |||||||
| ND | 6.2% | 10.5% | |||||||
| Colozza et al. 2005; rec. # 3820 RCT; CMF (Grp 1) vs epirubicin (Grp 2); n=133 each tested for HER2 status | FISH | not done | FISH | not done | |||||
| IHC | CB11 antibody and HercepTest | IHC | ≥50% CB11+ | HER+ | Grp 1: | 28% | Grp2: | 41% | |
| ≤50% CB11+ | HER- | 41% | 36% | ||||||
| CB11 negative | HER- | 31% | 23% | ||||||
| HercepTest using | 3+ | 7% | 9% | ||||||
| DAKO scoring system | 2+ | 7% | 9% | ||||||
| 1+ | 10% | 11% | |||||||
| 0 | 75% | 71% | |||||||
| Pritchard et al. 2006; rec. #1760 RCT; CMF versus CEF; | FISH | PathVysion kit | FISH | HER2/CEP17 ≥2.00 | Pos | 163 (26%) | also reported concordance rates between the different assays used | ||
| IHC | CB11 and TAB 250 antibodies (results reported separately from each antibody assay) | Neg | 465 (74%) | ||||||
| PCR | as described by O'Malley et al. 2001; rec. #13790) | IHC | complete membrane staining, score ≥5 on Allred semi-quantitative scale | CB11+ | 124 (20% | ||||
| CB11- | 510 (80%) | PCR+ 195 (31%) | |||||||
| TAB250+ | 116 (18%) | PCR- 429 (69%) | |||||||
| TAB250- | 516 (82%) | ||||||||
| Knoop et al. 2005; rec. # 3450 RCT (n=805 tested) | FISH | pharmDx | FISH | HER2/CEP17 ≥2, as in kit manufacturer's manual; only tested if IHC2+ or 3+ followed instructions in manual for HercepTest kit | Pos | IHC2+: 21.0 | IHC3+: 89.4 | ||
| IHC | HercepTest | Equiv | 6.2 | 8.9 | |||||
| Neg | 72.8 | 1.6 | |||||||
| IHC | 3+ | 30.6 | |||||||
| 2+ | 10.1 | (IHC3+ or FISH+) = HER2+: 32.7% | |||||||
| 1+ | 32.7 | HER2-: 67.3% | |||||||
| 0 | 26.7 | ||||||||
| Dressler et al. 2005, rec. # 4280; Thor et al. 1998, rec. # 40880 | FISH | PathVysion kit | RCT arm: | hiqh-dose | mod-dose | low-dose | total | ||
| IHC | CB11 (n=346) or A0-11-854 (n=177) antibodies | FISH | HER2/CEP17 >2 | Pos | 30 (5.7%) | 31 (5.9%) | 30 (5.7%) | 91 (17%) | |
| Neg | 149 (28.4%) | 136 (26.0%) | 148 (28.2%) | 433 (83%) | |||||
| PCR | differential PCR assay as described in Thor et al. 1998 | IHC | ≥50% of invasive cells stained by antibody | Pos | 44 (8.4%) | 43 (8.2%) | 40 (7.7%) | 127 (24%) | |
| Neg | 134 (25.6%) | 124 (23.7%) | 138 (26.4%) | 396 (76%) | |||||
| PCR | “unequivocal amplification relative to ...normal...and amplified standard controls” | Pos | 30 (6.1%) | 31 (6.3%) | 30 (6.1%) | 91 (18%) | |||
| Neg | 131 (26.7%) | 125 (25.5%) | 144 (29.3%) | 400 (82%) | |||||
| Del Mastro et al. 2004, 2005; rec. # 48020 | FISH | not done | FISH | not done | Pos | ||||
| IHC | CB11 antibody | Neg | |||||||
| FEC14 (n=370) | FEC21 (n=361) | ||||||||
| all slides scored by one pathologist, blinded to treatment arm & outcome | IHC | 3+ score on Dako scale: 103 of 731 (14%) with specimens available for assay | 3+ | 50 (13.5%) | 53 (14.7%) | ||||
| 2+ | 24 (6.5%) | 23 (6.4%) | |||||||
| 1+ | 19 (5.1%) | 20 (5.5%) | |||||||
| 0 | 277 (74.9%) | 265 (73.4%) | |||||||
| Tanner et al. 2006; rec. # 1820 data for n=180 from FEC arm tested for HER2 status | FISH | not done | CISH | ≥6 copies in >30% of invasive carcinoma cells or ratio >2, HER2/CEP17 | Pos | 56 (31%) | |||
| CISH | Zymed probes (digoxigenin-labeled) | Equiv | 0 | ||||||
| IHC | not done | Neg | 124 (69%) | ||||||
| IHC | 3+ | ||||||||
| 2+ | |||||||||
| 1+ | |||||||||
| 0 | |||||||||
| Hayes et al. 2007; rec. # 47610 | FISH | PathVysion kit | FISH | HER2/CEP17 ≥2.00 | Pos | proportions of HER2+ and HER2- patients not reported for any assay method | |||
| IHC | CB11 antibody and HercepTest | Equiv | |||||||
| Neg | |||||||||
| IHC | CB11: HER2+if ≥50% of breast cancer cells were stained; Herceptest: as in Dako manual | 3+ | |||||||
| 2+ | |||||||||
| 1+ | |||||||||
| 0 | |||||||||
| Martin et al. 2005; rec # 47650 | FISH | not reported | FISH | HER2/CEP17 ≥2.00 | Pos | 319 (21.4%; 20.8%, DAC arm; 22.0% FAC arm) | |||
| IHC | CB11 antibody (only for 12 patients) | Neg | 943 (63.3%; 63.8%, DAC arm; 62.7% FAC arm) | ||||||
| ??? | 229 (15.4%; 15.4%, DAC arm; 15.3% FAC arm) | ||||||||
| IHC | not reported | ||||||||
| Neoadjuvant (Pre-operative) Chemotherapy | |||||||||
| Learn et al. 2005; rec. # 47640 n=104 classified for HER2 status | FISH | not reported | FISH | not reported | |||||
| IHC | TAB250 antibody (Zymed; South San Francisco, CA) | HER2 Pos | 41 (39.4% of those tested) | ||||||
| FISH performed on all specimens with “borderline” HER2 IHC scores | IHC | not reported | HER2 Neg | 63 (60.6% of those tested) | |||||
| Arriola et al. 2006; rec # 950 n=223 tested by IHC/FISH initial algorithm & by CISH | FISH | Oncor/Ventana Inform kit | CISH | >5 copies or ratio>2 for | >5 copies | ratio>2 | |||
| CISH | Zymed probe and Spot-Light kit | HER2/CEN17 | Pos | 18% | 14% | ||||
| IHC | CB11 antibody and HercepTest | Neg | 82% | 86% | |||||
| algorithm: CB11 first, then HercepTest for negatives only, then FISH for discordant IHC results; positives by initial algorithm tested byCISH | IHC/FISH initial algorithm: CB11+ if complete membrane staining in >10% of cells; HercepTest+ if 2+ or 3+; FISH+ if >4 signals/cell | Pos | 19% | ||||||
| Neg | 81% | ||||||||
| Park et al. 2003; rec # 9960 | FISH | not done | FISH | not done | |||||
| CISH | Zymed SPOT-Light HER2 probe, digoxigenin-labeled | CISH | HER2 gene copy # >4, or large gene copy cluster in >50% of cancer cell nude | Pos | 46% | ||||
| Neg | 54% | ||||||||
| IHC | not done | IHC | not done | ||||||
| Zhang et al. 2003; rec # 9820 | FISH | PathVysion kit | FISH | gene copy ratio >2.0, HER2/chromosome 17 centromere | Pos | 13% | overall, 28% (n=28) HER2+ defined as 3+ by IHC or FISH+; 72% (n=69) HER2- | ||
| IHC | AB8 Neomarker antibody | Neg | 36% | ||||||
| untested | 51% | ||||||||
| n=75 analyzed by IHC | IHC | strong membrane staining in ≥10% of tumor cells | 3+ | 23% | |||||
| n=48 analyzed by FISH | 2+ | 9% | |||||||
| n=97 all patients in study | 1+ | 10% | |||||||
| 0 | 35% | ||||||||
| untested | 23% | ||||||||
| Tulbah et al. 2002; rec # 11560 n=54 tested | FISH | not done | FISH | not done | Pos | ||||
| IHC | HercepTest | Equiv | |||||||
| Neg | |||||||||
| IHC | scored 0–3+, as in Dako kit guide; for analysis of response, only 3+ was considered HER2+ | 3+ | 22 (41%) | ||||||
| 2+ | 12 (22%) | ||||||||
| 1+ | 8 (15%) | ||||||||
| 0 | 12 (22%) | ||||||||
| Tinari et al. 2006; rec # 2300 retrospective single-arm series; n=77 | FISH | not reported | FISH | not reported; used only if HercepTest scored 2+ | Pos | 20 (26%) | |||
| IHC | HercepTest (Dako) | Equiv | 0 | ||||||
| Neg | 57 (74%) | ||||||||
| IHC | scored 0–3+, as in Dako kit guide; positive if IHC scored 3+ or if FISH+ and IHC 2+ | 3+ | |||||||
| 2+ | |||||||||
| 1+ | |||||||||
| 0 | |||||||||
| Chemotherapy for Advanced or Metastatic Disease | |||||||||
| Harris et al 2006; rec. # 390 | FISH | Vysis PathVysion kit (Vysis Inc, Downers Grove, IL) | FISH | Ratio of HER2 to CEP17 signal ≥ 2.0. | Pos | 26 | Cohen's kappa = 83.0% (SE5.3%) for FISH vs CB11; 72.0%(SE6.2%) for Hercep-Test (0–1 vs 2–3) vs FISH; 79.2%(SE6.0%) for Hercep-Test (0–2 vs 3) vs FISH; 70.0%(SE6.3%) for Hercep-Test (0–1 vs2–3) vs CB11; 84.2%(SE5.4%) for Hercep-Test (1–2 vs 3) vs CB11. | ||
| Neg | 74 | ||||||||
| IHC | Monoclonol antibody CB11 (Biogenex, San Ramon, CA); HercepTest (Dako Corp, Carpinteria, CA) | IHC | CB11 : moderate to strong intensity staining in ≥10% of invasive carcinoma cells. | ||||||
| Pos | 20 | ||||||||
| Equiv | |||||||||
| Neg | 80 | ||||||||
| Herceptest score of 3+; i.e., complete membrane staining of >10% tumor cells | 0 | 40 | |||||||
| 1 | 28 | By CB11, 9% of African American women are HER2+ vs 20% of Caucasian women (p=0.08). | |||||||
| 2 | 11 | ||||||||
| 3 | 21 | ||||||||
| Di Leo et al 2004; rec. # 5970 | IHC | CB-11 (Novocastra, Newcastle, UK) | Grp1 | Grp2 | |||||
| IHC → | FISH: FISH done if IHC stained membranes in ≥ 1% of invasive cells; HER2+ if signal ratio, HER2/CEP17 ≥2 | Pos | 16% | 25% | |||||
| FISH | Spectrum Orange HER-2/Spectrum Green CEP17 (PathVysion, Vysis, Downers Grove, IL) | unknown | 14% | 16% | |||||
| Neg | 69% | 59% | |||||||
| Konecny et al 2004; rec. # 6740 | FISH | PathVision HER-2 Neu & CEP17 probes (Vysis, Downers Grove, IL) | Grp1 | Grp2 | |||||
| FISH | ≥2 HER-2/neu genes per Chromosome 17 Centromere | Pos | 35.8 | 34.8 | |||||
| Equiv | |||||||||
| Neg | 64.2 | 65.2 | |||||||
| IHC | not done | ||||||||
Studies on CMF. Of the two CMF studies, the retrospective series by Yang, Klos, Zhou, et al. (2003) pooled data for node-negative and node-positive patients, groups that Gusterson, Gelber, Goldhirsch, et al. (2003) randomized separately to different treatment arm pairs. Yang, Klos, Zhou, et al. (2003) only reported baseline characteristics and risk factors for all patients analyzed. Gusterson, Gelber, Goldhirsch, et al. (2003) compared HER2-positive versus HER2-negative patients separately for the node-positive and node-negative groups, but did not compare those with known HER2 status versus those lacking tissue blocks for HER2 assays. In node-negative patients, HER2 positivity was statistically significantly associated with larger tumor size, hormone-receptor negativity, and higher tumor grade. In node-positive patients, HER2 positivity was statistically significantly associated with menopausal status, hormone-receptor negativity, and higher tumor grade.
Studies on regimens with versus without an anthracycline. Three (Colozza, Sidoni, Mosconi, et al., 2005; Pritchard, Shepherd, O'Malley, et al., 2006; Tanner, Isola, Wiklund, et al., 2006) of five studies comparing adjuvant regimens with versus without an anthracycline compared baseline characteristics of HER2 positive and negative subgroups. Three (Colozza, Sidoni, Mosconi, et al., 2005; Knoop, Knudsen, Balslev, et al., 2005; Tanner, Isola, Wiklund, et al., 2006) explored whether subgroups tested for HER2 status were similar to the total study population or the subgroup not tested. Two trials (Moliterni, Menard, Valagussa, et al., 2003; Pritchard, Shepherd, O'Malley, et al., 2006) determined HER2 status on 92 percent or 89 percent, respectively, of the patients originally randomized and did not report comparisons to all or omitted patients. Each trial's full treatment arms were well balanced for baseline characteristics and prognostic factors.
Moliterni, Menard, Valagussa, et al. (2003) did not report data comparing baseline factors by HER2 status. All patients in this trial had one to three positive nodes, and approximately 65 percent had tumors smaller than 2.1 cm in diameter. Colozza, Sidoni, Mosconi, et al. (2005) reported that treatment arms were well balanced, whether comparing all patients randomized or only those tested for HER2 status. However, significantly more patients randomized to epirubicin than to CMF were HER2 positive (41 percent versus 28 percent, p=.03). Progesterone receptor positivity was the only factor statistically significantly associated with HER2 positivity. This trial included node-positive and node-negative patients (4 or more positive nodes in less than 25 percent), and approximately 45 percent with tumors 2 cm or smaller in diameter.
Pritchard, Shepherd, O'Malley, et al. (2006) reported baseline characteristics of patients tested for HER2 status were similar to those of all randomized patients, but did not show data for this comparison. They showed data comparing FISH-positive and FISH-negative subgroups; except for a shift toward younger age in the FISH-positive subgroup, there were no significant differences. Just over half the patients in this trial had T2 or T3 tumors, all had positive lymph nodes, with four or more positive nodes in 37 percent and 43 percent of the FISH-negative and FISH-positive groups, respectively. Knoop and co-workers (2005) reported that among all patients tested for HER2 status, treatment arms were well balanced for prognostic factors. However, they did not report comparing the HER2-positive versus HER2-negative patients, either by treatment arms or across treatments. Tumors were larger than 2 cm diameter in approximately 60 percent of patients, and approximately 30 percent had four or more positive nodes. Tanner, Isola, Wiklund, et al. (2006) reported (but did not show data) that baseline characteristics of all patients tested for HER2 status did not differ from those of the entire trial cohort. They showed that baseline characteristics were similar for HER2-tested subgroups from each arm. However, the AuSCS arm was excluded from this review, and data were not reported comparing baseline characteristics of HER2-positive versus HER2-negative patients from the FEC arm.
Studies on dose or dose intensity of anthracycline-based regimens. Studies from randomized, controlled trials that compared dose (Dressler, Berry, Broadwater, et al., 2005) or dose intensity (Del Mastro, Bruzzi, Nicolo, et al., 2005) of anthracycline-based regimens reported baseline characteristics and prognostic factors of patients with known HER2 status were similar to those of patients omitted from the analyses, since HER2 status was unknown. Dressler and co-workers (2005) did not report data comparing baseline characteristics or prognostic factors of HER2-positive versus HER2-negative patients. Del Mastro and co-workers (2005) found a greater proportion of HER2-positive than HER2-negative patients lacking expression of both estrogen and progesterone receptors (62 percent versus 32.5 percent). Other baseline characteristics and prognostic factors were similar between subgroups by HER2 status and between treatment arms.
Studies on regimens with versus without a taxane. One of two studies from randomized, controlled trials on regimens with versus without a taxane compared baseline characteristics and prognostic factors of patient with known HER2 status versus those of patients with unknown HER2 status. The trial comparing paclitaxel versus observation after AC (Hayes, Thor, Dressler, et al., 2007) showed similar baseline characteristics, prognostic factors and overall survival in the two subgroups they randomly selected and tested for HER2 status (n=643 and 679, respectively). These subgroups were also similar to all treated patients (n=3,121), and to all non-tested patients (n=1,799). Tumor diameter was 2 cm or smaller in approximately 35 percent, and approximately 54 percent had 4 or more positive nodes. The randomized, controlled trial that compared TAC versus FAC (Martin, Pienkowski, Mackey, et al., 2005) only compared patient characteristics and prognostic factors by treatment arm for all patients randomized. Neither study compared HER2-positive versus HER2-negative patients, either pooled across treatments or by treatment arm.
Six studies on preoperative neoadjuvant chemotherapy. The randomized, controlled trial on neoadjuvant therapy (Learn, Yeh, McNutt, et al., 2005) did not compare treatment arms or patient subgroups by HER2 status (neither known versus unknown nor positive versus negative) with respect to baseline characteristics or prognostic factors. This study only reported patient and tumor characteristics for all randomized patients
Only one (Tulbah, Ibrahim, Ezzat, et al., 2002) of the five included series compared baseline characteristics and prognostic factors for HER2-positive and HER2-negative subgroups. Across all five studies, approximately 55 percent to 65 percent of included patients were positive for estrogen receptors, and 45 percent to 55 percent were positive for progesterone receptors. However, their study samples varied somewhat with respect to tumor size and number of positive nodes. The series reported by Arriola, Moreno, Varela, et al. (2006) included 30 percent T2 and 70 percent T3 tumors, with 60 percent of patients node negative and 40 percent N1. Most patients (91 percent) in the series reported by Park, Kim, Lim, et al. (2003) had tumors between 5 and 10 cm in diameter. However, they did not report nodal status. Zhang, Yang, Smith, et al. (2003) include a few patients (13 percent) with T1 tumors, and approximately 33 percent node-negative patients. Most patients in the Tulbah, Ibrahim, Ezzat, et al. (2002) series had T3 or larger tumors, and approximately 55 percent had N1 disease. They reported generally well-balanced HER2-positive and HER2-negative subgroups. Finally, 75 percent of patients in the Tinari, Lattanzio, Natoli, et al. (2006) series had tumors with diameters between 2 and 5 cm; number of positive nodes was not reported.
Three studies on chemotherapy for advanced or metastatic breast cancer. Each of three included randomized, controlled trials reported that baseline characteristics and prognostic factors for the subgroup tested for HER2 status were similar to those of patients not tested. However, none compared HER2-positive versus HER2-negative subgroups, either separately by treatment arm or across arms.
Harris, Broadwater, Lin, et al. (2006) reported the only statistically significant difference between patients tested for HER2 (and other biomarkers) and those not tested was a shorter disease-free interval among those tested (19 versus 31 months, p=.0003). Investigators attributed this difference to discarding of tissue blocks after 10 years, thus a shorter interval from diagnosis to metastasis for those with blocks remaining. Hormone-receptor status (positive in 58 percent) and median number of metastatic sites (one) were the only prognostic factors reported among those tested for HER2 status. The analysis by HER2 status pooled patients across three trial arms randomized to different paclitaxel doses.
Di Leo, Chan, Paesmans, et al. (2004) showed the subgroups tested for HER2 status from each treatment arm were similar to each other and to the untested patients. Approximately half the included patients had three or more sites of disease, and more than three fourths had visceral involvement. They did not report hormone receptor status.
Konecny, Thomssen, Luck, et al. (2004) reported no statistically significant differences in baseline characteristics or prognostic factors between groups tested for HER2 and those not tested from each treatment arm compared separately. However, the HER2-positive and HER2-negative groups were not directly compared, either separately by treatment arm or pooled across arms.
| Study | Time to Event Outcomes | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Adjuvant chemotherapy for resected early breast cancer | |||||||||||||
| Yang et al., 2003 CMF; single-arm series | Outcome | Grp | N | Med (mos) | 1 yr | 2.5 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments p=.002 in stratified log rank that adjusted for nodal status |
| DFS | HER2+ | 34 | 6–7 years | ~60% | 53% | log rank <.01 | |||||||
| HER2- | 60 | not reached | ~90% | 86% | |||||||||
| Gusterson et al., 2003 760 node-neg pts randomized to periop CMF (Tx) or no adj Tx (Cx) | Outcome | Grp | N | Med (mos) | 2 yr | 3 yr | 4 yr | 5 yr | 6 yr | Test | p | HR (95%CI) | Comments unadjusted univariate analyses; adjusted results also NS unadjusted univariate analyses; adjusted results also NS |
| OS (HER2+) | Tx | 64 | not reached | 76±5 | Cox | NS | 1.15 (0.54–2.46) | ||||||
| Cx | 54 | not reached | 79±6 | prop hazards | |||||||||
| OS (HER2-) | Tx | 436 | not reached | 85±2 | Cox | NS | 1.04 (0.68–1.61) | ||||||
| Cx | 206 | not reached | 87±2 | prop hazards | |||||||||
| DFS(HER2+) | Tx | 64 | not reached | ~84% | ~68% | ~65% | ~62% | 61±6 | Cox | NS | 1.22 (0.66–2.25) | ||
| Cx | 54 | not reached | ~86% | ~75% | ~73% | ~70% | 68±7 | prop hazards | |||||
| DFS (HER2-) | Tx | 436 | not reached | ~90% | ~85% | ~80% | ~77% | 71±2 | Cox | NS | 0.82 (0.61–1.09) | ||
| Cx | 206 | not reached | ~85% | ~77% | ~72% | ~70% | 68±3 | prop hazards | |||||
| Gusterson et al., 2003 746 node-pos pts randomized to prolonged (Tx) or periop (Cx) CMF | Outcome | Grp | N | Med (mos) | 2 yr | 3 yr | 4 yr | 5 yr | 6 yr | Test | p | HR (95%CI) | Comments unadjusted univariate analyses; adjusted analyses gave similar results unadjusted univariate analyses; adjusted analyses gave similar results |
| OS (HER2+) | Tx | 85 | not reported | 46±6 | Cox | NS | 1.15 (0.62–1.54) | ||||||
| Cx | 55 | not reported | 40±7 | prop hazards | |||||||||
| OS (HER2-) | Tx | 406 | not reached | 71±2 | Cox | .01 | 0.69 (0.52–0.92) | ||||||
| Cx | 200 | not reached | 61±4 | prop hazards | |||||||||
| DFS(HER2+) | Tx | 85 | ~36 | ~60% | ~50% | ~43% | ~40% | 38±5 | Cox | NS | 0.77 (0.51–1.16) | ||
| Cx | 55 | ~24 | ~50% | ~42% | ~35% | ~30% | 29±6 | prop hazards | |||||
| DFS (HER2-) | Tx | 406 | >72 | ~80% | ~70% | ~63% | ~57% | 52±3 | Cox | <.0001 | 0.57 (0.46–0.72) | ||
| Cx | 200 | ~40 | ~63% | ~55% | ~45% | ~40% | 36±4 | prop hazards | |||||
| Moliterni et al., 2003 RCT; CMF→A (Tx) vs CMF (Cx) | Outcome | Grp | N | Med (mos) | 2 yr | 4 yr | 6 yr | 8 yr | 10 yr | Test | p | HR (95%CI) | Comments HR=0.48, p=.052 for treatment × HER2 interaction term HR=0.68, p not signif. for treatment × HER2 interaction term |
| OS (HER2+) | Tx | 45 | >192 | ~92% | ~83% | ~73% | ~68% | 64% | Cox | 0.61 (0.32–1.16) | |||
| Cx | 50 | ~170 | ~90% | ~80% | ~63% | ~57% | 54% | model | |||||
| OS (HER2-) | Tx | 203 | >192 | ~97% | ~90% | ~86% | ~83% | 76% | Cox | 1.26 (0.89–1.79) | |||
| Cx | 208 | >192 | ~97% | ~94% | ~90% | ~83% | 77% | model | |||||
| RFS(HER2+) | Tx | 45 | >192 | ~85% | ~75% | ~62% | ~58% | 55% | Cox | 0.83 (0.46–1.49) | |||
| Cx | 50 | ~102 | ~85% | ~65% | ~62% | ~52% | 46% | model | |||||
| RFS (HER2-) | Tx | 203 | ~162 | ~90% | ~80% | ~65% | ~60% | 56% | Cox | 1.22 (0.91–1.64) | |||
| Cx | 208 | >192 | ~90% | ~80% | ~74% | ~65% | 59% | model | |||||
| Colozza et al., 2005 RCT; epirubicin (Tx) vs. CMF (Cx); n=133 each group tested for HER2 status | Outcome | Grp | N | Med (mos) | 4 yr | 6 yr | % at 8 yr±SD | Test | Comments: CMF HER2+ versus CMF HER2-, p=.024; all other comparisons not statistically significant including epirubicin HER2+ versus epirubicin HER2-, p=0.24. Interaction terms by Cox MVA: for OS: HR=1.61, CI: 0.64–4.01, p not signif. for RFS: HR=1.02, CI: 0.40–2.58, p not signif. | ||||
| OS (HER2+) | Tx | 54 | not reached | ~89% | ~80% | 75.8±5.8 | log | ||||||
| Cx | 37 | not reached | ~77% | ~70% | 67.6±7.7 | rank | |||||||
| OS (HER2-) | Tx | 79 | not reached | ~90% | ~87% | 84.5±4.1 | log | ||||||
| Cx | 96 | not reached | ~93% | ~90% | 87.4±3.4 | rank | |||||||
| RFS(HER2+) | Tx | 54 | 60.1±6.9 | log | |||||||||
| Cx | 37 | 68.6±7.2 | rank | ||||||||||
| RFS (HER2-) | Tx | 79 | 65.9±5.4 | log | |||||||||
| Cx | 96 | 70.3±4.7 | rank | ||||||||||
| Pritchard et al., 2006 RCT; CEF (Tx) vs. CMF (Cx); HER2 status by FISH results | Outcome | Grp | N | Med (yrs) | 2 yr | 4 yr | 6 yr | 8 yr | 10 yr | Test | p | HR (95%CI) | Comments HR=2.04, CI: 1.14–3.65, p=.02 for treatment by HER2 interaction in Cox MVA HR=1.96, CI: 1.15–3.65, p=.01 for treatment by HER2 interaction in Cox MVA |
| OS (HER2+) | Tx | 75 | not reached | ~93% | ~70% | ~62% | ~58% | ~57% | log | .06 | 0.65 (0.42–1.02) | ||
| Cx | 88 | ~5.3 | ~92% | ~62% | ~47% | ~46% | ~45% | rank | |||||
| OS (HER2-) | Tx | 237 | not reached | ~93% | ~83% | ~75% | ~67% | ~63% | log | NS | 1.06 (0.83–1.44) | ||
| Cx | 228 | not reached | ~93% | ~80% | ~75% | ~67% | ~62% | rank | |||||
| RFS (HER2+) | Tx | 75 | not reached | ~77% | ~67% | ~58% | ~57% | ~56% | log | .003 | 0.52 (0.34–0.80) | ||
| Cx | 88 | ~2.5 | ~63% | ~43% | ~42% | ~34% | ~31% | rank | |||||
| RFS (HER2-) | Tx | 237 | ~10 | ~81% | ~67% | ~60% | ~54% | ~50% | log | NS | 0.91 (0.71–1.18) | ||
| Cx | 228 | ~10 | ~81% | ~64% | ~58% | ~54% | ~50% | rank | |||||
| Knoop et al., 2005 RCT (n=805); CEF (Tx) vs. CMF (Cx) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments HRs and 95% CIs all adjusted by multivariate analysis for T size, nodal & menopausal status; stratified for grade, ER and TOP2A status |
| OS (HER2+) | Tx | 120 | Cox | .09 | 0.73 (0.50–1.05) | ||||||||
| Cx | 143 | proportional hazards | |||||||||||
| OS (HER2-) | Tx | 249 | Cox | .23 | 0.82 (0.59–1.13) | ||||||||
| Cx | 293 | proportional hazards | |||||||||||
| RFS (HER2+) | Tx | 120 | Cox | .10 | 0.75 (0.53–1.06) | ||||||||
| Cx | 143 | proportional hazards | |||||||||||
| RFS (HER2-) | Tx | 249 | Cox | .10 | 0.79 (0.60–1.05) | ||||||||
| Cx | 293 | proportional hazards | |||||||||||
| Dressler et al., 2005; Thor et al., 1998; separate survival curves show similar results for HER2 status by IHC, FISH, and PCR; only abstracted data for HER2 by IHC | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments: HR and p data are for interaction of CAF dose with HER 2 status in model for DFS |
| OS HER2+ | high | 44 | >108 | ~97% | ~97% | 93% (86–100) | |||||||
| by IHC | mod | 43 | ~87 | ~93% | ~66% | 58% (47–75) | |||||||
| low | 40 | ~96 | ~90% | ~66% | 63% (49–80) | ||||||||
| OS HER2- | high | 134 | ~100 | ~93% | ~80% | 74% (67–81) | |||||||
| by IHC | mod | 124 | >108 | ~96% | ~86% | 78% (80–92) | |||||||
| low | 138 | ~100 | ~93% | ~80% | 74% (67–81) | ||||||||
| DFS HER2+ | high | 44 | >108 | ~97% | ~90% | 87% (74–96) | multivariate | .0003 | 0.42 (0.19–0.93) | HER2 by IHC | |||
| by IHC | mod | 43 | ~36 | ~60% | ~47% | 47% (34–64) | |||||||
| low | 40 | ~66 | ~65% | ~58% | 53% (39–71) | proportional | .033 | 0.92 (0.81–1.04) | HER2 by FISH | ||||
| DFS HER2- | high | 134 | >108 | ~83% | ~70% | 64% (56–73) | |||||||
| by IHC | mod | 124 | >108 | ~83% | ~70% | 65% (57–74) | hazards | .043 | 0.58 (0.25–1.35) | HER2 by PCR | |||
| low | 138 | ~90 | ~78% | ~63% | 59% (51–68) | ||||||||
| Del Mastro et al. 2004, 2005; Tx = FEC14 Cx = FEC21 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments for all FEC14, HER2+ vs. HER2-: EFS, HR=1.21 (0.65–2.24) p=.54; OS, HR=1.85 (0.88–3.89), p = .103 for all FEC21, HER2+ vs HER2-: EFS, HR=2.07 (1.27–3.38), p=.003; OS, HR=2.47 (1.34–4.57), p=.004 |
| OS (HER2+) | Tx | 50 | >72 | ~100% | ~100% | ~96% | ~92% | 89.9% | prop hazards | .22 0.59 (0.26–1.37) | |||
| Cx | 53 | >72 | ~98% | ~89% | ~85% | ~81% | 75.1% | ||||||
| OS (HER2-) | Tx | 320 | >84 | ~100% | ~99 | ~96% | ~95% | 91.9% | prop hazards | .34 0.79 (0.49–1.28) | |||
| Cx | 308 | >84 | ~100% | ~99% | ~96% | ~94% | 90.7% | ||||||
| EFS (HER2+) | Tx | 50 | >72 | ~100% | ~98% | ~85% | 79% | 77.7% | prop hazards | .092 0.54 (0.27–1.11) | |||
| Cx | 53 | >72 | ~91% | ~82% | ~68% | ~67% | 62.5% | ||||||
| EFS (HER2-) | Tx | 320 | >84 | ~100% | ~93% | ~90% | ~85% | 81.5% | prop hazards | .57 | 0.91 (0.65–1.27) | ||
| Cx | 308 | >84 | ~98% | ~93% | ~87% | ~83% | 80.9% | ||||||
| Tanner et al., 2006 FEC arm only | Outcome | Grp | N | Med (mos) | 2 yr | 3 yr | 4 yr | 5 yr | 6 yr | Test | p | HR (95%CI) | Comments only reported statistical comparisons of FEC vs. HDC/AuSCS, not HER2+ vs. HER2- in same arm |
| OS | HER2+ | 56 | ~54 | ~79% | ~64% | ~58% | ~46% | ~41% | not reported | ||||
| HER2- | 124 | >84 | ~94 | ~83% | ~74% | ~68% | ~64% | ||||||
| RFS | HER2+ | 56 | ~48 | ~68% | ~62% | ~50% | ~46% | ~46% | not reported | ||||
| HER2- | 124 | >84 | ~84% | ~74% | ~67% | ~66% | ~65% | ||||||
| Hayes et al., 2007 AC→P (Tx) vs. AC alone (Cx) HER2 status based on CB11 IHC test results; | Outcome | Grp | N | Med (mos) | 3 yr | 6 yr | 9 yr | Test | p | HR (95%CI) | Comments: total n=1322; HR & p for interaction of of HER2+ status and effect of adding paclitaxel HR & p, as for OS | ||
| OS HER2+ | Tx | not reached | ~87–92% | ~75–78% | ~70–78% | Cox | .01 | 0.57 | |||||
| Cx | ~60–96 | ~70–75% | ~52–62% | ~47–49% | multivariate regression | ||||||||
| OS HER2- | Tx | not reached | ~87–92% | ~76–80% | ~68–70% | ||||||||
| Cx | not reached | ~85–87% | ~74–77% | ~63–66% | |||||||||
| DFS HER2+ | Tx | not reached | ~80–87% | ~69–72% | ~62–67% | Cox | .01 | 0.59 | |||||
| Cx | ~48–60 | ~53–60% | ~45–50% | ~45–48% | multivariate regression | ||||||||
| DFS HER2- | Tx | not reached | ~83–87% | ~70–75% | ~65–69% | ||||||||
| Cx | ~120–132 | ~80–85% | ~65–67% | ~55–60% | |||||||||
| Martin et al., 2005 Tx = DAC Cx = FAC | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments K-M DFS curves not shown separately by HER2 status; p values not reported |
| DFS HER2+ | Tx | 155 | Cox prop hzrds models | 0.60 (0.41–0.88) | |||||||||
| Cx | 164 | ||||||||||||
| DFS HER2- | Tx | 475 | 0.76 (0.59–1.00) | ||||||||||
| Cx | 468 | ||||||||||||
| DFS HER2 | Tx | 115 | 0.72 (0.45–1.17) | ||||||||||
| Unknown | Cx | 114 | |||||||||||
| Neoadjuvant (preoperative) chemotherapy for locally advanced breast cancer | |||||||||||||
| Learn et al., 2005 | did not report time-to-event outcome | ||||||||||||
| Arriola et al., 2006 | did not report time-to-event outcomes | ||||||||||||
| Park et al., 2003 | did not report time-to-event outcomes | ||||||||||||
| Zhang et al., 2003; FAC, n=97 (n=78 also given post-op chemoTx) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments |
| DFS | HER2+ | 28 | 48 (for all patients) | ~90% | ~83% | ~60% | ~45% | not specified | NS | not reported | |||
| HER2- | 69 | ~90% | ~80% | ~70% | ~60% | ||||||||
| Tulbah et al., 2002; | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments if HER2+ = IHC 2+/3+, OS favored HER2- 90% vs. 79% p=.051 if HER2+ = IHC 2+/3+ DFS still not statistically significant (p=.09) |
| OS | HER2+ | 22 | not reached | ~95% | ~79% | ~66% | ~66% | log | .31 | ||||
| HER2- | 32 | not reached | ~97% | ~97% | ~72% | ~72% | rank | ||||||
| DFS | HER2+ | 21 | 34.5±7.8 | ~88% | ~75% | ~75% | 0 | log | .43 | ||||
| HER2- | 31 | (all 52 pts) | ~92% | ~83% | ~52% | ~52% | rank | ||||||
| Tinari et al., 2006; | did not report time-to-event outcome by HER2 status | ||||||||||||
| First- or second-line chemotherapy for advanced or metastatic breast cancer | |||||||||||||
| Harris et al., 2006 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | 10yr | Test | p HR (95%CI) | Comments |
| OS | CB11+ | 30 | 11.3 | Log | .14 | ||||||||
| CB11- | 126 | 13.1 | rank | ||||||||||
| FISH+ | 37 | 10.9 | Log | .26 | |||||||||
| FISH- | 109 | 13.1 | rank | ||||||||||
HercepTest 2+/3+ | 46 | 11.5 | Log | .84 | |||||||||
HercepTest 0/1+ | 105 | 13.2 | rank | ||||||||||
| Di Leo et al., 2004 Grp 1: A Grp 2: T | Outcome | Grp | N | Med (mos) | 6 mos | 1 yr | 1.5 yr | 2 yr | 2.5 yr | Test | p HR (95%CI) | Comments In full TAX 303 trial, no statistically significant differences between Tx arms with respect to OS or TTP | |
| OS HER2+ | 1 | 15 | 10.8 | ~.85 | ~.3 | No line | No line | No line | Cox regression | .33 1.47(0.68–3.15) | |||
| 2 | 21 | 14.4 | ~.95 | ~.6 | ~.46 | No line | No line | ||||||
| OS HER2- | 1 | 63 | 16.9 | ~.8 | ~.72 | ~.5 | ~.3 | No line | Cox regression | .07 0.64(0.40–1.03) | |||
| 2 | 50 | 12.6 | ~.8 | ~.6 | ~.32 | ~.28 | 0 | ||||||
| 3 mos | 6 mos | 9 mos | 12 mos | 15 mos | 18 mos | ||||||||
| TTP HER2+ | 1 | 15 | 4.7 ~.75 | ~.4 | ~.25 | ~.15 | ~.15 | 0 | Cox | .73 0.88(0.43–1.82) | |||
| 2 | 21 | 7.0 ~.75 | ~.6 | ~.15 | ~.1 | ~0 | No line | regression | |||||
| TTP HER2- | 1 | 63 | 5.9 ~.74 | ~.5 | ~.35 | ~.25 | ~.15 | <.1 | Cox | .22 0.77(0.52–1.16) | |||
| 2 | 50 | 5.0 ~.74 | ~.45 | ~.2 | ~.1 | <.1 | No line | regression | |||||
| PFS | Tx | ||||||||||||
| Cx | |||||||||||||
| Konecny et al., 2004 Grp 1: EC Grp 2: ET | Outcome | Grp | N | Med (mos) (95%CI) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p HR (95%CI) | Comments | |
| OS | 1 | ||||||||||||
HER2+ | 49 | 16.4(12.1–20.1) | ~.65 | ~.3 | ~.25 | ~.25 | log | .010 | |||||
HER2- | 88 | 33.1(20.9–50.6) | ~.78 | ~.57 | ~.45 | ~.4 | rank | ||||||
| 2 | |||||||||||||
HER2+ | 48 | 21.4(15.3–27.3) | ~.74 | ~.45 | ~.25 | ~.1 | log | .463 | |||||
HER2- | 90 | 27.5(17.1–35.2) | ~.7 | ~.55 | ~.35 | ~.2 | rank | ||||||
HER2+ | 1 | 49 | 16.4(12.1–20.1) | ~.6 | ~.3 | ~.25 | ~.25 | log | .319 | ||||
| 2 | 48 | 21.4(15.3–27.3) | ~.7 | ~.4 | ~.25 | ~.1 | rank | ||||||
HER2- | 1 | 88 | 33.1(20.9–50.6) | ~.78 | ~.58 | ~.43 | ~.4 | log | .292 | ||||
| 2 | 90 | 27.5(17.1–35.2) | ~.7 | ~.55 | ~.35 | ~.15 | rank | ||||||
| PFS | 1 | ||||||||||||
HER2+ | 49 | 7.1(4.1–9.3) | ~.2 | ~.08 | ~.08 | log | .010 | ||||||
HER2- | 88 | 10.4(6.9–14.9) | ~.54 | ~.22 | ~.12 | rank | |||||||
| 2 | |||||||||||||
HER2+ | 48 | 10.5(8.1–11.9) | ~.35 | ~.1 | ~.05 | log | .584 | ||||||
HER2- | 90 | 9.6(7.5–11.3) | ~.35 | ~.15 | ~.08 | rank | |||||||
HER2+ | 1 | 49 | 7.1(4.1–9.3) | ~.2 | ~.08 | ~.08 | log | .116 | |||||
| 2 | 48 | 10.5(8.1–11.9) | ~.35 | ~.1 | ~.05 | rank | |||||||
HER2- | 1 | 88 | 10.4(6.9–14.9) | ~.47 | ~.25 | ~.1 | log | .350 | |||||
| 2 | 90 | 9.6(7.5–11.3) | ~.52 | ~.13 | ~.08 | rank | |||||||
| Study | Tumor Response (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Adjuvant chemotherapy for resected early breast cancer | ||||||||||
| Yang et al., 2003 | not reported | |||||||||
| Gusterson et al., 2003 | not reported | |||||||||
| Moliterni et al., 2003 | not reported | |||||||||
| Colozza et al., 2005 | not reported | |||||||||
| Pritchard et al., 2006 | not reported | |||||||||
| Knoop et al., 2005 | not reported | |||||||||
| Dressler et al., 2005 | not reported | |||||||||
| Del Mastro et al 2004, 2005 | not reported | |||||||||
| Tanner et al., 2006 | not reported | |||||||||
| Hayes et al., 2007 | not reported | |||||||||
| Martin et al., 2005 | not reported | |||||||||
| Neoadjuvant (preoperative) chemotherapy for locally advanced breast cancer | ||||||||||
| Learn et al., 2005; n=104 classified for HER2 status | Grp | N | pCR | ORR (cCR+cPR) | Test | p | Comments: for ORR data, by multivariate analysis: AC, HER2+ vs. HER2-, p=0.06; AC+D, HER2+ vs. HER2-, p=0.99 | |||
| HER2+, AC | 32 | 22% | 75% | logistic regression | NS | |||||
| HER2+, AC+D | 9 | 22% | 78% | <.05 | ||||||
| HER2-, AC | 37 | 24% | 51% | |||||||
| HER-, AC+D | 26 | 24% | 81% | |||||||
| Arriola et al., 2006 | Grp | N | pCR | PR | SD | PD | NE | Test Mann-Whitney | p .03 | Comments “association of HER2+ with pCR” |
| all | 229 | 27 | ||||||||
| Park et al., 2003 | Grp | N | pCR | PR | OR (CR+PR) | NR (PD+NE) | Test Fisher's exact | p .013 | Comments | |
| HER2+ | 31 | 5 (16%) | 22 (71%) | 27 (87%) | 4 (13%) | |||||
| HER2- | 36 | 0 | 17 (47%) | 17 (47%) | 19 (53%) | |||||
| Zhang et al., 2003 | Grp | N | cCR+cPR | cNR | p .014 | RR 95%CI | pCR+MRD | ERD | p .53 | RR 95%CI 1.4 0.54–3.67 tests Fisher's exact & asymptotic |
| HER2+ | 28 | 93% | 7% | 1.2 | 1.1–1.4 | 18% | 82% | |||
| HER2- | 69 | 78% | 22% | 13% | 87% | |||||
| Tulbah et al., 2002 | Grp | N | pCR | PR | SD | PD | NE | Test | p | Comments also NS if IHC 2+ and 3+ considered HER2+ |
| HER2+ | 21 | 6 (29%) | NS | |||||||
| HER2- | 31 | 7 (23%) | ||||||||
| Tinari et al., 2006 | Grp | N | TR | OR | SD | PD | NE | Test | p | OR Comments 5.28 (1.57–19.6) |
| all | 77 | 23.4% | 72.7% | 3.9% | univariate .008 logistic regression | |||||
| HER2+ | 20 | |||||||||
| HER2- | 57 | |||||||||
| First- or second-line chemotherapy for advanced or metastatic breast cancer | ||||||||||
| Harris et al., 2006 | N | CR+PR (%) | p | Comments: did not report logistic regression analysis | ||||||
| HER2 by CB11 | 0.96 | |||||||||
Pos | 30 | 23 | ||||||||
Neg | 126 | 24 | ||||||||
| HER2 FISH | 0.70 | |||||||||
Pos | 37 | 22 | ||||||||
Neg | 109 | 25 | ||||||||
| HercepTest | 0.026 | |||||||||
Pos (2–3) | 46 | 35 | ||||||||
Neg (0–1) | 105 | 18 | ||||||||
| HercepTest | 0.98 | |||||||||
Pos (3) | 30 | 23 | ||||||||
Neg (0–2) | 121 | 23 | ||||||||
| Di Leo et al., 2004 | Grp1 | N | %(CR+PR) | A versus T OR (95%CI) | p | Comments: By MV logistic regression, treatment × HER2 status OR=3.64, CI: 1.39–9.54 p=0.01; remains SS after adjusting for visceral & Tx × visceral interaction | ||||
HER2+ | 15 | 27 | HER2+ | 5.50(1.28–23.69) | .04 | |||||
HER2- | 63 | 35 | HER2- | 1.24(0.58–2.68) | .70 | |||||
HER2 unk | 13 | 31 | HER2 unk | 1.25(0.25–6.24) | 1.00 | |||||
All | 91 | 33 | All | 1.72(0.94–3.18) | .09 | |||||
HER2+ | 21 | 67 | (In full TAX 303 trial, response rates were 48% with docetaxel (n=161), 33% with doxorubicin (n=165), p=0.008) | |||||||
HER2- | 50 | 40 | ||||||||
HER2 unk | 14 | 36 | ||||||||
All | 85 | 46 | ||||||||
| Konecny et al., 2004 Grp1: EC Grp 2: ET | Grp1 | N | CR+PR(95%CI) | SD | PD | NE | Test | p | Comments by MV logistic regression, adjusted Tx*HER2 interaction: p=0.256 | |
| HER2+ | 97 | 60(51–70) | chi sq | .004 | ||||||
| HER2- | 178 | 41(34–49) | ||||||||
| Grp1 | ||||||||||
HER2+ | 49 | 45(32–60) | chi sq | .130 | ||||||
HER2- | 88 | 33(22–43) | ||||||||
| Grp2 | ||||||||||
HER2+ | 48 | 76(63–88) | chi sq | .005 | ||||||
HER2- | 90 | 50(39–61) | ||||||||
| HER2+ | ||||||||||
Grp1 | 49 | 45(32–60) | chi sq | .004 | by MV logistic regression, OR=3.64 CI: 1.48–8.92, p=0.005 | |||||
Grp2 | 48 | 76(63–88) | ||||||||
| HER2- | ||||||||||
Grp1 | 88 | 33(22–43) | chi sq | .002 | by MV logistic regression, OR=1.92 CI: 1.01–3.64, p=0.046 | |||||
Grp2 | 90 | 50(39–61) | ||||||||
Abbreviations: ERD: extensive residual disease; MRD: minimal residual disease; NE: not evaluable; NS: not significant; OR: overall response (cPR + minimal residual disease + PR); ORR: overall response rate; PD: progressive disease; RR: relative risk; SD: stable disease; TR: tumor response (cPR + minimal residual disease);
OS, HER2- (n=406 multiple; n=200, one): HR=0.69, 95 percent CI: 0.52–0.92; p=.01
OS, HER2+ (n=85 multiple; n=55, one): HR=1.15, 95 percent CI: 0.62–1.54; p, NS
DFS, HER2- (n=406 multiple; n=200, one): HR=0.57, 95 percent CI: 0.46–0.72; p<.0001
DFS, HER2+ (n= 85 multiple; n=55, one): HR=0.77, 95 percent CI: 0.51–1.16; p, NS
The Yang, Klos, Zhou, et al. (2003) uncontrolled series (n=94) reported that at 5 years, DFS in the HER2-negative subgroup was superior to DFS in the HER2-positive subgroup (n=60, 86 percent versus n=34, 53 percent; log rank p<.1; stratified log rank, p=.002 after adjustment for nodal status).
Studies on regimens with versus without an anthracycline. Only one (Pritchard, Shepherd, O'Malley, et al., 2006) of four included randomized, controlled trials comparing regimens with versus without an anthracycline reported superior outcomes with the anthracycline regimen that reached statistical significance for HER2-positive but not HER2-negative patients. Pritchard, Shepherd, O'Malley, et al. (2006) used multivariate analysis (MVA) to test for an interaction of comparative treatment effect with HER2 status. The study compared CEF versus CMF and reported the following results for OS and relapse-free survival (RFS):
OS, HER2- (n=237, CEF; n=228, CMF): HR=1.06, 95 percent CI: 0.83–1.44; p, NS
OS, HER2+ (n=75, CEF; n=88, CMF): HR=0.65, 95 percent CI: 0.42–1.02; p=.06
OS, treatment by HER2 interaction from MVA: HR=2.04, 95 percent CI: 1.14–3.65, p=.02
RFS, HER2- (n=237, CEF; n=228, CMF): HR=0.91, 95 percent CI: 0.71–1.18; p, NS
RFS, HER2+ (n 75, CEF; n 88, CMF): HR=0.52, 95 percent CI: 0.34–0.80; p=.003
RFS, treatment by HER2 interaction from MVA: HR=1.96, 95 percent CI: 1.15–3.65; p=0.01
The other trials reported no statistically significant differences for any subgroups they compared. Moliterni, Menard, Valagussa, et al. (2003) compared CMF alone versus CMF followed by doxorubicin (CMF→A) in HER2-positive (n=50, CMF; n=45, CMF→A) and HER2-negative (n=208, CMF; n=203, CMF→A) subgroups. Confidence intervals spanned 1.00 and HRs were not statistically significant for either outcome (OS, RFS) in either subgroup. With Cox MVA, treatment by HER2 interaction terms were:
OS: HR=0.48, p=.052
RFS: HR=0.68, p, NS
Colozza, Sidoni, Mosconi, et al. (2005) compared CMF versus epirubicin alone (E), in HER2-positive (n=37, CMF; n=54, E) and HER2-negative (n=96, CMF; n=79, E) subgroups. Log rank analyses of Kaplan-Meier survival curves showed a statistically significant difference in OS at 8 years after CMF favoring HER2-negative over HER2-positive patients: (87.4 +/- 3.4) percent versus (67.6 +/- 7.7) percent, p=.024. All other subgroup comparisons were not statistically significant, and Cox MVA interaction terms for treatment effect by HER2 status also were not statistically significant.
Knoop, Knudsen, Balslev, et al. (2005) compared CMF versus CEF in HER2-positive (n=143, CMF; n=120, CEF) and HER2-negative (n=293, CMF; n=249, CEF) subgroups. For both OS and RFS, hazard ratios from Cox multivariate analyses (stratified by tumor grade, estrogen receptor and TOP2A status; and adjusted for tumor size, nodal and menopausal status) uniformly spanned 1.00 and were not statistically significant for either HER2-positive or HER2-negative subgroups.
The Tanner, Isola, Wiklund, et al. (2006) study showed separate Kaplan-Meier curves for HER2-positive (n=56) and HER2-negative (n=124) subgroups from the tailored FEC arm for both OS and RFS. However, they did not report statistical significance of differences between these HER2 status subgroups (although they reported statistical significance of differences between HER2 status subgroups treated by HDC/AuSCS versus subgroups treated with tailored FEC).
Studies on dose or dose intensity of anthracycline-based regimens. In one of two included studies, multivariate proportional hazards analysis showed statistically significant interaction of anthracycline-based regimen dose or dose-intensity with HER2 status to predict outcome. Dressler, Berry, Broadwater, et al. (2005) compared DFS after high-, moderate-, or low-dose CAF regimens in HER2-positive and HER2-negative subgroups. They reported separate MVAs using FISH, IHC, or PCR to classify patients' HER2 status. Results for DFS at five years comparing high-dose versus low-dose plus moderate-dose CAF subgroups were:
HER2/FISH (n=91, HER2+; n=433, HER2-): HR=0.822 (95 percent CI: 0.553–1.220)
HER2/IHC (n=127, HER2+; n=396, HER2-): HR=0.834 (95 percent CI: 0.590–1.181)
HER2/PCR (n=91, HER2+; n=400, HER2-): HR=0.732 (95 percent CI: 0.507–1.056)
HER2/FISH, interaction CAF dose by HER2: HR=0.919 (95 percent CI: 0.814–1.038); p=.033
HER2/IHC, interaction CAF dose by HER2: HR=0.418 (95 percent CI: 0.188–0.930); p=.0003
HER2/PCR, interaction CAF dose by HER2: HR=0.585 (95 percent CI: 0.253–1.352); p=.043
Investigators stated (but did not report HRs, CIs, or p values) that MVA yielded similar results for statistically significant interaction of CAF dose with HER2 status to predict OS.
Del Mastro, Bruzzi, Nicolo, et al. (2005) compared outcomes after identical doses of FEC administered every 14 days (FEC14) or every 21 days (FEC21). Multivariate proportional hazards analysis showed that interaction terms for HER2 status by randomly assigned treatment (dose intensity or treatment frequency) were not statistically significant for EFS (HR=0.53; p=.12) or OS (HR=0.646; p= .379). HER2 status (HER2-positive, n=103; HER2-negative, n=628) was statistically significant to predict EFS (HR=2.04, p=.005) and OS (HR=2.41, p=.006), while randomly assigned treatment (FEC14, n=370; FEC21, n=361) was not statistically significant to predict either outcome (EFS, HR=0.85, p=.335; OS, HR=0.72, p=.379).
Studies on regimens with versus without a taxane. One of two included studies reported statistically significant interaction of HER2 status with added paclitaxel to predict treatment outcome. Hayes, Thor, Dressler, et al. (2007) compared outcomes with versus without paclitaxel (following AC) in HER2-negative and HER2-positive subgroups, separately for each of two groups they randomly selected for HER2 testing. For each group, OS and DFS for HER2-positive patients given paclitaxel were superior to the same outcomes in HER2-positive patients not given paclitaxel. In contrast, OS and DFS for HER2-negative patients given paclitaxel appeared similar to the same outcomes for HER2-negative patients not given paclitaxel. They used Cox multivariate analyses, separately in each randomly selected group, and in the two groups combined, to test the statistical significance of an interaction term for HER2 positivity and paclitaxel treatment. Results for Group 2 and for Groups 1 and 2 pooled showed a statistically significant interaction favoring paclitaxel treatment in HER2-positive patients:
Group 1, n=643: recurrence, HR=0.63, p=.15; death, HR=0.61, p=.17
Group 2, n=679: recurrence, HR=0.52, p=.03; death, HR=0.52, p=.03
Groups 1+2, n=1,322: recurrence, HR=0.59, p=.01; death, HR=0.57, p=.01
Hayes, Thor, Dressler, et al. (2007) also investigated whether patients' estrogen-receptor status modified the impact of HER2 status on outcomes of paclitaxel. The researchers reported results of an exploratory analysis suggesting that, among HER2-positive patients, paclitaxel improved DFS whether patients were estrogen-receptor negative or positive. However, among HER2-negative patients, paclitaxel apparently improved DFS for ER-negative patients but not for ER-positive patients. HER2-negative, ER-positive patients comprised more than 50 percent of the patients in this study. However, the authors caution that additional prospective studies are needed to validate this finding before clinical practice changes and HER2-negative, ER-positive patients are no longer offered taxanes.
Martin, Pienkowski, Mackey, et al. (2005) compared DFS in patients randomized to AC plus docetaxel (TAC, n=745; HER2 positive, 155; HER2 negative, 475; HER2 unknown, 115) versus AC plus fluorouracil (FAC, n=746; HER2 positive, 164; HER2 negative, 468; HER2 unknown, 114). Subgroup analyses using a Cox proportional hazards model adjusted for age, tumor size and other prognostic factors showed superior outcomes with TAC compared to FAC for all subgroups, including by known HER2 status. A test for interaction of HER2 status with treatment effect, using the ratio of hazard ratios, was not statistically significant (ratio of HRs=0.85; p=.41).
Six studies on preoperative neoadjuvant chemotherapy. The primary outcome of interest for studies on neoadjuvant (preoperative) therapy is pathologic complete (pCR) and partial (PR) response rates, although clinical responses (cCR, cPR) also are considered. One randomized, controlled trial compared responses after neoadjuvant chemotherapy regimens (AC) with versus without added docetaxel (AC+D) (Learn, Yeh, McNutt, et al., 2005). Rates of cPR were similar with each regimen for HER2-positive (22 percent of each subgroup; AC, n=32; AC+D, n=9) and HER2-negative (24 percent of each subgroup; AC, n=37; AC+D, n=26) patients. Multivariate logistic regression analysis of overall clinical responses (ORR = cCR+cPR) showed a statistically significant increase with added docetaxel in HER2-negative patients (AC, ORR=51 percent; AC+D, ORR=81 percent; p<.05) but not in HER2-positive patients (AC, ORR=75 percent; AC+D, ORR=78 percent; p, NS). However, investigators did not report inclusion of an interaction term in their analysis.
Although two (Zhang, Yang, Smith, et al., 2003; Tulbah, Ibrahim, Ezzat, et al., 2002) of five uncontrolled series did report OS and/or DFS outcomes, these may have been influenced by postsurgical treatments that were not identical for all patients. Three of five series reported statistically significantly higher likelihood of response in the HER2-positive subgroups. Arriola, Moreno, Varela, et al. (2006) evaluated clinical and pathologic responses after preoperative treatment with doxorubicin alone. Although they did not report response rates for the HER2-positive (n=43) and HER2-negative (n=180) subgroups, a Mann-Whitney U test showed p=.03 for association of HER2 positivity with pCR. Park, Kim, Lim, et al. (2003) also investigated preoperative therapy with doxorubicin alone. They reported statistically significantly higher pCR (16 percent versus 0) and PR (71 percent versus 47 percent) in the HER2-positive (n=31) than the HER2-negative (n=36) subgroups, p=.013 by Fisher's exact test.
The study reported by Tinari, Lattanzio, Natoli, et al. (2006)compared marker assay results in paired core biopsy specimens (pre-chemotherapy) and resected tumors (post-chemotherapy), and focused primarily on changes induced by anthracycline-based neoadjuvant chemotherapy in HER2 and topoisomerase IIα (TopIIα) expression. However, they also used multivariate logistic regression analysis to compare pathologic tumor responses (TR, defined as either a pCR or minimal residual disease) in HER2 subgroups by core biopsy assays. Tinari and colleagues (2006) reported a 5.28-fold increase (95 percent CI: 1.57–19.6; p=.008) in the likelihood of achieving TR in HER2-positive than in HER2-negative patients.
Zhang, Yang, Smith, et al. (2003) investigated preoperative FAC in HER2-positive (n=28) and HER2-negative (n=69) patients. While overall clinical response rate was higher for the HER2-positive than the HER2-negative subgroup (CR+PR: 93 percent versus 78 percent), the risk ratio for response was not statistically significant (RR=1.2, 95 percent CI: 1.1–1.4, p=.14, Fisher's exact test). Overall pathologic response rates (pCR plus minimal residual disease, MRD) showed an even smaller difference between HER2-positive and HER2-negative subgroups that also was not statistically significant (18 percent versus 13 percent, RR=1.4, 95 percent CI: 0.54–3.67, p=.53, Fisher's exact test). Tulbah, Ibrahim, Ezzat, et al. (2002) investigated preoperative paclitaxel plus cisplatin in HER2-positive (n=21) and HER2-negative (n=31) subgroups. Pathologic complete response rates did not differ significantly between the groups (29 percent versus 23 percent; p=NS).
Three studies on chemotherapy for advanced or metastatic breast cancer. One of three studies did not compare different regimens and pooled data across arms randomized to different paclitaxel doses (Harris, Broadwater, Lin, et al., 2006); one compared monotherapy with doxorubicin (A) versus monotherapy with docetaxel (T) (Di Leo, Chan, Paesmans. et al., 2004); and one compared epirubicin plus cyclophosphamide (EC) versus epirubicin plus paclitaxel (ET) (Konecny, Thomssen, Luck, et al., 2004).
Harris, Broadwater, Lin, et al. (2006) used log rank analysis to compare Kaplan-Meier curves for OS between HER2-positive and HER2-negative patients, separately for test results by three different HER2 assays: CB11 IHC, the HercepTest™ IHC, and FISH. Differences between the curves were not statistically significant for any comparison. They also compared overall response rates (ORR=CR+PR) for subgroups defined by each HER2 assay. Results were statistically significant (HER2-positive, n=46, ORR=35 percent; HER2-negative, n=105, ORR=18 percent; p=.026) only with the HercepTest™ assay, and only when both 2+ and 3+ scores were considered HER2 positive.
Di Leo, Chan, Paesmans, et al. (2004) compared OS and time to progression (TTP) in patients randomized to A or T in HER2-positive (A, n=15; T, n=21) and HER2-negative (A, n=63; T, n=50) subgroups. There were no statistically significant differences between treatment arms for either outcome in either HER2 status subgroup. In contrast, ORR statistically significantly favored T over A in the HER2-positive subgroup (T, n=21, ORR=67 percent versus A, n=15, ORR=27 percent; OR=5.50, 95 percent CI: 1.28–23.69; p=.04). However, the difference was not statistically significantly different for the HER2-negative subgroup (T, n=50, ORR=40 percent versus A, n=63, ORR=35 percent; OR=1.24, 95 percent CI: 0.58–2.68; p=.70).
Konecny, Thomssen, Luck, et al. (2004) compared HER2-positive (EC, n=49; ET, n=48) and HER2-negative (EC, n=88; ET, n=90) subgroups randomized to EC or ET for OS and PFS. With the EC regimen, OS (median, 33.1 versus 16.4 months, log rank p=.01) and PFS (median, 10.4 versus 7.1 months, log rank p=.01) were significantly greater among HER2-positive than among HER2-negative patients. In each other comparison (OS or PFS; for the ET regimen by HER2 status, or for EC versus ET separately in subgroups by HER2 status) the difference was not statistically significant. Univariate chi square tests suggested each ORR difference was statistically significant (between all HER2-positive versus all HER2-negative patients, and separately by treatment arm and HER2 status subgroups; excluding those randomized to EC by HER2 subgroups). However, the interaction of treatment effect with HER2 status was not statistically significant (p=.256) by multivariate logistic regression.
Across all three treatment settings (adjuvant, neoadjuvant, advanced/metastatic), currently available evidence comparing chemotherapy outcomes in HER2-positive and HER2-negative patient subgroups may be used to generate hypotheses, but is too weak to test hypotheses. Only one study (on adjuvant therapy; Martin, Pienkowski, Mackey, et al., 2005) is from a randomized, controlled trial that prespecified a multivariate subgroup analysis by HER2 status. Investigators reported the interaction of assigned treatment (with versus without paclitaxel) with HER2 status to predict outcome was not statistically significant (ratio of HRs=0.85; p=.41).
All other evidence is from post-hoc analyses on subgroups not directly randomized, selected, or stratified by HER2 status. All other reports from randomized, controlled trials were secondary or correlative analysis on patient subgroups with archived tissue samples available for HER2 testing. Many compared baseline characteristics and prognostic factors of patients with known versus unknown HER2 status, sometimes separately by treatment arm, but more often pooled across treatment arms. However, since few directly compared baseline characteristics and prognostic factors for HER2-positive and HER2-negative subgroups separately from each arm, it is uncertain whether these subgroups were well balanced. A minority of studies reported multivariate analyses that tested the statistical significance of interactions between treatment effects of different regimens and HER2 status.
Evidence on adjuvant CMF chemotherapy. Evidence from two studies (one randomized, controlled trial and one series) suggests HER2-positive patients may derive quantitatively smaller benefit from CMF (smaller improvements in OS and DFS) than experienced by HER2-negative patients. However, such evidence cannot prove that CMF provides no benefit to HER2-positive patients.
Evidence on adjuvant anthracycline therapy. An analysis from one of four randomized, controlled trials reports a statistically significant interaction between use of a regimen that includes an anthracycline and HER2 status as outcome predictors. Data from this study suggest HER2-positive patients (but not HER2-negative patients) experience a statistically significant improvement in outcome from inclusion of an anthracycline in their treatment regimen. Again, this does not prove that HER2-negative patients do not benefit from anthracycline therapy. Given the highly statistically significant result favoring anthracycline therapy for the large population of breast cancer patients included in the Early Breast Cancer Trialists' Collaborative Group (EBCTCG 2005) overview analysis, a more complete test of this hypothesis is needed before one can conclude that omitting anthracyclines from adjuvant chemotherapy regimens does not worsen outcome in HER2-negative patients. The absence of a statistically significant interaction in three other randomized, controlled trials is not informative, given the differences in specific treatment regimens, populations studied, and small numbers in the HER2-positive subgroups.
Two trials compared different doses or dose intensities (frequencies) of anthracycline-based regimens. One (Dressler, Berry, Broadwater, et al., 2005) reported a statistically significant interaction of CAF dose with HER2 status to predict treatment outcome, whether HER2 status was based on FISH, IHC, or PCR assays. Data from this study suggested the highest of three CAF doses (now considered by many oncologists the standard dose for all patients) improved outcomes for HER2-positive patients, but suggested no benefit from the highest dose for HER2-negative patients. In contrast, the interaction of dose intensity (frequency) with HER2 status to predict treatment outcome was not statistically significant in a second randomized, controlled trial (Del Mastro, Bruzzi, Nicolo, et al., 2005). Available data are too weak to conclude that HER2-positive patients clearly experience better outcomes with the higher-dose or dose-intensity anthracycline-based regimens.
Evidence on adding paclitaxel to adjuvant AC chemotherapy. A correlative analysis from one randomized, controlled trial (Hayes, Thor, Dressler, et al., 2007) provides evidence that adding paclitaxel after AC improves OS and DFS for HER2-positive patients, but may not improve these outcomes for HER2-negative patients. Here again, these strongly suggestive data are too weak by themselves to conclude that use of paclitaxel in adjuvant regimens is not beneficial in HER2-negative patients. Additionally, the only trial with a prespecified multivariate subgroup analysis (Martin, Pienkowski, Mackey, et al., 2005) reported that the interaction of concurrently added paclitaxel with HER2 status was not statistically significant.
The potential interaction between HER2 status, estrogen receptor status, and progesterone receptor status as predictors of chemotherapy efficacy is receiving increasing attention. The Hayes, Thor, Dressler, et al. (2007) article is the only included study on chemotherapy for breast cancer that addresses this issue, although the analysis only includes HER2 status and ER status. In an exploratory analysis, the authors found that adding paclitaxel improved survival for all HER2-positive patients and for HER2-negative/ER-negative patients, but not for HER2-negative/ER-positive patients. As discussed in the Conclusions and Discussion for Chapter 2, many researchers are investigating breast cancer subtypes identified by different combinations of ER, PR, and HER2, including the so-called “triple-negative” subtype (i.e., negative for HER2, estrogen receptor, and progesterone receptor), and the luminal subtypes (luminal A or luminal B) that are negative for HER2 but positive for at least one of the hormone receptors. There is evidence that the triple negative and luminal subsets differ with respect to prognosis, chemotherapy response, and outcomes (Carey, Dees, Sawyer et al., 2007; Liedtke, Mazouni, Hess et al., 2008), and they clearly differ with respect to effects of endocrine therapy. New phase III trials for patients with triple negative or “basal-like” breast cancer (Kilburn, 2008) should provide more insight in the future.
Systematic reviews on adjuvant chemotherapy. Recent systematic reviews and meta-analyses on HER2 status to predict chemotherapy outcomes were reported by Gennari and colleagues (Gennari, Sormani, Pronzato, et al., 2008) and by Pritchard and colleagues (Pritchard, Messersmith, Elavathil, et al., 2008; Dhesy-Thind, Pritchard, Messersmith, et al., 2008). Gennari and co-workers (2008) pooled data from eight randomized trials that compared adjuvant regimens with versus without an anthracycline (four of which did not meet selection criteria for this review). Two (NSABP B11, Paik, Bryant, Park, et al., 1998; NSABP B15, Paik, Bryant, Tan-Chiu, et al., 2000) considered patients HER2-positive if membranes of any tumor cells showed antibody staining by IHC, a threshold for HER2 positivity inconsistent with the ASCO/CAP and NCCN guidelines. Substantial numbers of patients from these early (but otherwise well done) randomized, controlled trials may have been classified as HER2 positive who would now be classified as HER2 negative using the currently recommended thresholds. Thus, pooling data from these analyses with later analyses that used current IHC scoring criteria to classify patients may potentially bias the outcome comparisons. We excluded a third study included by Gennari and colleagues (2008) since it was only published as an abstract, without slides available on the web (De Laurentiis, Caputo, Massarelli, et al., 2001). We excluded a fourth study they included (Di Leo, Gancberg, Larsimont, et al., 2002), since patients were not treated identically within each arm and patients with unknown hormone receptor status were given tamoxifen. We replicated the results of the Gennari, Sormani, Pronzato, et al., (2008) meta-analysis including the same studies the authors did and reached the same results. Then we redid the analysis including only the studies meeting criteria for the current review, which meant excluding the four studies mentioned above. Removing these studies widened the confidence intervals, but did not alter the overall conclusions.
The systematic reviews and meta-analyses reported by Pritchard and colleagues (Pritchard, Messersmith, Elavathil, et al., 2008; Dhesy-Thind, Pritchard, Messersmith, et al., 2008) also included randomized, controlled trials that did not meet selection criteria for this review. In addition to the four discussed above, we excluded three trials on anthracycline-based regimens that were reported only as meeting abstracts but without slides, audio or video available on the web to provide full access to presented data (Petruzelka, Pribylova, Vedralova, et al., 2000; Vera, Albanell, Lirola, et al., 1999; Arnould, Fargeot, Bonneterre, et al., 2003; Bonneterre, Roche, Kerbrat, et al., 2003). We also excluded one fully published study in which patients were not treated identically within each arm (Di Leo, Larsimont, Gancberg, et al., 2001) and a second fully published study on high-dose chemotherapy with autologous stem-cell transplant that did not report data by HER2 status separately for the conventional-dose arm (Rodenhuis, Bontenbal, van Hoesel, et al., 2006).
The Gennari and co-workers (2008) meta-analysis reports statistically significant improvement in DFS (six trials included) and OS (seven trials included) of HER2-positive patients given an anthracycline compared to the same outcomes for HER2-positive patients not given an anthracycline (HR for relapse=0.71, 95 percent CI: 0.61–0.83; p<.001; HR for death =0.73, 95 percent CI: 0.62–0.85; p<.001). In contrast, including an anthracycline apparently did not statistically significantly improve DFS or OS for patients with HER2-negative disease (HR for relapse=1.00, 95 percent CI: 0.90–1.11; p=.75; HR for death=1.03, 95 percent CI: 0.92–1.16; p=.60). The meta-analysis reported by Pritchard and co-workers (2008) included the same six trials for DFS and the same seven trials for OS, and reported identical pooled results (hazard ratios, confidence intervals) as those reported by Gennari and co-workers (2007). These analyses support the need for more definitive tests of the hypothesis that the balance of potential benefit versus harm of anthracyclines in HER2-negative patients may not justify their use. Furthermore, as discussed in Key Question 2 and in this section, future analyses and new studies should probably subdivide the HER2 negative group, and analyze subsets who are triple-negative (or “basal-like”) separately from those who are positive for one or both hormone receptors (luminal A or B).
Pritchard, Messersmith, Elavathil, et al. (2008) also reported a meta-analysis on DFS that included three randomized, controlled trials comparing higher-dose or intensity versus lower-dose or intensity anthracycline regimens: two are included here (Dressler, Berry, Broadwater, et al., 2005; Del Mastro, Bruzzi, Nicolo, et al., 2005), and one we excluded (Di Leo, Larsimont, Gancberg, et al., 2001). They found significant improvement of DFS at higher doses for HER2-positive patients (HR=0.54; 95 percent CI: 0.38–0.79) but not for HER2-negative patients (HR=0.98; 95 percent CI: 0.78–1.22). However, a test for the interaction of anthracycline regimen dose or dose intensity with HER2 status to predict DFS was not statistically significant. Thus, present evidence is too weak to support conclusions about HER2 status as a sole predictor of differences in outcome between higher- and lower-dose anthracycline-based regimens. Longer-term data on potential toxicities (particularly decreased ejection fraction and congestive heart failure) of the higher doses are also needed.
Pritchard, Messersmith, Elavathil, et al. (2008) reported on a final meta-analysis that pooled results on DFS from two randomized, controlled trials on adjuvant therapy (Hayes, Thor, Dressler, et al., 2007; Martin, Pienkowski, Mackey, et al., 2005) and one on neoadjuvant therapy (Learn, Yeh, McNutt, et al., 2005) that compared taxane-containing versus non-taxane-containing regimens. While all three trials were included in this systematic review, the validity of pooling them for meta-analysis seems uncertain. Postsurgical therapy in the Learn, Yeh, McNutt, et al. (2005) trial may have affected DFS and may not have been uniform in all three arms. The meta-analytic results suggest the magnitude of benefit from including a taxane in the regimen may be greater for HER2-positive patients (HR=0.60; 95 percent CI: 0.46–0.78) than for HER2-negative patients (HR=0.83; 95 percent CI: 0.71–0.98). However, these results also show statistically significant evidence of benefit for each group from including a taxane in the regimen. Thus, the evidence is presently too weak to support conclusions on HER2 status as a sole predictor of whether or not any subgroup of breast cancer patients benefits from paclitaxel therapy.
These meta-analyses were thorough and used appropriate methodologies. The difference in the trials included in the meta-analyses versus the current systematic review is due to varying prespecified inclusion and exclusion criteria, which are a matter of opinion. The main concern regarding the meta-analyses is their relevance to current practice. The current ASCO/CAP guidelines recommend a different approach to measuring HER2 status than used in the trials incorporated into the meta-analyses, which is why we chose not to perform a formal meta-analysis. Whether and how the change in measurement of HER2 status alters the results of the trials and meta-analyses is unknown since necessary data are unavailable.
Evidence on neoadjuvant chemotherapy. Available evidence on whether HER2 status affects rates of complete pathologic response (pCR) to neoadjuvant chemotherapy is limited to four uncontrolled series (retrospective analysis in three). Although two of four reported statistically significantly higher pCR rates in HER2-positive than HER2-negative patients, these data are too weak to conclude that the regimens tested are of no benefit to HER2-negative patients. Furthermore, data are lacking to directly compare any neoadjuvant regimens. Since a number of trials have already compared different neoadjuvant therapies, correlative studies using archived tissue samples may be useful. However, it is also possible that conclusions on relative benefits of different regimens from studies in the adjuvant setting may generalize to the neoadjuvant setting.
Evidence on chemotherapy for advanced disease. Evidence also is limited on differences by HER2 status for outcomes of chemotherapy for advanced or metastatic disease. Three randomized, controlled trials investigated different treatments: one studied paclitaxel alone (at different doses), one studied an anthracycline alone versus a taxane alone, and one studied an anthracycline plus cyclophosphamide versus an anthracycline plus a taxane. Small patient groups limited statistical power.
In summary, although present evidence is suggestive, it is too weak to determine in either the adjuvant, neoadjuvant, or metastatic disease settings, whether a more favorable balance of benefit versus risk from chemotherapy can be achieved by selecting patients for anthracycline- or taxane-based regimens based on HER2 status.
Research needs. Future trials that compare adjuvant chemotherapy regimens with versus without an anthracycline, or with versus without a taxane, could determine HER2 status at the time of diagnosis, and stratify randomization by HER2 assay results. This approach might provide more definitive tests for the hypotheses that neither an anthracycline nor a taxane improves outcomes of HER2-negative patients. Another possibility is for the EBCTCG to collect individual patient data on HER2 status using current scoring thresholds from all trials that compared adjuvant regimens with versus without an anthracycline, or with versus without a taxane. If sufficient tumor samples are available, this might be a more efficient and more definitive approach for testing hypotheses on the interaction of HER2 status with assigned treatment to predict outcome. Future analyses should also obtain more complete information on estrogen and progesterone receptor status of all patients. This would enable investigators to further subdivide the HER2-negative subset, so that triple-negatives (or those with “basal-like” breast cancer if gene array data were obtained) can be analyzed separately from the luminal A and B subtypes.
For breast cancer patients, what is the evidence on clinical benefits and harms of using HER2 assay results to guide selection of hormonal therapy?
Of the 219 articles retrieved for Question 3, 66 were assessed for potential relevance to Question 3b. Only six articles met the selection criteria. The primary reasons for article exclusion are as follows: not reporting outcomes identified in selection criteria; not reporting outcomes by HER2 status, nonidentical treatment of patients, measurement of HER2 status inconsistent with current specialty society recommendations; lack of primary data; or inclusion of only HER2-positive patients, only HER2-negative patients, or fewer than 20 HER2-positive cases.
Two of the studies that did not meet the selection criteria were by Berry, Muss, Thor, et al. (2000) and by Ellis, Coop, and Singh, et al. (2001). The first uses data from the CALBG 8541 trial, and data from this trial are included in the previous section on chemotherapy for breast cancer. It is excluded here because while the chemotherapy regimens were randomized across patients, the use of tamoxifen was not. Rather, tamoxifen was prescribed based on clinician preferences. Its use increased over time after recommendations for its use in ER-positive, postmenopausal women were released during the course of the trial and as the percentage of postmenopausal women recruited also rose. Although the study by Ellis, Coop, and Singh, et al. (2001) on the neoadjuvant use of letrozole versus tamoxifen reportedly affected clinical practice, it is excluded from this systematic review for two reasons: It reported on clinical response (breast palpation) rather than the more definitive pathological response, and it used a broader definition of HER2 positivity (IHC scores of 2+ and 3+ were designated as positive, without any further evaluation of IHC 2+ scores using FISH).
| Level of Evidence | Study | n | Setting | Treatments | Outcome | Results |
|---|---|---|---|---|---|---|
| RCT stratified on HER2 status/HER2-guided vs. non-HER2-guided | ||||||
| RCT prespecified MV SGA | ||||||
| RCT post-hoc MV SGA | Von Minckwitz 2007 | 194 | Neoadjuvant tumor≥3cm, age 18–70 | doxorubicin+ docetaxel + tamoxifen (TAM) | pCR | Univariate: Not reported; |
| Log reg: IHC HER2 as predictor of pCR p=0.126; HER2*TAM not reported | ||||||
| Rasmussen 2008 | 3533 | Adjuvant postmen HR+ | tamoxifen vs. letrozole | DFS | Univariate : FISH/IHC HER2+ vs. HER2- p<.0001 | |
| Coxa: FISH/IHC HER2*Tx, p=.60 | ||||||
| TTR | Cox: FISH/IHC HER2*Tx NS | |||||
| Dowsett 2008 | 1782 | Adjuvant postmen HR+ | tamoxifen vs. anastrozole (ANA) for 5 yrs | TETR | Univariate: FISH/IHC HER2 - vs. + ANA:p<0.0001, TAM:p=.002 | |
| Cox: FISH/IHC HER2 - vs. + ANA:p<0.001, TAM:p=.014 | ||||||
| “no indication” of greater differential benefit of ANA vs. TAM but | ||||||
| no statistics provided and only 44 HER2+ pts so CIs wide | ||||||
| Ryden 2005, 2007 | 470 | Adjuvant, Stage II, Premen or <50 years | tamoxifen vs. observation | RFS | Univariate: IHC HER2- Tx vs. Cx p=.07 (ER+) | |
| IHC HER2+ Tx vs. Cx p=.2 (ER+) | ||||||
| FISH HER2- or HER2+ Tx vs. Cx p=.14 (ER+) | ||||||
| Cox regression: IHC HER2*TAM p=.4 (ER+); p=.3 (ER+/PR+) | ||||||
| FISH/IHC HER2*TAM p=.95 (unclear if ER+ only) | ||||||
| Knoop 2001 | 1515 | Adjuvant, High risk, Postmen | tamoxifen vs. observation | DFS | Univariate: IHC HER2 lo+ or - Tx vs. Cx p=.0001 (HR+) | |
| IHC HER2 hi+ Tx vs. Cx p=.5 (HR+) | ||||||
| Cox regression: IHC HER2 and HER2*TAM not significant (HR+) | ||||||
| RCT treatment by HER2 SGA | ||||||
| 1-arm prespecified MV analysis | ||||||
| 1-arm post-hoc MV analysis | Arpino 2004 | 136 | Metastatic, 1st line Tx | tamoxifen | ORR | No stat signif diff FISH HER2 + vs. - in CR+PR+SD |
| TTF | univariate: FISH HER2 - vs. + p=.007 | |||||
| Cox regression: HER2+ as predictor TTF p=.54 | ||||||
| OS | univariate: FISH HER2 - vs. + p=.07 | |||||
| 1-arm UV analysis | Cox regression: HER2+ as predictor OS p=.97 | |||||
Stratified for randomization group and chemotherapy
Abbreviations: DFS: disease-free survival; HR: hazard ratio; MV: multivariate; ORR: overall response rate; OS: overall survival; pCR: pathologic complete response; RCT: randomized, controlled trial; RFS: recurrence-free survival; SGA: subgroup analysis; TETR: time to early tumor recurrence; TTF: time to treatment failure; TTR: time to tumor recurrence; Tx: treatment; UV: univariate analysis;
| Study | Prospective design | Prespecified hypotheses about relation of marker to outcome | Large, well-defined, representative study population | Marker assay methods well-described | Blinded assessment of marker in relation to outcome | Homogeneous treatment(s), either randomized or rule-based selection | Low rate of missing data (≤15%) | Sufficiently long followup | 1) clear candidate variable selection, 2) clear, appropriate model-building guidelines, 3) assumptions tested, 4) standard prognostic variables included, 5) continuous variables well handled, 6) validation | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1) | 2) | 3) | 4) | 5) | 6) | |||||||||
| von Minckwitz et al., 2007 | ? | N | Y | Y | ? | Y | N | Y | Y | N | ? | ? | ? | N |
| Rasmussen et al., 2008, Mauriac et al., 2007 | N | Y | Y | Y | Not reported | Y | N | 51 mos/24 mos | ? | ? | ? | Y | ? | N |
| Dowsett et al., 2008 | N | N | Y | Y | ? | Y | N | Y | Y | ? | ? | ? | ? | N |
| Ryden et al., 2005, 2007 | Y | ? | ? | N | ? | Y | N | Med=14 yrs if no breast event | N | N | Y | ? | N | ? |
| Knoop et al., 2001 | Y | N | Y | N | ? | Y | Y | ? | N | N | ? | Y | N | ? |
| Arpino et al., 2004 | Y | N | Y | Y | ? | Y | N | ? | Y | ? | ? | N | N | ? |
| Study | Therapeutic Setting | Treatment(s) | Age | Extent of Disease | Performance Status Scale Index Result | Hormone Receptor Status (%) | ||
|---|---|---|---|---|---|---|---|---|
| Neoadjuvant Hormonal Therapy | ||||||||
| von Minckwitz et al., 2007, Germany, multicenter RCT, secondary analysis | Primary breast carcinoma ≥ 3 cm largest diameter; no distant metastases, age 18–70 | Doxorubicin + docetaxel ± tamoxifen (TAM); followed by surgery within 14–28 days | Median=48 Range=27–67 Premen: 51%(TAM-) 57%(TAM+) | Positive nodes: 47% (TAM-) 53% (TAM+) | Karnofsky score ≥70% 100% ≥90% 96.3% | ER+ 59.2 (TAM-) 53.1 (TAM+) PR+ 43.9 (TAM-) 34.7 (TAM+) | ||
| Adjuvant Hormonal Therapy | ||||||||
| Rasmussen et al., 2008, Mauriac et al., 2007, international, multicenter RCT, secondary analysis | Postmenopausal women with HR+, early invasive breast cancer, in monotherapy arms of BIG 1–95 trial | Letrozole (LET) vs. TAM; 44%–54% had mastectomy; 21%–32% had chemotherapy | Median=~60 | Positive lymph nodes: 42–47% | Not reported | Median ER=85–90 Median Pr=10–70 | ||
| Dowsett et al., 2008, international, multicenter RCT, secondary analysis | Postmenopausal women with operable, invasive breast cancer HR+, in monotherapy arms of ATAC trial. Most from UK. | Anastrozole (ANA) vs. TAM for 5 years; mastectomy, 41%; chemotherapy, 9%; TAM presurgery, 3% | Median=63 | Positive lymph nodes: 30% | Not reported | Pr+, 78% | ||
| Ryden et al., 2005, multicenter RCT, secondary analysis | Stage II, premenopausal/<50 yrs | TAM for 2 yrs vs. control; mastectomy or breast-conserving surgery + radiotherapy; <2% pts received adjuvant chemotherapy | Median=45 Range=2–75 | ~70% are node positive; tumor 25 in TAM group vs. 22 in control (p=0.03) | Not reported | TAM | Cx | ER- |
| PR- | 30 | 26 | ||||||
| PR+ | 810 | |||||||
| ER+ | ||||||||
| PR- | 4 | 5 | ||||||
| PR+ | 54 | 57 | ||||||
| P=0.6 | ||||||||
| Not done | 4 | 2 | ||||||
| Knoop et al., 2001, Denmark, multicenter RCT, secondary analysis | Postmenopausal, “high risk” | Grp 1: TAM 10 mg 3×/day for 1 year (n=868) + radiotherapy Grp 2: Radiotherapy (n=848) | Median=66, Range=45–88 | High risk=positive axillary lymph nodes, tumor >5 cm, or tumor invaded skin or deep fascia | Not reported | ER+ | 66% (11% HER2+) | |
| PR+ | 43% | |||||||
| (7% HER2+) | ||||||||
| Metastatic Hormonal Therapy | ||||||||
| Arpino et al., 2004, multicenter, US? PRO | First line, ER+ | TAM 2×/day, 10 mg (n=56) or 10 mg/m2 (n=149). | HER2+: 66%<65 yo; 16% premen | Not reported | Not reported | Her2+ | Her2- | |
| HER2-: 57%<65yo; 12% premen | ER+ | 100 | 100 | |||||
| PR+ | 78 | 96 | ||||||
| Study | Time to Event Outcomes | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Neoadjuvant Hormonal Therapy | ||||||||||||||
| von Minckwitz et al., 2007 | Not reported | |||||||||||||
| Adjuvant Hormonal Therapy | ||||||||||||||
| Rasmussen et al., 2008, Mauriac et al., 2007 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | Test | p | HR (95%CI) | Comments HER2+ vs. HER2- (any Tx): HR=2.09(1.59–2.76) p<0.0001. | ||
| DFS | HER2+ | LET vs. TAM | ||||||||||||
| All | 239 | ~0.95 | ~0.87 | ~0.82 | ~0.75 | HER2+ | ||||||||
| LET | 134 | ~0.97 | ~0.90 | ~0.86 | ~0.79 | 0.62(0.37–1.03) | ||||||||
| TAM | 105 | ~0.94 | ~0.84 | ~0.75 | ~0.70 | |||||||||
| HER2- | ||||||||||||||
| All | 3,294 | ~0.98 | ~0.96 | ~0.91 | ~0.88 | HER2- | ||||||||
| LET | 1,648 | ~0.98 | ~0.97 | ~0.95 | ~0.90 | 0.72(0.59–0.87) | ||||||||
| TAM | 1,646 | ~0.98 | ~0.95 | ~0.90 | ~0.86 | |||||||||
| Dowsett et al., 2008 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments “[N]o indication of a greater differential benefit of anastrozole over tamoxifen in the HER-2-positive patients. However, there were only 44 events in the HER-2-positive group, so the CIs are wide.” | |
| TTR | TAM | 837 | ||||||||||||
| HER2- | ~0.01 | ~0.04 | ~0.06 | ~0.08 | ~0.09 | 0.0018 | 2.25 | |||||||
| HER2+ | ~0.01 | ~0.06 | ~0.10 | ~0.15 | ~0.25 | |||||||||
| ANA | 875 | |||||||||||||
| HER2- | ~0.01 | ~0.02 | ~0.04 | ~0.05 | ~0.06 | <0.0001 | 3.27 | |||||||
| HER2+ | ~0.01 | ~0.08 | ~0.12 | ~0.16 | ~0.20 | |||||||||
| Ryden et al., 2005, 2007 ER+ pts | Outcome | Grp | N | Med (mos) | 5 yr | 10yr | 15 yr | Test | p | HR (95%CI) | Comments No stat diff in RFS between HER2+and HER2- pts (measured by IHC or FISH) among untreated pts. VEGFR2 status was predictive of TAM efficacy. Using the combined HER2 measure, there was a TAM effect in the ER+/HER2- group (n=275; HR=0.64, 95%CI: 0.44–0.93, p=0.02), but not in the ER+/HER2+cohort (n=24; HR=0.71, 95%CI: 0.23–2.20, p=0.6). | |||
| RFS | HER2+ (IHC 3+) | |||||||||||||
| Tx | 8 | ~0.7 | ~0.7 | ~0.7 | LR | 0.2 | 0.38 (0.08–1.79) | |||||||
| Cx | 13 | ~0.4 | ~0.4 | ~0.4 | ||||||||||
| RFS | HER2- (IHC 0–2+) | |||||||||||||
| Tx | 115 | ~0.75 | ~0.7 | ~0.65 | LR | 0.07 | 0.69(0.46–1.03) | |||||||
| Cx | 124 | ~0.7 | ~0.6 | ~0.55 | ||||||||||
| RFS | HER2+ (FISH) | |||||||||||||
| Tx | Data not reported | LR | 0.14 | 0.21 (0.03–1.67) | ||||||||||
| Cx | Data not reported | |||||||||||||
| RFS | HER2- (FISH) | |||||||||||||
| Tx | Data not reported | LR | 0.14 | 0.73(0.47–1.12) | ||||||||||
| Cx | Data not reported | |||||||||||||
| Knoop et al., 2001 ER+ or PR+ pts only | Outcome | Grp | N | Med (mos) | 5 yr | 10 yr | Test | p | HR (95%CI) | Comments | ||||
| DFS: | ||||||||||||||
| HER2 - | TAM | Not reported | 57(2) | 34(2) | LR | .0001 | Bonferroni p=.0006 | |||||||
| & low + (n=1,005) | Cx | 43(2) | 26(2) | |||||||||||
| HER2 hi + (n=52) | TAM | Not reported | 63(11) | 37(12) | LR | .5 | Bonferroni p=.5 | |||||||
| Cx | 41(9) | 35(8) | ||||||||||||
| Cox | HER2+ (n=54):
RR TAM vs. Cx=0.89
(95%CI:0.63–1.27) | |||||||||||||
HER2- (n=998):
RR TAM vs. Cx=0.86
(95%CI:0.78–0.93) | ||||||||||||||
| MV Cox | HER2 and HER2*TAM: Not significant (p values not reported) | |||||||||||||
| NOTE: Analysis limited to steroid-receptor positive pts. Standard errors in parentheses. LR=log-rank test of differences in DFS probabilities for pts with the variables in question when treated with TAM or not. | ||||||||||||||
| Metastatic Hormonal Therapy | ||||||||||||||
| Arpino et al., 2004 | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | 10yr | Test | p | HR (95%CI) | Comments HER2+ pts had lower median ER levels, even when all pts ER+ |
| TTF | HER2- | 104 | 7 | ~.35 | ~.18 | ~.08 | ~.08 | 0 | 0 | LR | .007 | |||
| HER2+ | 32 | 5 | ~.20 | ~.03 | No line | No line | No line | |||||||
| OS | HER2- | 104 | 31 | ~.85 | ~.60 | ~.50 | ~.25 | ~.20 | ~.05 | LR | .07 | |||
| HER2+ | 32 | 25 | ~.90 | ~.52 | ~.30 | ~.20 | ~.08 | ~.05 | ||||||
| HER2+ as predictor of TTF | MV Cox | 0.54 | 1.15 | adjusted | ||||||||||
| HER2+ as predictor of survival | MV Cox | 0.97 | 0.99 | adjusted | ||||||||||
Abbreviations: ANA: anastrozole; Cx: control; DFS: disease-free survival; HR: hazard ratio; LET: letrozole; LR: log rank; MV: multivariate; OS: overall survival; RR: relative risk; TAM: tamoxifen; TTF: time to treatment failure; TTR: time to tumor recurrence; Tx: treatment;
| Study | Tumor Response (%) | |||||||
|---|---|---|---|---|---|---|---|---|
| Neoadjuvant Hormonal Therapy | ||||||||
| von Minckwitz et al., 2007 | ALL (pCR) | ER+ (pCR) | ER- (pCR) | |||||
| HER2+ | HER2- | HER2+ | HER2- | HER2+ | HER2- | |||
| TAM+ | 0% | 10.7% | 0% | 0% | 0% | 24.2% | ||
| TAM- | 8.7% | 9.6% | 9.1% | 2.2% | 8.3% | 21.4% | ||
| Adjuvant Hormonal Therapy | ||||||||
| Rasmussen et al., 2008, Mauriac et al., 2007 | Not reported | |||||||
| Dowsett et al., 2008 | Not reported | |||||||
| Ryden et al., 2005 | Not reported | |||||||
| Knoop et al., 2001 | Not reported | |||||||
| Metastatic Hormonal Therapy | ||||||||
| Arpino et al., 2004 | Grp | N | cCR+cPR+cSD | PD | NE | Test | p | Comments |
| HER2- | 104 | 56% | 44% | χ2 | NS | |||
| HER2+ | 32 | 47% | 53% | |||||
Abbreviations: cCR: clinical complete response; cPR: clinical partial response; ER: cSD: clinical stable disease; estrogen-receptor; NE: not evaluable; NS: not significant; PD: progressive disease; SD: stable disease;
| Study | Design | Therapeutic Setting | n, Enrolled (Randomized) | n, Evaluated | n, Withdrawn (Lost to F/U) | Treatment Regimen (Agents) |
|---|---|---|---|---|---|---|
| Neoadjuvant Hormonal Therapy | ||||||
| None | ||||||
| Adjuvant Hormonal Therapy | ||||||
| Ryden et al 2005, multicenter, Sweden, 1986-1991 | RCT (secondary analysis) | Stage II invasive cancer, premenopausal or <50 years old. Includes HR+ and HR- | 564 | 428 | 136 (64, no specimens; 72, not assessable by IHC). Another 55 not assessable by FISH. Baseline prognostic factors similar in groups with or without specimens. | TAM for 2 yrs vs no TAM; also mastectomy or breast-conserving surgery + radiotherapy <2% pts received additional adjuvant chemotherapy (polychemotherapy, n=8; goserelin, n=1). Evenly distributed across arms. |
| Knoop et al. 2001, 27 sites, all in Denmark?, 8/77–11/82 | RCT (secondary analysis of Danish Breast Cancer Cooperative Group's 77c protocol) | Adjuvant, postmenopausal, “high risk” (positive axillary lymph nodes, tumor > 5 cm, skin or deep facia involvement) (Note: eligibility did not depend on hormone receptor status) | 1716 | 1515 | 201 (167, no specimens; 33, unevaluable ; 1,,unaccounted for. Baseline prognostic factors & outcomes similar in groups with or without specimens.) | TAM thrice daily for 1 year vs observation. All patients treated with mastectomy, lower ALND, and radiotherapy. 8% of the TAM pts were HER2-positive, vs. 14% of the observation arm (p=0.001) (per email from Dr. Knoop) |
| Metastatic Hormonal Therapy | ||||||
| Arpino et al. 2004, multicenter, US?, 1982-1987 | Prospective uncontrolled, SWOG protocol 8228 and ancillary study 9314 | First-line Tx for metastatic disease, ER+; prior adjuvant TAM or chemo completed > 3 mos before relapse | 349 | 136 | 213 (134, no specimens; 7, inevaluable specimens; 4, lost to F/U, 68, assays unsuccessful) | TAM twice daily until disease progression (failure), 10 mg (n=56) or 10 mg/m2 (n=149) |
| Study | Clearly Defined Question | Well-Described Study Population | Well-Described Intervention | Use of Validated Outcome Measures (Independently Assessed) | Appropriate Statistical Analysis | Well-Described Results | Discussion/Conclusions Supported by Data | Funding/Sponsorship Source Acknowledged |
|---|---|---|---|---|---|---|---|---|
| Hormonal Therapy | ||||||||
| Chemotherapy | ||||||||
Patients in the von Minckwitz, Sinn, Raab, et al. (2007) neoadjuvant trial had unilateral primary breast carcinoma at least 3 cm in largest diameter with no distant metastases or inflammatory disease. They comprised 194 of the 250 patients in the GEPARDO [German Preoperative Adriamycin-Docetaxel] trial. The average age was 48 years and 51 percent (control [Cx] group) to 57 percent (tamoxifen [TAM] group) were premenopausal. Forty-seven percent (Cx) to 53 percent (TAM) had clinically positive lymph nodes, and all had a Karnofsky score of at least 70 percent. For hormone-receptor status, 53 percent (TAM) to 59 percent (Cx) were ER-positive, while 35 percent (TAM) to 44 percent (Cx) were PR positive. HER2 status was measured centrally using IHC, and a HercepTest™ score of 3+ was considered positive. About 24 percent of the participants were HER2 positive.
Patients in the Rasmussen, Regan, Lykkesfeldt, et al. (2008) study comprised 3,533 of the 4.922 patients in the monotherapy arms of the BIG 1–98 trial. They were postmenopausal with early stage invasive cancer. The median age was around 60 years, and about 37 percent (HER2-negative patients) to 45 percent (HER2-positive patients) had tumors larger than 2 cm. Fewer than half had positive lymph nodes (42 percent for HER2-negative pts; 47 percent for HER2-positive patients). The median estrogen receptor level was 85 for HER2-positive patients and 90 for HER2-negative patients (p<0.0001), while the median progesterone receptor level was 10 in HER2-positive patients and 70 in HER2-negative patients (p<0.0001). HER2 positivity was defined as amplification by FISH or HercepTest™ 3+ by IHC (in 0.5 percent of patients with no FISH result). Seven percent of the patient population was HER2-positive.
Patients in the Dowsett, Allread, Knox, et al. (2008) study comprised 1,782 of the 5,880 patients in the monotherapy arms of the ATAC trial; most were from the United Kingdom. Sixty-seven percent of the patients had prior radiotherapy; 9 percent, prior chemotherapy; and 3 percent, tamoxifen prior to surgery. The median age was 63 years; and all of the women were postmenopausal. Sixty-seven percent had tumors that were no larger than 2 cm; 66 percent had negative lymph nodes; and all were hormone receptor positive (78 percent were PR+). HER2-positivity was defined by a score of 3+ on IHC or 2+ on IHC plus FISH amplification. Ten percent of the patients in the study were HER2-positive.
Patients in the Knoop, Bentzen, Nielsen, et al. (2001) adjuvant study were postmenopausal with a median age of 66 years. They had a high risk of recurrence, defined as having positive axillary lymph node(s), tumor larger than 5 cm diameter, or skin/deep fascial involvement. Sixty-six percent of the patients were estrogen-receptor (ER) positive, and 43 percent, progesterone-receptor (PR) positive.
In the original randomized, controlled trial, the Danish Breast Cancer Cooperative Group's 77c protocol, patients were randomized to receive tamoxifen three times daily for a year or to observation. All patients were also treated with mastectomy, lower axillary lymph node dissection, and radiotherapy. In the secondary analysis, data on HER2 status were available on a subset (n=1,515, 88 percent) of those in the original trial. Eighteen percent of these patients were HER2 positive by IHC, but approximately 11 percent had IHC results roughly comparable to a 3+ score by HercepTest™*. However, the proportions of HER2-positive patients differed between the arms of the trial: 8 percent of patients in the tamoxifen arm were HER2 positive, while 14 percent of those in the control arm were HER2 positive (p=0.001).
Patients in the Ryden, Jirstrom, Bendahl, et al. (2005) and Ryden, Landberg, Stal, et al. (2007) adjuvant trial had Stage II invasive cancer and included 470 or the 564 patients in the original trial. The median age was 45 years, and all were premenopausal or younger than 50 years old. The median tumor size ranged from 22 in the control group to 25 in the tamoxifen group. Both hormone-receptor-positive and hormone-receptor-negative patients were included. Fifty-four percent of patients in the tamoxifen group and 57 percent of patients in the control group were ER positive and PR positive, respectively; 30 percent and 26 percent, were ER negative and PR negative, respectively; the remainder were either ER negative/PR positive or ER positive/PR negative. Approximately 70 percent of the patients had positive lymph nodes. Patients were randomized to tamoxifen for two years versus no tamoxifen. Patients also underwent mastectomy or breast-conserving surgery plus radiotherapy. Less than 2 percent of patients, evenly distributed across arms in the original trial, received additional chemotherapy (n=8) or goserelin (n=1).
Data on HER2 status were available on 428 patients, or 76 percent of the original trial participants. The authors reported that baseline prognostic factors were similar in the groups with and without archived pathological specimens available for the secondary analysis. HER2 status was measured by FISH, using a cutoff of six signals/tumor cell (13 percent of patients were HER2 positive) and by IHC using a cutoff of 3+ on the HercepTest™ (15 percent were HER2 positive). The correlation between IHC 3+ and FISH amplification was r=0.82 (p<0.001); κ=0.84.
Patients with metastatic disease in the Arpino, Green, Allred, et al. (2004) single-arm study were drawn from the Southwest Oncology Group's (SWOG) protocol 8228 and ancillary study 9314. Approximately 60 percent of the patients were younger than 65 years old, and approximately 14 percent were premenopausal. All patients were ER positive; 78 percent of the HER2-positive and 96 percent of the HER2-negative patients were PR positive. Patients received tamoxifen twice daily as first-line therapy until disease progression.
Data on HER2 status were available on 136 patients, or about 39 percent of the original study participants. HER2 status was measured by FISH with a cutoff of HER2/CEP17 ratio of 2 or more (24 percent of patients were HER2 positive) and by IHC with a cutoff of complete membrane staining in 10 percent or more of tumor cells (21 percent of patients were HER2 positive), but only the FISH results were used in this analysis.
The outcome for the neoadjuvant study (von Minckwitz, Sinn, Raab, et al., 2007) was pathological complete response, and surgery was performed within 14–28 days after chemotherapy was completed. In the two studies on the BIG 1–98 trial, Mauriac, Keshaviah, Debled, et al. (2007) assessed time to early tumor recurrence (TETR), defined as a recurrence within 2 years, which was also the median followup; while Rasmussen, Regan, Lykkesfeldt, et al. (2008) reported on disease-free survival with a median followup of 51 months. In the comparison of anastrozole versus tamoxifen from the ATAC trial, Dowsett, Allread, Knox, et al. (2008) examined time to recurrence; the duration of followup was unclear, possibly 68 months. The only outcome reported in the Knoop, Bentzen, Nielsen, et al. (2001) adjuvant study was disease-free survival (DFS); the duration of followup was not reported, but the tables included estimates of DFS at 10 years. The Ryden, Jirstrom, Bendahl, et al. (2005) adjuvant trial only reported recurrence-free survival (RFS) and had 14 years; median followup for patients without a breast cancer event. The Arpino, Green, Allred, et al. (2004) uncontrolled study on metastatic disease reported overall response rates (ORR; sum of complete plus partial responses), time to failure (TTF), and overall survival (OS). “Nearly all” of the tumor blocks were more than 10 years old; some were more than 20 years old.
Randomization stratified on HER2/randomized to whether treatment was guided by HER2. No studies of this type were identified.
Randomized trial, prespecified multivariate subgroup analysis. No studies of this type were identified.
Randomized trial, post-hoc multivariate subgroup analysis. Five of the six studies that met the selection criteria were post-hoc analyses of randomized controlled trials. The only neoadjuvant study compared pathological tumor response in patients receiving doxorubicin and docetaxel with or without tamoxifen (von Minckwitz, Sinn, Raab, et al., 2007). The pCR rate among ER-positive and HER2-positive patients was 0 percent for those receiving tamoxifen versus 9 percent for those not receiving it; among HER2-negative patients the corresponding numbers were 24 percent and 21 percent. The numbers were small, however. There were only 25 ER-positive and HER2-positive patients, with 1 pCR, while there were 61 ER-positive but HER2-negative patients, with 14 pCRs. In a multivariate logistic regression model including menopausal status, tumor size, grade, and nodal status, the odds ratio for HER2 was 3.66 (95 percent CI: 0.69–19.30, p=.126). Analysis of the interaction term between HER2 status and treatment group was not reported. Consequently, the study confirms that the prognosis is poorer in HER2-positive patients, but it does not indicate whether or not tamoxifen is more or less effective in HER2-positive versus HER2-negative patients.
Two studies compared the use of an aromatase inhibitor versus tamoxifen. In secondary analyses of the BIG 1–98 trial, disease-free survival and time to early tumor recurrence were examined. Rasmussen, Regan, Lykkesfeldt, et al. (2008) reported a hazard ratio of letrozole versus tamoxifen among HER2-positive patients of 0.62 (95 percent CI: 0.37–1.03) and among HER2-negative patients of 0.72 (95 percent CI: 0.59–0.87). While the numerical values of the hazard ratios are similar, the result for HER2-negative patients is statistically significant, while that for HER2-positive patients is not. The number of HER2-positive patients is 239, much smaller than the 3,294 HER2-negative patients. Mauriac, Keshaviah, Debled, et al. (2007) report that the time to early tumor recurrence does not appear to be statistically significantly different by treatment group in either HER2-positive or HER2-negative patients, and the HER2 status/treatment group interaction term in a multivariate analysis is not statistically significant. Consequently, this study suggests that letrozole increases disease-free survival among HER2-negative patients relative to tamoxifen, but it does not provide evidence on a greater effect among HER2-positive patients.
In the secondary analysis of the ATAC trial, Dowsett, Allread, Knox, et al. (2008) compare the effect of anastrozole and tamoxifen by HER2 status. They examine time to treatment recurrence by HER2 status and report hazard ratios of HER2-negative versus HER2-positive patients of 2.25 (p=.0018) for anastrozole and 3.27 (p<.0001) for tamoxifen. These results demonstrate that HER2-positive patients have a poorer prognosis than HER2-negative patients but do not compare the effectiveness of each treatment within each HER2 group. The authors report that there is “no indication of a greater differential of anastrozole over tamoxifen in the HER-2-positive patients. However, there were only 44 events in the HER-2-positive group, so the CIs are wide.” No further details of the analysis are provided. In the multivariate analysis, no analysis of an interaction term between HER2 status and treatment group is reported.
| HER2 Status | Treatment group | 5-year DFS (% ± SE) | 10-year DFS (% ± SE) | Log rank p value | p value with Bonferroni correction |
|---|---|---|---|---|---|
| Negative/low-positive (n=1,005) | Tamoxifen | 57% (±2) | 34% (±2) | .0001 | .0006 |
| Control | 43% (±2) | 26% (±2) | |||
| High-positive (n=52) | Tamoxifen | 63% (±11) | 37% (±12) | .5 | .5 |
| Control | 41% (±9) | 35% (±8) | |||
Abbreviations: DFS: disease-free survival; SE: standard error;
A multivariate Cox model was constructed that included tumor size, proportion node positive, histologic grade, p53 value, EGFR, HER2, tamoxifen, and interactions between tamoxifen and p53, HER2 and EGFR. The coefficients for HER2 and for the interaction term for HER2 and tamoxifen were not statistically significant (specific p values and coefficients not reported for these variables). Node positive proportion (RR=1.011), grade (RR=1.103), p53 (1.54), and tamoxifen (RR=0.73) were statistically significant at p<.01. In other words, after controlling for other variables, HER2 was not a statistically significant predictor for outcomes of treatment with tamoxifen in this study.
| HER2 Status | IHC | FISH | ||
|---|---|---|---|---|
| Log rank p value | Hazard ratio TAM vs. Cx (95% CI) | Log rank p value | Hazard ratio TAM vs. Cx (95% CI) | |
| HER2- (n=239)a | .07 | 0.69 (0.46–1.03) | .14 | 0.73 (0.47–1.12) |
| HER2+ (n=21)b | .2 | 0.38 (0.08–1.79) | .14 | 0.21 (0.03–1.67) |
IHC 0–2+ or FISH nonamplified
IHC 3+ or FISH amplified
Abbreviations: Cx: control; TAM: tamoxifen
The authors also reported that among untreated patients, the difference in outcome between HER2-positive and HER2-negative patients (measured with either IHC or FISH; in both univariate and multivariate Cox proportional hazard models) was not statistically significant. In contrast, the marker VEGFR2 was a statistically significant predictor of outcome of tamoxifen treatment. In a univariate analysis among ER-positive/PR-positive patients with HER2 status measured using IHC, the duration of RFS was longer among tamoxifen-treated patients than controls in the HER2-negative subgroups (p=.03) but not among HER2-positive (p=.3) patients.
In a multivariate Cox model, the interaction term between treatment (tamoxifen versus control) and HER2 status was not statistically significant when the model was run for ER-positive patients (p=.4) or ER-positive/PR-positive patients (p=.3). The covariates in the model were not clearly listed but probably included age, tumor size, nodal status, Nottingham histologic grade, tamoxifen, and the interaction term.
Randomized trial, treatment by HER2 subgroup analysis. No studies of this type were identified.
Single-arm study, prespecified multivariate analysis. No studies of this type were identified.
Single-arm study, post-hoc multivariate analysis. The prospective but uncontrolled study on use of tamoxifen for metastatic disease by Arpino, Green, Allred, et al. (2004) compared outcomes for HER2-positive versus HER2-negative patients. ORR was 56 percent for HER2-negative patients and 47 percent for HER2-positive patients (χ2 test, p=NS). Median TTF was 7 months for HER2-negative patients versus 5 months for HER2-positive patients (log rank p=.007). Finally, median OS was 31 months for HER2-negative patients versus 25 months for HER2-positive patients (log rank p=.07). While all of the patients were ER positive, median ER levels were lower in HER2-positive than in HER2-negative patients.
Multivariate, partially nonparametric Cox models for TTF and OS included menopausal status, disease-free interval, ER and PR levels, HER1 status, and HER2 status. HER2-positive status was not a statistically significant predictor of either TTF or overall survival. HER1 status, premenopausal status, and disease-free interval before recurrence were statistically significant predictors of TTF, while ER and PR levels and disease-free interval prior to recurrence were significant predictors of OS. The hazard ratios for HER2-positive versus HER2-negative subgroups were 1.15 (p=.54) for TTF and 0.99 (p=.97) for OS. Therefore, after controlling for other factors, this study provided no evidence of a difference in outcomes after treatment with tamoxifen between HER2-positive and HER2-negative patients.
Single-arm study, univariate analysis. No studies of this type were identified.
The evidence on use of HER2 status to predict outcomes of hormonal therapy is weak and inconclusive. Four studies reviewed here addressed use of tamoxifen in different breast cancer patient populations; two compared tamoxifen with aromatase inhibitors. Evidence is lacking from the most informative types of studies, trials in which randomization is stratified by HER2 status or randomization to therapy directed by HER2 results or not. Less-informative designs were used, including post-hoc multivariate analyses in five randomized trials and one post-hoc multivariate analysis in a single-arm study. In comparing tamoxifen with aromatase inhibitors in a secondary analysis of randomized, controlled trial results, the most persuasive finding would be a significant interaction term between HER2 status and treatment group, after controlling for other important prognostic factors.
In the two comparison studies included, one had an insignificant interaction term (suggesting that there is no differential in the impact of the two treatments based on a patient's HER2 status), and the other did not report an interaction term although they included a qualitative statement that there was no evidence that one treatment was more effective than the other in HER2 positive patients. Some results suggest that tamoxifen may be more effective among HER2-negative patients, but a conclusion is undermined by the paucity of studies and inconsistent findings. Importantly, data demonstrating a difference in magnitude of benefit by HER2 status would not by themselves be sufficient to conclude there is no benefit in HER2-positive patients also positive for hormone receptors. Studying the differential impact of hormonal therapy by HER2 status is hindered by the inverse relationship between HER2 status and hormone receptor status, which leads to relatively small numbers of HR-positive and HER2-positive patients on which to base the results.
What is the evidence that monitoring serum or plasma concentrations of HER2 extracellular domain in patients with HER2-positive breast cancer predicts response to therapy, or detects tumor progression or recurrence, and if so, what is the evidence that decisions based on serum or plasma HER2 assay results improve patient management and outcomes?
Studies were included for Key Question 4 if they were:
randomized trials, prospective single-arm studies, or retrospective series of identically treated patients; that
measure serum or plasma HER2 concentrations in breast cancer patients, either at baseline or at multiple time points; and either:
associate baseline values or changes in HER2 concentration with one or more outcomes of interest (primary or secondary); or
compare outcomes of treatment decisions based on assay results with outcomes of decisions made in absence of assay results.
| Study | Design | Therapeutic Setting | n, Enrolled (Randomized) | n, Evaluated | n, Withdrawn (Lost to F/U) | Treatment Regimen (Agents) |
|---|---|---|---|---|---|---|
| Gasparini et al. 2007, Italy, multicenter; 12/00 – 09/04 | PII RCT | untreated MBC, t-IHC 2+/3+ (1st-line metastatic disease) | 124 enrolled (61 grp 1, 63 grp 2); allocation concealment: A | 123 for efficacy and toxicity, 118 for ORR | 1 for efficacy and toxicity, 6 for ORR | Grp 1: paclitaxel; Grp 2: paclitaxel + trastuzumab |
| Im et al. 2005, Korea, multicenter | PII, single-arm | MBC no previous CHT for metastatic disease (1st-line) | 40 | 39 for toxicity, 38 for response | 1 for toxicity, 2 for response (refused further tx) | Epirubicin + docetaxel |
| Fornier et al. 2005, USA, 1 center | RET analysis of PII | MBC, HER2 overexpressing and non-overexpressing | 55 of 95 in trial who had 1° tumor tested for tHER2 | 55 | Paclitaxel + trastuzumab | |
| Muller et al. 2004, Germany, multicenter | RCT | 1st-line tx for MBC | 103 of 597 in trial | 101 | 2 | Grp1: epirubicin + paclitaxel (ET, n=47, 62% sHER2+);Grp2: epirubicin + cyclophosphamide (EC, 54, 65% n=sHER2+) |
| Luftner et al. 2004, Germany, 1 center | PII | stage IV BC, 1 or 2 prev CHT (1 anthracycline-based) | 35 | 35 | Dose-intensified paclitaxel (1st-line 6%, 2nd-line 60%, 3rd-line 34%) | |
| Sandri et al. 2004, Italy, 1 center | Clinical trial | stage IV BC, ≥ 1 prev CHT for met dis (2nd-line+) | 64 | 39 | 25 | Cyclophosphamide + methotrexate |
| Colomer et al. 2004, Spain, 7 centers | PII | progressive advanced BC, no 1° tx for mets (1st-line) | 43 | 43 for toxicity 42 for efficacy | 1 | Paclitaxel + gemcitabine |
| Burstein et al. 2003, US, 17 centers | PII | stage IV BC, IHC HER2 3+ or FISH+, no prev CHT for met dis (1st-line) | 55 | 54 (43 had sHER2 values at baseline and after 1 tx cycle) | 1 (did not receive protocol-based tx) | Trastuzumab + vinorelbine |
| Lipton et al. 2003, mulitnational, multicenter | RCT | postmenopausal locally advanced, (stage IIIB) loco-regionally recurrent BC, MBC, ER+/PR? and/or PR+/ER? (1st-line) | 562 of 907 allocation concealment: B | 562 | Grp1: letrozole (n=283) Grp2: tamoxifen (n=279) | |
| Esteva et al. 2002, US, 1 center | PRO CS | MBC overexpressing tHER2, w/ or w/o previous tx for met dis, but no prior trastuzumab | 30 | 30 | Trastuzumab + docataxel | |
| Colomer et al. 2000, Spain, 1 center | PRO CS | MBC, no previous CHT for met dis (1st-line) | 77 | 55 | 3 | Doxorubicin+ paclitaxel |
| Colomer et al. 2006, Spain, 6 centers | PII | advanced BC (1st-line) | 52 | 47 | IV vinorelbine+ IV gemcitabine | |
| Yamauchi et al. 1997, US, ? centers | RCT | MBC (1st-line) | 94 of 369 | 94 | 3 doses of droloxifene | |
| Study | Clearly Defined Question | Well-Described Study Population | Well-Described Intervention | Use of Validated Outcome Measures (Independently Assessed) | Appropriate Statistical Analysis | Well-Described Results | Discussion/Conclusions Supported by Data | Funding/Sponsorship Source Acknowledged |
|---|---|---|---|---|---|---|---|---|
| Im et al. 2005 | + | + | + | + (NA/-) | + | + | + | + |
| Fornier et al. 2005 | + | + | + | + (-) | + | + | + | + |
| Luftner et al. 2004 | + | + | + | + (-) | + | - (no AEs) | + | + |
| Sandri et al. 2004 | + | + | + | + (-) | + | - (no AEs) | + | + |
| Colomer et al. 2004 | + | + | + | + (-) | + | + | + | - |
| Burstein et al. 2003 | + | + | + | + (-) | + | + | + | + |
| Esteva et al. 2002 | + | + | + | + (-) | + | + | + | + |
| Colomer et al. 2000 | + | - | + | + (+) | + | - (no AEs) | + | + |
| Colomer et al. 2006 | + | + | + | + (-) | + | + | + | + |
| Study | Therapeutic Setting | Treatments Compared | Age Mean, range | Number of Disease Sites (%) | Performance Status | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | ≥3 | Scale | Index | Result | E+&P+ | E+/P+ | ||||
| Cameron et al., 2008, multicenter international | tHER2+ LABC/MBC, 2nd-line | Grp 1: capecitabine (n=201) | 51, 28–83 | 22 | 30 | 48 | ECOG | %0 | 59 | ||
| %1 | 41 | ||||||||||
| Grp 2: lapitinib + capecitabine (n=198) | 54, 26–80 | 20 | 31 | 49 | %0 | 62 | |||||
| %1 | 38 | ||||||||||
| Gasparini et al., 2007, Italy, multicenter; 12/00 - 09/04 Phase II RCT | First-line, untreated MBC, t-IHC 2+/3+ | Grp 1: paclitaxel (n=61) | 54.27, 30–71 | 33 | 40 | 27 | ECOG | % 0, 1–2: | 82, 18 | 37 | 27 |
| Grp 2: paclitaxel + trastuzumab (n=63) | 56.02, 32–72 | 40 | 33 | 27 | 81, 19 | 37 | 10 | ||||
| Muller et al., 2004, Germany, multicenter RCT | First-line tx for MBC | Grp1: epirubicin + paclitaxel (ET, n=54, 65% sHER2+); | Grp1+Grp2: 48, 31–63 | 61 (E+) | |||||||
| Grp2:epirubicin+cyclophosphamide (EC, n=47, 62% sHER2+) | |||||||||||
| Lipton et al., 2003, multinational, multicenter RCT | First-line, postmenopausal locally advanced (stage IIIB), loco-regionally recurrent BC, MBC, ER+/PR? and/or PR+/ER? | Grp1: letrozole (n=283, 31% sHER2+) | 65, 42–94 | 53 | 37 | 10 | KPS | md | 90 | 38 | 28 |
| rng | 50–100 | ||||||||||
| Grp2: tamoxifen (n=279, 28% sHER2+) | 63, 31–90 | 55 | 34 | 11 | md | 90 | 40 | 27 | |||
| rng | 50–100 | ||||||||||
Abbreviations: BC: breast cancer; E+&P+: estrogen and progesterone receptor positive; E+/P+: estrogen and/or progesterone receptor positive; ECOG: Eastern Cooperative Oncology Group; ER: estrogen receptor; Grp: group; KPS: Karnofsky performance score; LABC: locally advanced breast cancer; MBC: metastatic breast cancer; md: median; PR: progesterone receptor; RCT: randomized, controlled trial; s: serum; rng: range; t: tissue; tx: treatment;
Two of the randomized trials selected patients for being positive on tissue (t) HER2 testing. Gasparini, Gion, Mariani, et al. (2007) selected patients with 2+ or 3+ scores on the IHC HercepTest™. Cameron, Casey, Press, et al. (2008) included patients who were 3+ on IHC or 2+ with a positive FISH result. Muller, Witzel, Luck, et al. (2004) performed tissue testing on only 29 of 103 patients and only nine patients had 3+ results by Dako-style scoring of an IHC assay using the CB11 mAb. No tHER2 results were reported for Lipton, Ali, Leitzel, et al. (2003).
Patient characteristics were reported in various ways. Only age was reported by all four studies. Baseline data in the two treatment groups in the Muller, Witzel, Luck, et al. (2004) trial were combined; median age was 48 years. In the Gasparini, Gion, Mariani, et al. (2007) and Cameron, Casey, Press, et al. (2008) trials, median ages by treatment group were in the low and mid-50s and in the Lipton, Ali, Leitzel, et al. (2003) study median ages were in the mid-60s.
The proportion of patients with three or more disease sites was 27 percent in the Gasparini, Gion, Mariani, et al. (2007) study, 49 percent in the Cameron, Casey, Press, et al. (2008) trial and 10 percent and 11 percent of the two treatment groups studied by Lipton, Ali, Leitzel, et al. (2003).
Gasparini, Gion, Mariani, et al. (2007) used the ECOG performance status scale, finding that 82 percent and 81 percent had the highest level (0). Cameron, Casey, Press, et al. (2008) reported that 62 percent and 59 percent were at ECOG level 0. Median Karnofsky Performance Scale values were 90 in both groups included by Lipton, Ali, Leitzel, et al. (2003).
In the study by Gasparini and co-workers, 37 percent were both estrogen and progesterone-receptor positive, while the proportions for the twp groups from Lipton and co-workers' study was 38 percent and 40 percent, respectively. Muller, Witzel, Luck, et al. (2004) only noted that 61 percent were estrogen-receptor positive. Cameron, Casey, Press, et al. (2008) reported the proportions of patients in the two groups who were either positive on one or both receptors: 48 percent and 46 percent.
| Study | Therapeutic Setting | Treatments Compared | Age Mean, range | Number of Disease Sites (%) | Performance Status | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | ≥3 | Scale | Index | Result | ER+&PR+ | |||||
| Im et al., 2005, Korea, multicenter | MBC (1st-line) | Epirubicin+paclitaxel (n=40, 14.8% sHER2+) | 49, 35–70 | 44 | 31 | 26 | ECOG | %0 | 21 | ||
| %1 | 54 | ||||||||||
| %2 | 26 | ||||||||||
| Colomer et al., 2000, Spain, 1 center | MBC, no previous CHT for met dz (1st-line) | Doxorubicin+ paclitaxel (n=55, 43.6% sHER2+) | 35% premenopausal | 55 | 45 | 67 (ER+) | |||||
| Fornier et al., 2005, USA, 1 center | MBC, tHER2 +/- | Paclitaxel+trastuzumab (n=55 of 95, 69% sHER2+) | 51, 33–67 | med 2, rng 1–4 | KPS | mn | 90 | ||||
| rng | 70–100 | ||||||||||
| Esteva et al., 2002, US, 1 center | MBC tHER2+, +/- previous tx for met dz | Trastuzumab+docataxel (n=30, 70% sHER2+) | 45, 33–78 | 16 | 40 | 42 | KPS | %90 | 63 | ||
| %80 | 20 | ||||||||||
| %70 | 16 | ||||||||||
| Colomer et al., 2004, Spain, 7 centers | progressive advanced BC (1st-line) | Paclitaxel+gemcitabine (n=42, 29.3% sHER2+) | 53, 29–72 | med 3, rng 1–6 | 49 (ER+) | ||||||
| Luftner et al., 2004, Germany, 1 center | stage IV BC, 1 or 2 previous CHT | Dose-intensified paclitaxel (n=35; 1st-line 6%, 2nd-line 60%, 3rd-line 34%, 63% sHER2+) | 48, 31–63 | 26 | 31 | 43 | 17 | 34 | |||
| (# involved organs) | |||||||||||
| Burstein et al., 2003, US, 17 centers | stage IV BC, tHER2+ | Trastuzumab+vinorelbine (n=43) | 55, 29–82 | md 3, | rng 1–6 | ECOG | %0 | 70 | 37 | 18 | |
| %1 | 28 | ||||||||||
| %3 | 2 | ||||||||||
| Colomer et al., 2007 | MBC (2nd-line) | Letrozole (n=226, 25% sHER2+) | ~63/64 | 36 | 31 | 33 | ECOG | %0 | 51 | 62 | |
| %1–2 | 49 | ||||||||||
| Yamauchi et al., 1997, US, ? centers | MBC (1st-line) | 3 doses of droloxifene (n=94 of 369, 34% sHER2+) | 47% < 64 | 45 | 32 | 18 | 55 (ER+) | ||||
| 53% ≥ 64 | 34 (PR+) | ||||||||||
| Sandri et al., 2004, Italy, 1 center | stage IV BC, ≥ 1 prev CHT for met dz (2nd-line+) | Cyclophosphamide + methotrexate (n=39) | 56, 36–81 | 26 | 39 | 36 | |||||
| Colomer et al., 2006, Spain, 6 centers | advanced BC (1st-line) | IV vinorelbine+ IV gemcitabine (n=47, 29.8% sHER2+) | 64, 34–81 | med 2, rng 1–4 | ECOG | %0 | 41 | 67 (ER+) | |||
| %1 | 47 | ||||||||||
| %2 | 12 | ||||||||||
Abbreviations: BC: breast cancer; CHT: chemohormonal therapy; dz: disease; ECOG: Eastern Cooperative Oncology Group; ER+: estrogen-receptor positive; IV: intravenous; MBC: metastatic breast cancer; med: median; met: metastatic; PR: progesterone-receptor positive; rng: range;
Two studies selected patients who were tHER2 3+ on IHC or positive on FISH. Five studies included mixed patient populations that were positive and negative on HER2 tissue testing (Colomer, Llombart-Cussac, Lloveras, et al., 2007; Colomer, Montero, Lluch, et al., 2000; Im, Kim, Lee, et al., 2005; Fornier, Seidman, Schwartz, et al., 2005; Sandri, Johansson, Colleoni, et al., 2004). The remaining four studies did not provide data on tissue HER2 testing (Yamauchi, O'Neill, Gelman, et al., 1997; Colomer, Llombart-Cussac, Lluch, et al., 2004; Luftner, Henschke, Flath, et al., 2004; Colomer, Llombart-Cussac, Tusquets, et al., 2006).
Regarding age, one study had a median age of 48 years, another had a median of 49 years. One study had 53 percent at age 64 or older, another had a median age of 64 years and a third had mean ages in sHER2 positive and negative groups of 63 and 64 years. The other 6 studies had median ages in the 50s.
Nine studies gave the distribution of patients by number of disease sites and one study gave the number of involved organs (43 percent had three or more involved organs). In seven studies, the percentage of patients with three or more disease sites ranged from 18 percent to 43 percent; in another study all patients had two or fewer disease sites. Four studies provided average number of disease sites: the medians were two in two studies and three in two studies.
Four studies provided ECOG performance status data: the percentages in categories 0 or 1 (better performance status) were 75, 98, 98, and 88 percent. Two studies used the Karnofsky Performance Scale: in one study the mean value was 90 percent and in the other 83 percent were at 80 percent or 90 percent on the scale.
Seven studies gave baseline information on hormone receptor status, 4 of which reported the proportion of patients those estrogen positive, ranging from 49 percent to 67.3 percent. One study gave the proportion progesterone positive (34 percent). Two studies gave percentages of different combinations of hormone receptor status: the proportions who were both estrogen and progesterone positive were 17 percent and 37 percent; the proportions who were either estrogen or progesterone positive were 34 percent or 18 percent.
| Level of Evidence | Study | n | Setting | Treatments | Outcome | Results |
|---|---|---|---|---|---|---|
| RCT stratified on HER2 status/HER2-guided vs. non-HER2-guided | ||||||
| RCT prespecified MV SGA | Gasparini 2007 | 123 | MBC 1st, t+ | paclit vs. paclit+trastuz | TTP | Cox regression sHER2 by treatment interaction p=.0538 |
| ORR | logistic regression sHER2 by treatment interaction p=.6044 | |||||
| RCT post-hoc MV SGA | ||||||
| RCT treatment by HER2 SGA | Cameron 2008 | 367 | LABC/MBC 2nd, t+ | capecit (Cp)+/- lapatinib (Lp) | PFS | cont sHER2/highest vs. other quartiles Cp p<.001, Cp+Lp0.12 Cp vs. Cp+Lp↑ highest quartile sHER2+ p<.001, other quartiles p=.002 |
| Muller 2004 | 101 | MBC 1st, t+/- | epirub+paclit (ET) vs. epirub+cycloph (EC) | OS | ET sHER2+↓ vs. - p=.092, EC sHER2+ vs. - p=NS | |
| PFS | sHER2- EC vs. ET p=NS, sHER2+ EC↓ vs. ET p=.0341 | |||||
| ORR | ET sHER2+ vs. - p=NS, EC sHER2+↓ vs. - p=.059 | |||||
| Lipton 2003 | 562 | locally advanced, recurrent, MBC 1st, t? | letrozole (LET) vs. tamoxifen (TAM) | TTP | sHER2+ LET↑ vs. TAM p=.0596, sHER2- LET↑ vs. TAM p=.0019 | |
| TTF | sHER2+ LET↑ vs. TAM, p=.0418, sHER2- LET↑ vs. TAM p=.0066 | |||||
| ORR | sHER2+ LET vs. TAM, p=.4507, sHER2- LET↑ vs. TAM p=.0078 | |||||
| CB | sHER2+ LET vs. TAM, p=.3057, sHER2- LET↑ vs. TAM p=.0162 | |||||
| 1-arm prespecified MV analysis | Colomer 2007 | 226 | MBC 2nd | letrozole (LET) | ORR | univariate sHER2+ ↓ vs. sHER2- p=.036 |
| TTP | univariate sHER2+ ↓ vs. sHER2- p=.004 Cox regression sHER2+ ↓ vs. sHER2- p<.001 | |||||
| OS | univariate sHER2+ ↓ vs. sHER2- p<.0005 | |||||
| Colomer 2000 | 55 | MBC 1st, t+/- | doxorub+paclit | RD | univariate sHER2+↓ vs. sHER2- p=.035 | |
| RD | Cox regression sHER2+↓ vs. sHER2- p=.04 | |||||
| ORR | univariate sHER2+↓ vs. sHER2- p=.01 | |||||
| ORR | logistic regression sHER2↓ + vs. sHER2- p=.03 | |||||
| 1-arm post-hoc MV analysis | Yamauchi 1997 | 94 | MBC 1st, t? | 3 doses droloxif | TTP | Cox regression sHER2+↓ vs. sHER2- p=.0003 |
| OS | Cox regression sHER2+↓ vs. sHER2- p=.003 | |||||
| ORR | univariate sHER2+↓ vs. sHER2- p=.00001 | |||||
| ORR | logistic regression sHER2+↓ vs. sHER2- p=.0001 | |||||
| 1-arm UV analysis | Im 2005 | 38 | MBC 1st, t+/- | epirub+paclit | RD | sHER2+↓ vs. sHER2- p=<0.001 |
| TTP | sHER2+↓ vs. sHER2- p=<0.001 | |||||
| OS | sHER2+↓ vs. sHER2- p=<0.076 | |||||
| Resp | sHER2+ vs. sHER2- p=0.45 | |||||
| Fornier 2005 | 55 | MBC, t+/- | paclit+trastuz | ORR | sHER2+ vs. sHER2- p=1.0, sHER2 Δ<15 vs. Δ≥15 p=0.005 | |
| ORR | sHER2 ≥15% vs. < 15% p=0.015 | |||||
| Esteva 2002 | 30 | MBC 2nd+, t+ | trastuz+docet | ORR | sHER2+↑ vs. sHER2- p=0.04 | |
| Colomer 2004 | 42 | MBC 1st, t? | paclit+gemcitab | RD | sHER2+↓ vs. sHER2- p=0.04 | |
| Resp | sHER2+↓ vs. sHER2- p=0.02 | |||||
| Luftner 2004 | 35 | MBC 2nd+, t? | dose intense paclit | RD | sHER2+↓ vs. sHER2- p=0.042 | |
| PFS | sHER2+↓ vs. sHER2- p=0.098 | |||||
| ORR | sHER2+ vs. sHER2- p=0.40 | |||||
| Sandri 2004 | 39 | MBC 2nd+, t+/- | cycloph+methotrex | TTP | sHER2+↓ vs. sHER2- p=0.007 | |
| OS | sHER2+↓ vs. sHER2- p=<0.001 | |||||
| Burstein 2003 | 43 | MBC, t+ | trastuz+vinorelb | Progr | no ↓ in sHER2 predicted progression; baseline, Δ did not predict | |
| Colomer 2006 | 47 | MBC 1st, t? | IVvinorelb+IVgemcit | ORR | sHER2+ vs. sHER2- p=0.9 | |
Abbreviations: cycloph: cyclophosphamide; DFS: disease-free survival; droloxif: droloxifene; epirub: epirubicin; gemcit: gemcitabine; HR: hazard ratio; MV: multivariate; ORR: overall response rate; OS: overall survival; paclit: paclitaxel; pCR: pathologic complete response; PFS: progression-free survival; RCT: randomized, controlled trial; RD: residual disease; RFS: recurrence-free survival; SGA: subgroup analysis; TETR: time to early tumor recurrence; trastuz: trastuzumab; TTF: time to treatment failure; TTR: time to tumor recurrence; Tx: treatment; UV: univariate analysis; vinorelb: vinorelbine;
| Study | Prospective design | Prespecified hypotheses about relation of marker to outcome | Large, well-defined, representative study population | Marker assay methods well-described | Blinded assessment of marker in relation to outcome | Homogeneous treatment(s), either randomized or rule-based selection | Low rate of missing data (≤ 15%) | Sufficiently long follow-up | Well-described, well-conducted multivariate analysis of outcome: 1) clear candidate variable selection, 2) clear, appropriate model-building guidelines, 3) assumptions tested, 4) standard prognostic variables included, 5) continuous variables well handled, 6) validation | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1) | 2) | 3) | 4) | 5) | 6) | |||||||||
| Cameron et al., 2008 | Y | N | Y | N | ? | Y | Y | ≥ 6 wk | treatment × HER2 SGA | |||||
| Gasparini et al., 2007 | Y | Y | N | Y | ? | Y | Y | med: 16.6 mos | ? | ? | ? | ? | ? | N |
| Muller et al., 2004 | Y | N | N | Y | ? | Y | N | med 8.9 mo (0.5–36) | treatment × HER2 SGA | |||||
| Lipton et al., 2003 | Y | N | N | Y | Y | Y | N | 3 mos | treatment × HER2 SGA | |||||
| Colomer et al., 2000 | Y | Y | N | Y | ? | Y | N | med 23 mos | ? | ? | ? | ? | ? | N |
| Yamauchi et al., 1997 | Y | N | N | Y | ? | Y | N | ? | ? | N | ? | ? | ? | N |
| Colomer et al., 2007 | Y | Y | Y | Y | ? | Y | Y | ≥ 4 wk | ? | ? | ? | ? | ? | N |
| Im et al., 2005 | Y | Y | N | Y | ? | Y | Y | med 22.5 mos | NA | |||||
| Fornier et al., 2005 | Y | N | N | Y | ? | Y | N | ≥ 4 wk | NA | |||||
| Esteva et al., 2002 | Y | Y | N | Y | Y | Y | Y | ≥ 8 wk | NA | |||||
| Colomer et al., 2004 | Y | Y | N | Y | ? | Y | Y | 26 mos | NA | |||||
| Luftner et al., 2004 | Y | Y | N | Y | ? | Y | Y | ≥ 4 wk | NA | |||||
| Burstein et al., 2003 | Y | Y | N | Y | ? | Y | Y | 8 wk | NA | |||||
| Sandri et al., 2004 | Y | N | N | Y | ? | Y | N | 2 mo | NA | |||||
| Colomer et al., 2006 | Y | Y | N | Y | ? | Y | Y | med 79 mo | NA | |||||
Abbreviations: mos: months; NA: not applicable; SGA: subgroup analysis; wks: weeks;
| Study | Time to Event Outcomes | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cameron et al., 2008 capecitabine (Cp) +/- lapatinib (Lp) | Outcome | Grp | N | Med (mos) | 6 mos | 1 yr | Test | p | HR (95%CI) | ||||
| PFS | sHER2+,Cp | Cox | <.001 | sHER2 as continuous variable | |||||||||
| sHER2-,Cp | |||||||||||||
| sHER2+,CpLp | Cox | .12 | |||||||||||
| sHER2-,CpLp | |||||||||||||
| sHER2+,Cp | 2.6 | Cox | <.001 | 2.3 (1.5, 3.6) highest sHER2 quartile vs. other quartiles | |||||||||
| sHER2-,Cp | 4.8 | ||||||||||||
| sHER2+,CpLp | 6.0 | Cox | .12 | 1.5 (0.9, 2.4) | |||||||||
| sHER2-,CpLp | 6.7 | ||||||||||||
| sHER2+,Cp | ~3 | ~17 | Cox | <.001 | 0.320 (0.181, 0.567) highest sHER2 quartile | ||||||||
| sHER2+,CpLp | ~6 | ~50 | |||||||||||
| sHER2-,Cp | ~20 | ~43 | ~15 | Cox | .002 | 0.561 (0.389, 0.81) other quartiles | |||||||
| sHER2-,CpLp | ~30 | ~52 | ~28 | ||||||||||
| Gasparini et al., 2007 paclitaxel vs. paclitaxel + trastuzumab | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments |
| TTP | Cox regression p value for sHER2 by treatment interaction: .0538 | ||||||||||||
| Muller et al., 2004 epirubicin + paclitaxel vs. epirubicin+ cyclophosphamide (ET vs. EC) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments |
| OS | sHER2+/EC | 19 | ~8.4 | ~50 | ~15 | ~0 | LR | .092 | |||||
| sHER2-/EC | 35 | ~22 | ~77 | ~40 | ~15 | ||||||||
| sHER2+/ET | 18 | ~16 | ~60 | ~10 | ~0 | LR | NS | ||||||
| sHER2-/ET | 29 | ~14 | ~65 | ~10 | ~0 | ||||||||
| PFS | sHER2-/EC | 35 | ~7 | ~30 | ~0 | ~0 | LR | NS | |||||
| sHER2-/ET | 29 | ~9 | ~21 | ~0 | ~0 | ||||||||
| sHER2+/EC | 19 | ~12 | ~21 | ~0 | ~0 | LR | .0341 | ||||||
| sHER2+/ET | 18 | ~9 | ~28 | ~0 | ~0 | ||||||||
| Lipton et al., 2003 letrozole vs. tamoxifen | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) | Comments |
| TTP | sHER2+ | 164 | |||||||||||
| letrozole | 87 | 6.1 | ~28 | ~7 | ~6 | ~4 | Cox | .0596 | 0.73 (0.53,1.01) | ||||
| tamoxifen | 77 | 3.3 | ~17 | ~5 | ~3 | ||||||||
| sHER2- | 398 | ||||||||||||
| letrozole | 196 | 12.2 | ~53 | ~29 | ~20 | ~14 | Cox | .0019 | 0.70 (0.56,0.88) | ||||
| tamoxifen | 202 | 8.5 | ~38 | ~20 | ~10 | ~8 | |||||||
| TTF | sHER2+ | 164 | |||||||||||
| letrozole | 87 | 6.0 | Cox | .0418 | |||||||||
| tamoxifen | 77 | 3.2 | |||||||||||
| sHER2- | 398 | ||||||||||||
| letrozole | 196 | 11.6 | Cox | .0066 | |||||||||
| tamoxifen | 202 | 6.2 | |||||||||||
Abbreviations: Cox: Cox proportional hazards; HR: hazard ratio; LR: log rank; med: median; mos: months; TTF: time to treatment failure; TTP: time to progression; yr: years;
| Study | Tumor Response (%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Cameron et al., 2008 capecitabine (Cp) +/- lapatinib (Lp) | Not reported | ||||||||
| Gasparini et al., 2007 paclitaxel vs. paclitaxel + trastuzumab | Grp | N | CR | PR | SD | PD | Test | p | Comments |
| WHO criteria logistic regression p value for sHER2 by treatment interaction: 0.6044 | |||||||||
| Muller et al., 2004 epirubicin + paclitaxel vs. epirubicin+cyclophosphamide (ET vs. EC) | Grp | N | CR+PR | SD | PD | Test | p | Comments | |
| sHER2+/ET | 18 | 50.0 | 33.3 | 16.7 | Chi sq | NS | UICC criteria | ||
| sHER2-/ET | 26 | 46.2 | 38.5 | 15.4 | |||||
| sHER2+/EC | 17 | 29.4 | 35.3 | 35.3 | Chi sq | .059 | |||
| sHER2-/EC | 31 | 41.9 | 35.5 | 22.6 | |||||
| Lipton et al., 2003 letrozole vs. tamoxifen | Grp | N | CR+PR | SD+PD | Test | p | Comments | ||
| sHER2+ | 164 | UICC criteria | |||||||
| letrozole | 17 | 83 | log regr | .4507 | |||||
| tamoxifen | 13 | 87 | |||||||
| sHER2- | 398 | ||||||||
| letrozole | 39 | 61 | log regr | .0078 | |||||
| tamoxifen | 26 | 74 | |||||||
| Grp | N | CR+PR +SD | PD | Test | p | Comments | |||
| sHER2+ | 164 | UICC criteria | |||||||
| letrozole | 33 | 67 | log regr | .3057 | |||||
| tamoxifen | 26 | 74 | |||||||
| sHER2- | 398 | ||||||||
| letrozole | 57 | 43 | log regr | .0162 | |||||
| tamoxifen | 45 | 55 | |||||||
Abbreviations: Chi sq: Chi square; CR: complete response; Grp: group; log reg: logistic regression; NS: not significant; PR: partial response; UICC: International Union against Cancer; WHO: World Health Organization;
| Study | Time to Event Outcomes | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Im et al., 2005 epirubicin+paclitaxel (n=40) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| TTP | sHER2+ | 4 | 2.8 | LR | <.001 | |||||||
| sHER2- | 19 | 8.3 | ||||||||||
| RD | sHER2+ | 3 | 1.5 | 0 | LR | <.001 | ||||||
| sHER2- | 13 | 6.7 | ~43 | ~33 | ||||||||
| OS | sHER2+ | 4 | 12.4 | ~50 | ~26 | LR | .076 | |||||
| sHER2- | 23 | not reached | ~72 | ~56 | ||||||||
| Colomer et al., 2000 doxorubicin+ paclitaxel (n=55) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| Resp Dur | sHER2+ | 15 | 7.5 | ~26 | LR | .035 | ||||||
| sHER2- | 24 | 11 | ~50 | ~35 | MV Cox | .04 | ||||||
| Colomer et al., 2004 paclitaxel+gemcitabine (n=42) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| Resp Dur | sHER2+ | 5 | 7.9 | ~40 | ~0 | ? | .04 | |||||
| sHER2- | 24 | 14.4 | ~55 | ~37 | ||||||||
| Luftner et al., 2004 dose-intensified paclitaxel (n=35) | Outcome | Grp | N | Mn (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| Resp Dur | sHER2+ | 9 | 6.0 | ~0 | LR | .042 | ||||||
| sHER2- | 5 | 2 | ~60 | |||||||||
| PFS | sHER2+ | 22 | 3 | ~3 | LR | .098 | ||||||
| sHER2- | 13 | 4 | ~10 | |||||||||
| Colomer et al., 2007 Letrozole (n=226) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| TTP | sHER2+ | 42 | 4 | ~36 | ~12 | ~7 | LR | .004 | ||||
| sHER2- | 184 | 14 | ~57 | ~34 | ~12 | |||||||
| CPH | <.001 | |||||||||||
| OS | sHER2+ | 42 | ~22 | ~82 | 44 | LR | <0.0005 | |||||
| sHER2- | 184 | ~91 | 75 | ~63 | ||||||||
| Yamauchi et al., 1997 3 doses of droloxifene (n=94 of 369) | Outcome | Grp | N | Med (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| TTP | sHER2+ | 32 | ~3 | ~13 | ~13 | MV Cox | 0.0003 | 0.36 (0.21, 0.63) (adjusted) | ||||
| sHER2- | 62 | ~8 | ~43 | ~28 | ||||||||
| OS | sHER2+ | 32 | ~28 | ~74 | ~54 | MV Cox | 0.003 | 0.35 (0.17, 0.70) (adjusted) | ||||
| sHER2- | 62 | ~92 | ~63 | |||||||||
| Sandri et al., 2004 cyclophosphamide+methotrexate (n=39) | Outcome | Grp | N | Mn (mos) | 1 yr | 2 yr | 3 yr | 4 yr | 5 yr | Test | p | HR (95%CI) |
| TTP | sHER2+ | ? | 2 | ~0 | ~0 | ~0 | LR | 0.007 | ||||
| sHER2- | ? | 8 | ~34 | ~12 | ~7 | |||||||
| OS | sHER2+ | ? | 11 | ~47 | ~0 | ~0 | LR | <0.001 | ||||
| sHER2- | ? | 16 | ~84 | ~49 | ~42 | |||||||
| Study | Tumor Response (%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Im et al., 2005 epirubicin+paclitaxel (n=40) | Grp | N | CR | PR | SD | PD | Test | p | Comments |
| sHER2+ | 4 | 0 | 75 | 25 | Chi sq | 0.45 | WHO criteria | ||
| sHER2- | 23 | 13.0 | 43.4 | 26.1 | 17.4 | ||||
| Colomer et al., 2000 doxorubicin+paclitaxel (n=55) | Grp | N | CR | PR | No response | Test | p | Comments | |
| sHER2+ | 24 | 0 | 62 | 37 | Chi sq | 0.021 | WHO criteria | ||
| sHER2- | 31 | 26 | 52 | 23 | MV logistic regression for ORR | ||||
| IHC+ | 11 | 9 | 55 | 36 | Chi sq | 0.219 | sHER2 p value: 0.03 | ||
| IHC- | 28 | 18 | 64 | 18 | |||||
| Fornier et al., 2005 paclitaxel+trastuzumab (n=55) | Grp | N | Response | No response | Test | p | Comments | ||
| sHER2+ | 38 | 50 | 50 | FE | 1.0 | Response= CR+PR criteria described | |||
| sHER2- | 17 | 47 | 43 | ||||||
| Δ<15 | 25 | 68 | 32 | FE | 0.005 | ||||
| Δ≥15 | 13 | 15 | 85 | ||||||
| Δ≥55% | 25 | 68 | 32 | FE | 0.015 | OR 4.25, 95% CI: 1.37–13.19 | |||
| Δ<55% | 30 | 33 | 67 | ||||||
| Esteva et al., 2002 trastuzumab+docataxel (n=30) | Grp | N | CR+PR | SD+PD | Test | p | Comments | ||
| sHER2+ | 21 | 76 | 24 | FE | 0.04 | ECOG criteria | |||
| sHER2- | 9 | 33 | 67 | ||||||
| IHC 3+ | 19 | 63 | 37 | FE | 0.99 | ||||
| IHC 0–2+ | 5 | 60 | 40 | ||||||
| FISH+ | 24 | 67 | 33 | FE | 0.60 | ||||
| FISH- | 4 | 50 | 50 | ||||||
| Colomer et al., 2004 paclitaxel+gemcitabine (n=42) | Grp | N | Response | No response | Test | p | Comments | ||
| sHER2+ | 15 | 42 | 58 | FE | 0.02 | WHO criteria | |||
| sHER2- | 26 | 83 | 17 | ||||||
| Luftner et al., 2004 dose-intensified paclitaxel (n=35) | Grp | N | CR+PR | SD | PD | Test | p | Comments | |
| sHER2+ | 22 | 40.9 | 36.4 | 22.7 | MH | 0.40 | mean duration 25.7 wks | ||
| sHER2- | 13 | 38.5 | 30.8 | 30.8 | mean duration 65.2 wks (p=0.042) | ||||
| internationally accepted criteria (referenced) | |||||||||
| Burstein et al., 2003 trastuzumab+ vinorelbine (n=43) | Grp | N | No progression | Progression | Comments | ||||
| sHER2+ | ? | AU ROC=0.8947, baseline or Δ in sHER2 do not predict response, but no ↓ in sHER2 predicted progression | |||||||
| sHER2- | ? | ||||||||
| RECIST criteria | |||||||||
| Colomer et al., 2007 letrozole (n=226) | Grp | N | CR+PR | No response | Test | p | Comments | ||
| sHER2+ | 42 | 14 | 86 | .036 | |||||
| sHER2- | 184 | 31 | 70 | ||||||
| Yamauchi et al., 1997 3 doses of droloxifene (n=94 of 369) | Grp | N | Response | No response | Test | p | Comments | ||
| sHER2+ | 32 | 9 | 91 | FE | .00001 | criteria? | |||
| sHER2- | 62 | 56 | 44 | MV logistic regression for response sHER2 p value: .0001 | |||||
| Colomer et al., 2006 IV vinorelbine+ IV gemcitabine (n=47) | Grp | N | CR+PR | No response | Test | p | Comments | ||
| sHER2+ | 14 | 50 | 50 | ? | .9 | WHO criteria | |||
| sHER2- | 33 | 48.5 | 51.5 | ||||||
Abbreviations: AU: area under; Chi sq: Chi square; CR: complete response; FE: fixed effects; Grp: group; MV: multivariate; ORR: overall response rate; PR: partial response; RECIST: Response Evaluation Criteria in Solid Tumors; ROC: receiver operating characteristic; s: serum; t: tissue; WHO: World Health Organization;
Randomization stratified on HER2/randomized to whether treatment was guided by HER2. No studies of this type were identified.
Randomized trial, prespecified multivariate subgroup analysis. The only trial that performed a prespecified multivariate subgroup analyses was Gasparini, Gion, Mariani, et al. (2007, n=123 patients given first-line treatment by paclitaxel with or without trastuzumab for metastatic breast cancer). One quality concern was uncertainty over whether sHER2 results were scored blindly to outcome. Also, this study addressed 11 predictor variables plus treatment interaction terms in logistic and Cox regression analyses, however there appeared to be too few events in terms of response and progression to support models with so many variables. Thus, the study was not large enough for the type of modeling used. Overall, it is unclear whether the multivariate analysis was well-conducted. It is unclear how candidate variables were selected, what model-building strategy was used, whether assumptions were tested, whether the standard metastatic breast cancer prognostic factors were included in final models, how continuous variables were categorized; also, the model did not appear to go through validation.
For time-to-progression, the Cox regression treatment by sHER2 interaction was nearly statistically significant (p=0.0538). Among patients with elevated sHER2 values, results significantly favored paclitaxel plus trastuzumab, while in those with normal sHER2, results nonsignificantly favored paclitaxel alone. Logistic regression analysis of overall response rate showed no significant treatment by sHER2 interaction (p=.6044); in both groups, combination treatment was favored, but not significantly.
Randomized trial, post-hoc multivariate subgroup analysis. No studies of this type were identified.
Randomized trial, treatment by HER2 subgroup analysis. Among three randomized trials that described treatment by sHER2 subgroup analyses, Muller, Witzel, Luck, et al. (2004) reported on a subset of 101 patients with serum available, out of 597 patients (17 percent) randomized to epirubicin plus either paclitaxel (ET) or cyclophosphamide (EC). This study was a retrospective analysis of previously reported randomized trial. These authors found within the ET group a trend for worse overall survival for sHER2 positive patients (p=.092), but no significant difference between sHER2 groups receiving EC. Regarding progression-free survival, outcomes for the two treatments did not differ among the sHER2 negative, but results were significantly worse for EC among those sHER2 positive. For overall response rate, sHER2 groups did not differ among those receiving ET, but those getting EC had worse results when sHER2 was positive. No test for treatment by sHER2 interaction was reported.
These results should be viewed cautiously because the analyzed subset comprised less than 20 percent of those originally randomized and multivariate analysis was not used to adjust for any imbalances between treatments by sHER2 subgroups. Additionally, it is unclear sHER2 results were scored blindly with respect to outcome.
Lipton, Ali, Leitzel, et al. (2003) addressed 562 postmenopausal women given either letrozole or tamoxifen as first-line therapy for advanced breast cancer. This retrospective analysis included 62 percent of all patients randomized in the trial; however, this is the only randomized trial that used blinded assessment of sHER2 in relation to outcome. Results were better in terms of time-to-progression and time to treatment failure for those receiving letrozole, regardless of sHER2 status. For overall response rate and rate of clinical benefit (overall response plus stable disease), letrozole was significantly better than tamoxifen for sHER2 negative patients, but not for those sHER2 positive. No tests of treatment by sHER2 interaction were reported.
Cameron, Casey, Press, et al. (2008) randomized 399 patients with tissue HER2 positive locally advanced or metastatic breast cancer to receive capecitabine with or without lapatinib. Exploratory analyses of the relation between sHER2 status and progression-free survival were conducted in 92 percent of those randomized. When sHER2 was divided into the highest quartile versus other quartiles, both sHER2 subgroups had significantly better progression-free survival when treated with capecitabine plus lapatinib compared to capecitabine alone. This study did not describe the sHER2 assay methods clearly, did not report that sHER2 was scored blind to outcome and used an uncommon threshold for sHER2 positivity. No test of treatment by sHER2 status was reported.
Randomized trial results summary. The methodologic quality of these randomized trials is generally poor. Only one randomized trial was conducted with a prespecified plan to assess the relation of sHER2 to outcome. The same trial was the only one that conducted multivariate analyses, however it appeared to have too few events to support the large number of predictor and interaction terms used and the modeling techniques were overall poorly described. The other three trials performed retrospective treatment by sHER2 subgroup analyses of 17 percent, 62 percent and 93 percent of patients originally enrolled. Only one study used blinded assessment of sHER2 in relation to outcome.
These four randomized trials each addressed a different comparison of treatments. The only study that tested treatment by sHER2 status interactions found them to be nonsignificant for TTP and ORR in a comparison of paclitaxel with and without trastuzumab. A comparison of epirubicin either with paclitaxel or cyclophosphamide did not consistently find sHER2 to be related to different treatment outcomes (OS, PFS, ORR). A trial comparing letrozole and tamoxifen found sHER2 to be a more consistent predictor of treatment outcome for TTP and TTF, less so for ORR and clinical benefit. A trial of capecitabine with or without lapatinib found better PFS for those receiving combination treatment for both those in the highest quartile and lower quartiles of sHER2 values. Only the Gasparini, Gion, Mariani, et al. (2007) trial, which analyzed nearly all patients randomized, used multivariate methods, while the other two trials used univariate analyses of much smaller subsets of those randomized.
Single-arm study, multivariate analysis. Among three single-arm studies that conducted multivariate analysis, Colomer, Llombart-Cussac, Lloveras, et al. (2007) included 226 patients with metastatic breast cancer who received letrozole. The authors prespecified their interest in assessing the relation between sHER2 status and treatment outcomes; however they provided inadequate detail in describing Cox regression methods such as selection of candidate variables, model-building strategy, testing of assumptions, forcing of standard prognostic variables and handling of continuous variables. It is unclear if sHER2 results were scored blind to outcomes and validation of the final model was not mentioned. The multivariate analysis found sHER2 and ECOG performance status to be significant independent predictors of time to progression.
Colomer, Montero, Lluch, et al. (2000) included 55 patients with metastatic disease who were receiving first-line doxorubicin and paclitaxel. Of the 77 patients originally enrolled in this Phase II study, 75 percent had evaluable serum samples. The plan to assess the relation between sHER2 and outcome was prespecified in this study; however, the multivariate logistic and Cox regression techniques were poorly described. It is unclear how candidate variables were selected, what model-building strategy was used, whether assumptions were tested, whether final models included all standard prognostic variables and whether continuous variables were well handled. Furthermore, models did not appear to be validated and it is unclear if sHER2 was scored blindly to outcome. In the logistic regression of response, there were only 39 events, but six variables entered into the multivariate model (more than the recommended one variable per greater than 10 events). A similar problem existed for the Cox regression of response duration. These authors found elevated sHER2 to be significantly associated with poorer results on response duration and overall response rate, in both univariate and multivariate analyses.
The study by Yamauchi, O'Neill, Gelman, et al. (1997) was originally a randomized comparison of three doses of droloxifene as first-line hormonal therapy. Of the 369 patients randomized, 94 were included in this retrospective analysis (25 percent). Logistic regression of overall response and Cox regression of time-to-progression and overall survival all used the stepwise model building strategy, a method with major weaknesses. The description of modeling methods was poor, lacking details on: candidate variable selection, whether assumptions were tested, whether final models included standard prognostic variables and whether continuous variables were well handled. The article did not make clear whether sHER2 results were scored blindly to outcome. Multivariate analyses entered dose into models but was not retained, suggesting similar results by different doses and dose groups were pooled. After adjustment for other variables, this study found consistently worse results for sHER2 positive patients on time to progression, overall survival and overall response rate.
Single-arm study, univariate analysis. These studies reported on 55 patients or fewer. With the exception of the study by Esteva, Valero, Booser, et al. (2002), positive sHER2 results were associated with worse outcomes. The lack of multivariate analyses in these studies makes these findings of limited use for guiding treatment decisions. These studies could be described as exploratory, hypothesis-generating investigations that might inform future, more sophisticated studies.
Single-arm study results summary. This body of evidence is quite heterogeneous with respect to treatment regimens, outcomes assessed, and definitions of elevated sHER2. Only three of 11 studies conducted multivariate analyses, but the modeling methods were poorly described. Evidence from single-arm series more often shows that sHER2 status predicts outcomes among patients treated, however, there were several instances in which it was nonpredictive and one study found better response among those with elevated sHER2 in conflict with all other studies.
The evidence is weak on whether sHER2 predicts outcome after treatment with any regimens in any setting. Evidence primarily focused on first-line or second- and subsequent-line treatment of metastatic disease using variety of regimens. Furthermore these studies used different thresholds for a positive sHER2 result and varied on whether patient selection required positive tissue HER2 status. There were only four randomized trials and only one used multivariate analysis, while three single-arm studies performed multivariate analysis. The quality of reporting on multivariate analyses lacked sufficient detail. Univariate analyses provide very limited information value, suggesting candidate variables for future multivariate analyses. These studies do not support clear conclusions for whether sHER2 predicts disease progression, treatment response, or outcomes of any specific treatment regimen.
In patients with ovarian, lung, prostate, or head and neck cancers, what is the evidence that:
testing tumor tissue for HER2; or
monitoring serum or plasma concentrations of HER2;
either predicts response to therapy, or detects tumor progression or recurrence; and if so, what is the evidence that decisions based on her2 assay results improve patient management and outcomes?
Studies were included for Key Question 5 if they were:
randomized trials, prospective single-arm studies, or retrospective series of identically treated patients; that
measured HER2 in tumor tissue, serum, or plasma from patients with ovarian, lung, prostate, or head and neck cancers, and either:
associated HER 2 status from tissue assays, or baseline values or changes in serum or plasma HER2 concentration, with one or more outcomes of interest (primary or secondary; see above); or
compared outcomes of treatment decisions based on tumor HER2 status, or serum or plasma assay results, with outcomes of decisions made in absence of test results.
| Level of Evidence | Study | n | Setting | Treatments | Outcome | Results | |
|---|---|---|---|---|---|---|---|
| RCT stratified on HER2 status/HER2-guided vs. non-HER2-guided | |||||||
| RCT prespecified MV SGA | |||||||
| RCT post-hoc MV SGA | |||||||
| RCT treatment by HER2 SGA | |||||||
| 1-arm prespecified MV analysis | |||||||
| 1-arm post-hoc MV analysis | Koukourakis 1999 | 189 | NSCLC | surgery | OS | univariate: IHC HER2 not associated with OS | |
| T1–2, N0–1 | Cox regression: IHC HER2 not entered in model | ||||||
| Cappuzzo 2005 | 101 | locally advanced, metastatic | gefitinib | ORR | univariate: IHC HER2+↑ vs. - p=.001 | ||
| ORR | Cox regression IHC HER2+↑ vs. - p=.08 | ||||||
| OS | univariate IHC HER2+↑ vs. - p=.056 (discrepancies) | ||||||
| NSCLC | TTP | univariate IHC HER2+↑ vs. - p=.02 (discrepancies) | |||||
| Hirsch 2005 | 56 | stage IIIB/IV | gefitinib | OS | univariate FISH HER2+ vs. - p=.80 | ||
| BAC. BAC- | OS | Cox regression FISH HER2 not entered in model | |||||
| like AC | ORR | univariate FISH HER2+ vs. - p>.05 | |||||
| Saad 2004 | 100 | stage I | surgery | OS | univariate AC IHC HER2+↓ vs. - p=signif | ||
| AC/BAC | OS | Cox regression AC IHC HER2+↓ vs. - signif independent predictor | |||||
| OS | univariate BAC IHC HER2+↓ vs. - p=signif | ||||||
| OS | Cox regression BAC IHC HER2+↓ vs. - signif independent predictor | ||||||
| 1-arm UV analysis | Cappuzzo 2007 | 42 | stage III/IV | gefitinib | Resp | FISH HER2+↑ vs. - p=.007 | |
| NSCLC | TTP | FISH HER2+ vs. - p=.2 | |||||
| OS | FISH HER2+ vs. - p=.1 | ||||||
| Daniele 2007 | 42 | stage III/IV | gefitinib | Resp | FISH/CISH+↑ vs. - p=.0005 | ||
| NSCLC | |||||||
| Krug 2005 | 65 | stage IIIB/IV | docet/paclit | OS | IHC HER2+ vs. - p=NS | ||
| NSCLC | +trastuz | ||||||
| Pelosi 2005 | 345 | stage I | surgery | OS | FISH HER2+ vs. - p=NS | ||
| NSCLC | DFS | FISH HER2+ vs. - p=NS | |||||
| Langer 2004 | 56 | stage IIIB/IV | trastuz+ | OS | IHC HER2 3+ vs. 2+ vs. 1+ p=.77 | ||
| recurrent | paclit+ | PFS | IHC HER2 3+ vs. 2+ vs. 1+ p=.34 | ||||
| NSCLC | carbopl | ||||||
| Cappuzzo 2003 | 63 | stage IIIB/IV | gefitinib | TTP | IHC HER2+ vs. - p=NS | ||
| NSCLC | OS | IHC HER2+ vs. - p=NS | |||||
| ORR | IHC HER2+ vs. - p=.126 | ||||||
| Koukourakis 2000 | 112 | T1–2, N0–1 | surgery | OS | IHC HER2+ vs. - p=NS | ||
| NSCLC | |||||||
| Graziano 1998 | 66 | stage IIIA | cispl+etop | OS | IHC HER2+ vs. - p=.617 | ||
| NSCLC | (PE), surgery,PE, RT | ORR | IHC HER2+ vs. - p=.999 | ||||
| Pfeiffer 1996 | 186 | stage I-IV | surgery | OS | IHC HER2 none vs. low vs. high p=NS | ||
| NSCLC | |||||||
Abbreviations: carbopl: carboplatin; cispl: cisplatin; DFS: disease-free survival; etop: etoposide; HR: hazard ratio; MV: multivariate; ORR: overall response rate; OS: overall survival; paclit: paclitaxel; pCR: pathologic complete response; PFS: progression-free survival; RCT: randomized, controlled trial; RFS: recurrence-free survival; SGA: subgroup analysis; trastuz: trastuzumab; TTP: time to progression; Tx: treatment; UV: univariate;
| Study | Prospective design | Prespecified hypotheses about relation of marker to outcome | Large, well-defined, representative study population | Marker assay methods well-described | Blinded assessment of marker in relation to outcome | Homogeneous treatment(s), either randomized or rule-based selection | Low rate of missing data (≤ 15%) | Sufficiently long follow-up | Well-described, well-conducted multivariate analysis of outcome: 1) clear candidate variable selection, 2) clear, appropriate model-building guidelines, 3) assumptions tested, 4) standard prognostic variables included, 5) continuous variables well handled, 6) validation | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1) | 2) | 3) | 4) | 5) | 6) | |||||||||
| Koukourakis et al., 1999 | N | N | ||||||||||||