Display Settings:

Format

Send to:

Choose Destination
We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
J Natl Cancer Inst. 2012 Feb 22;104(4):311-25. doi: 10.1093/jnci/djr545. Epub 2012 Jan 18.

A three-gene model to robustly identify breast cancer molecular subtypes.

Author information

  • 1Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA. bhaibeka@jimmy.harvard.edu

Abstract

BACKGROUND:

Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression-based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three key breast cancer genes (estrogen receptor [ER], HER2, and aurora kinase A [AURKA]). The aim of this study was to compare the robustness, classification concordance, and prognostic value of these classifiers with those of a simplified three-gene SCM in a large compendium of microarray datasets.

METHODS:

Thirty-six publicly available breast cancer datasets (n = 5715) were subjected to molecular subtyping using five published classifiers (three SSPs and two SCMs) and SCMGENE, the new three-gene (ER, HER2, and AURKA) SCM. We used the prediction strength statistic to estimate robustness of the classification models, defined as the capacity of a classifier to assign the same tumors to the same subtypes independently of the dataset used to fit it. We used Cohen κ and Cramer V coefficients to assess concordance between the subtype classifiers and association with clinical variables, respectively. We used Kaplan-Meier survival curves and cross-validated partial likelihood to compare prognostic value of the resulting classifications. All statistical tests were two-sided.

RESULTS:

SCMs were statistically significantly more robust than SSPs, with SCMGENE being the most robust because of its simplicity. SCMGENE was statistically significantly concordant with published SCMs (κ = 0.65-0.70) and SSPs (κ = 0.34-0.59), statistically significantly associated with ER (V = 0.64), HER2 (V = 0.52) status, and histological grade (V = 0.55), and yielded similar strong prognostic value.

CONCLUSION:

Our results suggest that adequate classification of the major and clinically relevant molecular subtypes of breast cancer can be robustly achieved with quantitative measurements of three key genes.

Comment in

  • Gene signatures revisited. [J Natl Cancer Inst. 2012]
PMID:
22262870
[PubMed - indexed for MEDLINE]
PMCID:
PMC3283537
Free PMC Article

Images from this publication.See all images (5)Free text

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk