Send to

Choose Destination
Methods. 2014 Jun 1;67(3):304-12. doi: 10.1016/j.ymeth.2014.03.005. Epub 2014 Mar 18.

Breast cancer patient stratification using a molecular regularized consensus clustering method.

Author information

Department of Biomedical Informatics, The Ohio State University, United States; Department of Electrical and Computer Engineering, The Ohio State University, United States. Electronic address:
Department of Computer Science and Engineering, The Ohio State University, United States. Electronic address:
Department of Biomedical Informatics, The Ohio State University, United States. Electronic address:


Breast cancers are highly heterogeneous with different subtypes that lead to different clinical outcomes including prognosis, response to treatment and chances of recurrence and metastasis. An important task in personalized medicine is to determine the subtype for a breast cancer patient in order to provide the most effective treatment. In order to achieve this goal, integrative genomics approach has been developed recently with multiple modalities of large datasets ranging from genotypes to multiple levels of phenotypes. A major challenge in integrative genomics is how to effectively integrate multiple modalities of data to stratify the breast cancer patients. Consensus clustering algorithms have often been adopted for this purpose. However, existing consensus clustering algorithms are not suitable for the situation of integrating clustering results obtained from a mixture of numerical data and categorical data. In this work, we present a mathematical formulation for integrative clustering of multiple-source data including both numerical and categorical data to resolve the above issue. Specifically, we formulate the problem as a novel consensus clustering method called Molecular Regularized Consensus Patient Stratification (MRCPS) based on an optimization process with regularization. Unlike the traditional consensus clustering methods, MRCPS can automatically and spontaneously cluster both numerical and categorical data with any option of similarity metrics. We apply this new method by applying it on the TCGA breast cancer datasets and evaluate using both statistical criteria and clinical relevance on predicting prognosis. The result demonstrates the superiority of this method in terms of effectiveness of aggregation and differentiating patient outcomes. Our method, while motivated by the breast cancer research, is nevertheless universal for integrative genomics studies.


Breast cancer prognosis; Breast cancer subtypes; Cancer patient stratification; Consensus clustering; Integrative genomic

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center