Send to

Choose Destination
Bioinformatics. 2019 Apr 8. pii: btz226. doi: 10.1093/bioinformatics/btz226. [Epub ahead of print]

Unsupervised discovery of phenotype specific multi-omics networks.

Author information

Computational Bioscience Program, University of Colorado, Aurora, CO, USA.
Department of Biostatistics and Informatics, University of Colorado, Aurora, CO, USA.
Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.
Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA.
Center for Genes, Environment & Health, National Jewish Health, Denver, CO, USA.
Department of Pharmaceutical Sciences, University of Colorado, Aurora, CO, USA.



Complex diseases often involve a wide spectrum of phenotypic traits. Better understanding of the biological mechanisms relevant to each trait promotes understanding of the etiology of the disease and the potential for targeted and effective treatment plans. There have been many efforts towards omics data integration and network reconstruction, but limited work has examined the incorporation of relevant (quantitative) phenotypic traits.


We propose a novel technique, sparse multiple canonical correlation network analysis (SmCCNet), for integrating multiple omics data types along with a quantitative phenotype of interest, and for constructing multi-omics networks that are specific to the phenotype. As a case study, we focus on miRNA-mRNA networks. Through simulations, we demonstrate that SmCCNet has better overall prediction performance compared to popular gene expression network construction and integration approaches under realistic settings. Applying SmCCNet to studies on chronic obstructive pulmonary disease (COPD) and breast cancer, we found enrichment of known relevant pathways (e.g. the Cadherin pathway for COPD and the interferon-gamma signaling pathway for breast cancer) as well as less known omics features that may be important to the diseases. Although those applications focus on miRNA-mRNA co-expression networks, SmCCNet is applicable to a variety of omics and other data types. It can also be easily generalized to incorporate multiple quantitative phenotype simultaneously. The versatility of SmCCNet suggests great potential of the approach in many areas.


The SmCCNet algorithm is written in R, and is freely available on the web at


Supplementary data are available at Bioinformatics online.

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center