Format

Send to

Choose Destination
Bioinformatics. 2017 Jun 15;33(12):1892-1894. doi: 10.1093/bioinformatics/btx058.

PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF.

Author information

1
McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
2
Lieber Institute for Brain Development, Baltimore, MD, USA.
3
Department of Oncology and Division of Biostatistics and Bioinformatics, Johns Hopkins School of Medicine, Baltimore, MD, USA.
4
Vavilov Institute of General Genetics, Moscow, Russia.
5
Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia.
6
Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins School of Medicine, Baltimore, MD, USA.
7
Department of Mathematics and Statistics, The College of New Jersey, Ewing Township, NJ, USA.
8
Department of Neurology and Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA.
9
Institute for Genome Sciences, University of Maryland School of Medicine.

Abstract

Summary:

Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data.

Availability and Implementation:

PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.

Contact:

gsteinobrien@jhmi.edu or ccolantu@jhmi.edu or ejfertig@jhmi.edu.

Supplementary information:

Supplementary data are available at Bioinformatics online.

PMID:
28174896
PMCID:
PMC5860188
DOI:
10.1093/bioinformatics/btx058
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center