Send to

Choose Destination
Bioinformatics. 2015 Jan 1;31(1):137-9. doi: 10.1093/bioinformatics/btu607. Epub 2014 Sep 10.

UNDO: a Bioconductor R package for unsupervised deconvolution of mixed gene expressions in tumor samples.

Author information

Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, Department of Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20057, Departments of Pathology and Oncology, Johns Hopkins University, Baltimore, MD 21231 and Department of Surgery, Memorial Sloan-Kettering Cancer Center, New York, NY 10021, USA.



We develop a novel unsupervised deconvolution method, within a well-grounded mathematical framework, to dissect mixed gene expressions in heterogeneous tumor samples. We implement an R package, UNsupervised DecOnvolution (UNDO), that can be used to automatically detect cell-specific marker genes (MGs) located on the scatter radii of mixed gene expressions, estimate cellular proportions in each sample and deconvolute mixed expressions into cell-specific expression profiles. We demonstrate the performance of UNDO over a wide range of tumor-stroma mixing proportions, validate UNDO on various biologically mixed benchmark gene expression datasets and further estimate tumor purity in TCGA/CPTAC datasets. The highly accurate deconvolution results obtained suggest not only the existence of cell-specific MGs but also UNDO's ability to detect them blindly and correctly. Although the principal application here involves microarray gene expressions, our methodology can be readily applied to other types of quantitative molecular profiling data.


UNDO is available at

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center