Format

Send to

Choose Destination
Bioinformatics. 2015 Jan 1;31(1):84-93. doi: 10.1093/bioinformatics/btu603. Epub 2014 Sep 5.

Snowball: resampling combined with distance-based regression to discover transcriptional consequences of a driver mutation.

Author information

1
Department of Biomedical Informatics, Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, Department of Psychiatry and Department of Cancer Biology, Vanderbilt University, Nashville, TN 37212, USA Department of Biomedical Informatics, Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, Department of Psychiatry and Department of Cancer Biology, Vanderbilt University, Nashville, TN 37212, USA Department of Biomedical Informatics, Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, Department of Psychiatry and Department of Cancer Biology, Vanderbilt University, Nashville, TN 37212, USA.
2
Department of Biomedical Informatics, Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, Department of Psychiatry and Department of Cancer Biology, Vanderbilt University, Nashville, TN 37212, USA.

Abstract

MOTIVATION:

Large-scale cancer genomic studies, such as The Cancer Genome Atlas (TCGA), have profiled multidimensional genomic data, including mutation and expression profiles on a variety of cancer cell types, to uncover the molecular mechanism of cancerogenesis. More than a hundred driver mutations have been characterized that confer the advantage of cell growth. However, how driver mutations regulate the transcriptome to affect cellular functions remains largely unexplored. Differential analysis of gene expression relative to a driver mutation on patient samples could provide us with new insights in understanding driver mutation dysregulation in tumor genome and developing personalized treatment strategies.

RESULTS:

Here, we introduce the Snowball approach as a highly sensitive statistical analysis method to identify transcriptional signatures that are affected by a recurrent driver mutation. Snowball utilizes a resampling-based approach and combines a distance-based regression framework to assign a robust ranking index of genes based on their aggregated association with the presence of the mutation, and further selects the top significant genes for downstream data analyses or experiments. In our application of the Snowball approach to both synthesized and TCGA data, we demonstrated that it outperforms the standard methods and provides more accurate inferences to the functional effects and transcriptional dysregulation of driver mutations.

AVAILABILITY AND IMPLEMENTATION:

R package and source code are available from CRAN at http://cran.r-project.org/web/packages/DESnowball, and also available at http://bioinfo.mc.vanderbilt.edu/DESnowball/.

PMID:
25192743
PMCID:
PMC4271146
DOI:
10.1093/bioinformatics/btu603
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center