Format

Send to

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2011 Feb 15;27(4):509-15. doi: 10.1093/bioinformatics/btq701. Epub 2010 Dec 24.

A computationally efficient modular optimal discovery procedure.

Author information

1
Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.

Abstract

MOTIVATION:

It is well known that patterns of differential gene expression across biological conditions are often shared by many genes, particularly those within functional groups. Taking advantage of these patterns can lead to increased statistical power and biological clarity when testing for differential expression in a microarray experiment. The optimal discovery procedure (ODP), which maximizes the expected number of true positives for each fixed number of expected false positives, is a framework aimed at this goal. Storey et al. introduced an estimator of the ODP for identifying differentially expressed genes. However, their ODP estimator grows quadratically in computational time with respect to the number of genes. Reducing this computational burden is a key step in making the ODP practical for usage in a variety of high-throughput problems.

RESULTS:

Here, we propose a new estimate of the ODP called the modular ODP (mODP). The existing 'full ODP' requires that the likelihood function for each gene be evaluated according to the parameter estimates for all genes. The mODP assigns genes to modules according to a Kullback-Leibler distance, and then evaluates the statistic only at the module-averaged parameter estimates. We show that the mODP is relatively insensitive to the choice of the number of modules, but dramatically reduces the computational complexity from quadratic to linear in the number of genes. We compare the full ODP algorithm and mODP on simulated data and gene expression data from a recent study of Morrocan Amazighs. The mODP and full ODP algorithm perform very similarly across a range of comparisons.

AVAILABILITY:

The mODP methodology has been implemented into EDGE, a comprehensive gene expression analysis software package in R, available at http://genomine.org/edge/.

PMID:
21186247
PMCID:
PMC3105483
DOI:
10.1093/bioinformatics/btq701
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems Icon for PubMed Central
    Loading ...
    Support Center