Format

Send to

Choose Destination
Comput Biol Med. 2014 Mar;46:1-10. doi: 10.1016/j.compbiomed.2013.12.002. Epub 2013 Dec 13.

Empirical evaluation of consistency and accuracy of methods to detect differentially expressed genes based on microarray data.

Author information

1
Department of Bioinformatics and Biostatistics, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 40202, United States. Electronic address: d0yang03@louisville.edu.
2
Department of Bioinformatics and Biostatistics, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 40202, United States. Electronic address: rudy.parrish@louisville.edu.
3
Department of Bioinformatics and Biostatistics, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 40202, United States. Electronic address: guy.brock@louisville.edu.

Abstract

BACKGROUND:

In this study, we empirically evaluated the consistency and accuracy of five different methods to detect differentially expressed genes (DEGs) based on microarray data.

METHODS:

Five different methods were compared, including the t-test, significance analysis of microarrays (SAM), the empirical Bayes t-test (eBayes), t-tests relative to a threshold (TREAT), and assumption adequacy averaging (AAA). The percentage of overlapping genes (POG) and the percentage of overlapping genes related (POGR) scores were used to rank the different methods on their ability to maintain a consistent list of DEGs both within the same data set and across two different data sets concerning the same disease. The power of each method was evaluated based on a simulation approach which mimics the multivariate distribution of the original microarray data.

RESULTS:

For smaller sample sizes (6 or less per group), moderated versions of the t-test (SAM, eBayes, and TREAT) were superior in terms of both power and consistency relative to the t-test and AAA, with TREAT having the highest consistency in each scenario. Differences in consistency were most pronounced for comparisons between two different data sets for the same disease. For larger sample sizes AAA had the highest power for detecting small effect sizes, while TREAT had the lowest.

DISCUSSION:

For smaller sample sizes moderated versions of the t-test can generally be recommended, while for larger sample sizes selection of a method to detect DEGs may involve a compromise between consistency and power.

KEYWORDS:

Consistency of gene lists; Differential gene expression; Empirical study; Microarrays; Moderated t-tests; Simulation study

PMID:
24529200
PMCID:
PMC3993975
DOI:
10.1016/j.compbiomed.2013.12.002
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center