Format

Send to:

Choose Destination
See comment in PubMed Commons below
Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12975-8. Epub 2002 Sep 16.

Significance and statistical errors in the analysis of DNA microarray data.

Author information

  • 1Departments of Applied Physics and Biology, California Institute of Technology, Pasadena, CA 91125, USA. jpbrody@uci.edu

Abstract

DNA microarrays are important devices for high throughput measurements of gene expression, but no rational foundation has been established for understanding the sources of within-chip statistical error. We designed a specialized chip and protocol to investigate the distribution and magnitude of within-chip errors and discovered that, as expected from theoretical expectations, measurement errors follow a Lorentzian-like distribution, which explains the widely observed but unexplained ill-reproducibility in microarray data. Using this specially designed chip, we examined a data set of repeated measurements to extract estimates of the distribution and magnitude of statistical errors in DNA microarray measurements. Using the common "ratio of medians" method, we find that the measurements follow a Lorentzian-like distribution, which is problematic for subsequent analysis. We show that a method of analysis dubbed "median of ratios" yields a more Gaussian-like distribution of errors. Finally, we show that the bootstrap algorithm can be used to extract the best estimates of the error in the measurement. Quantifying the statistical error in such measurements has important applications for estimating significance levels, clustering algorithms, and process optimization.

PMID:
12235357
[PubMed - indexed for MEDLINE]
PMCID:
PMC130571
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk