![]() | ![]() |
Formats:
|
||||||||||||||||||||||
Copyright © 2002, The National Academy of Sciences Genetics Significance and statistical errors in the analysis of DNA microarray data Departments of *Applied Physics and ‡Biology, California Institute of Technology, Pasadena, CA 91125 †To whom reprint requests should be addressed at: Center for Biomedical Engineering, University of California, REC 204, Code 2715, Irvine, CA 92697-2715. E-mail: jpbrody/at/uci.edu. Communicated by Robert H. Austin, Princeton University, Princeton, NJ Received May 21, 2001; Accepted August 5, 2002. This article has been cited by other articles in PMC.Abstract DNA microarrays are important devices for high throughput measurements of gene expression, but no rational foundation has been established for understanding the sources of within-chip statistical error. We designed a specialized chip and protocol to investigate the distribution and magnitude of within-chip errors and discovered that, as expected from theoretical expectations, measurement errors follow a Lorentzian-like distribution, which explains the widely observed but unexplained ill-reproducibility in microarray data. Using this specially designed chip, we examined a data set of repeated measurements to extract estimates of the distribution and magnitude of statistical errors in DNA microarray measurements. Using the common “ratio of medians” method, we find that the measurements follow a Lorentzian-like distribution, which is problematic for subsequent analysis. We show that a method of analysis dubbed ”median of ratios“ yields a more Gaussian-like distribution of errors. Finally, we show that the bootstrap algorithm can be used to extract the best estimates of the error in the measurement. Quantifying the statistical error in such measurements has important applications for estimating significance levels, clustering algorithms, and process optimization. Any measurement is only an estimate of a physical value, but to be useful the measurement should be accompanied by an estimate of the error. The error in a single measurement can be estimated by examining a histogram of many independently repeated measurements. Typically, a histogram of many measurements will form a normal (i.e., Gaussian) distribution whose mean value is taken as the best estimate of the true value. The standard deviation of this distribution is an estimate of the error in a single measurement. The measurement of ratios poses special statistical problems. The distribution of the ratio x/y of two Gaussian random variables x and y is not necessarily Gaussian. In the case of noisy measurements, where the standard deviation is a significant fraction of the measured value, the distribution of the ratio approaches a Lorentzian or Cauchy distribution (1). In the case of non-noisy measurement, where the standard deviation is a small fraction of the mean, the distribution of the ratio will follow a Gaussian distribution. Loosely speaking, Lorentzian distributions have longer tails than Gaussian distributions. This means that points sampled from a Lorentzian distribution will have more frequent “outliers” than points sampled from a similar Gaussian distribution. The mean, standard deviation, and higher moments of the Lorentzian distribution are undefined. The measurement of ratios can give wide tails and nonsensical error estimates unless the data are handled properly. Thus, one needs to turn to other statistical tools for measurement and error estimates rather than the mean and standard error in the mean. To examine the statistical reliability of measurements from DNA microarrays, we examined microarrays with multiply repeated spots and looked at differences in the measured values. We analyzed data from experiments that measure a large number (1,152) of mRNAs four different times on a single slide. When the ratio measurements are extracted using one common method [the ratio of medians (2)], the distribution of deviations follow a Lorentzian-like distribution rather than a normal (Gaussian) distribution. When we re-analyzed the data by using a modified algorithm (median of ratios), the distribution became more Gaussian-like and we obtained more consistent results. We describe a method for estimating the error in the measured ratio by using the bootstrap method (3). The bootstrap is an algorithm used to estimate confidence intervals of an arbitrary parameter estimated from a population of measurements. It does this by repeatedly randomly sampling from the population and calculating the parameter of interest. We evaluated this method of error estimation by comparing the actual differences in multiple measurements of the ratio (the median of the ratios) to the estimated error for a single measurement. There is good agreement between the two, leading us to conclude that the bootstrap can give reliable error estimates. Methods A test slide was constructed containing 100 spots representing cDNA cloned from mouse glycerol-3-phosphate dehydrogenase (G3PDH). The series of spots were from a single preparation of cDNA. Arrays were hybridized to mRNA from C2C12 and 10T1/2 cell lines. Results are shown in Fig. Fig.1;1
A 4,608 spot DNA microarray representing 1,152 mouse genes each repeated four times was constructed. mRNA was extracted from a whole adult mouse liver (Cy5) and a C2C12 mouse myoblast cell line (Cy3) and hybridized to the microarray. The slide was scanned and spots were grouped by the cDNA clone they represent. The commonly used measure of signal is the log2 transform of the ratio of medians. The ratio of medians is defined as “the ratio of the median intensities of each feature for each wavelength, with the median background subtracted.” We found that the median of ratios, defined as “the median of pixel-by-pixel ratios of pixel intensities, with the median background subtracted,” provided a more consistent measurement. A scatter plot, presented in Fig. Fig.2,2
We used a computer algorithm to calculate the bootstrap median and confidence levels in the median. The bootstrap algorithm works as follows. A list of measured ratios, one from each pixel in a spot, was compiled. A new list was created by sampling (with replacement) from this list. The median value of the new list was computed and recorded on a list of medians. This procedure was repeated as many times as there were pixels in the spot. The mean and 90% confidence interval in the mean was computed from the list of medians. In the bootstrap algorithm, these represent the best estimate of the median and 90% confidence level of the estimate. This is reported in Table 1 and shown graphically in Fig. Fig.3.3
Results The Efficiency of Hybridization on DNA Spots Varies Over a Wide Range. This has been known since the first paper on spotted DNA microarrays (4, 5); we reproduce it here to show the magnitude of the variation. The wide variation requires the use of an internal control on each DNA spot. The control and sample are labeled with different fluorophores and the ratio of intensities between the sample and control is reported. As is shown in Fig. Fig.1,1 Measurements Extracted from Images of DNA Microarrays by Using the Commonly Accepted Methods (Ratio of Medians) Follow a Lorentzian-Like Distribution. Our measurements on 1,152 different genes repeated four times show that the measured values follow a Lorentzian-like distribution. Measurements extracted using the ratio of means algorithm give similar results. This indicates that approximately one in five of the genes that appear to have significant changes in expression level do not; they are statistical outliers that are an artifact of the data analysis method. Measurements Extracted from Images by Using the Median of Pixel-by-Pixel Ratios Follow a Gaussian-Like Distribution. By examining a population of pixel-by-pixel ratio measurements at each spot and selecting the median of the population, the distribution of deviations follows a Gaussian distribution, with a significantly smaller width (see Fig. Fig.4).4
The Error on an Individual Spot Can Be Estimated by Using the Bootstrap Algorithm on the Ratios of Individual Pixels Within a Spot. Confidence levels (90%) in the median for each spot were estimated using the bootstrap algorithm. These errors agreed well with the observed spread in measurements across different spots that contained the same DNA (see Fig. Fig.33 Discussion DNA microarray measurements are typically made in two colors (using the fluorophores Cy3 and Cy5), where one color corresponds to a control and the other is the value of interest. For technical reasons (2), the measured value is reported as the ratio of the two channels, usually the logarithm (base 2, by convention) of the ratio. By taking the logarithm, equal changes in up/down concentrations are represented by equal numerical values. The distribution of the ratio x/y of two correlated normal random variables has been solved (1). It is a function of five parameters: the means , , standard deviations σx, σy of both the numerator and denominator, and the correlation coefficient ρ between the numerator and denominator. In the limit that the standard deviations are much greater than the means, σx and σy the distribution is exactly equal to a Lorentzian distribution. (For instance, when x and y are normally distributed and = 0 and = 0, the distribution of x/y is exactly Lorentzian.)The experimental distributions we examined were found to approximately follow the log-transformed Lorentz distribution, as expected for a ratio of two noisy measurements. The Lorentz distribution can be written as,
In DNA microarray experiments, the experimental quantity of interest is the ratio. More accurate measurements can be obtained by making a large number of independent measurements of the ratio and computing the median of the measurements. Because the measurements are drawn from a Lorentzian-like distribution whose mean is undefined, the median is the appropriate measure of the central value. Computing the mean value and/or the standard deviation of the population will result in meaningless values, because the determination of the values will be dominated by the outliers of the measurements and will not be reproducible. Independent measurements of the ratio can be made by repeated spotting of the same DNAs, but this takes up valuable area on the chip. If the dominant source of variation in the relative values occurs within a spot (as well as between spots), then a single spot can be subdivided into smaller independent areas (pixels), and the ratio for each one of these pixels could be computed (median of ratios). The median and standard error of the median can be calculated from this population of pixels within a single spot. When we reanalyzed the data by using the “median of ratios” algorithm, we found the data followed the Gaussian distribution,
Larger spots give more accurate measurements than smaller spots when using the median of ratios. The standard error in the median is roughly inversely proportional to the square root of the number of independent measurements, as would be true for any measurement with a Gaussian distribution. A large spot that has twice the diameter of a small spot will have four times the number of pixels when using the same scanner resolution. The error in the measurement will be about one half as large in the larger spot compared with the smaller spot. This follows from general statistical principles, where the standard error in a measurement is proportional to the square root of the number of independent measurements made. This result has obvious implications for tradeoffs in measurement accuracy versus array density, and should be considered during array and reader design. Many methods of analyzing-large scale expression patterns rely on quantitative measurements of transcript levels to “cluster” different genes into groups (6, 7). Many clustering algorithms use a maximum likelihood estimator that should be chosen to reflect the statistics of the underlying data. It is crucial to understand the distribution of the measured data when choosing such an estimator, especially if that distribution has long tails. Finally, an error measurement of transcript levels provides a parameter that can be used with clustering algorithms to estimate confidence levels for membership of a transcript in a cluster. Some methods of analyzing large scale expression patterns do not rely on measurements of quantitative levels of expression, but rather on whether the transcript is absent/present (8) or whether the expression level of a gene is significantly higher or lower in two different populations of cells. In these cases, there are more sensitive ways to assess the significance of the signal than by measuring the ratios with error bars. One such method is to compute a P value corresponding to the hypothesis that the mean values of the spots represent identical or distinct expression levels (9). Experimental errors can be classified as two different types: random and systematic. We have examined the random error in a single DNA microarray experiment. The goal here is to quantify the statistical random errors inherent in the experiment and provide a quantitative measure of quality so that experimental systematic errors can be evaluated and optimized. Conclusion We have outlined a method of obtaining reliable error estimates for spotted DNA microarray measurements. Ratios accompanied by error estimates will allow more meaningful interpretations of single chip data, better comparisons of data across multiple experiments, and more consistent results from clustering algorithms. Acknowledgments We thank Trent Basarsky of Axon Instruments for technical assistance. This work was supported by National Human Genome Research Institute Grant HG00047-01. References 1. Hinkley D V. Biometrika. 1969;56:635–639. 2. Eisen M B, Brown P O. Methods Enzymol. 1999;303:179–205. [PubMed] 3. Efron B, Tibshirani R. Science. 1991;253:390–395. 4. Schena M, Shalon D, Davis R W, Brown P O. Science. 1995;270:467–470. [PubMed] 5. Lee M-L T, Kuo F C, Whitmore G A, Sklar J. Proc Natl Acad Sci USA. 2000;97:9834–9839. [PubMed] 6. Tavazoie S, Hughes J D, Campbell M J, Cho R J, Church G M. Nat Genet. 1999;22:281–285. [PubMed] 7. Kim S, Dougherty E R, Chen Y, Sivakumar K, Meltzer P, Trent J M, Bittner M. Genomics. 2000;67:201–209. [PubMed] 8. Walker M G, Volkmuth W, Sprinzak E, Hodgsdon D, Klinger T. Genome Res. 1999;9:1198–1203. [PubMed] 9. Tusher V G, Tibshirani R, Chu G. Proc Natl Acad Sci USA. 2001;98:5116–5121. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||
Methods Enzymol. 1999; 303():179-205.
[Methods Enzymol. 1999]Science. 1995 Oct 20; 270(5235):467-70.
[Science. 1995]Proc Natl Acad Sci U S A. 2000 Aug 29; 97(18):9834-9.
[Proc Natl Acad Sci U S A. 2000]Methods Enzymol. 1999; 303():179-205.
[Methods Enzymol. 1999]Nat Genet. 1999 Jul; 22(3):281-5.
[Nat Genet. 1999]Genomics. 2000 Jul 15; 67(2):201-9.
[Genomics. 2000]Genome Res. 1999 Dec; 9(12):1198-203.
[Genome Res. 1999]Proc Natl Acad Sci U S A. 2001 Apr 24; 98(9):5116-21.
[Proc Natl Acad Sci U S A. 2001]