Receiver operating characteristic analysis: a general tool for DNA array data filtration and performance estimation

Genomics. 2003 Feb;81(2):202-9. doi: 10.1016/s0888-7543(02)00042-3.

Abstract

A critical step for DNA array analysis is data filtration, which can reduce thousands of detected signals to limited sets of genes. Commonly accepted rules for such filtration are still absent. We present a rational approach, based on thresholding of intensities with cutoff levels that are estimated by receiver operating characteristic (ROC) analysis. The technique compares test results with known distributions of positive and negative signals. We apply the method to Atlas cDNA arrays, GeneFilters, and Affymetrix GeneChip. ROC analysis demonstrates similarities in the distribution of false and true positive data for these different systems. We illustrate the estimation of an optimal cutoff level for intensity-based filtration, providing the highest ratio of true to false signals. For GeneChip arrays, we derived filtration thresholds consistent with the reported data based on replicate hybridizations. Intensity-based filtration optimized with ROC combined with other types of filtration (for example, based on significances of differences and/or ratios), should improve DNA array analysis. ROC methodology is also demonstrated for comparison of the performance of different types of arrays, imagers, and analysis software.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Data Interpretation, Statistical*
  • Oligonucleotide Array Sequence Analysis / methods*
  • ROC Curve
  • Sensitivity and Specificity