RefSeq refinements of UniGene-based gene matching improve the correlation of expression measurements between two microarray platforms

Appl Bioinformatics. 2006;5(2):89-98. doi: 10.2165/00822942-200605020-00003.

Abstract

Matching genes across microarray platforms is a critical step in meta-analysis. Standard practice uses UniGene to match genes. Numerous studies have found poor correlations between platforms when using UniGene matching. We profiled samples from 33 breast cancer patients on two different microarray platforms (Affymetrix and cDNA) and investigated gene matching. Our results confirmed that UniGene-based matching led to poor correlations of gene expression between platforms. Using RefSeq, a database maintained by the National Center for Biotechnology Information (NCBI), we developed and implemented a new method to refine gene matching. We found that the correlations between gene expression measurements were substantially higher after the RefSeq matching. Our approach differs from previously reported sequence-matching approaches and retains useful expression measurements. It is a sensible approach for matching probes across platforms. We conclude that UniGene alone is insufficient to match genes across platforms. Refined matching based on RefSeq significantly improves the quality of matches.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / genetics*
  • Computational Biology / methods*
  • DNA, Complementary / metabolism
  • Data Interpretation, Statistical
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Models, Genetic
  • Nucleic Acid Hybridization
  • Oligonucleotide Array Sequence Analysis*

Substances

  • DNA, Complementary