Send to

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2006 Nov 4;7:487.

Accurate and unambiguous tag-to-gene mapping in serial analysis of gene expression.

Author information

Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile. <>



In this study, we present a robust and reliable computational method for tag-to-gene assignment in serial analysis of gene expression (SAGE). The method relies on current genome information and annotation, incorporation of several new features, and key improvements over alternative methods, all of which are important to determine gene expression levels more accurately. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome.


We applied this method to the Saccharomyces cerevisiae genome, producing the most thorough and accurate annotation of potential virtual SAGE tags that is available today for this organism. The usefulness of this method is exemplified by the significant reduction of ambiguous cases in existing experimental SAGE data. In addition, we report new insights from the analysis of existing SAGE data. First, we found that experimental SAGE tags mapping onto introns, intron-exon boundaries, and non-coding RNA elements are observed in all available SAGE data. Second, a significant fraction of experimental SAGE tags was found to map onto genomic regions currently annotated as intergenic. Third, a significant number of existing experimental SAGE tags for yeast has been derived from truncated cDNAs, which are synthesized through oligo-d(T) priming to internal poly-(A) regions during reverse transcription.


We conclude that an accurate and unambiguous tag mapping process is essential to increase the quality and the amount of information that can be extracted from SAGE experiments. This is supported by the results obtained here and also by the large impact that the erroneous interpretation of these data could have on downstream applications.

[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Support Center