Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Proc Natl Acad Sci U S A. 2005 Feb 1;102(5):1566-71. Epub 2005 Jan 24.

A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome.

Author information

  • 1Howard Hughes Medical Institute and Department of Molecular and Cell Biology, University of California, Life Sciences Addition, Berkeley, CA 94720-3200, USA. myandell@fruitfly.org

Abstract

Five years after the completion of the sequence of the Drosophila melanogaster genome, the number of protein-coding genes it contains remains a matter of debate; the number of computational gene predictions greatly exceeds the number of validated gene annotations. We have assembled a collection of >10,000 gene predictions that do not overlap existing gene annotations and have developed a process for their validation that allows us to efficiently prioritize and experimentally validate predictions from various sources by sequencing RT-PCR products to confirm gene structures. Our data provide experimental evidence for 122 protein-coding genes. Our analyses suggest that the entire collection of predictions contains only approximately 700 additional protein-coding genes. Although we cannot rule out the discovery of genes with unusual features that make them refractory to existing methods, our results suggest that the D. melanogaster genome contains approximately 14,000 protein-coding genes.

PMID:
15668397
[PubMed - indexed for MEDLINE]
PMCID:
PMC545494
Free PMC Article

Images from this publication.See all images (3)Free text

Fig. 1.
Fig. 2.
Fig. 3.

MeSH Terms, Substances, Secondary Source ID

MeSH Terms

Substances

Secondary Source ID

PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk