Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Nat Rev Genet. 2003 Sep;4(9):741-9.

Vertebrate gene predictions and the problem of large genes.

Author information

  • 1Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 101300, China.


To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistently low false-negative rate. The incorporation of similarity information is meant to reduce the false-positive rate, but in doing so it increases the false-negative rate. The crucial variable is gene size (including introns)--genes of the most extreme sizes, especially very large genes, are most likely to be incorrectly predicted.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Nature Publishing Group
    Loading ...
    Write to the Help Desk