Send to

Choose Destination
Nat Rev Genet. 2003 Sep;4(9):741-9.

Vertebrate gene predictions and the problem of large genes.

Author information

Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 101300, China.


To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistently low false-negative rate. The incorporation of similarity information is meant to reduce the false-positive rate, but in doing so it increases the false-negative rate. The crucial variable is gene size (including introns)--genes of the most extreme sizes, especially very large genes, are most likely to be incorrectly predicted.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Nature Publishing Group
Loading ...
Support Center