Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Genome Biol. 2006;7 Suppl 1:S6.1-12. Epub 2006 Aug 7.

Vertebrate gene finding from multiple-species alignments using a two-level strategy.

Author information

  • 1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. dmc@sanger.ac.uk

Abstract

BACKGROUND:

One way in which the accuracy of gene structure prediction in vertebrate DNA sequences can be improved is by analyzing alignments with multiple related species, since functional regions of genes tend to be more conserved.

RESULTS:

We describe DOGFISH, a vertebrate gene finder consisting of a cleanly separated site classifier and structure predictor. The classifier scores potential splice sites and other features, using sequence alignments between multiple vertebrate species, while the structure predictor hypothesizes coding transcripts by combining these scores using a simple model of gene structure. This also identifies and assigns confidence scores to possible additional exons. Performance is assessed on the ENCODE regions. We predict transcripts and exons across the whole human genome, and identify over 10,000 high confidence new coding exons not in the Ensembl gene set.

CONCLUSION:

We present a practical multiple species gene prediction method. Accuracy improves as additional species, up to at least eight, are introduced. The novel predictions of the whole-genome scan should support efficient experimental verification.

PMID:
16925840
[PubMed - indexed for MEDLINE]
PMCID:
PMC1810555
Free PMC Article

Images from this publication.See all images (3)Free text

Figure 1
Figure 2
Figure 3
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Write to the Help Desk