 |

Open Mass Spectrometry Search Algorithm (OMSSA)
Probe Database Debut
New Structure Link from Protein
BLAST Download Update
New Microbial Genomes in GenBank
Nucleotide Database Splits
NCBI 4-Pack Course
RefSeq Release 14
New Organisms in UniGene
GenBank Passes 100 Gigabases
New BLAST Formatter
Splign Alignment Tool
GenBank Release 150
New Genome Builds
Submission Corner
Masthead
|  |
Splign Transcript to Genomic Alignment Tool on the Web
One of the most reliable ways to identify genes is to align transcript sequences to a genomic sequence. Local alignment tools such as BLAST can quickly identify exons but do not include the nonaligning intronic segments in the alignment and lack precision at splice junctions. To produce accurate eukaryotic gene models from transcript alignments, a tool is needed that combines local and global alignment algorithms and accurately tracks splice junctions. The new NCBI spliced-alignment tool, Splign, includes these design features and is used to help annotate higher eukaryotic genomes at NCBI. Now, in addition to being made available as a standalone tool, Splign is available through a Web interface. The Web interface to Splign as well as a link to download the standalone application and help documentation are available from the Splign Homepage.
Splign generates transcript (cDNA) to genomic alignments that include detailed information about exon-intron boundaries, splice-junctions, potential frameshifts and other sequence discrepancies. Splign can also produce alternative models when there is more than one possibility. The Web version of Splign provides an interactive graphical view of the alignment or complete table of results. Figure 1 shows the results of a comparison between a transcript for the fruit fly “pxt” gene, given in GenBank record AF238306, and the sequence of the right arm of fruit fly chromosome 3, given in NCBI RefSeq NT_033777. In the figure, the fifth exon has been selected and the alignment for this segment is displayed. The intron-exon borders for the gene are easily identified both graphically and within the text sequence alignment. Statistics such as the length of each sequence in the alignment, the nucleotide positions of the exons, and the alignment coverage are provided. Sequence mismatches and insertion or deletions are color-coded for easy identification. In this segment, three mismatches and one small deletion in the transcript have been identified.

Figure 1. Results of mRNA (cDNA) to genomic alignment created by Splign. A. Graphical view the alignment between Drosophila melanogaster sequences AF238306 and NT_033777. B. Tabular format for the same alignment. Boundaries of the aligned regions of the query and subject sequences are shown along with the identified base pairs associated with the intron-exon splice junctions. |