NCBI Logo NCBI News NCBI News Masthead
National Center for Biotechnology Information National Institutes of Health National Library of Medicine Nation Center for Biotechnology Information Spring 2002

In this issue...

Model Maker

Virus Reference

Release of the
1,000th Virus
Reference Genome

New MapViewer

Other Genomic

Mouse Genome

New Genomes
in GenBank



Trace Archive


Find Out
“About NCBI”


Barbara Rapp
Leaves NCBI



Make Your Own Gene Models with Model Maker

The Model Maker is a tool that allows the construction of an mRNA sequence model using putative exons defined by ab initio prediction and by aligning GenBank® transcripts (including ESTs) and NCBI RefSeqs to the NCBI human genome assembly. The ability to generate alternative transcript models using novel combinations of exons not represented in any single mRNA sequence alignment is useful in the exploration of gene model variants.

To generate a Model Maker display centered on a gene of interest, go to the Human Map Viewer page (Model Maker links will soon be available directly from LocusLink) and type the gene symbol or gene name into the query box. Select the Genes_seq map as the Master Map and click on the “mm” link to the right of the gene name. The Model Maker display for the HOXB1 gene is shown in Figure 1 below. The gene is found on the RefSeq contig NT_010783.9, with the chromosome coordinates shown. You may move upstream or downstream from the contig using a pair of horizontal arrows. If you click on the sequence viewer link “sv” after this repositioning, you will be able to see the upstream or downstream sequences of the contig.

The Model Maker page is divided into several sections. The first section displays “evidence” for exons in the form of sequence records from the databases that have been aligned to human genome assembly contig NT_010783.9 to produce the NCBI gene model for HOXB1. These sequences include, from top to bottom, a GenBank mRNA sequence and the mRNA RefSeq derived from it, each contributing 2 exons; a GenomeScan-generated model, contributing an additional small exon; and a genomic RefSeq derived from the alignment of the mRNA RefSeq with the contig, contributing 2 exons.

The Model Maker gathers the unique putative exons from the various alignments used as evidence, assigns each a consecutive number, and presents them in a “graphic view”. In the Figure, three unique exons are shown in the “graphic view” which may be added or removed from the nascent model by clicking them on or off. By choosing the “hits” hyperlink next to an evidence sequence, the coordinates of BLAST® hits for the evidence sequence on the human genomic sequence are shown and intron lengths may be deduced. It is also possible to select whole “exon sets” from evidence mRNAs, using the “set” link, add EST alignments to the evidence list, extend the region shown downstream and upstream or switch to the opposite strand.

Click on figure to view enlarged version

Figure 1: Model Maker display for HOX1B located on contig NT_010783.9. Access the Contig record by clicking the Accession Number hyperlink NT_010783.9. Move upstream or downstream by clicking on the arrows: <<< (upstream) or >>> (downstream). View the opposite strand by clicking on “change strand” hyperlink. Select “set” to select an evidence record as your model. Select “add ESTs” to add a graphical display of ESTs. To build a custom model, select the exon or EST of interest by clicking on the graphic view segment or select the exon from the table by using a check box. Monitor the growing model by viewing its translation in three different reading frames. The longest ORF in each frame is denoted by UPPERCASE letters. Save your results by clicking on the “Save” hyperlink.

The information in the “graphic view” is also presented as a “table view” that gives the start and stop positions of each putative exon, along with the first three and last three bases of each exon and two bases immediately upstream and downstream of the exon. Exons are selected for inclusion in the model using check boxes. Numbered links at the ends of each exon facilitate the selection of adjacent exons implied by existing transcripts. As each exon is selected, the 3- frame trans-lations of the model sequence are updated in the ORF Frames boxes. The longest ORF in each translation is shown in UPPER CASE letters. As compatible exons are added to the model, in phase, an ORF in at least one of the three reading-frames lengthens. When the stop codon is reached and the model is complete, it may be saved to a local file in FASTA format for use with other programs. —EP, MB


NCBI News | Spring 2002 NCBI News Footer