The Model Maker is a tool that allows the construction of an mRNA sequence model using putative exons defined by ab initio prediction and by aligning GenBank® transcripts (including ESTs) and NCBI RefSeqs to the NCBI human genome assembly. The ability to generate alternative transcript models using novel combinations of exons not represented in any single mRNA sequence alignment is useful in the exploration of gene model variants.
To generate a Model Maker display centered on a gene of interest, go to the Human Map Viewer page (Model Maker links will soon be available directly from LocusLink) and type the gene symbol or gene name into the query box. Select the Genes_seq map as the Master Map and click on the mm link to the right of the gene name. The Model Maker display for the HOXB1 gene is shown in Figure 1 below. The gene is found on the RefSeq contig NT_010783.9, with the chromosome coordinates shown. You may move upstream or downstream from the contig using a pair of horizontal arrows. If you click on the sequence viewer link sv after this repositioning, you will be able to see the upstream or downstream sequences of the contig.
The Model Maker page is divided into several sections. The first section displays evidence for exons in the form of sequence records from the databases that have been aligned to human genome assembly contig NT_010783.9 to produce the NCBI gene model for HOXB1. These sequences include, from top to bottom, a GenBank mRNA sequence and the mRNA RefSeq derived from it, each contributing 2 exons; a GenomeScan-generated model, contributing an additional small exon; and a genomic RefSeq derived from the alignment of the mRNA RefSeq with the contig, contributing 2 exons.
The Model Maker gathers the unique putative exons from the various alignments used as evidence, assigns each a consecutive number, and presents them in a graphic view. In the Figure, three unique exons are shown in the graphic view which may be added or removed from the nascent model by clicking them on or off. By choosing the hits hyperlink next to an evidence sequence, the coordinates of BLAST® hits for the evidence sequence on the human genomic sequence are shown and intron lengths may be deduced. It is also possible to select whole exon sets from evidence mRNAs, using the set link, add EST alignments to the evidence list, extend the region shown downstream and upstream or switch to the opposite strand.
The information in the graphic view is also presented as a table view that gives the start and stop positions of each putative exon, along with the first three and last three bases of each exon and two bases immediately upstream and downstream of the exon. Exons are selected for inclusion in the model using check boxes. Numbered links at the ends of each exon facilitate the selection of adjacent exons implied by existing transcripts. As each exon is selected, the 3- frame trans-lations of the model sequence are updated in the ORF Frames boxes. The longest ORF in each translation is shown in UPPER CASE letters. As compatible exons are added to the model, in phase, an ORF in at least one of the three reading-frames lengthens. When the stop codon is reached and the model is complete, it may be saved to a local file in FASTA format for use with other programs.