![]() |
| Open Mass Spectrometry Search Algorithm (OMSSA) Probe Database Debut New Structure Link from Protein BLAST Download Update New Microbial Genomes in GenBank Nucleotide Database Splits NCBI 4-Pack Course RefSeq Release 14 New Organisms in UniGene GenBank Passes 100 Gigabases Splign Alignment Tool GenBank Release 150 New Genome Builds Submission Corner Masthead |
Figure 1. The new BLAST graphical overview. Thinner lines connect the two matches to the human beta-2-microglobulin of (query) exons from the chimpanzeee mRNA sequence. The new formatter also simplifies the interpretation of hits to large database sequences bearing many annotated features by giving links to the features that lie within or close to the match. These links, given for database sequences in excess of 200Kb in length, highlight associations between regions of alignment and biological features such as genes and repeat regions. The BLAST output of Figure 2 shows the match to the albumin gene in a whole genome shotgun supercontig from the dog genome. Use the link to serum albumin to generate a display of the relevant portion of NW_876257, the two megabase supercontig.
Figure 2. Alignment from a translating (tblastn) search of the human albumin protein against the dog genome. The new formatter display indicates that this hit lies within the annotation for the albumin gene on the supercontig, NW_87627. Perhaps the biggest improvement provided by the new formatter lies in its handling of masked, low-complexity regions within the query sequence. Low-complexity regions are those with biased amino acid or nucleotide compositions and are usually masked prior to a search in order to provide more meaningful alignments. In traditional BLAST output, one-letter codes for masked amino acids and nucleotides are replaced with X's and n's, respectively. The new formatter allows masked residues, to be displayed in lower case, in order to preserve the identities of the masked residues and in color for better highlighting. Matches within filtered regions are now taken into account when computing the percent identity for an alignment. The lower case option and the mask color selection are available in the ‘Format’ section of the BLAST submission or formatting pages. Figure 3 shows the new masking displays for the default replacement masking and the lower case masking that retains the original query sequence residues.
Figure 3. BLAST protein alignments containing low complexity sequence. The upper alignment shows the default replacement masking. The lower alignments shows the lower-case masking option that preserves the query sequence in the output. |
|||
|
||||