NCBI Logo NCBI News Masthead

In this issue

Using TaxPlot to
Compare Genomes

New RefSeq Accession
Numbers for Curated
Genomic Regions


GenBank News

Recent Publications

DART Targets
Protein Domains

Evidence Viewer
Facilitates Analysis
of NCBI Human
Gene Models

Frequently Asked
Questions

BLAST Lab

Masthead
  Blast Lab

How to Search Custom Databases in Web-Blast Using Entrez Queries


A powerful new feature of the BLAST Web interface is the ability to limit BLAST searches to a subset of any database using a standard Entrez query. The limitation by Entrez query gives users a flexibility in database searching hitherto obtainable only by using a local version of BLAST with a locally constructed database. Skillful use of Entrez queries allows the equivalent of on-the-fly construction of databases of exact composition.



Entrez queries are entered into a box in the BLAST formatting area entitled “Limitation by Entrez Query”. It is helpful to first construct the query from within Entrez and verify that it returns the desired subset of sequences, before attempting to use it with BLAST. A successful Entrez text query may be pasted into the Entrez Limitation box on the BLAST page. Be sure that the database chosen is compatible with the Entrez limitation used. For example, an Entrez query that picks up ESTs will be of no use when searching the nr database since nr contains no EST sequences. Two examples will illustrate the utility of the Entrez Limitation.

In the simplest case, one might desire to search the nr protein database for matches to a particular type of protein from a particular class of organisms. The following Entrez query defines a search of only viral helicase proteins:

viruses[orgn] AND helicase [protein name]

Note that the limitation above will pick up annotated helicase proteins, but not unannotated proteins. It will, however, ensure that your results contain nothing but viral proteins annotated as helicase or helicase-related proteins.

Another interesting use of the Entrez query limitation is to limit a BLAST search to an arbitrary set of sequences defined by a list of accession numbers.

Figure: List of EST sequence in UniGene cluster Hs.2 taken from a portion of the UniGene report

Figure 1: List of EST sequence in UniGene cluster Hs.2 taken from a portion of the UniGene report.


Suppose that you have identified a UniGene cluster of interest and wish to BLAST an mRNA from that cluster against the ESTs from the same cluster in order to see how they map to the mRNA. This can be accomplished without downloading a single sequence by using the human EST database in conjunc-tion with an Entrez limitation. Take as an example, UniGene cluster Hs.2. This cluster includes several mRNA sequences, of which we will use one, D90042, as our query. We wish to map the 18 ESTs belonging to this cluster to the mRNA, so we need to BLAST D90042 against these 18 sequences. The UniGene report for Hs.2 lists the accession numbers involved, as shown in Figure 1. We can simply select and copy this portion of the report into a text editor and then parse the accessions by hand or by using a script. However we perform the parsing, we hope to wind up with an Entrez limitation of the form:

BG569293 OR BG533459 OR....

In this case, the list of 18 accession numbers are specified explicitly, connected with Boolean OR logic.

Figure: BLAST graphical overview for search with an mRNA from UniGene clusterHs.2 against 18 ESTs from the same cluster.

Figure 2: BLAST graphical overview for search with an mRNA from UniGene clusterHs.2 against 18 ESTs from the same cluster.


The graphical overview for this search, given in Figure 2, shows the alignment of the 18 ESTs in Hs.2 to mRNA D90042 from the same cluster. Such an alignment is a useful way to visualize the distribution of 3' and 5' ESTs within a cluster. The BLAST report itself can be used to distinguish between 5' and 3' ESTs.



The BLAST Lab feature is intended to provide detailed technical information on some of the more specialized uses of the BLAST family of programs. Topics are selected from the range of questions received by the BLAST Help Group.



Continue Link


NCBI News | Fall 2001 NCBI News