next up previous contents index
Next: 6.15 Searching with Short Up: 6 Combinations of Parameters Previous: 6.13.2 Protien Scoring Matrices   Contents   Index


6.14 Pattern Syntax for PHI-BLAST

PHI-BLAST search requires two input: a pattern and a protein sequence containing the pattern. This dual requirement increases the search specificity by removing random hits from the results. It is generally used as the first step for a PSI-BLAST search.

The syntax for pattern specification in PHI-BLAST follows the conventions of PROSITE. PHI-BLAST search through URLAPI only take one pattern per search request.

Table 6.14.1 PHI-BLAST Pattern Syntax
Syntax Meaning
ABCDEFGHIKLMNPQRSTVWXYZU Protein Alphabet
[] means any one of the characters enclosed in the brackets, e.g., [LFYT] means one occurrence of L or F or Y or T
- nothing, used as a spacer to clearly separate each position
X with nothing following means any residue
(n) means the preceeding residue is repeated n times
(m,n) the preceeding residue is repeated between m to n times, m < n)
> only at the end of a pattern and means nothing, may occur before a period
. may be used at the end, means nothing

The meaning of example pattern '[LIVMF]-G-E-X-[GAS]-[LIVM]-X(5,11)-R-[STAQ]-A-X-[LIVMA]-X-[STACV].' is given in the table below for reference purpose. For more information on pattern search using standalone tools from NCBI and detailed description of pattern syntax, see http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/seedtop.html.

Table 6.14.2 PHI-BLAST Pattern Syntax
Pattern Position Pattern Syntax Meaning Acutal Length
1 [LIVMF] Any one of LIVMF 1
2 G One G 1
3 E One E 1
4 X Any, one 1
5 [GAS] Any one of GAS 1
6 [LIVM] Any one of LIVM 1
7 X(5,11) 5 to 11 of any 5 to 11
8 R One R 1
9 [STAQ] Any one of STAQ 1
10 A One A 1
11 X Any, one 1
12 [LIVMA] Any one of LIVMA 1
13 X Any, one 1
14 [STACV] Any one of STACV 1

Note: The actual length of the pattern could be between 18 to 24 letters long. Users can use this pattern with NP_508999.2 in a test PHI-BLAST search.


next up previous contents index
Next: 6.15 Searching with Short Up: 6 Combinations of Parameters Previous: 6.13.2 Protien Scoring Matrices   Contents   Index
Tao Tao 2007-08-03