PHI-BLAST search requires two input: a pattern and a protein sequence containing the pattern. This dual requirement increases the search specificity by removing random hits from the results. It is generally used as the first step for a PSI-BLAST search.
The syntax for pattern specification in PHI-BLAST follows the conventions of PROSITE. PHI-BLAST search through URLAPI only take one pattern per search request.
| Table 6.14.1 PHI-BLAST Pattern Syntax | |
| Syntax | Meaning |
|---|---|
| ABCDEFGHIKLMNPQRSTVWXYZU | Protein Alphabet |
| [] | means any one of the characters enclosed in the brackets, e.g., [LFYT] means one occurrence of L or F or Y or T |
| - | nothing, used as a spacer to clearly separate each position |
| X | with nothing following means any residue |
| (n) | means the preceeding residue is repeated n times |
| (m,n) | the preceeding residue is repeated between m to n times, m < n) |
| > | only at the end of a pattern and means nothing, may occur before a period |
| . | may be used at the end, means nothing |
The meaning of example pattern '[LIVMF]-G-E-X-[GAS]-[LIVM]-X(5,11)-R-[STAQ]-A-X-[LIVMA]-X-[STACV].' is given in the table below for reference purpose. For more information on pattern search using standalone tools from NCBI and detailed description of pattern syntax, see http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/seedtop.html.
| Table 6.14.2 PHI-BLAST Pattern Syntax | |||
| Pattern Position | Pattern Syntax | Meaning | Acutal Length |
|---|---|---|---|
| 1 | [LIVMF] | Any one of LIVMF | 1 |
| 2 | G | One G | 1 |
| 3 | E | One E | 1 |
| 4 | X | Any, one | 1 |
| 5 | [GAS] | Any one of GAS | 1 |
| 6 | [LIVM] | Any one of LIVM | 1 |
| 7 | X(5,11) | 5 to 11 of any | 5 to 11 |
| 8 | R | One R | 1 |
| 9 | [STAQ] | Any one of STAQ | 1 |
| 10 | A | One A | 1 |
| 11 | X | Any, one | 1 |
| 12 | [LIVMA] | Any one of LIVMA | 1 |
| 13 | X | Any, one | 1 |
| 14 | [STACV] | Any one of STACV | 1 |