Significant or Not? – A Reality Check
(How to determine if a match to a sequence region is significant?) |
| Sample User Question |
 |
|
My nucleotide sequence (Composition 1) seems to have two regions of significant sequence similarity to other sequences. How can I determine if the matches to both of these regions are significant?
|
| Analysis/Comments |
 |
|
The most difficult task of interpreting BLAST results is deciding which of the results are significant and which are not. Is there a cut-off line that can always be used? It used to be that when a researcher obtained the sequence for a gene it was generally the only sequence the researcher was working with and they would probably know something about the function of the gene product. In these days of genome sequencing projects, this is no longer the case. This makes interpretation of BLAST results even more difficult. In this example we will see that there really is no simple cut-off line for significant results.
|
| Flow Chart |
 |
| Step By Step Guide |
 |
|
- Copy and Paste the Composition 1 Nucleotide Sequence in the BLAST Nucleotide sequence search box.
- Push the BLAST button – no need to change any parameters for this search.
- Push the Format button – wait for results.
- The results list shows two areas that contain strong sequence similarity to other sequences in the ../Database. Examine these sequences and alignments.
- Two things about the significant sequence matches on the extreme right of the sequence should be suspicious to you. The first is that it is on the extreme right of the sequence. The second are the names of the closest matches in this area.
- Run VecScreen on some of the sequence matches from this area to confirm your suspicions.
| Additional Notes |
 |
Foreign sequences will always cause problems with interpreting the results of sequence similarity searches but even with clean, high quality sequence the significance of some results in similarity searches can be difficult to discern.
Some recent references about sequence similarity and significance of results:
- Rost, B. Enzyme function less conserved than anticipated. J Mol Biol 318(2):595-608, 2002.
- Bilu, Y. and Linial M. The advantage of functional prediction based on clustering of yeast genes and its correlation with non-sequence based classifications.J Comput Biol 9(2):193-210, 2002.
- Chang JT, Raychaudhuri S, Altman RB. Including biological literature improves homology search. Pac Symp Biocomput 2001;:374-83.
|
|
|