• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of prosciprotein sciencecshl presssubscriptionsetoc alertsthe protein societyjournal home
Protein Sci. Nov 2000; 9(11): 2278–2284.
PMCID: PMC2144484

Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments.


The PSI-BLAST algorithm has been acknowledged as one of the most powerful tools for detecting remote evolutionary relationships by sequence considerations only. This has been demonstrated by its ability to recognize remote structural homologues and by the greatest coverage it enables in annotation of a complete genome. Although recognizing the correct fold of a sequence is of major importance, the accuracy of the alignment is crucial for the success of modeling one sequence by the structure of its remote homologue. Here we assess the accuracy of PSI-BLAST alignments on a stringent database of 123 structurally similar, sequence-dissimilar pairs of proteins, by comparing them to the alignments defined on a structural basis. Each protein sequence is compared to a nonredundant database of the protein sequences by PSI-BLAST. Whenever a pair member detects its pair-mate, the positions that are aligned both in the sequential and structural alignments are determined, and the alignment sensitivity is expressed as the percentage of these positions out of the structural alignment. Fifty-two sequences detected their pair-mates (for 16 pairs the success was bi-directional when either pair member was used as a query). The average percentage of correctly aligned residues per structural alignment was 43.5+/-2.2%. Other properties of the alignments were also examined, such as the sensitivity vs. specificity and the change in these parameters over consecutive iterations. Notably, there is an improvement in alignment sensitivity over consecutive iterations, reaching an average of 50.9+/-2.5% within the five iterations tested in the current study.

Full Text

The Full Text of this article is available as a PDF (349K).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997 Sep 1;25(17):3389–3402. [PMC free article] [PubMed]
  • Brenner SE, Levitt M. Expectations from structural genomics. Protein Sci. 2000 Jan;9(1):197–200. [PMC free article] [PubMed]
  • Bryant SH, Lawrence CE. An empirical energy function for threading protein sequence through the folding motif. Proteins. 1993 May;16(1):92–112. [PubMed]
  • Chothia C. Proteins. One thousand families for the molecular biologist. Nature. 1992 Jun 18;357(6379):543–544. [PubMed]
  • Domingues FS, Lackner P, Andreeva A, Sippl MJ. Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. J Mol Biol. 2000 Apr 7;297(4):1003–1013. [PubMed]
  • Fischer D, Eisenberg D. Protein fold recognition using sequence-derived predictions. Protein Sci. 1996 May;5(5):947–955. [PMC free article] [PubMed]
  • Godzik A. The structural alignment between two proteins: is there a unique answer? Protein Sci. 1996 Jul;5(7):1325–1338. [PMC free article] [PubMed]
  • Hobohm U, Sander C. Enlarged representative set of protein structures. Protein Sci. 1994 Mar;3(3):522–524. [PMC free article] [PubMed]
  • Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993 Sep 5;233(1):123–138. [PubMed]
  • Holm L, Sander C. Mapping the protein universe. Science. 1996 Aug 2;273(5275):595–603. [PubMed]
  • Jones DT. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol. 1999 Apr 9;287(4):797–815. [PubMed]
  • Jones DT, Taylor WR, Thornton JM. A new approach to protein fold recognition. Nature. 1992 Jul 2;358(6381):86–89. [PubMed]
  • Kim SH. Shining a light on structural genomics. Nat Struct Biol. 1998 Aug;5 (Suppl):643–645. [PubMed]
  • Koppensteiner WA, Lackner P, Wiederstein M, Sippl MJ. Characterization of novel proteins based on known protein structures. J Mol Biol. 2000 Mar 3;296(4):1139–1152. [PubMed]
  • Krogh A, Brown M, Mian IS, Sjölander K, Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994 Feb 4;235(5):1501–1531. [PubMed]
  • Marchler-Bauer A, Bryant SH. Measures of threading specificity and accuracy. Proteins. 1997;Suppl 1:74–82. [PubMed]
  • Marchler-Bauer A, Bryant SH. A measure of progress in fold recognition? Proteins. 1999;Suppl 3:218–225. [PubMed]
  • Müller A, MacCallum RM, Sternberg MJ. Benchmarking PSI-BLAST in genome annotation. J Mol Biol. 1999 Nov 12;293(5):1257–1271. [PubMed]
  • Orengo CA, Brown NP, Taylor WR. Fast structure alignment for protein databank searching. Proteins. 1992 Oct;14(2):139–167. [PubMed]
  • Orengo CA, Jones DT, Thornton JM. Protein superfamilies and domain superfolds. Nature. 1994 Dec 15;372(6507):631–634. [PubMed]
  • Panchenko AR, Marchler-Bauer A, Bryant SH. Combination of threading potentials and sequence profiles improves fold recognition. J Mol Biol. 2000 Mar 10;296(5):1319–1331. [PubMed]
  • Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol. 1998 Dec 11;284(4):1201–1210. [PubMed]
  • Rost B. Twilight zone of protein sequence alignments. Protein Eng. 1999 Feb;12(2):85–94. [PubMed]
  • Rost B, Schneider R, Sander C. Protein fold recognition by prediction-based threading. J Mol Biol. 1997 Jul 18;270(3):471–480. [PubMed]
  • Russell RB, Saqi MA, Bates PA, Sayle RA, Sternberg MJ. Recognition of analogous and homologous protein folds--assessment of prediction success and associated alignment accuracy using empirical substitution matrices. Protein Eng. 1998 Jan;11(1):1–9. [PubMed]
  • Rychlewski L, Jaroszewski L, Li W, Godzik A. Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci. 2000 Feb;9(2):232–241. [PMC free article] [PubMed]
  • Salamov AA, Suwa M, Orengo CA, Swindells MB. Genome analysis: Assigning protein coding regions to three-dimensional structures. Protein Sci. 1999 Apr;8(4):771–777. [PMC free article] [PubMed]
  • Sander C, Schneider R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991;9(1):56–68. [PubMed]
  • Sauder JM, Arthur JW, Dunbrack RL., Jr Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins. 2000 Jul 1;40(1):6–22. [PubMed]
  • Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998 Sep;11(9):739–747. [PubMed]
  • Sippl MJ, Weitckus S. Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins. 1992 Jul;13(3):258–271. [PubMed]
  • Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–197. [PubMed]
  • Sternberg MJ, Bates PA, Kelley LA, MacCallum RM. Progress in protein structure prediction: assessment of CASP3. Curr Opin Struct Biol. 1999 Jun;9(3):368–373. [PubMed]
  • Teichmann SA, Chothia C, Church GM, Park J. Fast assignment of protein structures to sequences using the intermediate sequence library PDB-ISL. Bioinformatics. 2000 Feb;16(2):117–124. [PubMed]
  • Teichmann SA, Chothia C, Gerstein M. Advances in structural genomics. Curr Opin Struct Biol. 1999 Jun;9(3):390–399. [PubMed]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...