U.S. flag

An official website of the United States government

PMC Full-Text Search Results

Items: 2

1.
Figure 1.

Figure 1. From: Consensus sequences improve PSI-BLAST through mimicking profile–profile alignments.

Sketch of consensus search. First, the PSSM for a query protein sequence is built by an iterative PSI-BLAST search over a large database of proteins sequences (such as UniProt). The resulting PSSM is then used to search and align sequences contained in a target database of consensus sequences. Finally, consensus sequence alignments are translated to alignments of the native raw protein sequences.

Dariusz Przybylski, et al. Nucleic Acids Res. 2007 Apr;35(7):2238-2246.
2.
Figure 2.

Figure 2. From: Consensus sequences improve PSI-BLAST through mimicking profile–profile alignments.

Consensus sequences performed better at any error rate. We compared the performance of BLAST and PSI-BLAST, with different strategies for consensus add-ons profile-consensus marked our standard approach of aligning a PSI-BLAST profile of the query against a database of consensus sequences (blue circles); profile-consensustop50% aligned query profiles against a database in which only the 50% most informative residues (Methods) were replaced by consensus sequence (black inversed triangles); profile-consensuslow50% aligned query profiles against a database in which only the 50% least informative residues were replaced by consensus sequence (black rectangles); consensus-consensus marked BLAST-based comparisons between consensus sequences on both sides, i.e. for the database and the query (black circles); sequence-consensus were BLAST-based comparisons between native sequences on the query side and a database with consensus sequences (black diamonds). For reference, results of original sequence-based PSI-BLAST (green rectangles), and pairwise BLAST (gray triangles) are also shown. True pairs were sequences from the same SCOP superfamily (similar structure), while false ones belonged to different SCOP folds (different structure) (Methods). (A) Alignments (2476 sequences, all versus all) were sorted by e-values. True versus false computed over all matches found below a given e-value threshold. By construction, we excluded all pairs that were trivially related (Methods), which explained why the curves for the pairwise BLAST were so low. Profile alignments of global consensus sequences performed best. The transparent gray lines marked the levels of 10, 20 and 30% errors. For instance, at the 10% error (90% accuracy) level, the profile-based search of global consensus sequences revealed over 66% more correct relations than PSI-BLAST (global-consensus-based = 2483 true positives; PSI-BLAST = 1490). (B) To rule out that the improvements of consensus sequence-based searches (A) originated from few families, we counted the cumulative number of correctly classified pairs (structural similarity recognized) for the first best scoring n alignment pairs (rank n) from each query search (i.e for rank n equal 2 we looked at 4952 pairs (2 times 2476). The searches of global consensus sequences performed best at all ranks.

Dariusz Przybylski, et al. Nucleic Acids Res. 2007 Apr;35(7):2238-2246.

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...
Support Center