|
CBB
Home Page
T.
Przytycka’s Research Group
|

|
|
|
Teresa M. Przytycka’s research
group
Algorithmic and Graph Theoretical
methods in
Computational and Systems Biology
|
|
|
Structure Comparison by
Projection Methods
|
Group members:
Elena
Zotenko
Teresa M. Przytycka
Collaborators
Dianne P. O’Leary
R.I Dogan,
WJ. Wilbur
References:
- Elena
Zotenko, Dianne P. O’Leary and Teresa M. Przytycka. Secondary
Structure Spatial Conformation Footprint (SSEF) A Novel Method for Fast
Protein Structure Comparison and Classification BMC
Structural Biology 2006, 6:12
(8 June 2006)
- Zotenko
E, Dogan RI, Wilbur WJ, O'Leary DP, Przytycka TM. Structural
footprinting in protein structure comparison: the impact of structural
fragments. BMC
Struct Biol. 2007 Aug 9;7:53.
 
|
Background: Recently a new class of methods for fast
protein structure comparison has emerged. We call the methods in this
class projection methods as
they rely on a mapping of protein structure into a high-dimensional
vector space. Once the mapping is done, the structure comparison is
reduced to distance computation between corresponding vectors. As
structural similarity is approximated by distance between projections,
the success of any projection method depends on how well its mapping
function is able to capture the salient features of protein structure.
There is no agreement on what constitutes a good projection technique
and the three currently known projection methods utilize very different
approaches to the mapping construction, both in terms of what
structural elements are included and how this information is integrated
to produce a vector representation.
Results: In this paper we propose a novel projection method
that uses secondary structure information to produce the mapping.
First, a diverse set of spatial arrangements of triplets of secondary
structure elements, a set of structural models, is automatically
selected. Then, each protein structure is mapped into a
high-dimensional vector of
“counts'' or footprint, where each count corresponds to
the number of times a given structural model is observed in the
structure, weighted by the precision with which the model is reproduced.
We perform the first comprehensive evaluation of our method together
with all other currently known projection methods, which not only
allows us to compare our method to the methods in the same class but
also creates a unique opportunity for establishing a connection between
a projection technique and performance.
Conclusions: The results of our evaluation suggest that the
type of structural information used by a projection method affects the
ability of the method to detect structural similarity. In particular,
our method that bases the mapping on the spatial conformations of
triplets of secondary structure elements outperforms other methods in
most of the tests.
|
|
Comparison of SSEF with other
projection methods:

|
DATA SETS
|
- SCOP
1.65
- SCOP
1.69
- CATH
2.6
|
|
|
|
PERFORMANCE IN SCREENING
|
- SCOP
1.65
- all level combined, the figure from the paper (plot)
- SCOP
1.69
- the fold
level (plot)
- the super-family level (plot)
- the family level (plot)
- CATH
2.6
- the topology
level (plot)
- the super-family level (plot)
|
|
|
|
|
|
PERFORMANCE IN CLASSIFICATION
|
|
|
|
|
Structural footprinting in
protein structure comparison: the impact of structural fragments.
Zotenko E,
Dogan RI, Wilbur WJ, O'Leary DP, Przytycka TM.
BMC
Struct Biol. 2007 Aug 9;7:53.
|
|
|
BACKGROUND: One
approach for speeding-up protein structure comparison is the projection
approach, where a protein structure is mapped to a high-dimensional vector and
structural similarity is approximated by distance between the corresponding
vectors. Structural footprinting methods are projection methods that employ
the same general technique to produce the mapping: first select a
representative set of structural fragments as models and then map a protein
structure to a vector in which each dimension corresponds to a particular
model and "counts" the number of times the model appears in the
structure. The main difference between any two structural footprinting
methods is in the set of models they use; in fact a large number of methods
can be generated by varying the type of structural fragments used and the
amount of detail in their representation. How do these choices affect the
ability of the method to detect various types of structural similarity?
RESULTS: To answer this question we
benchmarked three structural footprinting methods that vary significantly
in their selection of models against the CATH database. In the first set of
experiments we compared the methods' ability to detect structural
similarity characteristic of evolutionarily related structures, i.e.,
structures within the same CATH superfamily. In the second set of
experiments we tested the methods' agreement with the boundaries imposed by
classification groups at the Class, Architecture, and Fold levels of the
CATH hierarchy.
CONCLUSION:
In both experiments we found that the method which uses secondary structure
information has the best performance on average, but no one method performs
consistently the best across all groups at a given classification level. We
also found that combining the methods' outputs significantly improves the
performance. Moreover, our new techniques to measure and visualize the
methods' agreement with the CATH hierarchy, including the threshholded
affinity graph, are useful beyond this work. In particular, they can be
used to expose a similar composition of different classification groups in
terms of structural fragments used by the method and thus provide an
alternative demonstration of the continuous nature of the protein structure
universe.
|
|