|
Computational Biology Branch |
||||||||||
|
|
Abstract The knowledge of protein and domain
interactions provide crucial insights into their
function within a cell. Several computational methods have been proposed to
detect interactions between proteins and/or their constitutive domains. In
this work, we focus on approaches based on correlated evolution
(co-evolution) of sequences of interacting proteins. In this type of
approach, often referred to as the mirrortree method, a high correlation of evolutionary
histories of two proteins is used as an indicator to predict protein
interactions. Recently, it has been observed that subtracting the underlying
speciation process by separating co-evolution due to common speciation
divergence from that due to common function of interacting pairs greatly
improves the predictive power of the mirrortree approach. In this paper we investigate
possible improvements and limitations of this method. In particular, we
demonstrate that the performance of the mirrortree method can be
further improved by restricting the co-evolution analysis to the relatively
conserved regions in the protein domain sequences (disregarding highly
divergent regions). We provide a theoretical validation of our results
leading to new insights into the interplay between co-evolution and
speciation of interacting proteins. Method
Figure 1. Schema
of the mirrortree
method with ERS (Entropy Reduction Step_ and speciation subtraction steps.
From the initial MSA of each family, only those columns with entropy below a certain
threshold Results
Figure 2. ROC curves using full sequence original mirrortree
approach without any correction for speciation (dashed black). The orthogonal
and non-orthogonal subtraction correction for speciation are represented in
dashed blue and red lines
respectively, the subtraction combined with the ERS for which only residues
with entropy below 1.9 were selected are represented as solid blue and red lines for the orthogonal and
non-orthogonal approaches respectively. At an error rate of 0.04% (which
corresponds to 50 negatives), the number of true positives (or known
interacting pairs) retrieved using the original mirrortree approach was 13,
using the orthogonal and non-orthogonal corrections for speciation, the
numbers were 14 and 18 true positives, respectively. Combining them with the
ERS proposed in this paper (using entropy cut-off equals 1.9) the orthogonal
subtraction retrieved 24 and the non-orthogonal 21 true positives.
Figure 4. a-b)
Vector representation of the two methods for subtracting the speciation
signal.
|
||||||||
|
|
|
|
||||||||