Interactome-wide prediction of protein-protein binding sites reveals effects of protein sequence variation in Arabidopsis thaliana

PLoS One. 2012;7(10):e47022. doi: 10.1371/journal.pone.0047022. Epub 2012 Oct 15.

Abstract

The specificity of protein-protein interactions is encoded in those parts of the sequence that compose the binding interface. Therefore, understanding how changes in protein sequence influence interaction specificity, and possibly the phenotype, requires knowing the location of binding sites in those sequences. However, large-scale detection of protein interfaces remains a challenge. Here, we present a sequence- and interactome-based approach to mine interaction motifs from the recently published Arabidopsis thaliana interactome. The resultant proteome-wide predictions are available via www.ab.wur.nl/sliderbio and set the stage for further investigations of protein-protein binding sites. To assess our method, we first show that, by using a priori information calculated from protein sequences, such as evolutionary conservation and residue surface accessibility, we improve the performance of interface prediction compared to using only interactome data. Next, we present evidence for the functional importance of the predicted sites, which are under stronger selective pressure than the rest of protein sequence. We also observe a tendency for compensatory mutations in the binding sites of interacting proteins. Subsequently, we interrogated the interactome data to formulate testable hypotheses for the molecular mechanisms underlying effects of protein sequence mutations. Examples include proteins relevant for various developmental processes. Finally, we observed, by analysing pairs of paralogs, a correlation between functional divergence and sequence divergence in interaction sites. This analysis suggests that large-scale prediction of binding sites can cast light on evolutionary processes that shape protein-protein interaction networks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Arabidopsis / chemistry
  • Arabidopsis / genetics*
  • Arabidopsis / metabolism*
  • Arabidopsis Proteins / chemistry
  • Arabidopsis Proteins / genetics*
  • Arabidopsis Proteins / metabolism*
  • Binding Sites
  • Evolution, Molecular
  • Gene Duplication
  • Models, Biological
  • Models, Molecular
  • Molecular Sequence Data
  • Mutagenesis
  • Protein Binding
  • Protein Interaction Domains and Motifs
  • Protein Interaction Mapping / methods*

Substances

  • Arabidopsis Proteins

Grants and funding

This work was supported by an Netherlands Organisation for Scientific Research (NWO) VENI grant (863.08.027) to ADJvD, the SYSFLO Marie Curie Initial Training Network (FLV), and a PhD grant of the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen) to PB. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.