A graph-theory algorithm for rapid protein side-chain prediction

Protein Sci. 2003 Sep;12(9):2001-14. doi: 10.1110/ps.03154503.

Abstract

Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Computational Biology
  • Computer Simulation
  • Databases as Topic
  • Disulfides
  • Models, Molecular
  • Models, Statistical
  • Molecular Conformation
  • Protein Conformation
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteomics / methods*
  • Software

Substances

  • Disulfides
  • Proteins