How similar must a template protein be for homology modeling by side-chain packing methods?

Pac Symp Biocomput. 1996:126-41.

Abstract

Given the correct backbone coordinates of a globular protein, side-chain packing methods can be generally expected to predict the side-chain coordinates of the buried core residues accurately. In the context of a study in modeling a family of bacteriophage DNA-binding proteins, we observed that when the coordinates of the actual perfect backbone are not available, the side-chain packing methods are still of predictive value using homologous but imperfect backbones. This is the situation in practical homology modeling where a target protein sequence is modeled from template structures of known protein homologs. In order to assess the quality and degree of accuracy of such predictions and their dependence on the extent of homology, we have now extended these studies to a well characterized family of globin structures that span a much wider range of sequence-structure similarity. The collective results show a clear relationship that is independent of protein family between side-chain prediction accuracy and the level of similarity between the template and target proteins. We judge this similarity in terms of sequence identity and the backbone r.m.s. deviation of the template structure used for modeling and the actual target structure in cases where the target structures are available. In summary, as sequence identity drops from 100% to about 50%, or when the backbone r.m.s. deviation between template and target structures increases from 0 A to about 1 A, the overall average r.m.s. error for the buried-core residues rises from 1.2 A to 1.5 A while the chi 1 prediction accuracy drops from 85% to 70-75% and the chi 2 prediction accuracy drops from 80% to 60-65%. When the sequence identity drops below 50% or the backbone r.m.s. deviation rises above 1 A, all 3 measures of prediction accuracy decrease rapidly. When the sequence identity edges to the so-called twilight zone of sequence similarity at around 22%, or when the backbone r.m.s. deviation exceeds 2 A, the prediction accuracy approaches the values to be expected for random predictions, namely, 3.1 A for average r.m.s. error, 22% and 29% for accuracy of chi 1 and chi 2 prediction. These observations provide a practical evaluation of the side-chain packing methods and are of value to the homology-modeler. The extent and degree to which the backbone topology of a protein fold can constrain internal side-chain orientation gives insight into the plasticity of the sequence-structure relationship found in the architecture of proteins.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Bacteriophages / metabolism
  • Computer Simulation
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / metabolism
  • Globins / chemistry*
  • Globins / metabolism
  • Humans
  • Protein Conformation*
  • Proteins / chemistry*
  • Sequence Alignment*
  • Sequence Homology, Amino Acid

Substances

  • DNA-Binding Proteins
  • Proteins
  • Globins