Massive sequence perturbation of a small protein

Proc Natl Acad Sci U S A. 2005 Oct 18;102(42):14988-93. doi: 10.1073/pnas.0500465102. Epub 2005 Oct 7.

Abstract

Most protein topologies rarely occur in nature, thus limiting our ability to extract sequence information that could be used to predict structure, function, and evolutionary constraints on protein folds. In principle, the sequence diversity explored by a given protein topology could be expanded by introducing sequence perturbations and selecting variant proteins that fold correctly. However, our capacity to explore sequence space is intrinsically limited by the enormous number of sequences generated from the 20 amino acids and the limited number of variants likely to fold. Here we sought to test whether the sequence space for naturally existing proteins can be explored by simple, sequential degeneration of a complete set of short sequence segments of a model protein, without long-range covariation. Using the Raf ras binding domain as a model of a small protein capable of autonomous folding, we degenerated 72 of 76 positions of the primary structure for the 20 amino acids in segments of four to seven residues defined by secondary structure and selected the folded species for interaction with h-ras by using an in vivo survival-selection assay. The methodology presented allowed for rigorous statistical analysis and comparison of sequence diversity. The ensemble of sequence variants of Raf ras binding domain obtained have recaptured the diversity observed for the ubiquitin-roll topology. A signature sequence for this fold and the implication of this strategy to protein design and structure prediction are discussed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence*
  • Databases, Protein
  • Entropy
  • Humans
  • Models, Molecular
  • Molecular Sequence Data
  • Mutagenesis, Site-Directed
  • Peptide Library
  • Protein Conformation*
  • Protein Folding
  • Random Allocation
  • Reproducibility of Results
  • Sequence Alignment
  • raf Kinases / chemistry*
  • raf Kinases / metabolism

Substances

  • Peptide Library
  • raf Kinases