Euclidean sections of protein conformation space and their implications in dimensionality reduction

Proteins. 2014 Oct;82(10):2585-96. doi: 10.1002/prot.24622. Epub 2014 Jun 19.

Abstract

Dimensionality reduction is widely used in searching for the intrinsic reaction coordinates for protein conformational changes. We find the dimensionality-reduction methods using the pairwise root-mean-square deviation (RMSD) as the local distance metric face a challenge. We use Isomap as an example to illustrate the problem. We believe that there is an implied assumption for the dimensionality-reduction approaches that aim to preserve the geometric relations between the objects: both the original space and the reduced space have the same kind of geometry, such as Euclidean geometry vs. Euclidean geometry or spherical geometry vs. spherical geometry. When the protein free energy landscape is mapped onto a 2D plane or 3D space, the reduced space is Euclidean, thus the original space should also be Euclidean. For a protein with N atoms, its conformation space is a subset of the 3N-dimensional Euclidean space R(3N). We formally define the protein conformation space as the quotient space of R(3N) by the equivalence relation of rigid motions. Whether the quotient space is Euclidean or not depends on how it is parameterized. When the pairwise RMSD is employed as the local distance metric, implicit representations are used for the protein conformation space, leading to no direct correspondence to a Euclidean set. We have demonstrated that an explicit Euclidean-based representation of protein conformation space and the local distance metric associated to it improve the quality of dimensionality reduction in the tetra-peptide and β-hairpin systems.

Keywords: dimensionality reduction; free energy landscape; isomap; principal component analysis; protein conformation space.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacterial Proteins / chemistry*
  • Energy Transfer
  • Models, Molecular*
  • Molecular Dynamics Simulation
  • Oligopeptides / chemistry*
  • Peptide Fragments / chemistry*
  • Principal Component Analysis
  • Protein Conformation
  • Protein Folding
  • Protein Interaction Domains and Motifs
  • Protein Structure, Secondary
  • Protein Unfolding
  • Statistics as Topic
  • Surface Properties
  • Terminology as Topic

Substances

  • Bacterial Proteins
  • IgG Fc-binding protein, Streptococcus
  • Oligopeptides
  • Peptide Fragments
  • alanyl-alanyl-alanyl-alanine