Building an automated classification of DNA-binding protein domains

Bioinformatics. 2002:18 Suppl 2:S192-201. doi: 10.1093/bioinformatics/18.suppl_2.s192.

Abstract

Intensive growth in 3D structure data on DNA-protein complexes as reflected in the Protein Data Bank (PDB) demands new approaches to the annotation and characterization of these data and will lead to a new understanding of critical biological processes involving these data. These data and those from other protein structure classifications will become increasingly important for the modeling of complete proteomes. We propose a fully automated classification of DNA-binding protein domains based on existing 3D-structures from the PDB. The classification, by domain, relies on the Protein Domain Parser (PDP) and the Combinatorial Extension (CE) algorithm for structural alignment. The approach involves the analysis of 3D-interaction patterns in DNA-protein interfaces, assignment of structural domains interacting with DNA, clustering of domains based on structural similarity and DNA-interacting patterns. Comparison with existing resources on describing structural and functional classifications of DNA-binding proteins was used to validate and improve the approach proposed here. In the course of our study we defined a set of criteria and heuristics allowing us to automatically build a biologically meaningful classification and define classes of functionally related protein domains. It was shown that taking into consideration interactions between protein domains and DNA considerably improves the classification accuracy. Our approach provides a high-throughput and up-to-date annotation of DNA-binding protein families which can be found at http://spdc.sdsc.edu.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Artificial Intelligence*
  • Binding Sites
  • Computer Simulation
  • DNA / analysis
  • DNA / chemistry*
  • DNA / classification
  • DNA-Binding Proteins / analysis
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / classification*
  • Databases, Protein
  • Models, Chemical*
  • Models, Molecular
  • Protein Binding
  • Protein Structure, Tertiary
  • Sequence Analysis / methods*
  • Sequence Analysis, Protein / methods*

Substances

  • DNA-Binding Proteins
  • DNA