Using protein motif combinations to update KEGG pathway maps and orthologue tables

Genome Inform. 2004;15(2):266-75.

Abstract

We have studied the projection of protein family data onto single bacterial translated genome as a solution to visualise relationships between families restricted to bacterial sequences. Any member of any type of family as defined in the Pfam database (domains, signatures, etc.) is considered as a protein module. Our first goal is to discover rules correlating the occurrence of modules with biochemical properties. To achieve this goal we have developed a platform to quantify information found in protein databases and to support the analysis of the nature of modules, their position and corresponding frequencies of occurrence (in isolation or in combination) in association with pathway knowledge as found in KEGG. This paper focuses on two pathways: the two-component system and the aminophosphonate metabolism, that are partially but not completely documented. Proteins involved in those pathways were listed separately in each organism to analyse module composition and rules constraining pathway interactions were identified. It is shown how these results can be used to update KEGG pathways and orthologue tables.

MeSH terms

  • Animals
  • Computational Biology
  • Computer Graphics
  • Databases, Genetic*
  • Databases, Protein*
  • Gene Expression Profiling
  • Genome*
  • Humans
  • Information Storage and Retrieval
  • Multigene Family
  • Proteins* / chemistry
  • Proteins* / genetics
  • Proteins* / metabolism
  • Sequence Homology

Substances

  • Proteins