ARC: automated resource classifier for agglomerative functional classification of prokaryotic proteins using annotation texts

J Biosci. 2007 Aug;32(5):937-45. doi: 10.1007/s12038-007-0094-0.

Abstract

Functional classification of proteins is central to comparative genomics. The need for algorithms tuned to enable integrative interpretation of analytical data is felt globally. The availability of a general,automated software with built-in flexibility will significantly aid this activity. We have prepared ARC (Automated Resource Classifier), which is an open source software meeting the user requirements of flexibility. The default classification scheme based on keyword match is agglomerative and directs entries into any of the 7 basic non-overlapping functional classes: Cell wall, Cell membrane and Transporters (C), Cell division (D), Information (I), Translocation (L), Metabolism (M), Stress(R), Signal and communication (S) and 2 ancillary classes: Others (O) and Hypothetical (H). The keyword library of ARC was built serially by first drawing keywords from Bacillus subtilis and Escherichia coli K12. In subsequent steps,this library was further enriched by collecting terms from archaeal representative Archaeoglobus fulgidus, Gene Ontology, and Gene Symbols. ARC is 94.04% successful on 6,75,663 annotated proteins from 348 prokaryotes. Three examples are provided to illuminate the current perspectives on mycobacterial physiology and costs of proteins in 333 prokaryotes. ARC is available at http://arc.igib.res.in.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Archaeal Proteins / classification*
  • Archaeal Proteins / physiology*
  • Archaeoglobus fulgidus / chemistry
  • Archaeoglobus fulgidus / physiology
  • Bacillus subtilis / chemistry
  • Bacillus subtilis / physiology
  • Bacterial Proteins / classification*
  • Bacterial Proteins / physiology*
  • Computational Biology
  • Escherichia coli K12 / chemistry
  • Escherichia coli K12 / physiology
  • Escherichia coli Proteins / classification
  • Escherichia coli Proteins / physiology
  • Mycobacterium bovis / chemistry
  • Mycobacterium bovis / physiology
  • Mycobacterium leprae / chemistry
  • Mycobacterium leprae / physiology
  • Mycobacterium tuberculosis / chemistry
  • Mycobacterium tuberculosis / physiology
  • Protein Array Analysis

Substances

  • Archaeal Proteins
  • Bacterial Proteins
  • Escherichia coli Proteins