An extensible automated protein annotation tool: standardizing input and output using validated XML

Bioinformatics. 2006 Feb 1;22(3):291-6. doi: 10.1093/bioinformatics/bti808. Epub 2005 Dec 8.

Abstract

Motivation: There is a frequent need to apply a large range of local or remote prediction and annotation tools to one or more sequences. We have created a tool able to dispatch one or more sequences to assorted services by defining a consistent XML format for data and annotations.

Results: By analyzing annotation tools, we have determined that annotations can be described using one or more of the six forms of data: numeric or textual annotation of residues, domains (residue ranges) or whole sequences. With this in mind, XML DTDs have been designed to store the input and output of any server. Plug-in wrappers to a number of services have been written which are called from a master script. The resulting APATML is then formatted for display in HTML. Alternatively further tools may be written to perform post-analysis.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Amino Acid Sequence
  • Database Management Systems / standards
  • Databases, Protein
  • Documentation / methods*
  • Information Storage and Retrieval / methods*
  • Information Storage and Retrieval / standards*
  • Molecular Sequence Data
  • Programming Languages
  • Proteins / chemistry*
  • Proteins / classification*
  • Proteins / metabolism
  • Sequence Analysis, Protein / methods
  • Sequence Analysis, Protein / standards*
  • Software*

Substances

  • Proteins