Proteome Analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W365-71. doi: 10.1093/nar/gkh485.

Abstract

Proteome Analyst (PA) (http://www.cs.ualberta.ca/~bioinfo/PA/) is a publicly available, high-throughput, web-based system for predicting various properties of each protein in an entire proteome. Using machine-learned classifiers, PA can predict, for example, the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein. In addition, PA is currently the most accurate and most comprehensive system for predicting subcellular localization, the location within a cell where a protein performs its main function. Two other capabilities of PA are notable. First, PA can create a custom classifier to predict a new property, without requiring any programming, based on labeled training data (i.e. a set of examples, each with the correct classification label) provided by a user. PA has been used to create custom classifiers for potassium-ion channel proteins and other general function ontologies. Second, PA provides a sophisticated explanation feature that shows why one prediction is chosen over another. The PA system produces a Naïve Bayes classifier, which is amenable to a graphical and interactive approach to explanations for its predictions; transparent predictions increase the user's confidence in, and understanding of, PA.

MeSH terms

  • Internet
  • Proteins / classification
  • Proteins / physiology
  • Proteome / chemistry*
  • Proteomics*
  • Reproducibility of Results
  • Sequence Analysis, Protein
  • Software*

Substances

  • Proteins
  • Proteome