Send to

Choose Destination

See 1 citation found by title matching your search:

J Bioinform Comput Biol. 2007 Dec;5(6):1215-31.

Retrieving mutation-specific information for human proteins in UniProt/Swiss-Prot Knowledgebase.

Author information

Swiss-Prot Group, Swiss Institute of Bioinformatics, Centre M├ędical Universitaire, 1, rue Michel-Servet, 1211, Geneva 4, Switzerland.


The UniProt/Swiss-Prot Knowledgebase records about 30,500 variants in 5,664 proteins (Release 52.2). Most of these variants are manually curated single amino acid polymorphisms (SAPs) with references to the literature. In order to keep the list of published documents related to SAPs up to date, an automatic information retrieval method is developed to recover texts mentioning SAPs. The method is based on the use of regular expressions (patterns) and rules for the detection and validation of mutations. When evaluated using a corpus of 9,820 PubMed references, the precision of the retrieval was determined to be 89.5% over all variants. It was also found that the use of nonstandard mutation nomenclature and sequence positional correction is necessary to retrieve a significant number of relevant articles. The method was applied to the 5,664 proteins with variants. This was performed by first submitting a PubMed query to retrieve articles using gene or protein names and a list of mutation-related keywords; the SAP detection procedure was then used to recover relevant documents. The method was found to be efficient in retrieving new references on known polymorphisms. New references on known SAPs will be rendered accessible to the public via the Swiss-Prot variant pages.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center