NCBI Logo NCBI News NCBI News banner
National Center for Biotechnology Information US Department of Health and Human Services National Center for Biotechnology Information National Library of Medicine National Institutes of Health
Vol 14 No 1 of NCBI News

click to go to index of past issues

In this issue


Open Mass Spectrometry Search Algorithm (OMSSA)

Probe Database Debut

New Structure Link from Protein

BLAST Download Update

New Microbial Genomes in GenBank

Nucleotide Database Splits

NCBI 4-Pack Course

RefSeq Release 14

New Organisms in UniGene

GenBank Passes 100 Gigabases

New BLAST Formatter

Splign Alignment Tool

GenBank Release 150

New Genome Builds

Submission Corner


Masthead

 





Open Mass Spectrometry Search Algorithm (OMSSA)

Many years ago proteins were the biological molecules most often identified, sequenced and quantified by biologists. However, the advent of rapid methods of nucleic acid sequencing and measurement made DNA and RNA the center of attention—until recently. But proteins are back, now that newer mass spectrometer technologies allow the identification and analysis of proteins from complex biological samples. This current proteomics boom requires efficient computational methods of protein identification since analysis can involve thousands of peptide mass spectra derived from the coupled liquid chromatography–mass spectrometry of biological samples. NCBI’s Open Mass Spectrometry Search Algorithm (OMSSA)1 is a free search engine for analyzing and identifying peptides from tandem mass spectrometry (ms/ms) peptide spectra. The OMSSA algorithm scores peptide hits using a probability-based method that compares experimental fragments with those calculated from libraries of known protein sequences. The statistical model used by OMSSA is similar to the one used in the BLAST algorithm. OMSSA uses an expected value significance threshold, familiar to users of BLAST, to discriminate true matches from those that may be due to chance. OMSSA is very effective at identifying spectra from standard protein cocktails at high speed. The tool works with data from ion traps employing traditional charge associated dissociation, and electron transfer dissociation technologies.

OMSSA uses pre-formatted BLAST protein libraries from NCBI as the source of the calculated spectra, including the popular NCBI non-redundant protein (nr) and the NCBI reference protein (RefSeq) sets. Custom libraries of FASTA formatted sequences can also be used in the standalone version after processing with the NCBI BLAST utility, “formatdb”. OMSSA is compatible with common data formats for experimental mass spectra including the “dta”, “pkl”, and “mgf” formats.

The Web Interface

The Web interface to OMSSA, shown in Figure 1, is linked to the OMSSA homepage at:

Click on image to view larger

Figure 1: The Web interace to the OMSSA search tool. Cleavage and modification conditions, database and species selection as well as various settings, including filtering options and stringency, can be specified.

Experimental spectra can be uploaded and compared to calculated spectra from the nr and RefSeq protein libraries limited to selectable organisms, making identification easier. Common digestion methods and post-translational modifications can be selected as well as stringency thresholds. Currently, the Web version can search up to 2000 mass spectra at a time.

Output can be viewed in the Web browser or can be saved in the “csv” format suitable for import into spreadsheets or in the OMSSA “omx” xml or “oms” ASN.1 formats that can easily be parsed by computer programs. The latter two formats can be viewed locally in the OMSSA browser as described below. A sample set of results is available for spectra from a standard mixture of four proteins. Figure 2 shows the results for the analysis of one spectrum from this sample set identifying chicken lysozyme C.

The Standalone Interface

The standalone version of both OMSSA and the OMSSA browser can be downloaded from the NCBI ftp site at:


Archives containing binaries for Linux, Mac OS X, or Windows are available. The standalone version offers options not available with the Web interface, such as the use of custom sequence databases. Output of the Web and standalone versions can be displayed with the OMSSA browser program that provides a convenient browser and graphical display for OMSSA output that is saved locally.

Additional information about OMSSA including help documentation and an extensive FAQ can be found on the OMSSA Homepage at:


Users can also subscribe to the OMSSA email list or browse through archived discussion threads from the list.

1Geer LY, Markey SP, Kowalak JA, Wagner L, Xu M, Maynard DM, Yang X, Shi W, Bryant SH. 2004. Open mass spectrometry search algorithm. J. Proteome Res. (5):958-64. PMID: 15473683

—EK

to next article


NCBI News | Summer 2003 NCBI News