Format

Send to

Choose Destination
Bioinformatics. 2018 Jul 1;34(13):i304-i312. doi: 10.1093/bioinformatics/bty262.

HFSP: high speed homology-driven function annotation of proteins.

Mahlich Y1,2,3, Steinegger M2,4,5, Rost B2,3,6,7,8, Bromberg Y1,3,9.

Author information

1
Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA.
2
Computational Biology & Bioinformatics - i12 Informatics, Technical University of Munich (TUM), Munich, Germany.
3
Institute for Advanced Study, Technical University of Munich (TUM), Munich, Germany.
4
Quantitative and Computational Biology Group, Max-Planck Institute for Biophysical Chemistry, Göttingen, Germany.
5
Department of Chemistry, Seoul National University, Seoul, Korea.
6
TUM School of Life Sciences Weihenstephan (WZW), Technical University Munich (TUM), Freising, Germany.
7
Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA.
8
New York Consortium on Membrane Protein Structure (NYCOMPS), New York, NY, USA.
9
Department of Genetics, Human Genetics Institute, Rutgers University, Piscataway, NJ, USA.

Abstract

Motivation:

The rapid drop in sequencing costs has produced many more (predicted) protein sequences than can feasibly be functionally annotated with wet-lab experiments. Thus, many computational methods have been developed for this purpose. Most of these methods employ homology-based inference, approximated via sequence alignments, to transfer functional annotations between proteins. The increase in the number of available sequences, however, has drastically increased the search space, thus significantly slowing down alignment methods.

Results:

Here we describe homology-derived functional similarity of proteins (HFSP), a novel computational method that uses results of a high-speed alignment algorithm, MMseqs2, to infer functional similarity of proteins on the basis of their alignment length and sequence identity. We show that our method is accurate (85% precision) and fast (more than 40-fold speed increase over state-of-the-art). HFSP can help correct at least a 16% error in legacy curations, even for a resource of as high quality as Swiss-Prot. These findings suggest HFSP as an ideal resource for large-scale functional annotation efforts.

Supplementary information:

Supplementary data are available at Bioinformatics online.

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center