Format

Send to

Choose Destination
Metallomics. 2014 Oct;6(10):1913-30. doi: 10.1039/c4mt00156g. Epub 2014 Aug 13.

An integrative computational model for large-scale identification of metalloproteins in microbial genomes: a focus on iron-sulfur cluster proteins.

Author information

1
Univ. Grenoble Alpes, iRTSV-BGE, F-38000 Grenoble, France. johan.estellon@gmail.com.

Abstract

Metalloproteins represent a ubiquitous group of molecules which are crucial to the survival of all living organisms. While several metal-binding motifs have been defined, it remains challenging to confidently identify metalloproteins from primary protein sequences using computational approaches alone. Here, we describe a comprehensive strategy based on a machine learning approach to design and assess a penalized generalized linear model. We used this strategy to detect members of the iron-sulfur cluster protein family. A new category of descriptors, whose profile is based on profile hidden Markov models, encoding structural information was combined with public descriptors into a linear model. The model was trained and tested on distinct datasets composed of well-characterized iron-sulfur protein sequences, and the resulting model provided higher sensitivity compared to a motif-based approach, while maintaining a good level of specificity. Analysis of this linear model allows us to detect and quantify the contribution of each descriptor, providing us with a better understanding of this complex protein family along with valuable indications for further experimental characterization. Two newly-identified proteins, YhcC and YdiJ, were functionally validated as genuine iron-sulfur proteins, confirming the prediction. The computational model was then applied to over 550 prokaryotic genomes to screen for iron-sulfur proteomes; the results are publicly available at: . This study represents a proof-of-concept for the application of a penalized linear model to identify metalloprotein superfamilies on a large-scale. The application employed here, screening for iron-sulfur proteomes, provides new candidates for further biochemical and structural analysis as well as new resources for an extensive exploration of iron-sulfuromes in the microbial world.

PMID:
25117543
DOI:
10.1039/c4mt00156g
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Royal Society of Chemistry
Loading ...
Support Center