Format

Send to

Choose Destination
Appl Microbiol Biotechnol. 2010 Mar;86(1):285-93. doi: 10.1007/s00253-009-2423-8. Epub 2010 Jan 27.

Distinguishable codon usage and amino acid composition patterns among substrates of leaderless secretory pathways from proteobacteria.

Author information

1
Institute of Microbiology and Biotechnology, University of Latvia, Riga, Latvia, Kronvalda Boulevard 4, LV-1010 Riga, Latvia.

Abstract

The combined set of codon usage frequencies (61 sense codons) from the 111 annotated sequences of leaderless secreted type I, type III, type IV, and type VI proteins from proteobacteria were subjected to the forward and backward selection to obtain a combination of most effective predictor variables for classification/prediction purposes. The group of 24 codon frequencies displayed a strong discriminatory power with an accuracy of 100% for originally grouped and 97.3 +/- 1.6% for cross-validated (LOOCV) cases and an acceptable error rate (0.062 +/- 0.012) in k-fold (k = 6) cross-validation (KCV). The summary frequencies of synonymous codons for ten amino acids as the alternative predictor variables revealed a comparable discriminatory power (92.8 +/- 2.5% for LOOCV), however at somewhat lower levels of prediction accuracy (0.106 +/- 0.015 of KCV). A number of significant (p < 0.001) differences were found among indices of codon usage and amino acid composition depending on a definite secretion type. About 60% of secretion substrates were characterized as apparently originated from horizontal gene transfer events or putative alien genes and found to be unequally allocated in respect of groups. The proposed prediction approaches could be used to specify secretome proteins from genomic sequences as well as to assess the compatibility between bacterial secretion pathways and secretion substrates.

PMID:
20107986
DOI:
10.1007/s00253-009-2423-8
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center