Coding potential prediction in Wolbachia using artificial neural networks

In Silico Biol. 2007;7(1):105-13.

Abstract

Ab initio coding potential prediction in a bacterial genome is an important step in determining an organism's transcriptional regulatory function. Extensive studies of genes structure have been carried out in a few species such as Escherichia coli, fewer resources exist in newly sequenced genomes like Wolbachia. A model of gene prediction trained on one species may not reflect the properties of other, distantly related prokaryotic organisms. These issues were encountered in the course of predicting genes in the genome of Wolbachia, very important gramnegative bacteria that form intracellular inherited infections in many invertebrates. We describe a coding potential predictor based on artificial neural networks and we compare its performance by using different architectures, learning algorithms and parameters. We rely on a dataset of positive samples constructed from coding sequences and on a negative dataset consisted of all the intergenic regions that were not located between the genes of an operon. Both datasets, positive and negative, were output as fasta formatted files and were used for neural network training. The fast, adaptive, batch learning algorithm Resilient propagation, exhibits the best overall performance on a 64input-10hidden-1output nodes multi-layer perceptron neural network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Codon
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Genome, Bacterial
  • Models, Statistical
  • Models, Theoretical
  • Neural Networks, Computer*
  • Pattern Recognition, Automated
  • Wolbachia / genetics
  • Wolbachia / metabolism*

Substances

  • Codon