Send to

Choose Destination
BMC Syst Biol. 2010 May 3;4:56. doi: 10.1186/1752-0509-4-56.

Identifying essential genes in bacterial metabolic networks with machine learning methods.

Author information

Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany.



Identifying essential genes in bacteria supports to identify potential drug targets and an understanding of minimal requirements for a synthetic cell. However, experimentally assaying the essentiality of their coding genes is resource intensive and not feasible for all bacterial organisms, in particular if they are infective.


We developed a machine learning technique to identify essential genes using the experimental data of genome-wide knock-out screens from one bacterial organism to infer essential genes of another related bacterial organism. We used a broad variety of topological features, sequence characteristics and co-expression properties potentially associated with essentiality, such as flux deviations, centrality, codon frequencies of the sequences, co-regulation and phyletic retention. An organism-wise cross-validation on bacterial species yielded reliable results with good accuracies (area under the receiver-operator-curve of 75% - 81%). Finally, it was applied to drug target predictions for Salmonella typhimurium. We compared our predictions to the viability of experimental knock-outs of S. typhimurium and identified 35 enzymes, which are highly relevant to be considered as potential drug targets. Specifically, we detected promising drug targets in the non-mevalonate pathway.


Using elaborated features characterizing network topology, sequence information and microarray data enables to predict essential genes from a bacterial reference organism to a related query organism without any knowledge about the essentiality of genes of the query organism. In general, such a method is beneficial for inferring drug targets when experimental data about genome-wide knockout screens is not available for the investigated organism.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center