Send to

Choose Destination
Bioinformatics. 2018 Nov 8. doi: 10.1093/bioinformatics/bty928. [Epub ahead of print]

BacPaCS - Bacterial Pathogenicity Classification via Sparse-SVM.

Author information

Department of Computer Science, Faculty of Natural Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
The Shraga Segal Department of Microbiology, Immunology and Genetics, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.



Bacterial infections are a major cause of illness worldwide. However, most bacterial strains pose no threat to human health and may even be beneficial. Thus, developing powerful diagnostic bioinformatic tools that differentiate pathogenic from commensal bacteria are critical for effective treatment of bacterial infections.


We propose a machine-learning approach for classifying human-hosted bacteria as pathogenic or non-pathogenic based on their genome-derived proteomes. Our approach is based on sparse Support Vector Machines (SVM), which autonomously selects a small set of genes that are related to bacterial pathogenicity. We implement our approach as a tool - "Bacterial Pathogenicity Classification via sparse-SVM" (BacPaCS) - which is fully automated and handles datasets significantly larger than those previously used. BacPaCS shows high accuracy in distinguishing pathogenic from non-pathogenic bacteria, in a clinically relevant dataset, comprising only human-hosted bacteria. Among the genes that received the highest positive weight in the resulting classifier, we found genes that are known to be related to bacterial pathogenicity, in addition to novel candidates, whose involvement in bacterial virulence was never reported.


The code and the resulting model are available at:

Supplementary information:

Supplementary files, including an appendix, are provided as part of this submission.

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center