Send to

Choose Destination
Mol Biosyst. 2016 Mar;12(3):778-85. doi: 10.1039/c5mb00672d.

Detecting reliable non interacting proteins (NIPs) significantly enhancing the computational prediction of protein-protein interactions using machine learning methods.

Author information

Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland.
Centre of New Technologies, University of Warsaw, Banacha 2c Str., 02-097 Warsaw, Poland. and Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland.
GeneXplain GmbH, Am Exer 10b, D-38302, Wolfenb├╝ttel, Germany.
Centre of New Technologies, University of Warsaw, Banacha 2c Str., 02-097 Warsaw, Poland.


Protein-protein interactions (PPIs) play a vital role in most biological processes. Hence their comprehension can promote a better understanding of the mechanisms underlying living systems. However, besides the cost and the time limitation involved in the detection of experimentally validated PPIs, the noise in the data is still an important issue to overcome. In the last decade several in silico PPI prediction methods using both structural and genomic information were developed for this purpose. Here we introduce a unique validation approach aimed to collect reliable non interacting proteins (NIPs). Thereafter the most relevant protein/protein-pair related features were selected. Finally, the prepared dataset was used for PPI classification, leveraging the prediction capabilities of well-established machine learning methods. Our best classification procedure displayed specificity and sensitivity values of 96.33% and 98.02%, respectively, surpassing the prediction capabilities of other methods, including those trained on gold standard datasets. We showed that the PPI/NIP predictive performances can be considerably improved by focusing on data preparation.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Royal Society of Chemistry
Loading ...
Support Center