Format

Send to

Choose Destination
BMC Bioinformatics. 2018 Sep 21;19(1):332. doi: 10.1186/s12859-018-2364-2.

Identifying protein complexes based on node embeddings obtained from protein-protein interaction networks.

Author information

1
College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, People's Republic of China.
2
College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, People's Republic of China. yangzh@dlut.edu.cn.
3
Beijing Institute of Health Administration and Medical Information, Beijing, 100850, People's Republic of China. wangleibihami@gmail.com.
4
Beijing Institute of Health Administration and Medical Information, Beijing, 100850, People's Republic of China.
5
School of Software Technology, Dalian University of Technology, Dalian, 116024, Liaoning, People's Republic of China.

Abstract

BACKGROUND:

Protein complexes are one of the keys to deciphering the behavior of a cell system. During the past decade, most computational approaches used to identify protein complexes have been based on discovering densely connected subgraphs in protein-protein interaction (PPI) networks. However, many true complexes are not dense subgraphs and these approaches show limited performances for detecting protein complexes from PPI networks.

RESULTS:

To solve these problems, in this paper we propose a supervised learning method based on network node embeddings which utilizes the informative properties of known complexes to guide the search process for new protein complexes. First, node embeddings are obtained from human protein interaction network. Then the protein interactions are weighted through the similarities between node embeddings. After that, the supervised learning method is used to detect protein complexes. Then the random forest model is used to filter the candidate complexes in order to obtain the final predicted complexes. Experimental results on real human and yeast protein interaction networks show that our method effectively improves the performance for protein complex detection.

CONCLUSIONS:

We provided a new method for identifying protein complexes from human and yeast protein interaction networks, which has great potential to benefit the field of protein complex detection.

KEYWORDS:

Node embeddings; Protein complex detection; Random forest; Supervised learning method

PMID:
30241459
PMCID:
PMC6150962
DOI:
10.1186/s12859-018-2364-2
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center