Format

Send to

Choose Destination
PLoS One. 2019 Jun 13;14(6):e0218264. doi: 10.1371/journal.pone.0218264. eCollection 2019.

Predicting biomedical relationships using the knowledge and graph embedding cascade model.

Author information

1
School of Information Management, Sun Yat-Sen Uniersity, Guangzhou, Guangdong, China.
2
Department of Library and Information Science, Yonsei University, Seoul, Korea.
3
School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America.
4
School of Information Management, Wuhan University, Wuhan, Hubei, China.

Abstract

Advances in machine learning and deep learning methods, together with the increasing availability of large-scale pharmacological, genomic, and chemical datasets, have created opportunities for identifying potentially useful relationships within biochemical networks. Knowledge embedding models have been found to have value in detecting knowledge-based correlations among entities, but little effort has been made to apply them to networks of biochemical entities. This is because such networks tend to be unbalanced and sparse, and knowledge embedding models do not work well on them. However, to some extent, the shortcomings of knowledge embedding models can be compensated for if they are used in association with graph embedding. In this paper, we combine knowledge embedding and graph embedding to represent biochemical entities and their relations as dense and low-dimensional vectors. We build a cascade learning framework which incorporates semantic features from the knowledge embedding model, and graph features from the graph embedding model, to score the probability of linking. The proposed method performs noticeably better than the models with which it is compared. It predicted links and entities with an accuracy of 93%, and its average hits@10 score has an average of 8.6% absolute improvement compared with original knowledge embedding model, 1.1% to 9.7% absolute improvement compared with other knowledge and graph embedding algorithm. In addition, we designed a meta-path algorithm to detect path relations in the biomedical network. Case studies further verify the value of the proposed model in finding potential relationships between diseases, drugs, genes, treatments, etc. Amongst the findings of the proposed model are the suggestion that VDR (vitamin D receptor) may be linked to prostate cancer. This is backed by evidence from medical databases and published research, supporting the suggestion that our proposed model could be of value to biomedical researchers.

Conflict of interest statement

The authors have declared that no competing interests exist.

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center