Format

Send to

Choose Destination
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):1045-57. doi: 10.1109/TCBB.2013.111.

Protein function prediction using multilabel ensemble classification.

Author information

1
Southwest University, Beibei and South China University of Technology, Guangzhou.
2
George Mason University, Fairfax.
3
South China University of Technology, Guangzhou.

Erratum in

Abstract

High-throughput experimental techniques produce several kinds of heterogeneous proteomic and genomic data sets. To computationally annotate proteins, it is necessary and promising to integrate these heterogeneous data sources. Some methods transform these data sources into different kernels or feature representations. Next, these kernels are linearly (or nonlinearly) combined into a composite kernel. The composite kernel is utilized to develop a predictive model to infer the function of proteins. A protein can have multiple roles and functions (or labels). Therefore, multilabel learning methods are also adapted for protein function prediction. We develop a transductive multilabel classifier (TMC) to predict multiple functions of proteins using several unlabeled proteins. We also propose a method called transductive multilabel ensemble classifier (TMEC) for integrating the different data sources using an ensemble approach. The TMEC trains a graph-based multilabel classifier on each single data source, and then combines the predictions of the individual classifiers. We use a directed birelational graph to capture the relationships between pairs of proteins, between pairs of functions, and between proteins and functions. We evaluate the effectiveness of the TMC and TMEC to predict the functions of proteins on three benchmarks. We show that our approaches perform better than recently proposed protein function prediction methods on composite and multiple kernels. The code, data sets used in this paper and supplemental material are available at https://sites.google.com/site/guoxian85/tmec.

PMID:
24334396
DOI:
10.1109/TCBB.2013.111
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for IEEE Engineering in Medicine and Biology Society
Loading ...
Support Center