![]() | ![]() |
Formats:
|
||||
Copyright Lee et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Inferring Pathway Activity toward Precise Disease Classification 1Department of Bio and Brain Engineering, KAIST, Daejeon, South Korea 2Bioinformatics Program, University of California San Diego, La Jolla, California, United States of America 3Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America 4Department of Laboratory Medicine and Genetics, Sungkyunkwan University, School of Medicine, Samsung Medical Center, Seoul, South Korea Greg Tucker-Kellogg, Editor Lilly Singapore Centre for Drug Discovery, Singapore #Contributed equally. * E-mail: trey/at/bioeng.ucsd.edu (TI); Email: dhlee/at/biosoft.kaist.ac.kr (DL) Conceived and designed the experiments: EL HYC. Performed the experiments: EL HYC. Analyzed the data: EL HYC. Wrote the paper: EL HYC TGI DL. Helped with biological interpretation of findings: JWK. Received February 21, 2008; Accepted September 24, 2008. This article has been cited by other articles in PMC.Abstract The advent of microarray technology has made it possible to classify disease states based on gene expression profiles of patients. Typically, marker genes are selected by measuring the power of their expression profiles to discriminate among patients of different disease states. However, expression-based classification can be challenging in complex diseases due to factors such as cellular heterogeneity within a tissue sample and genetic heterogeneity across patients. A promising technique for coping with these challenges is to incorporate pathway information into the disease classification procedure in order to classify disease based on the activity of entire signaling pathways or protein complexes rather than on the expression levels of individual genes or proteins. We propose a new classification method based on pathway activities inferred for each patient. For each pathway, an activity level is summarized from the gene expression levels of its condition-responsive genes (CORGs), defined as the subset of genes in the pathway whose combined expression delivers optimal discriminative power for the disease phenotype. We show that classifiers using pathway activity achieve better performance than classifiers based on individual gene expression, for both simple and complex case-control studies including differentiation of perturbed from non-perturbed cells and subtyping of several different kinds of cancer. Moreover, the new method outperforms several previous approaches that use a static (i.e., non-conditional) definition of pathways. Within a pathway, the identified CORGs may facilitate the development of better diagnostic markers and the discovery of core alterations in human disease. Author Summary The advent of microarray technology has drawn immense interest to identify gene expression levels that can serve as biomarkers for disease. Marker genes are selected by examining each individual gene to see how well its expression level discriminates different disease types. In complex diseases such as cancer, good marker genes can be hard to find due to cellular heterogeneity within the tissue and genetic heterogeneity across patients. A promising technique for addressing these challenges is to incorporate biological pathway information into the marker identification procedure, permitting disease classification based on the activity of entire pathways rather than simply on the expression levels of individual genes. However, previous pathway-based methods have not significantly outperformed gene-based methods. Here, we propose a new pathway-based classification procedure in which markers are encoded not as individual genes, nor as the set of genes making up a known pathway, but as subsets of “condition-responsive genes (CORGs)” within those pathways. Using expression profiles from seven different microarray studies, we show that the accuracy of this method is significantly better than both the conventional gene- and pathway- based diagnostics. Furthermore, the identified CORGs may facilitate the development of effective diagnostic markers and the discovery of molecular mechanisms underlying disease. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||