Format

Send to

Choose Destination
BMC Genomics. 2014;15 Suppl 12:S10. doi: 10.1186/1471-2164-15-S12-S10. Epub 2014 Dec 19.

Distance-based classifiers as potential diagnostic and prediction tools for human diseases.

Abstract

Typically, gene expression biomarkers are being discovered in course of high-throughput experiments, for example, RNAseq or microarray profiling. Analytic pipelines that extract so-called signatures suffer from the "Dimensionality curse": the number of genes expressed exceeds the number of patients we can enroll in the study and use to train the discriminator algorithm. Hence, problems with the reproducibility of gene signatures are more common than not; when the algorithm is executed using a different training set, the resulting diagnostic signature may turn out to be completely different. In this paper we propose an alternative novel approach which takes into account quantifiable expression levels of all genes assayed. In our analysis, the cumulative gene expression pattern of an individual patient is represented as a point in the multidimensional space formed by all gene expression profiles assayed in given system, where the clusters of "normal samples" and "affected samples" and defined. The degree of separation of the given sample from the space occupied by "normal samples" reflects the drift of the sample away from homeostasis in the course of development of the pathophysiological process that underly the disease. The outlined approach was validated using the publicly available glioma dataset deposited in Rembrandt and associated with survival data. Additionally, the applicability of the distance analysis to the classification of non-malignant sampled was tested using psoriatic lesions and non-lesional matched controls as a model.

PMID:
25563076
PMCID:
PMC4303935
DOI:
10.1186/1471-2164-15-S12-S10
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center