Format

Send to

Choose Destination
Artif Intell Med. 2002 Nov;26(3):281-304.

Gene expression data analysis of human lymphoma using support vector machines and output coding ensembles.

Author information

1
Dipartimento di Informatica e Scienze dell'Informazione, Università di Genova, via Dodecaneso 35, 16146 Genova, Italy. valenti@disi.unige.it

Abstract

The large amount of data generated by DNA microarrays was originally analysed using unsupervised methods, such as clustering or self-organizing maps. Recently supervised methods such as decision trees, dot-product support vector machines (SVM) and multi-layer perceptrons (MLP) have been applied in order to classify normal and tumoural tissues. We propose methods based on non-linear SVM with polynomial and Gaussian kernels, and output coding (OC) ensembles of learning machines to separate normal from malignant tissues, to classify different types of lymphoma and to analyse the role of sets of coordinately expressed genes in carcinogenic processes of lymphoid tissues. Using gene expression data from "Lymphochip", a specialised DNA microarray developed at Stanford University School of Medicine, we show that SVM can correctly separate normal from tumoural tissues, and OC ensembles can be successfully used to classify different types of lymphoma. Moreover, we identify a group of coordinately expressed genes related to the separation of two distinct subgroups inside diffuse large B-cell lymphoma (DLBCL), validating a previous Alizadeh's hypothesis about the existence of two distinct diseases inside DLBCL.

PMID:
12446082
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center