Send to

Choose Destination
Comput Biol Med. 2011 Apr;41(4):228-37. doi: 10.1016/j.compbiomed.2011.02.004. Epub 2011 Mar 3.

A hybrid feature selection method for DNA microarray data.

Author information

Department of Chemical Engineering, I-Shou University, Kaohsiung 80041, Taiwan.


Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. In cancer classification, available training data sets are generally of a fairly small sample size compared to the number of genes involved. Along with training data limitations, this constitutes a challenge to certain classification methods. Feature (gene) selection can be used to successfully extract those genes that directly influence classification accuracy and to eliminate genes which have no influence on it. This significantly improves calculation performance and classification accuracy. In this paper, correlation-based feature selection (CFS) and the Taguchi-genetic algorithm (TGA) method were combined into a hybrid method, and the K-nearest neighbor (KNN) with the leave-one-out cross-validation (LOOCV) method served as a classifier for eleven classification profiles to calculate the classification accuracy. Experimental results show that the proposed method reduced redundant features effectively and achieved superior classification accuracy. The classification accuracy obtained by the proposed method was higher in ten out of the eleven gene expression data set test problems when compared to other classification methods from the literature.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center