Send to

Choose Destination
Curr Mol Med. 2019 Nov 18. doi: 10.2174/1566524019666191119105209. [Epub ahead of print]

Gene Selection for the Discrimination of Colorectal Cancer

Wang W1,2,3, Xie G3, Ren Z4, Xie T3, Li J3.

Author information

Network Information Center, The Sixth Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
National Engineering Research Center of Digital Life, Sun Yat-sen University, Guangzhou, China
Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou China
College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou China



Colorectal cancer (CRC) is the third most common cancer worldwide. Cancer discrimination is a typical application of gene expression analysis using a microarray technique. However, microarray data suffer from the curse of dimensionality and usual imbalanced class distribution between the majority (tumor samples) and minority (normal samples) classes. Feature gene selection is necessary and important for cancer discrimination.


To select feature genes for the discrimination of CRC.


We improve the feature selection algorithm based on differential evolution, DEFSw by using RUSBoost classifier and weight accuracy instead of the common classifier and evaluation measure for selecting feature genes from imbalance data. We firstly extract differently expressed genes (DEGs) from the CRC dataset of the TCGA and then select the feature genes from the DEGs using the improved DEFSw algorithm. Finally, we validate the selected feature gene sets using independent datasets and retrieve the cancer related information for these genes based on text mining through the Coremine Medical online database.


We select out 16 single-gene feature sets for colorectal cancer discrimination and 19 single-gene feature sets only for colon cancer discrimination.


In summary, we find a series of high potential candidate biomarkers or signatures, which can discriminate either or both of colon cancer and rectal cancer with high sensitivity and specificity.


Colorectal Cancer; Discrimination of Cancer; Feature Genes Selection; Imbalanced Data

Supplemental Content

Full text links

Icon for Bentham Science Publishers Ltd.
Loading ...
Support Center