Format

Send to

Choose Destination
See comment in PubMed Commons below
Med Oncol. 2013;30(2):584. doi: 10.1007/s12032-013-0584-x. Epub 2013 Apr 19.

Highly accurate two-gene signature for gastric cancer.

Author information

1
Department of Digestive Diseases, Wuhan General Hospital of Guangzhou Command, Wuhan, People's Republic of China.

Abstract

Large amount of expression data were generated by high-throughput experimental techniques such as microarray. Single algorithm cannot be widely accepted as suitable method for mining of gene expression data. Therefore, integration of different algorithms and extraction of more useful information from the expression data are the key problems for identification of biomarkers. Here, we used three machine learning algorithms to select feature genes based on gene profiling data of gastric cancer (GC). Then, a common divisor was extracted as candidate feature genes aggregation for Tree Building and Tree Pruning analysis by Decision Tree (DT) algorithm. Real-time quantitative PCR and immunohistochemistry (IHC) staining were used to validate the relative expression levels of the candidate feature genes. Receiver operating characteristic curves were used to analyse the classification sensitivity and specificity of the feature genes. A total of 174, 202, 149 feature genes were selected by Class Information Index, Information Gain Index and Relief algorithms, with a common divisor consisting of 32 genes. Using a DT algorithm to contribute to the classification rule sets, we identified COL2A1 and ATP4B as candidate biomarkers of GC. The expression levels of these two genes were validated by real-time PCR and IHC with high sensitivity (>90 %) and specificity (>90 %) in both training and test samples. We first introduced an integral and systematic data-mining model for identification of biomarkers based on gene expression data. The two-gene signature obtained by our predictive model could be used for recognizing the biological characteristic of GC.

PMID:
23606240
DOI:
10.1007/s12032-013-0584-x
[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments

    Supplemental Content

    Full text links

    Icon for Springer
    Loading ...
    Support Center