Clustering by fast search and merge of local density peaks for gene expression microarray data

Sci Rep. 2017 Apr 19:7:45602. doi: 10.1038/srep45602.

Abstract

Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the idea that density wise single or multiple connected regions make a cluster, in which density maxima point represents the center of the corresponding density region. More precisely, our approach firstly finds the local density regions and subsequently merges the density connected regions to form the meaningful clusters. This idea empowers the clustering procedure, in which outliers are automatically detected, higher dense regions are intuitively determined and merged to form clusters of arbitrary shape, and clusters are identified regardless the dimensionality of space in which they are embedded. Extensive experiments are performed on several complex data sets to analyze and compare our approach with the state-of-the-art clustering methods. In addition, we benchmarked the algorithm on gene expression microarray data sets for cancer subtyping; to distinguish normal tissues from tumor; and to classify multiple tissue data sets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis*
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Microarray Analysis / methods*