Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications

Comput Intell Neurosci. 2016:2016:4037380. doi: 10.1155/2016/4037380. Epub 2016 Apr 26.

Abstract

Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Humans
  • Models, Theoretical*
  • Neurosciences*
  • Regression Analysis*