Format

Send to

Choose Destination
Bioinformatics. 2012 Aug 1;28(15):2052-8. doi: 10.1093/bioinformatics/bts300. Epub 2012 May 17.

flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding.

Author information

1
Department of Neurology and Center of Translational System Biology, Mount Sinai School of Medicine, New York, NY 10029, USA. yongchao.ge@mssm.edu

Abstract

MOTIVATION:

For flow cytometry data, there are two common approaches to the unsupervised clustering problem: one is based on the finite mixture model and the other on spatial exploration of the histograms. The former is computationally slow and has difficulty to identify clusters of irregular shapes. The latter approach cannot be applied directly to high-dimensional data as the computational time and memory become unmanageable and the estimated histogram is unreliable. An algorithm without these two problems would be very useful.

RESULTS:

In this article, we combine ideas from the finite mixture model and histogram spatial exploration. This new algorithm, which we call flowPeaks, can be applied directly to high-dimensional data and identify irregular shape clusters. The algorithm first uses K-means algorithm with a large K to partition the cell population into many small clusters. These partitioned data allow the generation of a smoothed density function using the finite mixture model. All local peaks are exhaustively searched by exploring the density function and the cells are clustered by the associated local peak. The algorithm flowPeaks is automatic, fast and reliable and robust to cluster shape and outliers. This algorithm has been applied to flow cytometry data and it has been compared with state of the art algorithms, including Misty Mountain, FLOCK, flowMeans, flowMerge and FLAME.

AVAILABILITY:

The R package flowPeaks is available at https://github.com/yongchao/flowPeaks.

CONTACT:

yongchao.ge@mssm.edu

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
22595209
PMCID:
PMC3400953
DOI:
10.1093/bioinformatics/bts300
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center