Format

Send to

Choose Destination
Nucleic Acids Res. 2018 Apr 6;46(6):e36. doi: 10.1093/nar/gky007.

dropClust: efficient clustering of ultra-large scRNA-seq data.

Author information

1
Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700108, West Bengal, India.
2
Department of Computer Science and Engineering, University of Calcutta, Kolkata 700098, West Bengal, India.
3
Laboratory of Immunology and Infectious Disease Biology, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal 462066, Madhya Pradesh, India.
4
Center for Computational Biology and Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, Delhi 110020, India.

Abstract

Droplet based single cell transcriptomics has recently enabled parallel screening of tens of thousands of single cells. Clustering methods that scale for such high dimensional data without compromising accuracy are scarce. We exploit Locality Sensitive Hashing, an approximate nearest neighbour search technique to develop a de novo clustering algorithm for large-scale single cell data. On a number of real datasets, dropClust outperformed the existing best practice methods in terms of execution time, clustering accuracy and detectability of minor cell sub-types.

PMID:
29361178
PMCID:
PMC5888655
DOI:
10.1093/nar/gky007
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center