Geometric Sketching Compactly Summarizes the Single-Cell Transcriptomic Landscape

Cell Syst. 2019 Jun 26;8(6):483-493.e7. doi: 10.1016/j.cels.2019.05.003. Epub 2019 Jun 5.

Abstract

Large-scale single-cell RNA sequencing (scRNA-seq) studies that profile hundreds of thousands of cells are becoming increasingly common, overwhelming existing analysis pipelines. Here, we describe how to enhance and accelerate single-cell data analysis by summarizing the transcriptomic heterogeneity within a dataset using a small subset of cells, which we refer to as a geometric sketch. Our sketches provide more comprehensive visualization of transcriptional diversity, capture rare cell types with high sensitivity, and reveal biological cell types via clustering. Our sketch of umbilical cord blood cells uncovers a rare subpopulation of inflammatory macrophages, which we experimentally validated. The construction of our sketches is extremely fast, which enabled us to accelerate other crucial resource-intensive tasks, such as scRNA-seq data integration, while maintaining accuracy. We anticipate our algorithm will become an increasingly essential step when sharing and analyzing the rapidly growing volume of scRNA-seq data and help enable the democratization of single-cell omics.

Keywords: big data; data integration; diversity; geometric sketching; heterogeneity; rare cell-type discovery; sampling; scRNA-seq; single-cell RNA-seq; sketching.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Data Analysis
  • Datasets as Topic
  • Genetic Heterogeneity
  • Humans
  • Macrophages
  • RNA-Seq
  • Single-Cell Analysis / methods*
  • Transcriptome*