Modeling bi-modality improves characterization of cell cycle on gene expression in single cells

PLoS Comput Biol. 2014 Jul 17;10(7):e1003696. doi: 10.1371/journal.pcbi.1003696. eCollection 2014 Jul.

Abstract

Advances in high-throughput, single cell gene expression are allowing interrogation of cell heterogeneity. However, there is concern that the cell cycle phase of a cell might bias characterizations of gene expression at the single-cell level. We assess the effect of cell cycle phase on gene expression in single cells by measuring 333 genes in 930 cells across three phases and three cell lines. We determine each cell's phase non-invasively without chemical arrest and use it as a covariate in tests of differential expression. We observe bi-modal gene expression, a previously-described phenomenon, wherein the expression of otherwise abundant genes is either strongly positive, or undetectable within individual cells. This bi-modality is likely both biologically and technically driven. Irrespective of its source, we show that it should be modeled to draw accurate inferences from single cell expression experiments. To this end, we propose a semi-continuous modeling framework based on the generalized linear model, and use it to characterize genes with consistent cell cycle effects across three cell lines. Our new computational framework improves the detection of previously characterized cell-cycle genes compared to approaches that do not account for the bi-modality of single-cell data. We use our semi-continuous modelling framework to estimate single cell gene co-expression networks. These networks suggest that in addition to having phase-dependent shifts in expression (when averaged over many cells), some, but not all, canonical cell cycle genes tend to be co-expressed in groups in single cells. We estimate the amount of single cell expression variability attributable to the cell cycle. We find that the cell cycle explains only 5%-17% of expression variability, suggesting that the cell cycle will not tend to be a large nuisance factor in analysis of the single cell transcriptome.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Cell Cycle / genetics*
  • Cell Line
  • Computational Biology / methods*
  • Gene Expression / genetics*
  • Gene Expression Profiling
  • Gene Regulatory Networks
  • Genes, cdc
  • Humans
  • Models, Genetic*