Format

Send to

Choose Destination
Genome Biol. 2016 Apr 27;17:75. doi: 10.1186/s13059-016-0947-7.

Pooling across cells to normalize single-cell RNA sequencing data with many zero counts.

Author information

1
Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, CB2 0RE, Cambridge, UK. aaron.lun@cruk.cam.ac.uk.
2
EMBL European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK.
3
Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, CB2 0RE, Cambridge, UK. marioni@ebi.ac.uk.
4
EMBL European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK. marioni@ebi.ac.uk.
5
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, Cambridge, UK. marioni@ebi.ac.uk.

Abstract

Normalization of single-cell RNA sequencing data is necessary to eliminate cell-specific biases prior to downstream analyses. However, this is not straightforward for noisy single-cell data where many counts are zero. We present a novel approach where expression values are summed across pools of cells, and the summed values are used for normalization. Pool-based size factors are then deconvolved to yield cell-based factors. Our deconvolution approach outperforms existing methods for accurate normalization of cell-specific biases in simulated data. Similar behavior is observed in real data, where deconvolution improves the relevance of results of downstream analyses.

KEYWORDS:

Differential expression; Normalization; Single-cell RNA-seq

PMID:
27122128
PMCID:
PMC4848819
DOI:
10.1186/s13059-016-0947-7
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center