Format

Send to

Choose Destination
Nat Methods. 2019 Jan;16(1):43-49. doi: 10.1038/s41592-018-0254-1. Epub 2018 Dec 20.

A test metric for assessing single-cell RNA-seq batch correction.

Author information

1
Helmholtz Zentrum München-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.
2
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK.
3
Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
4
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK. st9@sanger.ac.uk.
5
Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK. st9@sanger.ac.uk.
6
Department of Physics, Cavendish Laboratory, University of Cambridge, Cambridge, UK. st9@sanger.ac.uk.
7
Helmholtz Zentrum München-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany. fabian.theis@helmholtz-muenchen.de.
8
Department of Mathematics, Technische Universität München, Munich, Germany. fabian.theis@helmholtz-muenchen.de.

Abstract

Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET ) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.

PMID:
30573817
DOI:
10.1038/s41592-018-0254-1

Supplemental Content

Loading ...
Support Center