Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data

Cell Syst. 2021 Feb 17;12(2):176-194.e6. doi: 10.1016/j.cels.2020.11.008. Epub 2020 Dec 17.

Abstract

In single-cell RNA sequencing (scRNA-seq), doublets form when two cells are encapsulated into one reaction volume. The existence of doublets, which appear to be-but are not-real cells, is a key confounder in scRNA-seq data analysis. Computational methods have been developed to detect doublets in scRNA-seq data; however, the scRNA-seq field lacks a comprehensive benchmarking of these methods, making it difficult for researchers to choose an appropriate method for specific analyses. We conducted a systematic benchmark study of nine cutting-edge computational doublet-detection methods. Our study included 16 real datasets, which contained experimentally annotated doublets, and 112 realistic synthetic datasets. We compared doublet-detection methods regarding detection accuracy under various experimental settings, impacts on downstream analyses, and computational efficiencies. Our results show that existing methods exhibited diverse performance and distinct advantages in different aspects. Overall, the DoubletFinder method has the best detection accuracy, and the cxds method has the highest computational efficiency. A record of this paper's transparent peer review process is included in the Supplemental Information.

Keywords: cell clustering; differential gene expression; doublet detection; parallel computing; reproducibility; scRNA-seq; software implementation; trajectory inference.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Benchmarking
  • Humans
  • RNA-Seq / methods*
  • Single-Cell Analysis / methods*