Format

Send to

Choose Destination
Bioinformatics. 2020 Mar 24. pii: btaa198. doi: 10.1093/bioinformatics/btaa198. [Epub ahead of print]

Automatic identification of relevant genes from low-dimensional embeddings of single cell RNAseq data.

Author information

1
Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany.
2
TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
3
Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany.
4
Institute of Functional Epigenetics, Helmholtz Zentrum München - German Research Center for Environmental Health, München, Germany.

Abstract

Dimensionality reduction is a key step in the analysis of single-cell RNA sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of large, heterogeneous single cell data. However, with no linear relation between gene and embedding coordinate, there is no way to extract the identity of genes driving any cell's position in the low-dimensional embedding, making it more difficult to characterize the underlying biological processes. In this paper, we introduce the concepts of local and global gene relevance to compute an equivalent of principal component analysis loadings for non-linear low-dimensional embeddings. Global gene relevance identifies drivers of the overall embedding, while local gene relevance identifies those of a defined subregion. We apply our method to single-cell RNAseq datasets from different experimental protocols and to different low dimensional embedding techniques. This shows our method's versatility to identify key genes for a variety of biological processes. To ensure reproducibility and ease of use, our method is released as part of destiny 3.0, a popular R package for building diffusion maps from single-cell transcriptomic data. It is readily available through Bioconductor.

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center