A Systematic Overview of Single-Cell Transcriptomics Databases, their Use cases, and Limitations

ArXiv [Preprint]. 2024 Apr 15:arXiv:2404.10545v1.

Abstract

Rapid advancements in high-throughput single-cell RNA-seq (scRNA-seq) technologies and experimental protocols have led to the generation of vast amounts of genomic data that populates several online databases and repositories. Here, we systematically examined large-scale scRNA-seq databases, categorizing them based on their scope and purpose such as general, tissue-specific databases, disease-specific databases, cancer-focused databases, and cell type-focused databases. Next, we discuss the technical and methodological challenges associated with curating large-scale scRNA-seq databases, along with current computational solutions. We argue that understanding scRNA-seq databases, including their limitations and assumptions, is crucial for effectively utilizing this data to make robust discoveries and identify novel biological insights. Furthermore, we propose that bridging the gap between computational and wet lab scientists through user-friendly web-based platforms is needed for democratizing access to single-cell data. These platforms would facilitate interdisciplinary research, enabling researchers from various disciplines to collaborate effectively. This review underscores the importance of leveraging computational approaches to unravel the complexities of single-cell data and offers a promising direction for future research in the field.

Keywords: Cell heterogeneity; Computational methods; Single-cell Atlases; Single-cell Databases; Single-cell RNA-seq; Single-cell data analysis; Single-cell data integration; Web-based platforms.

Publication types

  • Preprint