Send to

Choose Destination
PLoS Comput Biol. 2018 May 3;14(5):e1006135. doi: 10.1371/journal.pcbi.1006135. eCollection 2018 May.

beachmat: A Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types.

Author information

Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, United Kingdom.
Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.
European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany.


Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center