Format

Send to

Choose Destination
BMC Genomics. 2017 Jul 19;18(1):545. doi: 10.1186/s12864-017-3938-5.

A permutation-based non-parametric analysis of CRISPR screen data.

Jia G1,2, Wang X3, Xiao G4,5,6.

Author information

1
Department of Statistical Science, Southern Methodist University, Dallas, TX, 75205, USA.
2
Quantitative Biomedical Research Center, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
3
Department of Statistical Science, Southern Methodist University, Dallas, TX, 75205, USA. swang@smu.edu.
4
Quantitative Biomedical Research Center, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. guanghua.xiao@utsouthwestern.edu.
5
Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. guanghua.xiao@utsouthwestern.edu.
6
Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. guanghua.xiao@utsouthwestern.edu.

Abstract

BACKGROUND:

Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single specific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms.

METHODS:

We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level by permuting sgRNA labels, and thus it avoids restrictive distributional assumptions. Although PBNPA is designed to analyze CRISPR data, it can also be applied to analyze genetic screens implemented with siRNAs or shRNAs and drug screens.

RESULTS:

We compared the performance of PBNPA with competing methods on simulated data as well as on real data. PBNPA outperformed recent methods designed for CRISPR screen analysis, as well as methods used for analyzing other functional genomics screens, in terms of Receiver Operating Characteristics (ROC) curves and False Discovery Rate (FDR) control for simulated data under various settings. Remarkably, the PBNPA algorithm showed better consistency and FDR control on published real data as well.

CONCLUSIONS:

PBNPA yields more consistent and reliable results than its competitors, especially when the data quality is low. R package of PBNPA is available at: https://cran.r-project.org/web/packages/PBNPA/ .

KEYWORDS:

False discovery rate; Functional genomics; Negative selection; Next generation sequencing; Positive selection; RNA interference

PMID:
28724352
PMCID:
PMC5518132
DOI:
10.1186/s12864-017-3938-5
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center