Format

Send to

Choose Destination
Nucleic Acids Res. 2016 Jun 2;44(10):e98. doi: 10.1093/nar/gkw158. Epub 2016 Mar 14.

CLASS2: accurate and efficient splice variant annotation from RNA-seq reads.

Author information

1
Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
2
Department of Pediatrics, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA.
3
Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA florea@jhu.edu.

Abstract

Next generation sequencing of cellular RNA is making it possible to characterize genes and alternative splicing in unprecedented detail. However, designing bioinformatics tools to accurately capture splicing variation has proven difficult. Current programs can find major isoforms of a gene but miss lower abundance variants, or are sensitive but imprecise. CLASS2 is a novel open source tool for accurate genome-guided transcriptome assembly from RNA-seq reads based on the model of splice graph. An extension of our program CLASS, CLASS2 jointly optimizes read patterns and the number of supporting reads to score and prioritize transcripts, implemented in a novel, scalable and efficient dynamic programming algorithm. When compared against reference programs, CLASS2 had the best overall accuracy and could detect up to twice as many splicing events with precision similar to the best reference program. Notably, it was the only tool to produce consistently reliable transcript models for a wide range of applications and sequencing strategies, including ribosomal RNA-depleted samples. Lightweight and multi-threaded, CLASS2 requires <3GB RAM and can analyze a 350 million read set within hours, and can be widely applied to transcriptomics studies ranging from clinical RNA sequencing, to alternative splicing analyses, and to the annotation of new genomes.

PMID:
26975657
PMCID:
PMC4889935
DOI:
10.1093/nar/gkw158
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center