Format

Send to

Choose Destination
Genome Biol. 2016 Jan 26;17:12. doi: 10.1186/s13059-015-0862-3.

Isoform prefiltering improves performance of count-based methods for analysis of differential transcript usage.

Author information

1
Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland. charlotte.soneson@uzh.ch.
2
SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland. charlotte.soneson@uzh.ch.
3
Division of Chronic Disease Epidemiology, Epidemiology, Biostatistics and Prevention Institute (EPBI), University of Zurich, Hirschengraben 84, Zurich, 8001, Switzerland. katarinaluise.matthes@usz.ch.
4
Cancer Registry Zurich and Zug, University Hospital Zurich, Vogelsangstrasse 10, Zurich, 8091, Switzerland. katarinaluise.matthes@usz.ch.
5
Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland. gosia.nowicka@uzh.ch.
6
SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland. gosia.nowicka@uzh.ch.
7
Molecular Medicine Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia. charity.law@uzh.ch.
8
Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland. mark.robinson@imls.uzh.ch.
9
SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland. mark.robinson@imls.uzh.ch.

Abstract

BACKGROUND:

RNA-seq has been a boon to the quantitative analysis of transcriptomes. A notable application is the detection of changes in transcript usage between experimental conditions. For example, discovery of pathological alternative splicing may allow the development of new treatments or better management of patients. From an analysis perspective, there are several ways to approach RNA-seq data to unravel differential transcript usage, such as annotation-based exon-level counting, differential analysis of the percentage spliced in, or quantitative analysis of assembled transcripts. The goal of this research is to compare and contrast current state-of-the-art methods, and to suggest improvements to commonly used work flows.

RESULTS:

We assess the performance of representative work flows using synthetic data and explore the effect of using non-standard counting bin definitions as input to DEXSeq, a state-of-the-art inference engine. Although the canonical counting provided the best results overall, several non-canonical approaches were as good or better in specific aspects and most counting approaches outperformed the evaluated event- and assembly-based methods. We show that an incomplete annotation catalog can have a detrimental effect on the ability to detect differential transcript usage in transcriptomes with few isoforms per gene and that isoform-level prefiltering can considerably improve false discovery rate control.

CONCLUSION:

Count-based methods generally perform well in the detection of differential transcript usage. Controlling the false discovery rate at the imposed threshold is difficult, particularly in complex organisms, but can be improved by prefiltering the annotation catalog.

PMID:
26813113
PMCID:
PMC4729156
DOI:
10.1186/s13059-015-0862-3
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center