Format

Send to

Choose Destination
See comment in PubMed Commons below
PeerJ. 2013 Jul 23;1:e113. doi: 10.7717/peerj.113. Print 2013.

Improving transcriptome assembly through error correction of high-throughput sequence reads.

Author information

  • 1California Institute for Quantitative Biosciences, University of California , Berkeley, CA , USA.

Abstract

The study of functional genomics, particularly in non-model organisms, has been dramatically improved over the last few years by the use of transcriptomes and RNAseq. While these studies are potentially extremely powerful, a computationally intensive procedure, the de novo construction of a reference transcriptome must be completed as a prerequisite to further analyses. The accurate reference is critically important as all downstream steps, including estimating transcript abundance are critically dependent on the construction of an accurate reference. Though a substantial amount of research has been done on assembly, only recently have the pre-assembly procedures been studied in detail. Specifically, several stand-alone error correction modules have been reported on and, while they have shown to be effective in reducing errors at the level of sequencing reads, how error correction impacts assembly accuracy is largely unknown. Here, we show via use of a simulated and empiric dataset, that applying error correction to sequencing reads has significant positive effects on assembly accuracy, and should be applied to all datasets. A complete collection of commands which will allow for the production of Reptile corrected reads is available at https://github.com/macmanes/error_correction/tree/master/scripts and as File S1.

KEYWORDS:

De novo assembly; Error correction; Illumina; RNAseq; Trinity

PMID:
23904992
PMCID:
PMC3728768
DOI:
10.7717/peerj.113
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PeerJ, Inc. Icon for PubMed Central
    Loading ...
    Support Center