Format

Send to

Choose Destination
Bioinformatics. 2017 Nov 15;33(22):3652-3654. doi: 10.1093/bioinformatics/btx489.

LRCstats, a tool for evaluating long reads correction methods.

Author information

1
Department of Mathematics.
2
School of Computing Science, MADD-Gen program, Simon Fraser University, Burnaby, BC V5A1S6, Canada.

Abstract

Motivation:

Third-generation sequencing (TGS) platforms that generate long reads, such as PacBio and Oxford Nanopore technologies, have had a dramatic impact on genomics research. However, despite recent improvements, TGS reads suffer from high-error rates and the development of read correction methods is an active field of research. This motivates the need to develop tools that can evaluate the accuracy of noisy long reads correction tools.

Results:

We introduce LRCstats, a tool that measures the accuracy of long reads correction tools. LRCstats takes advantage of long reads simulators that provide each simulated read with an alignment to the reference genome segment they originate from, and does not rely on a step of mapping corrected reads onto the reference genome. This allows for the measurement of the accuracy of the correction while being consistent with the actual errors introduced in the simulation process used to generate noisy reads. We illustrate the usefulness of LRCstats by analyzing the accuracy of four hybrid correction methods for PacBio long reads over three datasets.

Availability and implementation:

https://github.com/cchauve/lrcstats.

Contact:

laseanl@sfu.ca or cedric.chauve@sfu.ca.

Supplementary information:

Supplementary data are available at Bioinformatics online.

PMID:
29036421
DOI:
10.1093/bioinformatics/btx489
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center