Format

Send to

Choose Destination
Genomics. 2019 Jan;111(1):43-49. doi: 10.1016/j.ygeno.2017.12.011. Epub 2017 Dec 18.

Efficiency of PacBio long read correction by 2nd generation Illumina sequencing.

Author information

1
Department of RNA Biology, Institute of Bioorganic Chemistry Polish Academy of Sciences, Noskowskiego 12/14, 60-101 Poznan, Poland; Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University Poznan, Umultowska 89, 61-614 Poznan, Poland.
2
Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University Poznan, Umultowska 89, 61-614 Poznan, Poland.
3
Department of RNA Biology, Institute of Bioorganic Chemistry Polish Academy of Sciences, Noskowskiego 12/14, 60-101 Poznan, Poland.
4
Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University Poznan, Umultowska 89, 61-614 Poznan, Poland. Electronic address: wmk@amu.edu.pl.

Abstract

Long sequencing reads offer unprecedented opportunities in analysis and reconstruction of complex genomic regions. However, the gain in sequence length is often traded for quality. Therefore, recently several approaches have been proposed (e.g. higher sequencing coverage, hybrid assembly or sequence correction) to enhance the quality of long sequencing reads. A simple and cost-effective approach includes use of the high quality 2nd generation sequencing data to improve the quality of long reads. We designed a dedicated testing procedure and selected universal programs for long read correction, which provide as the output sequences that can be used in further genomic and transcriptomic studies. Our results show that HALC is the best choice for correction of long PacBio reads, when both, read size and quality, are the main focus of the analysis. However, the tested tools show some unexpected behaviors, including read trimming and fragmentation.

KEYWORDS:

Illumina; Long read sequencing; NGS sequencing; PacBio; Sequence correction

PMID:
29268960
DOI:
10.1016/j.ygeno.2017.12.011
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center