Format

Send to

Choose Destination
Gigascience. 2017 Aug 1;6(8):1-9. doi: 10.1093/gigascience/gix061.

A recurrence-based approach for validating structural variation using long-read sequencing technology.

Author information

1
Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA.
2
Department of Human Genetics, University of Michigan, 1241 Catherine St, Ann Arbor, MI 48109, USA.

Abstract

Although numerous algorithms have been developed to identify structural variations (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. This is significant as the accurate identification of structural variation is still an outstanding but important problem in genomics. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of the region through which it spans. However, current efforts to use these data in this manner require the use of large computational resources to assemble these sequences as well as visual inspection of each region. Here we present VaPoR, a highly efficient algorithm that autonomously validates large SV sets using long-read sequencing data. We assessed the performance of VaPoR on SVs in both simulated and real genomes and report a high-fidelity rate for overall accuracy across different levels of sequence depths. We show that VaPoR can interrogate a much larger range of SVs while still matching existing methods in terms of false positive validations and providing additional features considering breakpoint precision and predicted genotype. We further show that VaPoR can run quickly and efficiency without requiring a large processing or assembly pipeline. VaPoR provides a long read-based validation approach for genomic SVs that requires relatively low read depth and computing resources and thus will provide utility with targeted or low-pass sequencing coverage for accurate SV assessment. The VaPoR Software is available at: https://github.com/mills-lab/vapor.

KEYWORDS:

copy number variation; sequence analysis; structural variation

PMID:
28873962
PMCID:
PMC5737365
DOI:
10.1093/gigascience/gix061
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center