Format

Send to

Choose Destination
Genome Res. 2018 Nov;28(11):1709-1719. doi: 10.1101/gr.235119.118. Epub 2018 Oct 23.

Targeted genotyping of variable number tandem repeats with adVNTR.

Author information

1
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California 92093, USA.
2
Department of Medicine, University of California, San Diego, La Jolla, California 92093, USA.
3
Department of Pediatrics, University of California, San Diego, La Jolla, California 92093, USA.

Abstract

Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets.

PMID:
30352806
PMCID:
PMC6211647
DOI:
10.1101/gr.235119.118
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center