Format

Send to

Choose Destination
Nucleic Acids Res. 2015 Feb 18;43(3):e19. doi: 10.1093/nar/gku1211. Epub 2014 Nov 26.

BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers.

Author information

1
Center for Cancer Genome Discovery and Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02215, USA.
2
Department of Pathology, Brigham and Women's Hospital, Boston, MA 02215, USA.
3
Center for Cancer Genome Discovery and Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02215, USA Broad Institute of Harvard and MIT, Cambridge, MA 02141, USA.
4
Center for Cancer Genome Discovery and Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02215, USA Department of Pathology, Brigham and Women's Hospital, Boston, MA 02215, USA Broad Institute of Harvard and MIT, Cambridge, MA 02141, USA.
5
Center for Cancer Genome Discovery and Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02215, USA Department of Pathology, Brigham and Women's Hospital, Boston, MA 02215, USA laura_macconaill@dfci.harvard.edu.

Abstract

Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings.

PMID:
25428359
PMCID:
PMC4330340
DOI:
10.1093/nar/gku1211
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center