Format

Send to

Choose Destination
BMC Genomics. 2017 Jan 5;18(1):16. doi: 10.1186/s12864-016-3449-9.

INDELseek: detection of complex insertions and deletions from next-generation sequencing data.

Author information

1
Division of Molecular Pathology, Department of Pathology, Hong Kong Sanatorium & Hospital, Happy Valley, Hong Kong SAR.
2
Department of Medicine, The University of Hong Kong, Pok Fu Lam, Hong Kong SAR.
3
Department of Surgery, The University of Hong Kong, Pok Fu Lam, Hong Kong SAR.
4
Department of Surgery and Cancer Genetics Center, Hong Kong Sanatorium & Hospital, Happy Valley, Hong Kong SAR.
5
Hong Kong Hereditary Breast Cancer Family Registry, Shau Kei Wan, Hong Kong SAR.
6
Division of Molecular Pathology, Department of Pathology, Hong Kong Sanatorium & Hospital, Happy Valley, Hong Kong SAR. eskma@hksh.com.

Abstract

BACKGROUND:

Complex insertions and deletions (indels) from next-generation sequencing (NGS) data were prone to escape detection by currently available variant callers as shown by large-scale human genomics studies. Somatic and germline complex indels in key disease driver genes could be missed in NGS-based genomics studies.

RESULTS:

INDELseek is an open-source complex indel caller designed for NGS data of random fragments and PCR amplicons. The key differentiating factor of INDELseek is that each NGS read alignment was examined as a whole instead of "pileup" of each reference position across multiple alignments. In benchmarking against the reference material NA12878 genome (nā€‰=ā€‰160 derived from high-confidence variant calls), GATK, SAMtools and INDELseek showed complex indel detection sensitivities of 0%, 0% and 100%, respectively. INDELseek also detected all known germline (BRCA1 and BRCA2) and somatic (CALR and JAK2) complex indels in human clinical samples (nā€‰=ā€‰8). Further experiments validated all 10 detected KIT complex indels in a discovery cohort of clinical samples. In silico semi-simulation showed sensitivities of 93.7-96.2% based on 8671 unique complex indels in >5000 genes from dbSNP and COSMIC. We also demonstrated the importance of complex indel detection in accurately annotating BRCA1, BRCA2 and TP53 mutations with gained or rescued protein-truncating effects.

CONCLUSIONS:

INDELseek is an accurate and versatile tool for complex indel detection in NGS data. It complements other variant callers in NGS-based genomics studies targeting a wide spectrum of genetic variations.

KEYWORDS:

Bioinformatics; Complex indel; Next-generation sequencing; Variant calling

PMID:
28056804
PMCID:
PMC5217656
DOI:
10.1186/s12864-016-3449-9
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center