Detection of Low-Frequency Mutations and Identification of Heat-Induced Artifactual Mutations Using Duplex Sequencing

Int J Mol Sci. 2019 Jan 8;20(1):199. doi: 10.3390/ijms20010199.

Abstract

We present a genome-wide comparative and comprehensive analysis of three different sequencing methods (conventional next generation sequencing (NGS), tag-based single strand sequencing (e.g., SSCS), and Duplex Sequencing for investigating mitochondrial mutations in human breast epithelial cells. Duplex Sequencing produces a single strand consensus sequence (SSCS) and a duplex consensus sequence (DCS) analysis, respectively. Our study validates that although high-frequency mutations are detectable by all the three sequencing methods with the similar accuracy and reproducibility, rare (low-frequency) mutations are not accurately detectable by NGS and SSCS. Even with conservative bioinformatical modification to overcome the high error rate of NGS, the NGS frequency of rare mutations is 7.0 × 10-4. The frequency is reduced to 1.3 × 10-4 with SSCS and is further reduced to 1.0 × 10-5 using DCS. Rare mutation context spectra obtained from NGS significantly vary across independent experiments, and it is not possible to identify a dominant mutation context. In contrast, rare mutation context spectra are consistently similar in all independent DCS experiments. We have systematically identified heat-induced artifactual variants and corrected the artifacts using Duplex Sequencing. Specific sequence contexts were analyzed to examine the effects of neighboring bases on the accumulation of heat-induced artifactual variants. All of these artifacts are stochastically occurring rare mutations. C > A/G > T, a signature of oxidative damage, is the most increased (170-fold) heat-induced artifactual mutation type. Our results strongly support the claim that Duplex Sequencing accurately detects low-frequency mutations and identifies and corrects artifactual mutations introduced by heating during DNA preparation.

Keywords: duplex consensus sequence (DCS); duplex sequencing; heat-induced mutations; human breast cells; mitochondrial dna; next-Generation sequencing (NGS); oxidative DNA damage; rare mutations; sequencing error; single strand consensus sequence (SSCS).

MeSH terms

  • Adult
  • Artifacts
  • Cell Line
  • Consensus Sequence
  • DNA, Mitochondrial / genetics
  • Genome, Mitochondrial
  • High-Throughput Nucleotide Sequencing / methods*
  • Hot Temperature*
  • Humans
  • Point Mutation / genetics*
  • Reproducibility of Results
  • Stochastic Processes
  • Young Adult

Substances

  • DNA, Mitochondrial