Intra-individual polymorphism in chloroplasts from NGS data: where does it come from and how to handle it?

Mol Ecol Resour. 2016 Mar;16(2):434-45. doi: 10.1111/1755-0998.12462. Epub 2015 Sep 20.

Abstract

Next-generation sequencing allows access to a large quantity of genomic data. In plants, several studies used whole chloroplast genome sequences for inferring phylogeography or phylogeny. Even though the chloroplast is a haploid organelle, NGS plastome data identified a nonnegligible number of intra-individual polymorphic SNPs. Such observations could have several causes such as sequencing errors, the presence of heteroplasmy or transfer of chloroplast sequences in the nuclear and mitochondrial genomes. The occurrence of allelic diversity has practical important impacts on the identification of diversity, the analysis of the chloroplast data and beyond that, significant evolutionary questions. In this study, we show that the observed intra-individual polymorphism of chloroplast sequence data is probably the result of plastid DNA transferred into the mitochondrial and/or the nuclear genomes. We further assess nine different bioinformatics pipelines' error rates for SNP and genotypes calling using SNPs identified in Sanger sequencing. Specific pipelines are adequate to deal with this issue, optimizing both specificity and sensitivity. Our results will allow a proper use of whole chloroplast NGS sequence and will allow a better handling of NGS chloroplast sequence diversity.

Keywords: Chloroplast; NGS; SNP; gatk; intra-individual polymorphism; samtools.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chloroplasts / genetics*
  • Computational Biology
  • DNA, Chloroplast / chemistry
  • DNA, Chloroplast / genetics*
  • Genome, Chloroplast*
  • Genotype
  • High-Throughput Nucleotide Sequencing
  • Polymorphism, Genetic*
  • Polymorphism, Single Nucleotide

Substances

  • DNA, Chloroplast

Associated data

  • Dryad/10.5061/dryad.31733