Sequencing of high G+C microbial genomes using the ultrafast pyrosequencing technology

J Biotechnol. 2011 Aug 20;155(1):68-77. doi: 10.1016/j.jbiotec.2011.04.010. Epub 2011 Apr 23.

Abstract

Next generation pyrosequencing of high G+C content genomes still poses problems to automated sequencing and assembly processes which necessitates cost and time intensive manual work in order to finish such genomes completely. The sequencing of the high G+C actinomycete Actinoplanes sp. SE50/110 was performed with standard pyrosequencing technology (454 Life Sciences) and revealed a high number of gaps. The reasons for the introduction of gaps were analyzed on a previously known 41kb long DNA reference sequence from Actinoplanes sp. SE50/110, hosting the acarbose biosynthesis gene cluster. Mapping of the sequencing results on the reference gene cluster sequence revealed a fragmentation into 30 contiguous sequences of different lengths. The gaps between these sequences were characterized by extremely low read coverage which strongly correlated with the G+C content in the gap regions in a negative manner. Furthermore, the gap-sequences contained strong stem-loop structures which hindered the amplification of these sequences during the emulsion PCR. Being significantly underrepresented or absent in the subsequent sequencing process, these sequences lead to weakly or uncovered genomic regions which forces the assembly algorithm to output multiple contiguous sequences instead of one finished genome. However, by applying a different pyrosequencing protocol, it was possible to sequence the complete acarbose biosynthesis gene cluster. The changes to the protocol include longer read length and addition of chemicals to the amplification chemistry, which reduces the self-annealing of DNA fragments during the amplification process and enables the complete reconstruction of high G+C content genomes without manual intervention.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition*
  • Base Sequence
  • Chromosome Mapping / methods
  • DNA, Bacterial / chemistry*
  • DNA, Bacterial / genetics
  • Genes, Bacterial
  • Genome, Bacterial*
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing / methods*
  • Micromonosporaceae / genetics
  • Multigene Family
  • Nucleic Acid Conformation
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Bacterial