Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

Sci Rep. 2016 May 19:6:26314. doi: 10.1038/srep26314.

Abstract

Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the 'CCCGCC' motif in the GFP coding sequence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genes, Reporter
  • Genetic Variation
  • Genome, Viral
  • Green Fluorescent Proteins / genetics*
  • High-Throughput Nucleotide Sequencing / methods*
  • Influenza A virus / genetics*
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Analysis, DNA

Substances

  • Green Fluorescent Proteins