SNP discovery by transcriptome pyrosequencing

Methods Mol Biol. 2011:729:225-46. doi: 10.1007/978-1-61779-065-2_15.

Abstract

Single nucleotide polymorphisms (SNPs) are single base differences between haplotypes. SNPs are abundant in many species and valuable as markers for genetic map construction, modern molecular breeding programs, and quantitative genetic studies. SNPs are readily mined from genomic DNA or cDNA sequence obtained from individuals having two or more distinct genotypes. While automated Sanger sequencing has become less expensive over time, it is still costly to acquire deep Sanger sequence from several genotypes. "Next-generation" DNA sequencing technologies that utilize new chemistries and massively parallel approaches have enabled DNA sequences to be acquired at extremely high depths of coverage faster and for less cost than traditional sequencing. One such method is represented by the Roche/454 Life Sciences GS-FLX Titanium Series, which currently uses pyrosequencing to produce up to 400-600 million bases of DNA sequence/run (>1 million reads, ~400 bp/read). This chapter discusses the use of high-throughput pyrosequencing for SNP discovery by focusing on 454 sequencing of maize cDNA, the development of a computational pipeline for polymorphism detection, and the subsequent identification of over 7,000 putative SNPs between Mo17 and B73 maize. In addition, alternative alignment and polymorphism detection strategies that implement Illumina short reads, data processing and visualization tools, and reduced representation techniques that reduce the sequencing of repeat DNA, thus enabling efficient analysis of genome sequence, are discussed.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Chromosome Mapping
  • Computational Biology
  • DNA, Plant / analysis
  • Expressed Sequence Tags
  • Gene Expression Profiling*
  • Genome, Plant
  • Genotype
  • Haplotypes
  • High-Throughput Nucleotide Sequencing / methods*
  • Molecular Sequence Data
  • Polymorphism, Single Nucleotide*
  • Zea mays / genetics*

Substances

  • DNA, Plant