Predicting crossover generation in DNA shuffling

Proc Natl Acad Sci U S A. 2001 Mar 13;98(6):3226-31. doi: 10.1073/pnas.051631498.

Abstract

We introduce a quantitative framework for assessing the generation of crossovers in DNA shuffling experiments. The approach uses free energy calculations and complete sequence information to model the annealing process. Statistics obtained for the annealing events then are combined with a reassembly algorithm to infer crossover allocation in the reassembled sequences. The fraction of reassembled sequences containing zero, one, two, or more crossovers and the probability that a given nucleotide position in a reassembled sequence is the site of a crossover event are estimated. Comparisons of the predictions against experimental data for five example systems demonstrate good agreement despite the fact that no adjustable parameters are used. An in silico case study of a set of 12 subtilases examines the effect of fragmentation length, annealing temperature, sequence identity and number of shuffled sequences on the number, type, and distribution of crossovers. A computational verification of crossover aggregation in regions of near-perfect sequence identity and the presence of synergistic reassembly in family DNA shuffling is obtained.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Cephalosporinase / genetics
  • Computer Simulation*
  • Crossing Over, Genetic*
  • DNA*
  • Evolution, Molecular*
  • Humans
  • Interleukin-1 / genetics
  • Mice
  • Models, Genetic*
  • Subtilisins / genetics

Substances

  • Interleukin-1
  • DNA
  • Subtilisins
  • Cephalosporinase