CorGen--measuring and generating long-range correlations for DNA sequence analysis

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W692-5. doi: 10.1093/nar/gkl234.

Abstract

CorGen is a web server that measures long-range correlations in the base composition of DNA and generates random sequences with the same correlation parameters. Long-range correlations are characterized by a power-law decay of the auto correlation function of the GC-content. The widespread presence of such correlations in eukaryotic genomes calls for their incorporation into accurate null models of eukaryotic DNA in computational biology. For example, the score statistics of sequence alignment and the performance of motif finding algorithms are significantly affected by the presence of genomic long-range correlations. We use an expansion-randomization dynamics to efficiently generate the correlated random sequences. The server is available at http://corgen.molgen.mpg.de.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Composition
  • Computational Biology / methods
  • Computer Graphics
  • Cytosine / analysis
  • Data Interpretation, Statistical
  • Genomics
  • Guanine / analysis
  • Humans
  • Internet
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Software*
  • User-Computer Interface

Substances

  • Guanine
  • Cytosine