Genome-wide study of correlations between genomic features and their relationship with the regulation of gene expression

DNA Res. 2015 Feb;22(1):109-19. doi: 10.1093/dnares/dsu044. Epub 2015 Jan 27.

Abstract

The broad class of tasks in genetics and epigenetics can be reduced to the study of various features that are distributed over the genome (genome tracks). The rapid and efficient processing of the huge amount of data stored in the genome-scale databases cannot be achieved without the software packages based on the analytical criteria. However, strong inhomogeneity of genome tracks hampers the development of relevant statistics. We developed the criteria for the assessment of genome track inhomogeneity and correlations between two genome tracks. We also developed a software package, Genome Track Analyzer, based on this theory. The theory and software were tested on simulated data and were applied to the study of correlations between CpG islands and transcription start sites in the Homo sapiens genome, between profiles of protein-binding sites in chromosomes of Drosophila melanogaster, and between DNA double-strand breaks and histone marks in the H. sapiens genome. Significant correlations between transcription start sites on the forward and the reverse strands were observed in genomes of D. melanogaster, Caenorhabditis elegans, Mus musculus, H. sapiens, and Danio rerio. The observed correlations may be related to the regulation of gene expression in eukaryotes. Genome Track Analyzer is freely available at http://ancorr.eimb.ru/.

Keywords: bioinformatic tool; epigenetics; gene expression; genome tracks; transcription start sites.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Caenorhabditis elegans
  • CpG Islands / physiology*
  • Databases, Genetic*
  • Drosophila melanogaster
  • Gene Expression Regulation / physiology*
  • Genome-Wide Association Study*
  • Humans
  • Mice
  • Software*
  • Transcription Initiation, Genetic / physiology*
  • Zebrafish