Format

Send to

Choose Destination
Nucleic Acids Res. 2007;35(20):6798-807. Epub 2007 Oct 10.

CG dinucleotide clustering is a species-specific property of the genome.

Author information

1
Department of Molecular Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA.

Abstract

Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.

PMID:
17932072
PMCID:
PMC2175314
DOI:
10.1093/nar/gkm489
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center