• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Feb 15, 1992; 89(4): 1358–1362.

Over- and under-representation of short oligonucleotides in DNA sequences.


Strand-symmetric relative abundance functionals for di-, tri-, and tetranucleotides are introduced and applied to sequences encompassing a broad phylogenetic range to discern tendencies and anomalies in the occurrences of these short oligonucleotides within and between genomic sequences. For dinucleotides, TA is almost universally under-represented, with the exception of vertebrate mitochondrial genomes, and CG is strongly under-represented in vertebrates and in mitochondrial genomes. The traditional methylation/deamination/mutation hypothesis for the rarity of CG does not adequately account for the observed deficiencies in certain sequences, notably the mitochondrial genomes, yeast, and Neurospora crassa, which lack the standard CpG methylase. Homodinucleotides (AA.TT, CC.GG) and larger homooligonucleotides are over-represented in many organisms, perhaps due to polymerase slippage events. For trinucleotides, GCA.TGC tends to be under-represented in phage, human viral, and eukaryotic sequences, and CTA.TAG is strongly under-represented in many prokaryotic, eukaryotic, and viral sequences. The CCA.TGG triplet is ubiquitously over-represented in human viral and eukaryotic sequences. Among the tetranucleotides, several four-base-pair palindromes tend to be under-represented in phage sequences, probably as a means of restriction avoidance. The tetranucleotide CTAG is observed to be rare in virtually all bacterial genomes and some phage genomes. Explanations for these over- and under-representations in terms of DNA/RNA structures and regulatory mechanisms are considered.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1.2M), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Bernardi G, Mouchiroud D, Gautier C, Bernardi G. Compositional patterns in vertebrate genomes: conservation and change in evolution. J Mol Evol. 1988 Dec;28(1-2):7–18. [PubMed]
  • Inman RB. A denaturation map of the lambda phage DNA molecule determined by electron microscopy. J Mol Biol. 1966 Jul;18(3):464–476. [PubMed]
  • Gilson E, Saurin W, Perrin D, Bachellier S, Hofnung M. Palindromic units are part of a new bacterial interspersed mosaic element (BIME). Nucleic Acids Res. 1991 Apr 11;19(7):1375–1383. [PMC free article] [PubMed]
  • Nussinov R. Nearest neighbor nucleotide patterns. Structural and biological implications. J Biol Chem. 1981 Aug 25;256(16):8458–8462. [PubMed]
  • Nussinov R. Theoretical molecular biology: prospectives and perspectives. J Theor Biol. 1987 Mar 21;125(2):219–235. [PubMed]
  • Ohno S. Universal rule for coding sequence construction: TA/CG deficiency-TG/CT excess. Proc Natl Acad Sci U S A. 1988 Dec;85(24):9630–9634. [PMC free article] [PubMed]
  • Beutler E, Gelbart T, Han JH, Koziol JA, Beutler B. Evolution of the genome and the genetic code: selection at the dinucleotide level by methylation and polyribonucleotide cleavage. Proc Natl Acad Sci U S A. 1989 Jan;86(1):192–196. [PMC free article] [PubMed]
  • Kozhukhin CG, Pevzner PA. Genome inhomogeneity is determined mainly by WW and SS dinucleotides. Comput Appl Biosci. 1991 Jan;7(1):39–49. [PubMed]
  • McClelland M. Selection against dam methylation sites in the genomes of DNA of enterobacteriophages. J Mol Evol. 1984;21(4):317–322. [PubMed]
  • JOSSE J, KAISER AD, KORNBERG A. Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. J Biol Chem. 1961 Mar;236:864–875. [PubMed]
  • Selker EU. Premeiotic instability of repeated sequences in Neurospora crassa. Annu Rev Genet. 1990;24:579–613. [PubMed]
  • Bird AP. CpG-rich islands and the function of DNA methylation. Nature. 1986 May 15;321(6067):209–213. [PubMed]
  • Cedar H, Razin A. DNA methylation and development. Biochim Biophys Acta. 1990 May 24;1049(1):1–8. [PubMed]
  • Tazi J, Bird A. Alternative chromatin structure at CpG islands. Cell. 1990 Mar 23;60(6):909–920. [PubMed]
  • Riggs AD. DNA methylation and late replication probably aid cell memory, and type I DNA reeling could aid chromosome folding and enhancer function. Philos Trans R Soc Lond B Biol Sci. 1990 Jan 30;326(1235):285–297. [PubMed]
  • Honess RW, Gompels UA, Barrell BG, Craxton M, Cameron KR, Staden R, Chang YN, Hayward GS. Deviations from expected frequencies of CpG dinucleotides in herpesvirus DNAs may be diagnostic of differences in the states of their latent genomes. J Gen Virol. 1989 Apr;70(Pt 4):837–855. [PubMed]
  • Lennon GG, Fraser NW. CpG frequency in large DNA segments. J Mol Evol. 1983;19(3-4):286–288. [PubMed]
  • Gunsalus RP, Yanofsky C. Nucleotide sequence and expression of Escherichia coli trpR, the structural gene for the trp aporepressor. Proc Natl Acad Sci U S A. 1980 Dec;77(12):7117–7121. [PMC free article] [PubMed]
  • Otwinowski Z, Schevitz RW, Zhang RG, Lawson CL, Joachimiak A, Marmorstein RQ, Luisi BF, Sigler PB. Crystal structure of trp repressor/operator complex at atomic resolution. Nature. 1988 Sep 22;335(6188):321–329. [PubMed]
  • Rafferty JB, Somers WS, Saint-Girons I, Phillips SE. Three-dimensional crystal structures of Escherichia coli met repressor with and without corepressor. Nature. 1989 Oct 26;341(6244):705–710. [PubMed]
  • Lieb M. Recombination in the lambda repressor gene: evidence that very short patch (VSP) mismatch correction restores a specific sequence. Mol Gen Genet. 1985;199(3):465–470. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...