Send to

Choose Destination
J Mol Evol. 1996 Sep;43(3):216-23.

A relationship between GC content and coding-sequence length.

Author information

Departamento de Genética, Instituto de Biotecnología, Facultad de Ciencias, Universidad de Granada, E-18071-Granada, Spain.


Since base composition of translational stop codons (TAG, TAA, and TGA) is biased toward a low G+C content, a differential density for these termination signals is expected in random DNA sequences of different base compositions. The expected length of reading frames (DNA segments of sense codons flanked by in-phase stop codons) in random sequences is thus a function of GC content. The analysis of DNA sequences from several genome databases stratified according to GC content reveals that the longest coding sequences-exons in vertebrates and genes in prokaryotes-are GC-rich, while the shortest ones are GC-poor. Exon lengthening in GC-rich vertebrate regions does not result, however, in longer vertebrate proteins, perhaps because of the lower number of exons in the genes located in these regions. The effects on coding-sequence lengths constitute a new evolutionary meaning for compositional variations in DNA GC content.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center