Send to

Choose Destination
J Mol Biol. 1994 Apr 22;238(1):1-8.

A graphic approach to analyzing codon usage in 1562 Escherichia coli protein coding sequences.

Author information

Department of Physics, Tianjin University, China.


The occurrence frequencies of the four bases (adenine, cytosine, guanine and thymine) at each of the three codon positions for 1562 Escherichia coli protein coding sequences have been calculated. The 1562 x 4 x 3 = 18,744 data thus obtained have been analyzed by a graphic method in which the four base occurrence frequencies at each codon position for each coding sequence are represented by a point in a three-dimensional space. Thus, the 18,744 data, which would otherwise occupy several printed pages, can be intuitively displayed by a graphy. The point distribution pattern for each of the three codon positions has been analyzed. The results of our analysis indicate that the patterns for the first two codon positions reflect the origin for producing native folding structures of proteins. We thus come to the conclusion that the distribution patterns for the first two codon positions should be basically species-independent, as confirmed by studies for a number of other species. However, the distribution pattern for the third codon position is species-dependent. Based on the point distribution of the third codon position, six collective parameters have been defined to describe the overall feature of the pattern concerned. These collective parameters can be generally used to classify different species, and hence would be a useful vehicle for studies in taxonomy. In addition to E. coli, the collective parameters for a number of other species have been calculated and analyzed.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center