![]() | ![]() |
Formats:
|
||||||||||||||||||||
Copyright Moura et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Large Scale Comparative Codon-Pair Context Analysis Unveils General Rules that Fine-Tune Evolution of mRNA Primary Structure 1Department of Biology, Center for Environmental and Marine Studies, University of Aveiro, Aveiro, Portugal 2Institute of Electronics and Telematics Engineering, University of Aveiro, Aveiro, Portugal 3Department of Mathematics, University of Aveiro, Aveiro, Portugal Alan Christoffels, Academic Editor Temasek Life Sciences Laboratory, Singapore * To whom correspondence should be addressed. E-mail: msantos/at/bio.ua.pt Conceived and designed the experiments: MS GM MP JO. Performed the experiments: GM MP AG LC. Analyzed the data: GM MP JA AF JO. Contributed reagents/materials/analysis tools: GM. Wrote the paper: MS GM. Other: Developed software: JO JA MP. Received May 25, 2007; Accepted July 31, 2007. This article has been cited by other articles in PMC.Abstract Background Codon usage and codon-pair context are important gene primary structure features that influence mRNA decoding fidelity. In order to identify general rules that shape codon-pair context and minimize mRNA decoding error, we have carried out a large scale comparative codon-pair context analysis of 119 fully sequenced genomes. Methodologies/Principal Findings We have developed mathematical and software tools for large scale comparative codon-pair context analysis. These methodologies unveiled general and species specific codon-pair context rules that govern evolution of mRNAs in the 3 domains of life. We show that evolution of bacterial and archeal mRNA primary structure is mainly dependent on constraints imposed by the translational machinery, while in eukaryotes DNA methylation and tri-nucleotide repeats impose strong biases on codon-pair context. Conclusions The data highlight fundamental differences between prokaryotic and eukaryotic mRNA decoding rules, which are partially independent of codon usage. Introduction A myriad of evolutionary forces shape the primary structure of coding components (ORFs) of genomes, herein called ORFeomes. These include genome and gene duplication, chromosome rearrangements, DNA recombination, deletions and insertions, transposition of mobile elements, single nucleotide polymorphisms, nucleotide repeats and biased G+C pressure [1]–[4]. Apart from these DNA replication derived phenomena others arising from DNA transcription, mRNA stability and translation [5]–[7] are also likely to fine tune ORFeomes' primary structure, but their significance is not yet fully understood. At the mRNA translation level, synonymous codon usage and codon-pair context (representing the pair of codons located in the A and P- ribosome sites) are expected to be under selective pressure since they affect mRNA decoding speed and accuracy [8]–[15]. Synonymous codon usage biases are explained mainly by G+C content and only secondarily by constraints imposed by mRNA translation variables [4], namely tRNA abundance, efficiency of tRNA charging, mRNA decoding efficiency (speed plus accuracy), mRNA stability and structure, gene expression, and amino acid composition [7], [13], [16]–[18]. The nucleotides surrounding a codon also influence synonymous codon usage, with the strongest influence arising from the interplay between the last nucleotide of a codon and the first nucleotide of the neighbor codon (N1N2N3 N1N2N3), the so called N3-N1 context [7], [19], [20]. Conversely to codon usage, the forces that modulate codon-pair context, with the exception of the context of initiation and termination codons [16], [21], are still poorly understood. The few studies carried out to date show, however, that codon-pair context has a direct impact on missense, nonsense and frameshifting errors [15], [22], [23]. In E. coli, missense error in vivo, under standard growth conditions, is in the order of 10−3 to 10−4 per codon decoded [24], [25]. Frameshifting and stop codon readthrough errors happen at levels of 3×10−4 to 10−5 and of 10−3 to 10−6, respectively [26], [27]. Under stress, namely amino acid starvation, these basal error rates increase significantly [16], indicating that decoding error in nature may be significantly higher than in optimal laboratory conditions. Furthermore, 30% of the newly synthesized proteins in HeLa, lymph node, L-Kb and dendritic cells are defective ribosomal products (DRiPs) that arise from missense, frameshifting and ribosome drop off at mRNA pausing sites [28]. Since protein synthesis utilizes 45% of the cell ATP, 30% DRiP rate represents 11% of wastage of total cellular energy [28]. Whether this is a common trend in all type of cells is unknown, however, peptides resulting from proteasome degradation of DRiPs are a major source of peptides for MHC class I molecules, highlighting an unanticipated role of mistranslation in immune cells [28]. It is not yet clear whether the ribosome drops off randomly or preferentially at specific mRNA drop off hot spots. In other words, it is important to elucidate whether average decoding error (10−4 to 10−5) is evenly distributed along mRNAs (average error) or whether it fluctuates along the mRNA? If so, how can decoding error hot spots be identified? In order to obtain insight into these questions and identify mRNA primary structural features that influence mRNA decoding error, we have developed a software package, statistical and graphical tools to study codon-pairs corresponding to ribosomal A- and P-site codons, using genome wide approaches (ANACONDA vs 1.0) [20], [29]. ANACONDA 1.0 already allowed us to demonstrate that codon-pair context is weakly modulated by G+C pressure [20]. In the present study, we have significantly improved ANACONDA (creating its version 2.0) and used it to carry out large scale comparative codon-pair context analysis using complete ORFeome sequences of 81 Eubacteria, 18 Archaea and 20 Eukaryota. The data show that i) codon-pair context is species specific, ii) there are general rules governing its evolution in the three domains of life and iii) in eubacteria and archeae codon-pair context is mainly determined by constraints imposed by the translational machinery, while, iv) in eukaryotes the emergence of DNA methylation and tri-nucleotide repeats influenced codon-pair context. The data suggests the existence of fundamental differences between prokaryotic and eukaryotic mRNA decoding rules and shows that codon-pair context is partially independent of codon usage. Results New tools for large scale comparative codon-pair context analysis The ANACONDA 1.0 algorithm developed previously [20], [29] simulates the ribosome during decoding by reading Open Reading Frames (ORFs) sequences, starting at the AUG initiation codon and moving the reading window three nucleotides at a time (Figure 1A
In an attempt to identify putative general rules that govern codon-pair context, we have carried out large scale codon-pair comparisons, using ANACONDA version 2.0. For this, new algorithms and tools were developed to convert the 61x64 codon-pair context colour-coded maps into a single colour-coded column containing 3904 lines, representing all possible combinations of pairs of the 64 codons (Figure 1D
Codon-pair context preferences are species specific Codon-pair context maps showed remarkable diversity from bacteria to high eukaryotes (Figure 2 The distribution of residual values over the entire set of ORFeomes showed that the 3 domains of life have significantly different codon-pair preferences (Tables 1, 2). For example, codon-pair contexts with highest and lowest adjusted residual values showed no common codon-pairs in the 3 domains of life, suggesting fundamental differences between eukarya, eubacteria and archeae in codon-pair rules and in the evolutionary forces that shape ORFeomes primary structure. Interestingly, 9 out of the 10 codon-pair contexts with highest residual values (best codon-pairs) of all eukaryotic ORFeomes were pairs formed by identical codons (codon repeats) (Table 1). The same trend was also detected when the most frequently preferred codon-pair contexts for each domain were compared (Table 2). With this approach, common codon-pair contexts were identified for the 3 domains of life. For example, AAU-CCA and GGC-UGU had positive residuals in Eubacteria and Archaea. In Eukarya and Archaea ACU-AAG had negative residuals and AGA-AGA had positive residuals in Eubacteria and Eukarya. This suggested that, despite the species specificity of codon-pair context maps, at least some of the evolutionary constraints that shaped codon-pair context are conserved across species in the three domains of life.
Context preferences exist in coding and non-coding sequences A large-scale codon-pair context comparison was carried out to visualize general context patterns, using clustering tools (Figure 3A In order to evaluate whether those general codon-pair context patterns arose from DNA replication biases, a second large scale comparative map was built using complete genome sequences (coding + non-coding) of the 119 organisms under study. For this, ANACONDA 2.0 scanned full chromosome sequences starting at the first six nucleotides and moved the scanning window three nucleotides at each step. In this way, both coding and non-coding sequences were analyzed and the frequency of all hexanucleotides was computed, without worrying about the DNA strand location or the reading frame of coding sequences, i.e. ORFs were scanned randomly in the frames 0, +1 or +2. This full genome context map (Figure 3B Codon-pair context is influenced by genome and mRNA translation biases Since DNA replication biases are partly visible at the dinucleotide level [30]–[32], we have constructed individual codon-pair context maps in which rows and columns were sorted to separate P-site codons ending with a particular nucleotide (N3; rows) and A-site codons starting with a particular nucleotide (N1; columns) (Figure 4A
The only universal rule detected in the large-scale comparison (Figure 3
General codon-pair context rules In order to highlight the codon-pair context preferences that were exclusive of coding sequences, the original map of ORFeomes (Figure 3A
Discussion Mistranslation is a poorly understood biological phenomenon which is influenced by various protein synthesis factors and mRNA primary structure features [13], [33], [34]. In order to shed new light on how the later influences decoding error and extend previous studies carried out mainly on the effect of codon usage on mistranslation [25], [35], we are investigating the effect of codon-pair context on decoding fidelity. Our comparative genomics approaches unveiled the effect of both genome replication and translation specific biases on codon-pair context. The few studies carried out to date on codon-pair context were unable to distinguish those two types of biases [12], [36]–[38]. Our large scale approach confirmed the importance of genomic biases but also unveiled important translational biases that shape codon-pair context and should be primary targets for mistranslation hot spots. Large-scale genomic analysis, such as the one that we have performed, allows for obtaining a global view of mistranslation in a way that is totally out of reach from analysis of single ORFeomes. Indeed, comparison of large sets of codon-pair context data unveiled the main codon-pair context patterns that exist in the 3 domains of life. Interestingly, when the most preferred or repressed codon-pair contexts of all organisms were considered (Table 1), but also when common rules were selected (Table 2), there was little or no overlapping between the context patterns of the 3 domains of life. This suggests that genome replication and/or mRNA translation in each domain imposes specific constraints to decoding sequences which produce different codon-pair context outcomes. Also, the phylogeny of individual species appeared as an important determinant of its codon-pair context behavior (Figure 2 Influence of genome wide biases on codon-pair context Our observation that ORFeomes and total genomes produce similar patterns of codon-pair context (Figure 3 Genomes are known to have biased dinucleotide frequencies [31], a feature that has frequently been used to produce genomic signatures of phylogenetical and taxonomical relevance [31], [32]. At the ORFeome level this bias influences codon usage [32] but may also interfere with codon-context, whenever the last nucleotide of one codon is associated with the first nucleotide of the second codon of the pair. Indeed, (N3–N1) contexts explained part of our results (Figure 6 The association bias of two consecutive nucleotides is a characteristic of genomes which results from global selective pressures acting upon DNA at the level of repair and replication mechanisms [32] or ecological constraints that may influence, for instance, the overall G+C content of the genome [42]–[44]. Regulatory activity acting upon the entire genome is another cause of dinucleotide bias. CpG dinucleotides, for example, are signals for DNA methylation, a mechanism commonly used by higher organisms to protect their genome from selfish DNA elements and to regulate gene expression [5], [6]. Our dinucleotide bias analysis for the 119 organisms confirmed a clear rejection of CpG methylation in coding sequences of high eukaryotes, as would be expected, since methylated DNA becomes unavailable for transcription and hence translation [5]. On the other hand, UpA dinucleotides are highly repressed in DNA sequences of most organisms [7], [31], [45], [46]. Interestingly, UpA dinucleotides are sites for preferential hydrolysis of RNA by macrophage ribonucleases [45] destabilizing RNA molecules [7] and should hence be avoided [45]. Furthermore, Duan and colleagues [7] proposed that mRNA stability imposes strong selective pressure on synonymous codon usage and it is likely that this is also true for codon-pair context. Our data confirmed that hypothesis since NNU3-A1NN contexts were highly repressed in the 119 different genomes analyzed. Influence of translational biases on codon-pair context As already mentioned, the unique universal rule that could be detected in the 119 genomes analyzed was rejection of most codon-pair contexts of the type NNU3-A1NN (Figure 3 When a large-scale comparison of codon-pair context excluded global genome biases (Figure 6A As to the other minor rules highlighted on the left side of Figure 6 On the other hand, most of the major genomic constraints that were not present in coding sequences, namely NNU3-A1NN, NYU3-A1RN and N(U/A)2U3-U1(U/A)2N or N(U/A)2A3-A1(U/A)2N rules (Figure 6B Conclusions Codon-pair contexts are biased in ORFeomes and such bias is the result of both translation and non-translation driven processes. Indeed, translational and DNA replication/repair and cis regulatory elements act synergistically on codon-pair context. This myriad of selective pressures creates significant difficulties to the identification of codon-context biases associated to mRNA translation only. Our large scale comparative genome approach indicated that: i) there is a strong influence of non-translational selective pressures upon coding sequences, especially in eukaryotic organisms since these have a higher degree of resemblance between ORFeome and total genome biases; ii) the strongest non-translational selective pressures that could be identified were dinucleotide biases, mainly imposed by regulatory cis-elements linked to DNA methylation or mRNA stability [5], [45], and preference for trinucleotide repeats, usually associated with DNA polymerase slippage during replication [51]; iii) apart from this non-translational noise, DNA coding sequences showed specific features that could be related to mRNA translation, namely repression of usage of premature termination or error-prone contexts associated to weak codon-anticodon interactions. It will now be most interesting to validate these in silico data in vivo, and identify experimentally the codon-pair contexts that are strongly selected for high mRNA decoding fidelity. Methods Primary data sources Nucleotide sequences, of complete genomes and assembled ORFeomes, were downloaded from GenBank or Ensembl Web sites (Genbank: ftp://ftp.ncbi.nih.gov/genomes/; Ensembl: ftp://ftp.ensembl.org/pub/) between December 2005 and January 2006. These included the DNA sequences of 81 eubacterial, 18 archaeal and 20 eukaryotic species. Plasmid sequences were not included in the analysis and all chromosomal sequences from one genome were analyzed together by ANACONDA 2.0. The total set of files downloaded and used in this study is documented as supplementary data (Figure S2). Statistical analyses Two-codon context bias was studied in complete ORFeome sequences using the residual analysis tools available in the software package ANACONDA 1.0 (a detailed explanation of this software can be found in [20], [29]. ANACONDA is publicly available at http://bioinformatics.ua.pt/submited-papers). Briefly, this methodology counts all consecutive pairs of codons and uses statistical analysis for contingency tables where a multinomial distribution is assumed (Figure 1B Since, under independence between two consecutive codons, the adjusted residuals dij have a standardized normal probability distribution [52], we have concluded that: , as the total number of observations is very high. This means that, for a 99,73% confidence level, an adjusted residue was statistically significant if its absolute value was greater than 3 [20]. However, this approach was based on a local analysis for each residual value. Herein, we considered a global analysis for each species and have thus constructed a simultaneous confidence region for all residual values. Since there are K = 61×64 different intervals we have introduced the Bonferroni correction to ensure an overall level of significance of α (usually α = 0.05, 0.01, 0.001). The Bonferroni correction is used for correction where each interval is constructed at a 100×(1–α/K) level (see, for example, [53]). Therefore, a–a to a interval at a confidence level of 100×(1−α/(61×64)) was constructed for each adjusted residual value dij. Considering again the asymptotic normal distribution of dij [52] we had a≈4,70341 when 1–α = 0,99, a≈5,15350 when 1–α = 0,999, a≈8,16204 when 1–α = 0.01×10−10. Thus, we assumed that the codon-pair adjusted residuals that fall within the interval −5 to 5 were not statistically significant, for a global confidence level of 99% (colored in black in all maps shown herein).The final output of residual analysis performed by ANACONDA is a codon-pair context map for each ORFeome being studied (Figure 1C Taking advantage of the automated statistical analysis performed by ANACONDA, individual maps for all 119 ORFeomes were built (see Figure S3). In order to facilitate large-scale comparison of maps these were converted into single lines and clustered together (Figures 1D,E The above approach was also used to study total genome sequences of the same 119 species in order to differentiate between the effect of translational selection acting upon coding sequences alone and genome mutational biases. With the same purpose, the bias for dinucleotides was studied in total genome sequences, and shown as observed frequencies, colored in green or red whenever the result was 1% above or below the expected value, respectively (Figure 4B Figure S1 Data normalization. In order to correct the size differences of ORFeomes, particularly between eukaryotes and non-eukaryotes, the adjusted residuals were normalized for 21 million codons which correspond approximately to the larger ORFeome analyzed (X. tropicalis). Normalization of codon-pair data for human chromosomes 1, 2, 3, 22 and ORFeome are displayed. The normalization effect is shown by the brightness of the maps, which is variable in non-normalized maps (above) and constant in normalized ones (below). After data normalization the differences between maps could be compared as shown in the DDM (right end of the Figure). (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S2A List of species used. All species used in the study are listed according to the download order. The database of origin and respective accession numbers are indicated. A - Eukaryotes; B - Archaea and Eubacteria; C - Eubacteria (cont.). (0.54 MB TIF) Click here for additional data file.(528K, tif) Figure S2B (0.54 MB TIF) Click here for additional data file.(528K, tif) Figure S2C (0.54 MB TIF) Click here for additional data file.(528K, tif) Figure S3A Individual codon-pair context maps of the 119 species. The codon-pair context maps built with ANACONDA software for individual ORFeomes are shown as ordered in Suppl. Figure S2. (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S3B (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S3C (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S3D (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S3E (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S3F (4.32 MB TIF) Click here for additional data file.(4.1M, tif) Figure S3G (2.16 MB TIF) Click here for additional data file.(2.0M, tif) Figure S3H (2.16 MB TIF) Click here for additional data file.(2.0M, tif) Figure S3I (2.16 MB TIF) Click here for additional data file.(2.0M, tif) Figure S3J (2.16 MB TIF) Click here for additional data file.(2.0M, tif) Figure S4 A and U bases are preferentially arranged in polynucleotide strings. In order to check if the preference detected for AA and UU dinucleotides in total genomes (Figure 4B (0.10 MB TIF) Click here for additional data file.(99K, tif) Figure S5A Codon-pair context patterns that are exclusive of ORFeomes or genomes. The filtering technique that was used to determine the biases of codon-pair contexts in coding and total sequences (Figure 6 = 50.(0.55 MB TIF) Click here for additional data file.(538K, tif) Figure S5B (0.53 MB TIF) Click here for additional data file.(516K, tif) Table S1 Codon-pair distribution similarities between the 3 domains of life. In order to compare the overall distribution of codon-pair contexts among the 119 organisms we have calculated the Spearman's correlation coefficients between all pairs of ORFeomes, producing a triangular colored map. The 119 species were organized by domain of life and sorted alphabetically in each domain. Pairs of species that were not statistically correlated (for a level of significance of 5%) are colored in grey, while green colored cells indicate pairs of species that were highly correlated (correlation coefficient above 0,80), and blue colored cells correspond to the major values fount inside each domain. (0.32 MB XLS) Click here for additional data file.(313K, xls) Footnotes Competing Interests: The authors have declared that no competing interests exist. Funding: This study was supported by FEDER/FCT projects POCTI/BME/39030; SAU-MMO/55476; BIA-PRO/55472; BIA-MIC/55466; PTDC/MAT/72974/2006 and Human Frontier Science Programme project RGP45/2005. References 1. Cliften PF, Fulton RS, Wilson RK, Johnston M. After the duplication: gene loss and adaptation in Saccharomyces genomes. Genetics. 2006;172:863–872. [PubMed] 2. van de Lagemaat LN, Gagnier L, Medstrand P, Mager DL. Genomic deletions and precise removal of transposable elements mediated by short identical DNA segments in primates. Genome Res. 2005;15:1243–1249. [PubMed] 3. Lin YW, Thi DA, Kuo PL, Hsu CC, Huang BD, et al. Polymorphisms associated with the DAZ genes on the human Y chromosome. Genomics. 2005;86:431–438. [PubMed] 4. Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH. Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci U S A. 2004;101:3480–3485. [PubMed] 5. Chan SW, Henderson IR, Jacobsen SE. Gardening the genome: DNA methylation in Arabidopsis thaliana. Nat Rev Genet. 2005;6:351–360. [PubMed] 6. Robertson KD. DNA methylation and human disease. Nat Rev Genet. 2005;6:597–610. [PubMed] 7. Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. J Mol Evol. 2003;57:694–701. [PubMed] 8. Berg OG, Silva PJ. Codon bias in Escherichia coli: the influence of codon context on mutation and selection. Nucleic Acids Res. 1997;25:1397–1404. [PubMed] 9. Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994;136:927–935. [PubMed] 10. Curran JF, Yarus M. Rates of aminoacyl-tRNA selection at 29 sense codons in vivo. J Mol Biol. 1989;209:65–77. [PubMed] 11. Percudani R, Ottonello S. Selection at the wobble position of codons read by the same tRNA in Saccharomyces cerevisiae. Mol Biol Evol. 1999;16:1752–1762. [PubMed] 12. Boycheva S, Chkodrov G, Ivanov I. Codon pairs in the genome of Escherichia coli. Bioinformatics. 2003;19:987–998. [PubMed] 13. Ogle JM, Ramakrishnan V. Structural insights into translational fidelity. Annu Rev Biochem. 2005;74:129–177. [PubMed] 14. Irwin B, Heck JD, Hatfield GW. Codon pair utilization biases influence translational elongation step times. J Biol Chem. 1995;270:22801–22806. [PubMed] 15. Shah AA, Giddings MC, Gesteland RF, Atkins JF, Ivanov IP. Computational identification of putative programmed translational frameshift sites. Bioinformatics. 2002;18:1046–1053. [PubMed] 16. Buckingham RH, Grosjean H. The accuracy of mRNA-tRNA recognition. In: Kirkwood TBL, Rosenberger RF, Galas DJ, editors. Accuracy in Molecular Processes: Its Control and Relevance to Living Systems. London: Chapman and Hall; 1986. pp. 83–126. 17. Percudani R, Pavesi A, Ottonello S. Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae. J Mol Biol. 1997;268:322–330. [PubMed] 18. Curran JF, Poole ES, Tate WP, Gross BL. Selection of aminoacyl-tRNAs at sense codons: the size of the tRNA variable loop determines whether the immediate 3′ nucleotide to the codon has a context effect. Nucleic Acids Res. 1995;23:4104–4108. [PubMed] 19. Fedorov A, Saxonov S, Gilbert W. Regularities of context-dependent codon bias in eukaryotic genes. Nucleic Acids Res. 2002;30:1192–1197. [PubMed] 20. Moura G, Pinheiro M, Silva R, Miranda I, Afreixo V, et al. Comparative context analysis of codon pairs on an ORFeome scale. Genome Biol. 2005;6:R28. [PubMed] 21. Tate WP, Poole ES, Mannering SA. Hidden infidelities of the translational stop signal. Prog Nucleic Acid Res Mol Biol. 1996;52:293–335. [PubMed] 22. Murgola EJ, Pagel FT, Hijazi KA. Codon context effects in missense suppression. J Mol Biol. 1984;175:19–27. [PubMed] 23. Tork S, Hatin I, Rousset JP, Fabret C. The major 5′ determinant in stop codon read-through involves two adjacent adenines. Nucleic Acids Res. 2004;32:415–421. [PubMed] 24. Rodnina MV, Wintermeyer W. Fidelity of aminoacyl-tRNA selection on the ribosome: kinetic and structural mechanisms. Annu Rev Biochem. 2001;70:415–435. [PubMed] 25. Kramer EB, Farabaugh PJ. The frequency of translational misreading errors in E. coli is largely determined by tRNA competition. RNA. 2007;13:87–96. [PubMed] 26. Atkins JF, Weiss RB, Thompson S, Gesteland RF. Towards a genetic dissection of the basis of triplet decoding, and its natural subversion: programmed reading frame shifts and hops. Annu Rev Genet. 1991;25:201–228. [PubMed] 27. Freistroffer DV, Kwiatkowski M, Buckingham RH, Ehrenberg M. The accuracy of codon recognition by polypeptide release factors. Proc Natl Acad Sci U S A. 2000;97:2046–2051. [PubMed] 28. Princiotta MF, Finzi D, Qian SB, Gibbs J, Schuchmann S, et al. Quantitating protein synthesis, degradation, and endogenous antigen processing. Immunity. 2003;18:343–354. [PubMed] 29. Pinheiro M, Afreixo V, Moura G, Freitas A, Santos MA, et al. Statistical, computational and visualization methodologies to unveil gene primary structure features. Methods Inf Med. 2006;45:163–168. [PubMed] 30. Campbell A, Mrazek J, Karlin S. Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. Proc Natl Acad Sci U S A. 1999;96:9184–9189. [PubMed] 31. Nakashima H, Nishikawa K, Ooi T. Differences in dinucleotide frequencies of human, yeast, and Escherichia coli genes. DNA Res. 1997;4:185–192. [PubMed] 32. Hooper SD, Berg OG. Detection of genes with atypical nucleotide sequence in microbial genomes. J Mol Evol. 2002;54:365–375. [PubMed] 33. Hooper SD, Berg OG. Gradients in nucleotide and codon usage along Escherichia coli genes. Nucleic Acids Res. 2000;28:3517–3523. [PubMed] 34. Stahl G, McCarty GP, Farabaugh PJ. Ribosome structure: revisiting the connection between translational accuracy and unconventional decoding. Trends Biochem Sci. 2002;27:178–183. [PubMed] 35. Dix DB, Thompson RC. Codon choice and gene expression: synonymous codons differ in translational accuracy. Proc Natl Acad Sci U S A. 1989;86:6888–6892. [PubMed] 36. Gutman GA, Hatfield GW. Nonrandom utilization of codon pairs in Escherichia coli. Proc Natl Acad Sci U S A. 1989;86:3699–3703. [PubMed] 37. Buchan JR, Aucott LS, Stansfield I. tRNA properties help shape codon pair preferences in open reading frames. Nucleic Acids Res. 2006;34:1015–1027. [PubMed] 38. Rocha EP, Danchin A, Viari A. Universal replication biases in bacteria. Mol Microbiol. 1999;32:11–16. [PubMed] 39. Grantham R, Gautier C, Gouy M, Mercier R, Pave A. Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 1980;8:r49–r62. [PubMed] 40. Buckingham RH. Codon context. Experientia. 1990;46:1126–1133. [PubMed] 41. McVean GAT, Hurst GDD. Evolutionary lability of context-dependent codon bias in bacteria. J Mol Evol. 2000;50:264–275. [PubMed] 42. Lao PJ, Forsdyke DR. Thermophilic bacteria strictly obey Szybalski's transcription direction rule and politely purine-load RNAs with both adenine and guanine. Genome Res. 2000;10:228–236. [PubMed] 43. Kennedy SP, Ng WV, Salzberg SL, Hood L, DasSarma S. Understanding the adaptation of Halobacterium species NRC-1 to its extreme environment through computational analysis of its genome sequence. Genome Res. 2001;11:1641–1650. [PubMed] 44. Tekaia F, Yeramian E, Dujon B. Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. Gene. 2002;297:51–60. [PubMed] 45. Beutler E, Gelbart T, Han JH, Koziol JA, Beutler B. Evolution of the genome and the genetic code: selection at the dinucleotide level by methylation and polyribonucleotide cleavage. Proc Natl Acad Sci U S A. 1989;86:192–196. [PubMed] 46. Nakashima H, Ota M, Nishikawa K, Ooi T. Genes from nine genomes are separated into their organisms in the dinucleotide composition space. DNA Res. 1998;5:251–259. [PubMed] 47. Marck C, Grosjean H. tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA. 2002;8:1189–1232. [PubMed] 48. Crick FH. Codon–anticodon pairing: the wobble hypothesis. J Mol Biol. 1966;19:548–555. [PubMed] 49. Caburet S, Vaiman D, Veitia RA. A genomic basis for the evolution of vertebrate transcription factors containing amino Acid runs. Genetics. 2004;167:1813–1820. [PubMed] 50. Borstnik B, Pumpernik D. Tandem repeats in protein coding regions of primate genes. Genome Res. 2002;12:909–915. [PubMed] 51. Rocha EP, Matic I, Taddei F. Over-representation of repeats in stress response genes: a strategy to increase versatility under stressful conditions? Nucleic Acids Res. 2002;30:1886–1894. [PubMed] 52. Haberman S. Analysis of residuals in cross-classified tables. Biometrics. 1973;29:205–220. 53. Simenoff JS. New York: Springer-Verlag; 2003. Analyzing categorical data. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||
Genetics. 2006 Feb; 172(2):863-72.
[Genetics. 2006]Proc Natl Acad Sci U S A. 2004 Mar 9; 101(10):3480-5.
[Proc Natl Acad Sci U S A. 2004]Nat Rev Genet. 2005 May; 6(5):351-60.
[Nat Rev Genet. 2005]J Mol Evol. 2003 Dec; 57(6):694-701.
[J Mol Evol. 2003]Nucleic Acids Res. 1997 Apr 1; 25(7):1397-404.
[Nucleic Acids Res. 1997]Bioinformatics. 2002 Aug; 18(8):1046-53.
[Bioinformatics. 2002]Proc Natl Acad Sci U S A. 2004 Mar 9; 101(10):3480-5.
[Proc Natl Acad Sci U S A. 2004]J Mol Evol. 2003 Dec; 57(6):694-701.
[J Mol Evol. 2003]Annu Rev Biochem. 2005; 74():129-77.
[Annu Rev Biochem. 2005]Annu Rev Biochem. 2001; 70():415-35.
[Annu Rev Biochem. 2001]RNA. 2007 Jan; 13(1):87-96.
[RNA. 2007]Annu Rev Genet. 1991; 25():201-28.
[Annu Rev Genet. 1991]Proc Natl Acad Sci U S A. 2000 Feb 29; 97(5):2046-51.
[Proc Natl Acad Sci U S A. 2000]Immunity. 2003 Mar; 18(3):343-54.
[Immunity. 2003]Genome Biol. 2005; 6(3):R28.
[Genome Biol. 2005]Methods Inf Med. 2006; 45(2):163-8.
[Methods Inf Med. 2006]Genome Biol. 2005; 6(3):R28.
[Genome Biol. 2005]Methods Inf Med. 2006; 45(2):163-8.
[Methods Inf Med. 2006]Genome Biol. 2005; 6(3):R28.
[Genome Biol. 2005]Methods Inf Med. 2006; 45(2):163-8.
[Methods Inf Med. 2006]Genome Biol. 2005; 6(3):R28.
[Genome Biol. 2005]Proc Natl Acad Sci U S A. 1999 Aug 3; 96(16):9184-9.
[Proc Natl Acad Sci U S A. 1999]J Mol Evol. 2002 Mar; 54(3):365-75.
[J Mol Evol. 2002]Annu Rev Biochem. 2005; 74():129-77.
[Annu Rev Biochem. 2005]Nucleic Acids Res. 2000 Sep 15; 28(18):3517-23.
[Nucleic Acids Res. 2000]Trends Biochem Sci. 2002 Apr; 27(4):178-83.
[Trends Biochem Sci. 2002]RNA. 2007 Jan; 13(1):87-96.
[RNA. 2007]Proc Natl Acad Sci U S A. 1989 Sep; 86(18):6888-92.
[Proc Natl Acad Sci U S A. 1989]Nucleic Acids Res. 1980 Jan 11; 8(1):r49-r62.
[Nucleic Acids Res. 1980]DNA Res. 1997 Jun 30; 4(3):185-92.
[DNA Res. 1997]J Mol Evol. 2002 Mar; 54(3):365-75.
[J Mol Evol. 2002]Proc Natl Acad Sci U S A. 2004 Mar 9; 101(10):3480-5.
[Proc Natl Acad Sci U S A. 2004]Experientia. 1990 Dec 1; 46(11-12):1126-33.
[Experientia. 1990]J Mol Evol. 2000 Mar; 50(3):264-75.
[J Mol Evol. 2000]J Mol Evol. 2003 Dec; 57(6):694-701.
[J Mol Evol. 2003]Bioinformatics. 2002 Aug; 18(8):1046-53.
[Bioinformatics. 2002]DNA Res. 1997 Jun 30; 4(3):185-92.
[DNA Res. 1997]J Mol Evol. 2002 Mar; 54(3):365-75.
[J Mol Evol. 2002]J Mol Evol. 2002 Mar; 54(3):365-75.
[J Mol Evol. 2002]Genome Res. 2000 Feb; 10(2):228-36.
[Genome Res. 2000]Gene. 2002 Sep 4; 297(1-2):51-60.
[Gene. 2002]Nat Rev Genet. 2005 May; 6(5):351-60.
[Nat Rev Genet. 2005]Nat Rev Genet. 2005 Aug; 6(8):597-610.
[Nat Rev Genet. 2005]RNA. 2002 Oct; 8(10):1189-232.
[RNA. 2002]J Mol Biol. 1966 Aug; 19(2):548-55.
[J Mol Biol. 1966]Genetics. 2004 Aug; 167(4):1813-20.
[Genetics. 2004]Genome Res. 2002 Jun; 12(6):909-15.
[Genome Res. 2002]Nat Rev Genet. 2005 May; 6(5):351-60.
[Nat Rev Genet. 2005]Proc Natl Acad Sci U S A. 1989 Jan; 86(1):192-6.
[Proc Natl Acad Sci U S A. 1989]Nucleic Acids Res. 2002 May 1; 30(9):1886-94.
[Nucleic Acids Res. 2002]Genome Biol. 2005; 6(3):R28.
[Genome Biol. 2005]Methods Inf Med. 2006; 45(2):163-8.
[Methods Inf Med. 2006]Genome Biol. 2005; 6(3):R28.
[Genome Biol. 2005]