Relative codon adaptation: a generic codon bias index for prediction of gene expression

DNA Res. 2010 Jun;17(3):185-96. doi: 10.1093/dnares/dsq012. Epub 2010 May 7.

Abstract

The development of codon bias indices (CBIs) remains an active field of research due to their myriad applications in computational biology. Recently, the relative codon usage bias (RCBS) was introduced as a novel CBI able to estimate codon bias without using a reference set. The results of this new index when applied to Escherichia coli and Saccharomyces cerevisiae led the authors of the original publications to conclude that natural selection favours higher expression and enhanced codon usage optimization in short genes. Here, we show that this conclusion was flawed and based on the systematic oversight of an intrinsic bias for short sequences in the RCBS index and of biases in the small data sets used for validation in E. coli. Furthermore, we reveal that how the RCBS can be corrected to produce useful results and how its underlying principle, which we here term relative codon adaptation (RCA), can be made into a powerful reference-set-based index that directly takes into account the genomic base composition. Finally, we show that RCA outperforms the codon adaptation index (CAI) as a predictor of gene expression when operating on the CAI reference set and that this improvement is significantly larger when analysing genomes with high mutational bias.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / genetics*
  • Base Composition
  • Biomarkers / metabolism
  • Codon*
  • Computational Biology
  • Gene Expression Profiling
  • Gene Expression Regulation, Bacterial
  • Gene Expression*
  • Genes, Bacterial*
  • Genome, Bacterial
  • Oligonucleotide Array Sequence Analysis

Substances

  • Biomarkers
  • Codon