a, Length (top; error bars represent s.e.m.), RNA expression level (middle; error bars represent s.e.m.), and proximity to transcription factor binding sites (bottom; error bars represent standard error of the proportion) of ORFs correlate with conservation level. P and tau: Kendall’s correlation statistics. Estimation of RNA abundance from RNAseq25 in rich conditions. The positive correlation between proximity to transcription factor binding sites and conservation level is shown for a window of 200 nucleotides and holds when considering windows of 300, 400 and 500 nucleotides (Kendall’s tau = 0.14, 0.16, 0.17, respectively; P < 2.2 × 10−16 in each case). b, Codon bias increases with conservation level. Codon bias estimated using the codon adaptation index (Supplementary Information). P and tau: Kendall’s correlation statistics. Error bars represent s.e.m. The large s.e.m. observed for ORFs5 may be related to the whole genome duplication event (Supplementary Fig. 3). c,Relative amino acid abundances shift with increasing conservation level. For each encoded amino acid, the ratio between its frequency in ORFs1-4 and its frequency in ORFs5-10 (gray), or the ratio between its frequency in ORFs1-4 and its frequency in ORFs0 (black), is plotted. Enrichment of cysteine in proteins encoded by ORFs1-4 relative to those encoded by ORFs5-10 (P < 1.8 × 10−150, hypergeometric test) corresponds to 3.6 ± 0.1 residues (mean, s.e.m.) per translation product. d, Predicted structural features of ORF translation products correlate with conservation level. ORFs0 were not included in these analyses as their short length hinders the reliability of structural predictions. Error bars represent s.e.m.