(A) The number of genes and eukaryotic complexity are uncorrelated. The figure displays for 38 eukaryotic genomes the estimated number of different cell types [28,29] in relation to the predicted total number of genes. The tree indicates, in a simplified form, the phylogenetic relationships between the organisms as taken from the National Center of Biotechnology Information (NCBI) taxonomy server (http://www.ncbi.nlm.nih.gov/Taxonomy). The order of the organisms is the same in all figures and tables; their major groups are: plants (green), protozoa (blue), fungi (black), and animals (red and brown). The correlation between the number of different cell types and the number of genes is poor (R2 = 0.29, R = 0.54).
Within the plants, we distinguish green algae (Cre, Chlamydomonas reinhardtii), and flowering plants (Osa, O. sativa; Ath, Arabidopsis thaliana). We include eight protozoa (Ddi, Dictyostelium discoideum; Tbr, Trypanosoma brucei; Lma, Leishmania major; Pra, Phytophthora ramorum; Tps, Thalassiosira pseudonana; Ehi, Entamoeba histolytica; Tan, Theileria annulata; Pfa, Plasmodium falciparum), and ten fungi (Ncr, Neurospora crassa; Eni, Emericella nidulans; Spo, Schizosaccharomyces pombe; Sce, S. cerevisiae; Kla, Kluyveromyces lactis; Cal, Candida albicans; Yli, Yarrowia lipolytica; Ecu, Encephalitozoon cuniculi; Pch, Phanerochaete chrysosporium; Uma, Ustilago maydis). Protostomia include two nematodes (Cbr, Caenorhabditis briggsae; Cel, C. elegans), and three insects (Ame, Apis mellifera; Aga, Anopheles gambiae; Dme, D. melanogaster). Deuterostomia include one urochordate (Cin, Ciona intestinalis), and 11 vertebrates, among which six are mammals (Dre, Danio rerio; Tni, Tetraodon nigroviridis; Tru, Takifugu rubripes; Xtr, Xenopus tropicalis; Gga, Gallus gallus; and Cfa, Canis familiaris; Bta, Bos taurus; Rno, Rattus norvegicus; Mmu, Mus musculus; Ptr, Pan troglodytes; and Hsa, H. sapiens, respectively).
(B) Outline of our analysis. For each of the 38 genomes (three, symbolised by circles), we collected information on the number of proteins (lines with boxes) that contain domains of particular superfamilies (boxes of particular colour). The resulting abundance profiles were normalised and compared both to the estimated number of different cell types in each organism, and to each other. Analysis of function of particular groups of domain superfamilies gives information on how their expansion in some organisms may have supported an increase in organismal complexity.