![]() | ![]() |
Formats:
|
||||||||||||||||||||
Copyright © 2006 The Author(s) ‘Genome design’ model and multicellular complexity: golden middle Institute of Cytology, Russian Academy of Sciences, St Petersburg 194064, Russia *Tel.: +78 122975310; Fax: +78 122970341; Email: aevin/at/mail.cytspb.rssi.ru Received August 1, 2006; Revised September 13, 2006; Accepted September 28, 2006. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Human tissue-specific genes were reported to be longer than housekeeping genes (both in coding and intronic parts). The competing neutralist and adaptationist models were proposed to explain this observation. Here I show that in human genome the longest are genes with the intermediate expression pattern. From the standpoint of information theory, the regulation of such genes should be most complex. In the genomewide context, they are found here to have the higher informational load on all available levels: from participation in protein interaction networks, pathways and modules reflected in Gene Ontology categories through transcription factor regulatory sets and protein functional domains to amino acid tuples (words) in encoded proteins and nucleotide tuples in introns and promoter regions. Thus, the intermediately expressed genes have the higher functional and regulatory complexity that is reflected in their greater length (which is consistent with the ‘genome design’ model). The dichotomy of housekeeping versus tissue-specific entities is more pronounced on the modular level than on the molecular level. There are much lesser intermediate-specific modules (modules overrepresented in the intermediately expressed genes) than housekeeping or tissue-specific modules (normalized to gene number). The dichotomy of housekeeping versus tissue-specific genes and modules in multicellular organisms is probably caused by the burden of regulatory complexity acted on the intermediately expressed genes. INTRODUCTION Human tissue-specific genes were reported to be longer than housekeeping genes, both in coding and intronic parts. The competing models were proposed to explain this observation: selection for economy (in housekeeping genes), mutation bias and ‘genome design’ (i.e. functional complexity) (1–11). The first two models assume a neutralist (permissive) interpretation of the accumulation of DNA in eukaryotic genomes. In contrast, the ‘genome design’ model suggests that the length of genomic elements is mostly determined by their functional load. In particular, the greater amount of intra- and intergenic noncoding DNA, in which the tissue-specific genes are embedded, may be involved in the more complex regulation and chromatin-mediated suppression of these genes, whereas the greater length of coding sequences may be related to more complex protein functional architectures. From the standpoint of information theory, the regulation of intermediately expressed genes should be most complex (Figure 1
MATERIALS AND METHODS Gene sequences and expression Human gene sequences were extracted from the RefSeq database (12). The data on gene expression were taken from the last version of Gene Expression Atlas standardized with the MAS5 algorithm (13). They present the results of oligonucleotide microarray experiments performed uniformly with 72 normal human tissues. The signals from probes on the chip corresponding to the same gene were averaged; the replicates representing the same tissue were also averaged. As recommended (13), a gene was regarded as expressed if its signal level exceeded the dataset median. The complete sets of structural and expression data were obtained for 15 726 genes. The genes were divided into seven groups (bins) differing in the among-tissues expression breadth, with a roughly equal number of genes in each group. In the part of analyses of intronic sequences, the gene groups were normalized to a roughly equal number of nucleotide tuples by randomly removing genes from the groups with a relatively greater total intron length. For analysis of intronic sequence, only the sum of internal introns (that reside within the coding sequence) was taken for consistency (because the complete mRNAs may not be known for all genes). There were 14 470 intron-containing genes in the dataset. In a part of analyses, introns were masked for lineage-specific repeats (that were inserted after the human–mouse split) or for all known repeats using the standalone RepeatMasker and DateRepeats programs (A.F.A. Smit, R. Hubley and P. Green; http://repeatmasker.org). Genomic objects For genes with references to the SwissProt (UniProt) database (13 273 proteins were found), the functional domains in the encoded proteins were estimated using the SwissPfam (for non-overlapping domains) and InterPro (the compilation of all known domain definitions from different databases, with redundancy) databases (14,15). The sets of genes regulated by different transcription factors were taken from the Molecular Signature Database (MsigDb) (16). The pathway gene sets were compiled using the KEGG (17) and Reactome (18) databases [using Entrez Gene mapping (19)], and HumanCyc (20). In the case of Gene Ontology categories (21), I collected for each category all its subcategories (separately for Biological Processes, Molecular Functions and Cellular Components) using GO graphs, and a gene was regarded as belonging to a given category if it was mapped to any of its subcategories in Entrez Gene. (If only the explicit Entrez Gene mapping of a given gene was used, the picture was similar.) The information on protein interactions was taken from the STRING database (22). All pairwise interactions of a given protein were taken to form the protein interaction set. The gene promoter regions were extracted from the database of experimentally determined exact transcriptional start sites (DBTSS): from 1000 nt upstream to 200 nt downstream of transcription start site (the standard promoter region length presented in the DBTSS) (23). The frequencies of amino acid and nucleotide tuples of different sizes were calculated using a sliding frame of a given size (with 1-letter step) for each gene group. The reduced amino acid alphabets were taken from the work by Li et al. (24). Similar to reduced amino acid alphabets, the more evolutionarily stable 2-letter purine/pyrimidine alphabet was used for testing intronic tuples. Thus, using repeats as markers of intronic sequence, it was estimated in regard to repeat ancestor copies (using the RepeatMasker program) that transitions (i.e. mutations from purine to purine or from pyrimidine to pyrimidine) occur roughly twice more frequently than transversions (i.e. mutations from purine to pyrimidine or vice versa). Estimation of information The Shannon information (uncertainty) was estimated on the basis of probability of occurrence of a given object (protein domain, transcription factor, pathway, protein interaction, GO category) in a given gene expression group in regard to the total dataset (i.e. the under- or overrepresentation of a given object in the total dataset), using a gene set corresponding to this object and the hypergeometric probability distribution. In other words, the expected count (number of occurrences) of the genomic object in a given gene expression group was estimated on the ground of the count of this object in the total dataset. Then, the probability of the deviation of the observed count from the expected was estimated using the hypergeometric test. If an object was overrepresented in a given group, the probability of equal or higher frequency was taken, if underrepresented, the probability of equal or lower frequency. Only those objects were taken that occur more than thrice in the total dataset (with the higher cutoff values, the picture was similar). This condition gives 1176 InterPro domains, 615 transcription factor sets, 274 pathways, 11 397 protein interaction sets, 1612 GO Biological Processes, 1034 GO Molecular Functions, and 358 GO Cellular Components (with the explicit Entrez Gene mapping, there were 710 Biological Processes, 634 Molecular Functions and 224 Cellular Components). For amino acid and nucleotide tuples (where there were much higher counts), the probability of occurrence of each tuple in a given gene group (in regard to the total dataset) was estimated using the chi-square distribution (with Yates correction). The information (uncertainty) of each genomic object was calculated using the Shannon formula (−P * log2P) (25), where the probability value was taken either from hypergeometric or chi-square test (as said above). Then the average information was determined for each gene expression group, summing the information across the entire set of objects of a given type (e.g. GO Biological Processes) and dividing it by the number of objects in the set. For revealing the number of over- and underrepresented modules (Tables 1 and 2), I used hypergeometric probability distribution (as said above), and then estimated false discovery rate (q-value) for correction for multiple comparisons using P-value (obtained in hypergeometric test) and the ‘q-value’ program (26). The conventional statistical analyses (ANOVA and Kruskal–Wallis tests, polynomial regression) were done using the Statgraphics Plus (Statistical Graphics Co.) software package. The star plot (Figure 6C) was done using the Statistica (StatSoft, Inc.) package.
RESULTS General picture The intermediately expressed human genes are longer both in coding and intronic part (Figure 2A and B
The ‘genome design’ model suggests that the length of a gene (including its intronic part) is roughly proportional to its functional load (4,11). Whether the intermediately expressed genes indeed have a higher complexity? First of all, the number of encoded protein functional domains is greater in them (Figure 3A
Similar to the case of coding sequence, the greater length of intronic sequence in the intermediately expressed genes cannot be explained by selection for economy and/or mutation bias. Because of the (above-mentioned) strong correlation between average expression level and among-tissues expression breadth, the economy selection should be more effective in the intermediately expressed genes compared with the narrower expressed (more tissue-specific) genes. Therefore, in the case of selection for economy intronic length should decrease monotonically with expression breadth. Were the mutation bias associated with expression level and/or expression breadth [because transcription can increase mutation and recombination rate (28,29)], the effect of mutation bias should also change monotonically with the change of the latter parameters. Informational approach The Shannon information theory defines information as a measure of surprise (uncertainty) of a message estimated using the prior probability of this message (25). This approach allows estimating information of any within-genome object in the genomewide context, which can be used for calculation of the prior probability (see Materials and Methods). The average information (uncertainty) of protein functional domains is greater in the intermediately expressed genes (Figure 3D From the standpoint of information theory, regulation of intermediately expressed genes should involve a higher informational load compared with both housekeeping and tissue-specific genes because of a more complex choice of switch-on/off transition (Figure 1
On the protein sequence level, the intermediately expressed genes show the higher average information of amino acid tuples of different sizes (Figure 5A
In introns, there is a similar picture with nucleotide tuples (Figure 6A
However, on the level of sequence tuples (in contrast to explicitly functional objects such as protein functional domains, transcription factor sets, pathways, protein interactions, Gene Ontology categories), there is a problem of discerning information from noise caused by possible redundancy (degeneracy) of the sequence level (especially, in the case of intronic sequence). The use of reduced alphabets with more evolutionarily stable letters (i.e. reflecting those sequence properties that are more tightly linked to function) should reduce this noise. Thus, for amino acid tuples, the picture was similar even with 3-letter alphabet (legend to Figure 5 DISCUSSION The whole picture can be summarized as follows. It was argued that the overtaking growth of the number of genes coding for transcription factors over the total number of genes limited the growth of prokaryotic genomes (34). The problem of regulatory complexity turns out to be even more severe for the eukaryotic genomes (35,36). The most complex regulatory problems should appear in the case of intermediately expressed genes (Figure 1 The domain architecture is considered the most important level of protein functional complexity, especially in the eukaryotic genomes (39–41). Proteins encoded by the intermediately expressed genes are shown here to consist of a greater number of various domains and therefore may perform more complex and diverse functions. The higher complexity of the intermediately expressed genes is also reflected in the frequency of amino acid tuples in encoded proteins and nucleotide tuples in introns and promoter regions. A possible functional load of introns is discussed in (11,42–47). Briefly, introns can harbor a plethora of regulatory elements acting in multiple ways: the interaction with transcription factors (as enhancers and suppressors), the regulation mediated by splicing, and the action of noncoding RNAs located in introns (to say nothing of alternative splicing which alters the protein structure). It is noteworthy that first intron, which is more often known to contain regulatory elements (48,49), is longer in the intermediately expressed genes (Supplementary Figure 7). The situation is further complicated by participation of introns in chromatin organization and interplay of the latter with transcriptional regulation (e.g. 10,50). It is interesting that in the yeast, introns are longer in the highly expressed genes (51), which contradicts the ‘selection for economy’ model. [The same is probably true for some other unicellular organisms, judging by the correlation between intron length and frequency of optimal codons (51)]. This fact indicates that in introns of unicellular organisms, the amount of activating elements outweigh the amount of suppressing ones. The fraction of non-housekeeping (i.e. generally suppressed) genes is much lower in unicellular organisms, therefore there should be a lower amount of suppressing elements in their introns. The maximum chromatin condensation is 5-fold lower in yeast when compared with mammals (52), which suggests that yeast introns should be less loaded with chromatin-condensation function. It should be noted that the ‘selection for economy’ model comes in two flavors: ‘energy economy’ and ‘time economy’, which were contrasted in the case of human bi-directional genes (6,7). The former was rejected in favor of the latter because antisense genes expressed are both shorter and narrower than corresponding sense genes (6,7). However, the antisense genes can be miniaturized because they should be accommodated within the loci of the sense genes, which is consistent with the ‘genome design’ model (11). (Also, their shorter length may be adequate for their function.) Moreover, in contrast to the energy economy, time economy is not additive in a piecemeal way [as in ‘beanbag genetics’ (53)]. In other words, the speed of an intracellular event probably cannot be changed without corresponding changes in other parts of the system. (Imagine an electronic circuit where some events are accelerated without adjustment of the others.) Therefore, time economy is closer in sense to ‘genome design’ because in this case genomic structure should be selected as a system [for timing design (54)]. The combinatorial control of gene expression involving cooperation of multiple transcription factors is now an emerging theme (50,55,56). Due to the most complex choice of switch-on/off transition in the case of intermediately expressed genes (according to the information theory), regulation of these genes should be more complex. Therefore, it may involve a greater amount of multiple regulatory factors (and their binding sites). Finally, evolutionary design becomes a recurrent theme in systems biology of gene and protein networks (54,57–61). It may have a counterpart in the blueprint of these networks (genomic structure). SUPPLEMENTARY DATA Supplementary Data are available at NAR Online. Acknowledgments I thank two anonymous reviewers for helpful comments. This work was supported by the Russian Foundation for Basic Research (RFBR) and by the Programme of the Presidium of the Russian Academy of Sciences ‘Molecular and Cellular Biology’ (MCB RAS). The Open Access publication charges for this article were waived by Oxford University Press. Conflict of interest statement. None declared. REFERENCES 1. Castillo-Davis C.I., Mekhedov S.L., Hartl D.L., Koonin E.V., Kondrashov F.A. Selection for short introns in highly expressed genes. Nature Genet. 2002;31:415–418. [PubMed] 2. Eisenberg E., Levanon E.Y. Human housekeeping genes are compact. Trends Genet. 2003;19:362–365. [PubMed] 3. Urrutia A.O., Hurst L.D. The signature of selection mediated by expression on human genes. Genome Res. 2003;13:2260–2264. [PubMed] 4. Vinogradov A.E. Compactness of human housekeeping genes: selection for economy or genomic design? Trends Genet. 2004;20:248–253. [PubMed] 5. Vinogradov A.E. Evolution of genome size: multi-level selection, mutation bias or dynamical chaos? Curr. Opin. Genet. Devel. 2004;14:620–626. [PubMed] 6. Chen J., Sun M., Hurst L.D., Carmichael G.G., Rowley J.D. Human antisense genes have unusually short introns: evidence for selection for rapid transcription. Trends Genet. 2005;21:203–207. [PubMed] 7. Chen J., Sun M., Rowley J.D., Hurst L.D. The small introns of antisense genes are better explained by selection for rapid transcription than by ‘genomic design’ Genetics. 2005;171:2151–2155. [PubMed] 8. Cohen-Gihon I., Lancet D., Yanai I. Modular genes with metazoan-specific domains have increased tissue specificity. Trends Genet. 2005;21:210–213. [PubMed] 9. Sironi M., Menozzi G., Comi G.P., Cagliani R., Bresolin N., Pozzoli U. Analysis of intronic conserved elements indicates that functional complexity might represent a major source of negative selection on non-coding sequences. Hum. Mol. Genet. 2005;14:2533–2546. [PubMed] 10. Vinogradov A.E. Noncoding DNA, isochores and gene expression: nucleosome formation potential. Nucleic Acids Res. 2005;33:559–563. [PubMed] 11. Vinogradov A.E. ‘Genome design’ model: evidence from conserved intronic sequence in human-mouse comparison. Genome Res. 2006;16:347–354. [PubMed] 12. Pruitt K.D., Tatusova T., Maglott D.R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005;33:D501–D504. [PubMed] 13. Su A.I., Wiltshire T., Batalov S., Lapp H., Ching K.A., Block D., Zhang J., Soden R., Hayakawa M., Kreiman G., et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl Acad. Sci. USA. 2004;101:6062–6067. [PubMed] 14. Bateman A., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., et al. The Pfam protein families database. Nucleic Acids Res. 2004;32:D138–D141. [PubMed] 15. Mulder N.J., Apweiler R., Attwood T.K., Bairoch A., Bateman A., Binns D., Bradley P., Bork P., Bucher P., Cerutti L., et al. InterPro, progress and status in 2005. Nucleic Acids Res. 2005;33:D201–D205. [PubMed] 16. Subramanian A., Tamayo P., Mootha V.K., Mukherjee S., Ebert B.L., Gillette M.A., Paulovich A., Pomeroy S.L., Golub T.R., Lander E.S., et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA. 2005;102:15545–15550. [PubMed] 17. Kanehisa M., Goto S., Hattori M., Aoki-Kinoshita K.F., Itoh M., Kawashima S., Katayama T., Araki M., Hirakawa M. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006;34:D354–D357. [PubMed] 18. Joshi-Tope G., Gillespie M., Vastrik I., D'Eustachio P., Schmidt E., de Bono B., Jassal B., Gopinath G.R., Wu G.R., Matthews L., et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 2005;33:D428–D432. [PubMed] 19. Wheeler D.L., Barrett T., Benson D.A., Bryant S.H., Canese K., Church D.M., DiCuccio M., Edgar R., Federhen S., Helmberg W., et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2005;33:D39–D45. [PubMed] 20. Romero P., Wagg J., Green M.L., Kaiser D., Krummenacker M., Karp P.D. Computational prediction of human metabolic pathways from the complete human genome. Genome Biol. 2005;6:R2. [PubMed] 21. The Gene Ontology Consortium. The Gene Ontology (GO) project in 2006. Nucleic Acids Res. 2006;34:D322–D326. [PubMed] 22. von Mering C., Jensen L.J., Snel B., Hooper S.D., Krupp M., Foglierini M., Jouffre N., Huynen M.A., Bork P. STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005;33:D433–D437. [PubMed] 23. Suzuki Y., Yamashita R., Sugano S., Nakai K. DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res. 2004;32:D78–D81. [PubMed] 24. Li T., Fan K., Wang J., Wang W. Reduction of protein sequence complexity by residue grouping. Protein Eng. 2003;16:323–330. [PubMed] 25. Shannon C. A mathematical theory of communication. Bell System Techn. J. 1948;27:379–423. 26. Storey J.D., Tibshirani R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA. 2003;100:9440–9445. [PubMed] 27. Drummond D.A., Bloom J.D., Adami C., Wilke C.O., Arnold F.H. Why highly expressed proteins evolve slowly. Proc. Natl Acad. Sci. USA. 2005;102:14338–14343. [PubMed] 28. Aguilera A. The connection between transcription and genomic instability. EMBO J. 2002;21:195–201. [PubMed] 29. Comeron J.M. Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence. Genetics. 2004;167:1293–1304. [PubMed] 30. Fan K., Wang W. What is the minimum number of letters required to fold a protein? J. Mol. Biol. 2003;328:921–926. [PubMed] 31. Zhang L., Kasif S., Cantor C.R., Broude N.E. GC/AT-content spikes as genomic punctuation marks. Proc. Natl Acad. Sci. USA. 2004;101:16855–16860. [PubMed] 32. Vinogradov A.E. Dualism of gene GC content and CpG pattern in regard to expression in the human genome: magnitude versus breadth. Trends Genet. 2005;21:639–643. [PubMed] 33. Kudla G., Lipinski L., Caffin F., Helwak A., Zylicz M. High guanine and cytosine content increases mRNA levels in mammalian cells. PLoS Biol. 2006;4:e180. [PubMed] 34. Mattick J.S., Gagen M.J. Accelerating networks. Science. 2005;307:856–858. [PubMed] 35. Claverie J.M. Gene number. what if there are only 30,000 human genes? Science. 2001;291:1255–1257. [PubMed] 36. Szathmary E., Jordan F., Pal C. Molecular biology and evolution. Can genes explain biological complexity? Science. 2001;292:1315–1316. [PubMed] 37. Vinogradov A.E. Isochores and tissue-specificity. Nucleic Acids Res. 2003;31:5212–5220. [PubMed] 38. Jongeneel C.V., Delorenzi M., Iseli C., Zhou D., Haudenschild C.D., Khrebtukova I., Kuznetsov D., Stevenson B.J., Strausberg R.L., Simpson A.J., Vasicek T.J. An atlas of human gene expression from massively parallel signature sequencing (MPSS). Genome Res. 2005;15:1007–1014. [PubMed] 39. Vogel C., Bashton M., Kerrison N.D., Chothia C., Teichmann S.A. Structure, function and evolution of multidomain proteins. Curr. Opin. Struct. Biol. 2004;14:208–216. [PubMed] 40. Orengo C.A., Thornton J.M. Protein families and their evolution—a structural perspective. Annu. Rev. Biochem. 2005;74:867–900. [PubMed] 41. Lin K., Zhu L., Zhang D.Y. An initial strategy for comparing proteins at the domain architecture level. Bioinformatics. 2006;22:2081–2086. [PubMed] 42. Le Hir H., Nott A., Moore M.J. How introns influence and enhance eukaryotic gene expression. Trends Biochem. Sci. 2003;28:215–220. [PubMed] 43. Nott A., Meislin S.H., Moore M.J. A quantitative analysis of intron effects on mammalian gene expression. RNA. 2003;9:607–617. [PubMed] 44. Pozzoli U., Sironi M. Silencers regulate both constitutive and alternative splicing events in mammals. Cell Mol. Life Sci. 2005;62:1579–1604. [PubMed] 45. Mattick J.S. RNA regulation: a new genetics? Nature Rev. Genet. 2004;5:316–323. [PubMed] 46. Fedorova L., Fedorov A. Puzzles of the human genome: why do we need our introns? Curr. Genomics. 2005;6:589–595. 47. Pang K.C., Frith M.C., Mattick J.S. Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. Trends Genet. 2006;22:1–5. [PubMed] 48. Majewski J., Ott J. Distribution and characterization of regulatory elements in the human genome. Genome Res. 2002;12:1827–1836. [PubMed] 49. Keightley P.D., Gaffney D.J. Functional constraints and frequency of deleterious mutations in noncoding DNA of rodents. Proc. Natl Acad. Sci. USA. 2003;100:13402–13406. [PubMed] 50. Barrera L.O., Ren B. The transcriptional regulatory code of eukaryotic cells—insights from genome-wide analysis of chromatin organization and transcription factor binding. Curr. Opin. Cell Biol. 2006;18:291–298. [PubMed] 51. Vinogradov A.E. Intron length and codon usage. J. Mol. Evol. 2001;52:2–5. [PubMed] 52. Russell P., Nurse P. Schizosaccharomyces pombe and Saccharomyces cerevisiae: a look at yeasts divided. Cell. 1986;45:781–782. [PubMed] 53. Crow J.F. The beanbag lives on. Nature. 2001;409:771. [PubMed] 54. Zaslaver A., Mayo A.E., Rosenberg R., Bashkin P., Sberro H., Tsalyuk M., Surette M.G., Alon U. Just-in-time transcription program in metabolic pathways. Nature Genet. 2004;36:486–491. [PubMed] 55. Ogata K., Sato K., Tahirov T.H. Eukaryotic transcriptional regulatory complexes: cooperativity from near and afar. Curr. Opin. Struct. Biol. 2003;13:40–48. [PubMed] 56. Remenyi A., Scholer H.R., Wilmanns M. Combinatorial control of gene expression. Nature Struct. Mol. Biol. 2004;11:812–815. [PubMed] 57. Alon U. Biological networks: the tinkerer as an engineer. Science. 2003;301:1866–1867. [PubMed] 58. Powell K. All systems go. J. Cell Biol. 2004;165:299–303. [PubMed] 59. Kashtan N., Alon U. Spontaneous evolution of modularity and network motifs. Proc. Natl Acad. Sci. USA. 2005;102:13773–13778. [PubMed] 60. Zhang L.V., King O.D., Wong S.L., Goldberg D.S., Tong A.H., Lesage G., Andrews B., Bussey H., Boone C., Roth F.P. Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network. J. Biol. 2005;4:6. [PubMed] 61. Yu H., Xia Y., Trifonov V., Gerstein M. Design principles of molecular networks revealed by global comparisons and composite motifs. Genome Biol. 2006;7:R55. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||
Nat Genet. 2002 Aug; 31(4):415-8.
[Nat Genet. 2002]Genome Res. 2006 Mar; 16(3):347-54.
[Genome Res. 2006]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D501-4.
[Nucleic Acids Res. 2005]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Nucleic Acids Res. 2004 Jan 1; 32(Database issue):D138-41.
[Nucleic Acids Res. 2004]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D201-5.
[Nucleic Acids Res. 2005]Proc Natl Acad Sci U S A. 2005 Oct 25; 102(43):15545-50.
[Proc Natl Acad Sci U S A. 2005]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D354-7.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D428-32.
[Nucleic Acids Res. 2005]Proc Natl Acad Sci U S A. 2003 Aug 5; 100(16):9440-5.
[Proc Natl Acad Sci U S A. 2003]Nat Genet. 2002 Aug; 31(4):415-8.
[Nat Genet. 2002]Trends Genet. 2004 May; 20(5):248-53.
[Trends Genet. 2004]Trends Genet. 2004 May; 20(5):248-53.
[Trends Genet. 2004]Genome Res. 2006 Mar; 16(3):347-54.
[Genome Res. 2006]Proc Natl Acad Sci U S A. 2005 Oct 4; 102(40):14338-43.
[Proc Natl Acad Sci U S A. 2005]EMBO J. 2002 Feb 1; 21(3):195-201.
[EMBO J. 2002]Genetics. 2004 Jul; 167(3):1293-304.
[Genetics. 2004]Protein Eng. 2003 May; 16(5):323-30.
[Protein Eng. 2003]J Mol Biol. 2003 May 9; 328(4):921-6.
[J Mol Biol. 2003]Proc Natl Acad Sci U S A. 2004 Nov 30; 101(48):16855-60.
[Proc Natl Acad Sci U S A. 2004]PLoS Biol. 2006 Jun; 4(6):e180.
[PLoS Biol. 2006]Protein Eng. 2003 May; 16(5):323-30.
[Protein Eng. 2003]J Mol Biol. 2003 May 9; 328(4):921-6.
[J Mol Biol. 2003]Science. 2005 Feb 11; 307(5711):856-8.
[Science. 2005]Science. 2001 Feb 16; 291(5507):1255-7.
[Science. 2001]Science. 2001 May 18; 292(5520):1315-6.
[Science. 2001]Nucleic Acids Res. 2003 Sep 1; 31(17):5212-20.
[Nucleic Acids Res. 2003]Genome Res. 2005 Jul; 15(7):1007-14.
[Genome Res. 2005]Curr Opin Struct Biol. 2004 Apr; 14(2):208-16.
[Curr Opin Struct Biol. 2004]Bioinformatics. 2006 Sep 1; 22(17):2081-6.
[Bioinformatics. 2006]Genome Res. 2006 Mar; 16(3):347-54.
[Genome Res. 2006]Trends Biochem Sci. 2003 Apr; 28(4):215-20.
[Trends Biochem Sci. 2003]Trends Genet. 2006 Jan; 22(1):1-5.
[Trends Genet. 2006]Trends Genet. 2005 Apr; 21(4):203-7.
[Trends Genet. 2005]Genetics. 2005 Dec; 171(4):2151-5.
[Genetics. 2005]Genome Res. 2006 Mar; 16(3):347-54.
[Genome Res. 2006]Nature. 2001 Feb 15; 409(6822):771.
[Nature. 2001]Nat Genet. 2004 May; 36(5):486-91.
[Nat Genet. 2004]Curr Opin Cell Biol. 2006 Jun; 18(3):291-8.
[Curr Opin Cell Biol. 2006]Curr Opin Struct Biol. 2003 Feb; 13(1):40-8.
[Curr Opin Struct Biol. 2003]Nat Struct Mol Biol. 2004 Sep; 11(9):812-5.
[Nat Struct Mol Biol. 2004]Nat Genet. 2004 May; 36(5):486-91.
[Nat Genet. 2004]Science. 2003 Sep 26; 301(5641):1866-7.
[Science. 2003]