|
|
Comp Funct Genomics. 2003 July; 4(4): 432–441. doi: 10.1002/cfg.311. | PMCID: PMC2447361 |
Copyright © 2003 Hindawi Publishing Corporation. Unravelling the ORFan Puzzle Naomi Siew 1,2 and Daniel Fischer 21 Department of Chemistry, Ben Gurion University, Beer-Sheva, 84105, Israel, 2 Bioinformatics Group, Department of Computer Science, Ben Gurion University, Beer-Sheva, 84105, Israel, Received May 7, 2003; Revised June 5, 2003; Accepted June 5, 2003. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ORFans are open reading frames (ORFs) with no detectable sequence similarity
to any other sequence in the databases. Each newly sequenced genome contains a
significant number of ORFans. Therefore, ORFans entail interesting evolutionary
puzzles. However, little can be learned about them using bioinformatics tools, and
their study seems to have been underemphasized. Here we present some of the
questions that the existence of so many ORFans have raised and review some of
the studies aimed at understanding ORFans, their functions and their origins. These
works have demonstrated that ORFans are an untapped source of research, requiring
further computational and experimental studies. The Full Text of this article is available as a PDF (141K). These references are in PubMed. This may not be the complete list of references from this article. - Thuluvath Paul J, John Preeti R. Association between hepatitis C, diabetes mellitus, and race. a case-control study. Am J Gastroenterol. 2003 Feb;98(2):438–441. [PubMed]
- Alimi JP, Poirot O, Lopez F, Claverie JM. Reverse transcriptase-polymerase chain reaction validation of 25 "orphan" genes from Escherichia coli K-12 MG1655. Genome Res. 2000 Jul;10(7):959–966. [PubMed]
- Alm RA, Ling LS, Moir DT, King BL, Brown ED, Doig PC, Smith DR, Noonan B, Guild BC, deJonge BL, Carmel G, Tummino PJ, Caruso A, Uria-Nickelsen M, Mills DM, Ives C, Gibson R, Merberg D, Mills SD, Jiang Q, Taylor DE, Vovis GF, Trust TJ. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature. 1999 Jan 14;397(6715):176–180. [PubMed]
- Andersson JO, Andersson SG. Pseudogenes, junk DNA, and the dynamics of Rickettsia genomes. Mol Biol Evol. 2001 May;18(5):829–839. [PubMed]
- Andrade MA, Daruvar A, Casari G, Schneider R, Termier M, Sander C. Characterization of new proteins found by analysis of short open reading frames from the full yeast genome. Yeast. 1997 Nov;13(14):1363–1374. [PubMed]
- Balasubramanian S, Schneider T, Gerstein M, Regan L. Proteomics of Mycoplasma genitalium: identification and characterization of unannotated and atypical proteins in a small model genome. Nucleic Acids Res. 2000 Aug 15;28(16):3075–3082. [PubMed]
- Barabasi AL, Albert R. Emergence of scaling in random networks. Science. 1999 Oct 15;286(5439):509–512. [PubMed]
- Basrai MA, Hieter P, Boeke JD. Small open reading frames: beautiful needles in the haystack. Genome Res. 1997 Aug;7(8):768–771. [PubMed]
- Bloom BR. On the particularity of pathogens. Nature. 2000 Aug 17;406(6797):760–761. [PubMed]
- Boucher Y, Nesbø CL, Doolittle WF. Microbial genomes: dealing with diversity. Curr Opin Microbiol. 2001 Jun;4(3):285–289. [PubMed]
- Brenner SE. Target selection for structural genomics. Nat Struct Biol. 2000 Nov;7 Suppl:967–969. [PubMed]
- Coulson Andrew F W, Moult John. A unifold, mesofold, and superfold model of protein fold use. Proteins. 2002 Jan 1;46(1):61–71. [PubMed]
- Doolittle RF. A bug with excess gastric avidity. Nature. 1997 Aug 7;388(6642):515–516. [PubMed]
- Doolittle Russell F. Biodiversity: microbial genomes multiply. Nature. 2002 Apr 18;416(6882):697–700. [PubMed]
- Doolittle WF. Phylogenetic classification and the universal tree. Science. 1999 Jun 25;284(5423):2124–2129. [PubMed]
- Dujon B. The yeast genome project: what did we learn? Trends Genet. 1996 Jul;12(7):263–270. [PubMed]
- Fischer D, Baker D, Moult J. We need both computer models and experiments. Nature. 2001 Feb 1;409(6820):558–558. [PubMed]
- Fischer D, Eisenberg D. Finding families for genomic ORFans. Bioinformatics. 1999 Sep;15(9):759–762. [PubMed]
- Fraser CM, Eisen JA, Salzberg SL. Microbial genome sequencing. Nature. 2000 Aug 17;406(6797):799–803. [PubMed]
- Goulding Celia W, Parseghian Angineh, Sawaya Michael R, Cascio Duilio, Apostol Marcin I, Gennaro Maria Laura, Eisenberg David. Crystal structure of a major secreted protein of Mycobacterium tuberculosis-MPT63 at 1.5-A resolution. Protein Sci. 2002 Dec;11(12):2887–2893. [PubMed]
- Hayashi T, Makino K, Ohnishi M, Kurokawa K, Ishii K, Yokoyama K, Han CG, Ohtsubo E, Nakayama K, Murata T, Tanaka M, Tobe T, Iida T, Takami H, Honda T, Sasakawa C, Ogasawara N, Yasunaga T, Kuhara S, Shiba T, Hattori M, Shinagawa H. Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res. 2001 Feb 28;8(1):11–22. [PubMed]
- Hirsh AE, Fraser HB. Protein dispensability and rate of evolution. Nature. 2001 Jun 28;411(6841):1046–1049. [PubMed]
- Hurst LD, Smith NG. Do essential genes evolve slowly? Curr Biol. 1999 Jul 15;9(14):747–750. [PubMed]
- Hutchison CA, Peterson SN, Gill SR, Cline RT, White O, Fraser CM, Smith HO, Venter JC. Global transposon mutagenesis and a minimal Mycoplasma genome. Science. 1999 Dec 10;286(5447):2165–2169. [PubMed]
- Huynen MA, van Nimwegen E. The frequency distribution of gene family sizes in complete genomes. Mol Biol Evol. 1998 May;15(5):583–589. [PubMed]
- Jain R, Rivera MC, Lake JA. Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A. 1999 Mar 30;96(7):3801–3806. [PubMed]
- Jordan I King, Rogozin Igor B, Wolf Yuri I, Koonin Eugene V. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002 Jun;12(6):962–968. [PubMed]
- Jordan I King, Rogozin Igor B, Wolf Yuri I, Koonin Eugene V. Microevolutionary genomics of bacteria. Theor Popul Biol. 2002 Jun;61(4):435–447. [PubMed]
- Karev Georgy P, Wolf Yuri I, Rzhetsky Andrey Y, Berezovskaya Faina S, Koonin Eugene V. Birth and death of protein domains: a simple model of evolution explains power law behavior. BMC Evol Biol. 2002 Oct 14;2(1):18–18. [PubMed]
- Koonin EV. Computational genomics. Curr Biol. 2001 Mar 6;11(5):R155–R158. [PubMed]
- Kunin Victor, Cases Ildefonso, Enright Anton J, de Lorenzo Victor, Ouzounis Christos A. Myriads of protein families, and still counting. Genome Biol. 2003;4(2):401. [PubMed]
- Lawrence JG, Hendrix RW, Casjens S. Where are the pseudogenes in bacterial genomes? Trends Microbiol. 2001 Nov;9(11):535–540. [PubMed]
- Mackiewicz P, Kowalczuk M, Gierlik A, Dudek MR, Cebrat S. Origin and properties of non-coding ORFs in the yeast genome. Nucleic Acids Res. 1999 Sep 1;27(17):3503–3509. [PubMed]
- Malpertuy A, Tekaia F, Casarégola S, Aigle M, Artiguenave F, Blandin G, Bolotin-Fukuhara M, Bon E, Brottier P, de Montigny J, Durrens P, Gaillardin C, Lépingle A, Llorente B, Neuvéglise C, Ozier-Kalogeropoulos O, Potier S, Saurin W, Toffano-Nioche C, Wésolowski-Louvel M, Wincker P, Weissenbach J, Souciet J, Dujon B. Genomic exploration of the hemiascomycetous yeasts: 19. Ascomycetes-specific genes. FEBS Lett. 2000 Dec 22;487(1):113–121. [PubMed]
- Mira Alex, Klasson Lisa, Andersson Siv G E. Microbial genome evolution: sources of variability. Curr Opin Microbiol. 2002 Oct;5(5):506–512. [PubMed]
- Mira A, Ochman H, Moran NA. Deletional bias and the evolution of bacterial genomes. Trends Genet. 2001 Oct;17(10):589–596. [PubMed]
- Pellegrini M, Yeates TO. Searching for frameshift evolutionary relationships between protein sequence families. Proteins. 1999 Nov 1;37(2):278–283. [PubMed]
- Petrov DA, Sangster TA, Johnston JS, Hartl DL, Shaw KL. Evidence for DNA loss as a determinant of genome size. Science. 2000 Feb 11;287(5455):1060–1062. [PubMed]
- Qian J, Luscombe NM, Gerstein M. Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. J Mol Biol. 2001 Nov 2;313(4):673–681. [PubMed]
- Rost Burkhard. Did evolution leap to create the protein universe? Curr Opin Struct Biol. 2002 Jun;12(3):409–416. [PubMed]
- Schmid KJ, Aquadro CF. The evolutionary analysis of "orphans" from the Drosophila genome identifies rapidly diverging and incorrectly annotated genes. Genetics. 2001 Oct;159(2):589–598. [PubMed]
- Skovgaard M, Jensen LJ, Brunak S, Ussery D, Krogh A. On the total number of genes and their length distribution in complete microbial genomes. Trends Genet. 2001 Aug;17(8):425–428. [PubMed]
- Unger Ron, Uliel Shai, Havlin Shlomo. Scaling law in sizes of protein sequence families: from super-families to orphan genes. Proteins. 2003 Jun 1;51(4):569–576. [PubMed]
- Vitkup D, Melamud E, Moult J, Sander C. Completeness in structural genomics. Nat Struct Biol. 2001 Jun;8(6):559–566. [PubMed]
- Wolf Yuri I, Karev Georgy, Koonin Eugene V. Scale-free networks in biology: new insights into the fundamentals of evolution? Bioessays. 2002 Feb;24(2):105–109. [PubMed]
- Wolfe Kenneth H, Li Wen-Hsiung. Molecular evolution meets the genomics revolution. Nat Genet. 2003 Mar;33 Suppl:255–265. [PubMed]
- Wood V, Rutherford KM, Ivens A, Rajandream MA, Barrell B. A re-annotation of the Saccharomyces cerevisiae genome. Comp Funct Genomics. 2001;2(3):143–154. [PubMed]
- Wren BW. Microbial genome analysis: insights into virulence, host adaptation and evolution. Nat Rev Genet. 2000 Oct;1(1):30–39. [PubMed]
- Yanai I, Camacho CJ, DeLisi C. Predictions of gene family distributions in microbial genomes: evolution by gene duplication and modification. Phys Rev Lett. 2000 Sep 18;85(12):2641–2644. [PubMed]
- Zdobnov Evgeny M, von Mering Christian, Letunic Ivica, Torrents David, Suyama Mikita, Copley Richard R, Christophides George K, Thomasova Dana, Holt Robert A, Subramanian G Mani, Mueller Hans-Michael, Dimopoulos George, Law John H, Wells Michael A, Birney Ewan, Charlab Rosane, Halpern Aaron L, Kokoza Elena, Kraft Cheryl L, Lai Zhongwu, Lewis Suzanna, Louis Christos, Barillas-Mury Carolina, Nusskern Deborah, Rubin Gerald M, Salzberg Steven L, Sutton Granger G, Topalis Pantelis, Wides Ron, Wincker Patrick, Yandell Mark, Collins Frank H, Ribeiro Jose, Gelbart William M, Kafatos Fotis C, Bork Peer. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science. 2002 Oct 4;298(5591):149–159. [PubMed]
|