![]() | ![]() |
Formats:
|
||||||||||||||||||||
Copyright © 2007 by The National Academy of Sciences of the USA Evolution Evolutionary dynamics of olfactory receptor genes in Drosophila species Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University, 328 Mueller Laboratory, University Park, PA 16802 *To whom correspondence may be addressed. E-mail: mun12/at/psu.edu or Email: nxm2/at/psu.edu Contributed by Masatoshi Nei, March 7, 2007 .Author contributions: M. Nozawa and M. Nei designed research; M. Nozawa analyzed data; and M. Nozawa and M. Nei wrote the paper. Received February 23, 2007. This article has been cited by other articles in PMC.Abstract Olfactory receptor (OR) genes are of vital importance for animals to find food, identify mates, and avoid dangers. In mammals, the number of OR genes is large and varies extensively among different orders, whereas, in insects, the extent of interspecific variation appears to be small, although only a few species have been studied. To understand the evolutionary changes of OR genes, we identified all OR genes from 12 Drosophila species, of which the evolutionary time is roughly equivalent to that of eutherian mammals. The results showed that all species examined have similar numbers (≈60) of functional OR genes. Phylogenetic analysis indicated that the ancestral species also had similar numbers of genes, but there were frequent gains and losses of genes that occurred in each evolutionary lineage. It appears that tandem duplication and random inactivation of duplicate genes are the major factors of gene number change. However, chromosomal rearrangements have contributed to the establishment of genome-wide distribution of OR genes. These results suggest that the repertoire of OR genes in Drosophila has been quite stable compared with the mammalian genes. The difference in evolutionary pattern between Drosophila and mammals can be explained partly by the differences of gene expression mechanisms and partly by the environmental and behavioral differences. Keywords: birth-and-death evolution, insect evolution, multigene family Olfactory receptor (OR) genes form one of the largest multigene families in animals, and the number of genes varies extensively among different mammalian orders (≈400–1,200 genes) (1, 2). Insects also have many OR genes, but these genes are remotely related to vertebrate OR genes, and there is virtually no sequence similarity between them (3). In addition to the extensive sequence divergence, there is a structural difference between insect and vertebrate OR genes. Both of the genes belong to the G protein-coupled receptor gene superfamily, but insect OR genes contain introns (4) whereas vertebrate OR genes have no introns in the protein-coding region (5). OR genes have been studied in a few insect species, and it has been reported that fruit flies (4), mosquitoes (6), and honey bees (7) have ≈60, ≈80, and ≈160 genes, respectively. This suggests that variation in the number of OR genes is smaller in insects than in mammals. However, to understand the evolutionary dynamics of OR genes in insects, we need more information about the gene repertoire of closely related species. Fortunately, draft genome sequences of 12 Drosophila species have been released from the Assembly/Alignment/Annotation (AAA) database. The 12 species are D. melanogaster, D. simulans, D. sechellia, D. yakuba, D. erecta, D. ananassae, D. pseudoobscura, D. persimilis, D. willistoni, D. virilis, D. mojavensis, and D. grimshawi (the genus name will be omitted in the following). Molecular data have suggested that these species evolved and diverged during the last 63 million years (MY) (8), which is somewhat lower than but similar to the divergence times of eutherian mammals (<100 MY) (9). This allows us to compare the evolutionary dynamics of OR genes of Drosophila and mammals. We have therefore conducted an evolutionary study of OR genes from these 12 Drosophila species. The results obtained are presented in this article. Results Numbers of OR Genes in Drosophila Species. Our homology search (see Materials and Methods) detected 711 functional, 67 nonfunctional, and 34 partial OR genes in the genomes of 12 Drosophila species. All species examined have similar numbers of functional OR genes and much smaller numbers of pseudogenes (Table 1). Although simulans, sechellia, persimilis, and virilis show somewhat smaller numbers of functional genes, they have larger numbers of pseudogenes and partial genes. Here a partial gene refers to a gene with an open reading frame (ORF) truncated at the end of a genomic contig studied. This gene may therefore become a functional gene when the entire genomic sequence is assembled. Some pseudogenes might also become functional genes later because the nonsense or frameshift mutations identified could be caused by sequencing errors. For these reasons, our estimates of functional genes are likely to be minimums except in melanogaster, where the genome sequence is well established. In particular, the Hawaiian fruit fly grimshawi, which has many pseudogenes and partial genes, may turn out to have the largest number of functional genes. In this connection, it should be mentioned that the gene OR83b (OR49 in our notation) is known to be coexpressed with another OR gene in most olfactory receptor neurons (ORNs) (10, 11) and is highly conserved even among different orders of insects (12). Yet, this gene in simulans was judged as a pseudogene because it contained a stop codon. However, if we consider the functional importance of this gene, the stop codon is likely to have occurred because of sequencing errors. We have therefore decided to regard it as a functional gene in this paper. Furthermore, the previous study showed that melanogaster has 62 functional OR genes (4), but we regarded one of them (OR85e or OR55) as a pseudogene because of large deletions.
Chromosomal Locations of OR Genes and Their Phylogenetic Relationships. Fig. 1
Comparison of the melanogaster and yakuba genomic sequences shows that there are two lost genes in melanogaster (corresponding to the open triangles in the yakuba sequence) and one lost gene in yakuba. One melanogaster pseudogene OR55 is apparently the ortholog of yakuba gene OR55 and is likely to be one of the two lost genes because its adjacent OR genes showed 1:1 orthologous relationships with the same gene order. In addition, the melanogaster genome contains one set of duplicate genes (solid lines) that apparently occurred after divergence of the two species. Similarly, the yakuba genome has two sets of duplicate genes. In other words, it appears that melanogaster acquired one gene and lost two genes whereas yakuba acquired two genes and lost one gene after their divergence. Comparison of the yakuba and pseudoobscura sequences suggests that yakuba acquired four genes and lost two genes (open rectangles) whereas pseudoobscura acquired seven genes and lost four genes after their divergence (Fig. 1 The genomic maps presented in Fig. 1 We noticed that 7 of the 12 duplication events observed in the three species occurred by tandem duplication (<10 kb apart on the same strand). This can also be seen in Fig. 2
Fig. 4
The numbers of genes for the 15 phylogenetic clades in each of the 12 Drosophila species are presented in Table 2. The number of OR genes varies considerably among these clades, clade L having the largest number of genes. In all species, clade O has only one gene, which is orthologous to OR83b (OR49) in melanogaster. The genes belonging to this clade are distantly related to other OR genes (Fig. 4
Numbers of OR Genes in Ancestral Species. We estimated the numbers of OR genes in ancestral species and gains and losses of genes during Drosophila evolution, using two different methods. One is the modified reconciled-tree (MR) method (17). This method is based on the comparison of a bootstrap condensed gene tree (18) with the species tree, and the numbers are estimated under the parsimony principle. In this study, we used a 50% bootstrap condensed tree of OR genes. The other is the gene order (GO) method developed in this study. In this method, the adjacent genes of each OR gene are considered, and when at least one of the two adjacent genes for a pair of OR genes gives the same best hit gene in the Blast search (19), the OR genes are regarded as orthologs or paralogs. By using this information, the number of gene gains and losses in the evolutionary process can be estimated (see Materials and Methods). For this analysis, we used representative five species including both subgenera, because the application of the GO method for a larger number of species was complicated. Fig. 5
Comparing the genomic maps of OR genes for melanogaster and yakuba, we previously concluded that melanogaster probably gained one gene and lost two genes after their divergence, whereas yakuba gained two genes and lost one gene. This conclusion is the same as that obtained by the MR method and similar to that obtained by the GO method. However, comparison of the genomic maps of OR genes for yakuba and pseudoobscura gave much smaller number of gene gains and losses compared with those obtained by the MR and GO methods. These results suggest that when distantly related species are compared, the MR and GO methods give more accurate estimates than the comparison of genomic sequences. Discussion We have seen that the repertoire of OR genes in Drosophila species has been nearly the same for the entire period of Drosophila evolution. Yet, we identified frequent gains and losses of OR genes in the evolutionary process, indicating that the birth-and-death process (20, 21) has operated during the evolution of Drosophila OR genes. Unequal crossing over appears to be the major mechanism for increasing gene number, because closely related genes in the phylogenetic tree were observed in tandem arrays in the genome. Although gene retroposition can increase the number of genes in a genome, this factor appears to have played negligible roles in the present case, because only two of 711 functional OR genes were intronless. Intronless genes could occur by retroposition, but they can also be generated by deletion of introns. Chromosomal rearrangements have contributed to the establishment of genome-wide distribution of OR genes, because the gene order of the orthologous OR genes is substantially different between distantly related species. These aspects of evolutionary changes of OR genes are essentially the same for both Drosophila and mammalian species (22). However, there are two major differences in the olfactory system between mammals and Drosophila. One is the number of glomeruli in the olfactory bulb of mammals and in the antennal lobe of Drosophila relative to the number of OR genes. In Drosophila, each ORN expresses only one or two OR genes (in addition to the co-expression of gene OR83b in most neurons), and the olfactory information from ORNs, which express the same OR genes, is transmitted to a single glomerulus (23). The number of glomeruli is therefore similar to the number of different ORs. For example, melanogaster has ≈60 ORs (4) and ≈50 glomeruli (24), and the honey bee has ≈160 ORs (7) and ≈160 glomeruli (25). In mammals, each ORN also expresses only one OR gene, but the number of glomeruli is considerably greater than the number of ORs. For example, the mouse has ≈1,000 ORs (22) and 1,800 glomeruli (26), and the human has ≈400 ORs (27) and ≈8,000 glomeruli (28). These observations suggest that the olfactory information received by ORs can be transmitted to glomeruli in a more flexible way in mammals than in Drosophila. The other difference is the expression pattern of OR genes. In Drosophila, a specific OR gene tends to be expressed deterministically in a specific ORN, which produces precise expression pattern of a given OR gene (29). Therefore, if an OR gene is duplicated or lost from the genome, the gene expression pattern may be disturbed. In mammals, however, one of the clustered OR genes in the genome is stochastically chosen to be expressed in each ORN (30). Therefore, the expression pattern of OR genes appears to be considerably different among different individuals, and consequently the number of OR genes may change relatively easily in the evolutionary process. Actually, at least 26 loci of OR genes are known to be polymorphic between functional and nonfunctional alleles among human individuals (31). Of course, the most important factor for determining the number of OR genes in mammals would be environmental conditions. Mammalian species inhabit a wide range of environments, including various temperature zones, diverse habitats (terrestrial, aquatic, and aerial living), etc. It is therefore possible that a varying degree of olfaction required for living in different environments has caused the evolutionary change of OR genes and this change has occurred because of the flexible system of olfaction in mammals. Drosophila species also inhabit highly variable environments (32), but the variation in the number of OR genes may not have changed greatly because of the rigid expression and the signal pathway mentioned above. It should be noted that mammalian species generally have a large number of OR pseudogenes. It is also known that removal of a substantial proportion of the olfactory bulb in laboratory rats does not seriously affect their survival (33). Therefore, a considerable portion of the change in the number of OR genes could be more or less neutral. Materials and Methods Identification of OR Genes in the Draft Genome Sequences. The genome sequences of melanogaster (release 4.3) and pseudoobscura (release 2.0) were downloaded from the FlyBase (34). Other genome sequences were obtained from the AAA database (http://rana.lbl.gov/drosophila/): simulans (dsim_washu_2jun05_mosaic.tar.gz), sechellia (dsec_broad_28oct05.tar.gz), yakuba (dyak_washu_13dec05.tar.gz), erecta (dere_freeze1.tar.gz), ananassae (dana_freeze1.tar.gz), persimilis (dper_broad_28oct05.tar.gz), willistoni (dwil_caf1.tar.gz), virilis (dvir_freeze1.tar.gz), mojavensis (dmoj_freeze1.tar.gz), and grimshawi (dgri_freeze1.tar.gz). To identify the OR genes, we performed a two-round TBlastN (19) search with E-value ≤10−10 against each genome sequence. In the first round, amino acid sequences of 61 OR genes annotated in the melanogaster genome were used as queries. After collecting hit sequences, we manually annotated each hit sequence, because Drosophila OR genes have introns (4) and it was not easy to determine the ORF by using computer programs. In the second round, the procedures were repeated by using the functional OR genes identified in the first round to find additional OR genes. To extract only unique genes, we compared the nucleotide sequences of all functional OR genes with one another and eliminated any overlapping genes. The flowchart for the detailed procedure is shown in SI Fig. 8. Note that the alternative transcripts from the same locus were considered as different genes. There were two such loci in the melanogaster genome. Coding sequences of functional OR genes and the genomic locations of all OR genes are available from SI Data Set 1 and SI Table 3, respectively. Estimation of the Numbers of OR Genes in Ancestral Species. The numbers of genes in ancestral species and gains and losses of OR genes in evolution were estimated by the MR (17) and the GO methods. The latter method is expected to give more reliable results when the bootstrap support of gene trees is low, although the adjacent genes may not always be identifiable because of gene rearrangements in the past. The detailed procedure of this method is described in Fig. 6
Supporting Information
Acknowledgments We thank Saby Das, Zhenguo Lin, Jongmin Nam, Nikos Nikolaidis, Alex Rooney, Shigeru Saito, Claire T. Saito, Shozo Yokoyama, and Jianzhi Zhang for valuable comments on earlier versions of the manuscript. Steve Schaeffer provided us with information on the chromosome assembly of pseudoobscura. Jongmin Nam and Yoshihito Niimura gave us the scripts for the MR method. This work was supported by National Institutes of Health Grant GM020293 (to M. Nei). Abbreviations Footnotes The authors declare no conflict of interest. This article contains supporting information online at www.pnas.org/cgi/content/full/0702133104/DC1. References 1. Niimura Y, Nei M. J Hum Genet. 2006;51:505–517. [PubMed] 2. Aloni R, Olender T, Lancet D. Genome Biol. 2006;7:R88. [PubMed] 3. Bergmann CI. Nature. 2006;444:295–301. [PubMed] 4. Robertson HM, Warr CG, Carlson JR. Proc Natl Acad Sci USA. 2003;100:14537–14542. [PubMed] 5. Niimura Y, Nei M. Proc Natl Acad Sci USA. 2005;102:6039–6044. [PubMed] 6. Hill CA, Fox AN, Pitts RJ, Kent LB, Tan PL, Chrystal MA, Cravchik A, Collins FH, Robertson HM, Zwiebel LJ. Science. 2002;298:176–178. [PubMed] 7. Robertson HM, Wanner KW. Genome Res. 2006;16:1395–1403. [PubMed] 8. Tamura K, Subramanian S, Kumar S. Mol Biol Evol. 2004;21:36–44. [PubMed] 9. Kumar S, Hedges SB. Nature. 1998;392:917–920. [PubMed] 10. Larsson MC, Domingos AI, Jones WD, Chiappe ME, Amrein H, Vosshall LB. Neuron. 2004;43:703–714. [PubMed] 11. Benton R, Sachse S, Michnick SW, Vosshall LB. Plos Biol. 2006;4:e20. [PubMed] 12. Krieger J, Klink O, Mohl C, Raming K, Breer H. J Comp Physiol A. 2003;189:519–526. 13. Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP, et al. Genome Res. 2005;15:1–18. [PubMed] 14. Saitou N, Nei M. Mol Biol Evol. 1987;4:406–425. [PubMed] 15. Kumar S, Tamura K, Nei M. Brief Bioinform. 2004;5:150–163. [PubMed] 16. Thompson J, Higgins DG, Gibson TI. Nucleic Acids Res. 1994;22:4673–4680. [PubMed] 17. Nam J, Nei M. Mol Biol Evol. 2005;22:2386–2394. [PubMed] 18. Nei M, Kumar S. Molecular Evolution and Phylogenetics. New York: Oxford Univ Press; 2000. pp. 165–186. 19. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zheng Z, Miller W, Lipman DJ. Nucleic Acids Res. 1997;25:3389–3402. [PubMed] 20. Ota T, Nei M. Mol Biol Evol. 1994;11:469–482. [PubMed] 21. Nei M, Rooney AP. Annu Rev Genet. 2005;39:121–152. [PubMed] 22. Niimura Y, Nei M. Gene. 2005;346:13–21. [PubMed] 23. Ache BW, Young JM. Neuron. 2005;48:417–430. [PubMed] 24. Fishilevich E, Vosshall LB. Curr Biol. 2005;15:1548–1553. [PubMed] 25. Galizia CG, Menzel R. J Insect Physiol. 2001;47:115–130. [PubMed] 26. Royet JP, Souchier C, Jourdan F, Ploye H. J Comp Neurol. 1988;270:559–568. [PubMed] 27. Niimura Y, Nei M. Proc Natl Acad Sci USA. 2003;100:12235–12240. [PubMed] 28. Meisami E, Mikhail L, Baim D, Bhatnagar KP. Ann NY Acad Sci. 1998;855:708–715. [PubMed] 29. Ray A, van der Goes van Naters W, Shiraiwa T, Carlson JR. Neuron. 2007;53:353–369. [PubMed] 30. Serizawa S, Miyamichi K, Nakatani H, Suzuki M, Saito M, Yoshihara Y, Sakano H. Science. 2003;302:2088–2094. [PubMed] 31. Menashe I, Man O, Lancet D, Gilad Y. Nat Genet. 2003;34:143–144. [PubMed] 32. Powell JR. Progress and Prospects in Evolutionary Biology. New York: Oxford Univ Press; 1997. pp. 143–184. 33. Bisulco S, Slotnick B. Chem Senses. 2003;28:361–370. [PubMed] 34. Crosby MA, Goodman JL, Strelets VB, Zhang P, Gelbart WM. The FlyBase Consortium. Nucleic Acids Res. 2007;35:D486–D491. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||
J Hum Genet. 2006; 51(6):505-17.
[J Hum Genet. 2006]Genome Biol. 2006; 7(10):R88.
[Genome Biol. 2006]Nature. 2006 Nov 16; 444(7117):295-301.
[Nature. 2006]Proc Natl Acad Sci U S A. 2003 Nov 25; 100 Suppl 2():14537-42.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 2005 Apr 26; 102(17):6039-44.
[Proc Natl Acad Sci U S A. 2005]Mol Biol Evol. 2004 Jan; 21(1):36-44.
[Mol Biol Evol. 2004]Nature. 1998 Apr 30; 392(6679):917-20.
[Nature. 1998]Neuron. 2004 Sep 2; 43(5):703-14.
[Neuron. 2004]PLoS Biol. 2006 Feb; 4(2):e20.
[PLoS Biol. 2006]Proc Natl Acad Sci U S A. 2003 Nov 25; 100 Suppl 2():14537-42.
[Proc Natl Acad Sci U S A. 2003]Genome Res. 2005 Jan; 15(1):1-18.
[Genome Res. 2005]Mol Biol Evol. 1987 Jul; 4(4):406-25.
[Mol Biol Evol. 1987]Brief Bioinform. 2004 Jun; 5(2):150-63.
[Brief Bioinform. 2004]Nucleic Acids Res. 1994 Nov 11; 22(22):4673-80.
[Nucleic Acids Res. 1994]Mol Biol Evol. 2004 Jan; 21(1):36-44.
[Mol Biol Evol. 2004]Proc Natl Acad Sci U S A. 2003 Nov 25; 100 Suppl 2():14537-42.
[Proc Natl Acad Sci U S A. 2003]Neuron. 2004 Sep 2; 43(5):703-14.
[Neuron. 2004]PLoS Biol. 2006 Feb; 4(2):e20.
[PLoS Biol. 2006]Mol Biol Evol. 2005 Dec; 22(12):2386-94.
[Mol Biol Evol. 2005]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Mol Biol Evol. 2004 Jan; 21(1):36-44.
[Mol Biol Evol. 2004]Mol Biol Evol. 1994 May; 11(3):469-82.
[Mol Biol Evol. 1994]Annu Rev Genet. 2005; 39():121-52.
[Annu Rev Genet. 2005]Gene. 2005 Feb 14; 346():13-21.
[Gene. 2005]Neuron. 2005 Nov 3; 48(3):417-30.
[Neuron. 2005]Proc Natl Acad Sci U S A. 2003 Nov 25; 100 Suppl 2():14537-42.
[Proc Natl Acad Sci U S A. 2003]Curr Biol. 2005 Sep 6; 15(17):1548-53.
[Curr Biol. 2005]Genome Res. 2006 Nov; 16(11):1395-403.
[Genome Res. 2006]J Insect Physiol. 2001 Feb 1; 47(2):115-130.
[J Insect Physiol. 2001]Neuron. 2007 Feb 1; 53(3):353-69.
[Neuron. 2007]Science. 2003 Dec 19; 302(5653):2088-94.
[Science. 2003]Nat Genet. 2003 Jun; 34(2):143-4.
[Nat Genet. 2003]Chem Senses. 2003 Jun; 28(5):361-70.
[Chem Senses. 2003]Nucleic Acids Res. 2007 Jan; 35(Database issue):D486-91.
[Nucleic Acids Res. 2007]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Proc Natl Acad Sci U S A. 2003 Nov 25; 100 Suppl 2():14537-42.
[Proc Natl Acad Sci U S A. 2003]Mol Biol Evol. 2005 Dec; 22(12):2386-94.
[Mol Biol Evol. 2005]