• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Apr 24, 2007; 104(17): 7122–7127.
Published online Apr 16, 2007. doi:  10.1073/pnas.0702133104
PMCID: PMC1855360
Evolution

Evolutionary dynamics of olfactory receptor genes in Drosophila species

Abstract

Olfactory receptor (OR) genes are of vital importance for animals to find food, identify mates, and avoid dangers. In mammals, the number of OR genes is large and varies extensively among different orders, whereas, in insects, the extent of interspecific variation appears to be small, although only a few species have been studied. To understand the evolutionary changes of OR genes, we identified all OR genes from 12 Drosophila species, of which the evolutionary time is roughly equivalent to that of eutherian mammals. The results showed that all species examined have similar numbers (≈60) of functional OR genes. Phylogenetic analysis indicated that the ancestral species also had similar numbers of genes, but there were frequent gains and losses of genes that occurred in each evolutionary lineage. It appears that tandem duplication and random inactivation of duplicate genes are the major factors of gene number change. However, chromosomal rearrangements have contributed to the establishment of genome-wide distribution of OR genes. These results suggest that the repertoire of OR genes in Drosophila has been quite stable compared with the mammalian genes. The difference in evolutionary pattern between Drosophila and mammals can be explained partly by the differences of gene expression mechanisms and partly by the environmental and behavioral differences.

Keywords: birth-and-death evolution, insect evolution, multigene family

Olfactory receptor (OR) genes form one of the largest multigene families in animals, and the number of genes varies extensively among different mammalian orders (≈400–1,200 genes) (1, 2). Insects also have many OR genes, but these genes are remotely related to vertebrate OR genes, and there is virtually no sequence similarity between them (3). In addition to the extensive sequence divergence, there is a structural difference between insect and vertebrate OR genes. Both of the genes belong to the G protein-coupled receptor gene superfamily, but insect OR genes contain introns (4) whereas vertebrate OR genes have no introns in the protein-coding region (5). OR genes have been studied in a few insect species, and it has been reported that fruit flies (4), mosquitoes (6), and honey bees (7) have ≈60, ≈80, and ≈160 genes, respectively. This suggests that variation in the number of OR genes is smaller in insects than in mammals. However, to understand the evolutionary dynamics of OR genes in insects, we need more information about the gene repertoire of closely related species.

Fortunately, draft genome sequences of 12 Drosophila species have been released from the Assembly/Alignment/Annotation (AAA) database. The 12 species are D. melanogaster, D. simulans, D. sechellia, D. yakuba, D. erecta, D. ananassae, D. pseudoobscura, D. persimilis, D. willistoni, D. virilis, D. mojavensis, and D. grimshawi (the genus name will be omitted in the following). Molecular data have suggested that these species evolved and diverged during the last 63 million years (MY) (8), which is somewhat lower than but similar to the divergence times of eutherian mammals (<100 MY) (9). This allows us to compare the evolutionary dynamics of OR genes of Drosophila and mammals. We have therefore conducted an evolutionary study of OR genes from these 12 Drosophila species. The results obtained are presented in this article.

Results

Numbers of OR Genes in Drosophila Species.

Our homology search (see Materials and Methods) detected 711 functional, 67 nonfunctional, and 34 partial OR genes in the genomes of 12 Drosophila species. All species examined have similar numbers of functional OR genes and much smaller numbers of pseudogenes (Table 1). Although simulans, sechellia, persimilis, and virilis show somewhat smaller numbers of functional genes, they have larger numbers of pseudogenes and partial genes. Here a partial gene refers to a gene with an open reading frame (ORF) truncated at the end of a genomic contig studied. This gene may therefore become a functional gene when the entire genomic sequence is assembled. Some pseudogenes might also become functional genes later because the nonsense or frameshift mutations identified could be caused by sequencing errors. For these reasons, our estimates of functional genes are likely to be minimums except in melanogaster, where the genome sequence is well established. In particular, the Hawaiian fruit fly grimshawi, which has many pseudogenes and partial genes, may turn out to have the largest number of functional genes. In this connection, it should be mentioned that the gene OR83b (OR49 in our notation) is known to be coexpressed with another OR gene in most olfactory receptor neurons (ORNs) (10, 11) and is highly conserved even among different orders of insects (12). Yet, this gene in simulans was judged as a pseudogene because it contained a stop codon. However, if we consider the functional importance of this gene, the stop codon is likely to have occurred because of sequencing errors. We have therefore decided to regard it as a functional gene in this paper. Furthermore, the previous study showed that melanogaster has 62 functional OR genes (4), but we regarded one of them (OR85e or OR55) as a pseudogene because of large deletions.

Table 1.
Numbers of OR genes in the 12 Drosophila species

Chromosomal Locations of OR Genes and Their Phylogenetic Relationships.

Fig. 1 shows the chromosomal locations of OR genes in melanogaster, yakuba, and pseudoobscura, whose genome sequences are better assembled than others. OR genes are numbered from the left-hand side of the genome to the right-hand side in each species. In all these species, OR genes are widely distributed in the genome except for the dot chromosome (chromosome 4 in melanogaster and yakuba, and chromosome 5 in pseudoobscura), where no OR gene was observed. Phylogenetic analysis of these OR genes revealed their orthologous and paralogous relationships among the three species (see Fig. 2 for a segment of the phylogenetic tree). The orthologous genes between melanogaster and yakuba are arranged in the genome in essentially the same order. The genomic arrangement of OR genes in pseudoobscura was quite different from that in the other two species apparently because of the gene rearrangements that occurred in the past. However, Fig. 2 indicates that pseudoobscura almost always has a gene orthologous to the OR genes from the other two species.

Fig. 1.
Chromosomal locations of OR genes and their orthologous and paralogous relationships in melanogaster (mel), yakuba (yak), and pseudoobscura (pse). Long rods show functional OR genes, whereas short rods indicate either pseudogenes or partial genes. Rods ...
Fig. 2.
Partial segment of the neighbor-joining (NJ) tree (14) of functional OR genes from melanogaster (m), yakuba (y), and pseudoobscura (p) (236 aa used). In each species, the genes are named as mentioned in Fig. 1. Filled circles indicate that the interior ...

Comparison of the melanogaster and yakuba genomic sequences shows that there are two lost genes in melanogaster (corresponding to the open triangles in the yakuba sequence) and one lost gene in yakuba. One melanogaster pseudogene OR55 is apparently the ortholog of yakuba gene OR55 and is likely to be one of the two lost genes because its adjacent OR genes showed 1:1 orthologous relationships with the same gene order. In addition, the melanogaster genome contains one set of duplicate genes (solid lines) that apparently occurred after divergence of the two species. Similarly, the yakuba genome has two sets of duplicate genes. In other words, it appears that melanogaster acquired one gene and lost two genes whereas yakuba acquired two genes and lost one gene after their divergence. Comparison of the yakuba and pseudoobscura sequences suggests that yakuba acquired four genes and lost two genes (open rectangles) whereas pseudoobscura acquired seven genes and lost four genes after their divergence (Fig. 1). However, these estimates may not be reliable, because pseudoobscura and yakuba (or melanogaster) diverged ≈55 million years ago (MYA) (8). Therefore, we used different methods to estimate the numbers of gene gains and losses as shown in the next section.

The genomic maps presented in Fig. 1 are also useful for identifying chromosomal inversion and translocation events. Genes OR61 and OR62 in melanogaster are inverted relative to their orthologs of yakuba, and they are located on the DNA strand opposite to that of yakuba. In addition, some genes (e.g., OR24 to OR27) moved from one chromosome (2R) arm to another (2L) or vice versa. Multiple events of inversion and translocation apparently occurred between yakuba and pseudoobscura, so that it is difficult to infer every evolutionary event.

We noticed that 7 of the 12 duplication events observed in the three species occurred by tandem duplication (<10 kb apart on the same strand). This can also be seen in Fig. 2, which shows that the genes contiguously located on the chromosomes are phylogenetically very close. Fig. 3 shows the relationships between the genomic distance between two consecutive OR genes and their phylogenetic distance (Poisson-correction distances of amino acid sequences) within chromosomes. It is clear that there is a high correlation between the two quantities, suggesting that duplicate genes eventually spread through the entire chromosome by repeated chromosomal rearrangements.

Fig. 3.
Relationships between sequence divergence (Poisson-correction distances) and chromosomal distance of two consecutive functional OR genes for melanogaster (Left), yakuba (Center), and pseudoobscura (Right). Chromosomal distances are shown in logarithm. ...

Fig. 4 shows the NJ tree of functional OR genes from melanogaster, pseudoobscura (subgenus Sophophora), and virilis (subgenus Drosophila). The melanogaster gene OR83b (OR49) and its orthologs from different species were used as outgroups, because the gene is known to have diverged from other OR genes a long time ago (4) and have a function different from other OR genes (10, 11). OR genes of each species do not form a species-specific clade but are scattered throughout the tree. We classified OR genes into 15 phylogenetic clades (A–O), each of which was defined as the largest cluster of similar genes supported by a bootstrap value of ≥80%. These clades remained unchanged even in the tree constructed for all functional OR genes from the 12 Drosophila species [supporting information (SI) Fig. 7]. Because all these clades contained OR genes from the Sophophora and Drosophila species, they must have existed in the most recent common ancestor (MRCA) of these subgenera.

Fig. 4.
NJ tree of functional OR genes from melanogaster (mel), pseudoobscura (pse), and virilis (vir) (264 aa used). Bootstrap values are shown only for the clades, which were supported by ≥80% bootstrap values. Note that most of the interior branches ...

The numbers of genes for the 15 phylogenetic clades in each of the 12 Drosophila species are presented in Table 2. The number of OR genes varies considerably among these clades, clade L having the largest number of genes. In all species, clade O has only one gene, which is orthologous to OR83b (OR49) in melanogaster. The genes belonging to this clade are distantly related to other OR genes (Fig. 4). There are some other clades, which contain only one or two genes, but it is uncertain whether they have any distinct function or not. In this study, OR gene groups are defined by phylogenetic relationships only, and therefore the classification may not be related to gene function. Yet, it is interesting to note that the number of genes in each clade is more or less the same for all species.

Table 2.
Numbers of functional OR genes for 15 clades in each Drosophila species

Numbers of OR Genes in Ancestral Species.

We estimated the numbers of OR genes in ancestral species and gains and losses of genes during Drosophila evolution, using two different methods. One is the modified reconciled-tree (MR) method (17). This method is based on the comparison of a bootstrap condensed gene tree (18) with the species tree, and the numbers are estimated under the parsimony principle. In this study, we used a 50% bootstrap condensed tree of OR genes. The other is the gene order (GO) method developed in this study. In this method, the adjacent genes of each OR gene are considered, and when at least one of the two adjacent genes for a pair of OR genes gives the same best hit gene in the Blast search (19), the OR genes are regarded as orthologs or paralogs. By using this information, the number of gene gains and losses in the evolutionary process can be estimated (see Materials and Methods). For this analysis, we used representative five species including both subgenera, because the application of the GO method for a larger number of species was complicated.

Fig. 5 shows estimates of the numbers of OR genes in ancestral species and gene gains and losses when the five Drosophila species were used. The estimates obtained by the MR and GO methods are generally similar to each other. Both methods showed that substantial numbers of gene gains and losses have occurred for each branch of the tree. Nevertheless, both extant and ancestral species contain similar numbers of OR genes despite these gains and losses. Similar results were obtained when we used different bootstrap threshold values (30–95%) in the MR method. These observations suggest that the OR repertoire of Drosophila species has been quite stable in the evolutionary process.

Fig. 5.
Estimates of the numbers of genes in ancestral species and gains and losses of genes during Drosophila evolution. (a) Estimates by the MR method using a 50% condensed tree. (b) Estimates by the GO method. Numbers in circles indicate the numbers of OR ...

Comparing the genomic maps of OR genes for melanogaster and yakuba, we previously concluded that melanogaster probably gained one gene and lost two genes after their divergence, whereas yakuba gained two genes and lost one gene. This conclusion is the same as that obtained by the MR method and similar to that obtained by the GO method. However, comparison of the genomic maps of OR genes for yakuba and pseudoobscura gave much smaller number of gene gains and losses compared with those obtained by the MR and GO methods. These results suggest that when distantly related species are compared, the MR and GO methods give more accurate estimates than the comparison of genomic sequences.

Discussion

We have seen that the repertoire of OR genes in Drosophila species has been nearly the same for the entire period of Drosophila evolution. Yet, we identified frequent gains and losses of OR genes in the evolutionary process, indicating that the birth-and-death process (20, 21) has operated during the evolution of Drosophila OR genes. Unequal crossing over appears to be the major mechanism for increasing gene number, because closely related genes in the phylogenetic tree were observed in tandem arrays in the genome. Although gene retroposition can increase the number of genes in a genome, this factor appears to have played negligible roles in the present case, because only two of 711 functional OR genes were intronless. Intronless genes could occur by retroposition, but they can also be generated by deletion of introns. Chromosomal rearrangements have contributed to the establishment of genome-wide distribution of OR genes, because the gene order of the orthologous OR genes is substantially different between distantly related species. These aspects of evolutionary changes of OR genes are essentially the same for both Drosophila and mammalian species (22).

However, there are two major differences in the olfactory system between mammals and Drosophila. One is the number of glomeruli in the olfactory bulb of mammals and in the antennal lobe of Drosophila relative to the number of OR genes. In Drosophila, each ORN expresses only one or two OR genes (in addition to the co-expression of gene OR83b in most neurons), and the olfactory information from ORNs, which express the same OR genes, is transmitted to a single glomerulus (23). The number of glomeruli is therefore similar to the number of different ORs. For example, melanogaster has ≈60 ORs (4) and ≈50 glomeruli (24), and the honey bee has ≈160 ORs (7) and ≈160 glomeruli (25). In mammals, each ORN also expresses only one OR gene, but the number of glomeruli is considerably greater than the number of ORs. For example, the mouse has ≈1,000 ORs (22) and 1,800 glomeruli (26), and the human has ≈400 ORs (27) and ≈8,000 glomeruli (28). These observations suggest that the olfactory information received by ORs can be transmitted to glomeruli in a more flexible way in mammals than in Drosophila.

The other difference is the expression pattern of OR genes. In Drosophila, a specific OR gene tends to be expressed deterministically in a specific ORN, which produces precise expression pattern of a given OR gene (29). Therefore, if an OR gene is duplicated or lost from the genome, the gene expression pattern may be disturbed. In mammals, however, one of the clustered OR genes in the genome is stochastically chosen to be expressed in each ORN (30). Therefore, the expression pattern of OR genes appears to be considerably different among different individuals, and consequently the number of OR genes may change relatively easily in the evolutionary process. Actually, at least 26 loci of OR genes are known to be polymorphic between functional and nonfunctional alleles among human individuals (31).

Of course, the most important factor for determining the number of OR genes in mammals would be environmental conditions. Mammalian species inhabit a wide range of environments, including various temperature zones, diverse habitats (terrestrial, aquatic, and aerial living), etc. It is therefore possible that a varying degree of olfaction required for living in different environments has caused the evolutionary change of OR genes and this change has occurred because of the flexible system of olfaction in mammals. Drosophila species also inhabit highly variable environments (32), but the variation in the number of OR genes may not have changed greatly because of the rigid expression and the signal pathway mentioned above.

It should be noted that mammalian species generally have a large number of OR pseudogenes. It is also known that removal of a substantial proportion of the olfactory bulb in laboratory rats does not seriously affect their survival (33). Therefore, a considerable portion of the change in the number of OR genes could be more or less neutral.

Materials and Methods

Identification of OR Genes in the Draft Genome Sequences.

The genome sequences of melanogaster (release 4.3) and pseudoobscura (release 2.0) were downloaded from the FlyBase (34). Other genome sequences were obtained from the AAA database (http://rana.lbl.gov/drosophila/): simulans (dsim_washu_2jun05_mosaic.tar.gz), sechellia (dsec_broad_28oct05.tar.gz), yakuba (dyak_washu_13dec05.tar.gz), erecta (dere_freeze1.tar.gz), ananassae (dana_freeze1.tar.gz), persimilis (dper_broad_28oct05.tar.gz), willistoni (dwil_caf1.tar.gz), virilis (dvir_freeze1.tar.gz), mojavensis (dmoj_freeze1.tar.gz), and grimshawi (dgri_freeze1.tar.gz).

To identify the OR genes, we performed a two-round TBlastN (19) search with E-value ≤10−10 against each genome sequence. In the first round, amino acid sequences of 61 OR genes annotated in the melanogaster genome were used as queries. After collecting hit sequences, we manually annotated each hit sequence, because Drosophila OR genes have introns (4) and it was not easy to determine the ORF by using computer programs. In the second round, the procedures were repeated by using the functional OR genes identified in the first round to find additional OR genes. To extract only unique genes, we compared the nucleotide sequences of all functional OR genes with one another and eliminated any overlapping genes. The flowchart for the detailed procedure is shown in SI Fig. 8. Note that the alternative transcripts from the same locus were considered as different genes. There were two such loci in the melanogaster genome. Coding sequences of functional OR genes and the genomic locations of all OR genes are available from SI Data Set 1 and SI Table 3, respectively.

Estimation of the Numbers of OR Genes in Ancestral Species.

The numbers of genes in ancestral species and gains and losses of OR genes in evolution were estimated by the MR (17) and the GO methods. The latter method is expected to give more reliable results when the bootstrap support of gene trees is low, although the adjacent genes may not always be identifiable because of gene rearrangements in the past. The detailed procedure of this method is described in Fig. 6. The flowchart for the identification of adjacent genes is shown in SI Fig. 9.

Fig. 6.
Example illustrating the GO method for estimating the numbers of genes in ancestral species and gains and losses of genes. (a) Phylogenetic tree for three species α, β, and γ. We assume that α, β, and γ ...

Supplementary Material

Supporting Information:

Acknowledgments

We thank Saby Das, Zhenguo Lin, Jongmin Nam, Nikos Nikolaidis, Alex Rooney, Shigeru Saito, Claire T. Saito, Shozo Yokoyama, and Jianzhi Zhang for valuable comments on earlier versions of the manuscript. Steve Schaeffer provided us with information on the chromosome assembly of pseudoobscura. Jongmin Nam and Yoshihito Niimura gave us the scripts for the MR method. This work was supported by National Institutes of Health Grant GM020293 (to M. Nei).

Abbreviations

GO
gene order
MR
modified reconciled-tree
OR
olfactory receptor
ORN
olfactory receptor neuron.

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/cgi/content/full/0702133104/DC1.

References

1. Niimura Y, Nei M. J Hum Genet. 2006;51:505–517. [PMC free article] [PubMed]
2. Aloni R, Olender T, Lancet D. Genome Biol. 2006;7:R88. [PMC free article] [PubMed]
3. Bergmann CI. Nature. 2006;444:295–301. [PubMed]
4. Robertson HM, Warr CG, Carlson JR. Proc Natl Acad Sci USA. 2003;100:14537–14542. [PMC free article] [PubMed]
5. Niimura Y, Nei M. Proc Natl Acad Sci USA. 2005;102:6039–6044. [PMC free article] [PubMed]
6. Hill CA, Fox AN, Pitts RJ, Kent LB, Tan PL, Chrystal MA, Cravchik A, Collins FH, Robertson HM, Zwiebel LJ. Science. 2002;298:176–178. [PubMed]
7. Robertson HM, Wanner KW. Genome Res. 2006;16:1395–1403. [PMC free article] [PubMed]
8. Tamura K, Subramanian S, Kumar S. Mol Biol Evol. 2004;21:36–44. [PubMed]
9. Kumar S, Hedges SB. Nature. 1998;392:917–920. [PubMed]
10. Larsson MC, Domingos AI, Jones WD, Chiappe ME, Amrein H, Vosshall LB. Neuron. 2004;43:703–714. [PubMed]
11. Benton R, Sachse S, Michnick SW, Vosshall LB. Plos Biol. 2006;4:e20. [PMC free article] [PubMed]
12. Krieger J, Klink O, Mohl C, Raming K, Breer H. J Comp Physiol A. 2003;189:519–526. [PubMed]
13. Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP, et al. Genome Res. 2005;15:1–18. [PMC free article] [PubMed]
14. Saitou N, Nei M. Mol Biol Evol. 1987;4:406–425. [PubMed]
15. Kumar S, Tamura K, Nei M. Brief Bioinform. 2004;5:150–163. [PubMed]
16. Thompson J, Higgins DG, Gibson TI. Nucleic Acids Res. 1994;22:4673–4680. [PMC free article] [PubMed]
17. Nam J, Nei M. Mol Biol Evol. 2005;22:2386–2394. [PMC free article] [PubMed]
18. Nei M, Kumar S. Molecular Evolution and Phylogenetics. New York: Oxford Univ Press; 2000. pp. 165–186.
19. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zheng Z, Miller W, Lipman DJ. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
20. Ota T, Nei M. Mol Biol Evol. 1994;11:469–482. [PubMed]
21. Nei M, Rooney AP. Annu Rev Genet. 2005;39:121–152. [PMC free article] [PubMed]
22. Niimura Y, Nei M. Gene. 2005;346:13–21. [PubMed]
23. Ache BW, Young JM. Neuron. 2005;48:417–430. [PubMed]
24. Fishilevich E, Vosshall LB. Curr Biol. 2005;15:1548–1553. [PubMed]
25. Galizia CG, Menzel R. J Insect Physiol. 2001;47:115–130. [PubMed]
26. Royet JP, Souchier C, Jourdan F, Ploye H. J Comp Neurol. 1988;270:559–568. [PubMed]
27. Niimura Y, Nei M. Proc Natl Acad Sci USA. 2003;100:12235–12240. [PMC free article] [PubMed]
28. Meisami E, Mikhail L, Baim D, Bhatnagar KP. Ann NY Acad Sci. 1998;855:708–715. [PubMed]
29. Ray A, van der Goes van Naters W, Shiraiwa T, Carlson JR. Neuron. 2007;53:353–369. [PMC free article] [PubMed]
30. Serizawa S, Miyamichi K, Nakatani H, Suzuki M, Saito M, Yoshihara Y, Sakano H. Science. 2003;302:2088–2094. [PubMed]
31. Menashe I, Man O, Lancet D, Gilad Y. Nat Genet. 2003;34:143–144. [PubMed]
32. Powell JR. Progress and Prospects in Evolutionary Biology. New York: Oxford Univ Press; 1997. pp. 143–184.
33. Bisulco S, Slotnick B. Chem Senses. 2003;28:361–370. [PubMed]
34. Crosby MA, Goodman JL, Strelets VB, Zhang P, Gelbart WM. The FlyBase Consortium. Nucleic Acids Res. 2007;35:D486–D491. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links