![]() | ![]() |
Formats:
|
||||||||||||||||||||||
Copyright © 2009 by the Genetics Society of America Does Gene Translocation Accelerate the Evolution of Laterally Transferred Genes? *Department of Biology, McMaster University, Hamilton, Ontario L8S 4K1, Canada and †Department of Biology, Indiana University, Bloomington, Indiana 47405 1Corresponding author: Department of Biology, McMaster University, Hamilton, Ontario L8S 4K1, Canada. E-mail: golding/at/mcmaster.ca Communicating editor: N. S. Wingreen Received April 20, 2009; Accepted May 18, 2009. Abstract Lateral gene transfer (LGT) and gene rearrangement are essential for shaping bacterial genomes during evolution. Separate attention has been focused on understanding the process of lateral gene transfer and the process of gene translocation. However, little is known about how gene translocation affects laterally transferred genes. Here we have examined gene translocations and lateral gene transfers in closely related genome pairs. The results reveal that translocated genes undergo elevated rates of evolution and gene translocation tends to take place preferentially in recently acquired genes. Translocated genes have a high probability to be truncated, suggesting that translocation followed by truncation/deletion might play an important role in the fast turnover of laterally transferred genes. Furthermore, more recently acquired genes have a higher proportion of genes on the leading strand, suggesting a strong strand bias of lateral gene transfer. GENE insertions and deletions, together with gene translocations play important roles in bacterial genome evolution (Garcia-Vallvé et al. 2000; Ochman and Jones 2000; Tillier and Collins 2000a; Fraser-Liggett 2005). Gene insertions and deletions, as the essential driving forces in influencing gene content (Kunin and Ouzounis 2003), have received a great deal of attention. Various methods have been employed to study gene insertions and deletions previously; for instance, there are studies of population dynamics (Nielsen and Townsend 2004), such as a birth-and-death model of evolution (Berg and Kurland 2002; Novozhilov et al. 2005), phylogeny-dependent studies including parsimony methods (Daubin et al. 2003a,b; Mirkin et al. 2003; Hao and Golding 2004), and maximum-likelihood methods (Hao and Golding 2006b, 2008b). It has been shown that recently laterally transferred genes have high evolutionary rates and high rates of gene turnover (Daubin et al. 2003b; Hao and Golding 2004, 2006b). Gene rearrangement has also been commonly studied as another important driving force that shapes bacterial genomes (for a review, see Rocha 2004). Gene order changes in genomes are history dependent; for instance, fewer gene rearrangements are expected among more closely related species. Gene order within genomes has therefore been used to reconstruct phylogeny (Sankoff et al. 2000; Tamames 2001; Rogozin et al. 2004; Belda et al. 2005). Previous studies have focused mainly on lateral gene transfer (LGT) and gene rearrangement individually, but little is known about any association between laterally transferred genes and gene rearrangements. The study of gene order of laterally acquired genes might shed some light on the understanding of the LGT process. In this study, we have examined gene translocations and lateral gene transfers in closely related genome pairs. It is shown that the proportion of translocated genes among recently acquired genes is always high, while the proportion of translocated genes is always low in ancient genes, suggesting that gene translocation tends to take place in recently transferred genes. The results also reveal that translocated genes have elevated rates of evolution compared with positionally conserved genes and gene truncation is more prevalent in translocated genes. These findings suggest that gene translocation might accelerate the gene turnover of recently transferred genes and/or that genes likely to undergo translocation are those genes more likely to be laterally transferred and dispensable for the genome. Furthermore, the proportion of recently acquired genes is higher on the leading strand, suggesting that laterally transferred genes are biased toward being on the leading strand. After lateral transfer, some genes could be translocated to the lagging strand and some translocated genes are likely to be eliminated during evolution. METHODS The Bacillaceae group was chosen in this study due to the abundance of completely sequenced congeneric species. Complete genome sequences (Table 1 and Figure 1
Genes were further categorized into group-specific genes and nonspecific genes. For instance, Bc group-specific (see Figure 1 No large-scale genome rearrangement was observed in the four genome pairs (Figure 2 Regions associated with insertion sequences (ISs) and prophages were identified. ISs were identified by the IScan program (Wagner et al. 2007), using query sequences of 20 reference sequences from Wagner et al. (2007) and 82 additional IS sequences that have been discovered in Bacillus species (names are given in Table S1). The sequences of all 102 ISs were obtained from the ISfinder website (Siguier et al. 2006b). Genes present in the IS regions were deemed to be IS associated. Prophages in each genome were identified by the Prophinder web server (Lima-Mendez et al. 2008). Genes present in the prophage regions were deemed to be prophage associated. The origins and termini of replication for all genomes were identified by GC skew as done in previous studies (Lobry 1996; Morton and Morton 2007). GC skew was computed from the function (G − C)/(G + C) on 1000-bp windows across each genome. Gene location together with its orientation was used to determine whether the gene is on the leading strand or not. The number of genes on the leading strand was counted (see Table S2). The proportion of genes on the leading strand was further analyzed at different phylogenetic depths in both the Bc group and the Bp group. In the Bc group, group-specific genes in the Ba1 genome were examined and classified according to their depth in the phylogeny. In brief, genes present in Bc4 were categorized as n0, genes present in Bw but not present in Bc4 were categorized as n1, genes present in Bc3 but not present in Bw or Bc4 were categorized as n2, genes present in either Bc1 or Bc2 but not present in Bc4, Bw, or Bc3 were categorized as n3, genes present in Bt genomes but not present in Bc4, Bw, Bc3, Bc2, or Bc1 were categorized as n4, and genes present only in the Ba strains were categorized as n5. Alignments of homologous sequences were constructed using the MUSCLE program (Edgar 2004). Three hundred twenty-five nonduplicated genes that are universally present in all Bacillaceae genomes were used for phylogeny reconstruction. A maximum-likelihood tree and a neighbor-joining tree were generated on concatenated sequences of the 325 genes (335,380 characters), using the PHYLIP package (Felsenstein 1989) version 3.67, and the rate variation parameter alpha was estimated using the PUZZLE program (Strimmer and von Haeseler 1996). The ratio of nonsynonymous changes to synonymous changes (Ka/Ks ratio) was measured by the Yang and Nielsen (2000) method, using yn00 in the PAML package (Yang 2007) based on nucleotide sequence alignments that were created from the corresponding protein alignments. To obtain a more reliable measurement of Ka/Ks, we excluded protein pairs that have protein identity <50%, since in this case synonymous changes might be greatly saturated. Statistical analyses were conducted using the R package (R Development Core Team 2008). RESULTS Molecular evolution of translocated genes: Evolutionary distance of different genes was examined separately in each genome pair (Figure 3, A–D
Translocation in recently acquired genes: The proportion of translocated genes was calculated and is shown in Figure 4
This trend holds true in genes acquired at different evolutionary depths. Group-specific genes were further divided and analyzed in two types (“A” and “B,” Figure 5
Truncation in translocated genes: If gene truncation, as an imperfect form of gene deletion, takes place constantly as does gene deletion, different numbers of truncated genes might reflect different levels of gene deletions (Hao and Golding 2006a). Figure 6
Dynamic strand bias: Among positionally conserved genes, group-specific genes have a lower proportion on the leading strand than nonspecific genes (Figure 7
The proportion of genes on the leading strand was further examined at different phylogenetic depths in the Bc group (Figure 8
DISCUSSION Robustness: Inferring gene translocation relies heavily on the identification of orthologous pairs. Any single threshold for ortholog identification might be problematic. We therefore made use of a series of cutoff thresholds to detect orthologs. Different threshold values caused some variation of the number of orthologous pairs, such as a decrease in the numbers of orthologous pairs when using restrictive cutoffs and an increase in the numbers of orthologous pairs when using relaxed cutoffs. Importantly, the proportion of translocated genes in recently transferred genes is always higher than that in ancient genes when using different cutoff thresholds. The high frequency of gene translocation in recently acquired genes, therefore, is not likely an artifact of the methodology used in this study. Gene duplication is very common during genome evolution and substitution rates are often accelerated following gene duplication (Zhang et al. 2003). After gene duplication, duplicates may be retained and undergo neofunctionalization or subfunctionalization (Lynch and Force 2000; Lynch et al. 2001). There is a possibility that some orthologs inferred in this study were involved in differential loss after duplication. It has been shown that differential loss and gene conversion might happen after ancient duplication (Lathe and Bork 2001), and gene duplication followed by differential loss can always be invoked as an alternative to lateral gene transfer and vice versa (Gogarten and Townsend 2005). Differential loss will result in a relatively high level of divergence at the sequence level. In this study, the high proportion of translocated genes in recently acquired genes holds true even when the cutoff threshold for ortholog identification is very restrictive (up to 80% of protein identity). This supports the robustness of the concept that translocation tends to take place in recently acquired genes. It is possible that some orthologous pairs detected in this study might be due to gene replacement via LGT. First, a distantly related gene copy could be introduced into a different location of the genome (lineage) and the original copy in the genome is deleted during evolution. This is the case of xenologous gene displacement. Second, it is also possible that the distantly related gene copy could be introduced to the same location of a genome and replace the original copy. This is known as gene displacement in situ (Omelchenko et al. 2003). Third, it is possible that a distinct gene is introduced into one lineage and then laterally transferred to another lineage. The first scenario is similar to the case of differential loss after duplication. In Figure 5, B and C The evolution of translocated genes: Besides the high frequency of gene translocation in recently transferred genes, this study reveals that translocated genes undergo faster rates of evolution compared with positionally conserved genes (Figure 3 Previous studies have suggested that many recently transferred genes tend to be deleted rapidly (Hao and Golding 2004, 2006b). Gene translocations tend to take place in recently transferred genes that tend to be deleted rapidly; as a consequence, gene translocation should be considered as a local phenomenon. Indeed, relatively high rates of gene rearrangements have been found in closely related Salmonella strains (Liu and Sanderson 1998; Liu et al. 2003; Kothapalli et al. 2005), whereas the genome structures between Escherichia coli and Salmonella remain highly similar (Krawiec and Riley 1990; Liu et al. 1993). Furthermore, most truncated genes were found in translocated genes and the proportion of truncated genes is much higher in translocated genes than in positionally conserved genes (Figure 5 Compared with ancient genes, recently transferred genes were shown to be under relaxed functional constraints and translocated genes might be under more relaxed functional constraints (Ka/Ks ratios, Figure 3 The occurrence of gene translocation seems to be influenced by gene function. The distribution of COG classification was compared between translocated genes and positionally conserved genes (see Figure S4). A significant difference in distribution was observed in BlBp, BamBs, and BwBc4. Gene translocation is generally rare in genes involved in translation, ribosomal structure, and biogenesis (“J” class), while gene translocation is more common in genes involved in carbohydrate transport and metabolism (“G” class) and amino acid transport and metabolism (“E” class) and in genes not included in COG (“−” class in Figure S4). In other words, besides the elevated evolutionary rates, translocated genes have a biased distribution of functional classification. This finding is a snapshot of the evolutionary process with the presence of selection. Gene translocation has deleterious effects on genes, and translocation that occurred in ancient genes or functionally essential genes is likely strongly deleterious, while translocation that has occurred in recently acquired genes is likely less deleterious or might be adaptive. Adaptive translocations are likely to be retained and slightly deleterious translocations could be retained in a population for some period of time, while strongly deleterious translocations should be extremely rare. The fate of many translocated genes in recently acquired genes is to be eliminated during evolution. Therefore, gene translocation serves as a factor that speeds up the turnover of laterally transferred genes. Genes distributed on the leading strand: Genes on the leading strand were examined but different pictures were obtained at different levels of comparison. A large-scale comparison shows that ancient genes are more likely on the leading strand than group-specific genes (Figure 7 In a fine-scale comparison it is found that more recently acquired genes have an even higher proportion of genes on the leading strand (Figure 8 It has been shown that genes evolve faster after shifting from one replicating strand to the other due to mutational biases (Tillier and Collins 2000b; Rocha and Danchin 2001). We have examined the translocated genes that shifted strand, but no significant difference in DNA distance was found between genes that shifted strand and those that did not shift strand (see Figure S5). The trend, though not significant, that translocated genes that shifted strand evolve faster than those that did not shift strand was observed in BlBp and BamBs. It is possible that the test lacks statistical power due to the small number of translocated genes. Importantly, translocated genes that did not shift strand have shown a significantly larger distance than positionally conserved genes. This suggests that the elevated rate of evolution in translocated genes is not mainly due to mutational bias after shifting strand. Gene translocation mechanisms: Genome rearrangement can be the result of a number of specific molecular mechanisms (Arber 2003), initiated or aided by prophage, IS elements, and site-specific recombination. Prophages have been well documented to play an important role in large-scale genome rearrangements (Canchaya et al. 2004), and quite often prophages are associated with insertions of a number of novel sequences (Ivanova et al. 2003). Translocated genes identified in this study tend to be spatially dispersed rather than clustered together (Figure 2 Mobile elements (IS elements) have been known to play an important role in extensive genome rearrangement, such as in Bordetella (Brinig et al. 2006). In this study, the results are robust even after excluding genes associated with ISs and prophages. However, the possibility that IS elements are involved in gene translocation cannot be ruled out since most of the IS elements in genomes are evolutionarily young and under fast rates of turnover (Siguier et al. 2006a; Wagner 2006a,b; Touchon and Rocha 2007). It has been shown that elements involved in gene transfer have undergone a decay process (Sirand-Pugnet et al. 2007). Similarly, it might be possible that IS elements involved in gene translocation in this study have been deleted during evolution. Site-specific recombination has also been reported to be involved in lateral gene transfer and deletion in bacterial genome evolution (Gillings et al. 2005; MacDonald et al. 2006). Furthermore, short palindromic sequences (Lewis et al. 1999; Tobes and Pareja 2006) or short signature sequences (Robins et al. 2005) have been suggested to serve as a source of recombination sites for gene movement. However, detection of recombination sites requires more experimental evidence. Conclusion: We have uncovered significant associations between gene translocation and lateral gene transfer. Translocated genes have accelerated rates of evolution and gene translocation tends to be observed in recently acquired genes. Many translocated genes undergo gene truncation and will ultimately be deleted from the genome. Furthermore, there is a strong leading strand bias of lateral gene transfer and in the course of evolution the strand bias of the laterally transferred genes will be influenced by gene translocation and many other factors. In conclusion, gene translocation plays an important role in shaping the evolution of laterally transferred genes. Acknowledgments The authors thank the reviewers for many useful suggestions. This work was supported by a Natural Sciences and Engineering Research Council of Canada grant to G.B.G. Notes Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.104216/DC1. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||
Genome Res. 2000 Nov; 10(11):1719-25.
[Genome Res. 2000]EMBO J. 2000 Dec 15; 19(24):6637-43.
[EMBO J. 2000]Nat Genet. 2000 Oct; 26(2):195-7.
[Nat Genet. 2000]Genome Res. 2005 Dec; 15(12):1603-10.
[Genome Res. 2005]Genome Res. 2003 Jul; 13(7):1589-94.
[Genome Res. 2003]Curr Opin Microbiol. 2004 Oct; 7(5):519-27.
[Curr Opin Microbiol. 2004]J Comput Biol. 2000; 7(3-4):521-35.
[J Comput Biol. 2000]Genome Biol. 2001; 2(6):RESEARCH0020.
[Genome Biol. 2001]Brief Bioinform. 2004 Jun; 5(2):131-49.
[Brief Bioinform. 2004]Mol Biol Evol. 2005 Jun; 22(6):1456-67.
[Mol Biol Evol. 2005]Curr Opin Microbiol. 2000 Oct; 3(5):475-80.
[Curr Opin Microbiol. 2000]Nature. 2001 Jun 28; 411(6841):1046-9.
[Nature. 2001]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Trends Genet. 2002 Dec; 18(12):609-13.
[Trends Genet. 2002]Genome Biol. 2003; 4(9):R56.
[Genome Biol. 2003]Gene. 2008 Sep 15; 421(1-2):27-31.
[Gene. 2008]Nucleic Acids Res. 2007; 35(16):5284-93.
[Nucleic Acids Res. 2007]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D32-6.
[Nucleic Acids Res. 2006]Bioinformatics. 2008 Mar 15; 24(6):863-5.
[Bioinformatics. 2008]Mol Biol Evol. 1996 May; 13(5):660-5.
[Mol Biol Evol. 1996]BMC Genomics. 2007 Oct 12; 8():369.
[BMC Genomics. 2007]Nucleic Acids Res. 2004; 32(5):1792-7.
[Nucleic Acids Res. 2004]Mol Biol Evol. 2000 Jan; 17(1):32-43.
[Mol Biol Evol. 2000]Mol Biol Evol. 2007 Aug; 24(8):1586-91.
[Mol Biol Evol. 2007]Genome Res. 2006 May; 16(5):636-43.
[Genome Res. 2006]J Mol Evol. 2006 Feb; 62(2):132-42.
[J Mol Evol. 2006]Nucleic Acids Res. 2003 Nov 15; 31(22):6570-7.
[Nucleic Acids Res. 2003]Mol Biol Evol. 2005 Nov; 22(11):2147-56.
[Mol Biol Evol. 2005]Nucleic Acids Res. 2000 Jan 1; 28(1):33-6.
[Nucleic Acids Res. 2000]Genome Res. 2006 May; 16(5):636-43.
[Genome Res. 2006]Theor Popul Biol. 2002 Jun; 61(4):503-7.
[Theor Popul Biol. 2002]Genome Biol. 2003; 4(9):R56.
[Genome Biol. 2003]Genetics. 2000 Jan; 154(1):459-73.
[Genetics. 2000]Genetics. 2001 Dec; 159(4):1789-804.
[Genetics. 2001]FEBS Lett. 2001 Aug 3; 502(3):113-6.
[FEBS Lett. 2001]Nat Rev Microbiol. 2005 Sep; 3(9):679-87.
[Nat Rev Microbiol. 2005]Genome Biol. 2003; 4(9):R55.
[Genome Biol. 2003]Mol Biol Evol. 2003 Oct; 20(10):1598-602.
[Mol Biol Evol. 2003]J Mol Evol. 2006 Feb; 62(2):132-42.
[J Mol Evol. 2006]Mol Biol Evol. 2004 Jul; 21(7):1294-307.
[Mol Biol Evol. 2004]Genome Res. 2006 May; 16(5):636-43.
[Genome Res. 2006]FEMS Microbiol Lett. 1998 Jul 15; 164(2):275-81.
[FEMS Microbiol Lett. 1998]J Bacteriol. 2003 Apr; 185(7):2131-42.
[J Bacteriol. 2003]J Bacteriol. 2005 Apr; 187(8):2638-50.
[J Bacteriol. 2005]J Bacteriol. 2006 Apr; 188(7):2375-82.
[J Bacteriol. 2006]EMBO Rep. 2004 Apr; 5(4):392-8.
[EMBO Rep. 2004]Proc Natl Acad Sci U S A. 2004 Oct 12; 101(41):14919-24.
[Proc Natl Acad Sci U S A. 2004]Science. 2006 Mar 24; 311(5768):1768-70.
[Science. 2006]Mol Biol Evol. 2006 Feb; 23(2):365-71.
[Mol Biol Evol. 2006]Mol Biol Evol. 2005 Nov; 22(11):2147-56.
[Mol Biol Evol. 2005]Nucleic Acids Res. 2003 Nov 15; 31(22):6570-7.
[Nucleic Acids Res. 2003]Theor Popul Biol. 2002 Jun; 61(4):503-7.
[Theor Popul Biol. 2002]Genome Biol. 2003; 4(9):R57.
[Genome Biol. 2003]J Mol Evol. 2000 Nov; 51(5):459-63.
[J Mol Evol. 2000]Mol Biol Evol. 2001 Sep; 18(9):1789-99.
[Mol Biol Evol. 2001]Gene. 2003 Oct 23; 317(1-2):3-11.
[Gene. 2003]Mol Microbiol. 2004 Jul; 53(1):9-18.
[Mol Microbiol. 2004]Nature. 2003 May 1; 423(6935):87-91.
[Nature. 2003]J Bacteriol. 2006 Apr; 188(7):2375-82.
[J Bacteriol. 2006]Curr Opin Microbiol. 2006 Oct; 9(5):526-31.
[Curr Opin Microbiol. 2006]PLoS Comput Biol. 2006 Dec 1; 2(12):e162.
[PLoS Comput Biol. 2006]Mol Biol Evol. 2006 Apr; 23(4):723-33.
[Mol Biol Evol. 2006]Mol Biol Evol. 2007 Apr; 24(4):969-81.
[Mol Biol Evol. 2007]Proc Natl Acad Sci U S A. 2005 Mar 22; 102(12):4419-24.
[Proc Natl Acad Sci U S A. 2005]Nature. 2006 Apr 27; 440(7088):1157-62.
[Nature. 2006]Ann N Y Acad Sci. 1999 May 18; 870():45-57.
[Ann N Y Acad Sci. 1999]BMC Genomics. 2006 Mar 24; 7():62.
[BMC Genomics. 2006]J Bacteriol. 2005 Dec; 187(24):8370-4.
[J Bacteriol. 2005]