![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2000 GenomeBiology.com Evaluating genome dynamics: the constraints on rearrangements within bacterial genomes 1Department of Cell and Molecular Biology, Biomedical Center, Uppsala University, S-751 24 Uppsala, Sweden. E-mail: Diarmaid.Hughes@icm.uu.se This article has been cited by other articles in PMC.Abstract Inversions and translocations distinguish the genomes of closely related bacterial species, but most of these rearrangements preserve the relationship between the rearranged fragments and the axis of chromosome replication. Within species, such rearrangements are found less frequently, except in the case of clinical isolates of human pathogens, where rearrangements are very frequent The evolution of biological diversity through genetic change raises questions about how much variation one can expect among closely related genomes. Some answers are emerging from the application of two different technologies to comparative genomics of bacteria. One, complete genome sequencing, is providing detailed blueprints of one or a few examples of each genome of interest. The second, physical mapping by pulsed-field gel electrophoresis (PFGE), is providing 'skeleton' views of large numbers of closely related genomes. Together, these technologies are providing insights into the dynamics of genome plasticity that are both detailed and broad. At this early stage in comparative genomics, the main generalizations that are emerging concerning rearrangements in bacterial genome organization are as follows. First, large chromosome inversions and translocations are common, even between closely related species. Second, chromosome inversions are usually symmetric around the axis of DNA replication. Third, chromosomal rearrangements are less common within species, but a dramatic increase in the frequency of inversions and translocation seems to be associated with the ability of bacteria to infect eukaryotic hosts, possibly reflecting a bacterial response to the challenges posed by the immune system. The underlying causes of rearrangement The bacterial RecA protein is required for damage to chromosomes - in particular chromosome breaks - to be repaired, and it acts by using a duplicate copy of the damaged sequence as a template for repair. The template is normally the homologous sequence on a sister chromosome, but when sequences are present in multiple copies within a genome, RecA can promote recombination between paralogs. Such recombination events can result in rearrangements in the order of genes on the chromosome [1]. Thus, recombination between repeated sequences that are in the same orientation as each other (direct repeats) can result in tandem duplication of the region bounded by the repeat sequences (Figure (Figure1a).1a
To these phenomena one can add the acquisition of new DNA by horizontal transfer from another genome, which in addition to introducing new genetic information may upset the stability of a genome and trigger other compensating rearrangements. The sequences involved in RecA-mediated rearrangements are usually long repeats, such as rRNA operons, transposons and IS (insertion sequence) elements. Sequences as short as 10-100 nucleotides can also be substrates for homologous recombination, but this is usually limited to sequences in close proximity to one another [1]. Another outcome of homologous recombination between repeated sequences is gene conversion - homogenization of the sequences within a gene family, to prevent the divergence of repeated sequences - which maintains the sequence similarity required for RecA-mediated recombination over evolutionary time scales [3]. RecA activity is thus a double-edged sword: it is needed to maintain the chromosome integrity required for completing replication, but it also promotes rearrangements within the genome. An interesting exception is that Buchnera lacks a recA gene but compensates by having over 100 copies of its entire genome per cell [4]. Historical background and new techniques Microbial genome analysis has its origins in the intensive laboratory analysis of just a few bacterial strains, in particular Escherichia coli K-12 and Salmonella typhimurium LT2 (proper name: Salmonella enterica serovar Typhimurium). These species had a common ancestor approximately 140 million years ago [5]. Their genetic maps are almost identical in organization and their major phenotypic differences can be explained by the horizontal acquisition of DNA segments into one or other of the species. With the exception of a large inversion around the terminus of replication, the two genomes seem to be stable in organization and to be diverging by the accumulation of point mutations. A high frequency of recombination in the terminus region is related to the mechanism of chromosome separation after replication, and different inversions around the terminus are found in other closely related bacteria [1]. The general conclusion drawn from these early comparisons was that bacterial genomes were stable in organization. Within the past few years, however, the application of new technologies to the analysis of the genomes of a wide variety of bacterial species has challenged this view. The two most important experimental techniques for comparative analysis of genome organization have been whole-genome sequencing and physical mapping of genome organization. Whole-genome sequencing provides complete information on a genome, facilitating many types of analysis including comparative analysis of genome organization. The TIGR Microbial Database [6] currently lists dozens of completed bacterial genome sequences and over one hundred in progress. In most cases a few genomes from each species are being sequenced. The availability of a genome sequence, while not essential, also facilitates the physical analysis of that genome and of related genomes. Physical mapping, most often by PFGE in conjunction with restriction digestion and probing for specific sequences, is used to generate skeleton structures of genome organization and is suited to screening large numbers of strains. The most informative comparisons, in terms of evaluating genome dynamics, are those made between close relatives - either sister species or isolates within a species. Genome rearrangements at three levels of comparison Comparisons between genome arrangements in related bacteria have been made at several levels: interspecific, intraspecific (serovars, or immunologically detectable variants, and biovars, or biochemically detectable variants within a single species), and within presumed clonal populations. The phylogenetic relationships between many of the bacterial species referred to in the following discussion, based on an analysis of their 16S rDNA, are shown in Figure Figure22
Interspecific comparisons Complete genome sequence data for three pairs of related species - Vibrio cholera - Escherichia coli; Streptococcus pneumoniae - Streptococcus pyrogenes; and Mycobacterium tuberculosis - Mycobacterium leprae - has been used to compare the positions of conserved sequences within each genome. The comparisons reveal in each case a distinct X-shaped pattern in scatterplots, suggestive of large chromosomal inversions that reverse the genomic sequence symmetrically around the axis of replication [7]. Similarly, Chlamydia trachomatis and Chlamydia pneumoniae differ by multiple large inversions, apparently oriented around the axis of replication. In addition, the region around the terminus of replication in these two species is subject to a high rate of reorganization [8]. Table 1 summarizes all the interspecific rearrangements referred to in this section.
As discussed earlier, another important type of genome rearrangement is translocation. Comparative genomics using whole-genome sequences shows that the genomes of the close relatives Mycoplasma genitalium and Mycoplasma pneumoniae [9] can be subdivided into six segments, which are ordered differently in the two species. Within each segment the order of genes is conserved, and the increased size of the M. pneumoniae chromosome is due mainly to gene duplications. In both species there is strong uniformity of direction of transcription, and this direction is not changed by the translocations [9]. Other interspecific genome comparisons reveal that inversions, translocations and deletions typically distinguish closely related species. Thus, Neisseria meningitidis and Neisseria gonorrhoeae differ by multiple translocations and/or inversions of blocks of genetic markers within a 500 kilobase region [10]. Mycobacterium tuberculosis differs from the attenuated vaccine strain Mycobacterium bovis BCG Pasteur in carrying two tandem duplications in its chromosome, of 29 kb and 36 kb respectively [11]. In addition, clinical isolates of Mycobacterium tuberculosis differ from each other in having deletions of up to several kilobases, probably linked to homologous recombination between multiple copies of IS6110 in the genome [12]. Finally, the chromosomal locations of 30 putative orthologs between Bacillus subtilis and Bacillus cereus are arranged in an apparently random manner [13], similar to what is seen when comparing the genomes of very distantly related organisms. Also, within B. subtilis several variants created by X-ray mutagenesis have large inversions (1,700-1,900 kb) around either the axis of replication, a 100 kb translocation, and smaller duplications and deletions [14]. In conclusion, in every case where interspecific comparisons have been made, clear evidence of large chromosomal rearrangements has been found. Intraspecific comparisons Comparisons within species, including comparisons between different serovars and biovars of the same species, reveal that rearrangements are less common within than between species (summarized in Table 2). Using PFGE after restriction digestion targeted to cut the genome within the conserved and repetitive rRNA operons, 'rrn genomic skeletons' were established for isolates of many serovars of Salmonella enterica [15]. The order of fragments, which is ABCDEFG in S. typhimurium and E. coli K-12, is conserved in most Salmonella serovars, most of which are host-generalists. In S. typhi and S. paratyphi C (which have human hosts), however, and in S. pullorum and S. gallinarum (which have fowl hosts), these fragments are rearranged. Thus, of 127 natural isolates of S. typhi examined, 21 different genome orders were found, all postulated to be due to inversions and translocations with end-points in rrn operons [15]. A feature of these rearrangements is that the distance from the origin of chromosome replication is well conserved, as is the direction of transcription relative to the direction of chromosome replication. A similar PFGE analysis has been made of Shigella, the human pathogenic form of E. coli [16]. This showed that of the four traditional Shigella subgroups (often referred to as species), S. boydii and S. sonnei had chromosomal arrangements identical to E. coli K-12, while S. dysenteriae and S. flexneri had different large rearrangements [17]. Interestingly, the Shiga toxin genes on the S. dysenteriae chromosome are bracketed by IS600 sequences, and increased toxin production is caused by tandem amplification via recombination between the IS600 elements [18].
The genome sequence of the strain Lactococcus lactis IL1403 has been determined [19] and physical genome maps have been created for several Lactococcus lactis strains (subspecies lactis and subspecies cremoris strains) and strains of the related Streptococcus thermophilus [20,21]. Within each group, strains were similar with the exception of the L. Lactis subspecies cremoris, where different strains were polymorphic, in part due to an inversion of half the chromosome. This inversion is due to a homologous recombination event between two defective copies of IS905 and does not alter the symmetry of the replication origin and terminus, oriC and terC [22]. Comparison of the physical and genetic maps of strains representing two serovars of Leptospira interrogans suggests that at least two inversions in the large replicon distinguish their genomes [23]. Brucella is a Gram-negative bacterium pathogenic for animals. The genus is divided into six species and numerous biovars. Physical maps of the genomes of reference strains in each species show a high conservation of restriction sites and the presence of two chromosomes. The exception is a large inversion in the small chromosome of B. abortus. But physical mapping of the genomes of the four biovars within one of these species, Brucella suis, reveals differences in both chromosome number and size. These differences can be explained by rearrangements due to homologous recombination between the three rrn loci in the genome [24]. It is proposed that the ancestor of Brucella had a single chromosome and that recombination, probably between rrn genes, led to the creation of two chromosomes. Multiple chromosomes are also found in other bacteria, mostly within the proteobacteriaceae, including Rhodobacter sphaeroides, Leptospira interrogans, Rhizobium spp., Burkholderia cepacia, Agrobacterium spp., and Ochrobactrum anthropi [24]. There is no evidence that the presence of multiple chromosomes in these genomes is related to a common phylogeny, since it is not always shared by all members of a genus (for example, the Rhodobacter genus) or even by all strains of the same species (for example, Brucella suis). In Streptomyces, the most common rearrangements found are sequence and length variations in the terminal inverted repeats (TIRs) at the ends of the linear replicons. This variation is due to homologous recombination between repetitive sequences and results in amplifications, deletions, a high frequency of spontaneous mutations, and the transfer of sequences between different chromosome arms [25]. The exchange of telomeric regions has also been described for linear replicons in the unrelated bacterium Borrelia burgdorferi [26]. Instability may not be a particular feature of linearity, however, because when Streptomyces chromosomes are circularized they remain unstable in these regions [27]. The lack of housekeeping genes in a large region at each end of the Streptomyces linear chromosomes probably permits the detection of deletions at a high frequency because they do not affect cell viability under laboratory conditions. Clinical and clonal populations The closest relatives that have been subjected to genome organization analysis are presumed clonal derivatives associated with clinical infections (summarized in Table 3). Bordetella pertussis strains from a whooping cough outbreak in Canada were subjected to restriction-enzyme genome mapping. Among 70 isolates, presumed to be descended from the same starting clone, 14 different types were found (distinguished by restriction fragment length polymorphism, RFLP). Representatives of these 14 types were further analyzed and shown to have 11 different genome orders, due in each case to large chromosomal inversions [28]. A similar analysis among different laboratory strains also revealed frequent large inversions [29]. B. pertussis carries about 100 copies of the 1 kilobase insertion sequence IS481, providing many targets for homologous recombination. The positions of the origin and terminus of replication are unknown in Bordetella, but because almost all of the inversions are around the same axis, it seems likely that they are in fact symmetric about the origin-terminus axis, as has been observed in other species. The genomes of clinical isolates of Pseudomonas aeruginosa from cystic fibrosis patients were analyzed by PFGE, revealing that 50% of them have large chromosomal inversions [30]. Most of these inversions are approximately symmetric about the replication axis. It is not known if these rearrangements confer any advantage on the strains in colonizing the lung habitat.
PFGE analysis of the genomes of 30 Neisseria meningitidis epidemic strains belonging to the ET-5 complex, isolated from various parts of the world over a period of 20 years, revealed 10 different types, including some with genome order rearrangements [31]. A striking feature of N. meningiditis revealed by complete genome sequencing [32,33] is the presence of hundreds of repetitive elements that could contribute to genome rearrangements important for evasion of the host immune system. Finally, within a defined lineage of N. gonorrhoeae strains with pilin variations, an inversion of more than one third of the chromosome was found [34]. The end points of the inversion are within a multicopy gene family involved in pilin production. Constraints on the frequency of rearrangement Rearrangement could be constrained by the number and size of repetitive sequences in a genome. Most bacterial genomes contain multiple copies of some highly expressed genes (such as rrn genes) or have multiple copies of insertion sequences, however. While the relative positions of these sequences could influence which rearrangements occur most frequently, recombination between short repeats and the mobility of IS elements should increase the variety of rearrangements. A second potential constraint is the rate of recombination, but experimental data from S. typhimurium show that rates of recombination between long repeat sequences are at least as high as nucleotide substitution rates [1]. For example, the rate of inversion between the tuf genes, approximately 10-8 per cell per generation, is equal to the rate of nucleotide substitution within the same genes [3,35]. If one can generalize from this, then recombination can rearrange genome organization as fast as genomes diverge by nucleotide substitution. A third possible constraint is that the fitness of the rearranged genomes is in general reduced. Indeed, inversions that reverse the orientation of sequences on either side of the replication terminus of S. typhimurium and E. coli usually occur very infrequently, or make the bacteria very unfit [1]. Inversions that do not alter the replication axis (that is, inversions that do not change the distance of genes from the origin of replication, or their orientation relative to the direction of replication) may be the least disruptive in terms of fitness, but this has never been rigorously tested. In conclusion, fitness costs may be an important constraint on the fixation of genome rearrangements in bacteria, but there are very few relevant measurements. Comparing these theoretical constraints on genome rearrangements with the data from natural bacterial isolates, several patterns emerge. One is that inversions and translocations are very common between even closely related species (compare Figure Figure22 In conclusion, the main constraint on genome rearrangements on an evolutionary time scale may be bacterial fitness, in particular associated with the global regulation of gene expression patterns and the orderly and efficient replication of the genome. In particular complex environments, however, such as those encountered on invading an eukaryotic host, bacterial fitness may be positively associated, at least on a short time scale, with the generation of genome rearrangements. Acknowledgements My research is supported by the Swedish Natural Sciences Research Council. I thank Ivica Tamas for his assistance in building the phylogenetic tree. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Nature. 1997 Jun 12; 387(6634):708-13.
[Nature. 1997]J Mol Biol. 1996 Jul 26; 260(4):506-22.
[J Mol Biol. 1996]Nature. 2000 Sep 7; 407(6800):81-6.
[Nature. 2000]J Mol Evol. 1987; 26(1-2):74-86.
[J Mol Evol. 1987]Genome Biol. 2000; 1(6):RESEARCH0011.
[Genome Biol. 2000]Nucleic Acids Res. 2000 Mar 15; 28(6):1397-406.
[Nucleic Acids Res. 2000]Nucleic Acids Res. 1997 Feb 15; 25(4):701-12.
[Nucleic Acids Res. 1997]J Bacteriol. 1995 Nov; 177(22):6390-400.
[J Bacteriol. 1995]Yeast. 2000 Jun 30; 17(2):111-23.
[Yeast. 2000]J Bacteriol. 1999 Feb; 181(3):1014-20.
[J Bacteriol. 1999]Microbiology. 1999 Mar; 145 ( Pt 3)():621-31.
[Microbiology. 1999]Microbiology. 1997 Dec; 143 ( Pt 12)():3723-32.
[Microbiology. 1997]Electrophoresis. 1998 Apr; 19(4):569-72.
[Electrophoresis. 1998]Proc Natl Acad Sci U S A. 2000 Sep 12; 97(19):10567-72.
[Proc Natl Acad Sci U S A. 2000]FEMS Microbiol Lett. 2000 Jan 1; 182(1):93-8.
[FEMS Microbiol Lett. 2000]Mol Microbiol. 1999 Dec; 34(5):1058-69.
[Mol Microbiol. 1999]Antonie Van Leeuwenhoek. 1999 Jul-Nov; 76(1-4):27-76.
[Antonie Van Leeuwenhoek. 1999]Antonie Van Leeuwenhoek. 1996 Oct; 70(2-4):161-83.
[Antonie Van Leeuwenhoek. 1996]J Bacteriol. 2000 May; 182(9):2481-91.
[J Bacteriol. 2000]J Bacteriol. 1998 Sep; 180(18):4834-42.
[J Bacteriol. 1998]J Bacteriol. 1993 Sep; 175(17):5445-51.
[J Bacteriol. 1993]Mol Microbiol. 1998 Jan; 27(1):99-106.
[Mol Microbiol. 1998]Mol Microbiol. 1998 Jan; 27(1):99-106.
[Mol Microbiol. 1998]Proc Natl Acad Sci U S A. 1998 Nov 24; 95(24):14296-301.
[Proc Natl Acad Sci U S A. 1998]Mol Microbiol. 2000 Feb; 35(3):490-516.
[Mol Microbiol. 2000]J Bacteriol. 1997 Jul; 179(14):4553-8.
[J Bacteriol. 1997]J Bacteriol. 1999 Sep; 181(17):5512-5.
[J Bacteriol. 1999]J Bacteriol. 1997 Sep; 179(18):5820-6.
[J Bacteriol. 1997]J Mol Biol. 1997 Aug 22; 271(3):386-404.
[J Mol Biol. 1997]Curr Microbiol. 2000 Jun; 40(6):372-9.
[Curr Microbiol. 2000]Nature. 2000 Mar 30; 404(6777):502-6.
[Nature. 2000]Science. 2000 Mar 10; 287(5459):1809-15.
[Science. 2000]FEMS Microbiol Lett. 1996 Dec 1; 145(2):173-9.
[FEMS Microbiol Lett. 1996]J Mol Biol. 1996 Jul 26; 260(4):506-22.
[J Mol Biol. 1996]J Mol Biol. 2000 Mar 24; 297(2):355-64.
[J Mol Biol. 2000]