![]() | ![]() |
Formats:
|
||||||||||||||||
Copyright © 2009 by the Genetics Society of America Molecular Analysis of a Large Subtelomeric Nucleotide-Binding-Site–Leucine-Rich-Repeat Family in Two Representative Genotypes of the Major Gene Pools of Phaseolus vulgaris *Institut de Biotechnologie des Plantes, UMR-CNRS 8618, INRA, Université Paris-Sud, 91 405 Orsay, France, †Department of Chromosome Biology, University of Vienna, 1030 Vienna, Austria and ‡Plate-Forme 4-Intégration et Analyse Génomiques, Génopole, Institut Pasteur, 75724 Paris, Cedex 15, France 1Corresponding author: Institut de Biotechnologie des Plantes, Bât. 630 UMR-CNRS 8618, INRA Université Paris-Sud, 91 405 Orsay, France. E-mail: valerie.geffroy/at/u-psud.fr 2Present address: Laboratório de Citogenética Vegetal, Departamento de Botânica–CCB, Universidade Federal de Pernambuco, Recife–PE, 50670-420, Brazil. Communicating editor: T. Brutnell Received July 4, 2008; Accepted December 11, 2008. Abstract In common bean, the B4 disease resistance (R) gene cluster is a complex cluster localized at the end of linkage group (LG) B4, containing at least three R specificities to the fungus Colletotrichum lindemuthianum. To investigate the evolution of this R cluster since the divergence of Andean and Mesoamerican gene pools, DNA sequences were characterized from two representative genotypes of the two major gene pools of common bean (BAT93: Mesoamerican; JaloEEP558: Andean). Sequences encoding 29 B4-CC nucleotide-binding-site–leucine-rich-repeat (B4-CNL) genes were determined—12 from JaloEEP558 and 17 from BAT93. Although sequence exchange events were identified, phylogenetic analyses revealed that they were not frequent enough to lead to homogenization of B4-CNL sequences within a haplotype. Genetic mapping based on pulsed-field gel electrophoresis separation confirmed that the B4-CNL family is a large family specific to one end of LG B4 and is present at two distinct blocks separated by 26 cM. Fluorescent in situ hybridization on meiotic pachytene chromosomes revealed that two B4-CNL blocks are located in the subtelomeric region of the short arm of chromosome 4 on both sides of a heterochromatic block (knob), suggesting that this peculiar genomic environment may favor the proliferation of a large R gene cluster. PLANT disease is one of the major limitations in crop production throughout the world and is responsible for huge economic losses (Madden and Wheelis 2003). Use of resistant genotypes is the most economic and ecologically safe means for controlling plant diseases (Hulbert et al. 2001; Hammond-Kosack and Parker 2003; Michelmore 2003). In the past 15 years, >60 resistance (R) genes following the classic gene-for-gene model (Flor 1955) have been cloned from various plant species (Martin et al. 2003; Meyers et al. 2005). The most prevalent plant R genes encode proteins containing a nucleotide-binding site (NBS) and a C-terminal leucine-rich-repeat (LRR) domain (Martin et al. 2003; Belkhadir et al. 2004; Rairdan and Moffett 2007). R genes belonging to this class have been identified in various plant species, in monocots as well as in dicots, and correspond to R genes effective against all types of pathogens and pests, including fungi, bacteria, viruses, nematodes, oomycetes, and insects (Dangl and Jones 2001; Hammond-Kosack and Parker 2003; Mchale et al. 2006). This NBS–LRR protein class can be divided into two subclasses on the basis of their amino-terminal sequence, corresponding to two ancient lineages (Bai et al. 2002; Meyers et al. 2003; Ameline-Torregrosa et al. 2008). One subclass is composed of toll interleukin 1 receptor (TIR)-NBS-LRR encoding genes characterized by the TIR domain homologous to the Drosophila toll and mammalian interleukin-1 receptor (Hammond-Kosack and Jones 1997). The second subclass is composed of NBS–LRR encoding genes without the TIR motif, which often includes a coiled-coil (CC) domain (Pan et al. 2000). Although these two subclasses are present in gymnosperm and dicot genomes, TIR–NBS–LRR are completely absent from monocot genomes (Zhou et al. 2004; Meyers et al. 2005). Annotation of the Arabidopsis thaliana, rice, poplar (Populus trichocarpa), Medicago truncatula, grape, and papaya genomes identified 149, 480, 317, 333, 233, and 55 genes encoding NBS–LRR proteins, respectively (Bai et al. 2002; Meyers et al. 2003; Zhou et al. 2004; Tuskan et al. 2006; Velasco et al. 2007; Ameline-Torregrosa et al. 2008; Kohler et al. 2008; Ming et al. 2008). Furthermore, the genes encoding NBS–LRR proteins have been shown to be the most polymorphic of all Arabidopsis genes (Clark et al. 2007). This richness of NBS–LRR sequences represents an important resource for combating pathogen attack, since the function of these proteins is to serve as surveillance molecules that detect infection by specific pathogens (Chisholm et al. 2006). NBS–LRR sequences are often tightly linked at complex loci (Hulbert et al. 2001; Leister 2004; McDowell and Simon 2006). For example, in the Arabidopsis genome, 73.2% (109 of 149) of the NBS–LRR sequences are located in clusters of genes and 40 sequences are singletons (Meyers et al. 2003). In eukaryotes, genetic systems for generating new functions are often based on complex loci comprising tandemly organized genes of related function (Borst and Greaves 1987). Complex resistance gene clusters identified in plants fit this situation. Indeed, the clustered organization is supposed to favor sequence exchanges, such as unequal crossing over and/or gene conversion events, which can give rise, in some cases, to new (nonparental) R specificities (Sudupak et al. 1993; Richter et al. 1995; Chin et al. 2001). The relative importance of unequal crossing over and gene conversion compared to point mutations in the evolution of R gene clusters has been discussed (Michelmore and Meyers 1998). If these two processes were prevalent, homogenization of the sequences within a haplotype would be observed. However, comparisons of intra- and interspecies resistance haplotypes, such as at the rice Pi2/9 cluster (Zhou et al. 2007), revealed that orthologs are often more similar than paralogs, suggesting a low rate of sequence homogenization from unequal crossing over and gene conversion (Michelmore and Meyers 1998). Consequently, evolution of R gene clusters results from a balance between mechanisms that, in addition to creating new genes and reassorting them, can also homogenize them and mechanisms leading to sequence diversification (Meyers et al. 2005). Strong positive selection on specific residues (x) of the LRR domain has been proposed as a mechanism leading to sequence diversification. Indeed, in the LRR domain, the “xxLxLxx” motif is predicted to form a β-strand/β-turn structure in which the x residues are solvent exposed and available to detect potential pathogen ligands (Bent and Mackey 2007). Recently, illegitimate recombination within the LRR domain was also proposed to be a frequent source of variability in the evolution of the R gene cluster (Wickler et al. 2007). If clustering of resistance genes is a widely discussed explanation for the evolution of specific resistance genes (Hulbert et al. 2001), studies on the evolution of R gene clusters are scarce, compared with studies on R genes organized as “singletons” (Stahl et al. 1999; Tian et al. 2002; Mauricio et al. 2003). One such example is the analysis of the complex RGC2 cluster of lettuce where two types of R genes (I and II) were identified. Type I genes present a high frequency of sequence exchange, whereas type II genes show few sequences exchanges and “orthologs” can be identified (Kuang et al. 2004). Common bean, Phaseolus vulgaris, is a plant species for which diversity of wild and cultivated forms has been extremely well documented (Kami et al. 1995; Broughton et al. 2003; Chacon et al. 2005). On the basis of morphological traits, phaseolin electrophoretic types, isozymes, and molecular markers, the P. vulgaris natural populations can be divided into three geographical regions of diversity: the Mesoamerican, the South Andean, and the North Andean centers. Genetic analyses based on neutral markers provide evidence for two independent domestication processes within the South Andean and Mesoamerican centers of diversity, leading to the development of two distinct groups of cultivated beans, called the Andean and Mesoamerican gene pools (Kami et al. 1995; Broughton et al. 2003; Chacon et al. 2005), respectively. These two gene pools are supposed to have diverged from a common ancestor ≤0.6 MYA (Chacon et al. 2007). In P. vulgaris, a complex R gene cluster, referred to as the B4 R gene cluster, is localized at the end of linkage group (LG) B4. This cluster comprises at least three R specificities of either Andean or Mesoamerican origin (Co-9, Co-y, and Co-z) and two R QTL effective against the fungal pathogen Colletotrichum lindemuthianum, the causal agent of anthracnose (Geffroy et al. 1999, 2000). Four expressed CC–NBS–LRR (CNL) encoding genes, mapped at the B4 R gene cluster, were previously characterized from two bean genotypes chosen to represent the Andean (JaloEEP558) and Mesoamerican (BAT93) cultivated gene pools (Ferrier Cana et al. 2003). To further characterize the organization of the B4 R gene cluster and to understand its evolution since the divergence of Andean and Mesoamerican gene pools, selected genomic sequences of the B4 R gene cluster were determined and compared in two common bean genotypes, BAT93 and JaloEEP558. In addition to the four previously sequenced CNL, 25 B4-CNL encoding genes were determined—10 from JaloEEP558 (Andean) and 15 from BAT93 (Mesoamerican). No features associating the R-like genes to either gene pool have been identified. A combination of fluorescent in situ hybridization (FISH) and genetic mapping based on pulsed-field gel electrophoresis (PFGE) separation confirmed that the B4-CNL family is a large CNL family specific to one end of LG B4, corresponding to chromosome 4. B4-CNL members were mapped at two distinct blocks separated by 26 cM. FISH on meiotic pachytene chromosomes revealed that the B4 R gene cluster is localized in a peculiar genomic environment, since B4-CNL sequences are located in the subtelomeric region of the short arm of chromosome 4 adjacent to a major heterochromatic block and on both sides of a minor heterochromatic block (knobs). MATERIALS AND METHODS Plant material and genetic mapping: Seventy-seven F9 recombinant inbred lines (RILs) derived from a cross between the Mesoamerican BAT93 genotype and the Andean JaloEEP558 genotype were used to map the B4-CNL sequences (phage inserts) on the integrated linkage map of common bean (Freyre et al. 1998). A PCR approach, using specific oligonucleotide primers, was utilized to map phage inserts. The list of primers used to map each phage insert is presented in supplemental Table 1. PCR reactions were completed in a volume of 25 μl containing 50 ng of template DNA, 1× PCR reaction buffer, 3 pmol of each primer, 50 μm of each dNTP, and 0.5 units of Red Taq Goldstar polymerase (Eurogentec, Seraing, Belgium). Amplifications were performed in a GeneAmp PCR system 2720 (PerkinElmer, Norwalk, CT). Presence/absence or size polymorphisms were scored on 77 F9 RILs. The MAPMAKER software, version 3.0, was used to map the genomic clone on the integrated linkage map (Lander et al. 1987). Linkage groups were established with a LOD threshold of 3.0 and a recombination fraction of 0.3. Marker order was estimated with a LOD threshold of 2.0, based on multipoint compare, order, and ripple analyses. Pulsed-field gel electrophoresis and hybridization: Conditions for high-molecular-weight DNA preparation in agarose blocks and restriction digestion with the rare cutting restriction endonuclease SalI have been described in Creusot et al. (1992) except that a different extraction buffer (10 mm Tris, pH 7.5; 15 mm KCl; 15 mm NaCl; 0.15 mm spermine; 0.5 mm spermidine; 2 mm EDTA; 0.5% Triton X-100 and 4.5% glucose) was used in this study for the DNA preparation. PFGE was carried out using a CHEF-DRIII apparatus commercialized by BIO-RAD. One-percent PFGE-grade agarose in 0.5× TBE was used as the matrix. Resolution of DNA fragments in the size range 50–600 kb was obtained by electrophoresis for 22 hr at 6 V/cm with a 12.5/40-sec pulse ramp at 10° in 0.5× TBE. The molecular weight standards were λ concatemers designed for use as size markers for PFGE (New England Biolabs). DNA transfer, hybridization conditions, and probe preparation were done as described in Geffroy et al. (1998). PRLJ1 is a NBS probe encompassing the P-loop–hydrophobic domain (HD) region of the NBS–LRR family specific to the B4 R gene cluster (Geffroy et al. 1999). Library screening and sequence determination: BAT93 and JaloEEP558 genomic λ libraries were screened with the PRLJ1 probe (Geffroy et al. 1999) as described in Ferrier Cana et al. (2003, 2005). Phage DNA was isolated using Nucleobond AX columns (Macherey-Nagel). Restriction analysis, Southern blot hybridization with the PRLJ1 probe, partial sequencing of the phage insert ends, and genetic mapping allowed the identification of contigs of overlapping phages. Three of these phage contigs, referred to as ContB1 (72,929 bp; nine phages), ContJ1 (29,543 bp; six phages), and ContJ2 (33,523 bp; four phages), and one phage, “λB54” (16,690 bp, which for convenience will be referred to as “contig” ContB2 in this article), were completely sequenced, except the 3′-end of ContJ1 that consists of tandem repeats preventing proper sequencing. For the remaining phages, sequencing was focused on B4-CNL sequences, using a primer walking strategy directly on phage DNA and/or on phage subcloned DNA, when two CNL were present on the same phage insert. Sequencing was performed using an automated ABI PRISM 3100 sequencer and the ABI prism BigDye Terminator Cycle sequencing kit (Applied Biosystems, Roissy, France) with custom primers or standard Sp6, T7, M13 reverse, or M13 (-21) primers. Raw DNA sequence data were visually inspected and assembled using the computer program Sequencher (Gene Codes, Ann Arbor, MI). The genomic sequence was annotated by using the gene prediction program Fgenesh (Softberry website) and was manually edited by a homology search against available databases. The genomic sequence of CNL-BA8 was used as a reference for the annotation of the NBS–LRR encoding genes (Ferrier Cana et al. 2003). The nucleotide sequence instead of the protein sequence was used for comparison because some genes, referred to as “pseudogenes,” do not contain the entire coding sequence. These pseudogenes contain a start methionine but present frameshift(s) or premature stop codon(s) leading to truncated predicted proteins compared to “full-length” encoding NBS–LRR (for details of their annotation, see supplemental Figure 2). The GenBank accession numbers are EU856768–EU856792. Sequence analysis: Multiple sequence alignments were performed using the Clustal X program and edited in GENEDOC (http://www.psc.edu/biomed/genedoc). Nucleotide sequence identities were established using the Gap program of the Genetics Computer Group (Madison, WI). COILs analyses (Lupas et al. 1991) were performed with the Macstripe, version 2.0b1, software, using a window of 14 and the MTK matrix. The average of distances between all pairs of CNL (the observed number of mutations with correction for probable invisible substitutions using the transversional model for the NBS domains and the general time reversible model for all other comparisons) was calculated with the PAUP software. FISH mapping: Phage genomic clones λB35, λB10, λB62, and λB61, mapped, respectively, at genetic position (GP) 1, GP 2, GP 3, and GP 4, were used as probes. Probes were labeled by nick translation (Roche Diagnostics) with Cy3-dUTP (Amersham Pharmacia Biotech) or Spectrum Green-dUTP (Vysis). Mitotic chromosome preparation is described in Pedrosa-Harand et al. (2006). Meiotic chromosomes were prepared from young flower buds fixed in ethanol/acetic acid 3:1 (v/v). Buds were macerated in 0.4% cellulase/0.4% pectolyase/0.4% citohelicase in 0.01 m citric acid–sodium citrate buffer, pH 4.8, for 3 hr at 37°, incubated in 60% acetic acid for 30 min, and squashed after removal of petals and sepals. Slide selection and pretreatment are described in Pedrosa et al. (2001). Chromosome and probe denaturation and post-hybridization washes were performed according to Heslop-Harrison et al. (1991), with modifications described in Pedrosa-Harand et al. (2006), except that meiotic preparations were denatured at 73° for 3 min. Reprobing of slides was performed according to Heslop-Harrison et al. (1992). Photographs were taken on a Zeiss Axioplan (Carl Zeiss) equipped with a mono cool view CCD camera (Photometrics, Tucson, AZ). Images from the camera were combined and pseudocolored using the IPLab spectrum software (IPLab, Fairfax, VA). Distance of FISH signals to the closest telomere and total chromosome length were measured in 10 metaphases using the “analyze–measure length” function of the same software. Digital images were imported into Adobe Photoshop version 8 for final processing. Positive selection assessment: The program CODEML from the PAML package (Yang 1997) was used to calculate the ω-ratio (of nonsynonymous-to-synonymous changes; dN/dS for each site). B4-CNL members corresponding to pseudogenes were excluded from this analysis. Consequently, this analysis was conducted on the 18 nucleotide sequences corresponding to “full-length” B4-CNL (12 from BAT93 and 6 from JaloEEP558), whose multiple alignment was carefully manually optimized in GENEDOC (http://www.nrbsc.org/gfsc/genedoc/index.html). Evolutionary models M7 and M8 were tested: model M7 assumed that amino acid site substitutions are conservative (ω ~ 0) and model M8 allows the occurrence of positively selected sites (ω > 1). M7 and M8 assume a β-distribution for the ω-value between 0 and 1. Diversifying selection was confirmed using a likelihood-ratio test by comparing the likelihood of models M8 and M7. Phylogeny: Phylogenetic trees were built for complete CNL sequences and separated CNL domains (A, B, C, D, and E) defined in Figure 3
To test whether congruence between tree topologies of domains exists, four different methods were employed: (1) the Kishino–Hasegawa (KH) test (Kishino and Hasegawa 1989), (2) the Shimodaira–Hasegawa (SH) test (Shimodaira 2002), (3) the Swofford–Olsen–Waddell–Hillis (SOWH) test (Goldman et al. 2000), and (4) the expected likelihood weight (ELW) test (Strimmer and Rambaut 2002). The KH and SH tests were performed with PAUP. The SOWH and ELW tests were performed using PhyML (Guindon and Gascuel 2003). The best trees agreeing with the null hypotheses and the unconstrained ML tree, inferred in PhyML, employed the general time reversible (GTR), I, G nucleotide substitution model. RESULTS Distribution of PRLJ1-like sequences in the common bean genome and estimation of the size of the B4 R gene cluster through PFGE genetic mapping: The bean NBS–PRLJ1 probe, encompassing the P-loop–HD region of plant R genes, was previously shown to be specific to the B4 R gene cluster, since all the polymorphic bands mapped at three different genetic positions (GP 1, GP 2, GP 3) defining a 2.7-cM interval at the end of linkage group B4 (Geffroy et al. 1999) (Figure 1
General features of 29 B4-CNL members issued from two representative genotypes of the two major gene pools of common bean: BAT93 (Mesoamerican) and JaloEEP558 (Andean): To estimate the number and the nucleotide polymorphism of the NBS–LRR homologs present at the B4 R gene cluster, genomic λ-phage libraries of both BAT93 and JaloEEP558 genotypes were screened with NBS–PRLJ1 probe, shown to be strictly specific to the B4 R gene cluster (this article and Geffroy et al. 1999). Complete or partial sequencing of the positive phage inserts allowed the identification of a total of 29 NBS–LRR homologs, 17 from BAT93 and 12 from JaloEEP558 (Figure 2
Sequence analysis revealed that the 29 B4-CNL present a very high percentage of nucleotide identity, ranging from 80 to 95% (supplemental Table 2). The 18 full-length B4-CNL potentially encode proteins ranging from 1066 aa (CNL-JA71) to 1186 aa (CNL-J1) (supplemental Figure 2). For these 18 sequences, no introns were identified between the start codon and the terminal stop codon, as predicted by previous results obtained for the B4-CNL encoding genes (Ferrier Cana et al. 2003) and in agreement with what has been previously found for plant CNL-encoding genes in general (Meyers et al. 2003). As shown in Figure 3 Identification of highly conserved intergenic regions: Concerning the completely sequenced phage contigs, annotation of ContB1 (72,919 bp), ContJ1 (29,543 bp), ContB2 (16,690 bp), and ContJ2 (33,523 bp) revealed that all the putative ORFs present strong homology to CNL, but two ORFs, one at each 3′-end of ContB2 and ContJ2, show homology to a A. thaliana P-type transporting ATPase-like protein (At3g27870; E-value = 0; 86% of similarity) (Figure 2B Nucleotide pairwise comparisons revealed that, although the intergenic regions display a degree of structural diversity, blocks of closely related sequences occur within and between the phage contigs. Three CNL belonging to ContB1 and ContJ1 appear to be more closely related than average. These are CNL-B1*, CNL-B6*, and CNL-J1, which share ≥95% of nucleotide identity (Figure 2A The high sequence conservation, including intergenic regions between ContB1 and ContJ1, and the fact that these regions map at the same genetic position (GP 3) strongly suggest that these contigs are issued from orthologous regions from BAT93 and JaloEEP558. Similar conclusions can be made for ContB2 and ContJ2. Evolutionary rates of CNL sequence divergence from BAT93 and JaloEEP558: The rates of substitution within and between genotypes BAT93 and JaloEEP558 were studied to characterize the evolutionary force operating on the CNL gene family (Table 1). When complete CNL sequences from BAT93 or JaloEEP558 are inspected, similar rates of substitution are observed within and between CNL-B and CNL-J sequences. The same conclusion was obtained when the NBS domain or the LRR domain were analyzed separately. The rates of substitution are also nearly the same at a given GP (GP 1, GP 2, GP 3, or GP 4) for complete CNL sequence comparisons within and between genotypes. The fact that similar rates of substitution are observed within and between genotypes whatever the analyzed level (complete CNL, NBS, or LRR domain, four different GP) suggests that the CNL sequences did not homogenize within a haplotype because frequent occurrence of recombination events (unequal crossing over or gene conversion) is expected to give smaller values of substitution rates within each haplotype than between haplotypes. The LRR domain presents a rate of substitution nearly twice the one observed for the NBS domain. Finally, the rate of substitution at GP 1 is nearly half the one observed for the other GP (GP 2, GP 3, GP 4).
CNL sequences from BAT93 (CNL-B) and JaloEEP558 (CNL-J) located at a same genetic position are often more similar than paralogs: Phylogenetic trees were built with the 29 complete nucleotidic CNL sequences as well as on separate domains (NBS, LRR) (Figure 4
CNL sequences from BAT93 and JaloEEP55 do not group into distinct Mesoamerican and Andean clades. On the contrary, the evolutionary history of CNL sequences from BAT93 and JaloEEP558 is interrelated (Figure 4 Likelihood-based tests of topology were performed because inspecting only robust nodes does not prove congruence of the topology. Significant incongruence was detected between the NBS and LRR trees (Figure 4, B and C Positively selected residues were identified mainly on “x” residues predicted to be solvent exposed in the LRR domain: A strong preference for nonsynonymous compared to synonymous codon substitutions has been detected in many plant R gene families (Ellis et al. 2000; Jones and Dangl 2006; Bent and Mackey 2007). To examine whether B4-CNL members are subject to diversifying selection, the program PAML was used on the 18 full-length B4-CNL (12 from BAT93 and 6 from JaloEEP558) (Yang 1997). Sites under diversifying selection were investigated using the M8 and M7 codon substitution models. The likelihood-ratio test for comparing M8 with M7 is 2Δ ln L = 938, which is much greater than the χ2 critical value (9.21 at the 1% significance level, with d.f. 2). This shows that our data were significantly more likely under the M8 model with positive selection on some codons than under the M7 model, which does not allow the presence of sites with ω > 1. A total of 72 sites under diversifying selection were identified with a posterior probability > 99% (Figure 3 The B4 R gene cluster is localized in a subtelomeric region of the short arm of chromosome 4 and is adjacent to heterochromatic regions: To localize the B4 R gene cluster on chromosomes of the BAT93 bean genotype, four phages (λB35, λB10, λB62, and λB61 mapped at GP 1, GP 2, GP 3, and GP 4, respectively; Figure 1B
The phage λB10 labeled multiple chromosome ends, a pattern that indicated the presence of one or more repetitive sequences in its insert. This or these sequences seem to be particularly abundant around the B4 R gene cluster, where it generated strong hybridization signals apparently colocalizing with λB61 (Figure 5, D and E To confirm the position of λB35 and λB61 relative to each other and to the heterochromatic block, these two phages were also localized on meiotic pachytene chromosomes. The signals generated from both probes greatly overlapped (Figure 5, G and H DISCUSSION In this report, two genotypes representative of the two common bean gene pools, BAT93 (Mesoamerican) and JaloEEP558 (Andean), were selected to investigate the genetic events involved in the evolution of the B4 R gene cluster since the divergence of these two gene pools. We gained insight into the organization and the evolution of the B4 R gene cluster by using a combination of approaches, including sequence analysis of 29 CNL from both bean genotypes, FISH experiments, and genetic mapping based on PFGE separation. Genetic mapping based on hybridization with NBS–PRLJ1 after PFGE separation confirmed that the B4-CNL members are specific to the end of bean LG B4. It has been demonstrated that they are divided into two distinct blocks: block 1 spanning 2.7 cM at the end of LG B4 (corresponding to GP 1, GP 2, GP 3) and separated by 26.4 cM from the second block (block 2). In agreement with their genetic location on the bean genetic map, FISH experiments proved that the B4-CNL sequences are present only at the end of the short arm of chromosome 4. In the common bean genome, one other CNL cluster, the Co-2 cluster, has been identified at the end of LG B11 (Geffroy et al. 1998; Creusot et al. 1999). Co-2-CNL sequences, presenting ~65% of nucleotide identity with PRLJ1, have not been picked up under the hybridization conditions used in this article. FISH mapping of B4-CNL sequences to meiotic pachytene chromosomes provides additional information: the B4-CNL sequences are located only in the subtelomeric region of the short arm of chromosome 4 on both sides of a minor heterochromatic block (knob). Furthermore, a second knob, referred to as “major,” has also been identified in a more distal position. The organization “major knob/B4-CNL sequences/minor knob/B4-CNL sequences” suggests that chromosome inversion might have split an ancestral knob into two knobs. The major knob is unambiguously placed in a distal position, while the minor knob is tentatively placed between GP 3 and GP 4, since these two genetic positions are separated by 26.4 cM and since λB35 (mapped at GP 1) and λB61 (mapped at GP 4) labeled on both sides of the minor knob (Figure 5C Our analysis indicates that the B4 R gene cluster is a large cluster in terms of its physical size as well as in terms of the number of CNL that it encodes. Indeed, the physical size of block 1 was estimated to be at least 1000 kb in BAT93 and 700 kb in JaloEEP558. Furthermore, a minimal number of 17 and 12 B4-CNL are present in BAT93 and JaloEEP558, respectively. Thus, the B4 cluster is relatively large compared to other plant R gene clusters. For example, the tomato I2 locus encodes 7 members in 90 kb (Simons et al. 1998), the maize rp3 locus encodes 5 members in 140 kb (Webb et al. 2002), and the lettuce Dm3 locus encodes 32 members within at least 3000 kb (Kuang et al. 2004). Large variation in gene copy number among different genotypes has been observed, such as for the maize Rp1 cluster, in which a range of 1–52 copies has been estimated (Smith et al. 2004). In our analysis, no such dramatic differences in B4-CNL copy number were observed between BAT93 (17 CNL) and JaloEEP558 (12 CNL). However, hybridization experiments with PRLJ1 suggest a higher level of complexity in BAT93 than in JaloEEP558, since BAT93 exhibits a higher number of strongly hybridizing bands than JaloEEP558 (Figure 1A The genomic organization of the B4-CNL sequences into two blocks mirrors what is observed for resistance specificities since two distinct groups of resistance specificities and resistance QTL have been described: one group (containing Co-9, Co-y, Co-z) is located in a distal position (in block 1 in Figure 1B One important feature of heterochromatic knobs, described in maize by McClintock et al. (1981), is that they are highly unstable regions. This instability is exemplified by results from A. thaliana where the heterochromatic knob located on chromosome 4 close to the centromere is present in ecotypes Col and Ws but absent in Ler (Fransz et al. 2000). The molecular mechanism underlying this instability is not known. In A. thaliana, suppression of recombination in the region containing the knob is observed in a cross between Col and Ler (Drouaud et al. 2006). This suppression of recombination probably reflects the absence of the knob in Ler rather than an intrinsic feature of heterochromatic blocks. In this study, suppression of recombination is not observed in a cross between BAT93 and JaloEEP558, since genetic distance between GP 1 and GP 4, containing the minor knob, was estimated to be 29.1 cM. On the contrary, increased levels of recombination are suspected. Indeed, it is only recently that GP 1, GP 2, and GP 3 have been linked to one end of LG B4. In a previous version of the bean linkage map, they were considered as a separate small linkage group referred to as LG B14 (Freyre et al. 1998; Geffroy et al. 1999). The difficulty in linking LG B14 to LG B4 could be explained either by an increased level of recombination in the region containing the minor knob or by the existence of a low level of polymorphism between BAT93 and JaloEEP558 in that region. An alternative hypothesis would be that the absence of the minor knob in JaloEEP558 results in a polymorphism between BAT93 and JaloEEP558 and is responsible for the difficulty in linking LG B14 to LG B4. Similarly, in Lotus japonicus, cytogenetic analyses revealed that an inversion in the region between LG 1A and LG 1B was unexpectedly responsible for the difficulty of linking these two linkage groups because this polymorphism would have led, in principle, to suppression of recombination (Pedrosa et al. 2002). Cytogenetic analyses of pachytene chromosomes from JaloEEP558 are currently underway to study the existence of the minor knob in JaloEEP558. Our results provide two leads to explain the origin of the important proliferation of CNL sequences at the B4-R gene cluster: (i) its proximity to heterochromatin blocks (knobs) and (ii) its subtelomeric localization. Concerning the proximity to heterochromatin, the tomato Mi-1 R gene cluster has also been located near heterochromatin, more precisely in the junction of euchromatin and pericentromeric heterochromatin on chromosome 6 (Zhong et al. 1999; Seah et al. 2007). This region in tomato is highly enriched in R genes effective against diverse pathogens as well as in resistance gene analog sequences with unidentified function (Seah et al. 2007). In our study, the B4-CNL sequences also seem to be located at a junction of euchromatin and heterochromatin. Consequently, it is tempting to speculate that some features of the chromosome structure in these euchromatin/heterochromatin junction regions may provide a favorable environment for R gene proliferation. We hypothesize that it could be due to some form of gene silencing. Indeed, genes inserted in proximity to heterochromatin are often silenced as in the case of position-effect variegation in Drosophila, and chromatin structure is known to play a major role in epigenetic regulation of the genome (Schotta et al. 2003; Lippman and Martienssen 2004). The importance of RNA silencing in maintaining low levels of expression at plant disease resistance gene clusters was recently identified by Yi and Richards (2007) for the Arabidopsis RPP5 locus. They further hypothesize that the disruption of RNA silencing by pathogens, as occurs in infections by viruses and Agrobacterium (Dunoyer et al. 2006; Ding and Voinnet 2007), may play an important role in optimizing plant response to pathogen attack by increasing the expression of R genes. On the other hand, the subtelomeric location of the B4 R gene cluster is another feature that might favor R gene proliferation. Indeed, subtelomeres of such diverse organisms as human, yeast, and trypanosomes are dynamic and variable mosaics of multichromosomal blocks of sequence, resulting from the fact that subtelomeres are hotspots of interchromosomal recombination (Mefford and Trask 2002; Linardopoulou et al. 2005). It has been proposed that the processes acting on subtelomeric regions may have a role in diversifying gene families (Mefford and Trask 2002). Furthermore, in several organisms such as Saccharomyces cerevisiae, genes in close proximity to telomeres undergo transcriptional silencing, a phenomenon known as “telomere position effect” (Mefford and Trask 2002). This putative silencing effect could have a beneficial effect on R gene proliferation in a similar way to what we proposed above for heterochromatic silencing. In the common bean genome, most of the well-characterized large R gene clusters are located at the end of linkage groups, suggesting that the location at the end of a linkage group, and by inference a subtelomeric location, is favorable for R gene proliferation. For example, the Co-x, I, and Co-2 loci have been mapped to the ends of LG B1, LG B2, and LG B11, respectively (Geffroy et al. 1998, 2008; Creusot et al. 1999; Vallejos et al. 2006). Furthermore, the Co-4 anthracnose resistance locus was also confirmed to be in a subtelomeric position of bean chromosome 8 (Melotto et al. 2004). In other plant species, large R gene clusters, such as the maize Rp1 cluster and barley Mla cluster, are also located at the end of linkage groups (Pryor and Ellis 1993; Wei et al. 1999). However, systematic localization of large R gene clusters in subtelomeric regions is not always observed, for example, in the A. thaliana genome (Meyers et al. 2003). In addition, subtelomeric regions in A. thaliana do not share extensive similarity among most nonhomologous chromosomes, such as seen in yeast and humans (Kuo et al. 2006). This suggests that additional data are needed to understand the behavior of subtelomeric regions in plants (Fan et al. 2008) and that A. thaliana may not be representative of all plant species. In conclusion, in common bean, the localization of the B4 R gene cluster in regions with high genome plasticity (proximity to knobs and/or subtelomeric localization) is likely to increase genetic and epigenetic variations, which may in turn result in the accelerated evolution of R gene specificities. In plants and animals, several multigene families, such as rRNA genes, are subject to concerted evolution, in which family members share greater sequence identity within a species than between species. This homogenization of sequence within a given species is probably the result of repeated occurrences of unequal crossing over and/or gene conversion (Nei and Rooney 2005). Sequencing of 29 B4-CNL from two bean genotypes representative of the two gene pools of P. vulgaris gave us the opportunity to compare paralogs and orthologs. No sequence homogenization of the B4-CNL within a haplotype was observed. Indeed, similar substitution rates were observed for CNL sequences within and between haplotypes. Furthermore, phylogenetic analyses revealed that Andean and Mesoamerican B4-CNL do not form two separate clades, but that both genotypes possess an assortment of different CNL. Indeed, sequences from BAT93 and JaloEEP558 located at the same GP are often, but not always, more similar than paralogs. This pattern of evolution suggests that the B4-CNL multigene family is not subject to concerted evolution but rather follows the “birth-and-death” model of evolution (Nei et al. 1997). This model was first proposed to explain the evolution of MHC genes in mammals (Nei et al. 1997; Nei and Rooney 2005) and has been subsequently adapted for R gene cluster evolution in plants (Michelmore and Meyers 1998). As expected under the birth and death model (Nei and Rooney 2005), the B4 R gene cluster contains a large number of CNL corresponding to pseudogenes (in JaloEEP558, 6 pseudogenes of 12 B4-CNL; in BAT93, 5 pseudogenes of 17 B4-CNL). However, even if recombination events were not frequent enough to lead to sequence homogenization within one haplotype, two types of evidence suggest that recombination occurs occasionally in the B4 R gene cluster. First, significant incongruence was identified between trees built on different domains (A, B, C, D, and E) of the B4-CNL sequences. Second, a duplicated 7-kb region of highly conserved sequences (>95% nucleotide identity), including the noncoding region, was identified in BAT93 (green region on ContB1 in Figure 2A In conclusion, the subtelomeric location of the B4-CNL sequences combined with B4-CNL's proximity to knobs might confer to this cluster an unusual potential for adaptation to new strains of an ever-changing array of pathogens, as testified by the numerous CNL, specific R genes, and resistance QTL mapped at the B4 R gene cluster. Phage sequencing revealed that B4-CNLs are tightly clustered at CNL-rich regions with an average density of one CNL sequence every 4 kb. PFGE analysis revealed that the B4-CNL sequences are present on at least 1000 kb in BAT93 for block 1. This large physical size prompted us to use a more suitable genomic library to study the B4 R gene cluster. We are currently using a BAC library from genotype BAT93, presenting an average insert size of 125 kb (Kami et al. 2006) to establish the complete sequence of the B4 R gene cluster to understand the factors involved in its large expansion. Acknowledgments We thank Peter Moffett and Nicolas Chen for helpful discussions and critical reading of the manuscript. We thank Anne-Valérie Gendrel for stimulating discussions. We also thank the two anonymous reviewers for their valuable advice. We thank Dieter Schweizer (Gregor Mendel Institute of Molecular Plant Biology) and Austrian Science Fund for supporting the participation of A.P.-H. in this project. The research was supported by the Institut National de la Recherche Agronomique, the Centre National de la Recherche Scientifique, and the Ministère de la Recherche. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||
Annu Rev Phytopathol. 2003; 41():155-76.
[Annu Rev Phytopathol. 2003]Annu Rev Phytopathol. 2001; 39():285-312.
[Annu Rev Phytopathol. 2001]Curr Opin Biotechnol. 2003 Apr; 14(2):177-93.
[Curr Opin Biotechnol. 2003]Curr Opin Plant Biol. 2003 Aug; 6(4):397-404.
[Curr Opin Plant Biol. 2003]Annu Rev Plant Biol. 2003; 54():23-61.
[Annu Rev Plant Biol. 2003]Genome Res. 2002 Dec; 12(12):1871-84.
[Genome Res. 2002]Plant Cell. 2003 Apr; 15(4):809-34.
[Plant Cell. 2003]Mol Genet Genomics. 2004 May; 271(4):402-15.
[Mol Genet Genomics. 2004]Science. 2006 Sep 15; 313(5793):1596-604.
[Science. 2006]PLoS One. 2007 Dec 19; 2(12):e1326.
[PLoS One. 2007]Proc Natl Acad Sci U S A. 1995 Feb 14; 92(4):1101-4.
[Proc Natl Acad Sci U S A. 1995]Theor Appl Genet. 2005 Feb; 110(3):432-44.
[Theor Appl Genet. 2005]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Mol Plant Microbe Interact. 2000 Mar; 13(3):287-96.
[Mol Plant Microbe Interact. 2000]Theor Appl Genet. 2003 Jan; 106(2):251-61.
[Theor Appl Genet. 2003]Genomics. 1987 Oct; 1(2):174-81.
[Genomics. 1987]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Theor Appl Genet. 2003 Jan; 106(2):251-61.
[Theor Appl Genet. 2003]Theor Appl Genet. 2005 Mar; 110(5):895-905.
[Theor Appl Genet. 2005]Theor Appl Genet. 2006 Mar; 112(5):924-33.
[Theor Appl Genet. 2006]Chromosoma. 2001 Jul; 110(3):203-13.
[Chromosoma. 2001]Trends Genet. 1992 Nov; 8(11):372-3.
[Trends Genet. 1992]Comput Appl Biosci. 1997 Oct; 13(5):555-6.
[Comput Appl Biosci. 1997]Syst Biol. 2001 Aug; 50(4):580-601.
[Syst Biol. 2001]Mol Biol Evol. 1993 May; 10(3):512-26.
[Mol Biol Evol. 1993]J Theor Biol. 1990 Feb 22; 142(4):485-501.
[J Theor Biol. 1990]Mol Biol Evol. 1997 Jul; 14(7):685-95.
[Mol Biol Evol. 1997]J Mol Evol. 1989 Aug; 29(2):170-9.
[J Mol Evol. 1989]Syst Biol. 2002 Jun; 51(3):492-508.
[Syst Biol. 2002]Syst Biol. 2000 Dec; 49(4):652-70.
[Syst Biol. 2000]Syst Biol. 2003 Oct; 52(5):696-704.
[Syst Biol. 2003]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Theor Appl Genet. 2003 Jan; 106(2):251-61.
[Theor Appl Genet. 2003]Theor Appl Genet. 2003 Jan; 106(2):251-61.
[Theor Appl Genet. 2003]Plant Cell. 2003 Apr; 15(4):809-34.
[Plant Cell. 2003]Plant Cell. 2008 Mar; 20(3):739-51.
[Plant Cell. 2008]Curr Opin Plant Biol. 2006 Aug; 9(4):383-90.
[Curr Opin Plant Biol. 2006]Plant J. 1999 Nov; 20(3):317-32.
[Plant J. 1999]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Plant Cell. 2003 Apr; 15(4):809-34.
[Plant Cell. 2003]Plant Cell. 2004 Feb; 16(2):309-18.
[Plant Cell. 2004]Trends Plant Sci. 2000 Sep; 5(9):373-9.
[Trends Plant Sci. 2000]Nature. 2006 Nov 16; 444(7117):323-9.
[Nature. 2006]Annu Rev Phytopathol. 2007; 45():399-436.
[Annu Rev Phytopathol. 2007]Comput Appl Biosci. 1997 Oct; 13(5):555-6.
[Comput Appl Biosci. 1997]Theor Appl Genet. 2003 Jan; 106(2):205-12.
[Theor Appl Genet. 2003]Genome. 1999 Apr; 42(2):254-64.
[Genome. 1999]Plant Cell. 1998 Jun; 10(6):1055-68.
[Plant Cell. 1998]Genetics. 2002 Sep; 162(1):381-94.
[Genetics. 2002]Plant Cell. 2004 Nov; 16(11):2870-94.
[Plant Cell. 2004]Genetics. 2004 Aug; 167(4):1939-47.
[Genetics. 2004]Annu Rev Phytopathol. 2001; 39():285-312.
[Annu Rev Phytopathol. 2001]Phytopathology. 2003 Jan; 93(1):88-95.
[Phytopathology. 2003]Theor Appl Genet. 2008 Apr; 116(6):807-14.
[Theor Appl Genet. 2008]Cell. 2000 Feb 4; 100(3):367-76.
[Cell. 2000]Genome Res. 2006 Jan; 16(1):106-14.
[Genome Res. 2006]Mol Plant Microbe Interact. 1999 Sep; 12(9):774-84.
[Mol Plant Microbe Interact. 1999]Genetics. 2002 Aug; 161(4):1661-72.
[Genetics. 2002]Theor Appl Genet. 2007 May; 114(7):1289-302.
[Theor Appl Genet. 2007]Semin Cell Dev Biol. 2003 Feb; 14(1):67-75.
[Semin Cell Dev Biol. 2003]Nature. 2004 Sep 16; 431(7006):364-70.
[Nature. 2004]Plant Cell. 2007 Sep; 19(9):2929-39.
[Plant Cell. 2007]Nat Genet. 2006 Feb; 38(2):258-63.
[Nat Genet. 2006]Nat Rev Genet. 2002 Feb; 3(2):91-102.
[Nat Rev Genet. 2002]Nature. 2005 Sep 1; 437(7055):94-100.
[Nature. 2005]Theor Appl Genet. 2008 Feb; 116(3):407-15.
[Theor Appl Genet. 2008]Genome. 1999 Apr; 42(2):254-64.
[Genome. 1999]Genetics. 2006 Feb; 172(2):1229-42.
[Genetics. 2006]Annu Rev Genet. 2005; 39():121-52.
[Annu Rev Genet. 2005]Proc Natl Acad Sci U S A. 1997 Jul 22; 94(15):7799-806.
[Proc Natl Acad Sci U S A. 1997]Genome Res. 1998 Nov; 8(11):1113-30.
[Genome Res. 1998]Theor Appl Genet. 2007 May; 114(7):1289-302.
[Theor Appl Genet. 2007]Plant Cell. 2005 Feb; 17(2):361-74.
[Plant Cell. 2005]Theor Appl Genet. 2006 Apr; 112(6):987-98.
[Theor Appl Genet. 2006]