![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||
Copyright © 2005, American Society of Plant Biologists Evolutionary Expansion, Gene Structure, and Expression of the Rice Wall-Associated Kinase Gene Family1[w] Department of Plant and Microbial Biology, University of California, Berkeley, California 94720 (S.Z., C.C., L.M., J.S., P.G.L.); Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut 06520–8104 (L.L., X.-W.D.); Department of Horticulture, Michigan State University, East Lansing, Michigan 48824 (N.J.); and Department of Biology, San Francisco State University, San Francisco, California 94132 (Z.-H.H.) *Corresponding author; e-mail shibo/at/nature.berkeley.edu; fax 510–642–7356. Received July 28, 2005; Revised August 22, 2005; Accepted August 24, 2005. This article has been cited by other articles in PMC.Abstract The wall-associated kinase (WAK) gene family, one of the receptor-like kinase (RLK) gene families in plants, plays important roles in cell expansion, pathogen resistance, and heavy-metal stress tolerance in Arabidopsis (Arabidopsis thaliana). Through a reiterative database search and manual reannotation, we identified 125 OsWAK gene family members from rice (Oryza sativa) japonica cv Nipponbare; 37 (approximately 30%) OsWAKs were corrected/reannotated from earlier automated annotations. Of the 125 OsWAKs, 67 are receptor-like kinases, 28 receptor-like cytoplasmic kinases, 13 receptor-like proteins, 12 short genes, and five pseudogenes. The two-intron gene structure of the Arabidopsis WAK/WAK-Likes is generally conserved in OsWAKs; however, extra/missed introns were observed in some OsWAKs either in extracellular regions or in protein kinase domains. In addition to the 38 OsWAKs with full-length cDNA sequences and the 11 with rice expressed sequence tag sequences, gene expression analyses, using tiling-microarray analysis of the 20 OsWAKs on chromosome 10 and reverse transcription-PCR analysis for five OsWAKs, indicate that the majority of identified OsWAKs are likely expressed in rice. Phylogenetic analyses of OsWAKs, Arabidopsis WAK/WAK-Likes, and barley (Hordeum vulgare) HvWAKs show that the OsWAK gene family expanded in the rice genome due to lineage-specific expansion of the family in monocots. Localized gene duplications appear to be the primary genetic event in OsWAK gene family expansion and the 125 OsWAKs, present on all 12 chromosomes, are mostly clustered. Efficient communication between the plant cell wall and the cytoplasm is important in plant development and in responding to biotic and abiotic stresses (Kohorn, 2000; Brownlee, 2002; Somerville et al., 2004). The wall-associated kinase (WAK) gene family, which belongs to the receptor-like kinase (RLK) superfamily in plants (Shiu and Bleecker, 2001, 2003), plays a critical role in this communication (He et al., 1996; Kohorn, 2001). WAK1, the first member of the WAK gene family identified in Arabidopsis (Arabidopsis thaliana), encodes a protein containing an intracellular Ser-Thr kinase domain and extracellular domains with similarities to vertebrate epidermal growth factor (EGF)-like domains (He et al., 1996). From the Arabidopsis genome (Arabidopsis Genome Initiative, 2000), 26 WAK and WAK-Like (WAKL) genes were identified and most of the 26 members are expressed in Arabidopsis (Verica and He, 2002; Verica et al., 2003). Functional studies of the different WAK members in Arabidopsis demonstrated that they are involved in various functions in plants, including pathogen resistance (He et al., 1998), heavy-metal tolerance (Sivaguru et al., 2003), and plant development (Lally et al., 2001; Wagner and Kohorn, 2001). Biochemical studies demonstrated that WAK proteins are covalently bound to pectin in the cell wall (Wagner and Kohorn, 2001). They can also form an approximately 500-kD protein complex via interactions with a Gly-rich extracellular protein, AtGRP-3 (Park et al., 2001), and can interact with a cytoplasmic, type 2C kinase-associated protein phosphatase (Anderson et al., 2001). Based on these attributes, it was suggested that WAKs serve as physical links between the extracellular matrix and the cytoplasm and also as a signaling component between the cell wall and the cytoplasm (He et al., 1996; Kohorn, 2001). Recently, a study of seven (group II) WAKL members in Arabidopsis showed that they have tissue-specific and developmentally regulated expression patterns and are functionally similar to WAKs (Verica et al., 2003). To further understand the functions and evolution of the WAK gene family in plants, we analyzed the WAK gene family in rice (Oryza sativa). Rice is one of the most important cereals and also considered a model for other cereal species, including maize (Zea mays), wheat (Triticum aestivum), barley (Hordeum vulgare), and sorghum (Sorghum bicolor). Rice genomic sequences are now available from both subspecies indica (Yu et al., 2002) and japonica (Feng et al., 2002; Goff et al., 2002; Sasaki et al., 2002; Rice Chromosome 10 Sequencing Consortium, 2003; International Rice Genome Sequencing Project, 2005). Previous analyses of the RLK superfamily from the indica genome sequence indicated that the WAK/WAKL gene family was one of the few RLK subfamilies that expanded in rice compared to Arabidopsis (Shiu et al., 2004). Here we present a detailed analysis of the rice WAK/WAKL (OsWAK) gene family from japonica. First, through public database searches, we retrieved all genes annotated as putative OsWAKs, followed by reiterative database searches from which we obtained additional putative OsWAK gene family members. Manual reannotation was performed to correct or reannotate the misannotated putative OsWAK genes, such as split genes, fused genes, short genes, and pseudogenes. We determined expression characteristics for certain OsWAK members and, based on domain composition, we classified OsWAKs into five groups and performed comparative phylogenetic analyses of WAKs in Arabidopsis, rice, and barley to understand the possible mechanisms of gene family expansion. RESULTS Identification and Classification of OsWAKs from Genome Sequences of Rice Subsp. japonica cv Nipponbare Four analytical steps were used to identify and classify OsWAKs from japonica subsp. cv Nipponbare. First, all genes annotated as putative OsWAKs were retrieved from three public databases: (1) The Institute for Genomic Research (TIGR) Rice Genome Annotation Database (Osa1, release 1 and 2; Yuan et al., 2005); (2) Rice Protein Database in GRAMENE (Ware et al., 2002); and (3) GenBank in the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov). There were different numbers of putative OsWAKs annotated in the different databases. Identical sequences from the same or overlapping contigs, but annotated with different putative OsWAK names in the various databases, were identified and removed. The second step, aimed at a more complete search for putative OsWAKs in rice, was performed using BLASTp and tBLASTn (E < 0.1) searches of Osa1 and the japonica cv Nipponbare genome sequences in GenBank. These analyses utilized 10 uniquely different putative OsWAK protein sequences (2768.t00013, 2541.t00004, 8323.t00010, 4577.t00017, 9637.t02571, 4967.t00013, 3135.t00009, gi|19881772, 9638.t00862, and 6563.t00002), which contained both the extracellular EGF-like domains (EGF-2, PS01186/PROSITE and/or EGF-Ca2+, PS01187/PROSITE) and an intracellular protein kinase domain. Most new genes obtained from this reiterative database search were originally annotated either as kinase domain-containing protein genes or receptor-like protein (RLP) genes. In the third step, each putative OsWAK sequence was manually assessed for the EGF-like domains, kinase domain, or its sequence similarity to other putative OsWAKs. In our analyses, in order for a gene to be defined as a putative OsWAK, the gene had to fit into one of the five OsWAK types (see Table I). Based on these five criteria, a total of 129 putative OsWAKs were initially identified from japonica cv Nipponbare.
As the last step, manual reannotation was performed to correct or reannotate the misannotated putative OsWAKs, as described below. This analysis included 10 putative split OsWAKs merged into five OsWAK-RLKs, four fused putative OsWAKs split into five different OsWAKs, and six non-OsWAKs, leaving a final total of 125 OsWAKs; the numbers of each type of OsWAK are shown in Table I. For the 125 OsWAKs, the detailed annotations (Supplemental Tables I and II), genomic and predicted coding/full-length (FL)-cDNA sequences (Supplemental Data 1), and predicted protein sequences (Supplemental Data 2) are presented. Correction and Reannotation of Misannotated Putative OsWAKs Automated annotations of rice genome sequences of japonica cv Nipponbare in TIGR and GenBank are useful as initial resources for gene annotation, but individual genes and gene families are generally not verified and corrected. Several types of annotation errors were found in the automated annotation of the Arabidopsis genome, including intron numbers/positions, merged/split genes, and missed short genes; approximately 35% of the initially annotated genes from Arabidopsis were corrected from FL-cDNA sequences (Haas et al., 2003). We therefore performed a similar assessment of the initially identified, 129 putative OsWAKs derived from automated annotation using the rice FL-cDNA sequences in the Knowledge-Based Oryza Molecular Biological Encyclopedia database (Kikuchi et al., 2003; http://cdna01.dna.affrc.go.jp/cDNA). For 31 putative OsWAKs, one corresponding FL-cDNA sequence was obtained; for eight other OsWAKs, more than one corresponding FL-cDNA sequence was obtained (Supplemental Table III). Of the 39 putative OsWAKs that matched with FL-cDNAs, only 20 had predicted coding sequences that perfectly or nearly perfectly matched the corresponding FL-cDNA sequences (identity ≥99.5%). For the other 19, the FL-cDNA sequences were either shorter (seven) or longer (12) than the predicted coding sequences. All of the FL-cDNA sequences, except one, have both 5′ untranslated region (UTR) and 3′ UTR, indicating they are most likely intact cDNA sequences. Therefore, these 19 putative OsWAKs were corrected using their corresponding FL-cDNA sequences, although it is possible that some or all of the shorter FL-cDNAs could derive from truncated sequences. Reannotation of 14 out of the 19, along with 13 additional putative OsWAKs, is described below. Split/Fused OsWAK Genes From the FL-cDNA sequence analysis described above, two putative OsWAK-RLPs, OsWAK30 and OsWAK31, matched to the same FL-cDNA sequence (AK058435). By examining matched regions, the two OsWAK-RLPs were found to match complementally to the FL-cDNA sequence (OsWAK30 on the 5′ side and OsWAK31 on the 3′ side). Also, both putative genes were located within an approximately 12-kb genomic region on chromosome 4 and no extra genes were predicted in the region between the two putative OsWAKs (Fig. 1
Fourteen other putative, split OsWAKs (seven putative OsWAK-RLPs and seven putative OsWAK-RLCKs) also matched complementally to various OsWAK-RLKs (data not shown). However, for each of these pairs, one to eight different genes interrupt the sequence (Fig. 2
Three putative OsWAKs (OsWAK5/2768.t00008, OsWAK73/9636.t03851, OsWAK89/9637.t03234) from Osa1 in TIGR were found to be three fused genes. Putative OsWAK5 and OsWAK73 are fused with non-OsWAKs, and the third has two fused OsWAKs (Fig. 3
OsWAK Short Genes Five putative OsWAKs (OsWAK17, OsWAK19, OsWAK54, OsWAK62, OsWAK67) were initially identified as OsWAK short genes. This was due to the fact that their predicted protein sequences were <300 amino acids and did not encode an EGF-like or kinase domain, but they had >40% amino acid identity to the longer OsWAKs. From the FL-cDNA sequence analyses described above, seven more putative OsWAKs (OsWAK18, OsWAK23, OsWAK35, OsWAK37, OsWAK52, OsWAK101, OsWAK127) were reannotated as OsWAK short genes. This was also due to the fact that the deduced protein sequences from their longest open reading frames (ORFs) are <300 amino acids and do not encode an EGF-like or kinase domain. Of the seven OsWAK short genes, four (OsWAK18, OsWAK52, OsWAK101, OsWAK127) had antisense transcripts and the other three (OsWAK23, OsWAK35, OsWAK37) had sense transcripts. In order to further understand how the OsWAK short genes might have arisen in the rice genome, genomic sequences adjacent to the seven OsWAK short genes were analyzed. These analyses found various transposable elements (TEs) were inserted at either the 5′ or 3′ ends of the FL-cDNAs of five of the OsWAK short genes (Table II).
Further analyses indicated that the inserted TEs likely did not interrupt the OsWAKs; however, they either provided novel splicing sites or initialized antisense transcriptions, which resulted in short ORFs for the OsWAKs. For example, the 3′ end of the OsWAK23 FL-cDNA is located inside a long interspersed nuclear element (LINE) that appears to provide a novel splicing acceptor site. As a result, both the EGF-like and kinase domains were spliced out and thus do not present in the FL-cDNA sequence of OsWAK23 (Fig. 4 OsWAK Pseudogenes Twelve putative OsWAKs were initially annotated as pseudogenes in GenBank because of the apparent lack of the 5′ exon, when compared to other OsWAK-RLKs, or the presence of stop codons or frame shifts in their coding regions (Supplemental Table II). However, four (OsWAK21, OsWAK76, OsWAK82, OsWAK84) had the corresponding FL-cDNA sequences available (Fig. 5
Intron Number and Position of OsWAKs Arabidopsis RLK-type WAK/WAKLs have a two-intron/three-exon gene structure pattern within their coding regions (Verica and He, 2002). Both introns are located within the extracellular regions; the first intron is between the two EGF-like domains and the second is between an EGF-like domain and the kinase domain. Based on predicted coding sequences or FL-cDNA sequences, intron numbers in the coding regions for each of the 125 OsWAKs were determined (Supplemental Table II). For the 38 OsWAKs having a FL-cDNA sequence, intron numbers and positions, along with domain positions, are shown in Figure 5 Domain Composition and Organization of OsWAKs Excluding the five OsWAK pseudogenes and the 38 OsWAKs with FL-cDNAs (Fig. 5
Expression Analysis of OsWAKs Thirty-eight OsWAKs have corresponding FL-cDNA sequences (Fig. 5 In addition, five different OsWAKs were chosen for expression analysis using reverse transcription (RT)-PCR, OsWAK7, OsWAK50, OsWAK125, plus two short genes, OsWAK17 (48% amino acid identity to the C terminus of OsWAK53) and OsWAK62 (92% amino acid identity to the extracellular region of OsWAK63). DNA samples from leaf tissues were used in PCR analyses; three RNA samples from the root tip, root base, and shoot tissues were used in RT-PCR analyses. OsWAK50, which has a FL-cDNA sequence available, had RT-PCR products of an expected size (1,084 bp) from all three tissues; because two introns were present, its PCR product was larger (1,308 bp). This result confirmed OsWAK50 is an expressed gene and showed there was no DNA contamination in RNA samples used for RT-PCR reactions. The same DNA and RNA samples were used for the other four OsWAKs. OsWAK7 and OsWAK125 that are without FL-cDNA sequences also had RT-PCR products from one or more rice tissues; because they had no intron, their RT-PCR products (OsWAK7, 758 bp; OsWAK125, 800 bp) were the same sizes as their PCR products (Fig. 7
For functional studies of all OsWAKs, further experimental confirmation of gene expression patterns is needed. To perform this type of analysis, rice microarrays containing all 125 OsWAKs would be ideal; however, this type of custom microarray is not yet available. We therefore took advantage of available rice tiling-path microarrays representing the entire rice chromosome 10 that have been successfully used to detect gene transcription activity (Li et al., 2005). Two independent sets of 36-mer probes, with 10-nucleotide intervals, tiled throughout chromosome 10 of japonica and indica, were designed for the microarrays; rice subsp. japonica cv Nipponbare and rice subsp. indica cv 93-11, respectively, were used for the experiments. The mRNA samples used for the microarray hybridization were from four normally growing tissues: 7-d-old seedling shoot and root, panicle (heading and filling stages), and suspension culture cells. Determination of the signal-to-noise ratio and expression level of a given gene model was previously described (Li et al., 2005). Based on the results of that analysis, we examined transcription activity for the 20 OsWAKs on chromosome 10. Fifteen out of the 20 OsWAKs were shown to be expressed in japonica tissues, and 10 in both japonica and indica (Table III). Surprisingly, of the three OsWAKs (OsWAK98, OsWAK101, OsWAK112) already having FL-cDNA sequences available, only OsWAK101 was expressed based on the microarray experiment, suggesting the other two might be expressed under specific stress or developmental conditions.
Phylogenetic Analyses of Rice OsWAKs, Arabidopsis WAK/WAKLs, and Barley HvWAKs Intracellular protein kinase domains are typically the conserved regions of RLKs and are used in phylogenetic analyses (Shiu and Bleecker, 2001). Ninety-five OsWAKs (67 OsWAK-RLKs plus 28 OsWAK-RCLKs) containing one or two kinase domains were initially analyzed phylogenetically together with the 21 Arabidopsis WAKs/WAKLs (Fig. 8
This analysis revealed that most OsWAKs and Arabidopsis WAK/WAKLs cluster in species-specific distinct clades, except for four OsWAKs (OsWAK1, OsWAK2, OsWAK10, OsWAK25) and four WAKL-IV members (WAK14, WAK15, WAK20, WAK21) that cluster in the same clade. This result indicates that most OsWAKs and Arabidopsis WAK/WAKLs expanded in a species-specific manner; only a few members likely originated from the common ancestral genes that existed before divergence of monocots and dicots. To further investigate whether expansion of OsWAKs is rice specific or due to lineage-specific expansion of this family in monocots, we identified 10 barley WAKs (HvWAKs) from the barley EST database (HarvEST; http://harvest.ucr.edu). Because only partial predicted protein sequences (of either extracellular regions or kinase domains) of HvWAKs were obtained (Supplemental Data 3), we used predicted, FL protein sequences of 43 OsWAK-RLKs containing both extracellular regions and kinase domains in the phylogenetic analyses. Ten HvWAKs cluster with individual OsWAKs, rather than diverging in a group unique to barley (Fig. 9
Localized Duplications Resulted in Expansion of OsWAKs Examination of individual OsWAKs in different subclades of the rice-specific clades (Fig. 8
Ratios of Nonsynonymous versus Synonymous Substitution Rates between OsWAK EGF-Ca2+ Domain Regions The EGF-Ca2+ domains, the typical extracellular domains in Arabidopsis WAK/WAKLs (Verica and He, 2002), are more conserved than the other extracellular regions and were shown to be under purifying selection (Verica et al., 2003). The EGF-Ca2+ domain regions of OsWAKs, which were identified by searching for EGF-Ca2+ domains ([DEQN]-x-[DEQN](2)-C-x(3,14)-C-x(3,7)-C-x-[DN]-x(4)-[FY]-x-C) using the SMART database (http://smart.embl-heidelberg.de), are also more conserved than other extracellular regions (data not shown). To determine whether OsWAK EGF-Ca2+ domains are also under purifying selection, ratios of nonsynonymous versus synonymous substitution rates among EGF-Ca2+ domain regions from 15 randomly selected OsWAKs were analyzed using an improved analytical method (Yang and Nielsen, 2000). Analysis of results shows that most ratios were significantly less than 1 (P < 0.05; Supplemental Table IV), indicating OsWAK EGF-Ca2+ domain regions are also under purifying selection. Physical Locations of OsWAKs on Chromosomes Since all rice clones used for genomic sequencing in the International Rice Genome Sequencing Project can be physically anchored on the 12 rice chromosomes, as are the pseudomolecules assembled in TIGR (http://www.tigr.org/tdb/e2k1/osa1/pseudomolecules/info.shtml), the 125 OsWAKs encoded on individual clones can be correspondingly mapped on japonica chromosomes (Fig. 11
DISCUSSION Identification of OsWAK Gene Family Members Through a reiterative database search and manual reannotation, we identified 125 OsWAK gene family members from rice japonica cv Nipponbare. Thirty-seven (approximately 30%) of the identified OsWAKs were corrected/reannotated from their earlier automated annotations, such as five merged genes from 10 split genes, five split genes from four fused genes, seven corrected as short genes, and six reannotated from putative pseudogenes. A few of the reannotations were based on sequence similarities to the other OsWAK family members; a few other reannotations/corrections were found by sequence extension of the previously annotated genes to recover the missing exons from the 5′ or 3′ ends. Most reannotations, however, were based on the corresponding FL-cDNA sequences available in rice (Kikuchi et al., 2003). This also occurred in the analysis of the Arabidopsis genome, where approximately 35% of automated annotations were reannotated using FL-cDNA sequences (Haas et al., 2003). Misannotations from the current automated annotation programs could occur for several reasons, including insufficient numbers of experimentally confirmed genes being available to train the gene-prediction programs, unexpectedly large introns, presence of TE insertions that interrupt normal gene sequences and transcription, and possible sequencing errors (Haas et al., 2003; Wang et al., 2003; Castelli et al., 2004). In addition, few of the annotation errors in rice were due to incomplete sequencing information; of the 63 BACs that encode the 125 OsWAKs, six BACs were not yet completed at the time the annotation was generated (see Supplemental Table II, column D). None of the five OsWAK pseudogenes came from incomplete BACs; however, the merged OsWAK57/58 was from an incomplete BAC (OJ1480_H01). Also, two OsWAKs (OsWAK10, OsWAK50) were truncated because they were located at the end of the BAC sequences; the full length of these two OsWAKs was recovered from their FL-cDNA sequences. Therefore, further experimental validations of the reannotated OsWAKs are warranted. Classification of the OsWAK Gene Family Of the 125 OsWAK gene family members, 67 are OsWAK-RLKs containing both extracellular EGF-like domains and an intracellular kinase domain; 28 are OsWAK-RLCKs containing only the kinase domain; 13 are OsWAK-RLPs with only the extracellular EGF-like domains; 12 are OsWAK short genes; and five are pseudogenes. Functions of these various OsWAKs are yet to be determined. The unique character of OsWAK-RLPs and OsWAK-RLKs is the EGF-like domain at the N-terminal extracellular region. The function of EGF-like domains in OsWAKs and Arabidopsis WAK/WAKLs is yet to be determined. Analysis of other EGF-containing proteins from plants and animals suggests several possible roles for the EGF-like domains in the OsWAK-RLKs and OsWAK-RLPs. In the plant vacuolar sorting protein, BP-80, EGF domains were shown to alter the structural conformation of the ligand-binding domains, thereby increasing their affinity for ligand binding (Cao et al., 2000). EGF-like domains in certain proteins from animals directly participate in protein-protein interactions (Appella et al., 1987; Rebay et al., 1991; Kuroda and Tanizawa, 1999; Stenberg et al., 1999). For example, in the fruit fly (Drosophila melanogaster), the transmembrane protein, Notch, which contains 36 tandem EGF-like repeats, interacts with other EGF-containing transmembrane proteins, including Delta and Serrate. Two of the Notch EGF repeats were shown to be both necessary and sufficient for mediating these interactions (Rebay et al., 1991). The formation of both homodimeric and heterodimeric receptor complexes has been proposed for a large number of receptor kinases (Heldin, 1995), including members of the EGF family of receptor kinases in animals (Yarden and Schlessinger, 1987; Sliwkowski et al., 1994) and the CLV1 receptor complex in plants (Clark et al., 1997; Trotochaud et al., 1999). It remains to be seen whether the EGF-like domains in OsWAK-RLKs and OsWAK-RLPs are involved in protein-protein interactions. In plants, several different RLK members were characterized and found to function in a diverse array of signaling processes, including phytohormone responses (Chang et al., 1993 [ETR1]; Li and Chory, 1997 [BRI1]; Clark et al., 1998 [CTR1]), reproduction (Stein et al., 1991 [SRK]; Mu et al., 1994 [PRK1]), developmental regulation (Becraft et al., 1996 [CRINKLY4]; Clark et al., 1997 [CLV1]; Yokoyama et al., 1998 [ERECTA]; Jinn et al., 2000 [HAESA]), and plant disease resistance responses (Song et al., 1995 [Xa21]). These RLKs are divided into different subclasses based on sequence relationships between their predicted extracellular regions (Shiu and Bleecker, 2001). Several RLP members in plants were also shown to have important functions. For example, CLV2 functions in shoot meristem development in Arabidopsis (Kayes and Clark, 1998; Jeong et al., 1999); Fasciated Ear 2, an ortholog of CLV2, in the regulation of ear inflorescence meristem proliferation in maize (Taguchi-Shiobara et al., 2001); Too Many Mouths (TMM) in stomatal patterning in Arabidopsis (Nadeau and Sack, 2002); Cf-9 in disease resistance in tomato (Jones et al., 1994); and Xa21D in pathogen resistance in rice (Wang et al., 1998). Arabidopsis WAKL7 is also an RLP gene that is wound inducible (Verica et al., 2003). A few plant RLCKs were found to be functional in disease resistance signaling processes, including Pto in pathogen resistance (Martin et al., 1993) and EDR1 in negative regulation of defense responses (Frye et al., 2001). Especially interesting are recent studies showing that plant RLK, RLP, and RLCK members could function together in the same signaling pathways. For example, in the self-incompatibility signaling pathway in Brassica, the S-receptor kinase (an RLK member), the S-locus glycoprotein (an RLP member resembling the extracellular part of the S-receptor kinase), and the M-locus protein kinase (an RLCK member) function as a single complex in signaling (Cui et al., 2000; Takasaki et al., 2000; Murase et al., 2004). Another example is the complex interaction of TMM (Leu-rich repeat [LRR]-RLP) and three ERECTA family members (LRR-RLKs) in stomatal patterning and differentiation (Shpak et al., 2005); TMM negatively regulates specific ERECTA family members at critical steps in stomatal differentiation. The mechanism for regulation of the ERECTA family members by TMM is still unknown. It was suggested that TMM forms a receptor heterodimer with ERECTA family RLKs, preventing signaling; the lack of a signal transducer domain in TMM supports the idea of this inhibitory function. Alternatively, TMM may have the same ligands as the ERECTA family, hence repressing the ERECTA family signaling pathway. These examples suggest that different OsWAKs could also function by heterodimerization in multimeric assemblies or alone in transduction cascades. Twelve OsWAK family members are short genes with less than 300 amino acids and no known domain. This type of short gene was also identified in the Arabidopsis FL-cDNA sequence analyses (Haas et al., 2003). In some cases, the short genes appear to be functional in Arabidopsis, e.g. the CLE gene family members (Hobe et al., 2003; Sharma et al., 2003). A possible clue into the genesis of OsWAK short genes comes from the observation that recognizable TEs are found within or near the 5′ or 3′ end of five of the short genes (OsWAK18, OsWAK23, OsWAK35, OsWAK52, OsWAK101). They either provide a novel splicing site or initialize antisense transcription, which resulted in the short ORFs. Further studies of the function, if any, of the OsWAK short genes, especially of the seven OsWAK short genes with corresponding FL-cDNA sequences, should provide interesting insights into the role of OsWAK short genes in rice. Structure and Expression of OsWAKs The two-intron gene structure pattern of Arabidopsis WAK/WAKLs (Verica and He, 2002) was generally conserved in OsWAKs. However, in OsWAKs with an extra/missing EGF-like domain, an extra/missing intron was observed in their extracellular regions. Extra introns were also observed in the kinase domains of a few OsWAK-RLCKs. Intron variation in OsWAKs demonstrates that OsWAK gene structure is not as conserved as it is in the Arabidopsis WAK/WAKLs and is a reflection of its expansion in rice. Antisense expression, bidirectional transcription, and alternative splicing were observed in a few of the 38 OsWAKs with FL-cDNA sequences. This type of antisense and bidirectional transcription was also observed from FL-cDNA sequence analysis of many other rice genes (Osato et al., 2003). However, biological implications of the different directions of transcription and the nature of the resulting products remain to be determined. Expression analyses based on the tiling-microarray experiment of the 20 OsWAKs on chromosome 10 and RT-PCR for five additional OsWAKs suggest that the majority of OsWAKs are expressed in rice. Evolutionary Expansion of OsWAKs Compared to the 26 Arabidopsis WAKs/WAKLs (Verica and He, 2002), the japonica rice genome has nearly a 5 times greater number of OsWAKs. Previous comparative analyses between the two plant species of other gene families, e.g. the CONSTANS-like gene family (Griffiths et al., 2003), the Dof family (Lijavetzky et al., 2003), and the LRR extensin family (Baumberger et al., 2003), did not reveal the magnitude of difference in family sizes that is seen with the OsWAKs. Therefore, the evolutionary expansion of the OsWAK gene family does not appear to be due simply to the larger genome size of rice. Phylogenetic analyses of OsWAKs and Arabidopsis WAK/WAKLs show that most OsWAKs and Arabidopsis WAK/WAKLs are clustered in distinct species-specific clades, suggesting species-specific expansion in both plants. Further phylogenetic analyses, comparing OsWAKs with barley HvWAKs, indicate that OsWAK expansion was mainly due to its lineage-specific expansion in monocot species. This type of divergence between monocot (rice) and dicot (Arabidopsis) species was also observed for a large gene family involved in pathogen resistance, the nucleotide-binding site (NBS)-LRR gene family (Bai et al., 2002). In rice, greater than 600 members of the OsNBS-LRR gene family were identified, 3 to 4 times greater than the number in Arabidopsis. Also, most OsNBS-LRRs do not encode Toll and mammalian interleukin-1 receptor domains; however, most Arabidopsis NBS-LRRs do contain an interleukin-1 receptor domain. This lineage-specific divergence of domains and expansion of certain gene families involved in pathogen resistance in both monocot and dicot species likely occurs to enable reaction of plants to pathogens unique to each species. It has been suggested that the expansion of Arabidopsis WAK/WAKLs is due to both local tandem duplications and large-scale genomic duplications (Verica and He, 2002). However, it appears from our analyses in rice that expansion of OsWAKs is probably due in large part to localized gene duplications, since many small OsWAK groups, located in close proximity on the same chromosome, have high sequence similarity. Localized duplication was also suggested as the main reason for WAK gene family expansion in indica rice (Shiu et al., 2004). A recent study that looked at the history of rice genome duplications showed that massive individual gene duplications are ongoing in the rice genome, providing a robust source of raw material for the genesis of new genes and gene functions (Yu et al., 2005). The results of our study provide the necessary genomic information for further in-depth study of the functions of OsWAKs and of their evolutionary expansion in the rice genome. MATERIALS AND METHODS Sequence Retrieval and Analysis The DNA and predicted protein sequences, annotated as putative OsWAKs from the genome sequence of rice (Oryza sativa) japonica cv Nipponbare, were retrieved from three public databases: (1) TIGR Rice Genome Annotation Database (Osa1; http://www.tigr.org/tdb/e2k1/osa1; Yuan et al., 2005); (2) Rice Protein Database in Gramene (http://www.gramene.org; Ware et al., 2002); and (3) GenBank in the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov). The same sequences from the same or overlapping contigs, but annotated by different putative OsWAK names in the various databases, were checked using the multiple sequence alignment tool of ClustalW from the European Bioinformatics Institute (http://www.ebi.ac.uk/clustalw/index.html). Genes from rice varieties other than japonica cv Nipponbare were manually removed. BLAST search tools, BLASTp and tBLASTn (Altschul et al., 1997), were used to identify additional OsWAK sequences from Osa1 and GenBank, using 10 putative OsWAK protein sequences as queries. Domain Searches of OsWAK Protein Sequences The SMART database (http://smart.embl-heidelberg.de) was used to search for EGF-like domains, protein kinase domains, and transmembrane domains. BioEdit software (http://www.mbio.ncsu.edu/BioEdit/bioedit.html) was utilized as a tool for diagramming domain positions for OsWAKs. RT-PCR Analysis of OsWAKs Total DNA and RNA, used in PCR and RT-PCR respectively, were isolated from 2-week-old seedlings of rice japonica cv Nipponbare. DNA extraction was performed on young leaf tissues (Cone, 1989); shoot and root tissues were used for total RNA isolation with TRIzol reagent (Gibco-BRL, Life Technologies), according to the manufacturer's instructions. RNA purification, PCR, and RT-PCR were conducted as described (Meng et al., 2003). PCR or RT-PCR products were purified using QIAquick PCR purification kit (Qiagen) and sequenced (Elim Biopharmaceuticals). Gene-specific RT-PCR primers were designed for each OsWAK as follows: OsWAK7 (F-AGTGTTGCACTAGTCATGCTGCA, R-CTATGGCATGCATATGAAGTCATG); OsWAK17 (F-GTTGATTGGTTCCTCTTGATGCAG, R-GAAGAGTGGAGAGTGGAGGATGA); OsWAK50 (F-CAACTCAAGCTTAACGTCAACTC, R-GAGCTCACTGGTGGTGAATATC); OsWAK62 (F-ACTCATGGACATTATAGGTCATC, R-GAATGTTGCACCATCTCCTCC); and OsWAK125 (F-AACCTCACCTGCAGCAGCAAC, R-TCATGGCATCCACCAGCAACG). Multiple Sequence Alignment and Phylogenetic Tree Analysis Predicted protein sequences of OsWAKs were retrieved from public databases as described in “Results.” Arabidopsis (Arabidopsis thaliana) WAK/WAKL protein sequences were obtained from GenBank as described in Verica and He (2002). Barley (Hordeum vulgare) HvWAKs were retrieved from the barley EST database, HarvEST (http://harvest.ucr.edu), and their predicted protein sequences were translated by frame matching to known WAK protein sequences. Multiple sequence alignment analysis was performed using ClustalW, with default parameters set as in the European Bioinformatics Institute (http://www.ebi.ac.uk/clustalw/index.html) and BioEdit. The phylogenetic tree was produced using TreeView software (http://taxonomy.zoology.gla.ac.uk/rod/treeview.html). Physical Mapping of OsWAKs on Rice Chromosomes All sequenced contigs from japonica cv Nipponbare were physically constructed as pseudomolecules (release 2) in TIGR (http://www.tigr.org/tdb/e2k1/osa1/pseudomolecules/info.shtml), representing the 12 rice chromosomes. OsWAKs were then mapped on individual chromosomes, based on corresponding contigs on rice chromosomes. Supplemental Data
Acknowledgments The authors are grateful to Barbara Alonso for providing expert assistance with the figures, and to the anonymous reviewers for their helpful comments. Notes 1This work was supported by the National Science Foundation (NSF) Plant Genome Research Program (grant no. 0110512) and NSF Research Experience for Undergraduate (REU; to P.G.L.), and by the National Institutes of Health (NIH grant no. S06 GM52588 to Z.-H.H.). The authors responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) are: Peggy G. Lemaux (lemauxpg/at/nature.berkeley.edu) and Shibo Zhang (shibo/at/nature.berkeley.edu). [w]The online version of this article contains Web-only data. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||
Plant Physiol. 2000 Sep; 124(1):31-8.
[Plant Physiol. 2000]Curr Opin Plant Biol. 2002 Oct; 5(5):396-401.
[Curr Opin Plant Biol. 2002]Science. 2004 Dec 24; 306(5705):2206-11.
[Science. 2004]Proc Natl Acad Sci U S A. 2001 Sep 11; 98(19):10763-8.
[Proc Natl Acad Sci U S A. 2001]Plant Physiol. 2003 Jun; 132(2):530-43.
[Plant Physiol. 2003]Plant J. 1998 Apr; 14(1):55-63.
[Plant J. 1998]Plant Physiol. 2003 Aug; 132(4):2256-66.
[Plant Physiol. 2003]Plant Cell. 2001 Jun; 13(6):1317-31.
[Plant Cell. 2001]Plant Cell. 2001 Feb; 13(2):303-18.
[Plant Cell. 2001]J Biol Chem. 2001 Jul 13; 276(28):26688-93.
[J Biol Chem. 2001]Science. 2002 Apr 5; 296(5565):79-92.
[Science. 2002]Nature. 2002 Nov 21; 420(6913):316-20.
[Nature. 2002]Science. 2002 Apr 5; 296(5565):92-100.
[Science. 2002]Nature. 2002 Nov 21; 420(6913):312-6.
[Nature. 2002]Science. 2003 Jun 6; 300(5625):1566-9.
[Science. 2003]Plant Physiol. 2005 May; 138(1):18-26.
[Plant Physiol. 2005]Plant Physiol. 2002 Dec; 130(4):1606-13.
[Plant Physiol. 2002]Science. 2003 Jul 18; 301(5631):376-9.
[Science. 2003]Nature. 2004 Sep 30; 431(7008):569-73.
[Nature. 2004]Plant Physiol. 2002 Jun; 129(2):455-9.
[Plant Physiol. 2002]Genome Biol. 2005; 6(6):R52.
[Genome Biol. 2005]Proc Natl Acad Sci U S A. 2001 Sep 11; 98(19):10763-8.
[Proc Natl Acad Sci U S A. 2001]Plant Physiol. 2002 Jun; 129(2):455-9.
[Plant Physiol. 2002]Plant Physiol. 2003 Dec; 133(4):1732-46.
[Plant Physiol. 2003]Mol Biol Evol. 2000 Jan; 17(1):32-43.
[Mol Biol Evol. 2000]Science. 2003 Jul 18; 301(5631):376-9.
[Science. 2003]Nat Rev Genet. 2003 Sep; 4(9):741-9.
[Nat Rev Genet. 2003]Genome Res. 2004 Mar; 14(3):406-13.
[Genome Res. 2004]Plant Cell. 2000 Apr; 12(4):493-506.
[Plant Cell. 2000]J Biol Chem. 1987 Apr 5; 262(10):4437-40.
[J Biol Chem. 1987]Cell. 1991 Nov 15; 67(4):687-99.
[Cell. 1991]Biochem Biophys Res Commun. 1999 Nov 30; 265(3):752-7.
[Biochem Biophys Res Commun. 1999]J Mol Biol. 1999 Oct 29; 293(3):653-65.
[J Mol Biol. 1999]Science. 1993 Oct 22; 262(5133):539-44.
[Science. 1993]Cell. 1997 Sep 5; 90(5):929-38.
[Cell. 1997]Proc Natl Acad Sci U S A. 1998 Apr 28; 95(9):5401-6.
[Proc Natl Acad Sci U S A. 1998]Proc Natl Acad Sci U S A. 1991 Oct 1; 88(19):8816-20.
[Proc Natl Acad Sci U S A. 1991]Plant Cell. 1994 May; 6(5):709-21.
[Plant Cell. 1994]Development. 1998 Oct; 125(19):3843-51.
[Development. 1998]Plant Cell. 1999 Oct; 11(10):1925-34.
[Plant Cell. 1999]Genes Dev. 2001 Oct 15; 15(20):2755-66.
[Genes Dev. 2001]Science. 2002 May 31; 296(5573):1697-700.
[Science. 2002]Science. 1994 Nov 4; 266(5186):789-93.
[Science. 1994]Proc Natl Acad Sci U S A. 2000 Mar 28; 97(7):3713-7.
[Proc Natl Acad Sci U S A. 2000]Nature. 2000 Feb 24; 403(6772):913-6.
[Nature. 2000]Science. 2004 Mar 5; 303(5663):1516-9.
[Science. 2004]Science. 2005 Jul 8; 309(5732):290-3.
[Science. 2005]Dev Genes Evol. 2003 Aug; 213(8):371-81.
[Dev Genes Evol. 2003]Plant Mol Biol. 2003 Feb; 51(3):415-25.
[Plant Mol Biol. 2003]Plant Physiol. 2002 Jun; 129(2):455-9.
[Plant Physiol. 2002]Genome Biol. 2003; 5(1):R5.
[Genome Biol. 2003]Plant Physiol. 2002 Jun; 129(2):455-9.
[Plant Physiol. 2002]Plant Physiol. 2003 Apr; 131(4):1855-67.
[Plant Physiol. 2003]BMC Evol Biol. 2003 Jul 23; 3():17.
[BMC Evol Biol. 2003]Plant Physiol. 2003 Mar; 131(3):1313-26.
[Plant Physiol. 2003]Genome Res. 2002 Dec; 12(12):1871-84.
[Genome Res. 2002]Plant Physiol. 2002 Jun; 129(2):455-9.
[Plant Physiol. 2002]Plant Cell. 2004 May; 16(5):1220-34.
[Plant Cell. 2004]PLoS Biol. 2005 Feb; 3(2):e38.
[PLoS Biol. 2005]Plant Physiol. 2005 May; 138(1):18-26.
[Plant Physiol. 2005]Plant Physiol. 2002 Dec; 130(4):1606-13.
[Plant Physiol. 2002]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Plant Mol Biol. 2003 Oct; 53(3):327-40.
[Plant Mol Biol. 2003]Plant Physiol. 2002 Jun; 129(2):455-9.
[Plant Physiol. 2002]