• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of bmcebBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Evolutionary Biology
BMC Evol Biol. 2005; 5: 72.
Published online Dec 20, 2005. doi:  10.1186/1471-2148-5-72
PMCID: PMC1368998

Genome-wide comparative analysis of the IQD gene families in Arabidopsis thaliana and Oryza sativa

Abstract

Background

Calcium signaling plays a prominent role in plants for coordinating a wide range of developmental processes and responses to environmental cues. Stimulus-specific generation of intracellular calcium transients, decoding of calcium signatures, and transformation of the signal into cellular responses are integral modules of the transduction process. Several hundred proteins with functions in calcium signaling circuits have been identified, and the number of downstream targets of calcium sensors is expected to increase. We previously identified a novel, calmodulin-binding nuclear protein, IQD1, which stimulates glucosinolate accumulation and plant defense in Arabidopsis thaliana. Here, we present a comparative genome-wide analysis of a new class of putative calmodulin target proteins in Arabidopsis and rice.

Results

We identified and analyzed 33 and 29 IQD1-like genes in Arabidopsis thaliana and Oryza sativa, respectively. The encoded IQD proteins contain a plant-specific domain of 67 conserved amino acid residues, referred to as the IQ67 domain, which is characterized by a unique and repetitive arrangement of three different calmodulin recruitment motifs, known as the IQ, 1-5-10, and 1-8-14 motifs. We demonstrated calmodulin binding for IQD20, the smallest IQD protein in Arabidopsis, which consists of a C-terminal IQ67 domain and a short N-terminal extension. A striking feature of IQD proteins is the high isoelectric point (~10.3) and frequency of serine residues (~11%). We compared the Arabidopsis and rice IQD gene families in terms of gene structure, chromosome location, predicted protein properties and motifs, phylogenetic relationships, and evolutionary history. The existence of an IQD-like gene in bryophytes suggests that IQD proteins are an ancient family of calmodulin-binding proteins and arose during the early evolution of land plants.

Conclusion

Comparative phylogenetic analyses indicate that the major IQD gene lineages originated before the monocot-eudicot divergence. The extant IQD loci in Arabidopsis primarily resulted from segmental duplication and reflect preferential retention of paralogous genes, which is characteristic for proteins with regulatory functions. Interaction of IQD1 and IQD20 with calmodulin and the presence of predicted calmodulin binding sites in all IQD family members suggest that IQD proteins are a new class of calmodulin targets. The basic isoelectric point of IQD proteins and their frequently predicted nuclear localization suggest that IQD proteins link calcium signaling pathways to the regulation of gene expression. Our comparative genomics analysis of IQD genes and encoded proteins in two model plant species provides the first step towards the functional dissection of this emerging family of putative calmodulin targets.

Background

The low solubility product constants of calcium phosphate salts provide a chemical rationale for the evolution of Ca2+ as a universal second messenger. The necessity to decrease cytosolic Ca2+ concentrations to submicromolar levels by exporting the cation into extracellular spaces or intracellular compartments that do not generate ATP, such as the endoplasmic reticulum or vacuole, creates a steep concentration gradient that allows for the controlled and gated generation of rapid Ca2+ transients in response to extracellular stimuli. Such intracellular Ca2+ signals are not only characterized by their magnitudes but also by their spatial and temporal resolution. The sum of these parameters is often referred to as the 'Ca2+ signature' of a primary stimulus [1-4]. Numerous environmental cues of biotic and abiotic nature and endogenous physiological and developmental conditions trigger specific Ca2+ signatures [2,5-8]. Stimulus-specific Ca2+ oscillations are generated by voltage- and ligand-gated Ca2+-permeable channels (influx), and by Ca2+-ATPases and antiporters (efflux) to regain resting Ca2+ levels [3,7]. Approximately 80 genes coding for potential Ca2+ channels, pumps and antiporters have been identified in the Arabidopsis genome, suggesting complex generation and regulation of stimulus-specific Ca2+ signatures [8].

Calcium spikes are recognized by several Ca2+-binding proteins and are decoded via Ca2+-dependent conformational changes in these sensor polypeptides and interacting target proteins [6,9-11]. Several classes of Ca2+ sensors have been identified in plants that contain a Ca2+-binding helix-loop-helix fold known as the EF-hand motif. Calmodulin is the archetypal Ca2+ sensor, which is exceptionally conserved in eukaryotes and contains four EF-hand motifs. About 250 EF-hand motif-containing proteins have been identified in Arabidopsis [12], including six typical calmodulins and 50 calmodulin-like proteins that differ significantly in sequence and number of EF-hand motifs [13,14]. Members of a second, plant-specific family of Ca2+ sensors, which usually contain three EF-hand motifs, have similarity to the regulatory B-subunit of calcineurin in animals and are referred to as calcineurin B-like (CBL) proteins [9,15-17]. While calmodulins and CBL sensor proteins have no catalytic activity on their own and therefore are sometimes referred to as 'Ca2+ sensor relays', a third major class of Ca2+ sensors are bifunctional proteins, known as Ca2+-dependent protein kinases (CDPK), which contain a calmodulin-like domain with four EF-hand motifs and a Ca2+-dependent, Ser/Thr protein kinase domain on a single polypeptide chain [18,19]. Because of their dual functions as Ca2+-binding proteins and catalytic effectors the CDPK proteins are considered 'Ca2+ sensor responders'. In Arabidopsis, CDPK and CBL proteins are encoded by multigene families of 34 and 10 members, respectively [16,19]. CDPKs play essential roles in hormone and stress signaling pathways as well as in plant responses to pathogens [20,21].

To transmit the information of the second messenger, Ca2+ sensor relays such as calmodulins and CBL proteins interact with target proteins and regulate their biochemical activities. During the final phase of the transduction process, the target proteins modulate diverse cellular activities to establish the specific response to a given extracellular signal. The CBL sensor proteins interact specifically in a Ca2+-dependent fashion with a single family of SNF1-like Ser/Thr protein kinases, known as CBL-interacting protein kinases or CIPKs, which are encoded by 25 genes in Arabidopsis [16,22-24]. Current data indicate that CBL-CIPK interaction networks provide a signaling module for integrating plant responses to an array of environmental stimuli [17,23,25,26]. In contrast to CBL sensor proteins, which regulate a select set of target protein kinases, calmodulins interact with an astonishingly large number of target proteins. These have been extensively reviewed and include among other functional categories, proteins implicated in generating Ca2+ signatures, enzymes in signaling and metabolic pathways, and transcriptional regulators [6,8,11,27-29]. The calmodulin-interacting domains of target proteins are not necessarily related in structure and exhibit high sequence variability, which may reflect the versatility of the calmodulin sensor relay. Nonetheless, calmodulin-interacting domains usually consist of a short (16–35 residues) basic amphiphilic helix, which is recognized by a flexible hydrophobic pocket that forms upon Ca2+ binding to calmodulin [9,10,30,31]. Three calmodulin recruitment motifs are currently known although not all functionally characterized calmodulin-binding domains contain these specific motifs: the IQ motif (IQxxxRGxxxR; Pfam 00612) is thought to mediate calmodulin retention in a Ca2+-independent manner, whereas Ca2+-dependent interaction can be achieved by two related motifs, termed 1-5-10 and 1-8-14, which are distinguished by their spacing of bulky hydrophobic and basic amino acid residues [31-34]. Using various biochemical approaches, about 200 target proteins have been identified in Arabidopsis, a number that is expected to rise [8,11].

In a genetic screen for regulatory factors of the glucosinolate homeostasis in Arabidopsis thaliana [35], we have recently identified a gene coding for a calmodulin-binding protein with similarity to SF16 from sunflower [36]. We termed this protein IQD1 for the presence of a plant-specific domain of 67 conserved amino acids (referred to as IQ67 domain), which is characterized by a unique and repetitive arrangement of IQ, 1-5-10 and 1-8-14 calmodulin recruitment motifs. We demonstrated by biochemical and genetic studies that IQD1 is a nuclear calmodulin-binding protein that stimulates glucosinolate accumulation and plant defense [37]. In this study, we present a comparative genome-wide analysis of the entire IQD gene families in Arabidopsis thaliana (33 loci) and Oryza sativa (29 loci), which are predicted to encode proteins sharing the IQ67 domain. Our genomics analysis provides the framework for future studies to dissect the function of this emerging family of novel calmodulin target proteins.

Results

Identification and structure of IQD genes in Arabidopsis thaliana

In a previous study, we characterized IQD1 as a calcium-dependent calmodulin-binding protein and identified six closely related genes in Arabidopsis [37]. The encoded proteins share a conserved central region of 67 amino acid residues, referred to as the IQ67 domain, which is characterized by the occurrence of multiple calmodulin-binding motifs [32,33] that are arranged in a unique repetitive pattern. The IQ67 domain contains 1–3 copies each of the IQ motif (IQxxxRGxxxR or of its more relaxed version [ILV]QxxxRxxxx [R, K]), the 1-5-10 motif ([FILVW]x3[FILV]x4[FILVW]), and the 1-8-14 motif ([FILVW]x6[FAILVW]x5[FILVW]). In addition, several conserved basic and hydrophobic amino acid residues are flanking these motifs, and the IQ67 domain is predicted to fold into a basic amphiphilic helix ([37]; see Figure Figure22).

Figure 2
Amino acid sequence conservation of the IQ67 domain. Aligned are sequences of the IQ67 domain of 72 putative IQD proteins form Arabidopsis thaliana (a), Oryza sativa (b), Pinus spp. and Physcomitrella patens (c). Each protein is identified by its gene ...

To uncover the entire family of genes coding for IQD proteins in the Arabidopsis genome, we searched available Arabidopsis databases with multiple BLAST algorithms using full-length IQD1 (454 amino acids) and its IQ67 domain as the query sequences, followed by additional searches with related sequences (see Methods). In addition, we performed a pattern search with the IQ motif and its degenerate versions as the query sequences and inspected each hit for the presence of an IQ67 domain. We subsequently performed pair-wise sequence comparisons to exclude redundant entries from the initial data set, which is frequently caused by multiple identification numbers of the same DNA or protein sequence in the databases. A total of 33 non-redundant putative IQD genes were extracted from these sources (Table (Table11 and Figure Figure1).1). Full-length cDNA or EST sequences were available for 26 of those genes, and we attempted to clone by reverse transcriptase-mediated PCR cDNA sequences for the remaining seven genes. We succeeded to generate full-length cDNAs for three additional genes, At1g17480, At1g18840 and At4g23060, but were unable to amplify cDNAs for At1g51960, At2g02790, At3g22190 and At3g49380. To date, no evidence is available supporting the expression of At1g51960 and At3g49380 (Table (Table1).1). A comparison of the 29 genomic loci with their corresponding cDNA sequences revealed that most of the predicted gene models are correct, with only three exceptions (At4g10640, At2g26410, At1g01110). The full-length cDNA of At4g10640 encodes a protein that is 16 amino acid residues longer than the protein predicted by the MIPS MATDB annotation. This discrepancy is caused by the erroneous and superfluous annotation of a fifth intron in the last coding exon. For At2g26410, the translational start site and the 5' border of the first intron were misannotated for the MIPS MATDB entry when compared with its full-length cDNA. The available cDNA for At1g01110, annotated as a full-length cDNA (Arabidopsis TIGR db Annotation Version 5.0), encodes only three exons but is likely truncated at its 5'-end because (i) At1g01110 and At4g00820 are paralogous genes that evolved by a segmental duplication event (see Figure Figure1a1a and Figure Figure5),5), and (ii) the At4g00820 gene model of five coding exons is supported by a full-length cDNA sequence. We therefore consider the MIPS MATDB annotation of At1g01110 (five coding exons) to be correct. The gene models of At1g51960, At2g02790, At3g22190 and At3g49380 remain to be verified as no full-length cDNA sequences are available. Structural examination of the 33 putative IQD genes revealed the presence of 2–6 translated exons, suggesting that IQD proteins are quite diverse. Almost two-thirds of the gene family (20 members) contains more than four protein-coding exons, and 12 genes encode one or two non-translated exons in their 5'-region (Figure (Figure1b).1b). All introns of most IQD genes are phase-0 introns, separating exactly two triplet codons [38]. The last intron of At1g23060 is in phase-2, which lies between the second and third nucleotide of joining codons, and a phase-1 intron is found in five other IQD genes (Figure (Figure1b).1b). The average size of IQD genes in Arabidopsis is 2.4 kb (Table (Table33).

Table 1
The IQD gene family of Arabidopsis thaliana
Figure 1
Phylogenetic analysis and exon-intron organization of IQD genes in Arabidopsis thaliana and Oryza sativa. Neighbor-joining trees of full-length amino acid sequences encoded by Arabidopsis (a) and rice (c) IQD genes are shown. The gene coding for the protein ...
Figure 5
Chromosomal distribution and segmental duplication events for Arabidopsis IQD genes. The five chromosomes are indicated by Roman numerals and the centromeric regions by ellipses. Deduced chromosomal positions of the IQD genes are marked by horizontal ...
Table 3
Average parameters of IQD genes and proteins from A. thaliana and O. sativa

Predicted primary structure and properties of Arabidopsis IQD proteins

Having identified non-redundant and verified potential IQD protein coding sequences, we developed a set of criteria for the presence of the IQ67 domain in the 33 predicted Arabidopsis proteins. The IQ67 domain is characterized by the precise spacing of three copies of the 11-amino acid IQ motif, which are separated by short sequences of 11 and 15 amino acid residues (Figure (Figure2a).2a). The first IQ motif is best conserved (present in 32 proteins), followed by the second (26 proteins) and third (12 proteins) IQ repeat. Although the third IQ motif shows the highest degree of sequence degeneration, its initial hydrophobic amino acid and following glutamine residue are present in 31 proteins. Each IQ motif is congruent with a 1-5-10 motif of hydrophobic amino acids, which again is least conserved for the last IQ motif. A fourth 1-5-10 motif overlaps the first spacer sequence and second IQ motif. Each IQ motif also partially overlaps with a 1-8-14 motif. Besides these repetitive motifs, the IQ67 domain is characterized by the presence of additional conserved hydrophobic and basic amino acid residues flanking each IQ motif (Figure (Figure2a).2a). A hallmark of IQD genes is the presence of a phase-0 intron at an invariant position within the coding region of the IQ67 domain that disrupts codon 16 and 17 (equivalent to codon 9 and 10 of the first IQ motif). At5g03960 is the only exception to this rule, which encodes the entire IQ67 domain on its second and central exon (Figure (Figure1b1b and Figure Figure3a).3a). Given these criteria, 32 proteins contain at least two or three discernible IQ motifs with the accompanying 1-5-10 and 1-8-14 motifs in their IQ67 domain, which we therefore consider bona fide IQD proteins. The protein encoded by At5g35670 does not meet these criteria because it only contains the first, albeit truncated IQ motif provided by the N-terminal exon of the IQ67 domain (exon 2 of At5g35670). The exon coding for the remainder of the IQ67 domain (residues 17–67) is missing and replaced by an unrelated exon in At5g35670 (Figure (Figure2a2a and Figure Figure3a).3a). However, the At5g35670 protein shares five common amino acid sequence motifs outside the IQ67 domain with a large set of IQD proteins as detected by comparative MEME (Multiple Expectation Maximization for Motif Elicitation) analysis [39] of the complete amino acid sequences of the 33 Arabidopsis proteins (Figure (Figure3a).3a). As most of these motifs are unique to IQD proteins, we consider At5g35670 a member of the IQD gene family in Arabidopsis. Since amino acids 17–67 of the IQ67 domain are encoded by the second or third exon of IQD genes, the IQ67 domain contributes to the core region of most IQD proteins. An interesting exception is At3g51380, which is the smallest member of the IQD protein family in Arabidopsis and consists of a C-terminal IQ67 domain and a short N-terminal extension of 35 amino acid residues.

Figure 3
Motif patterns in IQD proteins of Arabidopsis thaliana and Oryza sativa. The schematic IQD proteins of Arabidopsis (a) and rice (b) are aligned relative to the IQ67 domain (orange box). Total amino acid sequence length, boundaries of protein-coding exons ...

Since At3g51380 is predicted to encode a 'minimal' IQD protein (IQD20), we tested whether calmodulin interacts with recombinant IQD20. We employed the same co-sedimentation assay that we recently used to demonstrate Ca2+-dependent binding of IQD1 to bovine calmodulin [37]. As shown in Figure Figure4,4, an epitope tagged T7-IQD20 fusion protein preferentially co-sedimented with calmodulin-agarose beads in the presence of Ca2+, whereas noticeably less T7-IQD20 protein was bound to immobilized calmodulin when the incubation mix and wash buffer were supplemented with EGTA. Thus, our data indicate that the smallest member of the IQD protein family in Arabidopsis interacts with calmodulin in a Ca2+-independent manner but suggest that calmodulin binding is possibly stimulated by the presence of Ca2+ ions. We interrogated the web-based Calmodulin Target Database, which computes various structural and biophysical parameters of a given protein sequence to predict calmodulin binding sites [40]. This analysis predicted that IQD20 and all other IQD proteins of Arabidopsis contain, in addition to multiple IQ motifs, strings of high-scoring amino acid residues that indicate the location of putative calmodulin interaction sites (Table (Table4).4). The predicted calmodulin binding sites overlap with the IQ67 domain in 23 of the 33 IQD protein sequences (see Figure Figure3a3a).

Figure 4
Interaction of Arabidopsis IQD20 and calmodulin in vitro. Calmodulin-agarose beads were incubated in the presence of Ca2+ or absence of Ca2+ (+EGTA) with soluble proteins prepared from induced bacterial cultures expressing a T7-tagged IQD20 protein and ...
Table 4
Predicted calmodulin-binding sites in Arabidopsis and rice IQD proteins

Although the predicted IQD proteins are quite diverse with respect to size (103–794 residues) and computed molecular mass (11.8–86.8 kD), they appear to be remarkably uniform in terms of their relatively high theoretical isoelectric point (10.3 ± 0.6), the only exception being At1g19870 (pI of 5.2), and with respect to the abundance of Ala (8.6 ± 2.2), Ser (12.2% ± 2.2%), and basic amino acid residues (Arg/Lys, 17.6% ± 2.2%). To uncover the possible subcellular localization of IQD proteins in Arabidopsis, we searched for different signature motifs specific to cellular compartments. Because of their high content of basic residues, and as suggested by PSORT, at least half of the IQD protein family (16 members) may be localized in the cell nucleus (Table (Table1).1). This conjecture is supported by the presence of several basic clusters in IQD proteins that conform to the SV40-type, MATα2-type, and bipartite type of nuclear localization signals [41], and by the nuclear localization of an IQD1-GFP fusion protein [37]. The remaining IQD proteins are predicted to be localized in the mitochondria (7), chloroplasts (5), or unknown compartments (Table (Table11).

Chromosomal distribution and homology of Arabidopsis IQD genes

To infer clustering patterns that reflect IQD protein sequence similarity and evolutionary ancestry, we constructed phylogenetic trees by the neighbor-joining method [42] using IQD full-length sequences and the amino acid sequence of At5g35670 as outgroup. The At5g35670 gene encodes a C-terminally truncated IQ67 domain that lacks amino acid residues 17–67 (Figure (Figure2a).2a). The phylogenetic analysis of the Arabidopsis IQD gene family reveals four well-resolved subfamilies, two of which can be further divided into subgroups supported by the presence and position of introns, the occurrence of common protein motifs outside the IQ67 domain, and bootstrapping values (Figure (Figure1a1a and and1b;1b; Figure Figure3a).3a). Large segmental duplications of chromosomal regions during evolution, followed by gene loss, small-scale duplications and local rearrangements, have created the present complexities of the Arabidopsis genome [43-51]. These events have likely shaped the size and structure of the current IQD gene family. We therefore analyzed the evolutionary history of IQD genes, which are relatively evenly distributed among all five Arabidopsis chromosomes (Figure (Figure55 and Table Table1).1). The topology of the phylogenetic tree (Figure (Figure1a)1a) suggests for several IQD genes in all subfamilies a clear paralogous pattern of gene divergence by gene duplication. Using the Arabidopsis Redundancy Viewer (MATDB), the Viewer of Segmental Genome Duplications (TIGR) and the searchable supplementary material provided by Blanc et al. [45] and Simillion et al. [48], we found that 26 of the 33 IQD genes are located in previously identified chromosomal duplications [45,47,48]. Eight pairs of duplicated IQD genes have been retained during evolution, whereas the IQD sister gene has been lost for each of the other 10 duplication events (Figure (Figure5).5). All 18 duplications involving IQD genes occurred during the relatively recent genome-wide duplication event 75 ± 22 Myr ago, as estimated by Simillion et al. [48]. In most cases, the paralogous relationships indicated by segmental duplication are supported by the exon-intron organization and the phylogeny of the IQD gene pairs (Figure (Figure1a1a and and1b).1b). The following pairs of genes are therefore close paralogous IQD genes in Arabidopsis, sharing 50–67% amino acid sequence identity: At1g01110 and At4g00820; At1g14380 and At2g02790; At1g17480 and At1g72670; At1g18840 and At1g74690; At1g51960 and At3g16490; At2g43680 and At3g59690; At3g09710 and At5g03040; At5g07240 and At5g62070. Two orphan genes contained in opposite parts of a duplicated segment pair on chromosome III and IV, At3g22190 and At4g14750, group in different subfamilies of the phylogenetic tree and share substantially lower primary structure identity (20%) as well as less preservation of exon-intron organization (Figure (Figure1a1a and and1b),1b), suggesting reciprocal IQD sister gene loss after duplication of a chromosomal segment that contained two ancestral IQD genes. The genes At2g33990 and At3g15050 also appear to be closely related paralogs (Figure (Figure1a,1a, 43% identity); however they are positioned in different previously identified duplication segments, which points to a more complex evolutionary history. As expected, IQD genes of atypical structure (At5g03960, loss of intron in IQ67 coding region) or encoding atypical proteins (At1g19870, acidic pI; At3g51380, C-terminal IQ67 domain; At5g35670, truncated IQ67 domain) are either singleton genes (At5g35670, At3g51380), or orphan genes (At1g19870, At5g03960) whose homologous sister gene has been lost after duplication. Two pairs of closely positioned singleton genes, one each on chromosome III and IV, and two clustered genes in a duplicated segment on chromosome IV (At4g49260, At4g49380), suggest ancient tandem or local duplication events that have already resulted in substantial gene diversification (<30% identity for each gene pair). In summary, large-scale segmental duplication events appear to have exclusively contributed to the current complexity of the IQD gene family.

Identification and predicted properties of the IQD protein complement in Oryza sativa

We next explored the occurrence and size of the IQD gene family in the extensively sequenced genome of rice [52,53]. BLAST searches in several databases of O. sativa ssp. japonica and indica (see Materials and methods) using several Arabidopsis full-length IQD protein sequences as the queries identified 29 different loci that encode non-redundant putative IQD proteins in rice. The general features of rice IQD genes and proteins are summarized in Table Table22 and Table Table3.3. Full-length cDNA sequences are available for 16 genes and generally support the respective gene model, with the exception of two loci (Os01m05259, Os03m04309) that are incorrectly annotated (see Table Table2).2). The putative full-length cDNA sequences of two additional genes (Os01m06663, Os06m3925) are likely truncated in their coding region when compared with the conceptual translation products of each corresponding locus. A gene model could not be derived for the Os01m06368 locus in either O. sativa subspecies that covers the open reading frame of a corresponding partial cDNA sequence. To date, independent evidence for gene expression has been obtained for six of the remaining ten IQD family members for which a full-length cDNA is currently not available, suggesting that most IQD genes are functional in rice (Table (Table2).2). As for Arabidopsis, rice IQD genes encode 2–6 translated exons; however, less than half of the rice family members (13 genes) contain more than four exons (Figure (Figure1d).1d). Furthermore, all introns in most OsIQD genes are in phase-0; only six genes contain a phase-1 intron in their 3'-region and one gene (Os04m04570) is characterized by the presence of two phase-2 and one phase-1 intron in its 5'-region (Figure (Figure1d).1d). Rice IQD genes are slightly larger than Arabidopsis IQD genes, which is a result of increased intron length (Figure (Figure1b1b and and1d;1d; Table Table33).

Table 2
The IQD gene family of Oryza sativa

Conceptual translation of full-length cDNA or predicted mRNA sequences and computation of theoretical physico-chemical protein parameters reveal that the IQD protein complement in rice is remarkably similar to the IQD protein family in Arabidopsis (Table (Table22 and Table Table3).3). Comparative MEME analysis of the complete amino acid sequences of the 28 rice IQD proteins identified a similar set of conserved sequence motifs and their distribution along the polypeptide chain as found for members of the Arabidopsis IQD protein family (Figure (Figure3b3b and Table Table5).5). The IQ67 domain is positioned close to the core region of IQD polypeptides and is characterized by the same hallmarks as described for the Arabidopsis family, including the location and spacing of the three calmodulin-binding motifs (i.e., IQ, 1-5-10, 1-8-14), and the position of an invariant phase-0 intron that separates codon 16 and 17 of the IQ67 domain (Figure (Figure2b2b and Figure Figure3b).3b). As predicted by interrogation of the Calmodulin Target Database [40], all rice IQD proteins contain additional putative calmodulin binding sequences that often overlap with the IQ67 domain (Figure (Figure3b3b and Table Table4).4). It is interesting to note that the rice IQD gene family contains members with similar deviations from consensus properties as observed for the IQD gene family in Arabidopsis. These exceptions include loss of the phase-0 intron between the IQ67 domain-coding exons (Os01m06663, Os08m00125), replacement of the second exon coding for amino acids 17–67 of the IQ67 domain (Os06m03925), C-terminal location of the IQ67 domain (Os03m00334, Os04m04570), and an unusually large and acidic protein (Os04m05532). Since the rice IQD proteins display a similar range of structural and physico-chemical characteristics as the IQD family in Arabidopsis, it is very likely that we have identified most of the IQD family members in rice. Again, the majority of the family members (16 proteins) may be targeted to the cell nucleus; the remaining IQD proteins are predicted to be localized in the mitochondria (4), chloroplasts (1), or unknown compartments (Table (Table22).

Table 5
Major motifs in Arabidopsis and rice IQD proteins

Chromosomal distribution of rice IQD genes

Unlike the Arabidopsis IQD gene family, which is evenly distributed over all Arabidopsis chromosomes, the distribution of IQD genes in the rice genome is clearly biased towards three chromosomes. Almost half of the rice IQD gene family members (14 loci) are contained in chromosomes I and V, and five genes are present on chromosome III. Three IQD genes are each found on chromosomes IV and VI, while seven of the twelve rice chromosomes contain either one or no IQD gene locus (Table (Table2).2). Such a heterogeneous distribution of IQD genes over the different rice chromosomes is consistent with an ancient aneuploidy event, which has been proposed to have occurred in rice about 70 Myr ago [51], and not with a whole-genome duplication or polyploidization event. Duplicated segments cover substantial regions of chromosome V (16%) and chromosome I (11%), the second and third largest fraction of segmental duplications after chromosome II (22%) [51]. The topology of the phylogenetic tree of OsIQD genes suggests four pairs of paralogous genes that evolved by segmental duplication (55–69% amino acid sequence identity); interestingly, three such pairs include IQD genes located on chromosome I and V (Figure (Figure1c).1c). Like the IQD protein family in Arabidopsis, the phylogenetic analysis of the rice gene family reveals four major subfamilies, and one can be divided into two subgroups. The two rice proteins containing the IQ67 domain at their C-terminus cluster as a separate subfamily (Figure (Figure1c1c and and1d,1d, Figure Figure3b3b).

Comparative phylogenetic analyses

We further investigated the relationship between the Arabidopsis and rice IQD protein families by generating an alignment of the 61 identified IQD amino acid sequences followed by the generation of a neighbor-joining phylogenetic tree (Figure (Figure6).6). The combined phylogeny between the Arabidopsis and rice IQD sequences revealed six subfamilies of putative orthologous genes. Within each subfamily, the rice and Arabidopsis genes appear more closely related to each other than to IQD genes of the same species in a different subfamily, suggesting that an ancestral set of IQD genes already existed before the monocot-eudicot divergence. Four subfamilies of likely orthologous genes (I–IV) are composed of nearly identical sets of genes that constitute the respective subfamilies in Arabidopsis and rice (compare Figure Figure66 with Figure Figure1a1a and and1c).1c). The remaining two subfamilies contain the genes encoding atypical IQD proteins in both species: At3g51380, Os03m00334 and Os04m04570 (IQ67 domain on protein C-terminus) are members of subfamily V, whereas At5g35670 and Os06m03925 (truncated IQ67 domain) comprise subfamily VI (Figure (Figure6).6). The two genes coding for the acidic and unusually large IQD proteins, At1g19870 and Os04m05532 (Table (Table11 and Table Table2),2), are members of subfamily IV and form a pair of orthologous genes. These subgroups of orthologous genes and other branches within the subfamilies are well-supported, which may be indicative for a relatively early diversification of IQD gene structure and function during plant evolution. The three genes that experienced loss of the conserved intron separating the IQ67 domain-encoding exons, At5g03960, Os01m06663 and Os08m00125, are members of different subfamilies (Figure (Figure6),6), which suggests that intron loss occurred after the divergence of both evolutionary lineages. The phylogeny of Arabidopsis and rice IQD genes supports the occurrence of species-specific IQD gene duplications events. For example the two closely related IQD gene pairs in subfamily I (Os05m00863/Os01m00895 and At3g16490/At1g51960) or subfamily IV (Os05m04307/Os01m05025 and At1g18840/At1g74690) result from duplication events that occurred independently in both species.

Figure 6
Phylogenetic relationships of Arabidopsis thaliana and Oryza sativa IQD proteins. The unrooted tree, constructed using ClustalX (1.81), summarizes the evolutionary relationship among the 61 members of both IQD protein families. The neighbor-joining tree ...

To explore the evolutionary history of the IQD gene family in greater detail, we searched publicly available genomic and EST databases for homologous sequences in other plant species. We identified ESTs corresponding to IQD proteins for all angiosperm species represented in the TIGR Plant Gene Indices as well as for the gymnosperm Pinus ssp. (three putative full-length cDNA and six additional EST sequences). As expected, the putative full-length IQD proteins of pine (TIGR Pinus Gene Index entries TC41979, TC52213, and TC52519) are very similar to the Arabidopsis and rice IQD proteins with respect to calculated molecular masses (38.9–56.8 kD), isoelectric points (pI of 10.1–10.3) and frequencies of Ala, Ser, Arg, and Lys residues. A combined phylogenetic analysis of the Arabidopsis, rice and pine full-length IQD protein sequences reveals that the IQD proteins from Pinus cluster with different subfamilies (see Figure Figure6),6), suggesting that IQD proteins predated the evolution of vascular plants. We also performed a BLAST search of the moss database (see Materials and methods) and identified one contig EST sequence from Physcomitrella patens that encodes an IQD-like protein (contig5180). Although the deduced amino acid sequence appears to be truncated at the C-terminus (20 amino acid residues downstream of the IQ67 domain), an appreciable similarity with the protein encoded by At1g01110 is evident (33% identity), which includes the presence of MEME motif 3 at its N-terminus (data not shown). Interestingly, alignment of the deduced IQ67 domain of the moss polypeptide reveals a deletion of six residues that correspond to the N-terminus of the second IQ67 domain-encoding exon of most Arabidopsis and rice IQD proteins (Figure (Figure2c).2c). As the IQ67 intron is in phase-0 (see above) and since A. thaliana and O. sativa both express an IQD-like gene in which the second IQ67 domain-encoding exon is replaced by an unrelated exon, it is unlikely that the contig5180 DNA sequence is an artifact and probably represents either a novel variant of IQD-like genes or an ancestral gene of the IQD genes found in vascular plants.

We finally examined the relationships between the IQ67 domains of the four plant species by constructing a neighbor-joining phylogenetic tree using the PAUP*4.0 program and the amino acid sequence alignment shown in Figure Figure2.2. Three major subfamilies of IQ67 domain sequences can be observed, which each contain members of the Arabidopsis, rice and pine IQD families. In addition, two small subfamilies and two single branches originate deeply in the unrooted tree and are only distantly related to the three major subfamilies, which can be further divided into subgroups (Figure (Figure7).7). Bootstrap analyses indicated that the deep nodes of the tree have low statistical support, which may be attributed to the small size of the IQ67 domain. Low bootstrap support has also been observed for the phylogeny of the similarly sized DNA-binding domains of bHLH [54], Dof [55], or GATA [56] transcription factor families. Nevertheless, the IQ67 tree has better resolution in the outer clades. The short branches at the tips of the tree indicate high sequence conservation and strong evolutionary relationships among subfamily members. Interestingly, although the major subfamilies of IQ67 domain sequences (1–3) and of IQD full-length protein sequences (I–IV) overlap only partially (compare color code in Figure Figure66 and Figure Figure7),7), subgroups of IQ67 domain sequences largely correspond to subgroups of full-length IQD protein sequences as identified in Figure Figure6,6, which is suggestive of exon shuffling during the evolution of IQD proteins. We also investigated the effect of different programs and methods on IQ67 domain tree topology. Using ClustalX and the neighbor-joining algorithm or the PAUP*4.0 program and maximum parsimony analysis resulted in a similar tree topology (data not shown), which indicates that the neighbor-joining tree presented in Figure Figure77 is robust and reflective of likely phylogenetic relationships between IQ67 domains within subfamilies.

Figure 7
Phylogenetic relationships of the IQ67 domains encoded by IQD genes from Arabidopsis thaliana, Oryza sativa, Pinus ssp. and Physcomitrella patens. The unrooted tree was constructed from the alignment shown in Figure 2 using PAUP* 4.0 and the neighbor-joining ...

Discussion

The IQ67 domain – a plant-specific arrangement of putative calmodulin-interacting motifs

In this study we characterized a possibly complete set of IQ67 domain-encoding genes in the current version of the Arabidopsis thaliana and Oryza sativa genomes. The defining features of the IQ67 domain are the invariant arrangement of three IQ motifs [32] separated by 11 and 15 intervening amino acid residues, and the conserved exon-intron organization (Figure (Figure2).2). A pattern search of the Arabidopsis proteome with the conventional IQ motif (IQxxxRGxxxR) and its more generalized versions ([ILV]QxxxRxxxx[R,K]) as the queries confirmed a set of 33 IQD genes identified by reiterative BLAST searches. As expected from previous reports, our pattern search evidenced three additional major families and numerous miscellaneous proteins that contain at least one IQ motif: the CNGC family of cyclic nucleotide gated channels (20 members; [57]), the myosin family (17 members; [58]), and the CAMTA family of calmodulin-binding transcriptional activators (6 members; [59-61]). For each of these families, the spacing of IQ motifs and the exon-intron organization of the respective regions are unique and distinctive from the IQD family, which establishes the IQD proteins as a separate class of putative calmodulin targets of unknown biochemical functions (see Figure Figure8).8). The IQD proteins possibly constitute the largest class of putative calmodulin targets in plants. The size of the IQD family in Arabidopsis (33 proteins) and rice (29 proteins) clearly exceeds the size of other families of calmodulin-binding proteins [8] and is only comparable with the CIPK family (25–30 proteins) that interact with CBL Ca2+ sensors in Arabidopsis and rice [16]. In addition to the IQ motif, the IQ67 domain contains multiple copies the 1-5-10 and 1-8-14 motifs, which are related and typified by their spacing of hydrophobic and basic amino acid residues. While the IQ motif is thought to mediate calmodulin retention in a Ca2+-independent manner, the 1-5-10 and 1-8-14 motifs are involved in Ca2+-dependent association of calmodulin with its target [33,34]. However, it should be noted that not all characterized calmodulin-binding domains contain these features [31,32].

Figure 8
Organization of IQ motifs in major families of calmodulin-binding proteins. The scheme depicts the arrangement of the multiple IQ motifs present in proteins of the IQD family (this study; [37]), the CAMTA family of calmodulin-binding transcriptional activators ...

We previously demonstrated that Arabidopsis IQD1 binds to bovine calmodulin in a Ca2+-dependent fashion [37]. In this study, we tested calmodulin binding for IQD20, the smallest member of the Arabidopsis IQD protein family (103 residues), which consists only of the IQ67 domain at its C-terminus and a short N-terminal extension of 35 amino acid residues. Interestingly, we observed interaction of recombinant IQD20 with calmodulin in the absence of Ca2+, which is possibly augmented when the metal ion is present (Figure (Figure4).4). This observation and the prediction of putative calmodulin binding sites in IQD20 and all IQD proteins in Arabidopsis and rice, using the algorithm provided by the Calmodulin Target Database [40], strongly suggest that all IQD proteins have the potential to interact with calmodulin (Figure (Figure33 and Table Table4).4). Given our results with Arabidopsis IQD1 and IQD20, the prospect arises that different IQD proteins may interact with calmodulin in different modes, which could be Ca2+-independent, Ca2+-dependent, or more complex. The precise mechanism for each IQD protein is likely determined by the number and specific composition of the IQ, 1-5-10 and 1-8-14 motifs in the IQ67 domain, by the predicted calmodulin binding site adjacent to or overlapping with the IQ67 domain, and by the overall tertiary structure of the IQD protein. These structural features differ substantially between IQD1 and IQD20 (Figure (Figure2,2, Table Table1,1, Table Table4),4), which are likely responsible for the observed differences in calmodulin interaction with respect to Ca2+ dependency. The identification of interacting calmodulin or calmodulin-like proteins [14] and the biochemical characterization of calmodulin binding sites for each IQD protein are important tasks for future research.

It is interesting to note that the Calmodulin Target Database successfully predicts experimentally verified calmodulin-interacting peptides in CNGC [57] and CAMTA [59-61] proteins, which are located at conserved positions adjacent to the IQ motifs (see Figure Figure8).8). Although the IQ motif is likely as widely distributed as calmodulin and calmodulin-like proteins, the IQ67-specific arrangement of the three calmodulin retention motifs is confined to plant proteins and not found outside the plant kingdom, suggesting that this calmodulin-interaction module arose early in plant evolution.

Evolution of IQD proteins

The presence of at least one putative IQD-like gene in Physcomitrella patens indicates that the IQD gene family originated during the early evolution of land plants, possibly before the divergence of bryophyte and vascular plant lineages 450–700 Myr ago [62], but not later than the split of gymnosperms and angiosperms about 300 Myr ago [63] as evidenced by EST and full-length cDNA sequences coding for at least nine IQD genes in pine. Molecular and phylogenetic analysis of IQD and IQD-like genes from ferns, bryophytes and green algae will be necessary to resolve the evolutionary origin of the IQD gene family.

To explore how the IQD gene family has evolved since the monocot-eudicot divergence 170–235 Myr ago [64], we performed a genome-wide comparative analysis of the IQD gene complement between Arabidopsis and rice. The phylogenetic trees of the 33 Arabidopsis and 28 rice IQD genes showed relatively long branches and closely clustered nodes, reflecting a high degree of sequence divergence, which is further indicated by the large variation in the number of protein-coding exons (2–6) and computed molecular masses of the predicted IQD proteins (Figure (Figure11 and Tables Tables1,1, ,2,2, ,3).3). Based on their phylogenetic relationships, up to six different subfamilies of IQD genes can be defined for both species. This classification is supported by conserved exon-intron organization and protein motif patterns within each subfamily. The combined phylogenetic analysis revealed that members of all six subfamilies are present in the Arabidopsis and rice genome, indicating a relatively early diversification of the IQD gene family before the monocot-eudicot split (Figure (Figure6).6). In those subfamilies, seven members of both IQD gene families are clearly recognizable as distinct orthologous pairs (e.g. genes coding for atypical IQD proteins), suggesting that the encoded proteins exert similar functions in both species. On the other hand, it is currently impossible to assign potential functions to IQD genes that are the result of recent species-specific duplication events leading to independent functional diversification.

The topology of the phylogenetic trees at the outer branches suggests that gene duplication played a prominent role in the evolution of both gene families, which is supported by the analysis of duplicated segments in the Arabidopsis genome (Figure (Figure5).5). More than 80% of all genes in the annotated Arabidopsis genome reside in duplicated segments, and systematic analyses indicate that the Arabidopsis genome experienced a large-scale or even complete genome duplication event 30–90 Myr ago, sometime between the Arabidopsis-Gossypium and Arabidopsis-Brassica splits [48,49,51,65,66]. Evidence for older (>100 Mya) large scale-duplications exist, however, the frequency and precise timing of polyploidizations remains to be resolved and is a focus of current research [45,47-50,65,66]. The location of IQD genes in the Arabidopsis genome is clearly reflective of the recent large-scale duplication event. The IQD gene family is uniformly distributed among the five chromosomes, and 26 (or 79%) of the 33 IQD loci are found in duplicated segments of the recent age class (Figure (Figure5).5). It is important to point out that 16 of those 26 genes in duplicated loci correspond to 8 IQD sister gene pairs, which represents an unusually high fraction of paralogous genes (44.5%) that have been retained from the extra gene set since the duplication event. Nonfunctionalization and subsequent gene loss is the most likely fate of a gene duplicate, and less than 27% of the entire paralogous gene set originating from polyploidy have been retained in Arabidopsis [45,48]. Preferential retention of duplicated genes has been observed for gene families in Arabidopsis with functions in signal transduction and transcriptional regulation [44]. Specific examples include the gene families encoding Aux/IAA (71.5% [67]), GATA (39% [56]) and GRAS (40% [68]) transcription factors, or genes coding for 20S proteasome subunits (64% [69]); the given percentages equal fractions of retained gene duplicates that we calculated from published data. Empirical evidence indicates that regulatory processes in metazoa such as signal transduction or gene transcription are dependent on gene dosage and stoichiometric protein-protein interactions [70]. As pointed out by Blanc and Wolfe [44], retention of a near-complete set or subset of duplicated genes coding for regulatory components such as transcription factors, kinases, phosphatases or Ca2+-binding proteins would minimize disturbances in sensitive stoichiometric and concentration-dependent relationships.

The evolutionary history of the rice genome is less understood. The view of an ancient polyploidy event has recently been questioned by evidence suggesting that rice experienced a partial or entire duplication of one chromosome about 70 Myr ago and can thus be considered an ancient aneuploid [43,51,52,71-73]. The observed non-uniform distribution of the 29-member IQD gene family in the rice genome, 50% of all IQD loci and three of the four paralogous IQD gene pairs are present on chromosomes I and V (Table (Table2),2), is more consistent with an aneuploidy than whole-genome duplication event. If polyploidization had occurred, it would be expected that IQD genes are randomly distributed over the whole rice genome, as observed for the IQD gene family in Arabidopsis. Given the significant differences in genome size and estimated gene count between rice (420 Mb, 57,900 genes [52,53,74]) and Arabidopsis (119 Mb, 27,500 genes [75]), the slightly larger size of the IQD gene family in Arabidopsis (33 members) versus rice (29 genes) is in agreement with a whole-genome duplication event in the evolutionary history of the Arabidopsis genome. A similar difference in membership has been reported for the Arabidopsis and rice gene families encoding Dof and GRAS transcription factors [55,68]. Nonetheless, IQD genes tend to be larger in rice than in Arabidopsis, which is mainly due to an increased intron length (Figure (Figure11 and Table Table3).3). In addition to polyploidization and segmental duplication events, tandem duplication is another important mechanism in the evolution of gene families [76] and plays a significant role in Arabidopsis as 17% of all genes are arranged in tandem arrays [48,77]. However, there is no evidence for tandem proliferation of the IQD gene families in the recent history of Arabidopsis and rice genomes.

Our analysis further suggests that exon shuffling played a major role during the evolution of IQD genes. Exon insertions and duplications, the major mechanisms of exon shuffling, contributed significantly to the complexities of eukaryotic proteomes [38,78,79]. A striking correlation between functional domains in protein and exons flanked by introns of matching phases, referred to as symmetrical exons, has been observed [38,80]. As stated by the phase-compatibility rules of exon shuffling [81], symmetrical exons and their flanking introns can be deleted, duplicated and inserted into introns of the same phase class without causing frame shifts. Thus, symmetrical exons flanked by introns of a single phase class tend to predominate in genes that largely evolved by exon shuffling and their nonrandom usage may be indicative of gene assembly by exon recruitment [38,78]. An intriguing feature of IQD gene organization in Arabidopsis and rice is the almost exclusive presence of symmetrical exons flanked by phase-0 introns (Figure (Figure1).1). The strong bias for one intron phase class and the variation in the number of exons (2–6), and consequently size of the encoded proteins, is consistent with exon shuffling during the evolution of IQD genes. Exon shuffling is also suggested by the comparisons of patterns of protein motifs (Figure (Figure3)3) and by the phylogenetic analysis of IQD full-length proteins and IQ67 domains, which indicate that phylogenetic relationships based on the IQ67 domain do not necessarily recapitulate patterns of protein and gene structure (Figures (Figures55 and and6).6). Putative exon shuffling events may be recognized in some of the IQD gene structures. For example, At5g35670 and Os06m03925 encode a partial IQ67 domain and may have experienced exon swapping, or At4g10640 may have acquired its penultimate exon when compared with At3g49380 of the same subgroup (Figure (Figure1).1). Exon shuffling may have played a prominent role in the diversification of IQD genes and their hitherto unknown functions. The above-mentioned gene families of transcription factors [55,56,67] contain introns of mixed phase classes, suggesting that exon shuffling played only a minor role during the evolution of these proteins with relatively defined functions. On the other hand, for example, all introns of genes coding for CIPKs are in phase-0 [16]. The exclusive usage of one phase class may indicate exon shuffling to generate the domain diversity necessary for kinase regulation and the ability to recognize a wide spectrum of protein substrates.

Potential roles for IQD proteins

We have recently identified At3g09710 (IQD1) in a screen for Arabidopsis mutants with altered glucosinolate accumulation [37]. Glucosinolates are synthesized mainly by cruciferous species and constitute a class of secondary metabolites with roles in plant defense against pathogens and herbivores [35]. Characterization of gain- and loss-of-function alleles of IQD1 demonstrated that the encoded protein functions as a modulator of glucosinolate pathway-related gene expression. Tissue-specific expression of IQD1 is consistent with glucosinolate accumulation and mainly confined to the vascular tissues. We further demonstrated that an IQD1-GFP fusion protein is targeted to the cell nucleus and that recombinant IQD1 interacts with calmodulin in a Ca2+-dependent fashion [37]. It is therefore intriguing to hypothesize that IQD1 integrates intracellular Ca2+ signals elicited by environmental cues such as herbivorous attack to fine-tune glucosinolate synthesis and accumulation. It should be pointed out that the rice genome does not contain an ortholog of At3g09710 (Figure (Figure6),6), which is consistent with the absence of the glucosinolate pathway in this species and with functional diversification of the Arabidopsis and rice IQD gene families.

We are left to speculate on the biochemical and cellular functions of IQD proteins. One of the most intriguing features of IQD proteins is their high isoelectric point (~10.3), which has been maintained irrespective of protein size variation and domain composition, except for one family member each in Arabidopsis and rice. This observation suggests that the basic nature of IQD proteins is important for their biochemical functions. Although IQD proteins do not contain currently known DNA- or RNA-binding motifs, the basic isoelectric point and high frequency of serine residues, which are reminiscent of certain splicing factors [82], suggest that IQD proteins may associate with nucleic acids and regulate gene expression at the transcriptional or post-transcriptional level. Interestingly, we have recently observed that Arabidopsis IQD1 binds to nucleic acids (T. Savchenko, B. Zipp and S. Abel, unpublished results). A regulatory role for IQD proteins is also suggested by the relatively high fraction of retained duplicated IQD genes in the Arabidopsis genome. Preferential retention of paralogous gene pairs is thought to counteract disturbances in gene dosage and stoichiometric ratios of regulatory protein complexes after large-scale segmental duplication events and the onset of gene inactivation and loss of gene duplicates [44]. In this context, it is interesting to point out that the multiple Ca2+-dependent and Ca2+-independent calmodulin recruitment motifs of the IQ67 domains are likely involved in specific and cooperative interactions with calmodulins or calmodulin-like proteins. These interactions may dramatically alter the dynamic range of Ca2+-binding kinetics and, in turn, modulate interactions of the oligomeric protein complex with additional target proteins [31,83]. Many, if not most, members of the Arabidopsis and rice IQD protein families are likely to function in the cell nucleus (Tables (Tables11 and and2).2). There is increasing evidence for the generation of nucleus-specific Ca2+-signatures in plant cells [1,84-86] and for a potential regulatory role of calmodulin and related Ca2+ sensor proteins in nuclear processes such as transcription or gene silencing [9,60,61,87-90].

Conclusion

We have systematically identified and characterized by bioinformatics a novel family of putative calmodulin target proteins in two model plant species, Arabidopsis thaliana and Oryza sativa. Our phylogenetic analyses indicate that the major IQD gene lineages originated before the monocot-eudicot divergence and that the expansion of the IQD gene family in the genomes of Arabidopsis and rice is consistent with a recent polyploidization and aneuploidization event, respectively. The extant IQD loci in Arabidopsis primarily resulted from segmental duplication and reflect preferential retention of paralogous genes, which is characteristic for proteins with regulatory functions. The almost exclusive usage of phase-0 introns and variable number of exons suggests a role for exon shuffling during the diversification of IQD proteins, which is also supported by phylogenetic relationships between the IQ67 domain and full-length IQD proteins. The unusually basic isoelectric point of IQD proteins and their frequently predicted nuclear localization suggest that IQD proteins link calcium signaling pathways to the regulation of gene expression. Our study provides a framework for the functional dissections of this emerging family of putative calmodulin target proteins.

Methods

Identification of IQD genes

To identify members of the Arabidopsis thaliana IQD protein family, multiple database searches were performed using the Basic Local Alignment Search Tool (BLAST [91,92]) algorithms BLASTP and TBLASTN available on the National Center of Biotechnology Information (NCBI) and The Arabidopsis Information Resource (TAIR) databases [93-95]. We used the amino acid sequence of IQD1 and of its IQ67 domain as initial query sequences, followed by the amino acid sequences of other IQD family members. Amino acid sequence pattern searches were performed on the TAIR website using Patmatch. Arabidopsis nucleotide and protein sequences as well as information regarding the gene structure were obtained from the Munich Information Center for Protein Sequences (MIPS) Arabidopsis thaliana Database (MATDB) [96], The Institute for Genomic Research (TIGR) Arabidopsis thaliana Database [74], and the Arabidopsis thaliana Plant Genome Database (AtPGD) [97]. To identify members of the rice (Oryza sativa) IQD protein family (OsIQD), we searched four different databases using the same BLAST algorithms. Sequences for O. sativa ssp.japonica were retrieved from the database at the TIGR Rice Genome Project [74]. Genomic sequences for ssp. japonica and ssp. indica were also obtained from the GenBank database containing the results of the International Rice Genome Sequencing Project and the draft rice genome sequence of the Chinese Academy of Sciences [53,93]. Rice full-length cDNA and EST sequences were searched in the Knowledge-based Oryza Molecular biological Encyclopedia (KOME) at the National Institute of Agrobiological Sciences [98] and in the TIGR Gene Indices [74]. Nucleotide and amino acid sequences as well as gene structure and chromosomal duplications were obtained from the same databases mentioned above. Genomic sequences that appeared to be misannotated by comparison with available cDNA sequences (full-length cDNAs, ESTs) were corrected for subsequent analysis. Sequences encoding putative IQD proteins in Pinus ssp. and Physcomitrella patens were identified by BLAST searches of the TIGR Gene Indices [74] and of the moss database NIBB PHYSCObase [99].

Chromosomal duplication in the Arabidopsis genome

For the detection of large segmental duplications, we used the redundancy viewer at the MATDB [96], the duplicated blocks map provided by TIGR [74], the interactive supplementary material by Simillion et al. [48], and the interactive maps of duplicated blocks in Arabidopsis by Blanc et al. [45].

Computational analysis of IQD proteins

The amino acid sequences of all IQD proteins were analyzed for physico-chemical parameters (ProtParam) and predicted subcellular localization (PSORT, TargetP) on the ExPASy Proteomics Server [100]. MEME (Multiple Expectation Maximization for Motif Elicitation) was used to identify conserved motif structures among IQD protein sequences [39]. Putative calmodulin-binding sites in IQD protein sequences were predicted by the Calmodulin Target Database [40].

Alignment and phylogenetic analysis of IQD sequences

Multiple alignments of amino acid sequences were performed using ClustalW [101] or ClustalX [102] and were manually corrected. For generating the phylogenetic trees of full-length IQD protein sequences reported in Figures Figures1,1, ,22 and and5,5, we used ClustalX (1.81) and the neighbor-joining algorithm [42]. Bootstrap analysis with 1,000 replicates was used to evaluate the significance of the nodes. The trees of the Arabidopsis and rice IQD protein families were rooted using each atypical protein containing a truncated IQ67 domain as an outgroup; an unrooted tree is shown for the combined analysis of all Arabidopsis and rice IQD proteins (Figure (Figure6).6). For the creation of the unrooted phylogenetic tree of IQ67 domain sequences in Figure Figure7,7, we used in addition the PAUP*4.0 (b10) program to perform distance and parsimony analyses [103]. The same program was used for subsequent bootstrap analysis with 1,000 replicates to evaluate tree topology.

cDNA cloning

The identification and cloning of a full-length cDNA for At3g09710 has been described previously [37]. Using similar conditions for reverse transcriptase-mediated PCR, we amplified predicted full-length cDNA sequences for

At1g17480 (forward: 5'-ATGGGTGGGTCAGGAAATTGGATT-3';

reverse: 5'-TTAGCTTCGCTGGCTCTTGG-3'),

At1g18840 (forward: 5'-ATGGGAAAGCCTGCAAGGTG-3';

reverse: 5'-TAACCGTTTCCTTCTCGGGACGA-3'), and

At4g23060 (forward: 5'-ATGGGAAAAGCGTCCCGGTGGTT-3';

reverse: 5'-TCAGTACCTATACCCAATTGGCATCC-3').

The resulting PCR products were subcloned into the vector pGEMT (Promega, Madison, WI) by TA cloning followed by DNA sequencing of the insert with T7 and SP6 primers.

Expression of AtIQD20 and calmodulin binding assay

A full-length cDNA fragment encoding the predicted IQD20 protein of Arabidopsis was generated by RT-PCR using gene-specific primers

At3g51380 (forward: 5'-CGCGGATCCATGGCCAACTCCAAACGTTTG-3') and At3g51380 (reverse: 5'-GAGGAATTCTTAATGAGAGAG-3'). The PCR fragment was subcloned into the BamHI and EcoRI sites of vector pET21a (Novagen, Madison, WI, USA), which provides an N-terminal T7-epitope tag. Expression of recombinant T7-IQD20 and calmodulin-binding assays using calmodulin-agarose beads (phosphodiesterase-3':5'-cyclic nucleotide activator from bovine brain; Sigma-Aldrich, St. Louis, MO, USA) were performed as previously described [37].

Authors' contributions

SA carried out most of the bioinformatics analyses and wrote the entire manuscript. TS demonstrated calmodulin binding of IQD20. TS and ML contributed to data collection and IQD sequence analysis.

Acknowledgements

We thank Carla Ticconi and Raymond Kwong for critical reading of the manuscript. This work was supported by the National Research Initiative of the United States Department of Agriculture Cooperative State Research, Education and Extension Service to S.A. (grant number 2005-02507).

References

  • Rudd JJ, Franklin-Tong VE. Unravelling response-specificity in Ca2+-signaling pathways in plant cells. . New Phytologist. 2001;151:7–33. doi: 10.1046/j.1469-8137.2001.00173.x. [Cross Ref]
  • Evans NH, McAinsh MR, Hetherington AM. Calcium oscillations in higher plants. Curr Opin Plant Biol. 2001;4:415–420. doi: 10.1016/S1369-5266(00)00194-1. [PubMed] [Cross Ref]
  • Harper JF. Dissecting calcium oscillators in plant cells. Trends Plant Sci. 2001;6:395–397. doi: 10.1016/S1360-1385(01)02023-4. [PubMed] [Cross Ref]
  • Scrase-Field SA, Knight MR. Calcium: just a chemical switch? Curr Opin Plant Biol. 2003;6:500–506. doi: 10.1016/S1369-5266(03)00091-8. [PubMed] [Cross Ref]
  • Knight H, Knight MR. Abiotic stress signalling pathways: specificity and cross-talk. Trends Plant Sci. 2001;6:262–267. doi: 10.1016/S1360-1385(01)01946-X. [PubMed] [Cross Ref]
  • Snedden WA, Fromm H. Calmodulin as a versatile calcium signal transducer in plants. New Phytol. 2001;151:35–66. doi: 10.1046/j.1469-8137.2001.00154.x. [Cross Ref]
  • Sanders D, Pelloux J, Brownlee C, Harper JF. Calcium at the crossroads of signaling. Plant Cell. 2002;14 Suppl:S401–17. [PMC free article] [PubMed]
  • Reddy VS, Reddy AS. Proteomics of calcium-signaling components in plants. Phytochemistry. 2004;65:1745–1776. doi: 10.1016/j.phytochem.2004.04.033. [PubMed] [Cross Ref]
  • Luan S, Kudla J, Rodriguez-Concepcion M, Yalovsky S, Gruissem W. Calmodulins and calcineurin B-like proteins: calcium sensors for specific signal response coupling in plants. Plant Cell. 2002;14 Suppl:S389–400. [PMC free article] [PubMed]
  • Yang T, Poovaiah BW. Calcium/calmodulin-mediated signal network in plants. Trends Plant Sci. 2003;8:505–512. doi: 10.1016/j.tplants.2003.09.004. [PubMed] [Cross Ref]
  • Bouche N, Yellin A, Snedden WA, Fromm H. Plant-Specific Calmodulin-Binding Proteins. Annu Rev Plant Biol. 2005;56:435–466. doi: 10.1146/annurev.arplant.56.032604.144224. [PubMed] [Cross Ref]
  • Day IS, Reddy VS, Shad Ali G, Reddy AS. Analysis of EF-hand-containing proteins in Arabidopsis. Genome Biol. 2002;3:RESEARCH0056. doi: 10.1186/gb-2002-3-10-research0056. [PMC free article] [PubMed] [Cross Ref]
  • McCormack E, Braam J. Calmodulin and related potential calcium sensors of Arabidopsis. New Phytol. 2003;159:585–598. doi: 10.1046/j.1469-8137.2003.00845.x. [Cross Ref]
  • McCormack E, Tsai YC, Braam J. Handling calcium signaling: Arabidopsis CaMs and CMLs. Trends Plant Sci. 2005;10:383–389. doi: 10.1016/j.tplants.2005.07.001. [PubMed] [Cross Ref]
  • Kudla J, Xu Q, Harter K, Gruissem W, Luan S. Genes for calcineurin B-like proteins in Arabidopsis are differentially regulated by stress signals. Proc Natl Acad Sci U S A. 1999;96:4718–4723. doi: 10.1073/pnas.96.8.4718. [PMC free article] [PubMed] [Cross Ref]
  • Kolukisaoglu U, Weinl S, Blazevic D, Batistic O, Kudla J. Calcium sensors and their interacting protein kinases: genomics of the Arabidopsis and rice CBL-CIPK signaling networks. Plant Physiol. 2004;134:43–58. doi: 10.1104/pp.103.033068. [PMC free article] [PubMed] [Cross Ref]
  • Batistic O, Kudla J. Integration and channeling of calcium signaling through the CBL calcium sensor/CIPK protein kinase network. Planta. 2004;219:915–924. doi: 10.1007/s00425-004-1333-3. [PubMed] [Cross Ref]
  • Harmon AC, Gribskov M, Harper JF. CDPKs - a kinase for every Ca2+ signal? Trends Plant Sci. 2000;5:154–159. doi: 10.1016/S1360-1385(00)01577-6. [PubMed] [Cross Ref]
  • Hrabak EM, Chan CW, Gribskov M, Harper JF, Choi JH, Halford N, Kudla J, Luan S, Nimmo HG, Sussman MR, Thomas M, Walker-Simmons K, Zhu JK, Harmon AC. The Arabidopsis CDPK-SnRK superfamily of protein kinases. Plant Physiol. 2003;132:666–680. doi: 10.1104/pp.102.011999. [PMC free article] [PubMed] [Cross Ref]
  • Sheen J. Ca2+-dependent protein kinases and stress signal transduction in plants. Science. 1996;274:1900–1902. doi: 10.1126/science.274.5294.1900. [PubMed] [Cross Ref]
  • Romeis T, Ludwig AA, Martin R, Jones JD. Calcium-dependent protein kinases play an essential role in a plant defence response. Embo J. 2001;20:5556–5567. doi: 10.1093/emboj/20.20.5556. [PMC free article] [PubMed] [Cross Ref]
  • Shi J, Kim KN, Ritz O, Albrecht V, Gupta R, Harter K, Luan S, Kudla J. Novel protein kinases associated with calcineurin B-like calcium sensors in Arabidopsis. Plant Cell. 1999;11:2393–2405. doi: 10.1105/tpc.11.12.2393. [PMC free article] [PubMed] [Cross Ref]
  • Halfter U, Ishitani M, Zhu JK. The Arabidopsis SOS2 protein kinase physically interacts with and is activated by the calcium-binding protein SOS3. Proc Natl Acad Sci U S A. 2000;97:3735–3740. doi: 10.1073/pnas.040577697. [PMC free article] [PubMed] [Cross Ref]
  • Kim KN, Cheong YH, Gupta R, Luan S. Interaction specificity of Arabidopsis calcineurin B-like calcium sensors and their target kinases. Plant Physiol. 2000;124:1844–1853. doi: 10.1104/pp.124.4.1844. [PMC free article] [PubMed] [Cross Ref]
  • Zhu JK. Regulation of ion homeostasis under salt stress. Curr Opin Plant Biol. 2003;6:441–445. doi: 10.1016/S1369-5266(03)00085-2. [PubMed] [Cross Ref]
  • Pandey GK, Cheong YH, Kim KN, Grant JJ, Li L, Hung W, D'Angelo C, Weinl S, Kudla J, Luan S. The calcium sensor calcineurin B-like 9 modulates abscisic acid sensitivity and biosynthesis in Arabidopsis. Plant Cell. 2004;16:1912–1924. doi: 10.1105/tpc.021311. [PMC free article] [PubMed] [Cross Ref]
  • Zielinski RE. Calmodulin And Calmodulin-Binding Proteins In Plants. Annu Rev Plant Physiol Plant Mol Biol. 1998;49:697–725. doi: 10.1146/annurev.arplant.49.1.697. [PubMed] [Cross Ref]
  • Zhang L, Lu YT. Calmodulin-binding protein kinases in plants. Trends Plant Sci. 2003;8:123–127. doi: 10.1016/S1360-1385(03)00013-X. [PubMed] [Cross Ref]
  • Reddy AS, Day IS, Narasimhulu SB, Safadi F, Reddy VS, Golovkin M, Harnly MJ. Isolation and characterization of a novel calmodulin-binding protein from potato. J Biol Chem. 2002;277:4206–4214. doi: 10.1074/jbc.M104595200. [PubMed] [Cross Ref]
  • Osawa M, Swindells MB, Tanikawa J, Tanaka T, Mase T, Furuya T, Ikura M. Solution structure of calmodulin-W-7 complex: the basis of diversity in molecular recognition. J Mol Biol. 1998;276:165–176. doi: 10.1006/jmbi.1997.1524. [PubMed] [Cross Ref]
  • Hoeflich KP, Ikura M. Calmodulin in action: diversity in target recognition and activation mechanisms. Cell. 2002;108:739–742. doi: 10.1016/S0092-8674(02)00682-7. [PubMed] [Cross Ref]
  • Bahler M, Rhoads A. Calmodulin signaling via the IQ motif. FEBS Lett. 2002;513:107–113. doi: 10.1016/S0014-5793(01)03239-2. [PubMed] [Cross Ref]
  • Choi JY, Lee SH, Park CY, Heo WD, Kim JC, Kim MC, Chung WS, Moon BC, Cheong YH, Kim CY, Yoo JH, Koo JC, Ok HM, Chi SW, Ryu SE, Lee SY, Lim CO, Cho MJ. Identification of calmodulin isoform-specific binding peptides from a phage-displayed random 22-mer peptide library. J Biol Chem. 2002;277:21630–21638. doi: 10.1074/jbc.M110803200. [PubMed] [Cross Ref]
  • Rhoads AR, Friedberg F. Sequence motifs for calmodulin recognition. Faseb J. 1997;11:331–340. [PubMed]
  • Wittstock U, Halkier BA. Glucosinolate research in the Arabidopsis era. Trends Plant Sci. 2002;7:263–270. doi: 10.1016/S1360-1385(02)02273-2. [PubMed] [Cross Ref]
  • Dudareva N, Evrard JL, Pillay DT, Steinmetz A. Nucleotide sequence of a pollen-specific cDNA from Helianthus annuus L. encoding a highly basic protein. Plant Physiol. 1994;106:403–404. doi: 10.1104/pp.106.1.403. [PMC free article] [PubMed] [Cross Ref]
  • Levy M, Wang Q, Kaspi R, Parrella MP, Abel S. Arabidopsis IQD1, a novel calmodulin-binding nuclear protein, stimulates glucosinolate accumulation and plant defense. Plant J. 2005;43:79–96. doi: 10.1111/j.1365-313X.2005.02435.x. [PubMed] [Cross Ref]
  • Liu M, Grigoriev A. Protein domains correlate strongly with exons in multiple eukaryotic genomes--evidence of exon shuffling? Trends Genet. 2004;20:399–403. doi: 10.1016/j.tig.2004.06.013. [PubMed] [Cross Ref]
  • Bailey TL, Elkan C. The value of prior knowledge in discovering motifs with MEME. Proc Int Conf Intell Syst Mol Biol. 1995;3:21–29. [PubMed]
  • Yap KL, Kim J, Truong K, Sherman M, Yuan T, Ikura M. Calmodulin target database. J Struct Funct Genomics. 2000;1:8–14. doi: 10.1023/A:1011320027914. [PubMed] [Cross Ref]
  • Abel S, Theologis A. A polymorphic bipartite motif signals nuclear targeting of early auxin-inducible proteins related to PS-IAA4 from pea (Pisum sativum) Plant J. 1995;8:87–96. doi: 10.1046/j.1365-313X.1995.08010087.x. [PubMed] [Cross Ref]
  • Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. [PubMed]
  • Blanc G, Wolfe KH. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell. 2004;16:1667–1678. doi: 10.1105/tpc.021345. [PMC free article] [PubMed] [Cross Ref]
  • Blanc G, Wolfe KH. Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell. 2004;16:1679–1691. doi: 10.1105/tpc.021410. [PMC free article] [PubMed] [Cross Ref]
  • Blanc G, Hokamp K, Wolfe KH. A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res. 2003;13:137–144. doi: 10.1101/gr.751803. [PMC free article] [PubMed] [Cross Ref]
  • Blanc G, Barakat A, Guyot R, Cooke R, Delseny M. Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell. 2000;12:1093–1101. doi: 10.1105/tpc.12.7.1093. [PMC free article] [PubMed] [Cross Ref]
  • Vision TJ, Brown DG, Tanksley SD. The origins of genomic duplications in Arabidopsis. Science. 2000;290:2114–2117. doi: 10.1126/science.290.5499.2114. [PubMed] [Cross Ref]
  • Simillion C, Vandepoele K, Van Montagu MC, Zabeau M, Van de Peer Y. The hidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci U S A. 2002;99:13627–13632. doi: 10.1073/pnas.212522399. [PMC free article] [PubMed] [Cross Ref]
  • Bowers JE, Chapman BA, Rong J, Paterson AH. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature. 2003;422:433–438. doi: 10.1038/nature01521. [PubMed] [Cross Ref]
  • Ziolkowski PA, Blanc G, Sadowski J. Structural divergence of chromosomal segments that arose from successive duplication events in the Arabidopsis genome. Nucleic Acids Res. 2003;31:1339–1350. doi: 10.1093/nar/gkg201. [PMC free article] [PubMed] [Cross Ref]
  • Vandepoele K, Simillion C, Van de Peer Y. Evidence that rice and other cereals are ancient aneuploids. Plant Cell. 2003;15:2192–2202. doi: 10.1105/tpc.014019. [PMC free article] [PubMed] [Cross Ref]
  • Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica) Science. 2002;296:92–100. doi: 10.1126/science.1068275. [PubMed] [Cross Ref]
  • Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H. A draft sequence of the rice genome (Oryza sativa L. ssp. indica) Science. 2002;296:79–92. doi: 10.1126/science.1068037. [PubMed] [Cross Ref]
  • Toledo-Ortiz G, Huq E, Quail PH. The Arabidopsis basic/helix-loop-helix transcription factor family. Plant Cell. 2003;15:1749–1770. doi: 10.1105/tpc.013839. [PMC free article] [PubMed] [Cross Ref]
  • Lijavetzky D, Carbonero P, Vicente-Carbajosa J. Genome-wide comparative phylogenetic analysis of the rice and Arabidopsis Dof gene families. BMC Evol Biol. 2003;3:17. doi: 10.1186/1471-2148-3-17. [PMC free article] [PubMed] [Cross Ref]
  • Reyes JC, Muro-Pastor MI, Florencio FJ. The GATA family of transcription factors in Arabidopsis and rice. Plant Physiol. 2004;134:1718–1732. doi: 10.1104/pp.103.037788. [PMC free article] [PubMed] [Cross Ref]
  • Kohler C, Merkle T, Neuhaus G. Characterisation of a novel gene family of putative cyclic nucleotide- and calmodulin-regulated ion channels in Arabidopsis thaliana. Plant J. 1999;18:97–104. doi: 10.1046/j.1365-313X.1999.00422.x. [PubMed] [Cross Ref]
  • Reddy AS, Day IS. Analysis of the myosins encoded in the recently completed Arabidopsis thaliana genome sequence. Genome Biol. 2001;2:RESEARCH0024. doi: 10.1186/gb-2001-2-7-research0024. [PMC free article] [PubMed] [Cross Ref]
  • Reddy AS, Reddy VS, Golovkin M. A calmodulin binding protein from Arabidopsis is induced by ethylene and contains a DNA-binding motif. Biochem Biophys Res Commun. 2000;279:762–769. doi: 10.1006/bbrc.2000.4032. [PubMed] [Cross Ref]
  • Yang T, Poovaiah BW. A calmodulin-binding/CGCG box DNA-binding protein family involved in multiple signaling pathways in plants. J Biol Chem. 2002;277:45049–45058. doi: 10.1074/jbc.M207941200. [PubMed] [Cross Ref]
  • Bouche N, Scharlat A, Snedden W, Bouchez D, Fromm H. A novel family of calmodulin-binding transcription activators in multicellular organisms. J Biol Chem. 2002;277:21851–21861. doi: 10.1074/jbc.M200268200. [PubMed] [Cross Ref]
  • Hedges SB. The origin and evolution of model organisms. Nat Rev Genet. 2002;3:838–849. doi: 10.1038/nrg929. [PubMed] [Cross Ref]
  • Bowe LM, Coat G, dePamphilis CW. Phylogeny of seed plants based on all three genomic compartments: extant gymnosperms are monophyletic and Gnetales' closest relatives are conifers. Proc Natl Acad Sci U S A. 2000;97:4092–4097. doi: 10.1073/pnas.97.8.4092. [PMC free article] [PubMed] [Cross Ref]
  • Yang YW, Lai KN, Tai PY, Li WH. Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J Mol Evol. 1999;48:597–604. [PubMed]
  • Ermolaeva MD, Wu M, Eisen JA, Salzberg SL. The age of the Arabidopsis thaliana genome duplication. Plant Mol Biol. 2003;51:859–866. doi: 10.1023/A:1023001130337. [PubMed] [Cross Ref]
  • Raes J, Vandepoele K, Simillion C, Saeys Y, Van de Peer Y. Investigating ancient duplication events in the Arabidopsis genome. J Struct Funct Genomics. 2003;3:117–129. doi: 10.1023/A:1022666020026. [PubMed] [Cross Ref]
  • Remington DL, Vision TJ, Guilfoyle TJ, Reed JW. Contrasting modes of diversification in the Aux/IAA and ARF gene families. Plant Physiol. 2004;135:1738–1752. doi: 10.1104/pp.104.039669. [PMC free article] [PubMed] [Cross Ref]
  • Tian C, Wan P, Sun S, Li J, Chen M. Genome-wide analysis of the GRAS gene family in rice and Arabidopsis. Plant Mol Biol. 2004;54:519–532. doi: 10.1023/B:PLAN.0000038256.89809.57. [PubMed] [Cross Ref]
  • Cannon SB, Young ND. OrthoParaMap: distinguishing orthologs from paralogs by integrating comparative genome data and gene phylogenies. BMC Bioinformatics. 2003;4:35. doi: 10.1186/1471-2105-4-35. [PMC free article] [PubMed] [Cross Ref]
  • Birchler JA, Bhadra U, Bhadra MP, Auger DL. Dosage-dependent gene regulation in multicellular eukaryotes: implications for dosage compensation, aneuploid syndromes, and quantitative traits. Dev Biol. 2001;234:275–288. doi: 10.1006/dbio.2001.0262. [PubMed] [Cross Ref]
  • Bancroft I. Insights into cereal genomes from two draft genome sequences of rice. Genome Biol. 2002;3:REVIEWS1015. doi: 10.1186/gb-2002-3-6-reviews1015. [PMC free article] [PubMed] [Cross Ref]
  • Paterson AH, Bowers JE, Peterson DG, Estill JC, Chapman BA. Structure and evolution of cereal genomes. Curr Opin Genet Dev. 2003;13:644–650. doi: 10.1016/j.gde.2003.10.002. [PubMed] [Cross Ref]
  • Simillion C, Vandepoele K, Saeys Y, Van de Peer Y. Building genomic profiles for uncovering segmental homology in the twilight zone. Genome Res. 2004;14:1095–1106. doi: 10.1101/gr.2179004. [PMC free article] [PubMed] [Cross Ref]
  • The Institute for Genomic Research (TIGR) http://www.tigr.org
  • Wortman JR, Haas BJ, Hannick LI, Smith RK, Jr., Maiti R, Ronning CM, Chan AP, Yu C, Ayele M, Whitelaw CA, White OR, Town CD. Annotation of the Arabidopsis genome. Plant Physiol. 2003;132:461–468. doi: 10.1104/pp.103.022251. [PMC free article] [PubMed] [Cross Ref]
  • Meyers BC, Kozik A, Griego A, Kuang H, Michelmore RW. Genome-wide analysis of NBS-LRR-encoding genes in Arabidopsis. Plant Cell. 2003;15:809–834. doi: 10.1105/tpc.009308. [PMC free article] [PubMed] [Cross Ref]
  • AGI Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. doi: 10.1038/35048692. [PubMed] [Cross Ref]
  • Patthy L. Genome evolution and the evolution of exon-shuffling--a review. Gene. 1999;238:103–114. doi: 10.1016/S0378-1119(99)00228-0. [PubMed] [Cross Ref]
  • Long M. Evolution of novel genes. Curr Opin Genet Dev. 2001;11:673–680. doi: 10.1016/S0959-437X(00)00252-5. [PubMed] [Cross Ref]
  • de Souza SJ, Long M, Klein RJ, Roy S, Lin S, Gilbert W. Toward a resolution of the introns early/late debate: only phase zero introns are correlated with the structure of ancient proteins. Proc Natl Acad Sci U S A. 1998;95:5094–5099. doi: 10.1073/pnas.95.9.5094. [PMC free article] [PubMed] [Cross Ref]
  • Patthy L. Intron-dependent evolution: preferred types of exons and introns. FEBS Lett. 1987;214:1–7. doi: 10.1016/0014-5793(87)80002-9. [PubMed] [Cross Ref]
  • Chaudhary N, McMahon C, Blobel G. Primary structure of a human arginine-rich nuclear protein that colocalizes with spliceosome components. Proc Natl Acad Sci U S A. 1991;88:8189–8193. [PMC free article] [PubMed]
  • Putkey JA, Kleerekoper Q, Gaertner TR, Waxham MN. A new role for IQ motif proteins in regulating calmodulin function. J Biol Chem. 2003;278:49667–49670. doi: 10.1074/jbc.C300372200. [PubMed] [Cross Ref]
  • van Der Luit AH, Olivari C, Haley A, Knight MR, Trewavas AJ. Distinct calcium signaling pathways regulate calmodulin gene expression in tobacco. Plant Physiol. 1999;121:705–714. doi: 10.1104/pp.121.3.705. [PMC free article] [PubMed] [Cross Ref]
  • Pauly N, Knight MR, Thuleau P, van der Luit AH, Moreau M, Trewavas AJ, Ranjeva R, Mazars C. Control of free calcium in plant cell nuclei. Nature. 2000;405:754–755. doi: 10.1038/35015671. [PubMed] [Cross Ref]
  • Xiong TC, Jauneau A, Ranjeva R, Mazars C. Isolated plant nuclei as mechanical and thermal sensors involved in calcium signalling. Plant J. 2004;40:12–21. doi: 10.1111/j.1365-313X.2004.02184.x. [PubMed] [Cross Ref]
  • Anandalakshmi R, Marathe R, Ge X, Herr JM, Jr., Mau C, Mallory A, Pruss G, Bowman L, Vance VB. A calmodulin-related protein that suppresses posttranscriptional gene silencing in plants. Science. 2000;290:142–144. doi: 10.1126/science.290.5489.142. [PubMed] [Cross Ref]
  • Du L, Poovaiah BW. A novel family of Ca2+/calmodulin-binding proteins involved in transcriptional regulation: interaction with fsh/Ring3 class transcription activators. Plant Mol Biol. 2004;54:549–569. doi: 10.1023/B:PLAN.0000038269.98972.bb. [PubMed] [Cross Ref]
  • Perruc E, Charpenteau M, Ramirez BC, Jauneau A, Galaud JP, Ranjeva R, Ranty B. A novel calmodulin-binding protein functions as a negative regulator of osmotic stress tolerance in Arabidopsis thaliana seedlings. Plant J. 2004;38:410–420. doi: 10.1111/j.1365-313X.2004.02062.x. [PubMed] [Cross Ref]
  • Yoo JH, Park CY, Kim JC, Heo WD, Cheong MS, Park HC, Kim MC, Moon BC, Choi MS, Kang YH, Lee JH, Kim HS, Lee SM, Yoon HW, Lim CO, Yun DJ, Lee SY, Chung WS, Cho MJ. Direct interaction of a divergent CaM isoform and the transcription factor, MYB2, enhances salt tolerance in arabidopsis. J Biol Chem. 2005;280:3697–3706. doi: 10.1074/jbc.M408237200. [PubMed] [Cross Ref]
  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1006/jmbi.1990.9999. [PubMed] [Cross Ref]
  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [PMC free article] [PubMed] [Cross Ref]
  • National Center of Biotechnology Information (NCBI) http://www.ncbi.nlm.nih.gov
  • The Arabidopsis Information Resource (TAIR) http://www.arabidopsis.org
  • Rhee SY, Beavis W, Berardini TZ, Chen G, Dixon D, Doyle A, Garcia-Hernandez M, Huala E, Lander G, Montoya M, Miller N, Mueller LA, Mundodi S, Reiser L, Tacklind J, Weems DC, Wu Y, Xu I, Yoo D, Yoon J, Zhang P. The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. 2003;31:224–228. doi: 10.1093/nar/gkg076. [PMC free article] [PubMed] [Cross Ref]
  • Munich Information Center for Protein Sequences (MIPS) Arabidopsis thaliana Database (MATDB) http://mips.gsf.de/proj/thal/db/
  • Arabidopsis thaliana Plant Genome Database (AtPGD) http://www.plantgdb.org
  • Knowledge-based Oryza Molecular biological Encyclopedia (KOME) http://cdna01.dna.affrc.go.jp/cDNA/
  • PHYSCObase http://moss.nibb.ac.jp
  • ExPASy Proteomics Server http://us.expasy.org/
  • Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. [PMC free article] [PubMed]
  • Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997;25:4876–4882. doi: 10.1093/nar/25.24.4876. [PMC free article] [PubMed] [Cross Ref]
  • Swofford D. PAUP*: Phylogenetic analysis using parsimony. Sunderland, MA , Sinauer; 2000.
  • Talke IN, Blaudez D, Maathuis FJ, Sanders D. CNGCs: prime targets of plant cyclic nucleotide signalling? Trends Plant Sci. 2003;8:286–293. doi: 10.1016/S1360-1385(03)00099-2. [PubMed] [Cross Ref]
  • Yamada K, Lim J, Dale JM, Chen H, Shinn P, Palm CJ, Southwick AM, Wu HC, Kim C, Nguyen M, Pham P, Cheuk R, Karlin-Newmann G, Liu SX, Lam B, Sakano H, Wu T, Yu G, Miranda M, Quach HL, Tripp M, Chang CH, Lee JM, Toriumi M, Chan MM, Tang CC, Onodera CS, Deng JM, Akiyama K, Ansari Y, Arakawa T, Banh J, Banno F, Bowser L, Brooks S, Carninci P, Chao Q, Choy N, Enju A, Goldsmith AD, Gurjal M, Hansen NF, Hayashizaki Y, Johnson-Hopson C, Hsuan VW, Iida K, Karnes M, Khan S, Koesema E, Ishida J, Jiang PX, Jones T, Kawai J, Kamiya A, Meyers C, Nakajima M, Narusaka M, Seki M, Sakurai T, Satou M, Tamse R, Vaysberg M, Wallender EK, Wong C, Yamamura Y, Yuan S, Shinozaki K, Davis RW, Theologis A, Ecker JR. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science. 2003;302:842–846. doi: 10.1126/science.1088305. [PubMed] [Cross Ref]
  • Meyers BC, Vu TH, Tej SS, Ghazal H, Matvienko M, Agrawal V, Ning J, Haudenschild CD. Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing. Nat Biotechnol. 2004;22:1006–1011. doi: 10.1038/nbt992. [PubMed] [Cross Ref]

Articles from BMC Evolutionary Biology are provided here courtesy of BioMed Central
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...