• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of plntphysLink to Publisher's site
Plant Physiol. Aug 2006; 141(4): 1167–1184.
PMCID: PMC1533929

Genome-Wide Analysis of Basic/Helix-Loop-Helix Transcription Factor Family in Rice and Arabidopsis1,[W]

Abstract

The basic/helix-loop-helix (bHLH) transcription factors and their homologs form a large family in plant and animal genomes. They are known to play important roles in the specification of tissue types in animals. On the other hand, few plant bHLH proteins have been studied functionally. Recent completion of whole genome sequences of model plants Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) allows genome-wide analysis and comparison of the bHLH family in flowering plants. We have identified 167 bHLH genes in the rice genome, and their phylogenetic analysis indicates that they form well-supported clades, which are defined as subfamilies. In addition, sequence analysis of potential DNA-binding activity, the sequence motifs outside the bHLH domain, and the conservation of intron/exon structural patterns further support the evolutionary relationships among these proteins. The genome distribution of rice bHLH genes strongly supports the hypothesis that genome-wide and tandem duplication contributed to the expansion of the bHLH gene family, consistent with the birth-and-death theory of gene family evolution. Bioinformatics analysis suggests that rice bHLH proteins can potentially participate in a variety of combinatorial interactions, endowing them with the capacity to regulate a multitude of transcriptional programs. In addition, similar expression patterns suggest functional conservation between some rice bHLH genes and their close Arabidopsis homologs.

Since the discovery of the basic/helix-loop-helix (bHLH) motif with DNA-binding and dimerization capabilities (Murre et al., 1989), members of bHLH protein superfamily have been found to have an ever increasing number of functions in essential physiological and developmental processes in animals and, to a lesser extent, plants (Quail, 2000; Ledent and Vervoort, 2001; Toledo-Ortiz et al., 2003; Sonnenfeld et al., 2005). The bHLH domain contains approximately 60 amino acids, with two functionally distinctive regions, the basic region and the HLH region. The basic region is located at the N terminus of the bHLH domain and functions as a DNA-binding motif. It consists of approximately 15 amino acids, which typically include six basic residues (Atchley et al., 1999). The HLH region contains two amphipathic α helices with a linking loop of variable lengths; the amphipathic α helices of two bHLH proteins can interact, allowing the formation of homodimers or heterodimers (Murre et al., 1989; Ellenberger et al., 1994; Nesi et al., 2000). Some bHLH proteins have been shown to bind to sequences containing a consensus core element called the E box (5′-CANNTG-3′), with the G box (5′-CACGTG-3′) being the most common form. In addition, the nucleotides flanking the core element may also have a role in binding specificity (Atchley et al., 1999; Martinez-Garcia et al., 2000; Massari and Murre, 2000; Robinson et al., 2000).

According to their phylogenetic relationships, DNA-binding motifs, and functional properties, known bHLH proteins from animals have been divided into six main groups (named as group A to F; Atchley and Fitch, 1997). Group A bHLH proteins include Atonal, D, Delilah, dHand, E12, Hen, Lyl, MyoD, and Twist; they can bind to the E-box sequence CAGCTG. In group B, a number of proteins have diverse functions and bind to the G-box sequence CACGTG; examples of this group include Max, Myc, MITF, SREBP, and USF (Henriksson and Luscher, 1996; Facchini and Penn, 1998; Goding, 2000). Members of Group C contain an additional protein-protein interaction region (the PAS domain) and bind to sequences (NACGTG or NGCGTG) that are unlike the E box. Proteins in group D have the HLH region but lack the basic region; they can form heterodimers with bHLH proteins, thus are functionally related to typical bHLH proteins (Sun et al., 1991). Group E includes E(spl), Gridlock, Hairy, and Hey (Ledent and Vervoort, 2001); these proteins have Pro or Gly residues within the basic region and can bind to the sequence CACGNG preferentially (Fisher and Caudy, 1998; Steidl et al., 2000). Group F consists of COE-bHLH proteins; they have divergent sequences compared with other groups and another domain for dimerization and DNA binding (Crozatier et al., 1996; Fisher and Caudy, 1998; Ledent and Vervoort, 2001).

Compared to animals, only a small number of plant bHLH proteins have been characterized functionally. In Arabidopsis (Arabidopsis thaliana), a model for flowering plants (particularly eudicots), 162 bHLH-encoding genes have been identified from the analysis of genome sequences (Bailey et al., 2003; Heim et al., 2003; Toledo-Ortiz et al., 2003). The members of bHLH family of Arabidopsis have been divided into 21 subfamilies by Toledo-Ortiz et al. (2003). Buck and Atchley (2003) also did the phylogenetic analysis of bHLH family in plants. Their study found that a total number of 295 bHLH genes, including 118 from the Arabidopsis, 131 from the rice (Oryza sativa), and 46 from other plants, could be grouped into 15 separate clades. Sequence analysis suggests that most of the plant bHLH proteins belonged to Group B (Atchley and Fitch, 1997). In addition, recent reports have demonstrated that some plant bHLH proteins can interact with proteins that lack a bHLH domain. In particular, protein complexes with MYB, bHLH, and WD40 proteins were proposed to regulate guard cell and root hair differentiation (Ramsay and Glover, 2005).

Rice is one of the most important food crops in the world and it has been used as a major model species in plant (especially monocot) functional genomics research because of its relatively small genome size (approximately 390 Mb) and synteny with other cereal genomes (Gale and Devos, 1998). More than 95% of the rice (japonica cultivar Nipponbare) genome has been sequenced by the International Rice Genome Sequencing Project (http://rgp.dna.affrc.go.jp/cgi-bin/statusdb/irgsp-status.cgi; data from April, 2006). The rice bHLH gene family (OsbHLH genes) has not been analyzed in detail, and the phylogenetic relationship with other plant bHLH genes remains poorly understood. In this study, we identified 167 OsbHLH genes from the rice genomic sequence and carried out phylogenetic analyses to understand the relationships among these rice genes. Furthermore, we identified some of the duplication events that likely contributed to the expansion of the bHLH family. Phylogenetic analysis of rice and Arabidopsis bHLH genes allowed the identification of both shared and specific subfamilies and estimated the number of bHLH genes in the most recent common ancestor (MRCA) of rice and Arabidopsis, as well as potential gene birth-and-death events. Moreover, analysis of intron number and location provides evidence for numerous independent intron loss events in the bHLH family. Finally, expression studies indicate that bHLH proteins exhibit a variety of expression patterns, suggesting diverse functions.

RESULTS AND DISCUSSION

Identification of 167 OsbHLH Genes

To obtain sequences of bHLH genes in the rice genome, we used the criteria developed by Atchley et al. (1999) to define a bHLH protein. Briefly, the bHLH motif contains 19 conserved amino acids, five amino acids in the basic region, five in the first helix, one in the loop, and eight in the second helix (Atchley et al., 1999). Using TBLASTN against the rice genome database, we obtained all putative bHLH proteins that had more than 11 conserved amino acids among the 19 residues. In addition, because members of group C and group D do not have the typical basic region (Murre et al., 1994; Swanson et al., 1995; Bailey et al., 2003) and can interact with some bHLH proteins (Sun et al., 1991), we identified putative HLH proteins that had nine or 10 amino acids that matched the 14 amino acids in the HLH region. Another family called TCP family also has a bHLH structure (Kosugi and Ohashi, 1997), but the structure and DNA-binding specificity of the TCP motif are different from those of the bHLH motif. Therefore, the TCP family is not studied in this article.

Initially, we used the bHLH domain (64 amino acids) encoded by a putative rice bHLH gene (GeneBank number XM_463907) as a BLAST query to identify a large number of candidate bHLH sequences in The Institute for Genomic Research (TIGR) database, because this sequence fit the bHLH motif best among the known rice bHLH proteins. Because of the sequence variation among known bHLH domains, to detect additional possible bHLH domain sequences we used position-specific iterated BLAST to search the database of TIGR (version 4, 2006). Subsequently, TBLASTN was used to remove redundant sequences of candidate bHLH genes according to their corresponding sequencing bacterial artificial chromosome clone serial numbers and their chromosome locations, resulting in 167 OsbHLH genes (Table I; Supplemental Table I). The number designation of the OsbHLH genes was based on the order of the multiple sequence alignment (Supplemental Fig. 1) and the synonymy between the names of these 167 OsbHLH genes and the previously reported 131 rice genes by Buck and Atchley (2003) is shown in the Supplemental Table I. Among the reported rice bHLH genes by Buck and Atchley (2003), 45 genes were from the GenBank database, and others were predicted genes of rice with temporary designations (Yu et al., 2002). Compared with other transcription factor gene families in rice and Arabidopsis, the bHLH gene family was one of the largest families whose members were only fewer than the MYB family (Xiong et al., 2005).

Table I.
LOC (location) number of OsbHLH members

To verify the reliability of our criteria, we performed simple modular architecture research tool (SMART) analysis of the 167 putative OsbHLH protein sequences and found that 164 proteins had a typical bHLH domain and three, OsbHLH157, OsbHLH160, and OsbHLH161, contained a predicted HLH domain with low confidence values. In addition, the OsbHLH026 protein unexpectedly had two HLH domains predicted by SMART, and their E values were 8.46E-13 and 1.56E-02, respectively. The amino acid sequences of these two bHLH domains were 76% similar, with 14 identical amino acids among the 17 amino acids of the basic region and a predicated binding activity to the G box. To date, five Caenorhabditis elegans proteins have been reported to have two bHLH domains (Ledent et al., 2002), but the degrees of sequence similarity between the two bHLH domains in the same protein are not as high as that in OsbHLH026. The biological functions of this kind of bHLH protein remain to be elucidated.

Multiple Sequence Alignments, Predicted DNA-Binding Ability, and Conserved Residues

To examine sequence features of these rice bHLH domains, we performed multiple sequence alignment of the 167 rice bHLH amino acid sequences (Supplemental Fig. 1). On average, the basic regions (the N-terminal 17 positions; Supplemental Fig. 1) of OsbHLH domains have 5.7 basic residues, even though 26 of these proteins did not have the basic region. Within subsets of OsbHLH domains, there is further conservation of nonbasic residues in the basic region, as well as in the two helices and in a C-terminal region of the second helix (Supplemental Fig. 1). In contrast, the loop was the most divergent region in terms of both length (ranging from 3–18 amino acids) and amino acid composition. From the alignment, we identified 19 residues that are identical in at least 50% of the 167 rice bHLH domains (Supplemental Fig. 1, indicated at the bottom of the alignment). Figure 1 shows the distribution of amino acid residues at the 19 positions of the consensus motif of the bHLH domain, including the results from two previous reports (Atchley et al., 1999; Toledo-Ortiz et al., 2003) and the results of OsbHLHs from this study. Generally, the distributions of conserved amino acids among the bHLH domains of both rice and Arabidopsis were very similar, but quite different from that of the animal bHLHs (Fig. 1), as expected from the evolutionary distances for bHLHs among plants or between plants and animals. Eleven of the 19 conserved residues were included in the consensus motif used for identifying the bHLH family members (Glu-13, Arg-14, Arg-16, Asn-21, Leu-27, Lys-39, Leu-55, Ala-58, Ile-59, Tyr-61, and Leu-65 in our alignment [Atchley et al., 1999; Toledo-Ortiz et al., 2003]), whereas the other eight were not included in the consensus motif (Arg-17, Leu-30, Val-31, Pro-32, Asp-50, Ala-52, Ser-53, and Lys-63 in our alignment). Glu-13 and Arg-17 play very important roles in DNA binding (Atchley and Fitch, 1997). Although bHLH proteins have the potential to form homodimers or heterodimers, little is known about the residues important for dimerization (Shirakata et al., 1993), except in the case of the mammalian Max protein, which requires Leu-27 (Figs. 1 and and2)2) for dimer formation (Brownlie et al., 1997). Leu-27 was conserved in all 167 OsbHLH proteins, suggesting that this residue also is extremely important for dimerization or other functions of OsbHLH proteins.

Figure 1.
Distribution of amino acids in the bHLH consensus motif. In columns labeled a, percentages refer to the 392 bHLH proteins analyzed by Atchley et al. (1999). In columns labeled b, percentages refer to the 147 AtbHLH proteins identified by Toledo-Ortiz ...
Figure 2.
Predicted DNA-binding characteristics of the bHLH domain of OsbHLH and AtbHLH proteins. The asterisk (*) indicates that the data for AtbHLHs were from Toledo-Ortiz et al. (2003), and the figure is modeled after table III in Toledo-Ortiz et al. ...

The basic region of the bHLH domain has the ability to bind to DNA and is critical for function (Massari and Murre, 2000). Using the criteria described by Massari and Murre (2000), the OsbHLH proteins were divided into several categories according to sequence information in the N-terminal region of the bHLH domain (Fig. 2). The distribution of the predicted DNA-binding properties, as described below, across various phylogenetic subfamilies is indicated with shaped markers in the phylogenetic tree (Fig. 3). The OsbHLH proteins were divided into two major groups according to the 17 N-terminal amino acids within the bHLH domain: (1) a large group of 141 bHLH proteins containing five to eight basic residues in the basic region were predicted to bind to DNA, and (2) a smaller group of 26 HLH proteins lacking the basic region were thought to lack DNA-binding ability (Fig. 2), as previously done for Arabidopsis bHLH proteins (Toledo-Ortiz et al., 2003). These HLH proteins might be similar to the animal ID-HLH proteins, functioning as negative regulators of E-box-binding bHLHs through the formation of heterodimers (Fairman et al., 1993). The DNA-binding bHLHs in the first group were further divided into two groups with different predicted target sequences depending on the presence or absence of residues Glu-13 and Arg-16 in the basic region (Figs. 1 and and2).2). Group (1A) has 114 putative E-box-binding proteins with the conserved Glu-13/Arg-16 residues and Group (1B) has 27 non-E-box-binding proteins lacking these residues (Fig. 2). Within Group (1A), OsbHLH062 is an exception because it does not have Arg-16; nevertheless, we placed OsbHLH062 in this group because animal proteins such as SREBP missing the Arg-16 can also bind to E box (Hua et al., 1993). It is known that the three residues in the basic region of the bHLH domain, His/Lys-9, Glu-13, and Arg-17, constituted the classic G-box-binding region (Massari and Murre, 2000). Therefore, we can subdivide Group (1A) of 114 predicted E-box-binding bHLH proteins into two subgroups: (1A1) with 95 members predicted to bind G boxes, and (1A2) with 19 members predicted to bind other types of E boxes (non-G-box binders; Fig. 2).

Figure 3.Figure 3.
NJ phylogenetic tree of the OsbHLH members. This tree indicates the predicted DNA-binding activities, the intron distribution pattern, and the conservative sequence out of the bHLH domain. The unrooted tree, constructed using MEGA 3.0, summarizes the ...

Phylogenetic Analysis of the OsbHLH Genes

To obtain clues about the evolutionary history of the OsbHLH genes, a neighbor-joining (NJ) phylogenetic tree was generated using the multiple sequence alignments of the OsbHLH protein sequences with bootstrap analysis (1,000 replicates). The position of the bHLH domain and any conserved sequence motifs outside of the bHLH domain are shown in Figure 3. We subdivided the 167 members of the OsbHLH family into 22 subfamilies, designated A to V, according to clades with at least 50% bootstrap support. In addition, we noted that most of the members in the same subfamilies shared one or more motifs outside the bHLH domain, further supporting the subfamily definition. A total of 40 motifs outside of the bHLH domain were discovered (Supplemental Table III). However, most of these motifs have not been characterized except Leu-ZIP shared by the members of subfamily R; Leu-ZIP is known as a motif involved in protein dimerization (Tong et al., 1997; Paris et al., 2003). Similarly, sequence analysis of other transcription families like ERF (Nakano et al., 2006b) and WRKY (Eulgem et al., 2000) also indicated other motifs outside of the conserved bHLH domain. The fact that internal nodes of this tree had low bootstrap support is similar to the phylogenetic analysis of bHLH proteins in other organisms and is likely due to the fact that the bHLH domain is relatively short and members within a subfamily are highly conserved, with relatively few informative character positions. To further test the reliability of the NJ tree, maximum parsimony analysis was also used to generate phylogenetic tree (Supplemental Fig. 2), and 93% of the OsbHLH proteins were placed into the same subfamilies as those in the NJ tree, indicating that both methods are in very good agreement. As described below, the subfamilies defined on the basis of the phylogenetic analysis are also supported by additional studies.

Intron/Exon Structure within the OsbHLH Domains

The pattern of intron positions can also provide important evidence to support phylogenetic relationships in a gene family. Among 167 rice bHLH genes, the number of introns ranged from zero to four, with 87.4% of these 167 genes having intron(s) in the bHLH domain; these genes can be grouped into 10 patterns of intron presence and positions (Fig. 4A, I–III, V–X, and XII). Among these 10 patterns, the most common ones had one or more introns at three highly conserved positions (indicated by white inverted triangles), accounting for 82.0% of the 167 genes (Fig. 4A, I–III, V and VI). The remaining patterns had introns at varying positions (patterns VII–X, and XII) and were observed in only 5.4% of the 167 genes. Furthermore, we investigated intron phases with respect to codons. An intron was designated as occurring in one of three phases: in phase 1, splicing occurred after the first nucleotide of the codon; in phase 2, splicing occurred after the second nucleotide; and in phase 0, splicing occurred after the third nucleotide of the codon (Sharp, 1981). Figure 4A showed that all of the introns with conserved positions also had identical phases. All of the introns at the three conserved positions (indicated by white inverted triangles) had phase 0. The other introns with less conserved positions (black inverted triangles) were in phase 0, 1, or 2 (Fig. 4A, VII–X). Therefore, the splicing phase was highly conserved during the evolution of bHLH genes and supported the subfamily designation here. Such conserved splicing phase was also observed in the MYB gene families of rice and Arabidopsis (Jiang et al., 2004).

Figure 4.
Intron distribution within the bHLH domains of the OsbHLH and AtbHLH proteins. A, Scheme of the intron distribution patterns (color coded and designated I–XII) within the bHLH domains of the OsbHLH proteins. The white triangles are used when the ...

Exons with the same splicing phase at both 5′ and 3′ ends are called symmetric exons. According to the intron-early theory (Gilbert, 1987), an excess of phase 0 introns and symmetric exons (with the same phase on both ends) may facilitate exon shuffling by avoiding interruptions of the open reading frame (ORF) and facilitating recombinational fusion and exchange of protein domains (Patthy, 1987). Among the 271 introns analyzed here, 258 had phase 0, including all introns with conserved positions, whereas only four were in phase 1 and nine in phase 2. Among the 125 exons flanked by introns in the OsbHLH domain, 120 exons were symmetric with phase 0 introns, and only five were asymmetric with different splicing phase at 5′ end and 3′ end, respectively. Therefore, the analysis of bHLH genes provides strong support for the intron-early theory.

Genome Distribution of OsbHLH Genes

To determine the genomic distribution of the OsbHLH genes, the DNA sequence of each OsbHLH gene was used to search the rice genome database using BLASTN. Although each of the 12 rice chromosomes contains some OsbHLH genes, the distribution seems to be uneven (Fig. 5). Relatively high densities of bHLH genes were observed in some chromosomal regions, including the top and bottom of chromosomes 1, 2, and 3, and the bottom of chromosomes 4, 8, and 9. In particular, 17 OsbHLH members are located on the long arm of chromosome 4. In contrast, several large chromosomal regions lacked bHLH genes, such as the top half of chromosomes 4 and 9 and the central sections of chromosomes 7, 8, 11, and 12. Fourteen OsbHLH gene clusters were identified by members with high levels of sequence similarity (Fig. 5); for instance, the entire protein sequences of OsbHLH081 and OsbHLH082 share 75% similarity, and OsbHLH013 and OsbHLH015 are 68% similar (Fig. 5, linked with red line).

Figure 5.
Chromosomal locations, region duplication, and predicted cluster for OsbHLH genes. Chromosomal positions of the OsbHLH genes are indicated by OsbHLH number (assigned in Table I). The scale is in megabases (Mb). The numbers below the name of the chromosome ...

Genome duplication events are thought to have occurred throughout the process of plant evolution (Kent et al., 2003; Cannon et al., 2004; Mehan et al., 2004). To detect a possible relationship between the OsbHLH genes and potential genome duplications, we identified 40 pairs of OsbHLH genes that are close paralogs in the same subfamily (Fig. 5, blue lines, the alignment of these putative duplicated genes were in Supplemental Fig. 4). These genes represent 48% of the OsbHLH family and might have evolved from putative rice genome duplication events, because multiple pairs link each of at least 14 potential chromosomal/segmental duplications (Fig. 5, pairs of bars with numbers 1–14). In contrast, no intrachromosomal duplication event was suggested by the duplication of OsbHLH genes. The putative chromosomal duplications are similar to the predicated segmental duplications of the transcriptional factor encoding genes in the rice genome (Guyot and Keller, 2004; Xiong et al., 2005). Interestingly, four pairs of bHLH genes on the long arms of chromosome 1 and 9 are close paralogs (Fig. 5, linked in green lines) in two regions that had not been proposed to be the result of rice genome duplication. Additionally, 30 OsbHLH genes (18%) might involve local and tandem duplications (Fig. 5, linked with red line). It was much fewer than the number of that involved in the polyploidy duplication, so the polyploidy duplication might play a key role in gene expansion of bHLH genes in rice.

Additional evidence for a common origin of closely related bHLHs came from the intron position patterns in the bHLH domain. As shown in Figure 5, the genes related by putative duplications shared conserved intron position pattern. Only a few pairs of the probable duplicated genes, e.g. OsbHLH104 and OsbHLH152, in the same subfamily had different intron distribution patterns, which can be explained by a loss (or gain) of intron following the duplication event.

Comparative Analyses of Rice and Arabidopsis bHLH Genes

Using the alignment of the bHLH domain amino acid sequences of OsbHLHs and AtbHLHs (Supplemental Fig. 3), a phylogenetic tree was constructed (Fig. 6). Because of the large number of taxa and relatively small number of characters, the bootstrap values of internal nodes were low; nevertheless, the outer nodes had more credible bootstrap values, allowing for clustering of the bHLH genes of rice and Arabidopsis into 25 subfamilies (A–Y). In addition, our analysis of the OsbHLH gene family (Fig. 3, A–V) and the AtbHLH result (1–21) of Toledo-Ortiz et al. (2003) were also taken into consideration in the subfamily classification (Fig. 6). For example, subfamily 19 in Arabidopsis (Toledo-Ortiz et al., 2003) was divided into subfamily A and B in this study because of the bootstrap values. In the phylogenetic study by Buck and Atchley (2003) of 131 rice genes, 75 were grouped into 15 subfamilies, with 32 shown in their NJ tree, but the remaining 56 genes were not included in the phylogenetic result because of low statistical support. To compare our results with theirs, we labeled the clades they defined in the tree shown in Figure 6. Most of the large subfamilies, i.e. A to F, M, N, P, R to V, X, and Y, are also supported by the work of Buck and Atchley (2003) and had good bootstrap values (Fig. 6); some of these subfamilies contain new genes that we have included due to new rice genome sequence information. On the other hand, some small subfamilies, i.e. G to L, O, Q, and W did not appear in the NJ tree reported by Buck and Atchley (2003). These new subfamilies include new rice sequences that have been revealed after the completion of the rice genome sequence.

Figure 6.Figure 6.
NJ phylogenetic tree of the AtbHLH and OsbHLH domains and expression patterns for Arabidopsis and rice bHLH genes from RT-PCR, microarray, and EST data. The letter R above the column of expression data refers to root, S refers to stem, L refers to leaf, ...

Moreover, intron position patterns of the OsbHLHs were also consistent with the phylogenetic subfamilies defined in Figure 3. For instance, the members in subfamily A had the same intron distribution pattern, and so did members of the subfamilies G, I, K, L, N, O, P, Q, S, U, and V (Fig. 4B). Members of other subfamilies had the same intron distribution pattern with one or two exceptions. In addition, the intron/exon position pattern shown in Figure 4A agreed with the evolutionary relationship between OsbHLHs and AtbHLHs (see below; Fig. 6). There were 12 different groups of intron position patterns among OsbHLH and AtbHLH domains. Nine of the patterns are shared by the genes from both rice and Arabidopsis, although patterns IV and XI are found only among Arabidopsis genes, and pattern VII was only present in one rice gene. The nonconserved patterns shared by AtbHLH and OsbHLH showed that most of the intron patterns existed in the ancestor of monocots and eudicots. The percentages of each pattern in AtbHLHs and OsbHLHs were quite close; e.g. pattern I was found in 32.3% of OsbHLH members and 28.6% of AtbHLH members. The Arabidopsis bHLH introns also had identical splicing phase to those of subfamily members in OsbHLH domains.

The gene structures in terms of intron position and length were also displayed in Figure 6 to provide further clues about the evolutionary relationships among OsbHLHs and AtbHLHs. Most members in the same subfamilies had similar intron/exon structure. For example, members of subfamily P had only one intron with similar lengths. The fact that they not only had similar coding sequences but also very similar intron/exon structure supports their close evolutionary relationship and membership in the same subfamily. We also examined the sizes of introns and found that most members of the same bHLH subfamily had similar intron patterns, while the sizes of their introns of some members were similar too, i.e. the members of subfamily K have the single intron of 99 bp (OsbHLH149), 123 bp (OsbHLH150), and 130 bp (OsbHLH151), respectively (Fig. 6). Approximately 73% of the introns existing within the domains of OsbHLHs were shorter than 300 bp, and other introns in this bHLH domain of these family members were more than 300 bp. Although rice and Arabidopsis bHLH genes share many similarities, there were a few differences of intron patterns between the OsbHLH and AtbHLH domains. The length of one intron within the bHLH domain of two OsbHLHs (OsbHLH064 and OsbHLH076) was more than 5 kb (Fig. 6), but no AtbHLH gene had such a long intron. This might have resulted from insertion(s) into introns in OsbHLH family members after the divergence of monocots and eudicots. A similar explanation would be reasonable for the fact that a protein with two bHLH domains was observed in the rice genome, but not in Arabidopsis.

Overall, combining the bHLH domain intron patterns and the bHLH subfamilies, we can recognize two major categories of intron patterns that correspond to two major groups of subfamilies. The first one includes subfamilies A to K and W, mainly with members that have intron pattern I, which had the introns 1, 2, and 3 (Fig. 4B). Other intron patterns for members of these families can be explained by loss of specific introns in different lineages, starting from pattern I. For example, in the subfamily B, intron pattern II with intron 1 and 2, III with intron 1 and 3, IV with intron 2 and 3, and V with intron 1 might be obtained by losing one or two introns. The other category, including subfamilies L to V and X and Y mainly consisted of members with intron pattern VI, which had only the intron 2 (Fig. 4B). Members of some subfamilies without an intron in the bHLH, such as subfamilies A, W, H, I, and N, might have lost all introns. The intron position patterns in the bHLH family strongly support the hypothesis that introns have lost independently multiple times.

We have also observed that some members of different bHLH subfamilies were located within the same small chromosomal region, whereas some members of the same subfamily were distributed in different chromosomal regions, suggesting that bHLH genes were distributed widely in the genome of the common ancestor of monocots and eudicots. The phylogenetic tree of Arabidopsis and rice bHLH genes (Fig. 6) provides a way to estimate the number of bHLH genes in the MRCA. There were 45 branches with bootstrap values of 50 or greater that included both Arabidopsis and rice bHLH members, 11 branches had only Arabidopsis members, and 10 branches had only rice members. This result suggests that there were at least 66 bHLH genes in the MRCA of monocots and eudicots. Furthermore, the phylogenetic analysis provides evidence for birth-and-death evolution (Nei et al., 1997, 2000) in the flowering plant bHLH gene family. There are two other theories of evolution, which were divergent evolution and concerted evolution models (Nei and Rooney, 2005). Divergent evolution theory fits well the hemoglobin gene family (Ingram, 1961), but does not fit the gene family like bHLH because sequence similarities among members of the rice bHLH family (Fig. 6) are higher than the ones between rice and Arabidopsis. Concerted evolution was proposed to explain the evolution of rRNAs because a large number of tandemly repeated genes were found (Brown et al., 1972). But there are some pseudogenes that have stop codon in the bHLH domain (excluded from our study; data not shown) or their expression signals cannot be detected (Fig. 6) so this cannot be explained by concerted evolution. However, the phylogenetic tree of the bHLH family fits well with the model of birth-and-death evolution. The branches with more than one AtbHLH or OsbHLH gene had likely experienced gene birth due to gene duplication events, for example, OsbHLH116, OsbHLH135, OsbHLH136, and OsbHLH137 should be created by tandem duplication, and OsbHLH035 and OsbHLH036 should be involved in polyploidy duplication, whereas those branches with only Arabidopsis or rice bHLH members probably had gene deaths. Our results indicate that the birth rate seemed greater than the death rate in flowering plant bHLH family. Similarly, other large transcription factor families often experience birth-and-death evolution, such as the MADS-box gene family in both rice and Arabidopsis (Nam et al., 2003, 2004).

The Expression Pattern of OsbHLH and AtbHLH Genes

Because the expression pattern of a gene is often correlated with its function, we examined the expression information of OsbHLHs and AtbHLHs using reverse transcription (RT)-PCR analysis, microarray experiments, expressed sequence tag (EST) data of the National Center for Biotechnology Information (NCBI), and massively parallel signature sequencing (MPSS) data. We first analyzed the expression of the OsbHLHs using RT-PCR with RNA from rice root, leaf, stem, flower, and seed (Fig. 6). The RT-PCR products of a number of OsbHLHs have been confirmed by sequencing, providing a measure of the reliability of the RT-PCR results of OsbHLHs expression. In addition, we searched for information on OsbHLHs through the EST data from NCBI and the expression data of MPSS. Even though the EST information was incomplete, we found EST data for 61 OsbHLHs (May, 2005). Forty seven of these OsbHLHs with EST data had positive RT-PCR results (Fig. 6), whereas a few OsbHLHs had EST data but were not detected by RT-PCR (Fig. 6, represented by the boxes with italic lines). Expression information from the MPSS database demonstrated that 93 OsbHLH genes are expressed (Fig. 6), but 38 bHLH genes with positive RT-PCR signals were not detected by MPSS. In total, after integrating these data together, 33 OsbHLHs, such as OsbHLH012 and OsbHLH028, were not detectably expressed according to RT-PCR, EST, and MPSS data (Fig. 6, boxes with X). These genes might be pseudogenes, or expressed at specific developmental stages or under special conditions. Furthermore, we summarize the expression of AtbHLHs from RT-PCR analysis (Heim et al., 2003), microarray data (Zhang et al., 2005), and the NCBI EST database (May, 2005; Fig. 6). Expression of 87 AtbHLHs was detected using RT-PCR, 112 by microarray analysis, 115 from MPSS database, and 85 had matching ESTs. Only 14 AtbHLH genes had no expression signal (Fig. 6).

From Figure 6, 72 bHLH genes were expressed in all four tissues tested, suggesting that many bHLHs play regulatory roles at multiple development stages in rice and Arabidopsis. For example, both rice (OsbHLH031, OsbHLH032, and OsbHLH033) and Arabidopsis (AtbHLH046 and AtbHLH102) members of subfamily L are expressed in all four organs. It is possible that some members have preferential expression that is specific tissues or cells within these organs. Some bHLH genes show preferential expression, including 10 rice and Arabidopsis bHLH genes with expression in the root, one in the stem, nine in the leaf, and 30 in the flower and seed. This result indicates that members of this large family might take part in different biological processes in rice. It might be a common character of large transcription factor families, such as MYB family (Martin and Paz-Ares, 1997). In particular, members of subfamily Y, OsbHLH142, AtbHLH091, OsbHLH141, AtbHLH010, and AtbHLH089 had similar expression patterns in the flower and seed, supporting the hypothesis that these genes function in rice and Arabidopsis reproductive development. Therefore, genes with related sequences also tend to function in similar structures during development.

Eleven rice bHLH genes have been characterized. For example, LAX (OsbHLH122) regulates shoot branching (Komatsu et al., 2003) and Udt1 (OsbHLH164) is critical for tapetum development (Jung et al., 2005). OSB1 (OsbHLH013) and OSB2 (OsbHLH016) are involved in anthocyanin biosynthesis (Sakamoto et al., 2001). Several genes are important for stress responses, including OsbHLH1 (i.e. OsbHLH062 in this study) in cold response (Wang et al., 2003), OsPTF1(OsbHLH096) in tolerance to phosphate starvation (Yi et al., 2005), and RERJ1 (OsbHLH006) in wound and drought responses (Kiribuchi et al., 2004, 2005). Also, OsBP-5 (OsbHLH102) is involved in transcriptional regulation of the rice Wx gene (Zhu et al., 2003), and Ra (OsbHLH013) and Rb (OsbHLH165; Hu et al., 1996) are homologs of the maize (Zea maize) Lc gene; OsMYC (OsbHLH009; Zhu et al., 2005) is a homolog of AtMYC2.

Although the function of most rice bHLH genes is unknown, the phylogenetic and expression analyses provide a solid foundation for future functional studies in both rice and Arabidopsis. Identification of putative orthologs in different species will benefit the study of gene function, such as AtMYC2 (Abe et al., 2003) and OsMYC (Zhu et al., 2005), AMS (AtbHLH021; Sorensen et al., 2003) and TAPETUM DEGENERATION RETARDATION (TDR; OsbHLH005; N. Li, D. Zhang, X. Li, H. Liu, C. Yin, Z. Yuan, H. Chu, T. Wen, H. Huang, D. Luo, H. Ma, and D. Zhang, unpublished data). The identity and similarity of their full-length sequences were about 32% and 42%, respectively, and 70% and 76% within the two bHLH domains. AMS is critical for tapetal cell differentiation and likely regulates a postmeiotic transcriptional program supporting microspore development (Sorensen et al., 2003). Similar to ams, loss of function of TDR seems to result in delayed tapetal cell degradation and causes complete male sterility.

CONCLUSION

We have performed extensive analyses of the rice bHLH genes and compared them with Arabidopsis bHLH genes. We found that the rice and Arabidopsis bHLH genes form 25 subfamilies that are supported by phylogeny, additional protein motifs, and intron/exon structures. This phylogenetic analysis is in good agreement with previous results; at the same time, it presents new members in some existing subfamilies and defines new subfamilies by including additional bHLH genes from the completed rice genome sequence. The fact that the majority of subfamilies contain members from both rice and Arabidopsis suggests that the functions of most of bHLH genes are possibly conserved during angiosperm evolution. In addition, we estimate that the MRCA of monocots and eudicots had at least 66 bHLH genes. Phylogenetic analysis also suggests that there have been numerous gene birth events in this gene family during the evolution of flowering plants, in part due to putative genome duplications, compared with relatively few gene death events. The analysis of intron/exon structures revealed that most introns have conserved positions and phases, providing the evidence for the intron-early theory, and that multiple independent intron loss events likely have occurred during the evolution of flowering plants. Extensive expression data and available functional data support the hypothesis that bHLH genes in plants perform a variety of functions in different tissues at multiple developmental stages, also this summarized expression data of bHLH genes can be easily referred by readers.

In short, our studies indicate that the ancient bHLH gene family has likely expanded considerably during flowering plant evolution to include many relatively young members, allowing both the conservation and divergence of gene function. Our results have established a solid foundation for future studies using molecular genetic, biochemical, physiological, and developmental approaches, which will likely reveal the functional significance of this dynamic and fascinating gene family.

MATERIALS AND METHODS

Database Search for Rice bHLH Genes

To find the assembly of ESTs as candidate bHLH genes, one method of the BLAST (Altschul et al., 1990) program named TBLASTN (Altschul et al., 1997), provided by NCBI (http://www.ncbi.nlm.nih.gov) and TIGR (http://tigrblast.tigr.org/tgi) was performed. The default parameters in the TBLASTN program of TIGR and the wordsize 2, existence 10, extension 11 for NCBI were used to obtain the similar sequences as much as possible. We used the bHLH domain of a rice (Oryza sativa) bHLH gene (GeneBank number XM_463907) as a query sequence for TBLASTN. Each obtained sequence was then used as query sequence to perform PSI-BLAST (Altschul et al., 1997) in the version of release 4 of the TIGR pseudomolecules in rice (http://www.tigr.org/tdb/e2k1/osa1). The redundant sequences with different identification numbers and the same chromosome locus were removed from our data set. In addition, we have also obtained the sequences with Pfam number Pfam00010 containing the predicted HLH domain from the database of the TIGR.

Based on the results of BLASTN searches in the rice genome database of NCBI using the predicted cDNA sequences of rice bHLH genes, we obtained the information of the chromosome locations of these genes. Also, we obtained the information of intron distribution pattern and intron/exon boundaries of these bHLH genes from both the results of BLASTN and the TIGR annotation database. The data sets that were retrieved from the TBLASTN search and the annotation database were combined as our rice data set.

To further confirm the obtained cDNA sequences, the nucleotide sequences were translated into amino acid sequences, which were then examined for the bHLH domain using the hidden Markov model of SMART tool (http://smart.embl-heidelberg.de/; Schultz et al., 1998; Letunic et al., 2004).

Multiple Sequence Alignments

Multiple sequence alignments using the Multalin tool (http://prodes.toulouse.inra.fr/multalin/; Corpet, 1988) with the default parameters were performed on the obtained sequences of the OsbHLH domains and the bHLH domains flanking amino acids of the predicted bHLH proteins. The alignment was then adjusted manually by the location of the corresponding amino acids in the bHLH motif, and the similar amino acids were highlighted using GeneDoc (version 2.6.002) software (Pittsburgh Supercomputing Center, http://www.psc.edu/biomed/genedoc/; Nicholas et al., 1997). Then the amino acid sequences beside the bHLH domain were added to the aligned sequences and these sequences were aligned again to obtain the alignment with the motif adjacent to the bHLH domain. We used ClustalX (http://www-igbmc.u-strasbg.fr/BioInfo/; Thompson et al., 1997) as a secondary method to align sequences and to recheck the result. This alignment was also adjusted by manually using GeneDoc to align the motif common to OsbHLH members.

To compare the evolutionary relationship of rice and Arabidopsis (Arabidopsis thaliana) bHLHs, we also performed Multalin tool using our obtained OsbHLH domains and 147 AtbHLH domains predicted by Toledo-Ortiz et al. (2003), and the combined OsbHLH and AtbHLH phylogenetic tree was obtained after manual adjustment of the alignment. To obtain information on the intron/exon structure, the cDNA alignment of bHLH domain sequences was obtained according to the amino acid sequence alignment, and the information of intron distribution pattern and intron splicing phase were derived with the aligned cDNA sequences. In addition, to search for other motifs shared by OsbHLH members, we also used the multiple EM for motif elicitation tool (version 3.0; http://bioweb.pasteur.fr/seqanal/motif/meme/; Bailey and Elkan, 1994) to find similar sequences between OsbHLH members.

Tree Building

A phylogenetic tree was constructed with the aligned rice bHLH protein sequences using MEGA (version 3.0; http://www.megasoftware.net/index.html; Kumar et al., 2004) and using the NJ method with the following parameters: poisson correction, pairwise deletion, and bootstrap (1,000 replicates; random seed). The amino acids variation rates were also obtained. Meanwhile, max parsimony method of the software PHYLIP (version 3.6; http://evolution.genetics.washington.edu/phylip.html; Felsenstein, 1989) was also used with bootstrap of 1,000 replicates to create a second phylogenetic tree to validate the results from the NJ method.

The phylogenetic tree of the AtbHLH and OsbHLH domains was developed by using PHYLIP, and the resulting clades were assessed by bootstrap of 1,000 replicates. The Dayhoff PAM matrix in the protein distance algorithm and NJ method were employed to construct the unrooted tree. The constructed tree file was visualized by TreeView1.6.6 (http://taxonomy.zoology.gla.ac.uk/rod/rod.html).

OsbHLH Locations and Segmental Duplication

Following verification of the location of OsbHLH genes, the distribution of OsbHLH family members throughout the rice genome was drawn by the software MapInspect (http://www.dpw.wau.nl/pv/pub/MapComp/). For the detection of large segmental duplications, we consulted the duplicated blocks map provided by Xiong et al. (2005). On this map, each of the bHLH genes was localized on the corresponding chromosome using the coordinates from the genome sequence data (August, 2002 version). The software BioEdit (version 5.0.9; Hall, 1999) was used to analyze the homologs for similarity on the NJ phylogenetic tree of these rice bHLH genes. By performing ClustalX, we did protein sequence comparison of 40 pairs of OsbHLH genes involved in the potential genome duplication events.

Expression Analysis of AtbHLHs and OsbHLHs

We used RT-PCR to detect the expression patterns of the OsbHLHs. The PCR primers were designed to avoid the conserved region and to amplify products of 150 to 400 bp long. Primer sequences were shown in detail in Supplemental Table II. RNA of roots, leaves, stems, and flowers of rice japonica cultivar Nipponbare was isolated from the plants with 8 to 10 cm inflorescences. The plants grew in the greenhouse under long-day conditions. Total RNA was isolated as described (Chomczynski and Sacchi, 1987) and treated with DNaseI (Promega). Two micrograms of RNA was used for RT in a 20 μL reaction volume with M-MuLV reverse transcriptase (Fermentas) according to the manufacturer's recommendations with oligodT(18) primer. Thirty-two cycles of PCR amplification were performed. Each PCR pattern was verified by triple replicate experiments, and no template as negative control and Actin DNA fragment (551 bp) as positive control were employed for each gene. The resulting DNA bands of the expected size were considered as the expected DNA signal. To confirm this, 20 samples (OsbHLH numbers 001, 005, 006, 009, 032, 043, 056, 065, 073, 084, 090, 091, 092, 095, 104, 113, 138, 150, 152, and 153) were randomly selected for sequencing (by Invitrogen). EST data came from UniGene of NCBI (Wheeler et al., 2003; http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene. ). We also searched the expression data in the database of MPSS (Nakano et al., 2006a; http://mpss.udel.edu). The LOC (location) number of OsbHLH (shown in Table I) and AtbHLH (Toledo-Ortiz et al., 2003) was used to query the MPSS database that contains the signature information of the bHLH genes.

Acknowledgments

We thank Haisheng Liu and Huayong Xu for useful suggestions at the beginning of this work, Mingjiao Chen for supply of the rice material, and Professor Mingsheng Chen for helpful suggestions on phylogenetic analysis.

Notes

1This work was supported by the funds from the National Key Basic Research Developments Program of the Ministry of Science and Technology, People's Republic of China (2001CB109002 and 2005CB120802), National 863 High-Tech Project (2005AA2710330), Shanghai Municipal Committee of Science and Technology (03JC14061), the Program for New Century Excellent Talents in University (NCET–04–0403), the Shuguang Scholarship (04SG15), the Shanghai Institutes of Biological Sciences (Reproductive Development Project), and the U.S. Department of Energy (DE–FG02–02ER15332).

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Dabing Zhang (nc.ude.utjs@bdgnahz).

[W]The online version of this article contains Web-only data.

www.plantphysiol.org/cgi/doi/10.1104/pp.106.080580.

References

  • Abe H, Urao T, Ito T, Seki M, Shinozaki K, Yamaguchi-Shinozaki K (2003) Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) function as transcriptional activators in abscisic acid signaling. Plant Cell 15: 63–78 [PMC free article] [PubMed]
  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410 [PubMed]
  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402 [PMC free article] [PubMed]
  • Atchley WR, Fitch WM (1997) A natural classification of the basic helix-loop-helix class of transcription factors. Proc Natl Acad Sci USA 94: 5172–5176 [PMC free article] [PubMed]
  • Atchley WR, Terhalle W, Dress A (1999) Positional dependence, cliques, and predictive motifs in the bHLH protein domain. J Mol Evol 48: 501–516 [PubMed]
  • Bailey PC, Martin C, Toledo-Ortiz G, Quail PH, Huq E, Heim MA, Jakoby M, Werber M, Weisshaar B (2003) Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana. Plant Cell 15: 2497–2502 [PMC free article] [PubMed]
  • Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2: 28–36 [PubMed]
  • Brown DD, Wensink PC, Jordan E (1972) A comparison of the ribosomal DNA's of Xenopus laevis and Xenopus mulleri: the evolution of tandem genes. J Mol Biol 63: 57–73 [PubMed]
  • Brownlie P, Ceska T, Lamers M, Romier C, Stier G, Teo H, Suck D (1997) The crystal structure of an intact human Max-DNA complex: new insights into mechanisms of transcriptional control. Structure 5: 509–520 [PubMed]
  • Buck MJ, Atchley WR (2003) Phylogenetic analysis of plant basic helix-loop-helix proteins. J Mol Evol 56: 742–750 [PubMed]
  • Cannon SB, Mitra A, Baumgarten A, Young ND, May G (2004) The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol 4: 10. [PMC free article] [PubMed]
  • Chomczynski P, Sacchi N (1987) Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 162: 156–159 [PubMed]
  • Corpet F (1988) Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16: 10881–10890 [PMC free article] [PubMed]
  • Crozatier M, Valle D, Dubois L, Ibnsouda S, Vincent A (1996) Collier, a novel regulator of Drosophila head development, is expressed in a single mitotic domain. Curr Biol 6: 707–718 [PubMed]
  • Ellenberger T, Fass D, Arnaud M, Harrison SC (1994) Crystal structure of transcription factor E47: E-box recognition by a basic region helix-loop-helix dimer. Genes Dev 8: 970–980 [PubMed]
  • Eulgem T, Rushton PJ, Robatzek S, Somssich IE (2000) The WRKY superfamily of plant transcription factors. Trends Plant Sci 5: 199–206 [PubMed]
  • Facchini LM, Penn LZ (1998) The molecular role of Myc in growth and transformation: recent discoveries lead to new insights. FASEB J 12: 633–651 [PubMed]
  • Fairman R, Beran-Steed RK, Anthony-Cahill SJ, Lear JD, Stafford WF III, DeGrado WF, Benfield PA, Brenner SL (1993) Multiple oligomeric states regulate the DNA binding of helix-loop-helix peptides. Proc Natl Acad Sci USA 90: 10429–10433 [PMC free article] [PubMed]
  • Felsenstein J (1989) PHYLIP: Phylogeny Inference Package. Cladistics 5: 164–166
  • Fisher A, Caudy M (1998) The function of hairy-related bHLH repressor proteins in cell fate decisions. Bioessays 20: 298–306 [PubMed]
  • Gale MD, Devos KM (1998) Comparative genetics in the grasses. Proc Natl Acad Sci USA 95: 1971–1974 [PMC free article] [PubMed]
  • Gilbert W (1987) The exon theory of genes. Cold Spring Harb Symp Quant Biol 52: 901–905 [PubMed]
  • Goding CR (2000) Mitf from neural crest to melanoma: signal transduction and transcription in the melanocyte lineage. Genes Dev 14: 1712–1728 [PubMed]
  • Guyot R, Keller B (2004) Ancestral genome duplication in rice. Genome 47: 610–614 [PubMed]
  • Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41: 95–98
  • Heim MA, Jakoby M, Werber M, Martin C, Weisshaar B, Bailey PC (2003) The basic helix-loop-helix transcription factor family in plants: a genome-wide study of protein structure and functional diversity. Mol Biol Evol 20: 735–747 [PubMed]
  • Henriksson M, Luscher B (1996) Proteins of the Myc network: essential regulators of cell growth and differentiation. Adv Cancer Res 68: 109–182 [PubMed]
  • Hu J, Anderson B, Wessler SR (1996) Isolation and characterization of rice R genes: evidence for distinct evolutionary paths in rice and maize. Genetics 142: 1021–1031 [PMC free article] [PubMed]
  • Hua X, Yokoyama C, Wu J, Briggs MR, Brown MS, Goldstein JL, Wang X (1993) SREBP-2, a second basic-helix-loop-helix-leucine zipper protein that stimulates transcription by binding to a sterol regulatory element. Proc Natl Acad Sci USA 90: 11603–11607 [PMC free article] [PubMed]
  • Ingram VM (1961) Gene evolution and the haemoglobins. Nature 189: 704–708 [PubMed]
  • IRGSP (2005) The map-based sequence of the rice genome. Nature 436: 793–800 [PubMed]
  • Jiang C, Gu X, Peterson T (2004) Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of Arabidopsis and Oryza sativa L. ssp. indica. Genome Biol 5: R46. [PMC free article] [PubMed]
  • Jung KH, Han MJ, Lee YS, Kim YW, Hwang I, Kim MJ, Kim YK, Nahm BH, An G (2005) Rice Undeveloped Tapetum1 is a major regulator of early tapetum development. Plant Cell 17: 2705–2722 [PMC free article] [PubMed]
  • Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D (2003) Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci USA 100: 11484–11489 [PMC free article] [PubMed]
  • Kiribuchi K, Jikumaru Y, Kaku H, Minami E, Hasegawa M, Kodama O, Seto H, Okada K, Nojiri H, Yamane H (2005) Involvement of the basic helix-loop-helix transcription factor RERJ1 in wounding and drought stress responses in rice plants. Biosci Biotechnol Biochem 69: 1042–1044 [PubMed]
  • Kiribuchi K, Sugimori M, Takeda M, Otani T, Okada K, Onodera H, Ugaki M, Tanaka Y, Tomiyama-Akimoto C, Yamaguchi T, et al (2004) RERJ1, a jasmonic acid-responsive gene from rice, encodes a basic helix-loop-helix protein. Biochem Biophys Res Commun 325: 857–863 [PubMed]
  • Komatsu K, Maekawa M, Ujiie S, Satake Y, Furutani I, Okamoto H, Shimamoto K, Kyozuka J (2003) LAX and SPA: major regulators of shoot branching in rice. Proc Natl Acad Sci USA 100: 11765–11770 [PMC free article] [PubMed]
  • Kosugi S, Ohashi Y (1997) PCF1 and PCF2 specifically bind to cis elements in the rice proliferating cell nuclear antigen gene. Plant Cell 9: 1607–1619 [PMC free article] [PubMed]
  • Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5: 150–163 [PubMed]
  • Ledent V, Paquet O, Vervoort M (2002) Phylogenetic analysis of the human basic helix-loop-helix proteins. Genome Biol 3: RESEARCH0030. [PMC free article] [PubMed]
  • Ledent V, Vervoort M (2001) The basic helix-loop-helix protein family: comparative genomics and phylogenetic analysis. Genome Res 11: 754–770 [PMC free article] [PubMed]
  • Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork P (2004) SMART 4.0: towards genomic data integration. Nucleic Acids Res 32: D142–D144 [PMC free article] [PubMed]
  • Martin C, Paz-Ares J (1997) MYB transcription factors in plants. Trends Genet 13: 67–73 [PubMed]
  • Martinez-Garcia JF, Huq E, Quail PH (2000) Direct targeting of light signals to a promoter element-bound transcription factor. Science 288: 859–863 [PubMed]
  • Massari ME, Murre C (2000) Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms. Mol Cell Biol 20: 429–440 [PMC free article] [PubMed]
  • Mehan MR, Freimer NB, Ophoff RA (2004) A genome-wide survey of segmental duplications that mediate common human genetic variation of chromosomal architecture. Hum Genomics 1: 335–344 [PMC free article] [PubMed]
  • Murre C, Bain G, van Dijk MA, Engel I, Furnari BA, Massari ME, Matthews JR, Quong MW, Rivera RR, Stuiver MH (1994) Structure and function of helix-loop-helix proteins. Biochim Biophys Acta 1218: 129–135 [PubMed]
  • Murre C, McCaw PS, Baltimore D (1989) A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins. Cell 56: 777–783 [PubMed]
  • Nakano M, Nobuta K, Vemaraju K, Tej SS, Skogen JW, Meyers BC (2006. a) Plant MPSS databases: signature-based transcriptional resources for analyses of mRNA and small RNA. Nucleic Acids Res 34: D731–D735 [PMC free article] [PubMed]
  • Nakano T, Suzuki K, Fujimura T, Shinshi H (2006. b) Genome-wide analysis of the ERF gene family in Arabidopsis and rice. Plant Physiol 140: 411–432 [PMC free article] [PubMed]
  • Nam J, dePamphilis CW, Ma H, Nei M (2003) Antiquity and evolution of the MADS-box gene family controlling flower development in plants. Mol Biol Evol 20: 1435–1447 [PubMed]
  • Nam J, Kim J, Lee S, An G, Ma H, Nei M (2004) Type I MADS-box genes have experienced faster birth-and-death evolution than type II MADS-box genes in angiosperms. Proc Natl Acad Sci USA 101: 1910–1915 [PMC free article] [PubMed]
  • Nei M, Gu X, Sitnikova T (1997) Evolution by the birth-and-death process in multigene families of the vertebrate immune system. Proc Natl Acad Sci USA 94: 7799–7806 [PMC free article] [PubMed]
  • Nei M, Rogozin IB, Piontkivska H (2000) Purifying selection and birth-and-death evolution in the ubiquitin gene family. Proc Natl Acad Sci USA 97: 10866–10871 [PMC free article] [PubMed]
  • Nei M, Rooney AP (2005) Concerted and birth-and-death evolution of multigene families. Annu Rev Genet 39: 121–152 [PMC free article] [PubMed]
  • Nesi N, Debeaujon I, Jond C, Pelletier G, Caboche M, Lepiniec L (2000) The TT8 gene encodes a basic helix-loop-helix domain protein required for expression of DFR and BAN genes in Arabidopsis siliques. Plant Cell 12: 1863–1878 [PMC free article] [PubMed]
  • Nicholas KB, Nicholas HBJ, Deerfield DWI (1997) Genedoc: analysis and visualization of genetic variation. Embnew News 4: 14
  • Paris S, Longhi R, Santambrogio P, de Curtis I (2003) Leucine-zipper-mediated homo- and hetero-dimerization of GIT family p95-ARF GTPase-activating protein, PIX-, paxillin-interacting proteins 1 and 2. Biochem J 372: 391–398 [PMC free article] [PubMed]
  • Patthy L (1987) Intron-dependent evolution: preferred types of exons and introns. FEBS Lett 214: 1–7 [PubMed]
  • Quail PH (2000) Phytochrome-interacting factors. Semin Cell Dev Biol 11: 457–466 [PubMed]
  • Ramsay NA, Glover BJ (2005) MYB-bHLH-WD40 protein complex and the evolution of cellular diversity. Trends Plant Sci 10: 63–70 [PubMed]
  • Robinson KA, Koepke JI, Kharodawala M, Lopes JM (2000) A network of yeast basic helix-loop-helix interactions. Nucleic Acids Res 28: 4460–4466 [PMC free article] [PubMed]
  • Sakamoto W, Ohmori T, Kageyama K, Miyazaki C, Saito A, Murata M, Noda K, Maekawa M (2001) The Purple leaf (Pl) locus of rice: the Pl(w) allele has a complex organization and includes two genes encoding basic helix-loop-helix proteins involved in anthocyanin biosynthesis. Plant Cell Physiol 42: 982–991 [PubMed]
  • Schultz J, Milpetz F, Bork P, Ponting CP (1998) SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci USA 95: 5857–5864 [PMC free article] [PubMed]
  • Sharp PA (1981) Speculations on RNA splicing. Cell 23: 643–646 [PubMed]
  • Shirakata M, Friedman FK, Wei Q, Paterson BM (1993) Dimerization specificity of myogenic helix-loop-helix DNA-binding factors directed by nonconserved hydrophilic residues. Genes Dev 7: 2456–2470 [PubMed]
  • Sonnenfeld MJ, Delvecchio C, Sun X (2005) Analysis of the transcriptional activation domain of the Drosophila tango bHLH-PAS transcription factor. Dev Genes Evol 215: 221–229 [PubMed]
  • Sorensen AM, Krober S, Unte US, Huijser P, Dekker K, Saedler H (2003) The Arabidopsis ABORTED MICROSPORES (AMS) gene encodes a MYC class transcription factor. Plant J 33: 413–423 [PubMed]
  • Steidl C, Leimeister C, Klamt B, Maier M, Nanda I, Dixon M, Clarke R, Schmid M, Gessler M (2000) Characterization of the human and mouse HEY1, HEY2, and HEYL genes: cloning, mapping, and mutation screening of a new bHLH gene family. Genomics 66: 195–203 [PubMed]
  • Sun XH, Copeland NG, Jenkins NA, Baltimore D (1991) Id proteins Id1 and Id2 selectively inhibit DNA binding by one class of helix-loop-helix proteins. Mol Cell Biol 11: 5603–5611 [PMC free article] [PubMed]
  • Swanson HI, Chan WK, Bradfield CA (1995) DNA binding specificities and pairing rules of the Ah receptor, ARNT, and SIM proteins. J Biol Chem 270: 26292–26302 [PubMed]
  • Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25: 4876–4882 [PMC free article] [PubMed]
  • Toledo-Ortiz G, Huq E, Quail PH (2003) The Arabidopsis basic/helix-loop-helix transcription factor family. Plant Cell 15: 1749–1770 [PMC free article] [PubMed]
  • Tong Q, Xing S, Jhiang SM (1997) Leucine zipper-mediated dimerization is essential for the PTC1 oncogenic activity. J Biol Chem 272: 9043–9047 [PubMed]
  • Wang YJ, Zhang ZG, He XJ, Zhou HL, Wen YX, Dai JX, Zhang JS, Chen SY (2003) A rice transcription factor OsbHLH1 is involved in cold stress response. Theor Appl Genet 107: 1402–1409 [PubMed]
  • Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, Pontius JU, Schuler GD, Schriml LM, Sequeira E, Tatusova TA, et al (2003) Database resources of the National Center for Biotechnology. Nucleic Acids Res 31: 28–33 [PMC free article] [PubMed]
  • Xiong Y, Liu T, Tian C, Sun S, Li J, Chen M (2005) Transcription factors in rice: a genome-wide comparative analysis between monocots and eudicots. Plant Mol Biol 59: 191–203 [PubMed]
  • Yi K, Wu Z, Zhou J, Du L, Guo L, Wu Y, Wu P (2005) OsPTF1, a novel transcription factor involved in tolerance to phosphate starvation in rice. Plant Physiol 138: 2087–2096 [PMC free article] [PubMed]
  • Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296: 79–92 [PubMed]
  • Zhang X, Feng B, Zhang Q, Zhang D, Altman N, Ma H (2005) Genome-wide expression profiling and identification of gene activities during early flower development in Arabidopsis. Plant Mol Biol 58: 401–419 [PubMed]
  • Zhu Y, Cai XL, Wang ZY, Hong MM (2003) An interaction between a MYC protein and an EREBP protein is involved in transcriptional regulation of the rice Wx gene. J Biol Chem 278: 47803–47811 [PubMed]
  • Zhu ZF, Sun CQ, Fu YC, Qian XY, Yang JS, Wang XK (2005) Isolation and analysis of a novel MYC gene from rice. Yi Chuan Xue Bao 32: 393–398 [PubMed]

Articles from Plant Physiology are provided here courtesy of American Society of Plant Biologists
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • MedGen
    MedGen
    Related information in MedGen
  • Pathways + GO
    Pathways + GO
    Pathways, annotations and biological systems (BioSystems) that cite the current article.
  • PubMed
    PubMed
    PubMed citations for these articles
  • Taxonomy
    Taxonomy
    Related taxonomy entry
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...