• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of genoresGenome ResearchCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. May 2001; 11(5): 754–770.
PMCID: PMC311049

The Basic Helix-Loop-Helix Protein Family: Comparative Genomics and Phylogenetic Analysis

Abstract

The basic Helix-Loop-Helix (bHLH) proteins are transcription factors that play important roles during the development of various metazoans including fly, nematode, and vertebrates. They are also involved in human diseases, particularly in cancerogenesis. We made an extensive search for bHLH sequences in the completely sequenced genomes of Caenorhabditis elegans and of Drosophila melanogaster. We found 35 and 56 different genes, respectively, which may represent the complete set of bHLH of these organisms. A phylogenetic analysis of these genes, together with a large number (>350) of bHLH from other sources, led us to define 44 orthologous families among which 36 include bHLH from animals only, and two have representatives in both yeasts and animals. In addition, we identified two bHLH motifs present only in yeast, and four that are present only in plants; however, the latter number is certainly an underestimate. Most animal families (35/38) comprise fly, nematode, and vertebrate genes, suggesting that their common ancestor, which lived in pre-Cambrian times (600 million years ago) already owned as many as 35 different bHLH genes.

Transcription factors of the basic Helix-Loop-Helix (bHLH) family play a central role in cell proliferation, determination, and differentiation (Jan and Jan 1993; Weintraub 1993; Hassan and Bellen 2000). The bHLH domain is ~60 amino acids long and comprises a DNA-binding basic region (b) followed by two α-helices separated by a variable loop region (HLH) (Ferre-D'Amar et al. 1993). The HLH domain promotes dimerization, allowing the formation of homodimeric or heterodimeric complexes between different family members (Murre et al. 1989a; Kadesh 1993). The two basic domains brought together through dimerization bind specific hexanucleotide sequences (Murre et al. 1989a; Van Doren et al. 1991, 1994; Ohsako et al. 1994).

The bHLH motif first was identified in the murine transcription factors E12 and E47 (Murre et al. 1989b). Numerous bHLH proteins since have been identified in animals, plants, and fungi. A phylogenetic analysis based on a sample of 122 bHLH sequences has lead to a subdivision into four monophyletic groups of proteins named A, B, C, and D (Atchley and Fitch 1997).

Group A and B include bHLH proteins that bind hexameric DNA sequences referred to as “E Boxes” (CANNTG), respectively CACCTG or CAGCTG (Group A) and CACGTG or CATGTTG (Group B) (Murre et al. 1989a; Van Doren et al. 1991; Dang et al. 1992).

Group A includes several tissue-specific bHLH proteins (e.g., MyoD, Twist, Achaete-Scute proteins; for a recent review, see Hassan and Bellen 2000) as well as the ubiquitously distributed E12/Daughterless-type bHLH proteins (Murre et al. 1989b). In many instances, the tissue-specific proteins form inactive homodimers and require the presence of a E12/Daughterless partner to form active heterodimers (Cabrera and Alonso 1991; Lassar et al. 1991; Van Doren et al. 1992). Binding of the heterodimers to an E-box usually leads to transcriptional activation of the target gene (Cabrera and Alonso 1991; Van Doren et al. 1992).

Group B includes a large number of functionally unrelated proteins (e.g., Myc, Max, USF, SREBP, MITF) involved in various developmental and cellular processes (Henriksson and Luscher 1996; Facchini and Penn 1998; Goding 2000). Some group-B proteins contain an additional motif, known as a Leucine Zipper (LZ), which also is involved in protein dimerization. Dang et al. (1992) and Atchley and Fitch (1997) included in the same group B several proteins related to the Drosophila Hairy and Enhancer of split bHLH (HER proteins; Fisher and Caudy 1998). These proteins are characterized by the presence of a proline instead of an arginine at a crucial position in the basic domain. DNA-binding site selection and in vivo studies have shown that these proteins bind preferentially to sequences referred to as “N-boxes” (CACGCG or CACG AG) and have only a low affinity for “E-boxes” (Ohsako et al. 1994; Van Doren et al. 1994). The HER proteins are characterized further by the presence of an additional motif, the 4-amino acid WRPW domain, which allows the interaction with the Groucho repressor protein (Fisher and Caudy 1998). Accordingly, the HER proteins have been shown to act as transcriptional repressors during nervous system development and segmentation (Kageyama and Nakanishi 1997; Fisher and Caudy 1998).

Group C corresponds to the family of bHLH proteins known as bHLH-PAS (Crews 1998). The characteristic feature of bHLH-PAS proteins is the PAS domain, so called for the first three proteins identified with this motif: Drosophila Period (Per), human ARNT, and Drosophila Single-minded (Sim). The PAS domain found in bHLH-PAS proteins is ~260–310 amino acids long and allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins (Crews 1998). bHLH-PAS proteins control a variety of developmental and physiological events including neurogenesis, tracheal and salivary duct formation, toxin metabolism, circadian rhythms, and response to hypoxia (Crews 1998). bHLH-PAS proteins bind to ACGTG or GCGTG core sequences.

Group D corresponds to HLH proteins that lack a basic domain and are hence unable to bind DNA. This group includes the Id and Extramacrochaete (Emc) proteins (Benezra et al. 1990; Ellis et al. 1990; Garrell and Modolell 1990), which act as antagonists of group A bHLH proteins (Van Doren et al. 1991, 1992).

An additional group of putative HLH proteins has been described more recently, the COE family (for Collier/Olf-1/EBF). This group is characterized by the presence of an additional domain involved both in dimerization and in DNA binding, the COE domain (Crozatier et al. 1996). The HLH sequences of this group are highly divergent from the other bHLH motifs, making their phyletic analysis difficult.

Other than this subdivision in a few large groups, however, little is known of the evolution and diversification of the bHLH domain. Yet, given the importance of the bHLH genes in development, it would be desirable to have a more refined classification scheme of the various types of bHLH motifs, as well as a better understanding of their evolutionary relationships both within and between organisms. We have taken advantage of the complete sequencing of the nematode's (Caenorhabditis elegans Sequencing Consortium 1998) and fly's (Adams et al. 2000) genomes to extract a large, and possibly complete, set of bHLH genes from these two organisms. We also have used the large number of bHLH genes that now have been identified in vertebrates, as well as the smaller number available in plants and fungi, to assess the evolutionary relationships within this family.

RESULTS AND DISCUSSION

Derivation of Comprehensive Sets of bHLH Sequences from Existing Databases

The completion of the nematode and fly sequencing projects provided us with an opportunity to screen whole genomes for bHLH coding regions. To collect as many such sequences as possible (hopefully, all of them) we first retrieved a large number of bHLH sequences available from the nonredundant NCBI (http://www.ncbi.nlm.nih.gov) and Sanger protein databases (http://www.sanger.ac.uk), as described in the Methods section. We used the most divergent among these sequences (as determined by preliminary phylogenetic reconstructions) to screen by BLASTP (Altschul et al. 1990) the complete genomic sequences of C. elegans and D. melanogaster. We used the retrieved sequences that were not present in our initial collection to make new BLASTP searches in both genome databases as well as in the nonredundant NCBI protein database. We also used yeast and plant sequences retrieved from our original screen to make BLASTP searches, against the Saccharomyces cerevisiae and the Arabidopsis thaliana genome databases, in order to isolate additional bHLH sequences from these organisms. These various searches generated a set of more than 350 different bHLH sequences. We did not systematically retrieve the large number of bHLH sequences from various mammals (other than mouse) that are available. Thus, it is clear that our set is far from including all the available bHLH sequences. We believe, however, that it provides a very extensive coverage of fly, nematode, and mouse genes, and a fair representation of the plant and fungal types.

We aligned these sequences using the multiple alignments software CLUSTALW (Thompson et al. 1994) and checked each alignment by hand. The verified alignments then were used to construct phylogenetic trees as described in the Methods section. The resultant trees were bootstrapped to provide information about their statistical reliability. We used these trees to define groups of orthologous sequences.

Identification of Orthologous Families

Orthologous genes in two or more organisms are homologs that evolved from the same gene in the last common ancestor (Fitch 1970). Paralogous genes are those that have resulted from within species duplication (Fitch 1970). Unfortunately, there is no absolute criterion that can be used to decide if two genes are orthologous. The criterion we used to define orthologous families was that the grouping of bHLH sequences from at least two species into one monophyletic family should be supported by different methods of analysis with bootstrap values >50%. A similar criterion has been used in other analyses of protein families (Galliot et al. 1999). This criterion was relaxed for a few families, as will be discussed later (lower bootstrap values; Table Table1).1). The fact that congruence was observed between trees constructed by different methods suggests that our reconstruction of the bHLH phylogeny is essentially correct.

Table 1
bHLH Genes Grouped into 44 Phylogenetically-Defined Families

Our analysis led us to define 44 orthologous families ( i.e., 44 ancestral types of bHLH domains). Table Table11 summarizes the 44 families and some of their properties. We named each family according to its first discovered member, or in a few cases, to its best-characterized member. The complete list of all members of every family, together with database accession numbers, can be found as supplementary material at http://www.genome.org. Two types of bHLH motifs presented special problems. First, the HLH of COE family proteins were not easily alignable with other bHLH proteins. Hence, the phylogenetic analysis of this family was mostly done without other types of bHLH sequences and using the well-conserved COE domain in addition to the bHLH. Second, although Hairy/E(spl)-related (HER) proteins appear consistently monophyletic, the resolution within the group was very poor and we were unable to identify orthologous families with any confidence. Because many amino acids flanking the bHLH motif are conserved in this group, we used a larger domain for phyletic comparisons to obtain better (but still low) phylogenetic resolution (Table (Table11).

Figure Figure11 shows an alignment of all 44 bHLH types, based on one representative of each family. Thirty-six families comprise only animal members, four families are specific to plants, two are found only in yeasts, and two have both yeast and animal representatives. Thus, the bHLH motif appeared very early in eukaryotic history, but its expansion occurred almost entirely after the divergence between plants, fungi, and animals. The presence of only four plant families in our set is most likely a result of the fact that there were no extensive searches for bHLH genes in plants. As a consequence, most plant sequences come from one species, A. thaliana, for which an extensive genome project is conducted. Indications that more plant families are to be identified come from preliminary BLASTP searches, which revealed 30 different A. thaliana bHLH sequences, most of which are unrelated to other plant bHLH sequences. We found these sequences to form four additional “families” comprising A. thaliana sequences only. These “families” are not reported in Table Table11 because we choose to consider, as significant families, only groups that contain sequences from at least two different species. Hence, most A. thaliana bHLH are considered, in our work, as orphan genes (i.e., sequences that can not be assigned to any family).

Figure 1
(top) Alignment of the bHLH of the 44 different families listed in Table Table11 (abbreviations as in Table Table1).1). One member per family, usually from mouse, has been selected. Designation of basic, Helix1, Loop and Helix2 follows ...

Drosophila Genes

We found 56 bHLH sequences in D. melanogaster. Table Table22 lists these sequences, the family to which they belong, their chromosomal localization, their characterization status, and their accession number. A version of this table with links to Flybase (http://flybase.bio.indiana.edu; The Flybase Consortium 1999) is available as supplementary material at http://www.genome.org.

Table 2
The Complete List of bHLH Genes from Drosophila melanogaster

We believe that these sequences represent, if not the full set, at least a large proportion of the bHLH domains present in the fly genome. The repeated BLASTP searches that were used to build our original set of genes were meant to detect even very divergent types of bHLH domains. Furthermore, after we determined the 44 types of bHLH domains, we made new BLASTP screens of the complete sequence of D. melanogaster with one member of each family, without finding any new genes. On the other hand, none of these searches revealed Collier, the fly COE family representative. Therefore it is conceivable that one or more highly divergent HLH families may have escaped our screens.

The BLASTP searches detected additional sequences that we did not use in our analyses, as they did not correspond to complete HLH motifs. Such sequences were identified because they present a marked similarity with a small region of the bHLH domain, 20–30 amino acids long, often including the basic region. In all cases we checked the sequence by hand, and the decision as to whether a sequence did or did not correspond to a bona fide bHLH domain was always clearcut. We also checked the 61 “HLH DNA-binding domain” and 69 “Myc-type HLH dimerization domain” sequences recently identified in the Drosophila genome (Rubin et al. 2000), and found that only the 56 sequences listed in Table Table22 correspond to complete bHLH domains. Our analysis is completely consistent with and extends that of Moore et al. (2000), who analyzed 12 previously uncharacterized bHLH from the Drosophila genome project. We also retrieved these 12 genes in our screen and our family assignment coincides with that of Moore et al. (2000).

C. elegans Genes

We found 35 bHLH sequences in C. elegans. Table Table33 shows these sequences; a version of this table with links to Wormbase (http://www.wormbase.org) is available as supplementary material at http://www.genome.org. A previous report (Rubin et al. 2000) mentioned 38 “HLH DNA-binding domain” sequences and 8 “Myc-type HLH dimerization domain” sequences in the C. elegans genome. Prior analysis of the C. elegans genome revealed only 24 bHLH putative proteins (Ruvkun and Hobert 1998). Here again we checked the discrepancies between our results and the previous ones, and found that only the 35 sequences listed in Table Table33 correspond to complete bHLH domains. These 35 sequences are likely to represent the full set of C. elegans bHLH.

Table 3
The Complete List of bHLH Genes from Caenorhabditis elegans

In contrast to the sequences from Drosophila, most of which can easily be assigned to one of the 38 animal bHLH families, 17% of the C. elegans sequences (6/35) cannot be confidently assigned to a specific family, and are therefore called “orphan”. Furthermore, several C. elegans bHLH included in families are only loosely linked to the other members (their inclusion is supported by low bootstrap values). Conversely, 40% of the animal families do not contain C. elegans members. These results are consistent with the traditional view of metazoan phylogeny, which held nematodes as very distantly related to both arthropods and vertebrates. Recent molecular phylogenies indicate that, on the contrary, arthropods and nematodes are relatives, (i.e., they group into one of the three clades of bilaterians, the ecdysozoa) (Aguinaldo et al. 1997; Adoutte et al. 2000). Many nematodes, including C. elegans, have higher mutation rates than other metazoans, not only in their rRNA genes (Aguinaldo et al. 1997), but also throughout their genome (Mushegian et al. 1998). Therefore, nematode sequences, tend to be artifactually displaced to a wrong position because they appear as being very distant from all others, and to end up at the base of the tree or even associated with the outgroup (because of chance convergence at some nucleotide positions). This phenomenon, known as “long branch attraction phenomenon” (for a recent review, see Philippe and Laurent 1998), presumably explains why our analysis led to the clustering of several C. elegans sequences at the base of the group A family, or as orphan group B genes (Table (Table3).3). Accordingly, we found that the worm bHLH sequences diverge more rapidly than those of fly and mouse (data not shown; a detailed analysis can be found on our Web site, http://www.cnrs-gif.fr/cgm/evodevo/bhlh/index.html).

Interestingly, some nematode sequences have diverged very little from their fly or mouse counterparts. These include the few functionally characterized C. elegans bHLH genes, which show overall functional conservation with their vertebrates and/or fly orthologs; for example, the C. elegans orthologs of twist and myoD are involved in muscle formation (Harfe et al. 1998a,b), and the orthologs of atonal and NeuroD (lin-32 and cnd-1) play a role in nervous system development (Zhao and Emmons 1995; Hallam et al. 2000). The genetic control of developmental processes such as neurogenesis and myogenesis relies on small sets of interacting genes (syntagms Garcia-Bellido 1981). The function of syntagms crucially relies on specific molecular interactions among their members, hence imposing strong structural constraints on them and preventing structural diversification (for discussion on syntagms and evolution, see Huang 1998). This may explain why such networks are strongly conserved throughout metazoan evolution (Baylies et al. 1998; Arendt and Nübler-Jung 1999) and why nematode genes involved in such networks have been subject to special constraints.

Mouse Genes

We found a total of 90 different bHLH sequences in mouse (and related mammals). This large set of genes is the result of the extensive molecular analyses of processes such as neurogenesis, myogenesis, or oncogenesis in which bHLH are crucially involved. Therefore, it might be that in the absence of systematic bHLH searches or genome sequencing projects, only a small subset of vertebrate bHLH genes have been identified so far. Indeed, our initial searches showed that the same vertebrate bHLH genes may be reported under up to seven different names, suggesting the convergence of many research groups on small numbers of crucial genes.

However, our results show that at least 35 of the 38 vertebrate bHLH types have protostomian (fly and/or worm) orthologs (90%), and reciprocally, that all fly genes have mouse counterparts. Because we believe that our set of fly genes is close to complete, the fact that mouse counterparts have been identified for all fly genes suggests that our sample of bHLH genes in mouse is in fact quite extensive. Needless to say, a definite answer to the question of how complete is our knowledge of vertebrate bHLH genes will have to await the results of the various vertebrate sequencing projects that currently are under way.

Assessing Orthologies

The assesment of orthologies must necessarily be based on phylogenetic reconstructions. Thus, although orthology is a very useful concept, there is no foolproof way of deciding whether two similar sequences are indeed orthologous. We will illustrate this difficult question in the case of two closely related fly genes, Delilah (dei) and CG11450 (Fig. (Fig.11 and and3).3). CG11450 recently has been described, based on overall similarity, as the Drosophila ortholog of the vertebrate NeuroD gene (Hassan and Bellen 2000). We similarly retrieved CG11450 as the closest Drosophila relative of NeuroD when making BLASTP searches. However, the inclusion of both genes in the NeuroD family is, not supported by the phylogenetic analyses. While both genes clearly belong to the Atonal superfamily, they cannot be associated unequivocally to either of the NeuroD, Ngn or Ato families (Fig. (Fig.22 and and3).3). Nevertheless, CG11450 and Dei may represent divergent NeuroD proteins as they show several residues in their bHLH typical of this family (Fig. (Fig.1;1; Hassan and Bellen 2000).

Figure 2
A neighbor-joining (NJ) tree showing the evolutionary relationships of the 44 bHLH families listed in Table Table11 as well as the orphan genes delilah (putative D. melanogaster neuroD gene) and F31A3.4 (as a representative of a group of three ...
Figure 3
A rooted neighbor-joining (NJ) tree showing the evolutionary relationships of Atonal superfamily members. We used the closely related paraxis sequence (see Figure Figure2)2) as outgroup. The different constituting families are pointed out. Numbers ...

We examined whether what is known of the function of these various genes might help us elucidate the origin of dei and CG11450. The vertebrate representatives of the Ngn, NeuroD and Ato families are mainly involved in the determination and the differentiation of neural cells (Kageyama and Nakanishi 1997; Hassan and Bellen 2000). In Drosophila, the Ato representatives ato, amos and cato are all involved in neural development (Jarman et al. 1993; Goulding et al. 2000a,b; Huang et al. 2000). The function of the neurogenin ortholog target of poxn (tap) is not known, but the gene is exclusively expressed at late stages of neural development (Bush et al. 1996; Gautier et al. 1997; Ledent et al. 1998). On the contrary, dei and CG11450 are not involved in neurogenesis. dei is required for the differentiation of specific epidermal cells as muscle attachment sites (Armand et al. 1994). CG11450 is expressed in the embryonic mesoderm in a pattern that overlaps that of twist (Moore et al. 2000). During postembryonic development, CG11450 is involved in wing vein formation (CG11450 corresponds to the net locus; Brentrup et al. 2000). Thus, one plausible interpretation of the data is that dei and CG11450 are bona fide orthologs of the NeuroD genes, and that their phylogenetic relationships have been blurred by a rapid divergence associated to the acquisition of new functions.

Comparison of Fly, Nematode and Vertebrate Families

Most families comprise one protostome (fly and/or nematode) and several (often two) vertebrate genes. The fact that most families contain both fly and vertebrate genes suggests that there was no addition of new bHLH types in the corresponding lineages, and therefore no important diversification of the ancestral repertoire. Among the few families that lack fly genes, most also lack nematode genes. These may represent the arisal of new bHLH types in the vertebrate lineage, or alternatively a loss of ancestral types in both fly and nematode. The analysis of bHLH genes from molluscs or annelids might help settle this question. It is now widely believed that bilateria (triploblastic metazoans) are composed of three main lineages: deuterostomes (which include vertebrates and echinoderms) and protostomes themselves including two large groups, the ecdysozoans (e.g., arthropods and nematodes) and the lophotrochozoans (e.g., annelids, molluscs, flatworms) (e.g., Aguinaldo et al. 1997; de Rosa et al. 1999; Adoutte et al. 2000). Therefore, the finding of ortholog genes in vertebrates and lophotrochozoans but not in fly and nematode would strongly suggest that gene loss(es) has occurred in the ecdysozoan lineage. Similarly, the case of families that contain vertebrate and either worm or fly genes is explained best by gene losses that occurred, inside the ecdysozoan clade, in either lineage after the arthropod/nematode divergence. This occurred in the fly lineage for only one family, MITF, which contains vertebrate and worm but no fly genes (the case of the NeuroD family has been discussed above). The much larger number of families that have vertebrate and fly members but no nematode representative, as well as the large number of nematode genes that cannot be clearly assigned to specific families (orphan genes) is likely because of the high divergence rate reported for nematode genes in general (Aguinaldo et al. 1997; Mushegian et al. 1998) and that we found within our specific data set (data not shown; for details, see our Web site at http://www.cnrs-gif.fr/cgm/evodevo/bhlh/index.html).

Gene and Genome Duplications

Most bHLH families, as other gene families, comprise more members in vertebrates than in other phyla (Table (Table1).1). It has been proposed that this may reflect the occurrence of two rounds of genome duplication during the early vertebrate evolution (Sidow 1996; Meyer and Schartl 1999), but this idea, mainly based on mapping of gene clusters, remains controversial (Skrabanek and Wolfe 1998; Hughes 1999; Smith et al. 1999; Martin 2001). Many gene families in vertebrates have less than four genes (Skrabanek and Wolfe 1998; Smith et al. 1999). However, this might result from gene loss during or after the rounds of duplication (Meyer and Schartl 1999). Within our set of bHLH genes, the most usual case was two mouse genes per family, but we know this set is likely to be incomplete because the entire genomic sequence of the mouse is not available. Even within this incomplete set, we observe that up to one-fourth of the families comprise four or more members (Table (Table1).1). As pointed out by Hughes (1999), the presence of four vertebrate members, by itself, does not support the genome duplication hypothesis. Support only may come from families whose phylogenetic tree shows a topology of the (AB) (CD) form (i.e., two pairs of two closely related paralogs) (Hughes 1999). Hughes (1999) discussed the phylogenies of 13 protein families important in development and found that only one of them shows a (AB) (CD) topology. We constructed individual trees for each bHLH family (available at http://www.cnrs-gif.fr/cgm/evodevo/bhlh/index.html) and often found one or two duplication(s) during vertebrates radiation (e.g., the Achaete-Scute family; Fig. Fig.4).4). We checked the topology of the trees of families with four or more vertebrate members (nine families, see Table Table1)1) and observed that none of the five families showing a reliable phylogeny, has a (AB) (CD) topology (data not shown; see http://www.cnrs-gif.fr/cgm/evodevo/bhlh/index.html). Hence, our data set does not support the hypothesis of two rounds of genome duplication. Figure Figure44 also shows a feature we observed in several families: the existence of extra closely related genes in the tetraploid Xenopus and in ray-finned fishes such as the zebrafish Brachydanio rerio (actinopterygia). The latter observation is consistent with the hypothesis than actinopterygia genome underwent a duplication, which took place after actinopterygian-sarcopterygian lineage divergence (the sarcopterygian lineage include coelacanths, lungfishes, and all tetrapods) (reviewed in Wittbrod et al. 1998; Meyer and Schartl 1999).

Figure 4
A neighbor-joining (NJ) tree showing the evolutionary relationships among Achaete-Scute family members. Numbers above branches indicate percent support in bootstrap analyses (1000 replicates). This tree is rooted using the single cnidarian (Hv CNASH) ...

Duplications of Ecdysozoan bHLH

A few families contain more than one gene in fly and/or nematode, and in some cases, more genes than in vertebrates: the Achaete-Scute, Atonal, PTF1, Enhancer of split, Hairy, AHR, and TF4 families in Drosophila and AHR, Enhancer of split, and Max families in C. elegans (Table (Table1).1). The different protostome members of these families arose by duplications that occurred after the arthropod/nematode split within the ecdisozoan clade: For example, the four Drosophila achaete-scute genes are collectively orthologous to the two vertebrate genes and to the four nematode genes (Fig. (Fig.4);4); the three Drosophila Atonal genes are collectively orthologous to the two vertebrate and the single nematode genes (Fig. (Fig.3).3). We retrieved the chromosomal localizations of these genes from Flybase and Wormbase and observed that, in most cases, the members of a given family have very different localizations, often on different chromosomes (Table (Table22 and and3).3). The three Drosophila Atonal genes, for example, are found on three different chromosomes arms (2L, 3L and 3R; see Table Table2).2). These localizations suggest that the duplications that gave rise to the paralogs are rather ancient events. However, in some cases, the duplications might have occurred more recently, as the paralogs are localized close to each other in the genome: this is, for example, the case of the four achaete-scute genes and the seven Enhancer of split genes which are known for long time to form gene complexes in Drosophila. We found one similar case in C. elegans: C17C3.10 and C17C3.8 are adjacent genes and are on the same DNA strand. In addition, two worm members of the Achaete-Scute family are found at a similar chromosomal localization (Table (Table3),3), although separated by several unrelated genes. Information about the timing of duplication events may come from evolutive comparisons with increasingly distantly related species. For example, clear orthologs of three of the four achaete-scute genes have been found in another dipteran, Ceratitis capitata (Wülbeck and Simpson 2000), while a single ortholog to the four achaete-scute genes is found in the buckeye butterfly, Juonia coenia, a lepidoptera, and in the flour beetle, Tribolium castaneum, a coleoptera (Figure (Figure4;4; Galant et al. 1998). Duplication, in this case, probably has occurred after the divergence of diptera from other insects.

Phylogenetic Relationships of bHLH Families: A Reappraisal of High-Order

Although the bHLH motif has good resolving power to delimit families of proteins and describe their evolutionary relationships at the tips of the clades, the very early evolutionary history of the motif is more problematic (Atchley and Fitch 1997). Deep nodes usually have a low statistical support (small bootstrap values). This is mainly a result of the small size of the conserved sequence and the existence of numerous ancient paralogs. Nevertheless, we found recurrent topologies when constructing trees with different sequences sets and different tree reconstruction procedures [maximum parsimony (MP), distance, and maximum likelihood (ML)]. The congruence between trees obtained with different methods and different data sets is usually considered in phylogenetic reconstructions as a good argument in favor of the validity of a given phylogeny (Adoutte et al. 2000); however, it is not a demonstration of its reliability. A representative tree of the different bHLH families is shown in Figure Figure2.2. Our results agree largely with those of Atchley and Fitch (1997) who described the four high-order groups (A–D) found in a neighbor-joining (NJ) tree and subsequent work of Atchley and collaborators (Atchley et al. 1999; Morgenstern and Atchley 1999). Although the high-order groups were supported only by low bootstrap values, their validity was confirmed by MP analyses of particular sites at different positions in the bHLH (Atchley and Fitch 1997), analyses of bHLH flanking regions (Morgenstern and Atchley 1999) and mathematical modeling (Atchley et al. 1999). The inclusion of the 44 orthologous families in the high-order groups is shown in Table Table11.

Our results diverge from the previous analyses in a few points, however. First, we have had to revise the relationship between groups A and D, and to include group D within group A. Second, our analysis suggests that group B is paraphyletic and closest to the ancestral bHLH motif. Third, we have evidence that group C is not monophyletic but includes several independent occurrences of the bHLH-PAS association. Finally, the more extensive data set used in the present study led us to define two additional groups, E and F.

Our phylogenetic analysis (Fig. (Fig.2)2) reveals a large monophyletic group that corresponds to the group A defined by Atchley and Fitch (1997). This group includes the E12/E47 family genes and several other families whose members are able to heterodimerize with the E12/E47 proteins (Cabrera and Alonso 1991; Lassar et al. 1991; Van Doren et al. 1992). The phylogenetic analysis (Fig.(Fig.2)2) clearly shows that the Emc family is deeply embedded into the group A family. Furthermore, although group D proteins lack the DNA-binding motif, they are able to dimerize with several group A proteins (Benerzra et al. 1990; Ellis et al. 1990; Garrell and Modolell 1990; Van Doren et al. 1991, 1992), but not with other types of bHLH motifs. Therefore, our results indicate that the Emc family, previously considered to define group D, should also be considered as belonging to group A.

We believe that group B is paraphyletic rather than monophyletic (Fig. (Fig.2).2). This group is probably closest to the ancestral bHLH type from which groups A, C, D, E, and F bHLH arise. The distribution of these proteins in various groups of organisms stongly supports this suggestion: Group B proteins are found in plants, yeast, and animals, whereas the other groups (A, C, D, E, and F) are found only in animals. Likewise, we did not find the group C of Atchley and Fitch (1997) to form a monophyletic group (Fig. (Fig.2).2). As this group comprises the bHLH-PAS genes, one obvious explanation for its paraphyly is that the association between the bHLH and the PAS domains occurred several times independently, consistent with the hypothesis of a modular evolution of the bHLH proteins by domain shuffling (for discussion, see Morgenstern and Atchley 1999).

We found that all Hairy and Enhancer of split-related proteins form a well-supported monophyletic group that we named group E in accordance with Atchley and Fitch nomenclature (Fig. (Fig.2).2). The monophyly of this group is confirmed by the presence of several conserved amino acids flanking the bHLH and the presence of the WRPW peptide (Fisher and Caudy 1998).

Similarly, the HLH domain of the COE proteins appears well conserved among them and much divergent with respect to other bHLH families. Furthermore, all COE proteins contain a highly conserved domain, the COE domain, not found in any other proteins. Taken together, this strongly indicates that the COE proteins form a clearly distinct monophyletic group, which we named group F.

Conclusions: An Overview of bHLH Evolution

We have not been able to identify procaryotic genes that would match our bHLH sequences. Therefore, it seems that the bHLH motif has been established in early eukaryote evolution. The bHLH genes of yeast are involved in general transcriptional enhancement and cell cycle control, suggesting that this may have been the original function of the bHLH genes in primitive eukaryotes. An important diversification occurred independently in the animal and plant lineages, as seen by the 36 different families found exclusively in animals and 30 different bHLH genes found in A. thaliana, compared to the five genes found in yeast.

In animals, bHLH genes generally are involved in development and in tissue-specific gene regulation. The 38 families have representatives in the two major subdivisions of the animal kingdom, protostomes and deuterostomes, and must therefore have been represented in their common ancestor prior to the Cambrian radiation, which saw the emergence of all present-day phyla and many extinct ones. Morphologically, these ancestors (also called Urbilateria; De Robertis and Sasai 1996) probably were coelomates with antero-posterior and dorso-ventral polarity, rudimentary appendages, some form of metamerism, a heart, sense organs such as photoreceptors, and a complex nervous system (Knoll and Carroll 1999). Genetically, they possessed numerous Hox genes (at least seven; de Rosa et al. 1999) as well as other homeobox genes, several intercellular signaling pathways (TGF-Β, Hedgehog, Notch, EGF), and several Pax genes (Galliot et al. 1999). Our analysis indicates that their genome contained at least 35 different bHLH genes. The functional conservation that often is observed between fly and vertebrate bHLH orthologs indicates that some of the developmental functions associated with present-day bHLH genes already were established in these ancestral organisms, further indicating the genomic and developmental complexity of this ancient ancestor.

METHODS

Protein sequences were obtained mostly by BLASTP search (Altschul et al. 1990) at the National Center for Biotechnology (NCBI) and the Sanger center, as well as from Swissprot, GenPept, and TrEMBL through SRS (LION Bioscience AG) and Nentrez (NCBI) software. A table containing all sequences and their accession numbers is available on our Web site (http://www.cnrs-gif.fr/cgm/evodevo/bhlh/index.html). Protein alignments were carried out using CLUSTALW (Thompson et al. 1994) with no adjustment of the default parameters, and were subsequently edited and manually improved in Genedoc Multiple Sequence Alignment Editor and Shading Utility, Version 2.6.001 (Nicholas et al. 1997). The evaluation of percentage conservation of residues in multiple sequence alignments was done using the Blosum62 Similarity Scoring Table (Henikoff and Henikoff 1992). Only the bHLH motif (determined as in Ferre-D'Amar et al. 1993), plus a few flanking amino acids, was used in most of our analyses because the remaining part of proteins from independent clades are either not homologous or have so diverged that the alignments are meaningless. The facilities of the Belgian EMBnet Node (http://be.embnet.org) were used for all database searches through SRS and sequence analysis using Genedoc software, and for most of the protein alignments using CLUSTALW. Trees were built using unweighted maximum parsimony (MP) and neighbor-joining (NJ) algorithms with the PAUP 4.0 program (Swofford 1993). The MP analysis was performed with the following settings: heuristic search over 100 bootstrap replicates, MAXTREES set up to 1000 because of computer limitations, other parameters set to default values. When large numbers of sequences (>150) were handled, as a result of computer limitations, bootstraps were made by “fast” stepwise-additions (1000 replicates) in PAUP 4.0. Extensive computer simulations have shown that such fast algorithms are as efficient as more extensive search algorithms when a large number of sequences is used (Takahashi and Nei 2000). Distance trees were constructed with the NJ algorithm (Saitou and Nei 1987) using PAUP 4.0 based on a Dayhoff's PAM 250 distance matrix (Dayhoff et al. 1978). Bootstrap replicates of the NJ trees (1000) also were made with PAUP 4.0, parameters set to default values.

Some alignments also were analyzed by maximum likelihood (ML) using Puzzle 4.0.2 (Strimmer and Von Haeseler 1996). The ML was performed using the quartet puzzling tree search procedure with 10000 puzzling steps, using the Jones-Taylor-Thornton (JTT) model of substitution (Jones et al. 1992), the frequencies of amino acids being estimated from the data set (Strimmer and Von Haeseler 1996).

The trees were displayed with the TreeView (Version 1.5) (Page 1996), saved as PICT files, converted into JPEG files using Graphic Converter, and then annotated using Adobe Photoshop.

Acknowledgments

We thank Robert Herzog, Marc Colet, and André Adoutte for support. We are especially grateful to Alain Ghysen for his help in the writing of this article. We thank Daniel Van Belle for comments on protein structure. We also thank André Adoutte, Robert Herzog, Nicolas Lartillot, Michel Milinkovitch, and two anonymous referees for helpful comments on the manuscript. This work has been supported by the Federal Office for Scientific, Technical, and Cultural Affairs (V.L.) and Centre National de la Recherche Scientifique and Université de Paris-Sud (M.V.).

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

E-MAIL rf.fig-srnc.mgc@troovrev; FAX 33 169 823160.

Article and publication are at www.genome.org/cgi/doi/10.1101/gr.177001.

REFERENCES

  • Adams MD, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. [PubMed]
  • Adoutte A, Balavoine G, Lartillot N, Lespinet O, Prud'homme B, de Rosa R. The new animal phylogeny: Reliability and implications. Proc Natl Acad Sci. 2000;97:4453–4456. [PMC free article] [PubMed]
  • Aguinaldo AMA, Turbeville JM, Linford LS, Rivera MC, Garey JR, Raff RA, Lake JA. Evidence for a clade of nematodes, arthropods and other moulting animals. Nature. 1997;387:489–493. [PubMed]
  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. [PubMed]
  • Arendt D, Nübler-Jung K. Comparisons of early nerve cord development in insects and vertebrates. Development. 1999;126:2309–2325. [PubMed]
  • Armand P, Knapp AC, Hirsch AJ, Wieschaus EF, Cole MD. A novel basic helix-loop-helix protein is expressed in muscle attachment sites of the Drosophila epidermis. Mol Cell Biol. 1994;14:4145–4154. [PMC free article] [PubMed]
  • Atchley WR, Fitch WM. A natural classification of the basic helix-loop-helix class of transcription factors. Proc Natl Acad Sci. 1997;94:5172–5176. [PMC free article] [PubMed]
  • Atchley WR, Terhalle W, Dress A. Positional dependence, cliques, and predictive motifs in the bHLH protein domain. J Mol Evol. 1999;48:501–516. [PubMed]
  • Baylies M K, Bate M, Gomez M R. Myogenesis: A view from Drosophila. Cell. 1998;93:921–927. [PubMed]
  • Benezra R, Davis RL, Lockshon D, Turner DL, Weintraub H. The protein Id: A negative regulator of helix-loop-helix DNA binding proteins. Cell. 1990;61:49–59. [PubMed]
  • Brentrup D, Lerch H-P, Jäckle H, Noll M. Regulation of Drosophila wing vein patterning: net encodes a bHLH protein repressing rhomboid and is repressed by Rhomboid-dependent Egfr signalling. Development. 2000;127:4729–4741. [PubMed]
  • Bush A, Hiromi Y, Cole M. biparous: A novel bHLH gene expressed in neuronal and glial precursors in Drosophila. Dev Biol. 1996;180:759–772. [PubMed]
  • Cabrera CV, Alonso MC. Transcriptional activation by heterodimers of the achaete-scute and daughterless gene product of Drosophila. EMBO J. 1991;10:2965–2973. [PMC free article] [PubMed]
  • The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science. 1998;282:2012–2018. [PubMed]
  • Crews ST. Control of cell lineage-specific development and transcription by bHLH-PAS proteins. Genes & Dev. 1998;12:607–620. [PubMed]
  • Crozatier M, Valle D, Dubois L, Ibnsouda S, Vincent A. collier, a novel regulator of Drosophila head development, is expressed in a single mitotic domain. Curr Biol. 1996;6:707–718. [PubMed]
  • Dang CV, Dolde D, Gillison ML, Kato GJ. Discrimination between related DNA sites by a single amino acid residue of myc-related basic-helix-loop-helix proteins. Proc Natl Acad Sci. 1992;89:599–602. [PMC free article] [PubMed]
  • Dayhoff MO, Schwartz RM, Orcutt BC. A model of evolutionary change in proteins. In: Dayhoff MO, editor. Atlas of Protein Sequence Structure. 5, Suppl. 3. Washington DC.: National Biomedical Research Foundation; 1978. pp. 345–352.
  • De Robertis DM, Sasai Y. A common plan for dorsoventral patterning in Bilateria. Nature. 1996;380:37–40. [PubMed]
  • de Rosa R, Grenier JK, Andreeva T, Cook CE, Adoutte A, Akam M, Carroll SB, Balavoine G. Hox genes in brachiopods and priapulids and protostome evolution. Nature. 1999;399:772–776. [PubMed]
  • Ellis HM, Spann DR, Posakony JW. Extramacrochaete, a negative regulator of sensory organ development in Drosophila, defines a new class of helix-loop-helix proteins. Cell. 1990;61:27–38. [PubMed]
  • Facchini LM, Penn LZ. The molecular role of Myc in growth and transformation: Recent discoveries lead to new insights. FASEB J. 1998;12:633–651. [PubMed]
  • Ferre-D'Amare AR, Prendergast GC, Ziff EB, Burley SK. Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain. Nature. 1993;363:38–45. [PubMed]
  • Fisher A, Caudy M. The function of hairy-related bHLH repressors proteins in cell fate decisions. BioEssays. 1998;20:298–306. [PubMed]
  • Fitch WM. Distinguishing homologous from analogous proteins. Syst Zool. 1970;19:99–113. [PubMed]
  • The Flybase Consortium. The FlyBase database of the Drosophila Genome Projects and community literature. Nucleic Acids Res. 1999;27:85–88. [PMC free article] [PubMed]
  • Galant R, Skeath JB, Paddock S, Lewis D, Carroll SB. Expression pattern of a butterfly achaete-scute homolog reveals the homology of butterfly wing scales and insect sensory bristles. Curr Biol. 1998;8:807–813. [PubMed]
  • Galliot B, de Vargas C, Miller D. Evolution of homeobox genes: Q50 Paired-like genes founded the Paired class. Dev Genes Evol. 1999;209:186–197. [PubMed]
  • Garcia-Bellido A. The bithorax syntagma. In: Lakovaara S, editor. Advances in genetics, development, and evolution of Drosophila. VII European Drosophila Research Conference, Plenum Press; 1981. pp. 135–148.
  • Garrell J, Modolell J. The Drosophila extramacrochaete locus, an antagonist of proneural genes that, like these genes, encodes a helix-loop-helix protein. Cell. 1990;61:39–48. [PubMed]
  • Gautier P, Ledent V, Massaer M, Dambly-Chaudière C, Ghysen A. tap, a Drosophila bHLH expressed in chemosensory organs. Gene. 1997;191:15–21. [PubMed]
  • Goding CR. Mitf from neural cest to melanoma: Signal transduction and transcription in the melanocyte lineage. Genes & Dev. 2000;14:1712–1728. [PubMed]
  • Goulding SE, zur Lage P, Jarman AP. amos, a proneural gene for Drosophila olfactory sense organs that is regulated by lozenge. Neuron. 2000a;25:69–78. [PubMed]
  • Goulding SE, White NM, Jarman AP. cato encodes a basic helix-loop-helix transcription factor implicated in the correct differentiation of Drosophila sense organs. Dev Biol. 2000b;221:120–131. [PubMed]
  • Hallam S, Singer E, Waring D, Jin Y. The C. elegans NeuroD homolog cnd-1 functions in multiple aspects of motor neuron fate specification. Development. 2000;127:4239–4252. [PubMed]
  • Harfe B D, Branda C S, Krause M, Stern M J, Fire A. MyoD and the specification of muscle and non-muscle fates during postembryonic development of the C. elegans mesoderm. Development. 1998a;125:2479–2488. [PubMed]
  • Harfe B D, Vaz Gomes A, Kenyon C, Liu J, Krause M, Fire A. Analysis of a Caenorhabditis elegans Twist homolog identifies conserved and divergent aspects of mesodermal patterning. Genes & Dev. 1998b;12:2623–2635. [PMC free article] [PubMed]
  • Hassan BA, Bellen HJ. Doing the MATH: Is the mouse a good model for fly development? Genes & Dev. 2000;14:1852–1865. [PubMed]
  • Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci. 1992;89:10915–10919. [PMC free article] [PubMed]
  • Henriksson M, Luscher B. Proteins of the Myc network: Essential regulators of cell growth and differentiation. Adv Cancer Res. 1996;68:109–182. [PubMed]
  • Holland PWH, Garcia-Fernandez JD. Hox genes and chordate evolution. Dev Biol. 1996;173:382–395. [PubMed]
  • Huang F. Syntagms in development and evolution. Int J Dev Biol. 1998;42:487–494. [PubMed]
  • Huang M-L, Hsu C-H, Chien C-T. The proneural gene amos promotes multiple dendritic neuron formation in the Drosophila peripheral nervous system. Neuron. 2000;25:57–67. [PubMed]
  • Hughes A. Phylogenies of developmentally important proteins do not support the hypothesis of two rounds of genome duplication early in vertebrate history. J Mol Evol. 1999;48:565–576. [PubMed]
  • Jan YN, Jan LY. HLH proteins, fly neurogenesis, and vertebrate myogenesis. Cell. 1993;75:827–830. [PubMed]
  • Jarman AP, Grau Y, Jan LY, Jan YN. atonal is a proneural gene that directs chordotonal development in the Drosophila peripheral nervous system. Cell. 1993;73:1307–1321. [PubMed]
  • Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. CABIOS. 1992;8:275–282. [PubMed]
  • Kadesch T. Consequences of heteromeric interactions among helix-loop-helix proteins. Cell Growth Differ. 1993;4:49–55. [PubMed]
  • Kageyama R, Nakanishi S. Helix-loop-helix factors in growth and differentiation of the vertebrate nervous system. Curr Opin Genet Dev. 1997;7:659–665. [PubMed]
  • Knoll AH, Carroll SB. Early animal evolution: Emerging views from comparative biology and geology. Science. 1999;284:2129–2137. [PubMed]
  • Lassar A B, Davis R L, Wright W E, Kadesch T, Murre C, Voronova A, Baltimore D, Weintraub H. Functional activity of myogenic HLH proteins requires hetero-oligodimerization with E12/E47-like proteins in vivo. Cell. 1991;66:305–315. [PubMed]
  • Ledent V, Gaillard F, Gautier P, Ghysen A, Dambly-ChaudiÈre C. Expression and function of tap in the gustatory and olfactory organs of Drosophila. Int J Dev Biol. 1998;42:163–170. [PubMed]
  • Martin A. Is tetralogy true? Lack of support for the “one-to-four rule.” Mol Biol Evol. 2001;18:89–93. [PubMed]
  • Meyer A, Schartl M. Gene and genome duplications in vertebrates: the one-to-four (to-eight in fish) rule and the evolution of novel gene functions. Curr Opin Cell Biol. 1999;11:699–704. [PubMed]
  • Moore AW, Barbel S, Jan LY, Jan YN. A genomewide survey of basic helix-loop-helix factors in Drosophila. Proc Natl Acad Sci. 2000;97:10436–10441. [PMC free article] [PubMed]
  • Morgenstern B, Atchley WR. Evolution of bHLH transcription factors: Modular evolution by domain shuffling. Mol Biol Evol. 1999;16:1654–1663. [PubMed]
  • Murre C, Mc Caw PS, Vaessin H, Caudy M, Jan L Y, Cabrera C V, Buskin J N, Hauschka S D, Lassar A B, Weintraub H, et al. Interactions between heterologous helix-loop-helix proteins generate complexes that bind specifically to a common DNA sequence. Cell. 1989a;58:537–544. [PubMed]
  • Murre C, Mc Caw PS, Baltimore D. A new DNA binding and dimerizing motif in Immunoglobulin enhancer binding, Daugtherless, MyoD, and Myc proteins. Cell. 1989b;56:777–783. [PubMed]
  • Mushegian AR, Garey JR, Martin J, Liu LX. Large-scale taxonomic profiling of eukaryotic model organisms: A comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes. Genome Res. 1998;8:590–598. [PubMed]
  • Nicholas KB, Nicholas HB, Jr, Deerfield DWII. Genedoc: Analysis and visualization of Genetic Variation/ Embnew. News. 1997;4:14.
  • Ohsako S, Hyer J, Panganiban, Olivier I, Caudy M. Hairy function as a DNA-binding helix-loop-helix repressor of Drosophila sensory organ formation. Genes & Dev. 1994;8:2743–2755. [PubMed]
  • Page RD. TreeView: An application to display phylogenetic trees on personal computers. Comput Appl Biosc. 1995;12:357–358. [PubMed]
  • Philippe H, Laurent J. How good are deep phylogenetic trees? Curr Opin Gen Dev. 1998;8:616–623. [PubMed]
  • Rubin GM, et al. Comparative genomics of the eukaryotes. Science. 2000;287:2204–2215. [PMC free article] [PubMed]
  • Ruvkun G, Hobert O. The taxonomy of developmental control in Caenorhabditis elegans. Science. 1998;282:2033–2041. [PubMed]
  • Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. [PubMed]
  • Schreiber-Agus N, Stein D, Chen K, Goltz JS, Stevens L, DePinho RA. Drosophila Myc is oncogenic in mammalian cells and plays a role in the diminutive phenotype. Proc Natl Acad Sci. 1997;94:1235–1240. [PMC free article] [PubMed]
  • Sidow A. Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Dev. 1996;6:715–722. [PubMed]
  • Skrabanek L, Wolfe KH. Eukaryote genome duplication-where's the evidence? Cur Opin Genet Dev. 1998;8:694–700. [PubMed]
  • Smith NGC, Knight R, Hurst LD. Vertebrate genome evolution: A slow shuffle or a big bang? BioEssays. 1999;21:697–703. [PubMed]
  • Strimmer K, von Haeseler A. Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies. Mol Biol Evol. 1996;13:964–969.
  • Swofford D L. PAUP* Phylogenetic Analysis Using Parsimony, Version 4. 1998. (Sinauer, Sunderland, MA).
  • Takahashi K, Nei M. Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol Biol Evol. 2000;17:1251–1258. [PubMed]
  • Thompson JD, Higgins JD, Gibson TJ. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. [PMC free article] [PubMed]
  • Van Doren M, Ellis HM, Posakony JW. The Drosophila Extramacrochaetae protein antagonizes sequence-specific DNA binding by Daughterless/Achaete-Scute protein complexes. Development. 1991;113:245–255. [PubMed]
  • Van Doren M, Powell PA, Pasternak D, Singson A, Posakony JW. Spatial regulation of proneural gene activity: Auto- and cross-activation of achaete is antagonized by extramacrochaete. Genes & Dev. 1992;6:2592–2605. [PubMed]
  • Van Doren M, Bayley AM, Esnayra J, Ede K, Posakony JW. Negative regulation of proneural gene activity: Hairy is a direct transcriptional repressor of achaete. Genes & Dev. 1994;8:2729–2742. [PubMed]
  • Weintraub H. The MyoD family and myogenesis: Redundancy, networks, and thresholds. Cell. 1993;75:1241–1244. [PubMed]
  • Wittbrodt J, Meyer A, Schartl M. More genes in fish? BioEssays. 1998;20:511–515.
  • Wülbeck C, Simpson P. Expression of achaete-scute homologues in discrete proneural clusters on the developing notum of the medfly Ceratitis capitata, suggests a common origin for the stereotyped bristle patterns of higher Diptera. Development. 2000;127:1411–1420. [PubMed]
  • Zhao C, Emmons SW. A transcription factor controlling development of peripheral sense organs in C. elegans. Nature. 1995;373:74–78. [PubMed]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...