• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Sep 22, 2009; 106(38): 16327–16332.
Published online Sep 3, 2009. doi:  10.1073/pnas.0907914106
PMCID: PMC2752591

Elephant shark (Callorhinchus milii) provides insights into the evolution of Hox gene clusters in gnathostomes


We have sequenced and analyzed Hox gene clusters from elephant shark, a holocephalian cartilaginous fish. Elephant shark possesses 4 Hox clusters with 45 Hox genes that include orthologs for a higher number of ancient gnathostome Hox genes than the 4 clusters in tetrapods and the supernumerary clusters in teleost fishes. Phylogenetic analysis of elephant shark Hox genes from 7 paralogous groups that contain all of the 4 members indicated an ((AB)(CD)) topology for the order of Hox cluster duplication, providing support for the 2R hypothesis (i.e., 2 rounds of whole-genome duplication during the early evolution of vertebrates). Comparisons of noncoding sequences of the elephant shark and human Hox clusters have identified a large number of conserved noncoding elements (CNEs), which represent putative cis-regulatory elements that may be involved in the regulation of Hox genes. Interestingly, in fugu more than 50% of these ancient CNEs have diverged beyond recognition in the duplicated (HoxA, HoxB, and HoxD) as well as the singleton (HoxC) Hox clusters. Furthermore, the b-paralogs of the duplicated fugu Hox clusters are virtually devoid of unique ancient CNEs. In contrast to fugu Hox clusters, elephant shark and human Hox clusters have lost fewer ancient CNEs. If these ancient CNEs are indeed enhancers directing tissue-specific expression of Hox genes, divergence of their sequences in vertebrate lineages might have led to altered expression patterns and presumably the functions of their associated Hox genes.

Keywords: conserved noncoding elements, gene loss, genome duplication, teleost fish, fugu

Hox genes encode homeodomain-containing transcription factors that specify the identities of body segments along the anterior–posterior axis of metazoans. Hox genes occur in clusters that were generated by a series of tandem duplications of ancestral gene(s) before the divergence of cnidarians and bilaterians (1). The Hox clusters exhibit a striking phenomenon of spatial collinearity whereby genes in the 3′-end of the cluster are expressed in the anterior part of the embryo, whereas those in the 5′-end are expressed in the posterior part. In vertebrates, Hox cluster genes also exhibit temporal collinearity; that is, genes located in the 3′-end of the cluster are expressed early during development, whereas genes in the 5′-end are expressed later (2). Recent comparative studies of Hox clusters in model genomes have shown that Hox clusters have experienced repeated molecular changes, including fragmentation, cluster duplication, gene loss, coding-sequence divergence, and cis-regulatory element evolution (37). Because of their critical role in defining the identities of body segments, Hox genes are believed to have played a key role in driving the morphologic diversification of animals (810) and thus are of particular interest in understanding the genetic basis of morphologic diversity of metazoans.

Among chordates, the cephalochordate amphioxus possesses a single Hox cluster (11, 12). In urochordates such as Ciona and Oikopleura, the single cluster is highly fragmented and dispersed in the genome (13, 14). In contrast to these invertebrate chordates, vertebrates contain multiple Hox clusters that are tightly organized. However, the number of Hox clusters and Hox genes varies among vertebrates. Mammals and other tetrapods contain 4 Hox clusters (HoxA, HoxB, HoxC, and HoxD) resulting from 2 rounds of genome duplication early during the evolution of vertebrates. Most teleost fishes contain 7 Hox clusters owing to a fish-specific genome duplication event in the ray-finned fish lineage followed by the loss of 1 duplicate cluster. Whereas a duplicate HoxD cluster is lost in zebrafish, acanthopterygians such as fugu, medaka, and cichlids have lost a duplicate HoxC cluster. In addition, several Hox genes have experienced lineage-specific secondary losses resulting in each of these teleosts possessing a unique set of Hox genes (5, 15). In contrast to these teleosts, Atlantic salmon contains as many as 13 Hox clusters owing to an additional genome duplication in a salmonid ancestor (16). Although lampreys and hagfishes are known to contain multiple Hox clusters, the exact number of Hox clusters in these primitive jawless vertebrates is not known. PCR-based surveys have indicated that lampreys may contain 3 or 4 Hox clusters (17, 18), and hagfishes may contain as many as 7 Hox clusters (19). However, phylogenetic analyses suggest that some of the clusters are the result of lineage-specific cluster duplications (19, 20), and thus the number of Hox clusters in the stem jawless-vertebrate lineage is unclear.

Cartilaginous fishes are the most basal group of living jawed vertebrates (gnathostomes). They comprise 2 groups, the elasmobranchs (sharks, rays, and skates) and holocephalians (chimaera) that diverged ≈374 million years ago (21). By virtue of their phylogenetic position, cartilaginous fishes are a critical outgroup for studying the evolution of Hox gene clusters in gnathostomes. To date, a complete HoxA cluster and a partial HoxD cluster (HoxD5 to HoxD14) have been characterized in an elasmobranch cartilaginous fish, the horn shark (Heterodontus francisci) (22, 23). Comparisons of the single HoxA cluster in horn shark and human with duplicated HoxA clusters in zebrafish have identified many noncoding elements conserved in horn shark and human over ≈450 million years of evolution (24). These elements are likely to be cis-regulatory elements that are under evolutionary constraint.

We recently identified elephant shark (Callorhinchus milii), a holocephalian cartilaginous fish, as a useful model cartilaginous fish genome because of its relatively small genome (910 Mb) and generated 1.4× coverage sequence of its genome (25). Analysis of these sequences, which represent ≈75% of the genome, identified 37 Hox genes belonging to putative 4 Hox clusters (25). However, because of the low coverage of the genome, most Hox genes were in fragments, and the number and organization of Hox clusters could not be confirmed. In the present study we have sequenced BAC clones and determined complete sequences of the Hox gene clusters in elephant shark. We have also analyzed the pattern of evolution of conserved noncoding elements (CNEs) in the Hox clusters of elephant shark, human, and a teleost fish (fugu). Our study confirms that elephant shark has 4 Hox clusters like tetrapods, but the 4 clusters contain more Hox genes than tetrapods. In contrast to elephant shark and human Hox clusters that have lost only a small number of ancient CNEs, more than 50% of ancient CNEs have diverged beyond recognition in the Hox clusters of fugu.

Results and Discussion

Hox Clusters in Elephant Shark.

Using probes for selected Hox genes identified from the 1.4× coverage genome sequence of elephant shark, we isolated BACs and determined sequences for 4 Hox cluster loci in elephant shark (Fig. 1). All of the 37 Hox gene fragments identified in the 1.4× coverage sequence map to these loci, strongly suggesting that elephant shark contains only 4 Hox clusters (HoxA–D). The elephant shark Hox cluster loci display a distinct pattern of repetitive sequences within and outside the Hox gene clusters (Fig. 1); whereas the Hox clusters contain only 3.7% repetitive sequences, their flanking regions contain 26% repetitive sequences. This pattern suggests that repetitive sequences have been selectively excluded from the elephant shark Hox clusters. Interestingly, the density of repetitive sequences in the Hox clusters of elephant shark is similar to that in the single Hox cluster in amphioxus (3.9%), yet the elephant shark Hox clusters are 3 to 4 times smaller than the amphioxus Hox cluster (≈448 kb) (11). The exclusion of repetitive sequences, therefore, does not seem to be responsible for the compact sizes of the elephant shark Hox clusters.

Fig. 1.
Hox cluster loci in elephant shark. Genes are represented as colored boxes, and arrows denote the direction of transcription. Pseudogenes are denoted by the symbol Ψ. The bars below represent repetitive sequences predicted using CENSOR (see Methods ...

The sizes of the elephant shark Hox clusters are comparable to their orthologs in humans except for the HoxC cluster, which is considerably larger (≈172 kb) than its ortholog in humans (≈117 kb) (supporting information Fig. S1). The large size of the elephant shark HoxC cluster is mainly due to the presence of HoxC3 and HoxC1 genes that are absent in mammals and a relatively higher content of repetitive sequences (9.5%). A majority of these repetitive sequences (6.8%) are retrotransposons and are found in the 3′ end of the cluster (Fig. 1). A similar high concentration of retrotransposons (6.7%) has been recently reported in the 4 Hox clusters of the anole lizard, which are unusually larger (≈173–324 kb) than their orthologs in other vertebrates (26). Because transposable elements are a major source of genetic diversity, it has been suggested that the insertion of retrotransposons into the Hox clusters may have been associated with the evolution of morphologic diversity of reptiles (26). It would be interesting to investigate whether the insertion of retrotransposons into the HoxC cluster of elephant shark has modified the regulation and function of its HoxC genes.

The 4 Hox clusters in elephant shark contain 45 Hox genes, which is more than the genes present in the 4 Hox clusters in tetrapods. Furthermore, some of the genes present in the elephant shark Hox clusters have been lost in the supernumerary Hox clusters in teleost fishes. Thus, elephant shark contains more ancient gnathostome Hox genes than any known bony vertebrate. In addition to 45 intact Hox genes, the elephant shark Hox clusters contain remnants of 2 Hox genes, HoxA14 and HoxB14. An inactive HoxA14 gene has been previously identified in the horn shark (23), but this is a unique instance of a HoxB14 pseudogene in vertebrates. The presence of remnants of a HoxA14 gene in elephant shark and horn shark, which separated ≈374 million years ago (21), indicates that this gene was functional for a long time in the 2 lineages and became inactive relatively recently. In addition to the Hox pseudogenes, we also identified an Evx pseudogene closely linked to the HoxB cluster. The HoxA and HoxD clusters of elephant shark and other gnathostomes each contain an intact Evx gene (called Evx1 and Evx2, respectively). However, no Evx gene has been identified in the HoxB clusters of tetrapods, although HoxB clusters of some teleost fishes contain an intact Evx homolog, called Eve (15, 27). The identification of remnants of an Evx in the HoxB cluster of elephant shark suggests that this gene was independently lost in elephant shark and tetrapods. The elephant shark Hox clusters also contain 6 microRNA genes (Fig. 1). One of the microRNAs, mir-10c located between HoxC5 and HoxC4, is lost in mammals (28), whereas mir-196a-1 (between HoxB10 and HoxB9) is lost in medaka (5).

On the basis of the intact and inactivated Hox genes identified in the Hox clusters of elephant shark, we can infer that the last common ancestor of cartilaginous fishes and bony vertebrates contained 4 Hox clusters with at least 47 Hox genes (Fig. 2). Although all except 2 of these genes have been retained in elephant shark, several of these ancient gnathostome Hox genes have experienced differential losses in various bony vertebrate lineages at various stages of evolution. For example, 3 of the ancient gnathostome genes, HoxD2, HoxD5, and HoxD14, were lost in a common ancestor of bony vertebrates (Fig. 2 and Fig. S2), whereas HoxB7 was lost independently in the pufferfish, medaka, and tilapia lineages relatively recently (Fig. S2). Between the tetrapod and teleost lineages, more ancient Hox genes have experienced losses in the teleost lineage (10 genes) than in the tetrapod lineage (6 genes) (Fig. S2). It is not clear whether these ancient genes were redundant genes that were retained for varying periods and then lost, or whether they performed some unique functions that have been perturbed in the lineages that have lost them. There is evidence to suggest that 1 of the ancient Hox genes, HoxD14, which has been lost in bony vertebrates, had a unique expression pattern in cartilaginous fishes; it is expressed in a restricted cell population in the hindgut, but not in the central nervous system, somites, or fin buds/folds that are known to express Hox genes (29). Consequent to its loss in bony vertebrates, the function performed by this gene in cartilaginous fishes must have been perturbed in bony vertebrates. Examination of the expression patterns and functions of other elephant shark Hox genes that have been lost in various bony vertebrate lineages should help in understanding the consequences of their losses.

Fig. 2.
Hox gene clusters in elephant shark, human, and fugu. Loss of ancestral gnathostome Hox genes is indicated. Star denotes whole-genome duplication. HoxA14 is present in coelacanth and is therefore marked as independently lost in human and fugu. HoxB10 ...

Order of Hox Cluster Duplications.

Although it is widely accepted that there were 2 rounds of genome duplication during the early evolution of vertebrates (the 2R hypothesis), there has been conflicting evidence that supports scenarios such as one round of duplication followed by large-scale segmental duplications (30, 31) and only segmental duplications without genome duplication (32). Phylogenetic analyses of Hox genes have been used in the past to resolve this debate. If the 4 Hox clusters are the result of 2R, the phylogeny of the genes from the 4 clusters should show a symmetric topology. However, none of the studies has yielded a symmetric topology with strong statistical support (33, 34). We hypothesized that the Hox genes in elephant shark are good candidates to resolve this debate for 2 reasons: (i) the coding sequences in elephant shark have been found to evolve at a slower rate than in other vertebrates (35), and hence the degree of variation and divergence between Hox paralogs will be lower compared with other vertebrates; and (ii) in elephant shark, 7 of the 14 paralogous Hox groups (Hox1, -3, -4, -5, -9, -10, and -13) are represented by all of the 4 members, the highest in any vertebrate characterized to date. The phylogenetic analysis of these 7 paralogous group Hox genes should generate a robust phylogeny. We concatenated the coding sequence alignments of the 7 paralogous groups and carried out phylogenetic analysis using amphioxus as the outgroup. We used 3 different methods of phylogenetic inference: maximum likelihood (ML), Bayesian (BA), and neighbor-joining (NJ). The first 2 are model-based methods, whereas the latter is a distance-based method. Each of these methods has its own strengths and limitations. Interestingly, all 3 methods gave an ((AB)(CD)) topology with moderate to high support values (Fig. 3). This congruence in the inferred topology between the 3 methods provides strong support for the ((AB)(CD)) topology. We then tested the 15 possible cluster duplication topologies using CONSEL (36) to determine whether the topology inferred by the phylogenetic analysis was significantly better than the other topologies. All of the tests implemented in CONSEL indicated that the ((AB)(CD)) topology was the most likely topology (Table S1). This well-supported symmetric topology provides strong support for the 2R hypothesis.

Fig. 3.
Phylogenetic model for the order of Hox cluster duplication. The model was inferred by phylogenetic analysis of elephant shark (Cm) Hox genes from paralogous groups 1, 3, 4, 5, 9, 10, and 13 that contain all of the 4 members (A, B, C, and D) using amphioxus ...

CNEs in Elephant Shark and Human Hox Loci.

CNEs are useful tools for discovering functional elements in the noncoding regions of human and other vertebrate genomes. Indeed, functional assay of CNEs identified in distantly related vertebrates such as human and teleost fishes has shown that many of them function as transcriptional enhancers directing tissue-specific expression (37, 38). Comparisons of Hox clusters in human and teleost fishes have identified many CNEs within the Hox clusters (24, 27, 39, 40) that were shared by the common ancestors of these bony vertebrates. Because cartilaginous fishes reside at the base of the gnathostome phylogenetic tree, comparisons of cartilaginous fishes and bony vertebrates should help in defining more ancient CNEs that existed in the common ancestor of gnathostomes. Previously many CNEs have been identified in the HoxA cluster of horn shark and human (22). Interestingly, several of these ancient CNEs were found to be absent in the duplicated HoxA clusters of zebrafish (24). To identify ancient CNEs conserved in the Hox cluster loci of elephant shark, human, and teleost fishes, we aligned each elephant shark Hox cluster locus (note that it includes some genes linked to the Hox cluster; Fig. 1) with its orthologous sequence in human and fugu using SLAGAN (41) and predicted CNEs using VISTA (42). The VISTA plots of the 4 Hox clusters are given in Fig. S3, and details of the CNEs are given in Table S2.

The HoxD and HoxA loci of elephant shark and human contain 105 CNEs (total length 18.3 kb) and 96 CNEs (11.4 kb), respectively. HoxB and HoxC loci, on the other hand, contain 48 CNEs (5.0 kb) and 25 CNEs (2.6 kb), respectively (Table S2). The higher number of CNEs in the HoxD locus is largely due to a set of highly conserved CNEs (51 CNEs, 11.6 kb) located 5′ in the locus, upstream of the Hox cluster (Fig. S3). The corresponding regions in HoxA, -B, and -C loci are virtually devoid of CNEs (Fig. S3).

A considerable number of cis-regulatory elements that play a role in the tissue-specific expression of Hox genes have been previously identified in vertebrate Hox clusters and their functions verified in transgenic assays (Table S3). In a number of studies, putative transcription factor binding sites that are involved in the tissue-specific expression have also been identified. To determine whether these functionally verified enhancers overlap elephant shark–human CNEs, we compared the human genomic coordinates of the enhancers with those of the elephant shark–human CNEs. Of the 35 functionally verified enhancers that we extracted from literature, 26 enhancers (74%) show overlap with elephant shark–human CNEs. Furthermore, most of the putative transcription factor binding sites identified in these enhancers are conserved in the elephant shark CNEs (Table S3), indicating that CNEs represent the core regions of enhancers. The overlap of a majority of the functionally verified Hox enhancers with elephant shark–human CNEs suggests that a large number of CNEs identified in the elephant shark and human Hox clusters represent cis-regulatory elements that play a role in the expression of Hox genes.

To assess the pattern of evolution of CNEs in the intergenic regions of Hox genes, we compared the distribution of CNEs within only the Hox clusters (i.e., 5 kb upstream of the 5′-most Hox gene to 5 kb downstream of the 3′-most Hox gene). The elephant shark and human HoxA clusters contain the highest number of CNEs (70 CNEs, total length 8.4 kb), followed by HoxD clusters (46 CNEs, 5.0 kb). HoxB clusters contain 40 CNEs (4.4 kb), whereas HoxC clusters contain only 23 CNEs (2.5 kb) (Table 1). This variation in the CNE content indicates that CNEs in paralogous Hox clusters have been subject to different patterns of evolutionary constraints. The density of CNEs also shows variation across each Hox cluster. The density of CNEs in the 5′ end of HoxA, HoxB, and HoxD clusters is generally low and shows an increasing trend toward the 3′ end of the clusters (Fig. 4). This pattern suggests that CNEs associated with 3′ Hox genes, which express in the anterior regions of the developing embryo, are highly conserved during evolution, and presumably their expression is tightly regulated. Conversely, the expression patterns of 5′ Hox genes, which express in the posterior segments, might be divergent in elephant shark and human. A similar pattern of CNEs across the Hox clusters was previously observed in the HoxA clusters of horn shark and human (40) and among the Hox clusters of teleost fishes (5). In contrast to elephant shark and human HoxA, HoxB, and HoxD clusters, their HoxC clusters contain a low density of CNEs, and the density of CNEs in the HoxC cluster does not show an increase toward the 3′ end (Fig. 4). As noted above, the intergenic regions in the elephant shark HoxC cluster, particularly in the 3′ end of the cluster, are unusually large owing to the accumulation of retrotransposons. The divergence of CNEs in the elephant shark and human HoxC clusters could be a consequence of the insertion of retrotransposons into the elephant shark HoxC cluster.

Table 1.
CNEs in the elephant shark and human Hox clusters, and in the elephant shark and fugu Hox clusters
Fig. 4.
Total length and density of elephant shark–human CNEs in the intergenic regions of elephant shark Hox genes. Names of intergenic regions of Hox genes are given along the x axis. Arrows represent 5-kb flanking sequences included for this analysis. ...

Pattern of CNE Evolution in Hox Clusters of Fugu, Human, and Elephant Shark.

Following the duplication of Hox clusters in the teleost lineage, fugu has lost a complete duplicate HoxC cluster and retained 45 Hox genes in the remaining 7 Hox clusters (Fig. 2). Interestingly, elephant shark and fugu Hox clusters contain fewer CNEs than elephant shark and human Hox clusters, and the elephant shark–fugu CNEs are shorter (average length 90 bp) than elephant shark–human CNEs (average length 113 bp) (Table 1). In fact, more than 50% of CNEs conserved in elephant shark and human Hox clusters are not identifiable in the duplicated Hox clusters (85 of 156 CNEs from HoxA, -B, and -D clusters) as well as in the single HoxC cluster (12 of 23 CNEs) of fugu (Table S2). This indicates that a majority of ancient CNEs that are conserved in elephant shark and human have diverged beyond recognition or have been lost in fugu Hox clusters, irrespective of whether the cluster is retained in duplicate or as a singleton. Interestingly, among the 71 human CNEs that are conserved in the duplicated Hox clusters of fugu, a majority is found in the a-paralogous clusters. Whereas 12 of these CNEs are conserved in both fugu a- and b-paralogous clusters, 57 are conserved in a-paralogous clusters, and only 2 CNEs are conserved in a fugu b-paralogous cluster (HoxBb cluster) (Table S2). This skewed pattern of retention of unique CNEs between fugu a- and b-paralogous clusters indicates that fugu b-paralogous clusters have not only lost a large number of Hox genes (altogether only 12 Hox genes are retained; see Fig. 2) but have also lost a substantial number of ancient CNEs compared with a-paralogous clusters.

In contrast to 97 ancient CNEs (total length 8.7 kb) lost in the fugu Hox clusters, human Hox clusters have lost only 43 ancient CNEs (total length 2.9 kb) that are conserved in elephant shark and fugu Hox clusters (Table S2). To assess the extent of ancient CNEs that have been lost in elephant shark Hox clusters, we generated SLAGAN alignments of orthologous human, fugu, and elephant shark Hox clusters using human as the reference sequence and predicted CNEs conserved in human–fugu and human–elephant shark Hox clusters (Fig. S4). A total of 39 CNEs (total length 2.8 kb) conserved in human and fugu are not identifiable in elephant shark Hox clusters (Table S4). Although this number is comparable to the number of ancient CNEs lost in human Hox clusters, it should be noted that some of these CNEs could have evolved de novo in the stem bony vertebrate lineage. Overall, these comparisons show that fugu Hox clusters have lost far more ancient CNEs than the human and elephant shark Hox clusters.

Conservation of noncoding sequences is a powerful strategy for identifying cis-regulatory elements directing tissue-specific expression of developmental genes. Functional assay of CNEs associated with developmental genes has shown that many of them function as tissue-specific enhancers (37, 38). In a recent study in which enhancers were predicted in mouse embryonic forebrain, midbrain, and limb tissues on the basis of ChIP assay for the enhancer-associated p300 protein, approximately 90% of the p300-binding regions were found to overlap CNEs (43). Thus, CNEs are a useful signature for discovering enhancers associated with developmental genes. A considerable amount of work has been done on the identification and characterization of regulatory regions involved in tissue-specific expression of vertebrate Hox genes, and functions of about 35 enhancers have been verified in transgenic assays (Table S3). We found that a majority of these functionally verified enhancers (74%) overlaps CNEs in elephant shark and human Hox clusters. This provides support to the hypothesis that a large number of CNEs identified in the elephant shark and human Hox clusters represent enhancers. If these CNEs indeed represent enhancers, our observation that many ancient CNEs have diverged beyond recognition in fugu and human raises the question whether the divergent CNEs have resulted in altered expression patterns and functions of their associated Hox genes. However, it should be noted that divergent enhancers need not necessarily lead to altered expression pattern of genes associated with them. For example, if the divergent enhancers contained redundant transcription factor binding sites, the expression pattern of their associated genes could still be maintained by other binding sites. Another possibility is that the loss of transcription factor binding sites in one location might be compensated by de novo evolution of binding sites in another location, as has been demonstrated for the stripe 2 enhancer of even-skipped gene in Drosophila (44). Notwithstanding these possibilities, several studies have shown that changes in cis-regulatory sequences, ranging from subtle differences to extensive restructuring, can account for the altered expression pattern of Hox genes (6, 7, 4547). For example, the duplicate fugu HoxA2a and HoxA2b genes show distinct expression patterns in the hindbrain. Detailed functional assay of the regulatory elements of these genes has shown that a subtle drift in the subset of cis-regulatory elements is responsible for their differential expression patterns (7). Another interesting example is the HoxC8 early enhancer, a 200-bp region that plays a role in the early phase of mouse HoxC8 expression in neural tube and mesoderm. This enhancer is found in different vertebrate lineages, including chicken and fishes. Interestingly, whereas the 5′ region of the enhancer is highly conserved in mouse, fugu, and zebrafish, the 3′ region is divergent in the 2 teleosts, and transgenic mouse assays have shown that the mouse, fugu, and zebrafish enhancers confer distinct patterns of reporter gene expression (45). These data indicate that the divergence of short stretches of sequences within a highly conserved enhancer also has the potential to modify the expression pattern of enhancers. Therefore, it is possible that the divergence of ancient Hox CNEs that represent putative cis-regulatory elements might have led to altered expression patterns of their associated Hox genes. Incidentally, the 3′ region of the mouse HoxC8 early enhancer overlaps an elephant shark–human CNE (HE-HoxC_CNE10) identified in our study, and in contrast to the divergent 3′ regions of fugu and zebrafish, the elephant shark sequence is highly similar to the mouse and human sequences (Fig. S5). Thus, this enhancer seems to be an ancient gnathostome enhancer that has diverged considerably in the teleost lineage. Functional assay of this and other conserved elephant shark CNEs in comparison with their divergent counterparts in bony vertebrates should indicate whether the divergent CNEs have indeed resulted in altered expression patterns of their associated Hox genes.


Identification of BACs.

The DNA prepared from 92,160 clones in an elephant shark BAC library (IMCB_Eshark BAC library) were pooled in 3 dimensions and used for identifying BAC clones by 3-step PCR screening. The 1.4× coverage sequence of the elephant shark genome contained fragments for 37 Hox genes (25). PCR primers were initially designed for 4 of the fragments (HoxA1, HoxB5, HoxC8, and HoxD5) representing each of the 4 Hox clusters and used to identify representative BAC clones (158C9, 48D4, 161E22, and 46N13, respectively). These BACs were sequenced completely. BACs that overlapped the 5′ end sequence of these BACs were identified by PCR screening and sequenced completely. Altogether, 2 BACs each were sequenced to obtain the sequences of HoxA (55E21 and 158C9) and HoxC (205I1 and 161E22) loci, whereas 3 BACs each were sequenced to determine the HoxB (235L21, 91I24, and 48D4) and HoxD (28O18, 41O12, and 46N13) loci sequences.

Sequencing and Assembly of BACs.

BAC clones were sequenced using the standard shotgun sequencing method, and gaps were filled by PCR amplification and primer walking. Sequencing was done using the BigDye Terminator Cycle Sequencing Kit (Applied Biosystems). Sequences were processed and assembled using Phred-Phrap and Consed (http://www.phrap.org/phredphrapconsed.html).

Sequence Annotation.

Repetitive sequences were identified and masked using CENSOR at default settings (http://www.girinst.org/censor/index.php). Protein-coding and microRNA genes were predicted using a combination of ab initio (e.g., FGENESH) and homology-based methods. Sequences for human and fugu genes were extracted from the University of California Santa Cruz Genome Browser (http://genome.ucsc.edu/). Our homology-based searches identified remnants of HoxA14 gene in the HoxA cluster and remnants of an Evx gene and HoxB14 gene in HoxB cluster. We also found fragments of an Lnp gene in HoxC cluster with high similarity to 5 of the 11 exons (exons 3, 6, 8, 9, and 10) of the Lnp gene in HoxD cluster. Because we could not identify any other exons, we annotated the Lnp gene in the HoxC cluster as a pseudogene.

Order of Hox Cluster Duplications.

Predicted amino acid sequences of elephant shark Hox genes from paralogous groups Hox1, -3, -4, -5, -9, -10, and -13 were aligned with their orthologous sequences from amphioxus using CLUSTALW as implemented in BioEdit sequence alignment editor (http://www.mbio.ncsu.edu/BioEdit/BioEdit.html). A codon-based alignment of nucleotide sequences was generated on the basis of the amino acid alignment using PAL2NAL (http://coot.embl.de/pal2nal/). Individual alignments for the paralogous groups were concatenated, and third codon positions were excluded. The best-fit substitution model for this alignment was deduced by ModelGenerator (http://bioinf.may.ie/software/modelgenerator/). ML and BA methods were used for phylogenetic analyses using the best-fit substitution model. PHYML (http://www.atgc-montpellier.fr/phyml/) was used for ML analyses and 100 nonparametric bootstrap replicates were used. For BA analyses, we used MrBayes 3.1.2 (http://mrbayes.csit.fsu.edu/). Two independent runs starting from different random trees were run for 100,000 generations with sampling every 100 generations. A consensus tree was built from all sampled trees excluding the first 250 (burn-in). In addition, a NJ tree was also generated using MEGA 4.0 (http://www.megasoftware.net/mega.html) with 1,000 bootstrap replicates. CONSEL (36) was used to evaluate the 15 possible tree topologies for the Hox cluster duplications using the same alignment.

Identification and Analysis of CNEs.

Multiple alignments of elephant shark, human, and fugu Hox cluster loci sequences were generated using the “glocal” alignment program SLAGAN (41), with either elephant shark or human as the reference sequence. CNEs were predicted using a cut-off of ≥65% identity across >50 bp windows and visualized using VISTA (42). CNEs that were shorter than 50 bp were excluded from the final set of CNEs. Human or elephant shark CNEs that showed a minimum overlap of 20 bp with fugu CNEs are considered as conserved in fugu.

Supplementary Material

Supporting Information:


We thank Masaki Miya for help in phylogenetic analysis. This work is supported by the Biomedical Research Council of the A*STAR, Singapore.


The authors declare no conflict of interest.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. FJ824598 to FJ824601).

This article contains supporting information online at www.pnas.org/cgi/content/full/0907914106/DCSupplemental.


1. Ryan JF, et al. Pre-bilaterian origins of the Hox cluster and the Hox code: Evidence from the sea anemone, Nematostella vectensis. PLoS ONE. 2007;2:e153. [PMC free article] [PubMed]
2. Kmita M, Duboule D. Organizing axes in time and space; 25 years of colinear tinkering. Science. 2003;301:331–333. [PubMed]
3. Duboule D. The rise and fall of Hox gene clusters. Development. 2007;134:2549–2560. [PubMed]
4. Garcia-Fernandez J. The genesis and evolution of homeobox gene clusters. Nat Rev Genet. 2005;6:881–892. [PubMed]
5. Hoegg S, Boore JL, Kuehl JV, Meyer Comparative phylogenomic analyses of teleost fish Hox gene clusters: Lessons from the cichlid fish Astatotilapia burtoni. BMC Genomics. 2007;8:317. [PMC free article] [PubMed]
6. Ray R, Capecchi M. An examination of the Chiropteran HoxD locus from an evolutionary perspective. Evol Dev. 2008;10:657–670. [PubMed]
7. Tumpel S, Cambronero F, Wiedemann LM, Krumlauf R. Evolution of cis elements in the differential expression of two Hoxa2 coparalogous genes in pufferfish (Takifugu rubripes) Proc Natl Acad Sci USA. 2006;103:5419–5424. [PMC free article] [PubMed]
8. Carroll SB, Weatherbee SD, Langeland JA. Homeotic genes and the regulation and evolution of insect wing number. Nature. 1995;375:58–61. [PubMed]
9. Holland PW, Garcia-Fernandez J. Hox genes and chordate evolution. Dev Biol. 1996;173:382–395. [PubMed]
10. Wagner GP, Amemiya C, Ruddle F. Hox cluster duplications and the opportunity for evolutionary novelties. Proc Natl Acad Sci USA. 2003;100:14603–14606. [PMC free article] [PubMed]
11. Amemiya CT, et al. The amphioxus Hox cluster: Characterization, comparative genomics, and evolution. J Exp Zoolog B Mol Dev Evol. 2008;310:465–477. [PubMed]
12. Garcia-Fernandez J, Holland PW. Archetypal organization of the amphioxus Hox gene cluster. Nature. 1994;370:563–566. [PubMed]
13. Ikuta T, Yoshida N, Satoh N, Saiga H. Ciona intestinalis Hox gene cluster: Its dispersed structure and residual colinear expression in development. Proc Natl Acad Sci USA. 2004;101:15118–15123. [PMC free article] [PubMed]
14. Seo HC, et al. Hox cluster disintegration with persistent anteroposterior order of expression in Oikopleura dioica. Nature. 2004;431:67–71. [PubMed]
15. Amores A, et al. Zebrafish hox clusters and vertebrate genome evolution. Science. 1998;282:1711–1714. [PubMed]
16. Mungpakdee S, et al. Differential evolution of the 13 Atlantic salmon Hox clusters. Mol Biol Evol. 2008;25:1333–1343. [PubMed]
17. Force A, Amores A, Postlethwait JH. Hox cluster organization in the jawless vertebrate Petromyzon marinus. J Exp Zool. 2002;294:30–46. [PubMed]
18. Irvine SQ, et al. Genomic analysis of Hox clusters in the sea lamprey Petromyzon marinus. J Exp Zool. 2002;294:47–62. [PubMed]
19. Stadler PF, et al. Evidence for independent Hox gene duplications in the hagfish lineage: A PCR-based gene inventory of Eptatretus stoutii. Mol Phylogenet Evol. 2004;32:686–694. [PubMed]
20. Fried C, Prohaska SJ, Stadler PF. Independent Hox-cluster duplications in lampreys. J Exp Zoolog B Mol Dev Evol. 2003;299:18–25. [PubMed]
21. Cappetta H, Duffin C, Zidek J. In: The Fossil Record 2. Benton MJ, editor. London: Chapman and Hall; 1993. pp. 593–609.
22. Kim CB, et al. Hox cluster genomics in the horn shark, Heterodontus francisci. Proc Natl Acad Sci USA. 2000;97:1655–1660. [PMC free article] [PubMed]
23. Powers TP, Amemiya CT. Evidence for a Hox14 paralog group in vertebrates. Curr Biol. 2004;14:R183–R184. [PubMed]
24. Chiu CH, et al. Molecular evolution of the HoxA cluster in the three major gnathostome lineages. Proc Natl Acad Sci USA. 2002;99:5492–5497. [PMC free article] [PubMed]
25. Venkatesh B, et al. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome. PLoS Biol. 2007;5:e101. [PMC free article] [PubMed]
26. Di-Poi N, Montoya-Burgos JI, Duboule D. Atypical relaxation of structural constraints in Hox gene clusters of the green anole lizard. Genome Res. 2009;19:602–610. [PMC free article] [PubMed]
27. Lee AP, Koh EG, Tay A, Brenner S, Venkatesh B. Highly conserved syntenic blocks at the vertebrate Hox loci and conserved regulatory elements within and outside Hox gene clusters. Proc Natl Acad Sci USA. 2006;103:6994–6999. [PMC free article] [PubMed]
28. Tanzer A, Amemiya CT, Kim CB, Stadler PF. Evolution of microRNAs located within Hox gene clusters. J Exp Zoolog B Mol Dev Evol. 2005;304:75–85. [PubMed]
29. Kuraku S, et al. Noncanonical role of Hox14 revealed by its expression patterns in lamprey and shark. Proc Natl Acad Sci USA. 2008;105:6679–6683. [PMC free article] [PubMed]
30. Gu X, Wang Y, Gu J. Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution. Nat Genet. 2002;31:205–209. [PubMed]
31. McLysaght A, Hokamp K, Wolfe KH. Extensive genomic duplication during early chordate evolution. Nat Genet. 2002;31:200–204. [PubMed]
32. Friedman R, Hughes AL. The temporal distribution of gene duplication events in a set of highly conserved human gene families. Mol Biol Evol. 2003;20:154–161. [PubMed]
33. Bailey WJ, Kim J, Wagner GP, Ruddle FH. Phylogenetic reconstruction of vertebrate Hox cluster duplications. Mol Biol Evol. 1997;14:843–853. [PubMed]
34. Lynch VJ, Wagner GP. Multiple chromosomal rearrangements structured the ancestral vertebrate Hox-bearing protochromosomes. PLoS Genet. 2009;5:e1000349. [PMC free article] [PubMed]
35. Wang J, Lee AP, Kodzius R, Brenner S, Venkatesh B. Large number of ultraconserved elements were already present in the jawed vertebrate ancestor. Mol Biol Evol. 2009;26:487–490. [PubMed]
36. Shimodaira H, Hasegawa M. CONSEL: For assessing the confidence of phylogenetic tree selection. Bioinformatics. 2001;17:1246–1247. [PubMed]
37. Pennacchio LA, et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature. 2006;444:499–502. [PubMed]
38. Woolfe A, et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 2005;3:e7. [PMC free article] [PubMed]
39. Aparicio S, et al. Detecting conserved regulatory elements with the model genome of the Japanese puffer fish, Fugu rubripes. Proc Natl Acad Sci USA. 1995;92:1684–1688. [PMC free article] [PubMed]
40. Santini S, Boore JL, Meyer A. Evolutionary conservation of regulatory elements in vertebrate Hox gene clusters. Genome Res. 2003;13:1111–1122. [PMC free article] [PubMed]
41. Brudno M, et al. Glocal alignment: Finding rearrangements during alignment. Bioinformatics. 2003;19(Suppl 1):i54–i62. [PubMed]
42. Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2004;32:W273–W279. [PMC free article] [PubMed]
43. Visel A, et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature. 2009;457:854–858. [PMC free article] [PubMed]
44. Ludwig MZ, Bergman C, Patel NH, Kreitman M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature. 2000;403:564–567. [PubMed]
45. Anand S, et al. Divergence of Hoxc8 early enhancer parallels diverged axial morphologies between mammals and fishes. Proc Natl Acad Sci USA. 2003;100:15666–15669. [PMC free article] [PubMed]
46. Belting HG, Shashikant CS, Ruddle FH. Modification of expression and cis-regulation of Hoxc8 in the evolution of diverged axial morphology. Proc Natl Acad Sci USA. 1998;95:2355–2360. [PMC free article] [PubMed]
47. Shashikant CS, Kim CB, Borbely MA, Wang WC, Ruddle FH. Comparative studies on mammalian Hoxc8 early enhancer sequence reveal a baleen whale-specific deletion of a cis-acting element. Proc Natl Acad Sci USA. 1998;95:15446–15451. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...