Logo of ajhgLink to Publisher's site
Am J Hum Genet. Jan 2002; 70(1): 269–278.
Published online Nov 30, 2001. doi:  10.1086/338307
PMCID: PMC419983

A Cascade of Complex Subtelomeric Duplications during the Evolution of the Hominoid and Old World Monkey Genomes


Subtelomeric duplications of an obscure tubulin “genic” segment located near the telomere of human chromosome 4q35 have occurred at different evolutionary time points within the last 25 million years of the catarrhine (i.e., hominoid and Old World monkey) evolution. The analyses of these segments reported here indicate an exceptional level of evolutionary instability. Substantial intra- and interspecific differences in copy number and distribution are observed among cercopithecoid (Old World monkey) and hominoid genomes. Characterization of the hominoid duplicated segments reveals a strong positional bias within pericentromeric and subtelomeric regions of the genome. On the basis of phylogenetic analysis from predicted proteins and comparisons of nucleotide-substitution rates, we present evidence of a conserved b-tubulin gene among the duplications. Remarkably, the evolutionary conservation has occurred in a nonorthologous fashion, such that the functional copy has shifted its positional context between hominoids and cercopithecoids. We propose that, in a chimpanzee-human common ancestor, one of the paralogous copies assumed the original function, whereas the ancestral copy acquired mutations and eventually became silenced. Our analysis emphasizes the dynamic nature of duplication-mediated genome evolution and the delicate balance between gene acquisition and silencing.

Acquiring genetic diversity through gene duplication and divergent mutations is an important mechanism for evolution of the genome of a complex organism. Gene families are abundantly present and, in many cases, are part of a transcriptionally tightly regulated gene cluster at a particular genomic location. Tandem gene duplications and genomewide teraploidization events are believed to be largely responsible for shaping the present-day homeobox, hemoglobin, and immunoglobulin gene clusters in higher vertebrates (Shen et al. 1981; Ohno 1999; Zerucha and Ekker 2000). Other gene families have members scattered throughout the genome, and these gene duplications are mechanistically distinct from whole-genome or tandem duplication. Genes located in pericentromeric or subtelomeric domains are particularly prone to acquire genetic diversity, since frequent exchange of sequences occur between these dynamic (non-) homologous chromosome regions (Eichler et al. 1996; Pryde et al. 1997; Eichler 1998). For example, the large olfactory-receptor gene family has members for the most part distributed on human chromosome ends (Rouquier et al. 1998; Trask et al. 1998a, 1998b). Here, we present further insight into gene-duplication events on subtelomeric and pericentromeric regions, by analyses of gene-order conservation, phylogenetics, and fluorescent in situ hybridization (FISH). The data provides striking evidence for dynamic gene acquisition and silencing within evolutionary duplicated DNA segments.

Analysis of human chromosomal 4q35 segments, localized close to (<300 kb from) the telomere, has indicated a substantial amount of duplication throughout the genome (Grewal et al. 1999; van Geel et al. 1999). A tubulin-like element (TUBB4Q) residing in the region has at least seven homologous members in the genome (van Geel et al. 2000). High allelic sequence variability and lack of expression has suggested that this copy contains a pseudogene derivative (van Geel et al. 2000). The sequence dispersal and evolutionary history of 4qter was further analyzed by focusing in detail on this tubulin segment. The complete sequence of the human TUBB4Q members reported elsewhere (van Geel et al. 2000) was determined, and two additional members were identified, making the total number in humans at least 10. In addition, 12 members could be identified in both chimpanzees and baboons, and 2 members could be identified in squirrel monkeys, suggesting that duplications of this particular segment have occurred extensively in all these species. The structural ortholog of the HSA4q35 TUBB4Q sequence (for nomenclature, see table 1) was identified in baboon (PHA443J9 equals PHA IVq) and chimpanzee (PTR179F17 equals PTR IVq) by means of gene-order conservation with the flanking FRG1 gene (fig. 1). Baboon BAC-ends sequence (RPCI-41-209E1, RPCI-41-269I4, and RPCI-41-443J9) comparison with the human genomic chromosome 4q35 sequence (GenBank accession numbers AF250324, AF146191, U85056, U74497, and U74496) determined the baboon structural ortholog of HSA4q35 (homology >80%), since baboon has three tubulin loci flanked by FRG1 (see BACPAC Resources Home Page). Sequence orthology was also evident between the human HSA1q43-44, the chimpanzee PTR-Iq, and the baboon PHA Iq TUBB4Q members, which are the only members containing a 5′ truncated L1PA2 repeat within intron 3 (table 1). Moreover, this orthology was confirmed by phylogenetic and FISH analysis (see below). Consensus sequences of each genomic TUBB4Q member were coaligned by use of ClustalX software, version 1.8 (Thompson et al. 1994, 1997) and, in some instances, were manually edited. TUBB4Q comparative sequence alignments of all 36 operational taxonomic units (OTUs; sequences represented at the tips of the topological tree) denote genomic nucleotide-sequence identity of 78%–98% (GenBank accession numbers AF355105 to AF355140).

Figure  1
Organization of the human chromosome 4q35 region. Exons of the FRG1 and TUBB4Q genes are displayed on the forward and reverse strand, respectively. Specific family members of the tubulin located on chromosome 4q35 (TUBB4Q) were isolated by means of a ...
Table 1
Localization of Human and Orthologous Primate TUBB4Q Members[Note]

To solve the evolutionary relationship of this complex series of duplication events, phylogeny among the different TUBB4Q members in the catarrhine clade (hominoids and Old World monkeys) were determined with the platyrrhine clade (squirrel monkey) as an outgroup. Although we used the neighbor-joining method for phylogenetic analysis, topological-tree construction is quite solid, since the maximum-parsimony and maximum-likelihood methods resulted in similar trees. The topology of the tree indicates three catarrhine branch clusters—that is, clades (monophyletic groups)—in which all three species are represented (fig. 2). On the basis of this phylogenetic reconstruction, the human TUBB4Q members on HSA1q42.3, HSA1q43-44, and HSAYq11 have orthologous members in both chimpanzees and baboons (fig. 2; table 1). Thus, these members predate the catarrhine radiation (>25 million years). In addition, the phylogenetic analysis specifies that only some human members (HSA10p15 and HSA12cen-q11) are clustering with chimpanzee members, suggesting orthology in chimpanzees but not in baboons (fig. 2; table 1). All other members show only intraspecies clustering. Surprisingly, the TUBB4Q members that are supposed to be orthologous with HSA4q35 by means of gene-order conservation (described above), do not group together in the phylogenetic analysis. In addition, the two human members on HSA1 and the two on HSA12 have a unilateral close relationship, which implies they occurred through separate intrachromosomal evolutionary duplication events. Timing of the duplication events was estimated by sequence-divergence calculations among human-chimpanzee orthologous members. The substitution rate (r) was estimated from ancient TUBB4Q copies that probably were not under selective pressure during their evolutionary existence (pseudogenes, since they lack a functional open reading frame [ORF]). The human and Old World monkey (cercopithecidae) lineages divergence was set at 25 million years ago (Goodman 1999) to estimate the r value in our tree. From the HSA1q42.3, HSA1q43-44, and HSAYq11 phylogenetic catarrhine tree branches, the HSA1q43-44 catarrhine branch was discarded, since the substitution rate was much slower (2× slower), compared with the other two branch clusters. The average K value of the HSA1q42.3 and HSAYq11 catarrhine clusters was used to calculate r by the neighbor-joining method (PAUP): r=K/(2T)=0.156/(2×25×106)=3.12×10-9 bp/site/year. This rate provided a calibrated value for the evolutionary genetic distance between the different sequence taxa. We estimate the hominoid duplication events to have occurred after the divergence of the human and baboon lineages some 17.3, 15.1, and 7.2 million years ago, respectively, for the HSA12cen-q11, HSA12cen-p11, and HSA10p15 clusters.

Figure  2
Phylogenetic analysis of TUBB4Q genomic sequence members from human (“HSA”; red), chimpanzee (“PTR”; blue), baboon (“PHA”; green), and squirrel monkey (“SSC”; pink), using the distance-matrix ...

Selected human PAC clones, each containing one individual member of the TUBB4Q family, were localized by FISH and PCR-based monochromosomal hybrid mapping (table 1). It is interesting that, although duplicated segments have been described for other genes that are exclusively dispersed over either centromeres or telomeres (Kermouni et al. 1995; Eichler et al. 1997; Trask et al. 1998a), we here provide evidence that duplications have also occurred between pericentromeric and subtelomeric regions of the primate genome.

Orthology of the different catarrhine TUBB4Q members was primarily defined from both the perspective of gene-order conservation and that of phylogenetic analysis (see above). To specifically verify this distinct orthology, positional conservation was tested, by FISH analysis, in specific orthologous members of the chimpanzee and baboon species. In concordance with the human chromosomal positions, specific signals were identified on chimpanzee and baboon syntenic chromosomal locations (fig. 3; table 1). However, the phylogenetically orthologous baboon copy from HSA1q42.3 was located near the centromere of PHA Iq, instead of near a telomeric position. This contradictory location could possibly be ascribed to an evolutionary intrachromosomal inversion event during the evolving baboon or hominoid genomes.

Figure  3
FISH analysis of orthologous TUBB4Q members (contained in PACs or BACs) on human (A, D, G, J, L, and N), chimpanzee (B, E, H, K, M, and O) and baboon (C, F, and I) chromosomal metaphase spreads. Primate chromosomes were cohybridized (Wirth et al. 1999 ...

Tentative examination of potential ORFs of all identified tubulin members indicates four human members (HSA4q35, HSA10p15, HSA16q24, and HSA18p11), five members in both chimpanzee (PTR6H21 [PTR Xp], PTR51H24, PTR58N10, PTR91D7, and PTR179F17 [PTR IVq]) and baboon (PHA122M7, PHA140N17, PHA209E1, PHA269I4, and PHA443J9 [PHA IVq]), and one in squirrel monkey (SSCTUB), in which the ORF is not disrupted. The amino acid–sequence alignment (according to ClustalX; alignment available online) and subsequent phylogenetic analysis (fig. 4) reveals a strikingly closer relationship of the baboon PHA IVq TUBB4Q protein to the human HSA10p15 member (97.1% identity) than to its related HSA4q35 orthologous copy (88.5% identity). However, the chimpanzee orthologous members of HSA10P15 (PTR Xp) and HSA4q35 (PTR IVq) do cluster and indicate a monophyletic relationship (fig. 4). The protein sequence of the HSA10p15-PTR Xp clade evolved relatively slowly, compared with the other human and chimpanzee members. In addition, the only ORF-containing baboon member (PHA IVq) with orthologous counterparts in human (HSA4q35) and chimpanzee (PTR IVq) indicates minimal divergence from the HSA10p15 and PTR Xp members. Therefore, the sequences may be under evolutionary selective pressure because they represent genuine genes, even though we lack experimental evidence for transcriptional expression.

Figure  4
Phylogenic analysis of the protein sequences of TUBB4Q members of human (“HSA”; red), chimpanzee (“PTR”; blue), baboon (“PHA”; green), and squirrel monkey (“SSC”; pink) with an ORF, using ...
Figure  6 (Online Only)
Pairwise alignment of 15 putative TUBB4Q protein sequences (452 sites). The member identification names are listed at the left, with potential functional genes underlined. Abbreviations: HSA (human), PTR (chimpanzee), PHA (baboon), and SSC (squirrel monkey). ...

Potential functionality of the TUBB4Q cDNA sequences (theoretically extracted manually from the genomic sequences; alignment available on request) was tested statistically by comparison of the ratios of nonsynonymous (Ka) to synonymous (Ks) base substitution by use of MEGA version 2.0 (Kumar et al. 1994). Pseudogenes are not under functional constraints, and therefore nonsynonymous and synonymous base substitutions are expected to evolve at equal rates, unlike functional genes where nonsynonymous base substitutions are restricted (Meireles et al. 1999; Yang et al. 2000). The cDNAs with an intact ORF were jointly analyzed for the nucleotide-substitution pattern (table 2), whereas cDNAs with an interrupted ORF were already considered pseudogenes. The mean of the substitution ratio (Ka:Ks) of all functional genes in mammals is estimated to be ~0.21 (Li 1997). Comparison of PHA IVq with the human HSA4q35 (Ka/Ks=0.199; Ka-Ks=-0.205±0.034; Z-test 6.062 with P<.0001) and chimpanzee PTR IVq (Ka/Ks=0.190; Ka-Ks=-0.183±0.031; Z-test 5.839 with P<.0001) ortholog TUBB4Q members indicates a higher Ka:Ks ratio (mean 0.195) than to the paralogous human HSA10p15 (Ka/Ks=0.065; Ka-Ks=-0.217±0.031; Z-test 6.912 with P<.0001) and chimp PTR Xp (Ka/Ks=0.075; Ka-Ks=-0.210±0.031; Z-test 6.780 with P<.0001) copies (mean 0.070). Similar ratios were calculated by comparison of the same catarrhine members to the squirrel monkey SSCTUB member (table 2), although we lack evidence for an orthologous relationship. The Ka:Ks ratios of HSA10p15 and PTR Xp to PHA IVq or SSCTUB, respectively, clearly indicate significance for selective constraint. In comparison, the Ka:Ks values of functional known b-tubulin genes among human and mouse vary between 0.022 (TUBB5) and 0.067 (TUBB2), which suggests higher selective pressure on these structural proteins than average. The Ka:Ks ratios of HSA4q35 and PTR IVq to the (orthologous) PHA IVq or SSCTUB are similar to the gene average, which may suggest that evolutionary selective pressure was long present but was recently lifted in humans and chimpanzees. This may explain the higher Ka:Ks ratios between HSA4q35 and PTR IVq (0.48; see table 3).

Table 2
Pairwise Nonsynonymous (Upper Right Matrix) and Synonymous (Lower Left Matrix) Nucleotide Substitutions among 15 cDNAs from Different TUBB4Q Members Using the Distance Method from Nei-Gojobori (Jukes-Cantor)
Table 3
Nonsynonymous versus Synonymous (Ka:Ks)

Four human members have structural orthologs in baboon and potentially are representative of the ancestral copy. Of these members only the baboon ortholog of HSA4q35 (PHA IVq) is likely a functional gene. The HSA1q42.3, HSA1q43-44, and HSAYq11 members are all considered pseudogenes in all species tested. It is likely that they were derived from their functional counterpart. Therefore, the progenitor of all derivative human TUBB4Q copies was almost certainly HSA4q35.

We hypothesize that the HSA4q35 tubulin was once a functional gene (fig. 5). Because of its proclivity to duplicate to subtelomeric locations, a novel tubulin member was transposed to 10p15, ~7.3 million years ago. We propose that mutations have accumulated on the original copy since this time, while the HSA10p15 copy has been under selection pressure. Interestingly, no mutations have occurred to disrupt the ORF of the HSA4q35 TUBB4Q, making it a possible resource for the development of a future, highly specialized, evolved tubulin gene, expanding the present day b-tubulin gene family.

Figure  5
Schematic representation of the TUBB4Q duplication events during the catarrhine evolution. Within the catarrhine tree, duplication events are indicated by boxes and vertical circled arrows. The locations where the TUBB4Q members will finally end up in ...

This duplicated segment has been part of the hominoid/cercopithecoid clades since early in their evolution and suggests an exceptional level of evolutionary instability for a genomic segment. Chromosomal location (pericentromeric or telomeric) is undoubtedly an important underlying aspect of genomic instability. Our study emphasizes the dynamic nature of duplication-mediated genome evolution and the delicate balance between gene acquisition and silencing.silencing.


This work was supported by the Muscular Dystrophy Association and by the German Research Foundation (DFG; research grant Ha1374/5-2). We would like to thank Barbara Swiatkiewicz and Joe Catanese, for their excellent assistance with the genomic libraries; Linda Haley and Sheila Sait, for the human FISH analyses; and Gert-Jan van Ommen, for critical reading of the manuscript.

Electronic-Database Information

The URL for data in this article is as follows:

BACPAC Resources Home Page, http://www.chori.org/bacpac/ (for BAC-end sequencing method)


Eichler EE (1998) Masquerading repeats: paralogous pitfalls of the human genome. Genome Res 8:758–762 [PubMed]
Eichler EE, Budarf ML, Rocchi M, Deaven LL, Doggett NA, Baldini A, Nelson DL, Mohrenweiser HW (1997) Interchromosomal duplications of the adrenoleukodystrophy locus: a phenomenon of pericentromeric plasticity. Hum Mol Genet 6:991–1002 [PubMed]
Eichler EE, Lu F, Shen Y, Antonacci R, Jurecic V, Doggett NA, Moyzis RK, Baldini A, Gibbs RA, Nelson DL (1996) Duplication of a gene-rich cluster between 16p11.1 and Xq28: a novel pericentromeric-directed mechanism for paralogous genome evolution. Hum Mol Genet 5:899–912 [PubMed]
Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8:186–194 [PubMed]
Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8:175–185 [PubMed]
Goodman M (1999) The genomic record of humankind's evolutionary roots. Am J Hum Genet 64:31–39 [PMC free article] [PubMed]
Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8:195–202 [PubMed]
Grewal PK, van Geel M, Frants RR, de Jong P, Hewitt JE (1999) Recent amplification of the human FRG1 gene during primate evolution. Gene 227:79–88 [PubMed]
Kermouni A, Van Roost E, Arden KC, Vermeesch JR, Weiss S, Godelaine D, Flint J, Lurquin C, Szikora JP, Higgs DR, Marynen P, Renauld JC (1995) The IL-9 receptor gene (IL9R): genomic structure, chromosomal localization in the pseudoautosomal region of the long arm of the sex chromosomes, and identification of IL9R pseudogenes at 9qter, 10pter, 16pter, and 18pter. Genomics 29:371–382 [PubMed]
Kumar S, Tamura K, Nei M (1994) MEGA: molecular evolutionary genetics analysis software for microcomputers. Comput Appl Biosci 10:189–191 [PubMed]
Li W-H (1997) Molecular evolution. Sinauer Associates, Sunderland, MA, pp 1–487
Meireles CM, Czelusniak J, Goodman M (1999) The Tarsius gamma-globin gene: pseudogene or active gene? Mol Phylogenet Evol 13:434–439 [PubMed]
Ohno S (1999) The one-to-four rule and paralogues of sex-determining genes. Cell Mol Life Sci 55:824–830 [PubMed]
Pryde FE, Gorham HC, Louis EJ (1997) Chromosome ends: all the same under their caps. Curr Opin Genet Dev 7:822–828 [PubMed]
Rouquier S, Taviaux S, Trask BJ, Brand-Arpon V, van den Engh G, Demaille J, Giorgi D (1998) Distribution of olfactory receptor genes in the human genome. Nat Genet 18:243–250 (erratum in Nat Genet 19:102) [PubMed]
Shan Z, Zabel B, Trautmann U, Hillig U, Ottolenghi C, Wan Y, Haaf T (2000) FISH mapping of the sex-reversal region on human chromosome 9p in two XY females and in primates. Eur J Hum Genet 8:167–173 [PubMed]
Shen SH, Slightom JL, Smithies O (1981) A history of the human fetal globin gene duplication. Cell 26:191–203 [PubMed]
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882 [PMC free article] [PubMed]
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680 [PMC free article] [PubMed]
Trask BJ, Friedman C, Martin-Gallardo A, Rowen L, Akinbami C, Blankenship J, Collins C, Giorgi D, Iadonato S, Johnson F, Kuo WL, Massa H, Morrish T, Naylor S, Nguyen OT, Rouquier S, Smith T, Wong DJ, Youngblom J, van den Engh G (1998a) Members of the olfactory receptor gene family are contained in large blocks of DNA duplicated polymorphically near the ends of human chromosomes. Hum Mol Genet 7:13–26 [PubMed]
Trask BJ, Massa H, Brand-Arpon V, Chan K, Friedman C, Nguyen OT, Eichler E, van den Engh G, Rouquier S, Shizuya H, Giorgi D (1998b) Large multi-chromosomal duplications encompass many members of the olfactory receptor gene family in the human genome. Hum Mol Genet 7:2007–2020 [PubMed]
van Geel M, Heather LJ, Lyle R, Hewitt JE, Frants RR, de Jong PJ (1999) The FSHD region on human chromosome 4q35 contains potential coding regions among pseudogenes and a high density of repeat elements. Genomics 61:55–65 [PubMed]
van Geel M, van Deutekom JC, van Staalduinen A, Lemmers RJ, Dickson MC, Hofker MH, Padberg GW, Hewitt JE, de Jong PJ, Frants RR (2000) Identification of a novel beta-tubulin subfamily with one member (TUBB4Q) located near the telomere of chromosome region 4q35. Cytogenet Cell Genet 88:316–321 [PubMed]
Wirth J, Nothwang HG, van der Maarel S, Menzel C, Borck G, Lopez-Pajares I, Brondum-Nielsen K, Tommerup N, Bugge M, Ropers HH, Haaf T (1999) Systematic characterisation of disease associated balanced chromosome rearrangements by FISH: cytogenetically and genetically anchored YACs identify microdeletions and candidate regions for mental retardation genes. J Med Genet 36:271–278 [PMC free article] [PubMed]
Yang Z, Nielsen R, Goldman N, Pedersen AM (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449 [PMC free article] [PubMed]
Zerucha T, Ekker M (2000) Distal-less-related homeobox genes of vertebrates: evolution, function, and regulation. Biochem Cell Biol 78:593–601 [PubMed]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...