• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Nat Genet. Author manuscript; available in PMC Sep 9, 2009.
Published in final edited form as:
Published online May 4, 2008. doi:  10.1038/ng.126
PMCID: PMC2740655

The mouse X chromosome is enriched for multi-copy testis genes exhibiting post-meiotic expression


According to the prevailing view, mammalian X chromosomes are enriched in spermatogenesis genes expressed before meiosis13 and deficient in spermatogenesis genes expressed after meiosis2,3. The paucity of post-meiotic genes on the X chromosome has been interpreted as a consequence of Meiotic Sex Chromosome Inactivation (MSCI) – the complete silencing of genes on the XY bivalent at meiotic prophase4,5. Recent studies have concluded that MSCI-initiated silencing persists beyond meiosis68 and that most X-genes remain repressed in round spermatids7. We report here that 33 multi-copy gene families, representing ~273 mouse X-linked genes, are expressed in the testis and that this expression is predominantly in post-meiotic cells. RNA FISH and microarray analysis show that the maintenance of X chromosome post-meiotic repression is incomplete. Furthermore, X-linked multi-copy genes exhibit expression levels similar to those of autosomal genes. Thus, not only is the mouse X chromosome enriched for spermatogenesis genes functioning before meiosis, but in addition ~18% of mouse X-linked genes exhibit post-meiotic expression.

The existing model that the X chromosome is deficient in post-meiotic spermatogenesis genes was based primarily on the analysis of single-copy genes2,3. We therefore sought to explore the spermatogenesis expression profiles of X-linked multi-copy genes. We performed a systematic search for ampliconic regions - comprising palindromic or tandem segmental duplications - and their associated genes on the mouse X chromosome. To identify palindromic repeats, we used the Inverted Repeats Finder (IRF) program9, focusing on palindromes with large arms (≥8kb), exhibiting ≥90% nucleotide identity between arms, and are <500kb from each other. Using these criteria, we identified 17 palindromic regions, seven of which contained multiple types of repeat units (Table 1). In order to detect amplicons consisting only of tandem repeats, we searched for multi-copy gene clusters (Supplementary Table 1). This search revealed five additional ampliconic regions (Amp7, Amp17, Amp19, Amp21, and Amp22; Table 1), and confirmed the existence of all 17 IRF-identified palindromes. In sum, we identified 22 ampliconic regions consisting of 29 distinct repeat units whose sequence complexity totals 1,474kb (Table 1). Together, these ampliconic sequences comprise approximately 19.4Mb, or ~12% of the ~166Mb mouse X chromosome (Table 1).

Table 1
Ampliconic regions of the mouse X chromosome

The contiguous assembly of ampliconic regions is known to be difficult10. Of the 24 physical gaps in the mouse X chromosome (NCBI Mus musculus genome Build 37.1), 21 reside within amplicons (Supplementary Table 2). Five regions of the mouse X chromosome have yet to be completely assembled, because they contain huge (>2Mb) and highly complex ampliconic structures (Amp1, Amp4, Amp7, Amp17, and Amp19; Fig. 1a, b; Table 1). These five regions represent the vast majority (an estimated 87% or 16.9Mb, Supplementary Table 2) of the 19.4Mb of ampliconic sequence and most (19 of 24) X chromosome assembly gaps (Supplementary Table 2).

Figure 1
Mouse X chromosome ampliconic regions containing testis-expressed genes. a, Examples of complexity (Amp1, left), massive scale (Amp4, center), and tandem duplications (Amp19, right) in ampliconic regions. Each ampliconic region is compared to itself in ...

We next asked whether mouse X-ampliconic genes exhibit testis-biased expression. We searched X-ampliconic regions for ESTs and novel/predicted genes and identified 26 multi-copy genes located entirely within these ampliconic regions (Supplementary Table 3). RT-PCR revealed that 23 of the 26 X-ampliconic multi-copy genes are expressed predominantly or exclusively in the testis (Fig. 2). The 23 testis-expressed multi-copy genes reside in 20 of the 22 ampliconic regions (Fig. 1b, c). Examination of the genomic structure of these 20 regions (Supplementary Fig. 1) showed that the X-ampliconic gene copy number ranges from two to ~28 (Fig. 1c), with a total of ~232 ampliconic protein-coding gene copies exhibiting testis-biased expression.

Figure 2
Mouse X chromosome ampliconic and non-ampliconic multi-copy genes exhibit testis-biased expression as shown by RT-PCR. We assayed 26 ampliconic and 10 non-ampliconic multi-copy genes across 11 different tissues – Cxx and Obp1 were not detected ...

To determine if testis-biased expression of multi-copy genes is a phenomenon limited to ampliconic regions, we also analyzed non-ampliconic multi-copy genes. We defined all regions of the mouse X chromosome not harboring ampliconic sequences, as non-ampliconic sequence (~88% of the mouse X chromosome). We selected genes present in at least two copies, located within 1Mb of each other and with ≥85% amino acid identity (Supplementary Table 1). Ten multi-copy genes were identified (Fig. 1c), which ranged in copy number from two to 14 and totaling to 41 gene copies. Like the X-ampliconic genes, all ten multi-copy genes exhibited testis-biased expression (Fig. 2). This indicates that the mouse X chromosome has at least 33 multi-copy genes (23 ampliconic and 10 non-ampliconic) with testis-biased gene expression. Considering there are 1,555 X-linked genes (NCBI Build 37.1), then ~18% (232 ampliconic + 41 non-ampliconic = 273 gene copies) of X-linked genes are multi-copy genes with testis-biased expression, assuming all copies of each family are expressed. Testis ESTs for multi-copy family members (Supplementary Table 3) and copy-specific RT-PCR assays (Supplementary Fig. 2) both suggest that most, if not all, of the 33 multi-copy genes are expressed from more than a single copy.

We then determined whether the 33 multi-copy X-linked genes are expressed in germ cells or somatic testis cells. We performed RT-PCR on XXSry and XXSxrb testes that lack germ cells due to a combination of the absence of spermatogonial proliferation factor Eif2s3y as well as the toxic effects of double X chromosome dosage11,12. For 28 of the 33 multi-copy genes, no expression was detected in either XXSry or XXSxrb testes (Fig. 3a), indicating that these 28 multi-copy genes are germ cell-specific.

Figure 3
Mouse X chromosome ampliconic and non-ampliconic multi-copy genes are expressed predominantly in germ cells during post-meiotic spermatogenesis as shown by RT-PCR. a. All 23 multi-copy genes exhibiting testis-biased expression were assayed in wild type ...

Next, we established at what stage of spermatogenesis these 28 germ cell-specific genes initiate expression. In prepubertal mice, the first wave of spermatogenesis is a synchronized process, with progressively more mature spermatogenic cell types appearing at defined time points after birth. Prepubertal mice thus allow the correlation of expression patterns with the appearance of specific germ cell substages. We found that 20 of the 28 multi-copy X-linked genes initiate expression at or after 18.5dpp (Fig. 3b), when secondary spermatocytes and round spermatids first appear. The other eight X-linked genes initiate expression before meiosis, at 7.5dpp (Fig. 3b), and this early expression would mask any potential post-meiotic reactivation (Fig. 4). Despite this, we find that the majority (20 of 28) of X-linked multi-copy genes are specifically expressed in post-meiotic spermatogenic cells. This finding is unexpected since the X chromosome retains a repressed transcriptional state in post-meiotic cells68.

Figure 4
Multi-copy ampliconic genes exhibit higher levels of reactivation in round spermatids than single-copy genes. Four single-copy and three multi-copy X-linked genes were assayed for expression during successive stages of spermatogenesis by RNA/DNA FISH. ...

To address the degree of post-meiotic repression, we examined nascent transcription of individual X-linked genes in round spermatids via RNA FISH (Fig. 4). We selected four single-copy X-linked genes: Birc4, Scml2, Fmr1 and Zfx. In a previous microarray study, each of these genes was found to be subject to MSCI and to remain repressed during spermiogenesis7. To discriminate between X- and Y-bearing round spermatids we included a second RNA FISH probe against the spermatid-expressed Y-linked gene Sly (Supplementary Fig. 3a), which marks all Y-bearing round spermatids (data not shown). To ensure that the efficiency of RNA FISH was equivalent between different spermatogenic cell types, we also performed RNA FISH on mice carrying a ubiquitously expressed autosomal transgene (Supplementary Fig. 3b). All four X-linked genes are expressed in spermatogonia and are then silenced in all late pachytene spermatocytes examined (Fig. 4; see legend for quantitation). This repression is maintained in the majority of X-bearing round spermatids but in contrast to previous reports, three of the four genes also exhibit some degree of reactivation: 0%, 7%, 17% and 18% for Birc4, Scml2, Fmr1 and Zfx, respectively (see legend of Fig. 4 and Supplementary Table 4 for details of quantification).

Do other single-copy X-linked genes behave the same way? We analyzed published microarray data from isolated germ cell populations to examine the post-meiotic expression levels of 278 single-copy genes classified as being expressed specifically in A or B spermatogonia7. We calculated the average expression values for these 278 genes in pachytene spermatocytes and round spermatids and found that round spermatids exhibit a significantly higher average level of expression than pachytene spermatocytes (P < 0.0002, Wilcoxon Ranks Sum, Fig. 5a). This suggests there is a general low level of post-meiotic reactivation for many genes that are also expressed earlier in A or B spermatogonia. We conclude that, in contrast to the complete meiotic silencing of X-linked genes during MSCI, post-meiotic repression of the X chromosome is incomplete.

Figure 5
Microarray analyses of single-copy and multi-copy genes on the X chromosome. a, Mean expression levels (+/− two standard error from mean) of 278 single-copy genes, previously defined as being expressed solely in spermatogonia7, in A and B spermatogonia ...

The reactivation of single-copy X-linked genes in a small percentage of round spermatids provides a potential explanation for the amplification of spermatid expressed X-linked genes; increasing gene copy number raises the probability that an X-bearing round spermatid will express a given multi-copy gene. To test this we examined three multi-copy genes by RNA FISH; Ott (~12 copies), Slx like-1 (~14 copies) and Slx (~25 copies). Unlike the single-copy X-linked genes, the majority of X-bearing round spermatids showed RNA-FISH signals for all three multi-copy genes (Fig. 4). Furthermore, the percentage of X-bearing round spermatids that exhibited expression increased with gene copy number; 54% for Ott, 78% for Slx like-1 and 93% for Slx. For comparison, we examined the frequency of round spermatid expression for a single-copy spermatid-specific autosomal gene, Zfp29. We detected Zfp29 RNA FISH signals in 64% of round spermatids; a frequency similar to the three X-linked multi-copy genes. The high frequency, as compared to single-copy X-linked genes, of X-bearing round spermatids expressing a given multi-copy gene suggests that gene amplification facilitates higher levels of gene expression in the face of post-meiotic X chromosome repression.

Do multi-copy X-linked genes show higher spermatid expression levels than single-copy X-linked genes at the level of mature RNA? To address this we analyzed published microarray data from isolated cell populations of pachytene spermatocytes and round spermatids7. We identified 24 X-linked multi-copy genes, 552 X-linked single-copy genes and 20,236 autosomal genes. In pachytene spermatocytes, both X-linked multi-copy and single-copy genes exhibit expression levels consistent with the presence of MSCI (Fig. 5b). However, in round spermatids X-linked multi-copy genes show a significantly higher average level of expression than X-linked single-copy genes (P < 10−5, Wilcoxon Ranks Sum, Fig. 5b; Supplementary Table 5). This high level of post-meiotic X-linked multi-copy gene expression further suggests that X-gene amplification compensates for post-meiotic repression.

Evolutionary models predict the accumulation of male advantage alleles on X chromosomes13. X-linked spermatogenesis genes identified prior to this study are primarily expressed in pre-meiotic cells1,2. Here we describe a collection of 33 X-linked multi-copy genes that are expressed in post-meiotic spermatogenic cells. Some of these gene families may have important roles during sperm maturation, including Mgclh, whose autosomal paralogue Mgcl1 is required for acrosome formation and spermatid chromatin condensation14 and 1700012L04Rik, which encodes a novel spermatid-specific histone variant, H2AL115.

To understand the unusual expression pattern of X-linked multi-copy genes in the face of post-meiotic repression, we have re-examined the extent of X chromosome repression during spermiogenesis. Post-meiotic X-repression is incomplete, with pre-meiotic expressed single-copy X-linked genes exhibiting reactivation at varying levels. This reconciles the apparent conflict between studies documenting the existence of a repressed X chromosome in post-meiotic spermatogenic cells68 and those showing reactivation of selected single-copy X-linked genes16. Importantly, we find that X-linked multi-copy genes yield an elevated frequency of expressing X-bearing round spermatids and the average expression levels of these genes is higher than X-linked single-copy genes. It is possible that other compensatory mechanisms, aside from increased copy number, also counteract post-meiotic repression, because rare cases of robust reactivation of single-copy X-linked genes (e.g. Ube1x) have also been uncovered8. Nevertheless, amplification of X-linked genes may have evolved to compensate for the repressive chromatin environment affecting the X chromosome in post-meiotic cells (Fig. 5c). Indeed, the human X chromosome harbors multi-copy genes similar to those described here as well as primate-specific multi-copy genes exhibiting post-meiotic expression17,9,18.


Amplicon identification and multi-copy gene discovery

Individual mouse X chromosome amplicons (palindromic or tandem segmental duplications) were first identified using the Inverted Repeats Finder (IRF) program9. The Mus musculus reference sequence (NCBI Build 36.1) was used to select palindromes with arms ≥8kb, ≥90% nucleotide identity between arms, and <500kb of each other. The ≥8kb arm size restriction, used in the Warburton et al. (2004) palindrome analyses, should eliminate all repeats due to recent LINE insertions. Custom tracks can be downloaded from https://tandem.bu.edu to display all mouse X chromosome palindromes on the UCSC Genome Browser. In some cases the IRF program truncated and/or missed palindrome arms. Therefore, to resolve the complete structure of all IRF-identified palindromes, custom perl code19 was used to characterize their genomic structure via triangular dot-plots (Supplementary Fig. 1).

Since the IRF program identifies only palindromes, we utilized an alternative strategy to detect amplicons in tandem, which could also confirm the IRF-identified amplicons. Ensembl’s Biomart program (http://www.ensembl.org/) was used to obtain all X-linked protein-coding genes with paralogous genes on the X chromosome (Supplementary Table 1). Multi-copy genes less than 1Mb apart, which share ≥80% amino acid identity between any two copies, and which are not found in multiple copies throughout the mouse genome (e.g. retroviral proteins, ribosomal proteins, or olfactory receptors) were considered putative ampliconic regions. Multi-copy gene clusters passing these three criteria (Supplementary Table 1) were subjected to triangular dot-plot analysis19 to determine if the region is ampliconic. Ampliconic regions identified via this method confirmed the IRF-identified amplicons as well as five new tandemly arrayed amplicons (Table 1). Repeat unit boundaries and percent identities between exclusively tandemly arrayed amplicons were determined via ClustalW20. Multi-copy gene clusters fulfilling the three selection criteria, but not present in ampliconic regions were termed non-ampliconic multi-copy genes. To estimate gene copy number, we used both the NCBI Build 37.1 annotation and protein sequence searches via TBLASTN to the mouse X chromosome reference sequence.

To identify all protein-coding genes within the ampliconic repeats, each repeat unit was repeat masked (http://www.repeatmasker.org) and subsequently compared to the NCBI mouse EST database using BLAST21. We also searched, via BLAST, for predicted genes without EST support (NCBI mouse ab initio database) to uncover novel transcripts. Only genes falling entirely within a repeat unit were considered candidate ampliconic genes. Supplementary Table 3 provides a list of all novel and known genes identified via BLAST searches.


Total RNA was extracted using Trizol® (Gibco BRL) according to manufacturer’s instructions. For DNAse treatment, 3μg total RNA was incubated with 1.5 U RQ DNAse 1 (Promega) at 37°C for 90 mins. The RNA was precipitated, resuspended and then reverse transcribed using Superscript II reverse transcriptase (200 U, Gibco BRL) for 90 mins at 42°C. PCR was performed using the parameters: 1 cycle 94°C 3 mins, 35 cycles 94°C 30secs, 56°C 30secs, 72°C 30secs, and then 1 cycle 72°C 10mins. Primer sequences (Supplementary Table 3) for all RT-PCR reactions were designed to detect at least two copies of each family member. RNA/DNA FISH was carried out exactly as described in Turner et al. (2006). We used long range PCR products from BACs as probes for Birc4, Fmr1, Scml2, Ott, Slx-like 1, Slx, and Zfx. Primer sequences and BAC identifier names are in Supplementary Table 3.

Microarray and Statistical Analyses

We analyzed published microarray expression data (Gene Expression Omnibus GSE4193)7 from isolated germ cell populations of A spermatogonia, B spermatogonia, pachytene spermatocytes and round spermatids (two replicates for each cell population). Each array was normalized as in Namekawa et al. (2006) to set the trimmed (2% from each side) mean signal intensity to 125. Probes with <100 signal intensity at all timepoints were discarded, because their expression level estimates are unreliable. Probes which mapped to autosomal genes were also removed from analysis. Single-copy X-linked genes, summing to 552, were selected from Ensembl annotations of genes without mouse paralogs. The 25 multi-copy genes’ probe-ids and expression values for each timepoint are listed in Supplementary Table 5. When multiple probes matched a multi-copy gene (see Supplementary Table 5 for genes with multiple probe_ids) we selected a single probe, with the median signal intensity. Wilcoxon Ranks Sum paired tests, using JMP 5.1 software (SAS Institute Inc.), were performed on all comparisons of average expression levels and were used because they diminish the influence of outliers.

Supplementary Material


We thank J. Alfoldi, D. Bellott, P. Burgoyne, H. Byers, J. Hughes, J. Lange, L. Reynard, S. Rozen, H. Skaletsky for helpful comments on the manuscript; D. Bellott, M. Gill, Y. Hu, J. Hughes, H. Skaletsky for technical advice; Paul Burgoyne and Mike Mitchell for the Uty transgenic line. This work was supported by National Institutes of Health Fellowship F32HD052379 (to J.L.M.) and the Howard Hughes Medical Institute.


Author contributions

Amplicon identification and multicopy gene discovery were performed by JLM, JMAT, SKM and PEW. RT-PCRs and RNA FISH were performed by JLM, JMAT and SKM. Microarray and statistical analysis were performed by JLM and PJP. The paper was written by JLM, JMAT and DCP.


1. Wang PJ, McCarrey JR, Yang F, Page DC. An abundance of X-linked genes expressed in spermatogonia. Nat Genet. 2001;27:422–6. [PubMed]
2. Khil PP, Smirnova NA, Romanienko PJ, Camerini-Otero RD. The mouse X chromosome is enriched for sex-biased genes not subject to selection by meiotic sex chromosome inactivation. Nat Genet. 2004;36:642–6. [PubMed]
3. Reinke V. Sex and the genome. Nat Genet. 2004;36:548–9. [PubMed]
4. McKee BD, Handel MA. Sex chromosomes, recombination, and chromatin conformation. Chromosoma. 1993;102:71–80. [PubMed]
5. Turner JM. Meiotic sex chromosome inactivation. Development. 2007;134:1823–31. [PubMed]
6. Greaves IK, Rangasamy D, Devoy M, Marshall Graves JA, Tremethick DJ. The X and Y chromosomes assemble into H2A. Z-containing facultative heterochromatin following meiosis. Mol Cell Biol. 2006;26:5394–405. [PMC free article] [PubMed]
7. Namekawa SH, et al. Postmeiotic sex chromatin in the male germline of mice. Curr Biol. 2006;16:660–7. [PubMed]
8. Turner JM, Mahadevaiah SK, Ellis PJ, Mitchell MJ, Burgoyne PS. Pachytene asynapsis drives meiotic sex chromosome inactivation and leads to substantial postmeiotic repression in spermatids. Dev Cell. 2006;10:521–9. [PubMed]
9. Warburton PE, Giordano J, Cheung F, Gelfand Y, Benson G. Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. Genome Res. 2004;14:1861–9. [PMC free article] [PubMed]
10. Eichler EE, Clark RA, She X. An assessment of the sequence gaps: unfinished business in a finished human genome. Nat Rev Genet. 2004;5:345–54. [PubMed]
11. Mroz K, Carrel L, Hunt PA. Germ cell development in the XXY mouse: evidence that X chromosome reactivation is independent of sexual differentiation. Dev Biol. 1999;207:229–38. [PubMed]
12. Mazeyrat S, et al. A Y-encoded subunit of the translation initiation factor Eif2 is essential for mouse spermatogenesis. Nat Genet. 2001;29:49–53. [PubMed]
13. Rice WR. Sex-Chromosomes and the Evolution of Sexual Dimorphism. Evolution. 1984;38:735–742.
14. Kimura T, et al. Mouse germ cell-less as an essential component for nuclear integrity. Mol Cell Biol. 2003;23:1304–15. [PMC free article] [PubMed]
15. Govin J, et al. Pericentric heterochromatin reprogramming by new histone variants during mouse spermiogenesis. J Cell Biol. 2007;176:283–94. [PMC free article] [PubMed]
16. Wang PJ, Page DC, McCarrey JR. Differential expression of sex-linked and autosomal germ-cell-specific genes during spermatogenesis in the mouse. Hum Mol Genet. 2005;14:2911–8. [PMC free article] [PubMed]
17. Westbrook VA, et al. Spermatid-specific expression of the novel X-linked gene product SPAN-X localized to the nucleus of human spermatozoa. Biol Reprod. 2000;63:469–81. [PubMed]
18. Westbrook VA, et al. Hominoid-specific SPANXA/D genes demonstrate differential expression in individuals and protein localization to a distinct nuclear envelope domain during spermatid morphogenesis. Mol Hum Reprod. 2006;12:703–16. [PubMed]
19. Kuroda-Kawaguchi T, et al. The AZFc region of the Y chromosome features massive palindromes and uniform recurrent deletions in infertile men. Nat Genet. 2001;29:279–86. [PubMed]
20. Waterston RH, et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–62. [PubMed]
21. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. [PubMed]


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...