pmc logo image
Logo of geneticsJournal URL: http://www.genetics.org/

Formats:

Genetics. 2008 August; 179(4): 2325–2327.
doi: 10.1534/genetics.108.086819.
PMCID: PMC2516102
Two New Y-Linked Genes in Drosophila melanogaster
Maria D. Vibranovski,*1 Leonardo B. Koerich,*1 and A. Bernardo Carvalho*1,2
*Departamento de Genética, Instituto de Biologia, Universidade Federal do Rio de Janeiro, Caixa Postal 68011, CEP 21944-970, Rio de Janeiro, Brazil and Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, 60637-1503
1These authors contributed equally to this work.
2Corresponding author: Departamento de Genética, Instituto de Biologia, Universidade Federal do Rio de Janeiro, Caixa Postal 68011, CEP 21944-970, Rio de Janeiro, Brazil. E-mail: bernardo/at/biologia.ufrj.br
Communicating editor: T. C. Kaufman
Received January 8, 2008; Accepted May 22, 2008.
Abstract
The Y chromosome and other heterochromatic regions present special challenges for genome sequencing and for the annotation of genes. Here we describe two new genes (ARY and WDY) on the Drosophila melanogaster Y, bringing its number of known single-copy genes to 12. WDY may correspond to the fertility factor kl-1.
THE Y chromosome of Drosophila melanogaster is not involved in sex determination, but is essential for male fertility (Bridges 1916). Formal genetics identified six “fertility factors” on the Y of D. melanogaster: kl-1, kl-2, kl-3, and kl-5 on the long arm, and ks-1 and ks-2 on the short arm (Kennison 1981; Hazelrigg et al. 1982; Gatti and Pimpinelli 1983). On the other hand, molecular methods have identified up to this moment 10 protein-encoding single-copy genes (Gepner and Hays 1993; Carvalho et al. 2000, 2001; Carvalho and Clark 2003). The correspondence between these two sets of genes has been partially elucidated. The fertility factors kl-2, kl-3, and kl-5 encode three different axonemal dynein heavy chains, which are motor proteins responsible for the flagelar beating (Goldstein et al. 1982; Gepner and Hays 1993; Carvalho et al. 2000). The genes ORY and CCY map respectively to the same region of the fertility factors ks-1 and ks-2 and may correspond to them (Carvalho et al. 2001). Only the kl-1 fertility factor has not been identified (at least tentatively) at the molecular level.
The heterochromatic state of the Drosophila Y creates particular difficulties for its study (Carvalho et al. 2003). Before the genome project of D. melanogaster, only one single-copy gene was known at the molecular level in the Y chromosome (Gepner and Hays 1993), and even after that the identification of additional genes on the Y required the development of appropriate computational methods (Carvalho et al. 2000, 2001, 2003), whereas thousands of new genes were immediately found in the euchromatin using standard methods (Adams et al. 2000). The main difficulty arises as a result of the limitations of whole genome shotgun sequencing in dealing with repetitive sequences: Y chromosome sequences are assembled in small fragments and usually are found among a large set of small unmapped scaffolds (typically, 2000–10,000). Furthermore, the different exons of the same gene frequently are scattered in different scaffolds because intronic repetitive DNA causes assembly failures (Carvalho et al. 2003). These fragments of genes are missed by gene annotation programs, and even when partially annotated their Y linkage cannot be inferred from the assembly itself. The bottom line is that the identification of Y-linked genes in whole genome shotgun projects requires computational methods that identify among the many unmapped scaffolds which ones are more likely to be Y linked. These selected scaffolds can then be experimentally tested for Y linkage, and the structure of the genes confirmed by RT–PCR (Carvalho et al. 2003; Krzywinski et al. 2006). One such computational method utilizes testis expressed sequence tags (ESTs). Briefly, as Y-linked genes usually have testis expression, unmapped scaffolds that are matched by ESTs from testis may be Y linked (Lahn and Page 1997; Carvalho et al. 2001). Previous searches with the first data set of ~3000 testis ESTs (Andrews et al. 2000) yielded one Y-linked gene (CCY) among five candidates (Carvalho et al. 2001). Since then, the number of deposited testis ESTs has risen to ~31,000 (Stapleton et al. 2002). Using this expanded data set, we report here the identification of three Y-linked genes (among five candidates; Table 1). One of them (BF502240/AE003124) turns out to encode the missing N terminus of the previously found Y-linked gene CCY (Carvalho et al. 2001; L. B. Koerich, A. G. Clark and A. B. Carvalho, unpublished results). The remaining two are novel Y-linked genes and are detailed below.
TABLE 1
TABLE 1
Candidates for Y-linked genes identified with testis EST
ARY (aldehyde reductase Y):
The unmapped scaffold AE002861 was matched by the testis EST BE976842. PCR assays using male and female DNA as templates produced a band with male DNA only. Therefore, AE002861 is part of the Y chromosome. We have employed 5′ RACE and 3′ RACE to obtain the full sequence of the orthologs of ARY in other Drosophila species (L. B. Koerich, A. G. Clark and A. B. Carvalho, unpublished results). Using the D. willistoni gene (accession BK006428) we annotated the D. melanogaster gene, which was fully contained in the improved scaffold CP000212.2 (the EST BE976842 was lacking the C terminus). The full D. melanogaster ARY gene (accession BK006427) encodes a protein of 360 amino acids. It has been partially annotated before as CG40064, but mistakenly mapped to the heterochromatin of chromosome 3L (Hoskins et al. 2007; Table 1). ARY is a potential member of the aldo-keto reductase gene family, which is involved in conversion of glucose to fructose, and in the inactivation of cytotoxic metabolites (Hyndman et al. 2003). Interestingly, at least in mammals, fructose is an energy source for spermatozoa. In addition, the spermatozoa membrane has a higher content of polyunsaturated fatty acids, which through peroxidation generates 4-HNE, a toxic product that is inactivated by aldo-keto reductases (Kobayashi et al. 2002). Another hint on the possible function of ARY is that the aldo-keto reductase ARK1B7 is expressed at very high levels in the male reproductive tract in the mouse (Baumann et al. 2007). Knockout of ARK1B7 seems to cause partial sperm impairment, although the animals have normal fertility (Tables 3 and 4 in Baumann et al. 2007). Hence, it seems that ARY is another case of a male-specific gene recruited by the Drosophila Y. Using the methods described in Carvalho et al. (2000) we found that ARY is located in the kl-2 region. Since this region is known to contain one gene essential for male fertility (the kl-2 dynein heavy chain; Carvalho et al. 2000) and because the available evidence strongly suggests that each region contains only one essential gene (Kennison 1981), ARY probably is not essential for male fertility. We have previously reported on other nonessential genes in the D. melanogaster Y, such as PPrY and PRY (Carvalho et al. 2000, 2001).
WDY (WD40 Y):
WDY was identified by a match between the unmapped scaffolds AE003005 and AE003380, and the testis EST BF489509. After we identified the WDY gene, the EST was fully sequenced (accession BT021451). This sequence still lacks the N terminus; we identified the full WDY coding sequence in D. melanogaster (accession BK006449) by comparison with the D. ananassae ortholog (accession EU362855). The WDY coding sequence spans four scaffolds (AE003109, AE003005, AE003380, and AE002858), and in each of them a separate gene was misannotated (CG40245, CG40583, CG40551, and CG41020; also CG40449). The full protein has 991 amino acids and has clear orthologs in Anopheles, Aedes, and Tribolium (accession nos. XP_563902, XP_001655383, and XP_970543, respectively). None of these orthologs has assigned functions. There are no similar proteins in the D. melanogaster genome, except for the small CG34164 (106 amino acids; 60% of identity), and the only hint regarding the function of WDY is the presence of at least three WD40 domains. This domain is found in a number of eukaryotic proteins and is involved in a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing, and cytoskeleton assembly (Smith et al. 1999; Li and Roberts 2001). Mapping as described in Carvalho et al. (2000) showed that WDY is fully contained in the kl-1 region (both the N and the C terminus), and hence WDY may correspond to this fertility factor. RT–PCR showed that WDY has a strong expression in testis and a very weak expression in the soma (head plus thorax; data not shown), whereas microarray experiments showed that it (as well as ARY) is overexpressed in late spermatogenesis (after meiosis; M. D. Vibranovski, H. F. Lopes, T. L. Karr and M. Long, unpublished data). Therefore, it is likely that WDY has a male-related function. It remains to be determined which function this is and whether or not WDY is the kl-1 gene.
The identification of ARY and WDY raises the number of known single-copy protein-encoding genes of the D. melanogaster Y chromosome to 12 (Gepner and Hays 1993; Carvalho et al. 2000, 2001; Carvalho and Clark 2003). This brings the question of how many more genes remain to be found. It is worth pointing out that the total gene number is uncertain even in the euchromatic portion of the Drosophila genome (Hild et al. 2004) and that the Y chromosome has the additional difficulties of being heterochromatic (Hoskins et al. 2007; Smith et al. 2007), as well as having low shotgun coverage (approximately threefold, which causes sequence gaps; Carvalho et al. 2003). Furthermore, we certainly have not exhausted all candidates and methods (e.g., not all genes are represented in ESTs databases, due to low expression, etc.). Thus, a definitive answer is not possible at this moment. On the other hand, we have been searching for Y-linked genes in D. melanogaster since 2000, and the diminishing returns of our searches employing a variety of methods suggest that there are not many more to be found. The CCY gene was independently identified three times (two with testis ESTs and one with the “low-shotgun depth” method; Carvalho et al. 2003), which again suggests that we are approaching saturation. Finally, now all six fertility factors have been identified (at least tentatively) at the molecular level. Though it is difficult to translate these arguments into a number, we would be surprised if there are substantially >20 protein-encoding genes in the D. melanogaster Y. For the future, comparison with the other sequenced Drosophila species (Drosophila 12 Genomes Consortium 2007; L. B. Koerich, A. G. Clark and A. B. Carvalho, unpublished data), progress being made at the Drosophila Heterochromatin Genome Project (Hoskins et al. 2007; Smith et al. 2007), and perhaps high throughput methods for the identification of Y-linked scaffolds, are likely to yield a more precise answer.
Acknowledgments
We acknowledge Roger Hoskins and J. Roman Arguello for valuable comments and the funding from Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro, Fundação Universitária José Bonifácio-UFRJ, Conselho Nacional de Desenvolvimento Científíco e Tecnológico, Coordenacao de Aperfeicoamento do Pessoal de Ensino Superior, Fogarty International Center–National Institutes of Health (NIH) (grants TW005673-01A1 and TW007604-02), and NIH (grant GM64590). M.D.V. was also supported by the Pew Latin America Fellowship Program.
Notes
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. BK006427, BK006428, BK006449, and EU363855.
  • Adams, M. D., S. E. Celniker, R. A. Holt, C. A. Evans, J. D. Gocayne et al., 2000. The genome sequence of Drosophila melanogaster. Science 287 2185–2195. [PubMed]
  • Andrews, J., G. G. Bouffard, C. Cheadle, J. Lü, K. G. Becker et al., 2000. Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis. Genome Res. 10 2030–2043. [PubMed]
  • Baumann, C., B. Davies, M. Peters, U. Kaufmann-Reiche, M. Lessl et al., 2007. AKR1B7 (mouse vas deferens protein) is dispensable for mouse development and reproductive success. Reproduction 134 97–109. [PubMed]
  • Bridges, C. B., 1916. Non-disjunction as proof of the chromosome theory of heredity. Genetics 1 1–52. [PubMed]
  • Carvalho, A. B., B. A. Dobo, M. D. Vibranovski and A. G. Clark, 2001. Identification of five new genes on the Y chromosome of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 98 13225–13230. [PubMed]
  • Carvalho, A. B., B. P. Lazzaro and A. G. Clark, 2000. Y chromosomal fertility factors kl-2 and kl-3 of Drosophila melanogaster encode dynein heavy chain polypeptides. Proc. Natl. Acad. Sci. USA 97 13239–13244. [PubMed]
  • Carvalho, A. B., M. D. Vibranovski, J. W. Carlson, S. E. Celniker, R. A. Hoskins et al., 2003. Y chromosome and other heterochromatic sequences of the Drosophila melanogaster genome: How far can we go? Genetica 117 227–237. [PubMed]
  • Carvalho, A. B., and A. G. Clark, 2003. Birth of a new gene on the Y chromosome of Drosophila melanogaster. Abstracts of the 44th Annual Drosophila Research Conference, March 2003, Chicago.
  • Drosophila 12 Genomes Consortium, 2007. Evolution of genes and genomes on the Drosophila phylogeny. Nature 450 203–218. [PubMed]
  • Gatti, M., and S. Pimpinelli, 1983. Cytological and genetic-analysis of the Y-chromosome of Drosophila melanogaster I. Organization of the fertility factors. Chromosoma 88 349–373.
  • Gepner, J., and T. S. Hays, 1993. A fertility region on the Y-chromosome of Drosophila melanogaster encodes a dynein microtubule motor. Proc. Natl. Acad. Sci. USA 90 11132–11136. [PubMed]
  • Goldstein, L. B. S., R. W. Hardy and D. L. Lindsley, 1982. Structural genes on the Y chromosome of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 79 7405–7409. [PubMed]
  • Hazelrigg, T., P. Fornili and T. C. Kaufman, 1982. A cytogenetic analysis of X-ray induced male steriles on the Y chromosome of Drosophila melanogaster. Chromosoma 87 535–559.
  • Hild, M., B. Beckmann, S. A. Haas, B. Koch, V. Solovyev et al., 2004. An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome. Genome Biol. 5 R3.1–R3.17.
  • Hoskins, R. A., J. W. Carlson, C. Kennedy, D. Acevedo, M. Evans-Holm et al., 2007. Sequence finishing and mapping of Drosophila melanogaster heterochromatin. Science 316 1625–1628. [PubMed]
  • Hoskins, R. A., C. D. Smith, J. W. Carlson, A. B. Carvalho, A. Halpern et al., 2002. Heterochromatic sequences in a Drosophila whole-genome shotgun assembly. Genome Biol. 3 research0085.0081–0085.0016. [PubMed]
  • Hyndman, D., D. R. Bauman, V. V. Heredia and T. M. Penning, 2003. The aldo-keto reductase superfamily homepage. Chem. Biol. Interact. 143-144 621–631. [PubMed]
  • Kennison, J. A., 1981. The genetic and cytological organization of the Y-chromosome of Drosophila melanogaster. Genetics 98 529–548. [PubMed]
  • Kobayashi, T., T. Kaneko, Y. Iuchi, S. Matsuki, M. Takahashi et al., 2002. Localization and physiological implication of aldose reductase and sorbitol dehydrogenase in reproductive tracts and spermatozoa of male rats. J. Androl. 23 674–683. [PubMed]
  • Krzywinski, J., M. A. Chrystal and N. J. Besansky, 2006. Gene finding on the Y: fruitful strategy in Drosophila does not deliver in Anopheles. Genetica 126 369–375. [PubMed]
  • Lahn, B. T., and D. C. Page, 1997. Functional coherence of the human Y chromosome. Science 278 675–680. [PubMed]
  • Li, D., and R. Roberts, 2001. WD-repeat proteins: structure characteristics, biological function, and their involvement in human diseases. Cell. Mol. Life Sci. 58 2085–2097. [PubMed]
  • Smit, A. F., R. Hubley and P. Green, 2004. RepeatMasker Open 3.0. http://www.repeatmasker.org.
  • Smith, C. D., S. Shu, C. J. Mungall and G. H. Karpen, 2007. The release 5.1 annotation of Drosophila melanogaster heterochromatin. Science 316 1586–1591. [PubMed]
  • Smith, T. F., C. Gaitatzes, K. Saxena and E. J. Neer, 1999. The WD repeat: a common architecture for diverse functions. Trends Biochem. Sci. 24 181–185. [PubMed]
  • Stapleton, M., J. Carlson, P. Brokstein, C. Yu, M. Champe et al., 2002. A Drosophila full-length cDNA resource. Genome Biol. 3 research0080.0081–0080.0088. [PubMed]
  • Vibranovski, M. D., 2002. Identificação de genes do cromossomo Y de Drosophila melanogaster. M.Sc. Thesis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.

See more articles cited in this paragraph
See more articles cited in this paragraph
See more articles cited in this paragraph
See more articles cited in this paragraph
See more articles cited in this paragraph
See more articles cited in this paragraph
See more articles cited in this paragraph