![]() | ![]() |
Formats:
|
||||||||||||||||||||
Copyright © 2006 by The National Academy of Sciences of the USA Biochemistry A high-resolution map of transcription in the yeast genome *Stanford Genome Technology Center and Department of Biochemistry, Stanford University, Palo Alto, CA 94304; ‡European Bioinformatics Institute, European Molecular Biology Laboratory, Cambridge CB10 1SD, United Kingdom; and §European Molecular Biology Laboratory, 69117 Heidelberg, Germany ¶To whom correspondence may be addressed. E-mail: dbowe/at/stanford.edu or Email: larsms/at/embl.de Contributed by Ronald W. Davis, February 10, 2006 .†L.D. and W.H. contributed equally to this work. Author contributions: W.H., R.W.D., and L.M.S. designed research; L.D., W.H., M.G., and L.M.S. performed research; C.J.P. and T.J. contributed new reagents/analytic tools; L.D., W.H., M.G., J.T., L.B., and L.M.S. analyzed data; and L.D., W.H., and L.M.S. wrote the paper. Freely available online through the PNAS open access option. This article has been cited by other articles in PMC.Abstract There is abundant transcription from eukaryotic genomes unaccounted for by protein coding genes. A high-resolution genome-wide survey of transcription in a well annotated genome will help relate transcriptional complexity to function. By quantifying RNA expression on both strands of the complete genome of Saccharomyces cerevisiae using a high-density oligonucleotide tiling array, this study identifies the boundary, structure, and level of coding and noncoding transcripts. A total of 85% of the genome is expressed in rich media. Apart from expected transcripts, we found operon-like transcripts, transcripts from neighboring genes not separated by intergenic regions, and genes with complex transcriptional architecture where different parts of the same gene are expressed at different levels. We mapped the positions of 3′ and 5′ UTRs of coding genes and identified hundreds of RNA transcripts distinct from annotated genes. These nonannotated transcripts, on average, have lower sequence conservation and lower rates of deletion phenotype than protein coding genes. Many other transcripts overlap known genes in antisense orientation, and for these pairs global correlations were discovered: UTR lengths correlated with gene function, localization, and requirements for regulation; antisense transcripts overlapped 3’ UTRs more than 5’ UTRs; UTRs with overlapping antisense tended to be longer; and the presence of antisense associated with gene function. These findings may suggest a regulatory role of antisense transcription in S. cerevisiae. Moreover, the data show that even this well studied genome has transcriptional complexity far beyond current annotation. Keywords: tiling array, transcriptone survey, gene architecture, segmentation, antisense regulation Proteins constitute most structural and functional components of cells. The assumption has been that protein-encoding genes are also the main controllers of cellular processes. Recent evidence challenges this assumption, suggesting a wide-spread involvement of noncoding RNA in regulation, including through the activity of untranslated regions of mRNAs (1), antisense transcripts (2, 3), and isolated noncoding RNAs such as microRNA that control transcript levels or their translation (4). High-resolution transcriptome analysis in higher eukaryotes using tiling arrays has improved ORF annotations and exon-intron predictions and discovered many new transcripts of currently unknown function (5–7). However, these studies have encountered challenges, due to noise, limited resolution, lack of strand-specific signal, and drawbacks in the analysis methods (8). Sequencing of cloned cDNAs has also revealed a high level of transcriptional complexity, including the presence of many new transcripts, alternative promoter usage, splicing, and polyadenylation, as well as the presence of many sense–antisense transcript pairs (3, 9). However, because of the cost and labor of large-scale sequencing, this approach has been limited. Therefore, there is a need to develop high-throughput, precise, and high-resolution technology to map the full transcriptional activity. Yeast is a simple and relatively small eukaryotic genome that provides opportunities to rapidly characterize novel findings. We developed an oligonucleotide array for Saccharomyces cerevisiae that contains 6.5 million probes and interrogates both strands of the full genomic sequence with 25-mer probes tiled at an average of eight nucleotide intervals on each strand (17 nucleotides overlap) and a four nucleotides offset of the tile between strands. This design enables a 4-nt resolution for hybridization of double stranded targets and an 8-nt resolution for strand-specific targets. We profiled transcription during exponential growth in rich media, the standard laboratory growth condition, to generate a comprehensive map of transcription. Results and Discussion Microarray Experiments and Analysis. We hybridized first-strand cDNA synthesized using random primers from polyadenylated [poly(A)] and total RNA. To calibrate the sequence-specific probe effect (10–12), we background-corrected and adjusted (13) the signal of each probe by sequence-specific parameters, estimated from a calibration set of genomic DNA hybridizations (Fig. 5, which is published as supporting information on the PNAS web site). This method allowed us to quantitatively compare the signal from probe to probe on the array. The Transcriptome. To address the question of how much of the genome is transcribed, we analyzed the coding regions of 5,654 ORFs that were annotated as verified or uncharacterized genes in the Saccharomyces Genome Database (SGD, www.yeastgenome.org) and represented by unique probes on the array. Significant expression above background was detected for 5,104 ORFs (90%) (Binomial test, false discovery rate = 0.001; Fig. 6, which is published as supporting information on the PNAS web site). As expected, genes that were not detected have functions not required in this condition such as meiosis, sporulation, mating, sugar transport, and vitamin metabolism [hypergeometric test for gene ontology (GO) annotation enrichment, unadjusted P ≤ 3 × 10−9]. In addition, analyzing 11,412,997 bp of unique genomic sequence, we detected expression above background on either strand for 85%. Comparing this to existing annotation, which covers ≈75% of the genome, shows that 16% of the transcribed base pairs had not been annotated before. To obtain an unbiased map of the position, abundance, and architecture of transcripts, the hybridization signals were examined along their chromosomal position for each strand (Fig. 1
The automated segmentation algorithm provides an unbiased global analysis, but the data complexity invites additional manual curation. Profiles for all genomic regions are provided in a database that is searchable by gene symbol or chromosomal coordinate (www.ebi.ac.uk/huber-srv/queryGene). We encourage readers to explore the database along with the examples discussed below. Examples from this map of transcription are shown in Fig. 2
UTR Boundaries. To map UTRs, we compared ORF boundaries with segment boundaries. We automatically determined UTR lengths for verified or uncharacterized nuclear-encoded genes whose annotated coding sequence was fully contained within a single segment. A total of 2,223 segments passed a confidence filter that required a sharp decrease in signal on both sides of the segment. UTR coordinates are given in Tables 3 and 4. We proceeded with analysis of the 2,044 poly(A)-determined UTRs because the poly(A) hybridization data were cleaner and yielded most of the UTR determinations (Fig. 7, which is published as supporting information on the PNAS web site). For many remaining genes that did not pass the confidence filter, the UTRs can be mapped by closer inspection. We found that 3′ UTRs were significantly longer than the 5′ UTRs, with a median of 91 vs. 68 nt (Fig. 3
We compared UTR lengths with transcript levels and coding sequence (ORF) lengths. Although transcript level was generally lower for genes with long coding sequences, neither transcript levels nor ORF lengths were significantly associated with UTR lengths. We also compared length distributions of UTRs for different functional and localization categories (GO annotations) and detected significant correlations (Fig. 3 Complex Transcriptional Architectures. Many expressed segments flanked other expressed segments with different signal levels, thus making up complex transcriptional architectures. In many cases, different parts of the same gene are expressed at different levels: 921 ORFs from the poly(A) RNA sample were divided into at least two expressed segments, one covering >50% of the feature and others <50%. Such complex architectures could be due to alternative transcription initiation, termination, or alternative splicing, as has been described in mouse (9) and for several human genes (22). In yeast, it has been suggested that up to 20% of mRNAs have alternative 3′ ends (23). Complex hybridization patterns on the array could also be caused by RNA decay or variation introduced by reverse transcription, because the array captures the sum of cDNA molecules present at the time of hybridization. The explanation of our observations by such mechanisms will require a case-by-case analysis. Here, we discuss a few cases. For CPB1 and RNA14, our observed architecture matches previous results describing alternative 3′ ends in response to carbon source regulation (24). For GCN4, lower hybridization signal was observed at the 3′ end (Fig. 2 Neighboring Transcription. Additional unusual architecture was found for adjacent ORFs not separated by an unexpressed region. Such architectures can result from more than one ORF being encoded from a single transcript, like the upstream ORFs in GCN4, or from distinct transcripts not separated by untranscribed intergenic regions. We found the ORFs of GIM3 and YCK2 within one segment resembling a bicistronic transcript (Fig. 2 Fig. 2 Unannotated Transcripts. Many segments with signal above background did not overlap existing annotation. They fall into two classes: nonannotated isolated segments if there was no prior annotation on either strand (Fig. 2
Nonannotated Isolated Transcripts. We verified the array identification of 126 nonannotated isolated transcripts by RT-PCR. All were expressed in both total and poly(A) RNA reverse transcribed by using random and oligo(dT) primers, respectively. For 10 of them, a quantitative real-time PCR analysis showed their levels to be similar to expressed ORFs in both sample types. The 1.7-kb transcript between ORC2 and TRM7 (Fig. 2 We generated knockouts for 47 nonannotated isolated segments and tested for growth defects in rich media conditions (Table 7, which is published as supporting information on the PNAS web site). A growth defect was identified for two knockouts: one on chromosome 6, positions 54813–55221, the other on chromosome 7, positions 622039–622295. On chromosome 6, the deleted segment contained annotated transcription factor binding sites upstream to ACT1, an essential gene, which likely accounts for the observed inviability. On chromosome 7, the deletion does not overlap any annotation, and strains with deletions of the neighboring ORFs (YGR066C, YGR067C) did not have a growth defect. This segment does not appear to be evolutionarily conserved or to contain a long ORF. The proportion of growth defects found within the 47 knockouts is much lower than the ≈40% found for knockouts of protein-coding genes (33). Nonannotated Antisense Transcription. We identified antisense transcripts opposite to 1,555 genes, of which 402 were in the filtered set from both poly(A) and total RNA samples (Tables 3 and 4). The antisense transcripts are not caused by read-through from ORFs on the opposite strand, but appear as independent transcription units. For example, antisense transcription was found opposite SPO22, a meiosis-specific protein induced early in meiosis (Fig. 2 Many genes with antisense transcripts had products that localize to the cell cortex and cell wall, and that function in the meiotic cell cycle and in transcriptional regulation (Table 1). Some of these categories included genes not active during growth in rich media, like meiosis. Others included genes that are active during growth in rich media, but which may need posttranscriptional regulation. Further correlations were found between UTRs and their opposite antisense segments: More antisense transcripts overlapped the 3′ UTRs than the 5′ UTRs; also, UTRs that had overlapping antisense transcripts were longer than UTRs that did not (Table 2).
The generation and significance of the many nonannotated transcripts is unclear. Regulation of gene expression by antisense transcripts was reported in prokaryotes (34) and higher eukaryotes (35). Sense/antisense transcript pairs were suggested to be frequent in mammalian genomes and to provide regulatory function (3). In S. cerevisiae, major components of the RNA interference machinery have not been identified (36); however, in other species, alternative mechanisms for regulation by noncoding RNAs exist (2, 37). In Drosophila, it has been shown that microRNA predominantly target 3′ UTRs (38) and these UTRs also tend to be longer than UTRs of genes not targeted (39). We observed similar correlations for antisense transcripts in S. cerevisiae, which together with their association to particular functional categories may suggest a possible regulatory role. There are experiments supporting this hypothesis: artificial antisense transcripts in S. cerevisiae had effects on expression of several genes (40–42), and overexpression of random genomic fragments antisense to ORFs has led in several cases to growth inhibition (43). In our data set, naturally occurring antisense transcripts were found for ≈20 of these cases. Most ncRNAs previously reported as novel have since been annotated in SGD, and hence do not overlap with our expressed, nonannotated segments (44, 45). We compared our data to transcriptome surveys, carried out by using serial analysis of gene expression (SAGE) (46) and ESTs (19). Thirteen percent of the nonannotated isolated and 42% of the nonannotated antisense transcripts were represented by SAGE tags. For the EST data, these numbers were 1% and 6%, respectively. Analysis of SAGE tags on microarrays described a number of novel transcripts in a mutant strain defective in the RNA degradation pathway (47); however, the eight primary examples were not found expressed in our study of wild-type yeast. This study reveals considerable transcriptional activity in yeast that is currently not systematically annotated. Our transcription map will be useful for annotating the genome. Furthermore, the position of transcription initiation and termination sites will help in defining the promoters and transcriptional regulators of genes. Although our results suggest that not many new, long protein-coding regions will be discovered in yeast, the extensive noncoding transcription detected in regions with no prior anotation and antisense to annotated transcripts invites further investigation. Therefore, even for a genome that has been studied intensively since it was sequenced 10 years ago (48), a glimpse into the complexity of its transcriptional architecture makes this genome appear like novel territory. Materials and Methods Array Design and Sample Hybridization. The array was designed in collaboration with Affymetrix (Santa Clara, CA) (PN 520055). An S288c background strain S96 (MATa gal2 lys5) was grown in rich yeast-extract/peptone/dextrose media to mid-exponential phase. Total RNA was isolated by hot phenol extraction. Poly(A) RNA was enriched by two rounds of the Oligotex mRNA kit (Qiagen). First-strand cDNA was synthesized by using random primers. Three replicate hybridizations (biological) of poly(A), two of total RNA, and three of genomic DNA were performed. Probe Annotation. Probe sequences were aligned to the genome sequence of S. cerevisiae strain S288c (SGD of August 7, 2005). Perfect match probes were further analyzed. Normalization. RNA hybridization intensities were adjusted by
Segmentation. Segments of approximately constant hybridization signal were defined by using a dynamic programming algorithm that, for each chromosome strand separately, minimizes the cost function
sj is the arithmetic mean of the signal values of array j in segment s, S is the number of segments, and t1, …, tS are the segment boundaries (15). For each chromosome, S was chosen such that the average segment length was 1,500 nt. S, the only parameter of the segmentation algorithm, controls the sensitivity–specificity tradeoff and was chosen to yield high sensitivity.Supporting Information
Acknowledgments We thank Mike Mittmann for the array design, Iain Russell and Victor Sementchenko for experimental advice, Raquel Kuehn and Michelle Nugyen for technical assistance, Roy Parker, Rafael Irizarry, Elisa Izaurralde, Steve Cohen, and Nick Goldman for helpful comments on the manuscript, and contributors to the bioconductor (www.bioconductor.org) and r (www.R-project.org) projects for their software. This work was funded by the National Institutes of Health (R.W.D and L.M.S.) and the Deutsche Forschungsgemeinschaft (L.M.S.). Footnotes Conflict of interest statement: No conflicts declared. Data deposition: The array data have been deposited in ArrayExpress database (accession no. E-TABM-14). References 1. Wilkie G. S., Dickson K. S., Gray N. K. Trends Biochem. Sci. 2003;28:182–188. [PubMed] 2. Storz G., Altuvia S., Wassarman K. M. Annu. Rev. Biochem. 2005;74:199–217. [PubMed] 3. Katayama S., Tomaru Y., Kasukawa T., Waki K., Nakanishi M., Nakamura M., Nishida H., Yap C. C., Suzuki M., Kawai J., et al. Science. 2005;309:1564–1566. [PubMed] 4. Mattick J. S. Nat. Rev. Genet. 2004;5:316–323. [PubMed] 5. Cheng J., Kapranov P., Drenkow J., Dike S., Brubaker S., Patel S., Long J., Stern D., Tammana H., Helt G., et al. Science. 2005;308:1149–1154. [PubMed] 6. Bertone P., Stolc V., Royce T. E., Rozowsky J. S., Urban A. E., Zhu X., Rinn J. L., Tongprasit W., Samanta M., Weissman S., et al. Science. 2004;306:2242–2246. [PubMed] 7. Yamada K., Lim J., Dale J. M., Chen H., Shinn P., Palm C. J., Southwick A. M., Wu H. C., Kim C., Nguyen M., et al. Science. 2003;302:842–846. [PubMed] 8. Royce T. E., Rozowsky J. S., Bertone P., Samanta M., Stolc V., Weissman S., Snyder M., Gerstein M. Trends Genet. 2005;21:466–475. [PubMed] 9. Carninci P., Kasukawa T., Katayama S., Gough J., Frith M. C., Maeda N., Oyama R., Ravasi T., Lenhard B., Wells C., et al. Science. 2005;309:1559–1563. [PubMed] 10. Hekstra D., Taussig A. R., Magnasco M., Naef F. Nucleic Acids Res. 2003;31:1962–1968. [PubMed] 11. Naef F., Magnasco M. O. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2003;68:011906. [PubMed] 12. Wu Z., Irizarry R. A. J. Comput. Biol. 2005;12:882–893. [PubMed] 13. Huber W., von Heydebreck A., Sultmann H., Poustka A., Vingron M. Bioinformatics. Suppl. 1. Vol. 18. 2002. pp. S96–S104. 14. Gentleman R., Carey V., Huber W., Irizarry R., Dudoit S. Heidelberg: Springer; 2005. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. 15. Picard F., Robin S., Lavielle M., Vaisse C., Daudin J. J. BMC Bioinformatics. 2005;6:27. [PubMed] 16. Kuersten S., Goodwin E. B. Nat. Rev. Genet. 2003;4:626–637. [PubMed] 17. Mignone F., Gissi C., Liuni S., Pesole G. Genome Biol. 2002;3:REVIEWS0004. [PubMed] 18. Hurowitz E. H., Brown P. O. Genome Biol. 2003;5:R2. [PubMed] 19. Graber J. H., Cantor C. R., Mohr S. C., Smith T. F. Nucleic Acids Res. 1999;27:888–894. [PubMed] 20. Marc P., Margeot A., Devaux F., Blugeon C., Corral-Debrinski M., Jacq C. EMBO Rep. 2002;3:159–164. [PubMed] 21. Gerber A. P., Herschlag D., Brown P. O. PLoS Biol. 2004;2:E79. [PubMed] 22. Kapranov P., Drenkow J., Cheng J., Long J., Helt G., Dike S., Gingeras T. R. Genome Res. 2005;15:987–997. [PubMed] 23. Graber J. H., McAllister G. D., Smith T. F. Nucleic Acids Res. 2002;30:1851–1858. [PubMed] 24. Sparks K. A., Dieckmann C. L. Nucleic Acids Res. 1998;26:4676–4687. [PubMed] 25. Hinnebusch A. G. Annu. Rev. Microbiol. 2005;59:407–450. [PubMed] 26. Siepel A., Bejerano G., Pedersen J. S., Hinrichs A. S., Hou M., Rosenbloom K., Clawson H., Spieth J., Hillier L. W., Richards S., et al. Genome Res. 2005;15:1034–1050. [PubMed] 27. Blumenthal T., Gleason K. S. Nat. Rev. Genet. 2003;4:112–120. [PubMed] 28. He F., Li X., Spatrick P., Casillo R., Dong S., Jacobson A. Mol. Cell. 2003;12:1439–1452. [PubMed] 29. Martens J. A., Laprade L., Winston F. Nature. 2004;429:571–574. [PubMed] 30. Kampa D., Cheng J., Kapranov P., Yamanaka M., Brubaker S., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., et al. Genome Res. 2004;14:331–342. [PubMed] 31. Johnson J. M., Edwards S., Shoemaker D., Schadt E. E. Trends Genet. 2005;21:93–102. [PubMed] 32. Kellis M., Patterson N., Endrizzi M., Birren B., Lander E. S. Nature. 2003;423:241–254. [PubMed] 33. Winzeler E. A., Shoemaker D. D., Astromoff A., Liang H., Anderson K., Andre B., Bangham R., Benito R., Boeke J. D., Bussey H., et al. Science. 1999;285:901–906. [PubMed] 34. Wagner E. G., Simons R. W. Annu. Rev. Microbiol. 1994;48:713–742. [PubMed] 35. Kumar M., Carmichael G. G. Microbiol. Mol. Biol. Rev. 1998;62:1415–1434. [PubMed] 36. Aravind L., Watanabe H., Lipman D. J., Koonin E. V. Proc. Natl. Acad. Sci. USA. 2000;97:11319–11324. [PubMed] 37. Vanhee-Brossollet C., Vaquero C. Gene. 1998;211:1–9. [PubMed] 38. Lai E. C. Nat. Genet. 2002;30:363–364. [PubMed] 39. Stark A., Brennecke J., Bushati N., Russell R. B., Cohen S. M. Cell. 2005;123:1133–1146. [PubMed] 40. Xiao W., Rank G. H. Curr. Genet. 1988;13:283–289. [PubMed] 41. Peterson J. A., Myers A. M. Nucleic Acids Res. 1993;21:5500–5508. [PubMed] 42. Park H., Shin M., Woo I. J. Biosci. Bioeng. 2001;92:481–484. [PubMed] 43. Boyer J., Badis G., Fairhead C., Talla E., Hantraye F., Fabre E., Fischer G., Hennequin C., Koszul R., Lafontaine I., et al. Genome Biol. 2004;5:R72. [PubMed] 44. Olivas W. M., Muhlrad D., Parker R. Nucleic Acids Res. 1997;25:4619–4625. [PubMed] 45. McCutcheon J. P., Eddy S. R. Nucleic Acids Res. 2003;31:4119–4128. [PubMed] 46. Velculescu V. E., Zhang L., Zhou W., Vogelstein J., Basrai M. A., Bassett D. E., Jr., Hieter P., Vogelstein B., Kinzler K. W. Cell. 1997;88:243–251. [PubMed] 47. Wyers F., Rougemaille M., Badis G., Rousselle J. C., Dufour M. E., Boulay J., Regnault B., Devaux F., Namane A., Seraphin B., et al. Cell. 2005;121:725–737. [PubMed] 48. Goffeau A., Barrell B. G., Bussey H., Davis R. W., Dujon B., Feldmann H., Galibert F., Hoheisel J. D., Jacq C., Johnston M., et al. Science. 1996;274:563–567. 49. R Development Core Team. Vienna: R Foundation for Statistical Computing; 2005. R: A Language and Environment for Statistical Computing. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||
Trends Biochem Sci. 2003 Apr; 28(4):182-8.
[Trends Biochem Sci. 2003]Annu Rev Biochem. 2005; 74():199-217.
[Annu Rev Biochem. 2005]Science. 2005 Sep 2; 309(5740):1564-6.
[Science. 2005]Nat Rev Genet. 2004 Apr; 5(4):316-23.
[Nat Rev Genet. 2004]Science. 2005 May 20; 308(5725):1149-54.
[Science. 2005]Science. 2004 Dec 24; 306(5705):2242-6.
[Science. 2004]Science. 2003 Oct 31; 302(5646):842-6.
[Science. 2003]Trends Genet. 2005 Aug; 21(8):466-75.
[Trends Genet. 2005]Science. 2005 Sep 2; 309(5740):1564-6.
[Science. 2005]Nucleic Acids Res. 2003 Apr 1; 31(7):1962-8.
[Nucleic Acids Res. 2003]Phys Rev E Stat Nonlin Soft Matter Phys. 2003 Jul; 68(1 Pt 1):011906.
[Phys Rev E Stat Nonlin Soft Matter Phys. 2003]J Comput Biol. 2005 Jul-Aug; 12(6):882-93.
[J Comput Biol. 2005]BMC Bioinformatics. 2005 Feb 11; 6():27.
[BMC Bioinformatics. 2005]Nat Rev Genet. 2003 Aug; 4(8):626-37.
[Nat Rev Genet. 2003]Genome Biol. 2002; 3(3):REVIEWS0004.
[Genome Biol. 2002]Genome Biol. 2003; 5(1):R2.
[Genome Biol. 2003]Nucleic Acids Res. 1999 Feb 1; 27(3):888-94.
[Nucleic Acids Res. 1999]EMBO Rep. 2002 Feb; 3(2):159-64.
[EMBO Rep. 2002]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Science. 2005 Sep 2; 309(5740):1559-63.
[Science. 2005]Genome Res. 2005 Jul; 15(7):987-97.
[Genome Res. 2005]Nucleic Acids Res. 2002 Apr 15; 30(8):1851-8.
[Nucleic Acids Res. 2002]Nucleic Acids Res. 1998 Oct 15; 26(20):4676-87.
[Nucleic Acids Res. 1998]Annu Rev Microbiol. 2005; 59():407-50.
[Annu Rev Microbiol. 2005]Genome Res. 2005 Aug; 15(8):1034-50.
[Genome Res. 2005]Nat Rev Genet. 2003 Feb; 4(2):112-20.
[Nat Rev Genet. 2003]Mol Cell. 2003 Dec; 12(6):1439-52.
[Mol Cell. 2003]Nature. 2004 Jun 3; 429(6991):571-4.
[Nature. 2004]Genome Res. 2004 Mar; 14(3):331-42.
[Genome Res. 2004]Trends Genet. 2005 Feb; 21(2):93-102.
[Trends Genet. 2005]Genome Res. 2005 Aug; 15(8):1034-50.
[Genome Res. 2005]Nature. 2003 May 15; 423(6937):241-54.
[Nature. 2003]Science. 1999 Aug 6; 285(5429):901-6.
[Science. 1999]Annu Rev Microbiol. 1994; 48():713-42.
[Annu Rev Microbiol. 1994]Microbiol Mol Biol Rev. 1998 Dec; 62(4):1415-34.
[Microbiol Mol Biol Rev. 1998]Science. 2005 Sep 2; 309(5740):1564-6.
[Science. 2005]Proc Natl Acad Sci U S A. 2000 Oct 10; 97(21):11319-24.
[Proc Natl Acad Sci U S A. 2000]Annu Rev Biochem. 2005; 74():199-217.
[Annu Rev Biochem. 2005]Nucleic Acids Res. 1997 Nov 15; 25(22):4619-25.
[Nucleic Acids Res. 1997]Nucleic Acids Res. 2003 Jul 15; 31(14):4119-28.
[Nucleic Acids Res. 2003]Cell. 1997 Jan 24; 88(2):243-51.
[Cell. 1997]Nucleic Acids Res. 1999 Feb 1; 27(3):888-94.
[Nucleic Acids Res. 1999]Cell. 2005 Jun 3; 121(5):725-37.
[Cell. 2005]BMC Bioinformatics. 2005 Feb 11; 6():27.
[BMC Bioinformatics. 2005]