![]() | ![]() |
Formats:
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Copyright : © 2008 Hogan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Diverse RNA-Binding Proteins Interact with Functionally Related Sets of RNAs, Suggesting an Extensive Regulatory System 1 Department of Biochemistry, Stanford University School of Medicine, Stanford, California, United States of America 2 Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, United States of America 3 Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America 4 Institute of Pharmaceutical Sciences, Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland * To whom correspondence should be addressed. E-mail: pbrown/at/pmgm2.stanford.edu (POB); Email: herschla/at/stanford.edu (DH); Email: andre.gerber/at/pharma.ethz.ch (AG) Received April 4, 2008; Accepted September 11, 2008. This article has been cited by other articles in PMC.Abstract RNA-binding proteins (RBPs) have roles in the regulation of many post-transcriptional steps in gene expression, but relatively few RBPs have been systematically studied. We searched for the RNA targets of 40 proteins in the yeast Saccharomyces cerevisiae: a selective sample of the approximately 600 annotated and predicted RBPs, as well as several proteins not annotated as RBPs. At least 33 of these 40 proteins, including three of the four proteins that were not previously known or predicted to be RBPs, were reproducibly associated with specific sets of a few to several hundred RNAs. Remarkably, many of the RBPs we studied bound mRNAs whose protein products share identifiable functional or cytotopic features. We identified specific sequences or predicted structures significantly enriched in target mRNAs of 16 RBPs. These potential RNA-recognition elements were diverse in sequence, structure, and location: some were found predominantly in 3′-untranslated regions, others in 5′-untranslated regions, some in coding sequences, and many in two or more of these features. Although this study only examined a small fraction of the universe of yeast RBPs, 70% of the mRNA transcriptome had significant associations with at least one of these RBPs, and on average, each distinct yeast mRNA interacted with three of the RBPs, suggesting the potential for a rich, multidimensional network of regulation. These results strongly suggest that combinatorial binding of RBPs to specific recognition elements in mRNAs is a pervasive mechanism for multi-dimensional regulation of their post-transcriptional fate. Author Summary Regulation of gene transcription has been extensively studied, but much less is known about how the fates of the resulting mRNA transcripts are regulated. We were intrigued by the fact that while most eukaryotic genomes encode hundreds of RNA-binding proteins (RBPs), the targets and regulatory roles of only a small fraction of these proteins have been characterized. In this study, we systematically identified the RNAs associated with a select sample of 40 of the approximately 600 predicted RBPs in the budding yeast, Saccharomyces cerevisiae. We found that most of these RBPs bound specific sets of mRNAs whose protein products share physiological themes or similar locations within the cell. For 16 of the 40 RBPs, we identified sequence motifs significantly enriched in their RNA targets that presumably mediate recognition of the target by the RBP. The intricate, overlapping patterns of mRNAs associated with RBPs suggest an extensive combinatorial system for post-transcriptional regulation, involving dozens or even hundreds of RBPs. The organization and molecular mechanisms involved in this regulatory system, including how RBP–mRNA interactions are integrated with signal transduction systems and how they affect the fates of their RNA targets, provide abundant opportunities for investigation and discovery. Introduction Much of the regulation of eukaryotic gene expression programs is still unaccounted for. Although these programs are subject to regulation at many steps, most investigation has focused on regulation of transcription. There are clues, however, that a significant portion of undiscovered regulation might be post-transcriptional, acting to regulate mRNA processing, localization, translation, and decay [1–5]. For example, systematic phylogenetic comparison among yeast and mammalian genomes sequences have revealed that untranslated regions of many mRNAs are under purifying selection, and thus presumably carrying information important for fitness [6–8]. Biological regulation can be achieved by controlling any of a large number of steps in the lives of RNA molecules. Alternative splicing of transcripts can enable a single gene to encode numerous protein products, greatly expanding its molecular complexity [9]. Even in organisms with few introns, such as Saccharomyces cerevisiae, splicing is subject to regulation [10,11]. Notable examples of regulated RNA localization include mRNA export from the nucleus to the cytoplasm, partitioning of mRNAs to the rough endoplasmic reticulum (ER) membrane for cotranslational export, and the precise subcellular localization of thousands of specific mRNAs [12]. In a recent survey of mRNA localization in developing Drosophila embryos, more than 70% of the roughly 3,000 mRNAs examined showed distinct patterns of subcellular localization [13]. Widespread regulation of translation rates is evident in several observations. In yeast, despite extensive regulation of transcription and mRNA decay, only about 70% of the observed variance in protein abundance is accounted for by variation in mRNA abundance [14,15]. When cells are moved from rich media to minimal media, the abundance of hundreds of proteins change, but mRNA abundance changes parallel changes in the abundance for only about half of the cognate proteins [16,17]. The abundance of each RNA is determined jointly by regulated transcription and regulated degradation. Widespread, transcript-specific regulation of mRNA decay is evident from the closely matched decay rates of mRNAs encoding functionally related proteins [18–21], particularly evident in S. cerevisiae in sets of proteins that form stoichiometric complexes [19]. Increasing evidence points to extensive involvement of specific RNA-binding proteins (RBPs) in regulation of these post-transcriptional events [1–5]. Pioneering studies focusing on tens of predominantly nuclear mRNA RBPs (so-called heterogeneous ribonucleoprotein [hnRNP] proteins), revealed that these proteins recognize specific features in mRNAs, bind at overlapping, but distinct, times during RNA processing, and differentially associate with subsets of nascent transcripts [22]. Steps in RNA processing in the nucleus are functionally and physically coupled, providing an opportunity for coordinated control [23]. Investigations of regulation acting on RNA have usually focused on a few model RNAs, leaving unanswered the extent to which mRNAs are coordinated and differentially regulated, and this regulatory landscape is still largely unexplored. Recent studies have systematically identified the suite of mRNAs associated with some individual RBPs. Several RBPs implicated in RNA processing and nuclear export in S. cerevisiae were found to associate with distinct sets of hundreds of functionally related mRNAs [24,25]. Five members of the Puf family of RBPs in S. cerevisiae were each found to associate with distinct, overlapping sets of 40–250 mRNAs [26]. The specific sets of mRNAs associated with each Puf protein were significantly enriched for mRNAs encoding functionally and cytotopically related proteins. For instance, most of the approximately 220 mRNAs associated with Puf3 are transcribed from nuclear genes and encode proteins localized to the mitochondrion (p < 10−100). Puf3, Puf4, and Puf5 each recognize specific sequences in the 3′-untranslated regions (UTRs) of their targets. These results and others, from studies of a few selected RBPs, may be just a glimpse of a much larger and richer post-transcriptional regulatory network, involving dozens to hundreds of RBPs and a cognate suite of recognition elements in their RNA targets (e.g., [22,24–40]). But does such a multidimensional post-transcriptional regulatory network exist? To test this hypothesis and to extend and deepen our understanding of RBP–RNA interactions, we systematically searched for the RNA targets of a select sample of 40 out of the more than 500 known and predicted RBPs in S. cerevisiae. Results Systematic Identification of RNAs Associated with a Select Sample of RNA-Binding Proteins We first developed a list of candidate RBPs based on annotations in the Saccharomyces Genome Database (SGD) (http://www.yeastgenome.org), the Yeast Protein Database [41], and the Munich Information Center for Protein Sequences database [42] and on literature searches. From the assembled list of 561 genes (Table S1), we chose a set of 36 with diverse RNA-binding domains and diverse functional annotations (Table S2 and Text S1). Because many known RBPs lack recognizable RNA-binding domains, we also included two metabolic enzymes whose homologs in other species are known to associate with RNA, and two proteins that were not, a priori, expected to bind RNA, but which we suspected might have post-transcriptional regulatory functions (Table S2). To identify RNAs associated with each putative RBP, C-terminal tandem affinity purification (TAP)-tagged proteins, expressed under control of their native promoters, were affinity purified from whole-cell extracts of cultures grown to mid-log phase in rich medium [14,26,43]. Extracts were incubated with immunoglobulin G (IgG) agarose beads, washed, and ribonuclear protein complexes were eluted by tobacco etch virus (TEV) protease treatment (Text S2). We performed two to four independent isolations with each tagged strain. As controls, we performed 13 immunoaffinity purifications (IPs) of untagged strains to identify and exclude potential false-positive RNA targets. We purified total RNA from the whole-cell extracts and TEV-purified fractions, reverse transcribed with an amino-allyl-dUTP/dNTP mix, coupled the purified cDNA to Cy3 and Cy5 dyes, respectively, mixed the two differentially labeled cDNA pools, and then hybridized them to DNA microarrays (Dataset S1). We identified RNAs specifically associated with each protein using the significance analysis of microarrays (SAM) algorithm [44]. Although it is not possible to perfectly distinguish targets from nontargets, and the best criterion for distinguishing targets from nontargets is unlikely to be the same for all proteins, for most proteins, we chose a 1% false discovery rate (FDR) as a criterion for identifying targets (Datasets S2 and S3). For many RBPs, the number of RNAs called significantly enriched has an inflection point near 1% FDR, suggesting that this threshold is a good balance between sensitivity and specificity, but undoubtedly our identification of specific RBP targets is not comprehensive. For two proteins in the survey (Ssd1 and Khd1), we used a more stringent 1% local FDR criterion [45] (details in Materials and Methods; Datasets S2 and S3). We also included mRNAs specifically associated with Puf1–5 from our previous work [26], (defined using a 1% local FDR), and previously identified She2 targets [32]. Diverse Binding Specificity among RNA-Binding Proteins The 40 proteins in the survey (and also Puf1–5 and She2 from our previous work [26,32]) displayed diverse patterns of specificity with regard to the numbers and types of RNA targets and their enrichment profiles (Figures 1
Fourteen of the proteins we surveyed specifically associated with RNAs other than mature mRNAs encoded by nuclear genes (Figure S2). Their specific targets included intron-containing transcripts (Cbc2, Msl5, Npl3, Hrb1, Pab1, and Pub1), H/ACA box small nucleolar RNAs (snoRNAs) (Cbf5, Nrd1, and Pub1), C/D box snoRNAs (Nop56, Sof1, Nab3, Nrd1, Pub1, and Pab1), and mitochondrial mRNAs (Aco1, Tdh3, and Nab2). Several of these proteins have previously been shown to be associated with specific classes of RNA (Cbc2, Msl5, Npl3, Cbf5, Nrd1, Nop56, Sof1, and Nab3), and therefore provide de facto positive controls (Table S2 and Text S4). Aco1, a TCA cycle enzyme [48], which has recently been implicated in maintaining mitochondrial genome integrity [49], selectively binds transcripts encoded by the mitochondrial genome (p < 10−38). Our results also suggest unexpected associations for several noncoding-RNA–binding proteins and suggest possible regulatory links between mRNA and noncoding RNA (ncRNA) processing (Text S4). However, the remainder of this report will focus mostly on mRNA targets. Most mRNAs Associate with Multiple RNA-Binding Proteins To explore the interrelationships among RBPs and their RNA targets, we organized RNAs (Figure 1 Altogether, we identified more than 12,000 mRNA–RBP interactions (at a 1% FDR), an average of at least 2.8 RBPs interacting with each of 4,300 distinct mRNAs; 31 proteins (including Puf1–5 and She2) reproducibly bound at least ten mRNAs (at a 1% FDR). Most mRNAs were bound by multiple RBPs (Figure 1 About 75% (~9,000) of the mRNA–RBP interactions identified in this survey were accounted for by the nine proteins that targeted more than 500 mRNAs each, (Figure 1 Many RNA-Binding Proteins Associate with mRNAs Encoding Functionally and Cytotopically Related Proteins Regulatory proteins, including both transcription factors and RBPs, typically regulate sets of targets that share identifiable functional relationships (e.g., [26–29,32,35,52–60]). As a first step toward identifying relationships among RNAs bound by specific RBPs, we searched for gene ontology (GO) terms [61] that were significantly enriched among the targets of each RBP. Twenty-five of the RBPs in this survey were consistently associated with at least ten mRNAs; 13 of these sets of RNA targets specific to an RBP were significantly enriched for at least one “cellular component” GO term (Figure 2
Diverse subcellular loci and biological processes were represented among the annotations enriched in the sets of RNA targets of these 15 RBPs (as well as the five Puf proteins and She2), including nearly all major subcellular compartments. Some subcellular sites and biological processes were found as shared attributes of the RNA targets associated with an unexpectedly large fraction of the RBPs in this study, perhaps highlighting processes or systems in which post-transcriptional regulation plays an especially important role. For instance, six RBPs (Pub1, Khd1, Nab6, Ssd1, Ypl184c, and Scp160) were specifically associated with mRNAs encoding cell wall proteins; six (Pub1, Puf1, Puf2, Khd1, Ypl184c, and Scp160) were specifically associated with mRNAs encoding plasma membrane proteins; five (Puf3, Nsr1, Pab1, Npl3, and Nrd1) were significantly associated with mRNAs encoding subunits of mitochondrial ribosome; and four (Scp160, Bfr1, Puf4, and Gbp2) were specifically associated with mRNAs encoding proteins localized to the nucleolus and involved in RNA processing and ribosome biogenesis. For many RBPs, several distinct subcellular components or biological processes were overrepresented in the functional annotations of the associated transcripts; these subcellular loci or processes were often functionally linked. For example, RNAs associated with Ssd1 were enriched for transcripts encoding cell wall and bud proteins, whereas Gbp2-associated RNAs were enriched for transcripts encoding nuclear proteins with roles in ribosome biogenesis or chromatin remodeling. In many instances, the functional themes significantly overrepresented among the RNA targets of an RBP are congruent with previously published work on that RBP, such as phenotypes associated with mutation of altered expression (Table S2). A few examples are described in subsequent sections. Specific Features of Post-Transcriptional Regulation May Be Linked to Broad-Specificity RNA-Binding Proteins Although some appear to bind to most or all mRNAs (Figure S2 and Text S3), the nine RBPs that bind large (>500) sets of mRNAs display several distinct enrichment profiles (Figure 1 Pab1 provides a simple and useful example of the possible functional significance of the differential enrichment; immunoaffinity enrichment of mRNAs associated with Pab1 was correlated with ribosome occupancy (Pearson correlation = 0.35). Pab1 is the major poly(A) binding protein in both the nucleus and cytoplasm [64]. In the cytoplasm, Pab1 binds to the poly(A) tails of mRNAs and interacts with eIF4-G to promote translation initiation [65]. Because longer poly(A) tails have been reported to increase translation efficiency [66], a possible interpretation of these results is that the observed enrichment could reflect the number of Pab1 proteins bound per mRNA and thus the length of the poly(A) tail [39]. In contrast, immunoaffinity enrichment with Khd1 was negatively correlated with ribosome occupancy (r = −0.26). Khd1 is implicated in repressing translation of ASH1 mRNA during the transport of the mRNA to the bud tip [67]. The negative correlation with global ribosome occupancy and the large number of mRNAs associated with Khd1 suggest that Khd1 may similarly repress translation initiation of hundreds to thousands of mRNAs, perhaps during their transport to specific cellular loci. Many RNA-Binding Proteins Appear to Bind Their Targets during Specific Stages in Their Lives Many RBPs associate with mRNAs at a particular stage in their lives [2]. For the approximately 270 intron-containing genes, the relative enrichment of introns (i.e., unspliced pre-mRNAs and possibly uncleaved excised introns) versus exons (i.e., mature mRNAs and pre-mRNAs) should reveal whether the RBP is bound specifically to intron-containing transcripts, mature mRNAs, or both, and thus indicate when and where the RBP associates with its target RNAs. Linking these data to functional information on the RBP could then provide insights into timing and duration of specific stages in the lives of mRNAs. To test this idea, we compared the enrichment of intron and exon sequences in association with RBPs. For the approximately 120 intron/exon probe pairs for which our data were most consistently reliable, the relative enrichment profiles vary greatly among RBPs (Figure 3
Combinatorial Interactions among RNA-Binding Proteins and mRNAs The RBPs we analyzed bound overlapping sets of mRNAs, and many individual mRNAs were bound by more than one RBP (Figure 1 To explore the relationships among the groups of RNAs bound by different RBPs, we determined the extent to which the overlaps between targets for each RBP pair differed from what would be expected by chance. The significance values from this analysis were used as a metric of similarity for hierarchical clustering to identify pairs and sets of RBPs with similar patterns of shared targets. The results are presented in Figure 4
To further explore the interrelationships among RBPs and their mRNA targets, we used a supervised method to identify smaller subsets of mRNAs that shared interactions with several RBPs. We did this by selecting mRNAs bound by a common set of RBPs whose targets, in turn, were enriched for common GO terms (Figure 2 The group of mRNAs, defined by interactions with at least four of a set of six RBPs (Pub1, Khd1, Nab6, Ssd1, Ypl184c, and Scp160), includes a significant excess of mRNAs encoding proteins localized to the cell wall (Figure 4 How Do the RNA-Binding Proteins Identify Their Targets? We identified candidates for the sequence elements that mediate regulatory interactions with specific RBPs using two related computational methods: “finding informative regulatory elements” (FIRE), which searches for motifs with informative patterns of enrichment [89], and a newly developed method, “relative filtering by nucleotide enrichment” (REFINE). In brief, REFINE identifies all hexamers that are significantly enriched in putative 5′- and 3′-UTR regions of targets over nontargets, filters out regions of target sequences that are relatively devoid of such hexamers, and then applies the “multiple expectation maximization for motif elicitation” (MEME) motif-finding algorithm [90]. A full description of the REFINE methodology and more detailed analyses of predicted motif sequences will be published separately (D. P. Riordan, D. Herschlag, and P. O. Brown, unpublished data). Herein, we combined the results from these two approaches. Using stringent statistical criteria based on randomized simulations (details in Materials and Methods), we identified a total of 60 candidate RNA regulatory motifs significantly associated with 21 different RBPs; 35 motifs (for 21 RBPs) were predicted by REFINE, and 25 motifs (for 13 RBPs) were predicted by FIRE (Table S4). Since the same motifs were often predicted by both programs for the same RBP or for different RBPs with significantly overlapping target sets, we manually grouped motifs with similar consensus sequences and origins into classes (Table S4). We then included only the most significant motif from each class and for each RBP, resulting in a set of 14 nonredundant RNA motifs predicted with high confidence (Figure 5
The motifs we identified for Puf3, Puf4, Puf5, Pub1, Nab2, Nrd1, and Nab3 match previously described binding sites for the corresponding RBPs, validating our approach and suggesting that many of the RBP–RNA interactions we measured are likely to be directly mediated by these elements (Text S7). Interestingly, the inferred recognition element for Nrd1, Nrd1–1 (UUCUUGUW), contains both an exact match to the reported Nrd1 binding site consensus “UCUU” and a partial match to the reported Nab3 recognition site consensus “GUAR” [91,92]. As Nrd1 and Nab3 are known to act as a complex to control transcriptional termination of nonpolyadenylated RNAs [93], and a nearly identical motif was identified in Nab3 targets (Table S4), it is possible that these motifs represent a favored orientation of adjacent Nrd1 and Nab3 RNA elements that facilitates specific binding of the Nrd1–Nab3 complex. The most significant novel motif we identified, Puf2–1 (UAAUAAUUW), is enriched in the 3′-UTRs and coding sequences of Puf2 targets and demonstrates significant conservation and a forward strand bias (Figure 5 A selective sample of 11 mRNAs provides an unfinished, but revealing, picture of the organization of the information that specifies interactions with, and perhaps regulation by, specific RBPs examined in this study (Figure 6
For many RBPs, our computational method did not identify any sequence motifs with statistically significant enrichment, the motifs identified significantly overlapped those associated with other RBP target sets, or the motif did not match previously reported binding preferences (Table S4 and Text S7). The large degree of motif coenrichment observed in our analysis is consistent with combinatorial regulation by a highly interconnected regulatory network and represents an important limitation of computational regulatory element identification. It is likely that some of the RBPs for which we failed to predict sequence motifs recognize RNA structural elements or features primarily present in coding sequences, which are difficult to detect with current methods for RNA motif prediction, because they are not suited to modeling structural features or handling the significant confounding sequence biases in coding sequences. Vts1 illustrates some of the limitations of current RNA motif prediction methods. Vts1 is known to bind to a structural RNA motif called the Smaug recognition element (SRE), which consists of a short hairpin with the loop consensus sequence CNGGN(0–1) [95]. SRE sites are indeed significantly enriched in the coding sequences of Vts1 targets (65% targets versus 36% nontargets, p < 10−7) in agreement with previous results [96], suggesting that SRE elements are directly responsible for these interactions in vivo. However, neither REFINE nor FIRE succeeded in identifying the SRE. Instead, both programs identified a motif, Vts1–1 (UKWCGRGGN), which is indeed enriched in the 3′-UTRs of Vts1 targets but is unrelated to the SRE (Table S4). We suspect that the Vts1–1 motif may represent a binding site for an unknown factor that regulates a set of mRNAs that overlaps extensively with the targets of Vts1. Insights into the Functions of Specific RNA-Binding Proteins The functional and cytotopic themes represented among the specific targets of each RBP have obvious implications for their possible regulatory roles, which can be integrated with previously reported information to derive further insights, and generate new hypotheses, as illustrated here for Ssd1 and Ypl184c (see Text S9 for descriptions of Khd1 and Gbp2). Ssd1 is a large (140 kDa), ribonuclease-II domain–containing, predominantly cytoplasmic protein [99], genetically implicated in cell-wall biogenesis and function: mutant phenotypes include increased sensitivity to osmotic stress and caffeine, altered composition and structure of the cell wall, defects in germination and sporulation, premature aging, and pathogenicity [73,74,100–103]. Ssd1 physically and genetically interacts with numerous signaling proteins, many of which are genetically implicated in cell-wall function [71,102,104,105]. Ssd1 binds to the C-terminal domain of RNA polymerase II in vitro [106]. Of the 52 annotated mRNAs associated with Ssd1, 16 encode proteins localized to the cell wall (p < 10−15), and 11 encode proteins localized to the bud (p < 10−5). The proteins encoded by the Ssd1-associated transcripts have diverse functional and structural roles related to cell-wall biosynthesis, or remodeling and its regulation, cell-cycle progression, and protein trafficking. Ssd1 also appears to bind its own transcript (Text S8). For both of the Ssd1 mRNA targets encoded by intron-containing genes (PUF5 and ECM33), the intron-containing primary transcripts are also enriched by Ssd1 IP, suggesting that Ssd1 binds its RNA targets in the nucleus, perhaps while they are being transcribed. A putative RNA-recognition motif is significantly enriched in the 5′-UTRs of Ssd1 targets (Figure 5 Ypl184c is a largely uncharacterized, predominantly cytoplasmic protein that contains three RNA recognition motifs (RRMs). Of the three proteins that have been found to physically interact with Ypl184c, two are among the other RBPs included in this survey: Pab1 and Nab6 [71]. A disproportionate fraction of the 321 annotated mRNAs we found to associate with Ypl184c encode proteins localized to the cell wall (38, p < 10−23), ER (50, p < 10−5), plasma membrane (32, p < 10−3), or extracellular milieu (8, p < 10−3). Transcripts encoding components of several protein complexes were associated with Ypl184c, including three of five components of the Cdc28 complex (CLB2, CLN3, and CLN2) for which we obtained high-quality measurements, three of three components of the plasma membrane H+ ATPase (PMP1, PMP2, and PMA1) for which we obtained high-quality measurements, and four of nine components of the oligosaccharyltransferase complex (OST4, SWP1, OST3, and OST5) [107]. Components of these complexes that were not defined as targets of Ypl184c (at a stringent 1% FDR) were nevertheless more likely to be overrepresented in Ypl184c IPs than expected by chance, suggesting that Ypl184c may actually associate with the mRNAs encoding most or all members of these complexes. Ypl184c associated with many mRNAs that exhibit unusual modes of translation regulation. Ypl184c bound all five of the mRNAs that have experimentally confirmed short upstream open reading frames (uORFs) (GCN4, CPA1, LEU4, SCH9, and SCO1) [108–115] in their 5′-UTRs and for which we obtained high-quality measurements; uORFs have been shown to regulate the translation of the downstream coding sequence and the stability of the mRNA [116]. Ypl184c associated with all five of the S. cerevisiae mRNAs that have been shown to have internal ribosome entry sites (IRES) (HAP4, YMR181C, GPR1, NCE102, and GIC1) in their 5′-UTRs [117,118] for which we obtained high-quality measurements; these IRESs enable cap-independent translation, often in response to environmental stresses [119]. Ypl184c also bound the unspliced HAC1 transcript, which associates with the cytosolic side of the ER membrane and is not efficiently translated until it is spliced by IRE1 as part of the unfolded protein response pathway [120,121]. Given Ypl184c's association with Pab1 and its striking association with sets of mRNAs that are known to be subject to extensive translational regulation, we speculate that Ypl184c regulates translation. The sequence motifs that we found to be significantly enriched in the mRNA targets of Ypl184c closely match the ones we found for Pub1 (Table S4). Indeed, the RNA target sets of these two proteins overlap significantly (Figures 1 Discussion A large body of work has given us a general picture of the relationship between the several hundred transcription factors and thousands of genes in yeast (e.g., [26–29,32,35,52–60]). Among the key features of transcriptional regulation are that: (1) individual transcription factors characteristically regulate sets of genes with related biological roles, (2) transcription factors are recruited to the specific genes they regulate by binding to specific sequences in the vicinity of those genes, and (3) combinatorial regulation of individual genes by two or more distinct transcription factors provides multidimensional control and precision to their regulation. Our systematic identification of RNAs associated with each of 46 proteins in yeast suggests that a system that shares these three key features, likely involving dozens to hundreds of RBPs, may regulate the post-transcriptional fate of most or all RNAs in the yeast cell. This glimpse into the landscape of RNA–protein interactions has provided tantalizing clues to its organization and role. The mRNA targets of most of the RBPs in the survey encoded sets of proteins that were significantly associated with one or several related subcellular sites or biological processes (Figure 2 The striking tendency of individual RBPs to bind to sets of mRNAs whose protein products are similarly localized in the cell hints at an important role for RBPs in establishing and maintaining spatial organization in the cell, perhaps through facilitating localized protein production and mRNA decay [13,32,122–131]. The cellular structures that were most often overrepresented among the mRNA targets of many RBPs were the cell wall, plasma membrane, and ER. Thus, in addition to the familiar role of the peptide signal sequence in mediating ER-localized translation [12], RBPs may have important roles in RNA partitioning between the cytoplasm and ER, and perhaps in localization to specific sites in the periphery of the cell, such as sites of cell-wall biogenesis, bud development, and endocytosis [32,132–135]. Two of the RBPs whose targets disproportionably encode proteins localized to the cell periphery, She2 and Khd1, have been shown to be involved in trafficking some of their mRNA targets to the bud tip during the G2/M phase of the cell cycle [32,67,136]. The particularly strong overrepresentation of RBPs that associate with mRNAs encoding cell-wall components may reflect the need for extensive multilayered regulation of the location and timing of assembly and remodeling of this dynamic subcellular structure. Identification of the information that specifies mRNA–RBP interactions is still in its earliest stages. The sequence motifs overrepresented in RBP targets, identified with the recently developed FIRE and novel REFINE methodologies, are diverse in design and location (Figures 5 We identified over 12,000 mRNA–RBP interactions with high confidence. Most mRNAs in the yeast transcriptome associated with at least one of the RBPs in our survey and many associated with multiple RBPs. Some of the RBPs in the survey appear to interact with most or all mRNAs at some point in their lifecycle (Figure S1 and Text S3). Naively extrapolating from our results to the estimated 600 RBPs in Saccharomyces suggests that each mRNA might interact with a dozen or more different RBPs, on average, during its lifetime. This extrapolation is highly speculative; the sample of RBPs that we investigated is biased towards RBPs that we suspected might have a regulatory function; we do not have a good estimate of the number of regulatory RBPs that bind discrete sets of mRNAs in the manner analogous to specific transcription factors; given that three of the four proteins in this survey that were not annotated as RBPs nevertheless gave reproducible interactions with specific sets of mRNAs (Bud27, Aco1, and Tdh3), the number of potential noncanonical, unannotated RBPs with regulatory roles may be large, perhaps even in the hundreds [140–144]. There is no reason to believe the system we have described is peculiar to yeast. Extensive post-transcriptional regulation by combinatorial binding of a large and diverse set of specific RBPs is likely to be a general feature of regulation in eukaryotes. Indeed, several lines of evidence suggest an even greater genomic investment in post-transcriptional regulation in humans (and other metazoans); the number and diversity of RBPs encoded by the human genome seems to far exceed that of yeast [145], untranslated regions of mRNAs are much longer in humans (~1,300 bases on average) than in yeast (~300 bases on average) and appear to contain much more regulatory information [6,146,147], and the architecture of animal cells is far more diverse and complex than that of the yeast cell, with a correspondingly greater potential role for specific RNA localization [13,130,148–151]. This work has provided a glimpse of a network of RBP–mRNA interactions that is likely to play an important, but still largely undiscovered, role in biological regulation. The genes and cis-regulatory elements implicated in this process represent a substantial fraction of the genome's investment in regulation, yet the specific details and molecular mechanisms of this network of RBP–mRNA interactions are still largely terra incognita—and fertile ground for further exploration and discovery. Materials and Methods RNA imunoaffinity purifications. We carried out immunopurifications of specific proteins, together with the associated RNAs, using specific strains expressing a TAP-tagged derivative of each selected protein (Open Biosystems Cat# YSC1177-OB), essentially as described in Gerber et al. [26]. After growing 1L cultures to an optical density at 600 nm (OD600) of 0.6–0.9 in YPAD, we harvested cells by centrifugation, chilled the cell pellets on ice, washed them twice with 25 ml of ice cold buffer A (20 mM Tris–HCl [pH 8.0], 140 mM KCl, 1.8 mM MgCl2, 0.1% Nonidet P-40, 0.02 mg/ml heparin), then froze them in LN2 and stored them at −80 °C. In a few instances, we proceeded to lyse the pelleted cells immediately without freezing. To lyse the cells, we first thawed the cell suspension at 4 °C, added 5 ml of buffer B (buffer A plus 0.5 mM DTT, 1 mM PMSF, 1 μg/ml leupeptin, 1 μg/ml pepstatin, 20 U/ml DNase I [Stratagene Cat# 600032], 50 U/ml Superasin [Ambion Cat# AM2696], and 0.2 mg/ml heparin), and then mechanically lysed the cells by vortexing in the presence of glass beads. We removed the beads by centrifugation at 1,000g for 5 min, then clarified the extracts by centrifuging them twice at 7,000g for 5 min each. We adjusted the volume of the extract to 5 ml with buffer B, removed a 100-μl aliquot for reference RNA isolation, and then incubated the remaining 4.9 ml with 400 μl of 50% (v/v) suspension of IgG-agarose beads (Sigma Cat# A2909) in Buffer A with gentle rotation for 2 h. We washed the beads once with 5 ml of buffer B for 15 min, and three times with 12 ml of buffer C (20 mM Tris-HCl [pH 8.0], 140 mM KCl, 1.8 mM MgCl2, 0.5 mM DTT, 0.01% NP-40, 15 U/ml Superasin, 1 μg/ml pepstatin, 1 μg/ml leupeptin, 1 mM PMSF) for 15 min with gentle rotation. We pelleted the beads by centrifugation for 5 min at 60g in a table-top centrifuge. We then transferred the beads to 1.2-ml micro-spin columns (BioRad Cat# 732-6204), centrifuged them briefly to pellet the beads, removed buffer C, and then added 1 volume of buffer C. We cleaved TAP-tagged proteins by incubation with 80 U acTEV protease (Invitrogen Cat# 12575023) or an equivalent amount of purified TEV [152] for 2 h at 15 °C. We collected the eluent by centrifugation into 2-ml tubes. We isolated reference RNA using RNeasy Mini Kit (Qiagen Cat# 74106), while we isolated RNA from the eluate by extraction with Phenol/Chloroform/Isoamyl Alcohol, 25:24:1 (Invitrogen Cat# 15593031) twice, and chloroform once, followed by ethanol precipitation with 15 μg of Glycoblue (Ambion Cat# AM9515) as carrier. Oligonucleotide microarray design. Starting with the Operon AROS 1.1 oligo set, which contains long oligonucleotides for almost all annotated S. cerevisiae nuclear and mitochondrial coding sequences, we added 3,072 additional probes designed to detect annotated noncoding RNAs, ribosomal RNA precursors, introns, exon-intron and exon-exon junctions, other sequences predicted to be expressed, additional probes for genes with high cross-hybridization potential, and hundreds of controls for array quality measurements and normalization. Details of oligonucleotide selection and probe sequences are available from the Operon Web site (https://www.operon.com/; S. cerevisiae YBOX V1.0). Microarray production and prehybridization processing. Detailed methods for microarray experiments are available at the Brown lab Web site (http://rd.plos.org/pbio.0060255). For oligonucleotide microarrays, we resuspended oligonucleotides in 3× SSC (1× SSC = 150 mM NaCl, 15 mM sodium citrate [pH 7.0]) at a final concentration of 25 μM and printed oligonucleotides on poly-lysine glass (Erie Scientific Cat# C41–5870-M20) (http://rd.plos.org/pbio.0060255a). We printed each oligonucleotide twice per array. For most arrays, the second print was in reverse orientation to the first print, such that oligonucleotide pairs were printed with different pins and thus located in different sectors of the array. Prior to hybridization, the oligonucleotides were crosslinked to the poly-lysine–coated surface with 65 mJ of UV irradiation. Slides were then incubated in a 500-ml solution containing 3× SSX and 0.2% SDS for 5 min at 50 °C. Slides were washed for 2 min in a glass chamber containing 400 ml of water, dunked in a glass chamber containing 400 ml of 95% ethanol for 15 s, and then dried by centrifugation. Free poly-lysine groups were then succinylated by incubation with 5.5 g of succinic anhydride that was dissolved in 350 ml of anhydrous 1-methyl,2-pyrolidoinone (Sigma Cat# 328634) and 15 ml of 1 M sodium borate (pH 8.0) for 20 min [53]. Slides were washed for 2 min in a glass chamber containing 400 ml of room temperature water, dunked in a glass chamber containing 400 ml of 95% ethanol for 15 s, and then dried by centrifugation. cDNA microarrays containing long double-stranded DNA (dsDNA) from PCR reactions were prepared as previously described [53]. Microarray sample preparation, hybridization, and washing. A total of 3 μg of reference RNA from extract and up to 3 μg (or 50%) of affinity-purified RNA were reverse transcribed with Superscript II (Invitrogen Cat# 18064–014) in the presence of 5-(3-aminoallyl)-dUTP (Ambion Cat# AM8439) and natural dNTPs (GE Healthcare Life Sciences Cat# US77212) with a 1:1 mixture of N9 and dT20V primers (Invitrogen). Subsequently, amino-allyl–containing cDNAs were covalently linked to Cy3 and Cy5 NHS-monoesters (GE Healthcare Life Sciences Cat# RPN5661). Dye-labeled DNA was diluted in a 20–40-μl solution containing 3× SSC, 25 mM Hepes-NaOH (pH 7.0), 20 μg of poly(A) RNA (Sigma cat # P4303), and 0.3% SDS. The sample was incubated at 95 °C for 2 min, spun at 14,000 rpm for 10 min in a microcentrifuge, and then hybridized at 65 °C for 12–16 h. For most oligonucleotide microarray experiments, we hybridized microarrays inside sealed chambers in a water bath using the M-series lifterslip to contain the probe on the microarray (Erie Scientific Cat # 22x60I-M-5522). For some oligonucleotide microarray experiments, we hybridized microarrays using the MAUI hybridization system (BioMicro), which promotes active mixing during hybridization. We hybridized cDNA microarrays inside sealed chambers in a water bath using a coverslip to contain the probe on the microarray. Following hybridization, microarrays were washed in a series of four solutions containing 400 ml of 2× SSC with 0.05% SDS, 2× SSC, 1× SSC, and 0.2× SSC, respectively. The first wash was performed for 5 min at 65 °C. The subsequent washes were performed at room temperature for 2 min each. Following the last wash, the microarrays were dried by centrifugation in a low-ozone environment (<5 ppb) to prevent destruction of Cy dyes [153,154]. Once dry, the microarrays were kept in a low-ozone environment during storage and scanning (see http://rd.plos.org/pbio.0060255). Microarray scanning and data processing. Microarrays were scanned using either AxonScanner 4200, 4000B, or 4000A (Molecular Devices). PMT levels were adjusted to achieve 0.1%–0.5% pixel saturation. Each element was located and analyzed using GenePix Pro 5.0 (Molecular Devices). These data were submitted to the Stanford Microarray Database [155] for further analysis. Data were filtered, as described in Text S10, to remove low-confidence measurements. Oligonucleotide pairs that both passed filtering criteria were averaged, and the data were globally normalized per array such that the mean log2 (Cy5/Cy3 fluorescence) ratio was zero after normalization. We analyzed a total of 123 IPs by microarray hybridization (Dataset S1). During the course of this work, we continued to improve and optimize our protocols. These changes and the manufacturing differences in reagents (especially in the beads used in the IPs) led to systematic differences in the background distribution of RNAs between corresponding experiments. We minimized systematic differences among sets of experiments by deriving estimates of the background separately for each set of experiments. Each group was normalized by subtracting the median log2 ratio for each molecular features across the experiments in a group from the log2 ratio of the molecular feature in each experiment. The details of the group normalization are described in Text S10, and the groups are labeled in Table S5. Microarray analyses. Hierarchical clustering was performed with Cluster 3.0 [156], and the results were visualized as heat maps with Java TreeView 1.0.12 [157]. Clustering of FDR values (Figures 1 For SAM, unpaired two-class t-tests were performed with default settings. FDRs were generated from up to 1,000 permutations of group normalized data. Details of SAM analysis are described in Text S11. Enrichment of specific gene lists in RBP target sets. The p-values of enrichment of specific classes of RNAs and GO terms in target sets were determined using the hypergeometric density distribution function and corrected for multiple hypothesis testing using the Bonferroni method. Enrichment of GO terms was performed with GO::TermFinder [158]. For noncoding RNAs, all RNAs for which we obtained reliable measurements on the microarray were used as background. For GO analysis, only probes that are meant to capture mature mRNAs were included in analyses. For oligonucleotide microarray experiments, this corresponds to probes that match the following regular expression: Y[A-P][RL][0–9]{3}[WC][-ABC]*_ORF (Datasets S1–S3). For cDNA microarray experiments, this corresponds to probes that match the following regular expression: Y[A-P][RL][0–9]{3}[WC][-ABC]* (Datasets S1–S3). mRNAs for which we obtained high-quality measurements were used as background. Sequences used for motif analysis. Yeast sequence files orf_genomic_1000.fasta and orf_coding.fasta were downloaded from SGD (ftp://ftp.yeastgenome.org). The 200 nucleotides upstream and downstream of coding sequences containing proper start and stop codons were extracted to create 5′-UTR and 3′-UTR databases, and the coding sequences were used for the coding sequence database. All-by-all WU-BLAST [159] (http://blast.wustl.edu/) comparisons were performed for each database against itself to identify highly similar sequences (using options -e 1e-10 -b 5000 -S 1 -F F). WU-BLAST output files were parsed to identify alignments of greater than or equal to 80% identity extending over half the length of the query sequence, and all such sequence pairs were grouped into redundant classes. One sequence from each redundant class was retained to create nonredundant databases for each region. Motif prediction. The REFINE procedure was run using hexamers with significant (p < 10−3) enrichment in RBP targets, as measured by the hypergeometric distribution (using options –ss –f 3 –g 6 –ct 3 –max 15 –dust). MEME analysis (version 3.5.1) was performed on the REFINE output sequences with options –dna –minw 6 –maxw 15 –text –maxsize 200000 –evt 10 –nmotifs 3. Motif site sequences were extracted from MEME output and used to generate position-specific log-odds scoring matrices based on the observed frequencies and 0.25 pseudocounts per base, and null frequencies based on mononucleotide composition of all sequences in the corresponding (5′-UTR or 3′- UTR) nonredundant database. Cutoff scores for motif classification were chosen to maximize the significance of association of motif sites with RBP target membership as measured by hypergeometric p-values for enrichment. All subsequences with scores above the cutoff threshold were classified as motif sites, and the final significance was measured as the negative log of the p-value of motif enrichment in RBP targets. FIRE analysis was run on the nonredundant 5′- and 3′-UTR databases using binary data indicating RBP target membership with options –exptype=discrete –seqlen_rna=200 –nodups=1 –dodna=0. Simulations to evaluate significance of predicted motifs. For both REFINE and FIRE, statistical significance of the predicted motifs was assessed by randomly generating target sets of similar size and repeating each procedure 100 times on the simulated target data. We defined a test statistic as the negative log of the p-value for motif enrichment for REFINE; the reported motif z-score was used for FIRE motifs, and we compared the observed values of these test statistics to the distributions generated by the random simulations (Table S4). Motifs were declared as significant if the observed test statistic was greater than three standard deviations above the mean, or if there was significant enrichment (p < 10−4) of the motif in targets occurring in regions from which that motif was not derived. Accession Numbers Our microarray experiment data are publicly available from the Stanford Microarray Database and Gene Expression Omnibus. Dataset S1: Normalized Data from DNA Microarray Experiments; Values from Both Pregroup Normalization and after Group Normalization Are Included (6.76 MB ZIP) Click here for additional data file.(6.6M, zip) Dataset S2: Data Matrix Containing False-Discovery Rate Values for Each RNA–RBP Pair (2.5 MB ZIP) Click here for additional data file.(2.4M, zip) Dataset S3: Significance Analysis of Microarray Results for Each Protein (11.8 MB ZIP) Click here for additional data file.(12M, zip) Figure S1: Immunopurification Enrichment Profiles of Several RNA-Binding Proteins (A) Distribution of average Cy5/Cy3 fluorescence ratios from five independent microarray hybridizations analyzing Ssd1 targets. The enrichment distribution for mRNAs is shown in black, and the enrichment distribution for other annotated RNAs (i.e., nuclear introns, mitochondrion-encoded mRNAs, mitochondrial introns, snoRNAs, ribosomal RNAs, LSR1, NME1, SCR1, SRG1, and TLC1) is shown in red. The points correspond to an estimated distribution that was created by binning the average fluorescence ratios into 0.1 log2 unit bins from −7 to 7 log2 units. The lines correspond to a smoothed fit of the data [160]. We scaled the smoothed fit of the distribution to the binned data by making the maximum value of the smoothed fit data equal to the value in the bin with the largest number of RNAs. (B) Same as in (A), except for Scp160. The results are the average of three independent microarray hybridizations. (C) Same as in (A), except for Pab1. The results are the average of three independent microarray hybridizations. (D) Same as in (A), except for Pub1. The results are the average of three independent microarray hybridizations. (374 KB PDF) Click here for additional data file.(374K, pdf) Figure S2: Overrepresentation of Specific Classes of RNAs in Association with Specific RNA-Binding Proteins Enrichment of several classes of RNAs (rows) in target sets (1% FDR) of RBPs (columns). The significance of enrichment of the class of RNAs is represented as a heat map in which the color intensity corresponds to the negative log10 p-value, which was calculated using the hypergeometric density distribution function and corrected for multiple hypothesis testing using the Bonferroni method. RBPs whose targets are significantly enriched (p ≤ 0.05) for a specific class of RNAs are shown. (219 KB PDF) Click here for additional data file.(219K, pdf) Figure S3: Specific Features of Post-Transcriptional Regulation May Be Linked to Broad-Specificity RNA-Binding Proteins Pearson correlations between IP enrichment with the RBP (columns) and selected characteristics of mRNAs (rows) are represented as a heat map. mRNAs that passed quality filtering for all nine RBPs were included in this analysis. (231 KB PDF) Click here for additional data file.(231K, pdf) Table S1: Annotated and Putative RNA-Binding Proteins in Saccharomyces cerevisiae (160 KB XLS) Click here for additional data file.(160K, xls) Table S2: Summary of RNA-Binding Proteins in the Survey (49 KB XLS) Click here for additional data file.(49K, xls) Table S3: Gene Ontology Terms Enriched in RNA-Binding Protein Target Sets (91 KB XLS) Click here for additional data file.(91K, xls) Table S4: RNA Motifs Identified in RNA-Binding Protein Target Sequences (46 KB XLS) Click here for additional data file.(46K, xls) Table S5: Description of Microarray Experiments and Groups Used for Group Normalization (41 KB XLS) Click here for additional data file.(41K, xls) Text S1: Representation of RNA-Binding Proteins in This Study (24 KB DOC) Click here for additional data file.(24K, doc) Text S2: Comments on the Immunopurification Method (51 KB DOC) Click here for additional data file.(51K, doc) Text S3: Diverse RNA Enrichment Profiles among RNA-Binding Proteins (29 KB DOC) Click here for additional data file.(29K, doc) Text S4: RNA-Binding Proteins That Preferentially Associate with RNAs Other Than Mature mRNAs Encoded by Nuclear Genes (68 KB DOC) Click here for additional data file.(68K, doc) Text S5: Specific Features of Post-Transcriptional Regulation May Be Linked to Broad-Specificity RNA-Binding Proteins (38 KB DOC) Click here for additional data file.(38K, doc) Text S6: Many RNA-Binding Proteins Appear to Bind Their Targets during Specific Stages in Their Lives (57 KB DOC) Click here for additional data file.(57K, doc) Text S8: Many RNA-Binding Proteins Associated with Their Own Transcripts (32 KB DOC) Click here for additional data file.(32K, doc) Text S9: Insights into the Functions of Specific RNA-Binding Proteins (49 KB DOC) Click here for additional data file.(50K, doc) Text S10: Immunopurification Group Normalization (28 KB DOC) Click here for additional data file.(28K, doc) Text S11: Significance Analysis of Microarrays (33 KB DOC) Click here for additional data file.(33K, doc) Acknowledgments David Waugh kindly provided TEV protease expression plasmid. We thank Maitreya Dunham, Donna Storton, Joseph DeRisi, and Adam Carroll and members of the Brown and Herschlag labs for advice and discussions. We appreciate comments on the manuscript from members of the Gerber lab and from Greg Hogan. Abbreviations
Footnotes Academic Editor: Sean R. Eddy, Howard Hughes Medical Institute, Janelia Farm, United States of America Author contributions. DJH, APG, DH, and POB conceived and designed the experiments. DJH and APG performed the experiments. DJH and DPR analyzed the data. DPR, DH, and POB contributed reagents/materials/analysis tools. DJH, DPR, APG, DH, and POB wrote the paper. Funding. This work was supported by the Howard Hughes Medical Institute and by a grant from the National Cancer Institute to POB (R01 CA77097–08). POB is an investigator for the Howard Hughes Medical Institute. DJH was partially supported by National Science Foundation and American Heart Association predoctoral fellowships. DPR was partially supported by a Stanford Graduate Fellowship and by the Stanford Genome Training Program (Grant Number T32 HG00044 from the National Human Genome Research Institute). APG was supported by a Career Development Award from the Human Frontier Science Program Organization. Competing interests. The authors have declared that no competing interests exist. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
||||||||||||||||||||||||||||||||||||||||||||||||
Genes Dev. 2004 Dec 1; 18(23):2845-60.
[Genes Dev. 2004]Mol Cell. 2002 Jun; 9(6):1161-7.
[Mol Cell. 2002]Nature. 2007 Jun 14; 447(7146):799-816.
[Nature. 2007]Nature. 2003 May 15; 423(6937):241-54.
[Nature. 2003]Cell. 2006 Jul 14; 126(1):37-47.
[Cell. 2006]Mol Cell. 2007 Sep 21; 27(6):928-37.
[Mol Cell. 2007]PLoS Biol. 2007 Apr; 5(4):e90.
[PLoS Biol. 2007]Annu Rev Cell Biol. 1994; 10():87-119.
[Annu Rev Cell Biol. 1994]Cell. 2007 Oct 5; 131(1):174-87.
[Cell. 2007]Genes Dev. 2004 Dec 1; 18(23):2845-60.
[Genes Dev. 2004]Mol Cell. 2002 Jun; 9(6):1161-7.
[Mol Cell. 2002]Annu Rev Biochem. 1993; 62():289-321.
[Annu Rev Biochem. 1993]Nature. 2002 Apr 4; 416(6880):499-506.
[Nature. 2002]Nat Genet. 2003 Feb; 33(2):155-61.
[Nat Genet. 2003]RNA. 2005 Apr; 11(4):383-93.
[RNA. 2005]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Annu Rev Biochem. 1993; 62():289-321.
[Annu Rev Biochem. 1993]Nucleic Acids Res. 2003 Apr 1; 31(7):1830-7.
[Nucleic Acids Res. 2003]Nucleic Acids Res. 1999 Jan 1; 27(1):69-73.
[Nucleic Acids Res. 1999]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D364-8.
[Nucleic Acids Res. 2005]Nature. 2003 Oct 16; 425(6959):737-41.
[Nature. 2003]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Nat Biotechnol. 1999 Oct; 17(10):1030-2.
[Nat Biotechnol. 1999]Proc Natl Acad Sci U S A. 2001 Apr 24; 98(9):5116-21.
[Proc Natl Acad Sci U S A. 2001]Genet Epidemiol. 2002 Jun; 23(1):70-86.
[Genet Epidemiol. 2002]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Proc Natl Acad Sci U S A. 2003 Sep 30; 100(20):11429-34.
[Proc Natl Acad Sci U S A. 2003]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Proc Natl Acad Sci U S A. 2003 Sep 30; 100(20):11429-34.
[Proc Natl Acad Sci U S A. 2003]Cancer Res. 2005 May 1; 65(9):3762-71.
[Cancer Res. 2005]Nucleic Acids Res. 1991 Apr 25; 19(8):1739-40.
[Nucleic Acids Res. 1991]Mol Cell Biol. 1990 Jul; 10(7):3551-61.
[Mol Cell Biol. 1990]Science. 2005 Feb 4; 307(5710):714-7.
[Science. 2005]Nucleic Acids Res. 2001 Jun 15; 29(12):2567-74.
[Nucleic Acids Res. 2001]Genetics. 2000 Feb; 154(2):557-71.
[Genetics. 2000]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Proc Natl Acad Sci U S A. 1996 May 14; 93(10):4925-30.
[Proc Natl Acad Sci U S A. 1996]Proc Natl Acad Sci U S A. 2003 Sep 30; 100(20):11429-34.
[Proc Natl Acad Sci U S A. 2003]Nat Genet. 2005 Aug; 37(8):844-52.
[Nat Genet. 2005]Science. 1998 Oct 23; 282(5389):699-705.
[Science. 1998]Proc Natl Acad Sci U S A. 2003 Apr 1; 100(7):3889-94.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 2002 Apr 30; 99(9):5860-5.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2006 Apr 4; 103(14):5320-5.
[Proc Natl Acad Sci U S A. 2006]Cell. 1986 Jun 20; 45(6):827-35.
[Cell. 1986]Mol Cell Biol. 1998 Jan; 18(1):51-7.
[Mol Cell Biol. 1998]RNA. 1998 Nov; 4(11):1321-31.
[RNA. 1998]RNA. 2007 Jul; 13(7):982-97.
[RNA. 2007]EMBO J. 2002 Mar 1; 21(5):1158-67.
[EMBO J. 2002]Science. 2005 Sep 2; 309(5740):1514-8.
[Science. 2005]Proc Natl Acad Sci U S A. 2005 Mar 22; 102(12):4258-63.
[Proc Natl Acad Sci U S A. 2005]Nucleic Acids Res. 1996 Sep 1; 24(17):3332-6.
[Nucleic Acids Res. 1996]Mol Cell. 2006 Dec 28; 24(6):917-29.
[Mol Cell. 2006]Nucleic Acids Res. 2001 Jun 15; 29(12):2567-74.
[Nucleic Acids Res. 2001]Genetics. 2000 Feb; 154(2):557-71.
[Genetics. 2000]Mol Cell Proteomics. 2007 Mar; 6(3):439-50.
[Mol Cell Proteomics. 2007]Mol Cell. 2006 Jan 20; 21(2):239-48.
[Mol Cell. 2006]Yeast. 1999 Apr; 15(6):481-96.
[Yeast. 1999]Genetics. 2002 Jan; 160(1):83-95.
[Genetics. 2002]Genes Dev. 2001 Nov 1; 15(21):2803-8.
[Genes Dev. 2001]J Biol Chem. 1993 Jul 15; 268(20):15080-7.
[J Biol Chem. 1993]Gene. 1989 Dec 28; 85(2):321-8.
[Gene. 1989]Mol Cell. 2007 Oct 26; 28(2):337-50.
[Mol Cell. 2007]Proc Int Conf Intell Syst Mol Biol. 1994; 2():28-36.
[Proc Int Conf Intell Syst Mol Biol. 1994]Proc Natl Acad Sci U S A. 1998 Jun 9; 95(12):6699-704.
[Proc Natl Acad Sci U S A. 1998]Mol Cell Biol. 2004 Jul; 24(14):6241-52.
[Mol Cell Biol. 2004]Nature. 2001 Sep 20; 413(6853):327-31.
[Nature. 2001]Science. 2008 Jun 6; 320(5881):1344-9.
[Science. 2008]Proc Natl Acad Sci U S A. 2006 Apr 4; 103(14):5320-5.
[Proc Natl Acad Sci U S A. 2006]Nat Struct Biol. 2003 Aug; 10(8):614-21.
[Nat Struct Biol. 2003]Nat Struct Mol Biol. 2006 Feb; 13(2):168-76.
[Nat Struct Mol Biol. 2006]Science. 2003 Nov 14; 302(5648):1212-5.
[Science. 2003]J Biol Chem. 1997 Jun 27; 272(26):16103-9.
[J Biol Chem. 1997]Yeast. 1999 Apr; 15(6):481-96.
[Yeast. 1999]Genetics. 2002 Jan; 160(1):83-95.
[Genetics. 2002]Plant J. 2001 Feb; 25(3):271-80.
[Plant J. 2001]Proc Natl Acad Sci U S A. 2003 Mar 4; 100(5):2766-70.
[Proc Natl Acad Sci U S A. 2003]J Biol Chem. 1997 Jun 27; 272(26):16103-9.
[J Biol Chem. 1997]Mol Cell Proteomics. 2007 Mar; 6(3):439-50.
[Mol Cell Proteomics. 2007]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D169-72.
[Nucleic Acids Res. 2006]J Biol Chem. 1986 Apr 15; 261(11):5160-7.
[J Biol Chem. 1986]Nucleic Acids Res. 1998 Mar 1; 26(5):1150-9.
[Nucleic Acids Res. 1998]Genes Dev. 2006 Apr 15; 20(8):915-21.
[Genes Dev. 2006]Science. 2007 Aug 31; 317(5842):1224-7.
[Science. 2007]Mol Cell Biol. 1994 Nov; 14(11):7322-30.
[Mol Cell Biol. 1994]Cell. 2007 Oct 5; 131(1):174-87.
[Cell. 2007]Proc Natl Acad Sci U S A. 2003 Sep 30; 100(20):11429-34.
[Proc Natl Acad Sci U S A. 2003]Trends Biochem Sci. 2006 Dec; 31(12):687-93.
[Trends Biochem Sci. 2006]Genome Biol. 2003; 4(7):R44.
[Genome Biol. 2003]Annu Rev Cell Biol. 1994; 10():87-119.
[Annu Rev Cell Biol. 1994]Nat Struct Biol. 2003 Aug; 10(8):614-21.
[Nat Struct Biol. 2003]Mol Cell Biol. 2005 Jun; 25(11):4752-66.
[Mol Cell Biol. 2005]Proc Natl Acad Sci U S A. 2005 Dec 13; 102(50):18005-10.
[Proc Natl Acad Sci U S A. 2005]Annu Rev Biochem. 1993; 62():289-321.
[Annu Rev Biochem. 1993]Nature. 2002 Apr 4; 416(6880):499-506.
[Nature. 2002]Acta Biochim Pol. 2006; 53(1):11-32.
[Acta Biochim Pol. 2006]Mol Cell Biol. 2007 Aug; 27(16):5607-18.
[Mol Cell Biol. 2007]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D247-51.
[Nucleic Acids Res. 2006]Nature. 2007 Jun 14; 447(7146):799-816.
[Nature. 2007]Genome Biol. 2003; 5(1):R2.
[Genome Biol. 2003]PLoS One. 2007 May 23; 2(5):e460.
[PLoS One. 2007]Cell. 2007 Oct 5; 131(1):174-87.
[Cell. 2007]PLoS Biol. 2004 Mar; 2(3):E79.
[PLoS Biol. 2004]Protein Eng. 2001 Dec; 14(12):993-1000.
[Protein Eng. 2001]Science. 1997 Oct 24; 278(5338):680-6.
[Science. 1997]Science. 1997 Oct 24; 278(5338):680-6.
[Science. 1997]BMC Biotechnol. 2007 Feb 12; 7():8.
[BMC Biotechnol. 2007]Anal Chem. 2003 Sep 1; 75(17):4672-5.
[Anal Chem. 2003]Nucleic Acids Res. 2007 Jan; 35(Database issue):D766-70.
[Nucleic Acids Res. 2007]Proc Natl Acad Sci U S A. 1998 Dec 8; 95(25):14863-8.
[Proc Natl Acad Sci U S A. 1998]Bioinformatics. 2004 Nov 22; 20(17):3246-8.
[Bioinformatics. 2004]Bioinformatics. 2004 Dec 12; 20(18):3710-5.
[Bioinformatics. 2004]Nat Genet. 1993 Mar; 3(3):266-72.
[Nat Genet. 1993]