![]() | ![]() |
Formats:
|
||||||||||||
Copyright © The Author 2006. Published by Oxford University Press. All rights reserved FlyRNAi: the Drosophila RNAi screening center database 1Department of Genetics, Harvard Medical School, Boston, MA 02115, USA 2Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA 3Department of Biological Sciences, University of California-San Diego, La Jolla, CA 92093-0346, USA 4German Cancer Research Center, D-69120 Heidelberg, Germany *To whom correspondence should be addressed. Tel: +1 617 432 0365; Fax: +1 617 432 6238; Email: iflockha/at/genetics.med.harvard.edu Received August 10, 2005; Revised October 18, 2005; Accepted October 18, 2005. The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions/at/oxfordjournals.org This article has been cited by other articles in PMC.Abstract RNA interference (RNAi) has become a powerful tool for genetic screening in Drosophila. At the Drosophila RNAi Screening Center (DRSC), we are using a library of over 21 000 double-stranded RNAs targeting known and predicted genes in Drosophila. This library is available for the use of visiting scientists wishing to perform full-genome RNAi screens. The data generated from these screens are collected in the DRSC database (http://flyRNAi.org/cgi-bin/RNAi_screens.pl) in a flexible format for the convenience of the scientist and for archiving data. The long-term goal of this database is to provide annotations for as many of the uncharacterized genes in Drosophila as possible. Data from published screens are available to the public through a highly configurable interface that allows detailed examination of the data and provides access to a number of other databases and bioinformatics tools. INTRODUCTION In recent years, we have witnessed the wide application of high-throughput screening (HTS) technologies to approach biological questions. Arguably, the most promising HTS approach for discovering gene function is based on RNA interference (RNAi). RNAi results in the silencing of a gene through the specific degradation of its mRNA, which is triggered by double-stranded RNA (dsRNA) fragments complementary to that transcript. In Drosophila, RNAi can be achieved in cell lines and primary cells simply by adding long dsRNA to the medium (1,2). The long dsRNA are taken up by the cells, and are rapidly processed to 21–24 nt short-interfering RNAs (siRNAs) that guide the specific degradation of target mRNAs. The biochemical steps involved in the processing of dsRNAs and loading of the siRNAs into the RNA-induced silencing complex (RISC) are carried out by a number of proteins, including members of the Dicer protein family (3). In the last 2 years that technique has been adapted to HTS, allowing genome-wide screens to be performed efficiently in 384-well assay plates (1). With the support of a grant from NIGMS, we created in May 2003 the Drosophila RNAi Screening Center (DRSC) (http://flyRNAi.org) with the following goals:
SCREENING AT THE DRSC Visiting scientists typically perform their screens in duplicate—screening against two full-genome sets. Raw data from duplicate genome sets are collected along with phenotype and ‘hit’ information. Data are primarily stored by plate and well, rather than by dsRNA, which allows for comparison between genome sets and facilitates plate-wide analysis. The scientists perform their screens blind—only learning the identity of the dsRNA in a particular well after entering the data from that well into the system. Once the screen is completed, the data are held privately until the results are published or a 2 year period passes after completion of the screen. The scientists have password-protected accounts which give them access to data entry interfaces and direct links to their personal data, both published and unpublished. The scientist's data are broken down by assay. Any changes the scientist may make to the experimental data display (as described below) are saved between sessions. The logged in user also has access to some tools for viewing data a plate at a time, direct links to the bioinformatic tools (listed below) and functions for directly querying the quality control (QC) information for the source plates. An important aspect of how the database is organized is in the distinction among genes, amplicons and dsRNAs. The dsRNA library was designed in collaboration with R. Paro's group and collaborators (ZMBH Heidelberg) (4). The approach taken was to generate gene-specific primers to 21396 putative open reading frames (ORFs) covering the entire Drosophila genome. Choice of the primers was based on the combined genome annotations available from BDGP/Celera (5) and the Sanger Center. A pair of a specific forward and reverse primer was used to amplify a genomic region (henceforth called amplicon) corresponding to each predicted ORF. Each amplicon, flanked by RNA polymerase T7 promoters, was in turn used as a template in an in vitro transcription reaction to generate dsRNA. As new releases of the Drosophila genome result in slightly revised annotations, the predicted gene target of each amplicon may change accordingly. Because of this unavoidable issue, result and ‘hit’ information from screens is focused more on amplicons and the corresponding dsRNA than on genes. However, the key piece of information that remains invariant is the nucleotide sequence of the specific region encompassed by an amplicon and by extension of its related dsRNA. For the purpose of data tracking in the database, the term amplicon can refer to two distinct biological entities that are related through their sequences: the DNA fragment amplified with specific primers or its corresponding dsRNA. WEBSITE OVERVIEW The main URL for the public database is http://flyRNAi.org/cgi-bin/RNAi_screens.pl. This page has four major parts; a menu bar to the left, a link to the Gene Lookup Page (Search for Genes in Public Screens), a list of public screens below it for which all data are accessible, and a list of ongoing screens for which the data are kept confidential until the time of publication or the 2 year limit after their completion, whichever comes first (Figure 1
The menu bar provides links to a number of informational resources. The ‘About Us’ link provides general information about the DRSC, such as personnel, location, equipment, funding and DRSC news. The ‘Screening’ header opens up a ‘how-to’ section and summarizes the current protocols in use at the DRSC to conduct RNAi screens in the 384-well plate format (1). The ‘Applications’ link is for scientists interested in submitting an application to come and carry out a screen at the DRSC. The ‘Literature’ and ‘Links’ pages are lists of external resources that the DRSC wishes to highlight for RNAi screeners. The ‘Tools’ section consists of a number of small bioinformatics applications that we have developed to help screeners or interested scientists search for and manage information displayed in the Data section of the database. It also offers links to other databases with the purpose of providing additional information on particular genes or gene function. The published screens page is organized by screen publication date, the most recent of which are listed first. Each screen field consists of the title and the authors of the screen. A pdf file of the publication, as well as the supplementary data (when available), are included whenever possible. One or more direct links are provided to access the raw data, with a listing of the dsRNAs (or their corresponding amplicons) found to have an effect in the assay under study. Immediately below the record, which is updated as is appropriate, of the published screens (6–13) follows a list of screens that have just been completed at the DRSC. These screens are yet to be fully analyzed and have not yet reached the stage of publication. For each unpublished screen listed, the title and brief summary of the screen are given along with details about the scientists involved in the screen (names, academic affiliation, etc.). A contact email address is provided for each screen. EXPERIMENTAL DATA INTERFACE Each screen has links to one or more listings of the amplicons for which phenotypic data was entered during that assay. Multiple listings appear in cases when several different assays were combined in the screen, such as when two different cell lines are screened for comparison. Each listing is organized in rows representing DRSC amplicons. The data being displayed are fully configurable by the user (Figure 2
The user can use search criteria to display a subset of rows, or use check boxes to indicate the rows to display when a redisplay button is pressed. More significantly, the user can sort the data, based on any displayed column of information, by clicking the column header. The set of columns that is displayed is likewise configurable via a selection menu at the bottom of the page, which offers a choice of over 40 columns of information per row (Table 1). When the page displays data to the user's liking, the displayed contents of the page can be saved as a tab-delimited text file, to be imported into the user's spreadsheet program of choice.
For further details about any row, the user can click the triangle icon at the start of the row to go to the amplicon detail page for that amplicon. INFORMATIC TOOLS This page provides a link to the Gene and Amplicon Lookup page (Figure 3
The top section of the results page shows whether a particular gene and its related amplicon(s) have been identified in a public screen as ‘hit’(s). If so, the targeting amplicon is listed with the name of the screen, the phenotype and the screener's evaluation of the strength of that hit (see also Table 1). Below that, there is a section for each amplicon in the DRSC library that targets that gene. It is broken into subsections detailing general information about the amplicon (predicted sequence, primer sequences, length, etc.), genomic position, detailed information about the gene, links to other databases and some historical information relating to the creation of the amplicon. An additional important tool is the off-target sequence search tool. As in the mammalian field, the issue of off-target effects caused by siRNAs is emerging as a significant issue in Drosophila (14,15) and potential off-targets associated with dsRNAs for Caenorhabditis elegans have been annotated at RNAiDB (16). An initial review of the data at the DRSC confirms that off-target effects do happen in Drosophila and need to be taken into account when interpreting knock-down data by long dsRNAs (M. Booker, S. Silver, M. Kulkarni, A. Friedman, N. Perrimon and B. Mathey-Prevot, manuscript in preparation). As discussed earlier, long dsRNA (typically 400 nt) are processed to 21–23 nt siRNAs by the Dicer protein. Dicer does not appear to have a sequence preference for where processing will occur and as a result we do not know in advance which and how many siRNAs are produced from any particular dsRNA. However, we can check any possible 21mer sequence that is included in a given dsRNA (or within its corresponding amplicon) for a possible match with other mRNAs which are not the intended target. Ideally, only one match corresponding to the targeted mRNA should be found. To facilitate this search, we have developed a bioinformatics tool based on our own faster algorithm, somewhat similar to that published by Arziman et al. (17) except that it does not have a built-in primer design component. Our off-target search tool allows a user to provide one or more DNA sequences in FASTA format and search those sequences for predicted off-targets among all fly gene transcripts. The user can specify an off-target length (16–50 bp) with a default value of 21 bp. A color-coded map of gene matches for a given sequence is returned to the user. The map shows regions of the submitted sequences that are devoid of predicted matches with genes other than the intended target (no off-target) as well as stretches which do have matches with off-target genes. The intended or primary target is determined based on a match over all or most of the length of submitted sequence. The tool also reports the number of off-target genes. The sequence extraction tool allows users to retrieve multiple FASTA-formatted DRSC amplicon sequences. It can also be used to retrieve fly gene sequences. The FlyBase Identifier retrieval tool allows the user to do a batch query for FlyBase FBgn identifiers by giving a list of fly gene symbols, names, synonyms and CG accessions. The genetic interactions tool allows the user to construct a graph of genes of interest for their reported genetic interactions based on data stored at FlyBase. The user may query for these by submitting one or more gene symbols, synonyms or FBgn identifiers. The Screen Analysis Tutorial is a series of web pages that provides screen analysis assistance to DRSC screeners as well as to public users interested in mining the data available in our database. This guide offers the user multiple resources and some graphical approaches to help integrate and explore relational associations between DRSC hits and other gene function, ontology or expression data sources. FUTURE DEVELOPMENTS The DRSC database/interface is constantly evolving and new experimental data accrue at a pace of ~20–30 genome-wide screens a year. The DRSC library of amplicons (and dsRNAs) is also evolving as new Drosophila genome annotations come on line, and new insights about the specificity of our dsRNAs come to our attention (e.g. the need to replace amplicons associated with off-target effects). We are committed to the idea that our database be an important resource for public data mining and will make available all screen data as soon as it is permitted by the general agreement signed between the screeners and the DRSC. We will work to provide additional bioinformatic tools and search capabilities to enhance our current database, and we welcome any suggestion or collaboration to improve the integration of our database with others. Finally, we encourage comments to make our database more useful to scientists and hope that similar databases will be created to collect information from RNAi screens in mammalian cells. Acknowledgments The authors of this paper would like to thank Carolyn Shamu and Tim Mitchison of The Institute of Chemistry and Cell Biology (ICCB) at Harvard Medical School and Erik Brauner of the Broad Institute for all their help and guidance in setting up the early phase of the DRSC database. We would also like to thank Sara Cherry, Ramanuj Dasgupta, Kent Nybakken, Adam Friedman, Jennifer Philips and the rest of the Perrimon lab for their helpful suggestions on the web interface and feedback on the features of the database. This work was supported by grant R01 GM067761 from the National Institute of the General Medical Sciences. N.P. is a Howard Hughes Medical Institute investigator. Funding to pay the Open Access publication charges for this article was provided by the NIGMS grant listed above. Conflict of interest statement. None declared. REFERENCES 1. Armknecht S., Boutros M., Kiger A., Nybakken K., Mathey-Prevot B., Perrimon N. High-throughput RNA interference screens in Drosophila tissue culture cells. Methods Enzymol. 2005;392:55–73. [PubMed] 2. Clemens J.C., Worby C.A., Simonson-Leff N., Muda M., Maehama T., Hemmings B.A., Dixon J.E. Use of double-stranded RNA interference in Drosophila cell lines to dissect signal transduction pathways. Proc. Natl Acad. Sci. USA. 2000;97:6499–6503. [PubMed] 3. Meister G., Tuschl T. Mechanisms of gene silencing by double-stranded RNA. Nature. 2004;431:343–349. [PubMed] 4. Hild M., Beckmann B., Haas S.A., Koch B., Solovyev V., Busold C., Fellenberg K., Boutros M., Vingron M., Sauer F., et al. An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome. Genome Biol. 2003;5:R3. [PubMed] 5. Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. [PubMed] 6. Agaisse H., Burrack L.S., Philips J., Rubin E.J., Perrimon N., Higgins D.E. Genome-wide RNAi screen for host factors required for intracellular bacterial infection. Science. 2005;14:14. 7. Baeg G.H., Zhou R., Perrimon N. Genome-wide RNAi analysis of JAK/STAT signaling components in Drosophila. Genes Dev. 2005;29:29. 8. Boutros M., Kiger A.A., Armknecht S., Kerr K., Hild M., Koch B., Haas S.A., Consortium H.F., Paro R., Perrimon N. Genome-wide RNAi analysis of growth and viability in Drosophila cells. Science. 2004;303:832–835. [PubMed] 9. Cherry S., Doukas T., Armknecht S., Whelan S., Wang H., Sarnow P., Perrimon N. Genome-wide RNAi screen reveals a specific sensitivity of IRES-containing RNA viruses to host translation inhibition. Genes Dev. 2005;19:445–452. [PubMed] 10. DasGupta R., Kaykas A., Moon R.T., Perrimon N. Functional genomic analysis of the Wnt-wingless signaling pathway. Science. 2005;308:826–833. [PubMed] 11. Eggert U.S., Kiger A.A., Richter C., Perlman Z.E., Perrimon N., Mitchison T.J., Field C.M. Parallel chemical genetic and genome-wide RNAi screens identify cytokinesis inhibitors and targets. PLoS Biol. 2004;2:e379. [PubMed] 12. Kiger A., Baum B., Jones S., Jones M., Coulson A., Echeverri C., Perrimon N. A functional genomic analysis of cell morphology using RNA interference. J. Biol. 2003;2:27. [PubMed] 13. Philips J.A., Rubin E.J., Perrimon N. Drosophila RNAi screen reveals CD36 family member required for mycobacterial infection. Science. 2005;14:14. 14. Qiu S., Adema C.M., Lane T. A computational study of off-target effects of RNA interference. Nucleic Acids Res. 2005;33:1834–1847. [PubMed] 15. Naito Y., Yamada T., Matsumiya T., Ui-Tei K., Saigo K., Morishita S. dsCheck: highly sensitive off-target search software for double-stranded RNA-mediated RNA interference. Nucleic Acids Res. 2005;33:W589–W591. [PubMed] 16. Gunsalus K.C., Yueh W.C., MacMenamin P., Piano F. RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects. Nucleic Acids Res. 2004;32:D406–D410. [PubMed] 17. Arziman Z., Horn T., Boutros M. E-RNAi: a web application to design optimized RNAi constructs. Nucleic Acids Res. 2005;33:W582–W588. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||
Methods Enzymol. 2005; 392():55-73.
[Methods Enzymol. 2005]Proc Natl Acad Sci U S A. 2000 Jun 6; 97(12):6499-503.
[Proc Natl Acad Sci U S A. 2000]Nature. 2004 Sep 16; 431(7006):343-9.
[Nature. 2004]Genome Biol. 2003; 5(1):R3.
[Genome Biol. 2003]Science. 2000 Mar 24; 287(5461):2185-95.
[Science. 2000]Methods Enzymol. 2005; 392():55-73.
[Methods Enzymol. 2005]Science. 2000 Mar 24; 287(5461):2185-95.
[Science. 2000]Nucleic Acids Res. 2005; 33(6):1834-47.
[Nucleic Acids Res. 2005]Nucleic Acids Res. 2005 Jul 1; 33(Web Server issue):W589-91.
[Nucleic Acids Res. 2005]Nucleic Acids Res. 2004 Jan 1; 32(Database issue):D406-10.
[Nucleic Acids Res. 2004]Nucleic Acids Res. 2005 Jul 1; 33(Web Server issue):W582-8.
[Nucleic Acids Res. 2005]