- Journal List >
- Nucleic Acids Res >
- v.34(Database issue); Jan 1, 2006 >
- PMC1347432

LOCATE: a mouse protein subcellular localization database
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions/at/oxfordjournals.org
Abstract
We present here LOCATE, a curated, web-accessible database that houses data describing the membrane organization and subcellular localization of proteins from the FANTOM3 Isoform Protein Sequence set. Membrane organization is predicted by the high-throughput, computational pipeline MemO. The subcellular locations of selected proteins from this set were determined by a high-throughput, immunofluorescence-based assay and by manually reviewing >1700 peer-reviewed publications. LOCATE represents the first effort to catalogue the experimentally verified subcellular location and membrane organization of mammalian proteins using a high-throughput approach and provides localization data for ~40% of the mouse proteome. It is available at http://locate.imb.uq.edu.au.
INTRODUCTION
Determination of the membrane organization and the subcellular location of a protein are essential to understanding its biochemical function. A cell is divided into different cellular compartments and each compartment is associated with a different range of biochemical processes; by localizing a protein to a specific compartment, or set of compartments, the cellular role of the protein can be inferred. This information can provide insight into the functions of hypothetical or novel proteins and can provide a more specific organellar context in which to investigate a particular protein. Historically, these data have been difficult to produce on a large scale for higher eukaryotic organisms. However, recent advances in membrane organization prediction methods and high-throughput subcellular localization assays have made it possible to generate these datasets. We used high-throughput methods to predict the membrane organization for the entire mouse proteome and to determine the subcellular localization of a subset of the proteome. We then developed a database, LOCATE, to organize and warehouse these data.
DATABASE CONTENT
Dataset
The mouse proteome dataset we used was the FANTOM3 Isoform Protein Sequence set (IPS7) generated by the RIKEN FANTOM Consortium (1). This dataset is comprised of protein sequences based on transcript sequences generated from direct sequencing of full-length transcripts. The sequenced transcripts were clustered into transcriptional units (TUs) where a TU is a grouping of transcripts that arise from a single genomic locus and share at least one nucleotide having the same genomic location and orientation. The IPS7 dataset contains 33
451 protein sequences encoded by 19
853 TUs.
Membrane organization
Protein orientation with respect to the membrane was predicted by MemO, a high-throughput, automated pipeline, which combines publicly available feature predictors with empirically determined annotation rules (1,2) (M. J. Davis, F. Clark, J. L. Fink, Z. Yuan, F. Zhang, T. Kasukawa, Y. Hayashizaki, P. Carnici and R. D. Teasdale, manuscript in preparation). The pipeline is described briefly here.
Prediction of signal peptides was performed by a local implementation of SignalP v2.0 (3) and by the Australian National Genomic Information Service (ANGIS, http://biomanager.angis.org.au) version of SPScan. A protein was predicted to contain a signal peptide if the averaged and normalized raw output scores from both methods exceeded a threshold identified to maximize the proportions of true positives and true negatives on a training set.
α-Helical transmembrane domain prediction was performed by a consensus method consisting of five currently available predictors: HMMTOP (4), TMHMM v2.0 (5), SVMTM v3.0 (6), MEMSAT (7) and DAS (8). A protein was said to contain a transmembrane domain if at least 7, but no more than 42, consecutive residues in the protein (ignoring a gap of <4 residues) were predicted to participate in a transmembrane domain by at least three of the five predictors.
The prediction of the absence or presence of the signal peptide and transmembrane domain provided a classification into one of five categories of membrane organization:
- soluble intracellular proteins (no transmembrane domains or signal peptide);
- soluble secreted proteins (signal peptide, no transmembrane domains);
- type I membrane proteins (one transmembrane domain, signal peptide) (9);
- type II membrane proteins (one transmembrane domain, no signal peptide) (9);
- multi-pass membrane protein (multiple transmembrane domains) (9).
We applied this pipeline to the 33
451 protein sequences in the IPS7 dataset and identified 5116 (~15%) proteins containing signal peptides, and 8238 (~25%) proteins containing transmembrane domains. These proteins were then allocated to the five membrane organization categories based on combinations of those features. The class breakdown of proteins is shown in Table 1.
Table 1
Subcellular localization
Proteins were selected for experimentation based on clone availability and the extent of previous characterization of their subcellular localization. When selecting multipass membrane proteins, only those without a predicted ER signal peptide were chosen. N-terminally tagged myc-gene of interest expression constructs were generated using a modified overlapping PCR methodology originally reported by Suzuki et al. (10). The expressed protein, within fixed transfected HeLa cells, was detected by indirect immunofluorescence and representative images were collected and analyzed to determine the protein's subcellular localization. To date, experimental subcellular localization data have been generated for 417 of these selected proteins and localization data based on primary literature review have been gathered for 1752 TUs.
Both the experimental and literature-mined localization data were manually examined and evaluated for sufficient quality prior to addition to the database. When evaluating literature-mined localization data, only papers describing the localization of full-length proteins in individual mammalian cells in which the protein is detected directly were included in our analysis. These peer-reviewed observations were not reinterpreted. However, some observations were excluded when considered not to be of a sufficient quality.
Because it was not always possible to determine to which protein isoform the literature data referred, we assigned the literature-mined location to all protein isoforms encoded by the corresponding TU. Table 1 summarizes the subcellular localization statistics by membrane organization class.
To provide as complete a location description as possible for any given protein, we also include localization data mined from other online databases including LIFEdb (11), Mouse Genome Informatics (12), UniProt (13), RefSeq (14) and others. A total of 7410 TUs and 11
353 protein isoforms are annotated with these data. In total, we have localization data for 8017 TUs and 12
598 protein isoforms representing 41 and 37% of the IPS7 set, respectively.
Data presentation
General information
Information in LOCATE is displayed as a web page which describes a particular protein entry in detail. The page is divided into sections which summarize several types of data. The top of the page contains a summary of the MemO classification and the subcellular localization of the protein as well as associated metadata provided by FANTOM3 annotations such as the protein identifier, a functional description, protein name synonyms, the source organism and links to other databases which also contain this protein.
Transmembrane topology and predicted domains
Knowing what functional domains and motifs exist in a protein is extremely useful when attempting to decipher the cellular role of the protein. We have generated predictions of Pfam and SCOP domains for all proteins in the database and have displayed the predicted domains on a graphical protein schematic diagram alongside the membrane organization data (Figure 1). The presence and position of certain domains in relation to predicted transmembrane domains can provide insights into the validity of the functional annotation of the protein (if one exists) as well as the validity or range of the transmembrane domain prediction.

Figure 1
Subcellular location data
If a protein entry has high-throughput subcellular localization data, we display the subcellular location(s) in which that particular protein isoform was observed and a high-resolution fluorescent-image which best illustrates the observed localization. Information about the experimental conditions such as the cell type and epitope used in the localization assays is also displayed. If a protein entry has subcellular localization data mined from literature, we display the determined subcellular location(s), the PubMed ID, and a full citation of the data source.
Controlled vocabulary
Consistent naming of subcellular locations is critical to the integrity and extensibility of the LOCATE data. Therefore, we have constructed a controlled vocabulary which describes both experimentally determined and literature-mined subcellular locations. In the case of high-throughput experimental subcellular localization assays, it is not always possible to determine the exact cellular compartment to which the protein is observed to localize. To address this problem, our controlled vocabulary contains a hierarchical set of terms that allows the call to be only as specific as the data allow. This system also reflects the confidence of the localization call; use of a very specific term implies higher confidence. Some proteins have been observed to localize to more than one subcellular compartment; in these cases, we allow the use of multiple terms to describe the observed locations. When mining subcellular localization data from the literature, we use terms that allow for different levels of location resolution and for cellular components that are specific to cells with a lineage or morphology that differs from the model cells used in our experiments. In both vocabularies, we use Gene Ontology (15) terms to describe subcellular locations whenever possible (see the LOCATE website for more details).
Observed spliced isoforms
For each protein in the database, we display a list of all proteins that belong to the same TU to allow comparisons between each of the observed protein isoforms. Specifically, we display the membrane organization and length of each isoform on a splicing graph which illustrates the observed exons and the various alternate splice forms for that particular TU (Figure 2). These graphs enable analysis of the pattern of membrane organization variation within the observed protein isoforms and examination of the possible effects of alternative splicing on membrane organization. The graphs were generated by a customized version of the Splicing Graph Module (16).

Figure 2
Data accessibility
This database does not seek to duplicate information contained in other databases unless it is particularly useful when viewed in juxtaposition with the subcellular localization or membrane organization data. However, we understand the value of convenient data accessibility and provide links to offsite resources such as SymAtlas (17), GenBank (18), RIKEN (1), MGI (19), READ (20), Pfam (21), SCOP (22), UniProt (13), OMIM (23), Entrez Gene (24), BIND (25), the GeneNetwork (26) and the Mouse Retrovirus Tagged Cancer Gene Database (RTCGD) (20) where applicable.
Because the major aim of this database effort is to present protein subcellular location data and the predicted membrane organization of the protein, these two features are the primary search mechanisms; proteins can be retrieved by protein class, subcellular localization or both. Alternatively, individual protein entries can be retrieved by searching the database with a protein ID (RIKEN clone/IPS ID, GenBank accession number, Entrez Gene ID), by protein name, by Pfam or SCOP accession number, or by functional description. BLAST searches against the database, and subsets of the database, are also available. The BLAST results are enhanced to display the membrane organization of the hits. We also offer a number of batch data retrieval options. The proteins in any given search can be retrieved as FASTA-formatted protein or transcript sequences, subcellular localization data, membrane organization data or protein schematics. XML-marked-up documents containing these data can also be obtained.
CONCLUSIONS
LOCATE represents a significant contribution to the biological research community by organizing and presenting membrane organization and subcellular localization data for the mouse proteome. The LOCATE search interface allows users to retrieve data and sets of data using several different approaches. The interface to individual proteins was designed to maximize ease of interpretation by providing summaries or visualizations that contain the most relevant points of data; links are provided to the raw data or other details that are necessary for a careful evaluation of the experimental results. LOCATE data can be retrieved as individual entries or downloaded as HTML, plain text or XML files.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
Acknowledgments
The authors would like to acknowledge Nicholas Hamilton for implementing DomainDraw, the domain drawing program; Robert Luetterforst for assistance with the literature mining; and Emma Redhead for designing the LOCATE XML schema and XML document generator. The work was supported by funds from the Australian Research Council (ARC) and by the Research Grant for the RIKEN Genome Exploration Research Project from the Ministry of Education, Culture, Sports, Science and Technology of the Japanese Government to Y.H., and the Research Grant for the Genome Network Project from the Ministry of Education, Culture, Sports, Science and Technology of the Japanese Government. R.D.T. is supported by a National Health and Medical Research Council of Australia R. Douglas Wright Career Development Award. R.N.A. is supported by a Postgraduate Research Scholarship from the IMB, University of Queensland. M.J.D. is supported by the National Institute for Diabetes, Digestion and Kidney Disease, National Institutes of Health (DK63400) as part of the Stem Cell Genome Anatomy Project (http://www.scgap.org/). Funding to pay the Open Access publication charges for this article was provided by University of Queensland and Australian Research Council.
Conflict of interest statement. None declared.
REFERENCES
Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
Formats:
- Abstract |
- Full Text |
- PDF (155K)
-
LOCATE: a mammalian protein subcellular localization database.
[Nucleic Acids Res. 2008]
LOCATE: a mammalian protein subcellular localization database.Sprenger J, Lynn Fink J, Karunaratne S, Hanson K, Hamilton NA, Teasdale RD. Nucleic Acids Res. 2008 Jan; 36(Database issue):D230-3. Epub 2007 Nov 5.
-
Subcellular localization of mammalian type II membrane proteins.
[Traffic. 2006]
Subcellular localization of mammalian type II membrane proteins.Aturaliya RN, Fink JL, Davis MJ, Teasdale MS, Hanson KA, Miranda KC, Forrest AR, Grimmond SM, Suzuki H, Kanamori M, et al. Traffic. 2006 May; 7(5):613-25.
-
SUBA: the Arabidopsis Subcellular Database.
[Nucleic Acids Res. 2007]
SUBA: the Arabidopsis Subcellular Database.Heazlewood JL, Verboom RE, Tonti-Filippini J, Small I, Millar AH. Nucleic Acids Res. 2007 Jan; 35(Database issue):D213-8. Epub 2006 Oct 28.
-
MitoP2: an integrative tool for the analysis of the mitochondrial proteome.
[Mol Biotechnol. 2008]
MitoP2: an integrative tool for the analysis of the mitochondrial proteome.Elstner M, Andreoli C, Ahting U, Tetko I, Klopstock T, Meitinger T, Prokisch H. Mol Biotechnol. 2008 Nov; 40(3):306-15. Epub 2008 Sep 9.
-
Ratiocinative screen of eukaryotic integral membrane protein expression and solubilization for structure determination.
[J Struct Funct Genomics. 2009]
Ratiocinative screen of eukaryotic integral membrane protein expression and solubilization for structure determination.Hays FA, Roe-Zurz Z, Li M, Kelly L, Gruswitz F, Sali A, Stroud RM. J Struct Funct Genomics. 2009 Mar; 10(1):9-16. Epub 2008 Nov 22.
-
BayesMotif: de novo protein sorting motif discovery from impure datasets
[BMC Bioinformatics. ]
BayesMotif: de novo protein sorting motif discovery from impure datasetsHu J, Zhang F. BMC Bioinformatics. 11(Suppl 1)S66
-
A testis-specific regulator of complex and hybrid N-glycan synthesis
[The Journal of Cell Biology. 2010]
A testis-specific regulator of complex and hybrid N-glycan synthesisHuang HH, Stanley P. The Journal of Cell Biology. 2010 Sep 6; 190(5)893-910
-
The proteins of intra-nuclear bodies: a data-driven analysis of sequence, interaction and expression
[BMC Systems Biology. ]
The proteins of intra-nuclear bodies: a data-driven analysis of sequence, interaction and expressionMohamad N, Bodén M. BMC Systems Biology. 444
-
Lipase Maturation Factor LMF1, Membrane Topology and Interaction with Lipase Proteins in the Endoplasmic Reticulum
[The Journal of Biological Chemistry. 2009]
Lipase Maturation Factor LMF1, Membrane Topology and Interaction with Lipase Proteins in the Endoplasmic ReticulumDoolittle MH, Neher SB, Ben-Zeev O, Ling-liao J, Gallagher CM, Hosseini M, Yin F, Wong H, Walter P, Péterfy M. The Journal of Biological Chemistry. 2009 Nov 27; 284(48)33623-33633
-
The mannose 6-phosphate glycoprotein proteome
[Journal of proteome research. 2008]
The mannose 6-phosphate glycoprotein proteomeSleat DE, Valle MC, Zheng H, Moore DF, Lobel P. Journal of proteome research. 2008 Jul; 7(7)3010-3021
-
PubMed
PubMedPubMed citation for this article
-
Taxonomy
TaxonomyRelated taxonomy entry
-
Taxonomy Tree
Taxonomy Tree
-
LOCATE: a mouse protein subcellular localization database
LOCATE: a mouse protein subcellular localization databaseNucleic Acids Research. 2006 Jan 1; 34(Database issue)D213-D217PMC
-
Ten Simple Rules for Getting Published
Ten Simple Rules for Getting PublishedPLoS Computational Biology. 2005 Oct; 1(5)e57PMC
Your browsing activity is empty.
Activity recording is turned off.
See more...-
The transcriptional landscape of the mammalian genome.
[Science. 2005]
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y, FANTOM Consortium, RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)Science. 2005 Sep 2; 309(5740):1559-63.
-
The transcriptional landscape of the mammalian genome.
[Science. 2005]
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y, FANTOM Consortium, RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)Science. 2005 Sep 2; 309(5740):1559-63.
-
Mouse proteome analysis.
[Genome Res. 2003]
Kanapin A, Batalov S, Davis MJ, Gough J, Grimmond S, Kawaji H, Magrane M, Matsuda H, Schönbach C, Teasdale RD, Yuan Z, RIKEN GER Group, GSL MembersGenome Res. 2003 Jun; 13(6B):1335-44.
-
The HMMTOP transmembrane topology prediction server.
[Bioinformatics. 2001]
Tusnády GE, Simon IBioinformatics. 2001 Sep; 17(9):849-50.
-
Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.
[J Mol Biol. 2001]
Krogh A, Larsson B, von Heijne G, Sonnhammer ELJ Mol Biol. 2001 Jan 19; 305(3):567-80.
-
A model recognition approach to the prediction of all-helical membrane protein structure and topology.
[Biochemistry. 1994]
Jones DT, Taylor WR, Thornton JMBiochemistry. 1994 Mar 15; 33(10):3038-49.
-
Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method.
[Protein Eng. 1997]
Cserzö M, Wallin E, Simon I, von Heijne G, Elofsson AProtein Eng. 1997 Jun; 10(6):673-6.
-
Review Topogenesis of membrane proteins: determinants and dynamics.
[FEBS Lett. 2001]
Goder V, Spiess MFEBS Lett. 2001 Aug 31; 504(3):87-93.
-
Review Topogenesis of membrane proteins: determinants and dynamics.
[FEBS Lett. 2001]
Goder V, Spiess MFEBS Lett. 2001 Aug 31; 504(3):87-93.
-
Review Topogenesis of membrane proteins: determinants and dynamics.
[FEBS Lett. 2001]
Goder V, Spiess MFEBS Lett. 2001 Aug 31; 504(3):87-93.
-
Protein-protein interaction panel using mouse full-length cDNAs.
[Genome Res. 2001]
Suzuki H, Fukunishi Y, Kagawa I, Saito R, Oda H, Endo T, Kondo S, Bono H, Okazaki Y, Hayashizaki YGenome Res. 2001 Oct; 11(10):1758-65.
-
LIFEdb: a database for functional genomics experiments integrating information from external sources, and serving as a sample tracking system.
[Nucleic Acids Res. 2004]
Bannasch D, Mehrle A, Glatting KH, Pepperkok R, Poustka A, Wiemann SNucleic Acids Res. 2004 Jan 1; 32(Database issue):D505-8.
-
The Mouse Genome Database (MGD): from genes to mice--a community resource for mouse biology.
[Nucleic Acids Res. 2005]
Eppig JT, Bult CJ, Kadin JA, Richardson JE, Blake JA, Anagnostopoulos A, Baldarelli RM, Baya M, Beal JS, Bello SM, Boddy WJ, Bradt DW, Burkart DL, Butler NE, Campbell J, Cassell MA, Corbani LE, Cousins SL, Dahmen DJ, Dene H, Diehl AD, Drabkin HJ, Frazer KS, Frost P, Glass LH, Goldsmith CW, Grant PL, Lennon-Pierce M, Lewis J, Lu I, Maltais LJ, McAndrews-Hill M, McClellan L, Miers DB, Miller LA, Ni L, Ormsby JE, Qi D, Reddy TB, Reed DJ, Richards-Smith B, Shaw DR, Sinclair R, Smith CL, Szauter P, Walker MB, Walton DO, Washburn LL, Witham IT, Zhu Y, Mouse Genome Database GroupNucleic Acids Res. 2005 Jan 1; 33(Database issue):D471-5.
-
The Universal Protein Resource (UniProt).
[Nucleic Acids Res. 2005]
Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LSNucleic Acids Res. 2005 Jan 1; 33(Database issue):D154-9.
-
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.
[Nucleic Acids Res. 2005]
Pruitt KD, Tatusova T, Maglott DRNucleic Acids Res. 2005 Jan 1; 33(Database issue):D501-4.
-
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.
[Nat Genet. 2000]
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock GNat Genet. 2000 May; 25(1):25-9.
-
DEDB: a database of Drosophila melanogaster exons in splicing graph form.
[BMC Bioinformatics. 2004]
Lee BT, Tan TW, Ranganathan SBMC Bioinformatics. 2004 Dec 7; 5():189.
-
Large-scale analysis of the human and mouse transcriptomes.
[Proc Natl Acad Sci U S A. 2002]
Su AI, Cooke MP, Ching KA, Hakak Y, Walker JR, Wiltshire T, Orth AP, Vega RG, Sapinoso LM, Moqrich A, Patapoutian A, Hampton GM, Schultz PG, Hogenesch JBProc Natl Acad Sci U S A. 2002 Apr 2; 99(7):4465-70.
-
GenBank.
[Nucleic Acids Res. 2005]
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DLNucleic Acids Res. 2005 Jan 1; 33(Database issue):D34-8.
-
The transcriptional landscape of the mammalian genome.
[Science. 2005]
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y, FANTOM Consortium, RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)Science. 2005 Sep 2; 309(5740):1559-63.
-
MGD: the Mouse Genome Database.
[Nucleic Acids Res. 2003]
Blake JA, Richardson JE, Bult CJ, Kadin JA, Eppig JT, Mouse Genome Database GroupNucleic Acids Res. 2003 Jan 1; 31(1):193-5.
-
RTCGD: retroviral tagged cancer gene database.
[Nucleic Acids Res. 2004]
Akagi K, Suzuki T, Stephens RM, Jenkins NA, Copeland NGNucleic Acids Res. 2004 Jan 1; 32(Database issue):D523-7.
-
The Pfam protein families database.
[Nucleic Acids Res. 2004]
Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, Studholme DJ, Yeats C, Eddy SRNucleic Acids Res. 2004 Jan 1; 32(Database issue):D138-41.
-
SCOP database in 2004: refinements integrate structure and sequence family data.
[Nucleic Acids Res. 2004]
Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AGNucleic Acids Res. 2004 Jan 1; 32(Database issue):D226-9.
-
The Universal Protein Resource (UniProt).
[Nucleic Acids Res. 2005]
Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LSNucleic Acids Res. 2005 Jan 1; 33(Database issue):D154-9.
-
Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders.
[Nucleic Acids Res. 2005]
Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VANucleic Acids Res. 2005 Jan 1; 33(Database issue):D514-7.
-
Entrez Gene: gene-centered information at NCBI.
[Nucleic Acids Res. 2005]
Maglott D, Ostell J, Pruitt KD, Tatusova TNucleic Acids Res. 2005 Jan 1; 33(Database issue):D54-8.
-
The Biomolecular Interaction Network Database and related tools 2005 update.
[Nucleic Acids Res. 2005]
Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Bobechko B, Boutilier K, Burgess E, Buzadzija K, Cavero R, D'Abreo C, Donaldson I, Dorairajoo D, Dumontier MJ, Dumontier MR, Earles V, Farrall R, Feldman H, Garderman E, Gong Y, Gonzaga R, Grytsan V, Gryz E, Gu V, Haldorsen E, Halupa A, Haw R, Hrvojic A, Hurrell L, Isserlin R, Jack F, Juma F, Khan A, Kon T, Konopinsky S, Le V, Lee E, Ling S, Magidin M, Moniakis J, Montojo J, Moore S, Muskat B, Ng I, Paraiso JP, Parker B, Pintilie G, Pirone R, Salama JJ, Sgro S, Shan T, Shu Y, Siew J, Skinner D, Snyder K, Stasiuk R, Strumpf D, Tuekam B, Tao S, Wang Z, White M, Willis R, Wolting C, Wong S, Wrong A, Xin C, Yao R, Yates B, Zhang S, Zheng K, Pawson T, Ouellette BF, Hogue CWNucleic Acids Res. 2005 Jan 1; 33(Database issue):D418-24.
-
GeneNetwork: an interactive tool for reconstruction of genetic networks using microarray data.
[Bioinformatics. 2004]
Wu CC, Huang HC, Juan HF, Chen STBioinformatics. 2004 Dec 12; 20(18):3691-3.