Genes & Expression
Databases
- BioSystems
- Database that groups biomedical literature, small molecules, and sequence data in terms of biological relationships.
- Consensus CDS (CCDS)
- A collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality.
- GenBank
- The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis. GenBank consists of several divisions, most of which can be accessed through the Nucleotide database. The exceptions are the EST and GSS divisions, which are accessed through the Nucleotide EST and Nucleotide GSS databases, respectively.
- Gene
- A searchable database of genes, focusing on genomes that have been completely sequenced and that have an active research community to contribute gene-specific data. Information includes nomenclature, chromosomal localization, gene products and their attributes (e.g., protein interactions), associated markers, phenotypes, interactions, and links to citations, sequences, variation details, maps, expression reports, homologs, protein domain content, and external databases.
- Gene Expression Nervous System Atlas (GENSAT)
- Maps the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques. The GENSAT database contains a series of images related to gene expression experiments.
- Gene Expression Omnibus (GEO) Datasets
- Stores curated gene expression and molecular abundance DataSets assembled from the Gene Expression Omnibus (GEO) repository. DataSet records contain additional resources, including cluster tools and differential expression queries.
- Gene Expression Omnibus (GEO) Profiles
- Stores individual gene expression and molecular abundance Profiles assembled from the Gene Expression Omnibus (GEO) repository. Search for specific profiles of interest based on gene annotation or pre-computed profile characteristics.
- Genes and Disease
- Summary information for more than 80 genetic disorders with discussions of the underlying mutation(s) and clinical features, as well as links to related databases and organizations. The database is accessed through NCBI's Bookshelf.
- HomoloGene
- A gene homology tool that compares nucleotide sequences between pairs of organisms in order to identify putative orthologs. Curated orthologs are incorporated from a variety of sources via the Gene database.
- Online Mendelian Inheritance in Man (OMIM)
- Catalog of human genes and genetic disorders, with links to associated literature references, sequence records, maps, and related databases.
- Probe
- A public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.
- UniGene
- A database that provides sets of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
- UniGene Library Browser
- This database contains libraries of Expressed Sequence Tags (ESTs) organized by organism, tissue type and developmental stage.
Tools
- Digital Differential Display (DDD)
- A tool for comparing EST profiles in order to identify genes with significantly different expression levels.
- E-Utilities
- Tools that provide access to data within NCBI's Entrez system outside of the regular web query interface. They provide a method of automating Entrez tasks within software applications. Each utility performs a specialized retrieval task, and can be used simply by writing a specially formatted URL.
- Gene Expression Omnibus (GEO) BLAST
- Tool for aligning a query sequence (nucleotide or protein) to GenBank sequences included on microarray or SAGE platforms in the GEO database.
- Genome Workbench
- An integrated application for viewing and analyzing sequence data. With Genome Workbench, you can view data in publically available sequence databases at NCBI, and mix these data with your own data.
- Map Viewer
- A software component of the Genome database that provides special browsing capabilities for a subset of organisms. You can view and search an organism's complete genome, display chromosome maps, and zoom into progressively greater levels of detail, down to the sequence data for a region of interest.
Downloads
- FTP: Gene
- This site contains three directories: DATA, GeneRIF and tools. The DATA directory contains files listing all data linked to GeneIDs along with subdirectories containing ASN.1 data for the Gene records. The GeneRIF (Gene References into Function) directory contains PubMed identifiers for articles describing the function of a single gene or interactions between products of two genes. Sample programs for manipulating gene data are provided in the tools directory. Please see the README file for details.
- FTP: Gene Expression Omnibus (GEO) Profiles and Datasets
- This site contains GEO data in two formats: SOFT (Simple Omnibus in Text Format) and MINiML (MIAME Notation in Markup Language). Summary text files and supplementary data are also available. Please see the README.TXT file for more information.
- FTP: GENSAT
- This site contains GENSAT image data organized by gene and contributing institution.
- FTP: HomoloGene
- This site contains data for each build of HomoloGene, beginning with build 35. Complete data for each build are provided in XML, and a data summary is provided in tab-delimited text format.
- FTP: UniGene
- This site contains individual directories for each organism with data in UniGene. The data for each species includes the unique sequence for each UniGene cluster, all sequences in each cluster in FASTA format and library information for the cluster. See the README file for further details.
Submissions
- Gene Expression Omnibus (GEO) Web Deposit
- Submit expression data, such as microarray, SAGE or mass spectrometry datasets to the NCBI Gene Expression Omnibus (GEO) database.
- GeneRIF
- GeneRIF provides a simple mechanism to allow scientists to add to the functional annotation of genes in the Gene database.