What is dbEST?

dbEST (Nature Genetics 4:332-3;1993) is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. A brief account of the history of human ESTs in GenBank is available (Trends Biochem. Sci. 20:295-6;1995). Also, consult the special "Genome Directory" issue of Nature (vol. 377, issue 6547S, 28 September 1995).

About ESTs

Expressed Sequence Tags (ESTs) are short (usually <1000 bp), single-pass sequence reads from mRNA (cDNA). Typically they are produced in large batches. They represent a snapshot of genes expressed in a given tissue and/or at a given developmental stage. They are tags (some coding, others not) of expression for a given cDNA library.

Additional information about ESTs can be found in:
  Boguski MS, Lowe TM, Tolstoshev CM. 1993.  dbEST--database for "expressed sequence tags." Nat Genet 4(4):332-333.

Most EST projects develop large numbers of sequences. These are commonly submitted to GenBank and dbEST as batches of dozens to thousands of entries, with a great deal of redundancy in the citation, submitter and library information. To improve the efficiency of the submission process for this type of data, we have designed a special streamlined submission process and data format.

dbEST also includes sequences that are longer than the traditional ESTs, or are produced as single sequences or in small batches. Among these sequences are products of differential display experiments and RACE experiments. The thing that these sequences have in common with traditional ESTs, regardless of length, quality, or quantity, is that there is little information that can be annotated in the record.

If a sequence is later characterized and annotated with biological features such as a coding region, 5'UTR, or 3'UTR, it should be submitted through the regular GenBank submissions procedure (via BankIt or Sequin), even if part of the sequence is already in dbEST.

dbEST is reserved for single-pass reads. Assembled sequences should not be submitted to dbEST. GenBank will accept assembled EST submissions for the TSA (Transcriptome Shotgun Assembly) division. Please contact gb-admin@ncbi.nlm.nih.gov for more information about submitting EST assemblies. The individual reads which make up the assembly should be submitted to dbEST, the Trace archive or the Short Read Archive (SRA) prior to the submission of the assemblies. For additional information about submitting to Trace or SRA please see Trace web site.

NOTE: Beginning in 2009 Sequences derived from "next generation" sequencing platforms, including Roche 454, Illumina, Applied Biosystems SOLiD, and Helicos Biosciences HeliScope, should be submitted to the Short Read Archive (SRA) (For information contact sra@ncbi.nlm.nih.gov.)

Sequences which should not be included in EST submissions include the following: mitochondrial sequences, rRNA, viral sequences, vector sequences. Vector and linker regions should be removed from EST sequences before submission.

Other ways to access dbEST

Other ways to access dbEST

How to submit data

How to submit data to dbEST

Information on the current release

Number of ESTs - dbEST summary by organism

Last updated: 2014-06-19T13:44:59-04:00