What is GSS?

The GSS division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. The GSS division contains (but is not limited to) the following types of data:

  • random "single pass read" genome survey sequences.
  • cosmid/BAC/YAC end sequences
  • exon trapped genomic sequences
  • Alu PCR sequences
  • transposon-tagged sequences

Section 1.3.3 of the GenBank 96.0 release notes provides additional information about the GSS division.

dbGSS sequences are incorporated into the GSS Division of GenBank.

How to submit data

Beginning in 2019, you may submit to dbGSS using tbl2asn. Only input data files 1 and 2 under REQUIRED are necessary to generate a GSS submission.

  1. Template file containing a text ASN.1 Submit-block object (suffix .sbt)
  2. Nucleotide sequence data in FASTA format (suffix .fsa)
    Each sequence in the FASTA file should include [organism=Genus species] [tech=survey] [moltype=genomic DNA]. The SeqID will be used as the clone value. You may use other relevant source modifiers as well, as in [strain=ABC123], [clone-lib=BAC library], etc.
  3. Features are not required, but you may include a feature table with no more than a single misc_feat for each sequence with "similar to ..." in the note. No other features may be applied to GSS.

Once you have generated your template.sbt and GSS.fsa files, the command line for running tbl2asn will look something like this:

tbl2asn -t template.sbt -i GSS.fsa -a s -V v

The submission file generated in this example would be GSS.sqn

A more thorough description of the possible command line arguments is available on the tbl2asn page.

You may also include BioProject or BioSample information, if the GSS sequences are part of a larger, ongoing project.

Please send your GSS submission (.sqn file) to gb-sub@ncbi.nlm.nih.gov.

Any questions may be sent to info@ncbi.nlm.nih.gov or gb-admin@ncbi.nlm.nih.gov.

Other ways to access dbGSS

GSS sequences are included in the GSS division of GenBank available from NCBI through E-Utilities.

GSS sequences are also available by anonymous FTP in the /repository/dbGSS directory at ftp.nlm.nih.gov

Please note that GSS received through the end of 2018 are available by FTP at ftp.ncbi.nlm.nih.gov/repository/dbGSS

GSS received after 2018 will not be included in this format.

Support Center

Last updated: 2019-06-05T20:59:28Z