TSA Frequently Asked Questions

What type of transcriptome assembly should be submitted to SRA?

Single-step, unannotated assemblies in BAM format should be submitted to SRA as a Transcriptome Project to SRA .

Can I submit single-pass reads to TSA?

Single-pass reads may be submitted as part of a TSA project providing they are in the minority and add additional information to the transcriptome project.

Can I submit assemblies with internal Ns?

TSA does not accept assemblies which have Ns inserted to represent gaps of unknown length. Sequences containing Ns representing gaps of unknown length need to be split into individual assemblies. Internal Ns representing ambiguous bases or known length gaps can be submitted. If the Ns represent ambiguous bases they should not be more than 10% of the sequence length or more than 14 n's in a row. If the N's represent a known length gap then an assembly_gap feature must be used.

Is annotation required?

Annotation is not required unless you are submitting a targeted study. If annotation is included the product names should follow the International Protein Nomenclature Guidelines.

Can I submit an assembly of EST/SRA/trace archive data generated by another group?

No. All submitted assemblies must be derived from primary data generated by the same group.

Where should clonally derived sequences be submitted?

These sequences should be submitted to GenBank . Only computationally assembled sequences by a program such as CAP3 should be submitted to TSA.

Should TSA submissions be submitted directly to GenBank via email?

All TSA submissions need to be submitted using the TSA Submission Portal . Sequences submitted via email or SequinMacroSend will not be accepted.

Can a moltype other than transcribed RNA be used for TSA submissions?

Yes. If a targeted data set is being submitted where the focus was isolating a specific other RNA molecule type, this molecule type should be used. For example noncoding RNA.

Can I submit different assemblies as one submission through the Submission Portal?

No, each submission in portal should represent a single assembly and should have the following information in common:

  • Assembly data structured comment
  • SRA run accessions (SRRXXXXXY)
  • BioProject accession
  • BioSample accession

Are TSA sequences available by a BLAST search?

A Transcriptome Shotgun Assembly (TSA) BLAST database is now available. The sequences were initially included in nt but now have been segregated into a separate database. The TSA database is available from the BLAST home page under Basic BLAST at the nucleotide, tblastn, and tblastx links. These sequences are not available in nt.

Can I run VecScreen before submitting?

The TSA submission portal will automatically run VecScreen on your submission. If you would like to screen your sequences prior to submitting then please review the UniVec instructions.

The following is the command-line that should be used:

blastn -task blastn -reward 1 -penalty -5 -gapopen 3 -gapextend 3 -dust yes -soft_masking true
   -evalue 700 -searchsp 1750000000000 -db UniVec -query sequence.fa -out vs.test.out
Support Center

Last updated: 2018-06-27T14:00:09Z