skip to main content

Data submission to NCBI (SRA)

Quick view

Before you begin

Gather information
Why did you perform your analysis?
  • Project title and abstract
  • Aims and objectives
  • Organism(s) sequenced
  • Optional: Funding sources, publications, etc.
What did you sequence?
  • Descriptive sample information
  • Tabular format is ideal
  • Examples: Organism(s), age(s), gender(s), location data, cell line(s), etc.
How did you sequence your samples?
  • Sequencing methods
  • Kits used
  • Instrument model(s)
What is your data file format?

Register metadata

BioProject  

  • A description of the research effort
  • "Why" you sequenced your samples

BioSample  

  • A description of biologically or physically unique specimens
  • "What" you sequenced
1

Provide technical details

2
SRA Study
SRA Sample

SRA Experiment

  • A description of a sample-specific sequencing library
  • "How" you performed the sequencing
  • Multiple Experiments can "point" to a single Sample, but not vice-versa

Transmit data files

SRA Run
  • All files linked to a Run are "merged" into a single dataset
  • Files are converted to SRA format
  • Files submitted by FTP or Aspera once steps 1 and 2 are complete
3

Example SRA Submission

Example SRA Submission

  • Grey: Each submission should encompass a single 'project'.
  • Red: Project and sample information are imported from the Submission Portal.
  • Green: Each sample needs 1 or more sequencing libraries (SRA Experiments).
  • Blue: Runs identify individual files (or sets - e.g., paired-end data).

Detailed description

  1. Begin your submission by registering project and sample information
    • BioProject is an overall description of a single research initiative; a project will typically relate to multiple samples and datasets
    • BioSample is a description of biological source material; each physically unique specimen should be registered as a single BioSample with a unique set of attributes
    • Please see here for examples of what we consider unique samples
  2. Import your BioProject and BioSample accessions into the SRA while providing technical information about sequencing methods
    • Create a new SRA submission here
    • Register at least one unique sequencing library (click "New Experiment") for each BioSample; users with many samples are encouraged to email the SRA for suggestions and support
    • When registering SRA Experiments, you will designate your BioProject and BioSample accessions, and they will be imported into the SRA
  3. The final step of the submission is to send data files.
    • Identify which data file(s) should be linked to a given sequencing library and specify them in one (or more) Runs (click "New Run" after SRA Experiment(s) have been made)
    • Submitters should compute the MD5 sum of each data file beforehand; this is necessary to ensure the integrity of submitted files
    • Once files are designated in an SRA Run, the system will provide the current username and password to allow files to be sent via FTP
    • Users with large data files (> 1 GB) and/or who are located overseas are encouraged to install Aspera Connect and then contact the SRA for instructions on how to send files

Additional resources

  1. SRA submission quick start guide: A step-by-step walkthrough of the entire submission process, complete with screenshots and "how-to" advice
  2. SRA submission FAQ: A question-answer format page that addresses several common questions about submission to the SRA
  3. SRA file format guide: A detailed description of the formats that are acceptable for submission to the SRA
  4. SRA Aspera guide: An overview of using Aspera Connect and the command-line tool "ascp" to upload to- and download from- the SRA