BioSample Submission FAQ

What is a BioSample?

A BioSample contains descriptive information about the physical biological specimen from which your experimental data are derived. Typical examples of a BioSample include a cell line, a tissue biopsy or an environmental isolate. The information you supply about the biological materials are critical for providing context to your experimental data.

Under what circumstances is it necessary to submit to the BioSample database?

BioSample submission is required as part of data deposit to several NCBI primary data archives including SRA, TSA and WGS. Typically, BioSample data are submitted first and assigned BioSample accession numbers (SAMNxxxxxxxx). The BioSample accession numbers can subsequently be referenced as appropriate when submitting corresponding experimental data to the archival database. At this time, BioSample submission is not required for GEO or dbGaP; deposit to those databases triggers automatic creation of BioSample records.

How do I submit to BioSample?

Two submission routes are supported:

BioSample Submission Portal
Online wizard that supports batch submissions using spreadsheet templates or single submissions using web forms.
Most submitters should use this method.
Preview BioSample packages and attributes, and download submission templates.
XML deposit
Programmatic API deposit in XML format. Suitable only when data is stored in an inhouse database or LIMS, and from which valid BioSample XML can be generated. Here are the instructions and schemas, including listings of recognized packages and attributes.

What information should I provide about my BioSample?

You are asked to indicate what type of sample you are submitting by selecting a relevant package. BioSample packages represent broad categories of sample types and help guide you into providing appropriate attributes by which to describe your samples. Attributes describe a BioSample using structured attribute name:value pairs, for example, tissue:liver. You can preview BioSample packages and attributes, and download submission templates. In addition to recognized package attributes, you can provide any number of custom attributes to fully describe your samples. Provide comprehensive information that will allow users to fully interpret your study.

What types of validation must my BioSample pass?

The Submission Portal reports errors for invalid submission attempts. If you have difficulties understanding error reports, please to write to biosamplehelp@ncbi.nlm.nih.gov for assistance.
Typical validations include:

  • Content must be supplied for mandatory fields. If information is unavailable for any mandatory field, please enter 'not collected', 'not applicable' or 'missing' as appropriate.
  • Multiple BioSamples cannot have identical attributes. You should have one BioSample for each specimen, and each of your BioSamples must have differentiating information (excluding sample name, title, and description). This check was implemented to encourage submitters to include distinguishing information in their samples. If it is necessary to represent true biological replicates as separate BioSamples, you might add an 'aliquot' or 'replicate' attribute, e.g., 'replicate = biological replicate 1', as appropriate. Note that multiple assay types, e.g., RNA-seq and ChIP-seq data may reference the same BioSample if appropriate.
  • The values provided for some attributes are validated, for example, 'collection date' and 'culture collection' values must be provided in a recognized format. Refer to the attribute definition for information about required formats.

When will I receive my BioSample accession number?

If your submission passes validation, you can expect to receive a BioSample accession number(s) within a few minutes by email. One exception is if we do not recognize the name of the organism you supplied, in that case, please be patient while we consult with the NCBI Taxonomy team.

When will my BioSample record be released?

During submission, you are presented two options for releasing your BioSample to the public. If you select 'Release immediately upon curation' the records will be released within a few hours. If you select 'Release on a specified date', the BioSample will be released on the date you specify or upon the release of any data that reference that BioSample accession, whichever is first. At this time, we do not have a mechanism in place for you to view your records before release.

Will NCBI apply further curation to my BioSample records?

No, BioSample is a submitter-driven repository. Submitters are responsible for the content and accuracy of their records, and for ensuring that sufficient information has been provided to allow users to fully interpret their study. BioSample submissions must pass basic validation rules and taxonomy review. If the attributes you provided have preferred names, the preferred name will automatically be displayed on your records (see Attributes page for list of recognized synonyms). Otherwise, records are generally not subject to further curation.

How do I update my BioSample?

At this time, it is necessary for submitters to write to biosamplehelp@ncbi.nlm.nih.gov to request updates and withdrawals as necessary. Please note that when BioSamples are updated, the Submission Overview page in the Submission Portal will not reflect this change. That page is only a record of the initial submission, and does not display changes made in the BioSample database.

Should I cite BioSample accession numbers in my manuscript?

Typically, it is appropriate to cite the accession numbers that are assigned to your data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioSamples do need to be referenced, state that "BioSample metadata are available in the NCBI BioSample database (http://www.ncbi.nlm.nih.gov/biosample/) under accession number SAMNxxxxxxxx".

Write to the Help Desk