Search by SRA Cloud-based queries
Overview
SRA has deposited its metadata into BigQuery (GCP) and Athena (AWS) to provide the bioinformatics community with programmatic access to this data. You can now search across the entire SRA by sequencing methodologies and sample attributes. NCBI is piloting this in the cloud-based services to help users leverage the benefits of elastic scaling and parallel execution of queries.
The SRA cloud-based resources contain tables for SRA metadata and computed metadata on SRA runs.
Tables
Below are the tables currently available for searching all of SRA in the cloud and their more extensive documentation:
- SRA Metadata Table - Sample and SRA Metadata information organized by SRA run
- Taxonomy Analysis Information Table - Summary information on the taxonomy analysis done per SRA run
- Taxonomy Analysis Table - results of the taxonomy analysis
- Taxonomy Table - table containing the taxonomy database from NCBI
- Kmer Table - table containing kmers used in the taxonomy analysis
- Annotated Variations Table - variation and annotation information for runs containing SARS-CoV-2
Please read about the SRA Taxonomy Analysis Tool to learn how the analysis is carried out.
The Basics of SQL
The basic SQL query has three parts or statements:
SELECT
: Identifies which columns from the selected table(s) to show. The*
indicates "all columns"FROM
: Identifies table(s) to queryWHERE
: Joins tables using the identical columns in both tables and sets filters on the query
Search in BigQuery
Search in Athena
Contact SRA
Contact SRA staff for assistance at sra@ncbi.nlm.nih.gov