Download Sequence and Track Data

Download FASTA and GenBank flat file

You can download sequence and other data from the graphical viewer by accessing the Download menu on the toolbar.

Download Button

You can download the FASTA formatted sequence of the visible range, all markers created on the sequence, or all selections made of the sequence. You can make a highlighted selection using click and drag, and multiple selections can be made by holding down on the Ctrl button on your keyboard.

Download FASTA

You can also download annotation and sequence data from the sequence track in GenBank flat file format.

Download FlatFile

Download Track Data

The Download Track Data dialog allows you to download portions of track data in tabular formats for further analysis.

Currently, you can download data from gene annotation and selected feature and SNP annotation tracks. Only tracks added to the graphical view will be shown in the download dialog. Some SNP annotation data cannot be downloaded from the graphical viewer, including legacy SNP tracks older than NCBI SNP build 151. Download of remote or user-provided track data, e.g. remote tabix VCF tracks, is also not possible at this time.

The "Download Track Data" dialog in the graphical viewer does not permit downloads containing greater than 30 million features or SNPs. Complete NCBI SNP release VCF files can be found on the SNP FTP site.

Please refer to the NCBI genomes FTP site or the NCBI Datasets page to obtain NCBI full genome annotation data.

Please contact us if you would like to download a track or track format that we do not currently offer in this dialog.

download-SNP-tracks

Data Formats

GFF3

Please refer to the GFF3 file format description. This format is currently only available for gene annotation tracks.

When the "Include RNA and CDS features" box is checked, RNAs, CDS, exons, and other features (if any) annotated on the gene track will be included in the downloaded file. When this box in unchecked, only the gene feature rows will be included in the file.

If the requested range starts or stop in the middle of a feature, the reported start or stop coordinates will match the requested coordinate(s). The range will not be extended to encompass the full range of the feature. Rows for truncated features will contain the attribute "partial=true" in column 9.

CSV

The CSV format is currently only available for gene annotation tracks. The CSV (comma-separated value) table includes columns reporting the gene feature start and stop coordinates, symbol, strand/orientation, NCBI Gene database ID, and official gene name. The start and stop coordinates in the table correspond to the full range of the gene feature and may extend past the requested range coordinates. Gene names may not be available for some gene features.

BED

The BED file reports six columns (accession, start, stop, gene or feature name, score, strand).

Currently, the "Include RNA and CDS features" option is not supported for the CSV and BED file format options. Therefore, these file formats cannot be generated for tracks that only include RNA and CDS features, e.g. CCDS Features tracks.

VCF

NCBI SNP tracks can be downloaded in VCF format. To obtain VCF files of whole genome NCBI SNP annotation, please go to the NCBI SNP FTP site at ftp://ftp.ncbi.nlm.nih.gov/snp/.

Please refer to this page for more information on downloading image data as PDF or SVG files.

Support Center

Last updated: 2020-06-09T18:27:23Z