About dbSNP Reference (rs) number

dbSNP Reference SNP (rs or RefSNP) number is a locus accession for a variant type assigned by dbSNP. The RefSNP catalog is a non-redundant collection of submitted variants which were clustered, integrated and annotated. RefSNP number is the stable accession regardless of the differences in genomic assemblies. RefSNP numbers facilitate large-scale studies in association genetics, medical genetics, functional and pharmaco-genomics, population genetics and evolutionary biology, personal genomics, and precision medicine. They provide a stable variant notation for mutation and polymorphism analysis, annotation, reporting, data mining, and data integration.

Distinguishing RefSNP Features

  • Non-redundancy and globally unique accession series (1)
  • Composed from over 2 billion Submitted SNP (ss) from thousands of submitters.
  • More than 20 years of tracking histories for all assigned, merged, and deleted RefSNP.
  • Annotated and linked to the latest human assembly and RefSNP nucleotide and protein sequences.
  • Updates to reflect current knowledge of sequence data and biology
  • Data validation.
  • Ongoing curation and annotation by NCBI staff and collaborators.
  • Searchable across variation and genomic databases
  • Supported and reported in open-source and commercial software and tools.
  • Over 400K RefSNP are in ClinVar
  • Cited in over 51K publications with biological, functional, disease, and clinical information for variants across the genomes (2,3,4)
  • Linked to many NCBI internal and external resources such as ClinVar, PubMed, PubMedCentral, RefSeq, UCSC, EBI, TopMed, and GnomAD.
  • Supports consistent reporting and non-redundant variation annotations across related sequences including alternate haplotypes, GRC patches, and future graph genomes if the alignment or sequence relationship is known.

(1) Unique across all organisms but dbSNP currently only assign new RefSNP for human (See Scope below)

(2) LitVar Text mining

(3) Cited RefSNP

(4) PubMed

RefSNP Assignment

dbSNP maps all Submitted SNP (ss) to the most recent genome assembly and RefSeq sequences using the new build pipeline that utilizes SPDI notation and the Variant Overprecision Correction Algorithm (VOCA). The system provides a more precise RefSNP mapping, clustering, and variant normalization (Holmes et al., 2019). Each ss that maps to the same position and is the same variant type is assigned to an existing RefSNP (rs) or assigned a new one. All the ss data including publication, alleles, and frequency data are integrated with the assigned RefSNP for reporting and exchange. New or updated RefSNP records are made publicly available during each dbSNP build release about once per quarter.

dbSNP relies on user submissions to generate a new RefSNP. Please submit your novel variants, along with frequency data if available, in order to promote progress in biomedical research through broad data sharing and ensure compliance with Genomic Data Sharing Policy for NIH-funded research (https://osp.od.nih.gov/scientific-sharing/data-repositories-and-trusted-partners/).

Scope

dbSNP only assigned RefSNP for human organisms as an outcome of the recent collaborations with EMBL-EBI European Variation Archive (EVA). dbSNP Build 152 (November 2018) contains more than 650 million human RefSNP records, of which over 580 million records have population frequency data.

Variation Type

Despite its name, RefSNP is assigned to all variation types listed below with precise locations for both common and rare variations, including mutations. Most are typically small variations (<= 50bp).

  • Single nucleotide variation (SNV)
  • Short multi-nucleotide changes (MNV)
  • Small deletions or insertions
  • Small STR repeats
  • retrotransposable element insertions

dbSNP Accession Types

Submitted SNP (ss) – submitted variant based on asserted location or flanking sequences

Reference SNP (rs) - Non-redundant set of variations based on clustering of SS’es of same variant type and sequence position.

Data Aggregation and Annotations

  • Submitted SNP (ss) information
  • Submitter contact and publications
  • Variation Data – alleles, genotype, and frequency
  • Experimental methods and conditions
  • Genomic positions on different assembly versions
  • RefSNP are annotated on all available latest genomic assemblies and RefSeq sequences (mRNA, Protein, and RefSeqGene)
  • ClinVar clinical assertions
  • Allele Frequency
  • Molecular Consequences
  • Linked Resources

External submitters and collaborators

  • ClinVar
  • BioProject
  • BioSample
  • Gene
  • PubMed
  • Genome
  • Nucleotide
  • Protein

Data Access

RefSNP data, including genotype, frequency and associated metadata, are available without restrictions on the web, FTP, and API.

Web:
FTP downloads in JSON and VCF formats
API:
Videos:
  • New Variation Services for Normalizing, Remapping, and Annotating Variants (YouTube).

Note: Entrez upcoming eUtils changes (http://bit.ly/2tKKldq).

Additional Information:

Support Center

Last updated: 2019-12-02T15:45:37Z