New Cn3D 4.0
1000th Viral RefSeq Unmasked!
New Genomes in GenBank
View the Mouse Genome with Map Viewer
Mouse Genome Resources
GenBank Release 131
Anopheles Gambiae Genome
Submitting Large Sequin Files
BLAST Version 2.2.4 Released
SNP Population Grows at NCBI
SNPs, or single nucleotide polymorphisms, are
variations in genomic DNA sequences within a population of organisms.
These genetic changes occur at a frequency of over 1 percent in the human
genome, and are important because they are sometimes linked to heritable
phenotypes. Knowledge of SNPs is useful for physical mapping, disease
association, and surveys of population structure. The dbSNP database was
developed at NCBI to facilitate the management of SNP data, integrate
this data with other NCBI resources, and distribute the information to
the scientific community.
Composition of the Database
The data in dbSNP includes SNPs, microsatellite repeats, and small insertion/deletion
polymorphisms. There is no minimum allele frequency or requirement that
a SNP result in a measurable phenotype for submission to dbSNP, and a
large portion of the polymorphisms in the database are neutral polymorphisms.
Currently, dbSNP contains predominantly human data, but variation information
for sev-eral other organisms can be found in the database. Release 106
of dbSNP contained 4.5 million SNPs, and the database is growing at a
rate of 90 SNPs per month.
Although dbSNP accepts submissions from any laboratory or individual,
the bulk of the submissions are derived from large-scale contributors
associated with the National Human Genome Research Institutes (NHGRI)
grants program that aims to catalog 50,000 SNPs by 2005. SNPs are submitted
to dbSNP using a special procedure that involves registering a submission
handle with the NCBI SNP group, followed by the preparation
of a set of structured submission files. Instructions on how to submit
to dbSNP are located on the dbSNP home page. Each SNP in the database
is given an identifier beginning with ss, for submitted
SNP. If there are multiple submissions for the same SNP, then a
reference SNP cluster is created, to incorporate information from the
multiple submitters. The reference SNP clusters, given rs
identifiers, are used in the annotation of reference genome sequences.
The SNP Record
A SNP record contains the obser-ved alleles at a particular locus, the
flanking sequence that surrounds the variation, the experimental method
used to assay the variation, including protocols and conditions, and cross-references
to associated GenBank records or UniGene clusters. Other types of data
that can be included are genetic map locations, population-specific frequencies,
individual-specific genotype information, relevant publications that docu-ment
the details of the methodologies or populations, known genes in the region,
synonyms for a submitters SNP ID used in the submission, and validation
information to describe the quality of the frequency data.
Searches of dbSNP may be limited to Entrez fields such as allele variation,
validation status, chromosome on which the SNP is mapped, and many others.
SNP records retrieved in Entrez are displayed in a summary format tailored
to the structure of a SNP record. There are, however, several additional
display formats, such as a graphical summary and a chromosome report.
Entrez SNP results may also be sorted by various fields including organism,
SNP ID, success rate, and heterozygosity.
Special SNP query services offer pre-formulated search methods by Submitter,
New Batches, Method type, Population Detail, Publication, Locus Information,
and STS Markers; two Free Form Search services are also offered. In addition,
the dbSNP data can be searched using a special BLAST page. These search
options are linked from the blue sidebar menu of the dbSNP home page.
The integration of SNPs into other resources, such as the Map Viewer,
provides a way to see them in their genomic context. When the SNP
Master Map is viewed in the Map Viewer, a graphical summary showing mapping
information, associated gene features, and marker heterozygosity is provided.
Batch downloads of SNPs can be performed using Batch Entrez or via an
e-mail-mediated query service that allows for the retrieval of a large
number of SNPs by using individual submissions (ss#), submitter IDs, or
dbSNP RefSNP cluster IDs (rs#). The SNP batch query service is accessible
The SNP data may also be downloaded at ftp://ftp.ncbi.nih.gov/snp/.
The SNP home page is found at www.ncbi.nlm.nih.gov/SNP/.
The Entrez SNP page can be reached from the Hotspots
column on the NCBI home page.