![]() |
| Open Mass Spectrometry Search Algorithm (OMSSA) Probe Database Debut New Structure Link from Protein BLAST Download Update New Microbial Genomes in GenBank Nucleotide Database Splits NCBI 4-Pack Course RefSeq Release 14 New Organisms in UniGene GenBank Passes 100 Gigabases New BLAST Formatter Splign Alignment Tool GenBank Release 150 New Genome Builds Masthead |
Submitting Sequence Polymorphisms to NCBI's dbSNP Small genetic variations at specific positions in the genome, called single nucleotide polymorphisms or SNPs, are often responsible for phenotypic differences. The identification and analysis of SNPs in human and other complex genomes has become one of the major themes of biomedical research since the completion of the human genome sequence. The NCBI database of Single Nucleotide Polymorphisms (dbSNP) provides a public repository for this rapidly growing set of primary data and now contains over 40 million submitted SNPs from 33 different species. SNPs are submitted using a specialized protocol which involves the generation and transmission of a set of files to the NCBI SNP submission group. A brief outline of the SNP submission protocol and file types needed is presented below.
The SNP submission process is modeled on that of the GenBank bulk divisions - sequence tagged site (STS), genome survey sequence (GSS) and expressed sequence tag (EST). In fact, it is possible to simultaneously submit polymorphism data as a STS and a SNP. In all submission scenarios, the submitter creates a text file made up of a combination of required and optional sections — Contact, Publications, Method, Population Description and Assay, among others — for different types of information. Each section of the file is broken up into a set of fields identified by colon-delineated capitalized tags for the various types of data. The SNP submissions page mentioned above provides more information, including examples of the submission file format, and shows the possible sections and fields.
Click on image to view larger Figure 1: A section of a spreadsheet for creating a SNP submission showing the Contact, Publications, Method and Population sections. The completed submission file should be emailed to: SNP submissions can be made for either published or unpublished data. Each submitted SNP is assigned an identifier of the form ss#, where “#” represents an integer identifier. The ss identifier serves the same purpose as an accession number for a GenBank sequence. NCBI also builds a non-redundant Reference SNP (RefSNP) database. Each RefSNP cluster, which is given an identifier of the form rs#, contains polymorphisms that map to the same position in the genome. RefSNPs are available as part of the Entrez database system and are linked to the primary SNP records as well as sequence, gene, genome, structure and functional information. An example of a submitted SNP record that includes population and other detailed information for the human gene alcohol dehydregnase 2 can be seen on the following Web page: Questions concerning snp submissions should be directed to: —MR |
||||||
|
|||||||