Format

Send to

Choose Destination
J Bioinform Comput Biol. 2003 Jul;1(2):253-65.

Automated identification of single nucleotide polymorphisms from sequencing data.

Author information

1
Centre National de Génotypage, 2, rue Gaston Crémieux-CP5721, 91057 Evry, France. masazumi@cng.fr

Abstract

The single nucleotide polymorphism (SNP) is the difference of the DNA sequence between individuals and provides abundant information about genetic variation. Large scale discovery of high frequency SNPs is being undertaken using various methods. However, the publicly available SNP data sometimes need to be verified. If only a particular gene locus is concerned, locus-specific polymerase chain reaction amplification may be useful. Problem of this method is that the secondary peak has to be measured. We have analyzed trace data from conventional sequencing equipment and found an applicable rule to discern SNPs from noise. The rule is applied to multiply aligned sequences with a trace and the peak height of the traces are compared between samples. We have developed software that integrates this function to automatically identify SNPs. The software works accurately for high quality sequences and also can detect SNPs in low quality sequences. Further, it can determine allele frequency, display this information as a bar graph and assign corresponding nucleotide combinations. It is also designed for a person to verify and edit sequences easily on the screen. It is very useful for identifying de novo SNPs in a DNA fragment of interest.

PMID:
15290772
DOI:
10.1142/s021972000300006x
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center