Format

Send to

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2004 Jul 6;5:89.

A double classification tree search algorithm for index SNP selection.

Author information

  • 1Laboratory of Population Genetics, National Cancer Institute, NIH, Bethesda, MD 20892, USA. zhangpeis@mail.nih.gov

Abstract

BACKGROUND:

In population-based studies, it is generally recognized that single nucleotide polymorphism (SNP) markers are not independent. Rather, they are carried by haplotypes, groups of SNPs that tend to be coinherited. It is thus possible to choose a much smaller number of SNPs to use as indices for identifying haplotypes or haplotype blocks in genetic association studies. We refer to these characteristic SNPs as index SNPs. In order to reduce costs and work, a minimum number of index SNPs that can distinguish all SNP and haplotype patterns should be chosen. Unfortunately, this is an NP-complete problem, requiring brute force algorithms that are not feasible for large data sets.

RESULTS:

We have developed a double classification tree search algorithm to generate index SNPs that can distinguish all SNP and haplotype patterns. This algorithm runs very rapidly and generates very good, though not necessarily minimum, sets of index SNPs, as is to be expected for such NP-complete problems.

CONCLUSIONS:

A new algorithm for index SNP selection has been developed. A webserver for index SNP selection is available at http://cognia.cu-genome.org/cgi-bin/genome/snpIndex.cgi/

PMID:
15238162
PMCID:
PMC476734
DOI:
10.1186/1471-2105-5-89
[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Support Center