Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2012 Feb 1;28(3):358-65. doi: 10.1093/bioinformatics/btr673. Epub 2011 Dec 8.

M(3): an improved SNP calling algorithm for Illumina BeadArray data.

Author information

  • 1Biostatistics Division, Department of Epidemiology and Public Health, Yale University, New Haven, CT 06520, USA.

Abstract

SUMMARY:

Genotype calling from high-throughput platforms such as Illumina and Affymetrix is a critical step in data processing, so that accurate information on genetic variants can be obtained for phenotype-genotype association studies. A number of algorithms have been developed to infer genotypes from data generated through the Illumina BeadStation platform, including GenCall, GenoSNP, Illuminus and CRLMM. Most of these algorithms are built on population-based statistical models to genotype every SNP in turn, such as GenCall with the GenTrain clustering algorithm, and require a large reference population to perform well. These approaches may not work well for rare variants where only a small proportion of the individuals carry the variant. A fundamentally different approach, implemented in GenoSNP, adopts a single nucleotide polymorphism (SNP)-based model to infer genotypes of all the SNPs in one individual, making it an appealing alternative to call rare variants. However, compared to the population-based strategies, more SNPs in GenoSNP may fail the Hardy-Weinberg Equilibrium test. To take advantage of both strategies, we propose a two-stage SNP calling procedure, named the modified mixture model (M(3)), to improve call accuracy for both common and rare variants. The effectiveness of our approach is demonstrated through applications to genotype calling on a set of HapMap samples used for quality control purpose in a large case-control study of cocaine dependence. The increase in power with M(3) is greater for rare variants than for common variants depending on the model.

AVAILABILITY:

M(3) algorithm: http://bioinformatics.med.yale.edu/group.

CONTACT:

name@bio.com; hongyu.zhao@yale.edu

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
22155947
[PubMed - indexed for MEDLINE]
PMCID:
PMC3268244
Free PMC Article

Images from this publication.See all images (4)Free text

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk