Format

Send to

Choose Destination
Mol Genet Metab. 2011 Dec;104(4):457-69. doi: 10.1016/j.ymgme.2011.08.019. Epub 2011 Aug 24.

A novel application of pattern recognition for accurate SNP and indel discovery from high-throughput data: targeted resequencing of the glucocorticoid receptor co-chaperone FKBP5 in a Caucasian population.

Author information

1
Department of Pharmacology, Department of Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA. pelleymounter.linda@mayo.edu

Abstract

The detection of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) with precision from high-throughput data remains a significant bioinformatics challenge. Accurate detection is necessary before next-generation sequencing can routinely be used in the clinic. In research, scientific advances are inhibited by gaps in data, exemplified by the underrepresented discovery of rare variants, variants in non-coding regions and indels. The continued presence of false positives and false negatives prevents full automation and requires additional manual verification steps. Our methodology presents applications of both pattern recognition and sensitivity analysis to eliminate false positives and aid in the detection of SNP/indel loci and genotypes from high-throughput data. We chose FK506-binding protein 51(FKBP5) (6p21.31) for our clinical target because of its role in modulating pharmacological responses to physiological and synthetic glucocorticoids and because of the complexity of the genomic region. We detected genetic variation across a 160 kb region encompassing FKBP5. 613 SNPs and 57 indels, including a 3.3 kb deletion were discovered. We validated our method using three independent data sets and, with Sanger sequencing and Affymetrix and Illumina microarrays, achieved 99% concordance. Furthermore we were able to detect 267 novel rare variants and assess linkage disequilibrium. Our results showed both a sensitivity and specificity of 98%, indicating near perfect classification between true and false variants. The process is scalable and amenable to automation, with the downstream filters taking only 1.5h to analyze 96 individuals simultaneously. We provide examples of how our level of precision uncovered the interactions of multiple loci, their predicted influences on mRNA stability, perturbations of the hsp90 binding site, and individual variation in FKBP5 expression. Finally we show how our discovery of rare variants may change current conceptions of evolution at this locus.

PMID:
21917492
PMCID:
PMC3224211
DOI:
10.1016/j.ymgme.2011.08.019
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center