Format

Send to

Choose Destination
Bioinformatics. 2015 Jun 15;31(12):i240-9. doi: 10.1093/bioinformatics/btv263.

Genome-wide detection of intervals of genetic heterogeneity associated with complex traits.

Author information

1
Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany.
2
Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland, The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan, JST, PRESTO, Japan and Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany.

Abstract

MOTIVATION:

Genetic heterogeneity, the fact that several sequence variants give rise to the same phenotype, is a phenomenon that is of the utmost interest in the analysis of complex phenotypes. Current approaches for finding regions in the genome that exhibit genetic heterogeneity suffer from at least one of two shortcomings: (i) they require the definition of an exact interval in the genome that is to be tested for genetic heterogeneity, potentially missing intervals of high relevance, or (ii) they suffer from an enormous multiple hypothesis testing problem due to the large number of potential candidate intervals being tested, which results in either many false positives or a lack of power to detect true intervals.

RESULTS:

Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype. It also solves both the inherent computational efficiency problem and the statistical problem of multiple hypothesis testing, which are both caused by the huge number of candidate intervals. We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping.

CONCLUSIONS:

Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes.

AVAILABILITY AND IMPLEMENTATION:

The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/sis.html.

PMID:
26072488
PMCID:
PMC4559912
DOI:
10.1093/bioinformatics/btv263
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center