Format

Send to

Choose Destination
J Comput Biol. 2015 Feb;22(2):178-88. doi: 10.1089/cmb.2014.0258. Epub 2015 Jan 22.

A nonhomogeneous hidden markov model for gene mapping based on next-generation sequencing data.

Author information

1
Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University , Diepenbeek, Belgium .

Abstract

The analysis of polygenetic characteristics for mapping quantitative trait loci (QTL) remains an important challenge. QTL analysis requires two or more strains of organisms that differ substantially in the (poly-)genetic trait of interest, resulting in a heterozygous offspring. The offspring with the trait of interest is selected and subsequently screened for molecular markers such as single-nucleotide polymorphisms (SNPs) with next-generation sequencing. Gene mapping relies on the co-segregation between genes and/or markers. Genes and/or markers that are linked to a QTL influencing the trait will segregate more frequently with this locus. For each identified marker, observed mismatch frequencies between the reads of the offspring and the parental reference strains can be modeled by a multinomial distribution with the probabilities depending on the state of an underlying, unobserved Markov process. The states indicate whether the SNP is located in a (vicinity of a) QTL or not. Consequently, genomic loci associated with the QTL can be discovered by analyzing hidden states along the genome. The aforementioned hidden Markov model assumes that the identified SNPs are equally distributed along the chromosome and does not take the distance between neighboring SNPs into account. The distance between the neighboring SNPs could influence the chance of co-segregation between genes and markers. To address this issue, we propose a nonhomogeneous hidden Markov model with a transition matrix that depends on a set of distance-varying observed covariates. The application of the model is illustrated on the data from a study of ethanol tolerance in yeast.

KEYWORDS:

next-generation sequencing; nonhomogeneous hidden Markov model; quantitative trait loci analysis; single-nucleotide polymorphisms

PMID:
25611462
DOI:
10.1089/cmb.2014.0258
[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center