Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Genome Biol Evol. 2009 Jun 22;1:153-64. doi: 10.1093/gbe/evp015.

Efficient sampling of parsimonious inversion histories with application to genome rearrangement in Yersinia.

Author information

  • 1Bioinformatics group, Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences, Budapest, Hungary. miklosi@renyi.hu

Abstract

Inversions are among the most common mutations acting on the order and orientation of genes in a genome, and polynomial-time algorithms exist to obtain a minimal length series of inversions that transform one genome arrangement to another. However, the minimum length series of inversions (the optimal sorting path) is often not unique as many such optimal sorting paths exist. If we assume that all optimal sorting paths are equally likely, then statistical inference on genome arrangement history must account for all such sorting paths and not just a single estimate. No deterministic polynomial algorithm is known to count the number of optimal sorting paths nor sample from the uniform distribution of optimal sorting paths. Here, we propose a stochastic method that uniformly samples the set of all optimal sorting paths. Our method uses a novel formulation of parallel Markov chain Monte Carlo. In practice, our method can quickly estimate the total number of optimal sorting paths. We introduce a variant of our approach in which short inversions are modeled to be more likely, and we show how the method can be used to estimate the distribution of inversion lengths and breakpoint usage in pathogenic Yersinia pestis. The proposed method has been implemented in a program called "MC4Inversion." We draw comparison of MC4Inversion to the sampler implemented in BADGER and a previously described importance sampling (IS) technique. We find that on high-divergence data sets, MC4Inversion finds more optimal sorting paths per second than BADGER and the IS technique and simultaneously avoids bias inherent in the IS technique.

KEYWORDS:

MCMC; Yersinia; genome rearrangement; inversion

PMID:
20333186
[PubMed]
PMCID:
PMC2817410
Free PMC Article

Images from this publication.See all images (6)Free text

F IG . 1.—
F IG . 2.—
F IG . 3.—
F IG . 4.—
F IG . 5.—
F IG . 6.—
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk