Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2008 Aug 15;24(16):i153-9. doi: 10.1093/bioinformatics/btn298.

HapCUT: an efficient and accurate algorithm for the haplotype assembly problem.

Author information

  • 1Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0404, USA. vibansal@cs.ucsd.edu



The goal of the haplotype assembly problem is to reconstruct the two haplotypes (chromosomes) for an individual using a mix of sequenced fragments from the two chromosomes. This problem has been shown to be computationally intractable for various optimization criteria. Polynomial time algorithms have been proposed for restricted versions of the problem. In this article, we consider the haplotype assembly problem in the most general setting, i.e. fragments of any length and with an arbitrary number of gaps.


We describe a novel combinatorial approach for the haplotype assembly problem based on computing max-cuts in certain graphs derived from the sequenced fragments. Levy et al. have sequenced the complete genome of a human individual and used a greedy heuristic to assemble the haplotypes for this individual. We have applied our method HapCUTto infer haplotypes from this data and demonstrate that the haplotypes inferred using HapCUT are significantly more accurate (20-25% lower maximum error correction scores for all chromosomes) than the greedy heuristic and a previously published method, Fast Hare. We also describe a maximum likelihood based estimator of the absolute accuracy of the sequence-based haplotypes using population haplotypes from the International HapMap project.


A program implementing HapCUT is available on request.

[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk