Friday, May 11, 2007 A Survey of SNP Haplotype Inference problem Yan Zhang SNP haplotypes have been drawing great attention for the study of disease mapping and for tracing population migration events. Since in many situations only genotype data is available, computational methods have been developed to identify the haplotype pairs carried by an individual and to estimate haplotype frequencies from genotype data. There are two major approaches to solving the inference problem: combinatorial methods and statistical methods. Combinatorial methods often state an explicit objective function that one tries to optimize in order to obtain a solution to the inference problem. Statistical methods are usually based on an explicit model of haplotype evolution; the inference problem is then cast as a maximum-likelihood or a Bayesian inference problem. In this talk, I will introduce the distinct mathematical models and assumptions that support these methods. By using two real data sets and a set of simulated data that follows coalescent model, I will show my experimental evaluations on four representative methods. I conclude that, firstly, the informative biological prior is the key for this inference problem; secondly, future research of this problem should focus on improving the performance of algorithms for data sets with low LD, genotyping errors, missing alleles, and rare haplotypes.