From: Przytycka, Teresa (NIH/NLM/NCBI) [E] Sent: Tuesday, April 07, 2009 12:54 PM To: NLM/NCBI List ncbi-seminar Subject: Special CBB seminar, April 29, 11 am Susanta Tewari Speaker: Susanta Tewari, University of Georgia Date: April 29, 2009 Time: 11:00 Place: B2 Library A Probabilistic Model of Meiotic Recombination and an Efficient Recursive Linking Algorithm for Constructing Likelihood Based Genetic Maps for a Large Number of Markers Neurospora crassa, a model organism has been extensively used to explore fundamental cellular processes, such as recombination and for construction of high-resolution maps because of its excellent biological characteristic. We model the recombination process of fungal systems via chromatid exchange in meiosis that accounts for any type of bivalent configuration in a genetic interval in any specified order of genetic markers for both random spore and tetrad data. First, a probability model framework has been developed for 2 genes and then generalized for arbitrary number of genes. Maximum Likelihood Estimators (MLE) for both random and tetrad data are developed. It has been shown that the MLE estimator of recombination for tetrad data is uniformly more efficient over that from random spore data by a factor of at least 4 usually. The MLE for the generalized probability framework has been computed using the Expectation Maximization (EM) algorithm. This MLE could be found either with a straightforward algorithm or with the proposed recursive linking algorithm. The time complexity of the straightforward algorithm is exponential without bound in the number of genetic markers, and implementation of the model with a straightforward algorithm for more than 7 genetic markers is not feasible, thus motivating the critical importance of the proposed recursive linking algorithm. The recursive linking algorithm, a variant of dynamic programming algorithm, decomposes the pool of genetic markers into segments and renders the model feasible for hundreds of genetic markers. The recursive algorithm is shown to reduce the order of time complexity from exponential to linear in the number of markers. The improvement in time complexity is shown theoretically by a worst-case analysis of the algorithm and supported by run time results using data on linkage group-II of the fungal genome Neurospora crassa. Pearson chi-squared statistic is computed as a measure of goodness-of-fit using a product multinomial set-up. We implement our model with genetic marker data on the whole genome of Neurospora crassa. Simulated annealing is used to search for the best order of genetic markers for each chromosome, and the goodness-of-fit value is evaluated for model assumptions. Keywords: Exchange, EM algorithm, Dynamic Programming, Time complexity, MLE, Recombination. ========================================================== Teresa M. Przytycka, PhD Investigator NIH | NLM | NCBI http://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/index.html Phone 301-402-1723 | Fax 301-480-4637 | cell 301-219-0766 ==========================================================