From: Park, Yonil (NIH/NLM/NCBI) Sent: Friday, November 19, 2004 5:25 PM To: 'ncbi-seminar@ncbi.nlm.nih.gov' Subject: CBB seminar, Tuesday, November 23, 11 AM Time: 11 AM, Tuesday, November 23, 2004 Location: NCBI Library, B2 floor, Building 38A Speaker: Yonil Park Title: The Estimation of Statistical Parameters for Gapped Local Alignment Score Distributions by Simulation of Gapped Global Alignment In this talk, we present some non-rigorous numerical studies that use gapped global alignment simulations to estimate the parameters for gapped local alignment statistics. For a given parameter accuracy, simulations of global alignment are more efficient than simulations of local alignment, because they require shorter sequence lengths. Empirical evidence indicates that the distribution of local alignment score approximates a Gumbel extreme value distribution if the gap penalties are large enough. In the absence of an analytic formula for the Gumbel parameters, they must be estimated by computer simulations. In practice, estimates of scale parameter (lambda) must have a 1% to 4% relative error and location parameter (K) must have less than 10% relative error. At present BLAST users are limited to particular scoring systems and gap penalties, because the Gumbel parameters must be computed off-line. If simulations were fast enough that lambda and K could be calculated in less than about one second, however, BLAST users could employ arbitrary scoring systems and gap penalties. First, we give a heuristic model of the global alignment of random sequences based on Markov additive processes. This heuristic suggests a numerical acceleration scheme for simulating the Gumbel scale parameter (lambda). Our numerical study shows that for the default scoring system in the current protein-protein BLAST, lambda can be computed to 0.5% relative error in 3.7 sec on a PC. Second, we speculate on a formula for the location parameter (K) and confirm it with numerical results. Finally, we also present a method of calculating the BLAST finite-size correction from simulations of global alignment. ---------- Yonil Park NCBI/NLM/NIH Bldg 38A, Room 6N611N 8600 Rockville Pike Bethesda, MD 20894 Voice: 301-402-1438 Fax: 301-480-2288