Chapters published under CC BY 3.0 license
Monographs, or book chapters, which are outputs of Wellcome Trust funding have been made freely available as part of the Wellcome Trust's open access policy
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Mode CJ, editor. Applications of Monte Carlo Methods in Biology, Medicine and Other Fields of Science [Internet]. Rijeka (HR): InTech; 2011 Feb 28.
The biochemical models describing complex and dynamic metabolic systems are typically multi-parametric and non-linear, thus the identification of their parameters requires non-linear regression analysis of the experimental data. The stochastic nature of the experimental samples poses the necessity to estimate not only the values fitting best to the model, but also the distribution of the parameters, and to test statistical hypotheses about the values of these parameters. In such situations the application of analytical models for parameter distributions is totally inappropriate because their assumptions are not applicable for intrinsically non-linear regressions. That is why, Monte Carlo simulations are a powerful tool to model biochemical processes. The classification of Monte Carlo approaches is not unified, so here we comply with the interpretation given in (Press et al., 1992), where the general Monte Carlo approach is to construct parallel virtual worlds, in which the experimental estimates will play the role of true parameters, if the way in which the true parameters generate a sample is known. Bootstrap is a modification of Monte Carlo, which uses very few premises imposed on the data, and does not need to know the mechanism by which the true parameters generate experimental samples. Instead, resampling with replacement from the experimental sample is used to construct synthetic samples.
As far as confidence intervals (CI) are concerned, literature offers multiple types, but each of them belongs to one of the two main groups: root (Politis, 1998) and percentile intervals (Efron & Tibshirani, 1993). The difference in the philosophy of those two CI types is substantial for the biochemical interpretation of results. The difference here is explained with the difference between classical statistics (where the parameters are fixed unknown quantities) and Bayesian statistics (where the parameters are random variables with unknown distributions), and also with the philosophical differences between objectivity and subjectivity of scientific research. The main conclusion is that root confidence intervals are confidence intervals of the investigated parameters, whereas percentile confidence intervals refer to the estimates of the investigated parameters.
Our first application of Monte Carlo and Bootstrap simulation procedures is with a simulation platform for training students in medical biochemistry (Tenekedjiev & Kolev, 2002). In this system, students search for estimates and confidence intervals of parameters of a given biochemical system for different enzyme-substrate pairs. The platform applies Monte Carlo simulation on two stages. Initially, a Monte Carlo procedure is applied to emulate a biochemical experimental measurement setting along with given enzyme kinetic reactions as realistically as possible. The system is in position to simulate continuous enzyme assay (used for adjustment of the “experimental” conditions) and end-point enzyme assay “measurements” (suitable for parameter identification). We use an ordinary differential equation (ODE) as basis of the generation of pseudo-experimental data. The pseudo-real nature of the generated data is ensured by the random incorporation of three types of errors for each repetition of the experiments. The Briggs-Haldane steady-state model is fitted to the pseudo-measured and end-point assay data obtained by the system. The kinetic parameters can be calculated by χ2-minimization. The task is simplified by the existence of a good initial guess from a linearized Lineweaver-Burk model. The two-dimensional root confidence regions of the parameters can be calculated by either Monte Carlo or Bootstrap, following similar procedures. The best point estimate is identified using trimmed mean over the flipped parameters taking only the values from the identified root confidence region.
In the majority of biochemical reactions, parameters are unknown in very wide intervals, and may have different numerical order. Finding the root confidence regions (intervals) includes parameter flipping, which often generates results with an incorrect sign. That is why, in a second example (Tanka-Salamon et al., 2008) we propose a multiplicative modification for the estimation of root confidence regions and the best estimate of the parameters, which ensures that all estimates will have a physical meaning. The main assumption is that the ratio between the true parameter value and the optimal parameter value derived from the true data sample has the same distribution as the ratio between the optimal parameter value derived from the true data sample, and the optimal synthetic parameter value derived from the synthetic data sample. The assumption is equivalent to performing classical Bootstrap over the logarithms of the estimated parameters. This method is applied in a real experimental set-up for the estimation of root confidence regions of kinetic constants and root best estimates in amidolytic activity of plasmin under the influence of three fatty acids. By doing so, the inhibition effect of the three fatty acids can be proven and quantified. The measured data have the form of continuous reaction progress curves with several replicas. The product concentrations are predicted by three different models with increasing complexity. We model the instability of the inhibited enzyme and represent the resulting continuous assay model with concomitant inactivation of the enzyme as a system of two stiff ODE. From there, we derive the closed form of the progress curve. The four-dimensional root confidence regions are acquired by Monte Carlo simulation in every data point in each of the progress curves using an analytical model of the measured standard deviation, similarly to the first example.
Statistical simulation methods are a powerful tool in the analysis of complex systems. The most popular among them is Monte Carlo. The numerical techniques that stand behind this method are based on statistical simulation, i.e. on any method that uses random number sequences to conduct a simulation. The essence of the method is that it provides integral measures of uncertainty of the simulated system based on the known uncertainties of its parts (Hertz & Thomas, 1983). The integral measures are calculated on the basis of a large number of system instances in different pseudo-realities, each defined by a specific set of randomly generated states of its parts. The Monte Carlo approach is successfully employed in finding estimates of parameters that define the behavior of different stochastic systems. The flexibility, the very few premises imposed on the data and the application in hypothesis tests and statistical parameter assessment are what makes simulation techniques widely accepted.
Following the interpretation of (Press et al., 1992), one can assume that there is an experiment that intends to assess a certain set of M parameters that define the behavior of a measurable stochastic system. The true values of those parameters are unknown to the observer, but they are statistically realized in a set of real measurements D0 available to the observer (called a learning sample), which incorporates some random error. Then the experimenter can assess the parameters of a given model so that the discrepancy between the modeled and the measured data is minimized (e.g. using χ2-minimization or some other method) and a set of real parameters is formed. Due to the random character of the sample generating process, repetitions of the experimental measurement would generate many other possible measurement sets – D1, D2, D3, … – with identical structure as D0. Those in turn would generate sets of real parameters respectively , … that slightly differ from each other. So the estimate is just an instance of the M-dimensional random variable .
The task is to find the distribution of the deviation of the real parameters from the true ones, when just D0 is available. This random variable is called a root (Beran, 1986; DiGiccio & Romano, 1988). As long as is not known, the general approach is to create a fictitious world, where the true parameters are substituted with the real ones. The main assumption is that the root in the real world has the same distribution as in the fictitious world.
If we know the process that generates data under a given set of parameters , then we can simulate synthetic data sets . with the same structure as the learning sample. The Bootstrap method (Efron & Tibshirani, 1993) applies to cases where the process that generates the data and/or the nature of the measurement error are unknown, and the learning sample D0 is formed out of N independent and identically distributed (iid) measurements. The Bootstrap generates synthetic samples with the same structure as D0 by drawing with replacement from D0. In (Press et al., 1992) this type of Bootstrap is called quick-and-dirty Monte Carlo. There is another Bootstrap version, called exact Bootstrap, which forms all synthetic samples that can be generated by drawing with replacement from D0. However, this method is rather impractical in a real problem so we shall not stress it here.
The parameters can be identified from the synthetic samples in exactly the same way as was identified from D0, which in turn would generate instances (q=1, 2, …, Q) of the root in the fictitious world. If sufficiently large number of such instances Q is created, then it is possible to construct the empirical distribution of the root in the fictitious world, which according to the main assumption coincides with the distribution of the root .
The ultimate step in the simulation is to assess confidence intervals or confidence regions of the investigated M-dimensional parameter. There are different methods to assess one-dimensional CIs (e.g. percentile-t, bootstrap-t, bias-correction, simple bootstrap interval, etc.; see (Davison & Hinkley, 1997; MacKinnon & Smith, 1998)), yet for multi-dimensional CIs practically there are only two methods – root and percentile methods. The distinction of those two stems from the different approaches to probabilities in general.
If the frequentist definition of probabilities is adopted (Von Mises, 1946), then the likelihood inference has to be applied in parameter identification (Berger & Wolpert, 1988). Here the parameters are considered unknown, but deterministic values. In that sense, the estimate is the only value, which is random, so its confidence region can be calculated. Since the distribution of and coincide, then the following is true for the random variable . Then, the instances of may be generated by replacing with as the only available estimate: . The so-called percentile confidence interval (or region) is identified from the instances of the synthetic parameters and it is in fact the confidence region of the estimate.
If the subjective definition of probabilities is adopted (Jeffrey, 2004), then Bayesian statistics can be used in the parameter identification (Berry, 1996). Here the parameters are considered to be random variables, whose distribution can be assessed, and its confidence region may be calculated. Since the distributions of and coincide, then the following is true for the random variable . Then, the instances of may be generated by replacing with as the only available estimate: . The so-called root confidence interval (or region) is identified from the instances of the flipped (around the real parameters) synthetic parameters and it is in fact the confidence region of the true parameter. Furthermore, using this approach it is possible to find a point estimate of that is better than . One possibility is to find as the mean value of the flipped synthetic parameters . Since the method is sensitive to errors (Davidson & MacKinnon, 1999), it is better to use trimmed mean value (Hanke & Reitsch, 1991). Regardless of the type of mean value, the resulting point estimate is unbiased unlike .
Metabolomics (Strohman, 2002) deals with the evaluation of the dynamic metabolic networks and links the genetic information to the phenotype through adequate metabolic control analysis. It is a prerequisite in the understanding of the cellular phenotype and its pathological alterations. For that purpose, one not only requires the technical developments that allow stringent monitoring of the metabolic fluctuations induced in vivo by biological signals and environmental changes, but also powerful mathematical models capable of treating dynamic metabolic systems in their variability. The principles of metabolomics are well known in biochemistry (Newsholme & Leech, 1984; Voet & Voet, 1995), but the training of biomedical and clinical researchers is still insufficient to exploit the opportunities provided by the up-to-date computer-intensive statistical procedures applicable in the field of enzyme kinetics. For that purpose, our earlier work (Tenekedjiev & Kolev, 2002) introduces a computer-simulated experimental setting, in which the user (a graduate medical student, who is familiar with the basic ideas in enzyme kinetics and the structure of metabolic pathways) acquires skills in adjusting experimental conditions to conform model assumptions, in identification of kinetic parameters and determination of their confidence intervals, in application of these parameters for metabolic predictions in context-dependent in vivo situations. In the proposed system, students search for estimates and confidence intervals of the parameters kp and of an enzyme-catalyzed reaction
Initially, the system uses Monte Carlo simulations to emulate enzyme kinetic reactions as close as possible to the real lab setting. The user can perform continuous enzyme assay and end-point assay “measurements”.
The continuous enzyme assay simulates an expensive experiment, where the product concentration is measured at equally spaced time intervals (sampling time), and the mean reaction rate is calculated for different conditions. Here it is not envisaged to make repetitions and perform identification of the parameters. The user sets up the enzyme-substrate pair, the designed concentrations of the total enzyme (the free enzyme concentration plus the enzyme-substrate complex concentration) and the initial substrate , the temperature T, the pH, the overall experimental time tover and the sampling time Δt (see Fig. 3.1). As a result, the user gets the time course of the pseudo-measured product Pmes(i. Δt), for i=1, 2, …, tover / Δt. The time course of the substrate, the product, the free enzyme and the enzyme-substrate complex are recalculated from Pmes (see Fig. 3.2). These results form a single replica of the process.
Setting up the designed concentrations of the total enzyme Etdes and the initial substrate S0des , the temperature T, the acidity pH, the overall experimental time tover and the sampling time Δt for a given enzyme-substrate pair.
Time course of the substrate, the product, the free enzyme and the enzyme-substrate complex from the setting in Fig. 3.1.
For each replica we adopted the Briggs-Haldane steady-state approach (Segel, 1993) to model the transformation of the substrate into product in the biochemical system (3.1) using the ODE:
In (3.2) the true concentration of the product Ptrue is a function of the time t; is the true concentration of the total enzyme; is the true concentration of the substrate; kp,app and KM,app are the apparent constants kp and KM in this replica. The initial condition of (3.2) is that the true concentration of the product is zero at the beginning: Ptrue(0)=0.
To model the biological diversity of the substrate-enzyme pair in the real experiment, kp,app and KM,app are instances of random variables. The apparent constant KM,app depends also on pH, whereas kp,app depends on pH, T, and t:
In (3.3), (3.4) and (3.5), Rp, Rm and RB are instances of a positive continuous uniformly distributed random variable on the interval [1–Δpar ; 1+ Δpar], where Δpar <1 is given. The constants Td, A1, A2, A3, g, Ke1, Ke2, Kes1, Kes2, Ka1, Ka2 are known typical constants for each enzyme-substrate pair, with the following meaning: Td – the temperature over which the enzyme degradation begins (in degrees C); A1 – the logscale factor of the apparent kp constants at t=0 (in 1/min); A2 – the heat acceleration factor of the apparent kp constants at t=0 (in 1/min); A3 – the scale factor of the temporal enzyme degradation constant B; g – the power in the temporal enzyme degradation constant B; Ke1 – the pK value for the first H+ dissociation constant of the free enzyme; Ke2 – pK value for the second H+ dissociation constant of the free enzyme; Kes1 – the pK value for the first H+ dissociation constant of the enzyme-substrate complex; Kes2 – the pK value for the second H+ dissociation constant of the enzyme-substrate complex; Ka1 – the pK value of the first acidic dissociation group in the diprotic substrate; Ka2 – the pK value of the second acidic dissociation group in the diprotic substrate.
To model the imperfection of the setup in the real experiment, and are instances of random variables, which slightly deviate from the designed and :
In (3.6) and (3.7), RE and RS are instances of positive continuous normally distributed random variables with unit mean values and standard deviations respectively , where c and d are given.
To model the measurement error in the real experiment, each of the pseudo-measured values of product concentration Pmes(i. Δt ), for i=1, 2, …, tover/ Δt is an instance of a random variable, which slightly deviates from the true product concentration Ptrue(i. Δt), for i=1, 2, …, tover/ Δt that results from integrating the ODE (3.2) from 0 to tover:
In (3.8), RP,i are instances of positive continuous normally distributed random variables with unit mean values and standard deviations b.(Ptrue(i. Δt))a, where a and b are given.
After finding Pmes it is possible to find the measured time course of the substrate, of the enzyme-substrate complex, and of the free enzyme:
The quantities in (3.8) and (3.9) are shown on Fig. 3.2. The measured velocity of the process can be approximated with the formula:
The purpose of the end-point assay experiment is to teach the user how to manually setup “optimal” experimental conditions. First of all, the steady-state assumptions have to be checked (that the concentration of the enzyme-substrate complex should stay approximately the same). For example, Fig. 3.2 shows that this condition is not met under the selected experimental conditions. An experiment that meets the requirements of the steady-state is shown on Fig. 3.3. Students can visually check the validity of the empirical "criteria" for the steady-state model (the degraded substrate should be up to 10% of the initial one, and the total enzyme concentration should be far less than the concentration of the initial substrate plus the constant KM,app), which otherwise should be proven with complex mathematical procedures (Segel, 1988). The influence of temperature, of acidity, of the substrate and enzyme concentrations, and the overall time of the experiment can be estimated by constructing the velocity graphics of the reaction as a function of the investigated condition. For example, Fig. 3.4 shows the influence of temperature on the product formation. The first graphics shows a continuous monitoring of the product generation in the course of the enzyme-catalysed reaction at various temperatures. The continuous assay illustrates also the intrinsic errors of the sampling procedure, which impose the necessity for repeated sampling. The second graph shows the end-point enzyme activity assay for various sampling time. The data are cross-section of the continuous assay. for 1- and 10-min incubation. Comparison of the two curves illustrates the apparent nature of the "optimal" temperature caused by the time-dependence of the heat denaturation.
Adjustment of the steady-state experimental conditions.
General conditions for optimal enzyme action.
As a whole, the instrumental model for a continuous-time assay is used as a manual Monte Carlo system to find suitable experimental conditions, appropriate for the end-point assay experiment, where the values of the kinetic constants and their confidence regions are identified. The end-point assay simulates a cheap experiment, where the user sets up T, pH, tover, , J predetermined initial substrate concentrations , and the number K of replicas for any substrate concentration. The product concentration is measured K times just at time tover for each .
Each replica is simulated in the same way as in the continuous-time assay, but the learning sample D0 only consists of the product concentrations at time tover: , where , is the k-th measurement of the product concentration Pmes at substrate concentration . As long as all the designed experimental conditions are identical for (k=1, 2, …, K), they can be referred as a process. Let are the mean value and the standard deviation for the end-point product concentration of the jth process, calculated by the K instances . A non-linear regression model of the standard deviation of the measured final product concentration is created as a function of the mean value of the measured final product concentration:
The constants a1 and a2 are determined with χ2-minimization of:
The final product concentrations are predicted by a model different from (3.2), which takes the form of J number of ODEs, which can be solved separately:
The initial conditions of (3.13) are . In (3.13), kp and KM are the kinetic constants. After integrating the ODE (3.13) from time 0 to tover, the value at tover would depend on the values of kp and KM. At the same time, would only depend on the sample D0. Then the optimal parameters kp,0 and KM,0 can be found with χ2-minimization of
Solving (3.15) is simplified by the presence of a good initial guess from a linearized model of Lineweaver-Burk (Lineweaver & Burk, 1934).
The two-dimensional confidence region of kp and KM is again calculated by Monte Carlo or Bootstrap simulation. The synthetic samples in the Monte Carlo simulation are formed so that for the j-th process, the final product concentrations are generated as K number of instances of a normally distributed random variable with mean value and standard deviation . The synthetic samples in the Bootstrap simulation are formed so that for the j-th process, the final product concentrations are generated by K drawing with replacement from the set . Whatever the method for generating ,
In order to find the root confidence interval, the resulting parameters are flipped in accordance with the discussion in section 2:
Let be the discrepancy measure (3.14), calculated over the original sample with the flipped synthetic parameters. Let’s renumerate the acquired discrepancy measures in ascending order so that . Then the first Q*p sorted vectors belong to the two-dimensional (2-D) confidence region with probability p. These vectors are called inside simulated points, and the rest are the outside simulated points. The area of the inside simulated points determines the root confidence region and its borders have constant χ2 discrepancy measure. The projections of the confidence region on the KM and kp axes are the 2-D confidence intervals (Fig. 3.5.). The best point estimate of the parameters is calculated as trimmed mean of using just the inside simulated points.
Confidence area of the model parameters based on a Monte Carlo simulation procedure.
The percentile confidence region can be calculated likewise, but instead of the flipped synthetic parameters, one should use just the synthetic parameters.
The described simulation system has been employed for over 8 years to train medical students in interpretation of the stochastic nature of experimentally estimated model-parameters at the Department of Medical Biochemistry of the Semmelweis University in Budapest (Hungary). After this training, students (and the users in general) are able to perceive the imperative requirements for multiple sampling replicas in experimentation, to interpret the stochastic nature of experimentally estimated model-parameters, as well as to gain insights into the application of in vitro determined kinetic parameters for the modeling of in vivo metabolic events.
The dissolution of intravascular thrombi is performed through the hydrolytic degradation of their fibrin matrix catalyzed by the serine protease plasmin (Kolev et al., 2005). Arterial thrombi enclose millimolar concentrations of phospholipids (Varadi et al., 2004) and free fatty acids (Rabai et al., 2007). These lipid constituents of thrombi are reported to modulate the fibrinolytic process (Varadi et al., 2004, Rabai et al., 2007; Hazari et al., 1992; Hazari et al., 1994; Huet et al., 2004). The paper (Tanka-Salamon et al., 2008) investigates the effects of the three most abundant fatty acids in the structure of platelet membranes – arachidonic acid, stearic acid and oleic acid representing respectively 22.0, 19.5 and 18.8 % of the total fatty acid content of platelet phosphoglycerolipids (Rabai et al., 2007).
Plasmin (Et=20 nM) was incubated with sodium salts of fatty acids for 15 min at 37 °C. Then 180 µl of this mixture was added to 20 µl Spectrozyme-PL (H-D-norleucylhexahydrotyrosyl-lysine-p-nitroanilide, American Diagnostica, Stamford, CT) at 7 different concentrations in the range 0.05 – 6 mM yielding final concentration S0j (j=1,2,…,7) in the volume of the reaction mixtures. The light absorbance at 405 nm (A405), which reflects the release of p-nitraniline, was measured continuously at ti (i=1,2,…,60) time points in the course of 10 min at 37 °C, 4 parallel measurements were done for each S0j. The main problem in the initial data processing is the proper assessment of the baseline absorbance and the delay time of the measurement, which affect profoundly the absolute values of the pNA product. An original algorithm was employed to convert the measured absorbance into product concentration (the notation indicates the product concentration at time ti for the k-th replica with S0,j) which form the learning sample D0: . As long as all the designed experimental conditions are identical for , they can be referred as a process. Let are the mean value and the standard deviation for the product concentration of the jth process at time ti, calculated by four instances . A non-linear regression model of the standard deviation of the measured product concentration for each process is created as a function of the mean value of the measured product concentration:
The constants aj and bj are determined with χ2-minimization of:
The product concentrations are predicted by three different models with increasing complexity. In the simplest case (Model I) the scheme is assumed, where E is plasmin, S is Spectrozyme-PL, P is p-nitroaniline, k1, k2 and k−1 are the respective reaction rate constants. With the quasi-steady-state assumption the differential rate equation for this scheme is
The time t in (4.4) is a strictly increasing function of for any combination of KM and kp and therefore it has an inverse function , which can be numerically estimated for all measured time points ti by a look-up table procedure and the results can be denoted as .
Because in the course of certain experiments the reaction rate declined faster than predicted by Model I, the more general scheme was also tested (Model II), which accounts for the accumulation of the product and its complex with the enzyme. Assuming steady-state for both ES and EP complexes the differential rate-equation is
The time t in (4.6) is a strictly increasing function of for any combination of KM, kp and Ki and therefore it has an inverse function , which can be numerically estimated for all measured time points ti by a look-up table procedure and the results can be denoted as .
Because in certain cases the product inhibition could not model the progress curve of the reaction satisfactorily, the instability of the enzyme in the assay system was also considered according to the scheme suggested in (Duggleby, 1995; Duggleby, 2001) (Model III):
The initial conditions of (4.7) are . In (4.7), kp and KM have the same meaning as in Model II. The integration of ODE systems (4.7) from time 0 to t60 was done by quasi-constant step size implementation in terms of backward differences of the Klopfenstein-Shampine family of Numerical Differentiation Formulas of orders 1–5 and the initial steps were determined so that the solution would stay in its domain during the whole integration (Shampine et al., 2003). The values of the product concentrations at time points ti for Model III can be found from the first component of the solution: for i=1, 2, …, 60 and for j=1,2,…, 7.
Since Model I and Model II are special cases of Model III for given values of J2, J3 and Ki, then further discussion only refers to Model III.
As in section 3, depends only on the kinetic parameters KM, kp,Ki, J2, J3 ; would only depend on the sample D0. Then the optimal parameters kp,0, KM,0, Ki,0, J2,0, J3,0 can be found with χ2-minimization of
The minimization (4.9) was performed using the Nelder-Mead simplex direct search method (Lagarias et al., 1998). Since the optimization assigns values of less than 10−12 (sec−1) to J2, from now on only J3 shall be considered.
The four-dimensional confidence region of kp, KM, Ki, and J3 is calculated by Monte Carlo simulation. The synthetic samples are formed so that for the i-th time point of the j-th process, the product concentrations are generated as four instances of a normally distributed random variable with mean value and standard deviation . Then
Let’s find the root confidence region of the parameters. That requires flipping the parameters as in (3.17). However, the resulting values may have no physical meaning (could be negative) unlike the case in section 3, because here we operate with real measurements, and there is no appropriate initial guess for the optimization. This situation is a rule rather than an exception in biochemical analysis, where parameters are supposed to be strictly positive. Therefore, one possible solution to find the root confidence region in such cases is to use a modification of the classical Monte Carlo procedure, called multiplicative Monte Carlo. The main assumption here, in accordance with section 1, is that the ratio between the true parameter value and the optimal parameter value derived from the true data sample has the same distribution as the ratio between the optimal parameter value derived from the true data sample, and the optimal synthetic parameter value derived from the synthetic data sample . The assumption is equivalent to performing classical Bootstrap over the logarithms of the estimated parameters. Then, the flipped parameters are derived as follows:
Then the root confidence region is derived in the same way as in section 3. Let be the discrepancy measure (4.8), calculated over the original sample with the flipped synthetic parameters. The discrepancy measures are renumerated in ascending order so that . The inside and outside simulated points are identified as in section 3. Again, the area of the inside simulated points determines the root confidence region. The best point estimate of the parameters is calculated as trimmed geometrical mean of using just the inside simulated points. Since it is impossible to plot the 4D confidence regions, they are visualized in pairs (e.g. Fig. 4.1 shows the confidence region of KM and kp, and of Ki and J3). The kinetic parameters of plasmin in the presence of the three fatty acids at different concentrations are given in Table 4.1. It also gives the 2D confidence intervals (2D) and best estimates, whereas the multi-dimensional confidence regions are given on Fig. 4.2.
Confidence regions of KM and kp (in the first plot) and of Ki and J3 (in the second plot).
Kinetic parameters of plasmin in the presence of fatty acids. Numerical values of the best estimates (BE) and their 95% confidence intervals (CI).
Kinetic parameters of plasmin in the presence of fatty acids (oleate OA, arachidonate AA, and stearate SA), and their multi-dimensional confidence regions.
The described study implemented a novel numerical procedure based on Monte Carlo to give a quantitative characterization of the modulation of plasmin activity by the presence of three fatty acids. All three fatty acids caused a 10–20-fold increase in the Michaelis constant of plasmin. Based on the ratio of the catalytic and Michaelis constants, all three fatty acids acted as inhibitors of plasmin with various degrees of potency - oleate and arachidonate can be defined as mixed-type inhibitors of plasmin, whereas stearate has a rather unusual effect; the increase in the Michaelis constant is coupled to higher values of the catalytic constant. At saturating concentrations of the substrate this effect is seen as apparent activation of plasmin in the amidolytic assay. Our findings illustrate the general possibility for a modulator to change the kinetic parameters of an enzyme in an independent and controversial manner, so that the overall catalytic outcome may vary with the concentration of the substrate. In physiological context the reported results indicate that acting as mixed-type inhibitors, unsaturated fatty acids stabilize fibrin against plasmin, whereas through its discordant effects on the catalytic and Michaelis constants stearate may destabilize clots in the process of their formation.
The investigated biochemical problem was a suitable setup to demonstrate the multiplicative Monte Carlo, which provides a reasonable physical meaning of resulting parameters. Its advantages over the classical methods in tasks, where strictly positive parameter values are required, makes it a necessary tool in most biological and biochemical parameter identification studies.
The application of Monte Carlo and Bootstrap techniques in a multi-dimensional setup is principally similar to the one-dimensional case. However, instead of CI, we need to find confidence regions that contain certain percentage of the entire probability distribution. As demonstrated, only root and percentile confidence regions are of interest in the multi-dimensional case. The use of either of them stems from the two different approaches to probabilities, so researchers should first clarify their viewpoint to probabilities in general before choosing the type of confidence regions to exploit. Working with root confidence intervals (regions) requires parameter flipping, which in some cases may generate results with an inappropriate sign. This is particularly true for biochemical (biological) processes, where parameters are supposed to be positive most of the cases. Therefore we presented the multiplicative Monte Carlo as a procedure that ensures the physical meaning of parameters.
Since Monte Carlo and Bootstrap are computer intensive methods, all calculation and visualization procedures, discussed and demonstrated here were executed with MATLAB R2009a. Whenever we used root confidence intervals, we also provided a best point estimate, which as a rule is better than the sample estimate. Generally, the mean of the flipped parameters can serve for a best estimate, but having in mind the sensitivity to errors (extreme values, outliers in the sample, etc.) it is recommended to use trimmed means. Regardless of the way to calculate this best estimate though, it will always be unbiased unlike the sample estimate.
The recent expansion of technological tools in biomedical research poses the requirement for appropriate modeling of the processes under investigation. The described examples from our work underscore the applicability of Monte Carlo simulations in biochemical models of variable complexity providing a robust tool to estimate the reliability of estimated parameters.
This work was supported by grants from the Wellcome Trust [083174] and OTKA [K83023].
Chapters published under CC BY 3.0 license
Monographs, or book chapters, which are outputs of Wellcome Trust funding have been made freely available as part of the Wellcome Trust's open access policy
Your browsing activity is empty.
Activity recording is turned off.
See more...