# A Rate Equation Approach to Elucidate the Kinetics and Robustness of the TGF-*β* Pathway

^{*}Henrik Jönsson,

^{*}Evangelia Pardali,

^{†}Peter ten Dijke,

^{†}and Carsten Peterson

^{*}

^{*}Computational Biology & Biological Physics, Department of Theoretical Physics, Lund University, Lund, Sweden; and

^{†}Molecular Cell Biology, Leiden University Medical Center, Leiden, The Netherlands

## Abstract

We present a rate equation model for the TGF-*β* pathway in endothelial cells together with novel measurements. This pathway plays a prominent role in inter- and intracellular communication and subversion can lead to cancer, fibrosis vascular disorders, and immune diseases. The model successfully describes the kinetics of experimental data and also correctly predicts the behavior in experiments where the system is perturbed. A novel method in this context, simulated tempering, is used to fit the model parameters to the data. It provides an ensemble of high quality solutions, which are analyzed with clustering methods and display a hierarchical structure highlighting distinct parameter subspaces with biological interpretations. This analysis discriminates between different biological mechanisms to achieve a transient signal from a sustained TGF-*β* input, where one mechanism is to use a negative feedback to turn the signal off. Further analysis in terms of parameter sensitivity reveals that this negative feedback loop in TGF-*β* signaling renders the system global robustness. This sheds light upon the role of the Smad7 protein in this system.

## INTRODUCTION

### General considerations

Mathematical modeling of signal transduction networks using rate equations is increasingly attracting attention as a powerful tool (see, e.g., (1–5)). It is used to simulate the kinetics of large signaling networks, where one cannot only rely on biological intuition. In such studies, the aim is to identify and shed light on the role of key components and modules. Furthermore, such approaches allow for predicting quantities not yet measured.

Rate equation modeling involves three major steps:

- Specify the components and their interactions and set up the system of equations.
- Find values for the kinetic parameters from experimental estimates or by fitting the model to experimental kinetic data.
- Analyze the behavior of the model for extracted parameter values.

Step 2 often presents the main limitation for a pathway modeling approach. The systems tend to have many parameters where only a few (if any) have values that represent reliable estimates from experiments. Also, the experimental kinetic data is typically not sufficient to constrain the parameter values to a single optimal solution, and multiple parameter sets can explain the available data. We address this problem by consistently looking at ensembles of parameter sets, where these sets subsequently are clustered with unsupervised methods, providing explanatory insights into the data and related biological interpretations.

A novel tool in this context is developed to deal with the optimization of parameters, simulated tempering (ST), which has previously been used to map out thermodynamical properties of protein-folding models (6,7). As with any other Monte Carlo method, ST naturally provides ensembles of solutions rather than single ones, subject to analysis by standard clustering techniques.

In this article, we apply the rate equation methodology to the Transforming Growth Factor *β* (TGF-*β*) pathway in endothelial cells. The members of the TGF-*β* superfamily are responsible for many different biological functions, including proliferation, differentiation, apoptosis, embryonic development, and wound healing. Perturbations in the TGF-*β* pathway have been detected in several human diseases, most notably in many forms of cancer, and in fibrotic diseases of the liver, the kidney, and the lung (8). This pathway is not too large for modeling, since there are a sufficient number of measurements available to infer the value of the parameters available. Neither is it small enough to use visual inspection or a simple ON/OFF language as means to draw conclusions about its dynamics and function. We compare the models both to existing data (9,10) and to novel measurements first presented here. The experiments consist of kinetic (time-course) measurements after TGF-*β* stimulation under different conditions: untreated cells and three cases in which different components of the pathway have been perturbed. Two of the experiments are used to fit the model parameters and the other two are left as “blind test” experiments. In addition, we predict the response of the system when varying the ligand dosage. Thus, we develop a predictive model that is tested against existing data. Furthermore, we make testable predictions for further experiments. We also identify, among other things, a feedback loop (Smad7) as important for explaining all data sets used and for the stability of the model.

To our knowledge, this is the first time the TGF-*β* pathway including regulatory aspects is approached with dynamical models. Recently, Vilar et al. (5) presented a detailed receptor model for TGF-*β* signaling, and we will discuss how this model relates to our simplified receptor description.

### The TGF-*β* pathway in endothelial cells

The TGF-*β* signaling pathway in endothelial cells (see Fig. 1 for a simplified layout) is triggered by the TGF-*β* protein, which acts as a ligand, by binding to and activating a heteromeric complex of type I and type II serine/threonine kinase receptors. The type I receptor acts downstream of the type II receptor and the signal is propagated inside the cell as the activated receptor complex is internalized and binds to and phosphorylates a protein of the Smad family, called receptor-regulated Smads or R-Smads (11–13). The R-Smads include Smad1, Smad2, Smad3, Smad5, and Smad8. The phosphorylated R-Smads can form complexes with Smad4, also referred to as Co-Smad (11,12). These complexes move into the nucleus where they regulate the transcription of target genes. There is also an inhibitory effect generated by the inhibitory-Smads (I-Smads), Smad6, and Smad7 (11,12). The I-Smads negatively regulate the TGF-*β* signaling pathway by binding to the receptors and compete with R-Smads for receptor interaction, by recruiting ubiquitin ligase to activated receptor complexes and thereby target the receptor for proteasomal degradation or by recruiting phosphatases (PP1-*α*) that inactivate the type I receptor by dephosphorylation (12–14).

*β*pathway in endothelial cells. The ligand, TGF-

*β*, binds to the receptors ALK1 or ALK5, and induces phosphorylation of Smad1/5 and Smad2, respectively, which in turn form complexes with Smad4. These complexes move into the nucleus

**...**

In most cell types, TGF-*β* signaling is mediated via the type I receptor activin receptor-like kinase 5 (ALK5). In endothelial cells, it is also mediated by the similar ALK1 kinase. Endothelial cells make up the endothelium, a single layer of flattened cells, which are responsible for the formation connective tissues such as blood cells, blood vessels, etc. Since neovascularization plays a rate-limiting step in cancer progression, research has frequently been focused on endothelial cells (15).

The two receptor proteins ALK1 and ALK5 give rise to two distinct pathways, which in turn induce opposite cellular functions. The TGF-*β*/ALK5 pathway induces the phosphorylation of Smad2 and Smad3 whereas the TGF-*β*/ALK1 pathway is responsible for the phosphorylation of Smad1 and Smad5. Moreover, ALK5 inhibits migration and proliferation while ALK1 stimulates these processes (9).

The phosphorylated R-Smads also display different behaviors in endothelial cells. It has been shown in Valdimarsdottir et al. (10) that the negative regulation of Smad1/5 is dependent on some newly synthesized protein and that Smad7 is induced by TGF-*β*/ALK1 signaling but unaffected by the TGF-*β*/ALK5 signaling. An interpretation of this would be that in endothelial cells, TGF-*β* induced activated Smad1/5, together with Smad4, activates the production of Smad7. The effect of Smad7 on the two pathways is also different. It has been shown to inactivate the ligand-bound ALK1 receptor. It can target the activated receptor for an ubiquitin-ligase-dependent degradation (14,16). Smad7 can also recruit a phosphatase (PP1-*α*) to the activated ALK1 receptor and thus inhibiting further phosphorylation of Smad1/5 (10). It has been shown that only high levels of Smad7 have an inhibitory effect on phosphorylated Smad2 (10). This leads to the conclusion that Smad7 negatively regulates the phosphorylation of both Smad1/5 and Smad2 but the strength of the latter interaction is much weaker.

The putative TGF-*β*-induced negative feedback from Smad7 is an interesting aspect of the pathway. What is its purpose? If it is merely to shut off the ALK1 pathway, could this not be controlled by simpler means, such as in the form of creation and degradation? These are two main questions investigated in our computational analysis of the pathway.

## MATERIALS AND METHODS

### Use of experimental data

Relative concentration levels for phosphorylated Smad1 (PSmad1) and phosphorylated Smad2 (PSmad2) are estimated from Western blot analysis. The time-course data sets are from five different experiments after TGF-*β* stimulation, and both novel and existing measurements are used. The data sets consist of:

- A nonperturbed experiment (control), where the cells are only stimulated with TGF-
*β*. This new experiment is described below. - Cells that are treated with the protein synthesis inhibitor cyclohexamide, which is modeled by completely blocking all protein production (10).
- Cells that are treated with the proteasome inhibitor MG-132, which is modeled by removing the proteasomal degradation of all proteins (10).
- Cells that are treated with the phosphatase inhibitor orthovanadate, which is modeled by removing the phosphatases from the model (10).
- An additional nonperturbed experiment, where the dose-response of phosphorylated Smad1 and Smad2 is measured by varying the amount of TGF-
*β*(9).

In Experiments I–IV, the concentrations are measured at times 0, 45, 90, 120, 180, and 240 min after TGF-*β* addition. To investigate the early dynamics of the pathway, we have also performed additional measurements of Experiment I at times 0, 5, 15, 30, 45, 60, and 120. The dose-responses, in Experiment V, are measured after 45 min only. The doses in this experiment varied from 0 to 5 ng/ml in six steps.

There are many possible ways to fit the model to experimental data, many of which display nonbiological behavior. To reduce the number of possible solutions, we fit to more than one set of experimental data. For detailed studies, we use Experiments I, II, and V for this calibration, whereas the others are used as blind-test experiments. In this way, the predictive power of our approach is tested. We also permute the experiments used for calibration to investigate the effects of such alterations.

### Details of new measurements

#### Kinetics of TGF-*β*3 induced Smad2 phosphorylation versus TGF-*β*3 induced Smad1/5 phosphorylation

Mouse embryonic endothelial cells were stimulated with 1 ng/ml TGF-*β*3 for different time points before lysis, fractionated by 6% SDS-PAGE and blotted. As a positive control, 293-cell lysate transfected with either Smad2/constitutively active ALK5 (PS2) or Smad1/constitutively active ALK1 (PS1) was used. The filters were incubated with phospho-Smad2 or phospho-Smad1 antibodies; detection was performed by enhanced chemoluminescence.

#### Ligands and cells and Western blot analysis

Recombinant TGF-*β*3 was obtained from K. Iwata (OSI Pharmaceuticals, Melville, NY). All assays were performed with both ligands with essentially the same results. Recombinant BMP6 was a gift from Dr. K. Sampath (Curis, Cambridge, MA). Mouse embryonic endothelial cells were cultured and Western blot analysis was performed as described in Goumans et al. (9) and shown in Fig. 4 *C* below.

### The model

Our aim is to develop a model versatile enough to be able to explain current data for the TGF-*β* pathway in endothelial cells and where the perturbation experiments described above can be naturally implemented. At the same time, each individual reaction step should be described as simply as possible to keep the number of parameters low. To this end, we model the TGF-*β* pathway as described in Table 1 (see also Fig. 1). All reactions are assumed to be reversible and constant production and degradation of all the nonphosphorylated proteins are allowed for.

*β*pathway model, where

*p*

_{i}(

*i*= 0,1,…,32) are the rate constants

#### Receptor dynamics

We only include the Type I receptors that are explicitly activated by TGF-*β*, and do not include receptor internalization and recycling (Fig. 2). This simplistic description of the receptor dynamics can be compared with a recently introduced, rather detailed model for the TGF-*β* receptors, which takes into account phenomena such as receptor recycling and trafficking (5). This detailed model is capable of describing different kinds of receptor responses to extracellular ligand concentrations depending on the situation at hand. We demonstrate that, regardless of the simplifications, our receptor model behaves strikingly similarly to the more complicated model of Vilar et al. (5), at least as long as only one ligand of the TGF-*β* superfamily is present (see Supplementary Material for details). An explanation for this similarity is that, although our simplistic receptor model has far fewer parameters, it does include variants of the parameters pinpointed as the most important ones by Vilar et al. (5), which are determining the ratio of degradation of the unbound compared to the activated receptor.

#### Phosphorylation and complex formation

The activated receptors catalyze the phosphorylation of the R-Smads (Smad1, Smad2), which is described by a Michaelis-Menten formalism. PSmad1 and PSmad2 can form complexes with Smad4, and the complex including PSmad1 can move into the nucleus and induce Smad7 production. We assume a constant volume difference between the cytoplasm and nucleus, which can be integrated into model parameters and a nucleus concentration unit, and hence the volumes are not explicitly introduced in the model.

#### Feedback inhibition

As described above, Smad7 has an inhibitory effect on the signal. This is modeled by recruitment of the phosphatase *P*_{A} (*P*_{B}) to the activated ALK1 (ALK5), which leads to an inactivation of the receptor. Since an ubiquitin ligase-dependent degradation of the activated receptor leads to a similar inactivation behavior, we do not account for this process explicitly in the model.

#### Formalism

The reactions in Table 1 are implemented with standard rate equations using deterministic ordinary differential equations (Table 2). This assumes an ample amount of molecules involved and not to rare events. These conditions are very likely satisfied in the TGF-*β* case. For all reactions we use mass action or Michaelis-Menten enzyme kinetics. The complete set of equations is given in Table 2. As an example, the equation for Smad1 concentration is given by

which can be deduced from rows *i* and *l* in Table 1 (see Fig. 1). [X] denotes the concentration of molecule *X* and the *p*-values are kinetic parameters. We have chosen to use the parameterization for the production and degradation terms where *r* and *l* correspond to the production rate and the equilibrium level for the production/degradation terms, respectively. These equilibrium levels are also used as initial concentrations in the simulations.

### Computational procedures

We use a general computational procedure that can be divided into calibration and analysis (see Fig. 3). In the calibration part, we extract parameter value sets that describe experimental data well, which results in an ensemble of solutions. The calibration consists of two parts:

- Optimization, where the parameters are adjusted for the model to fit the experimental data.
- Filtering, where good solutions from the optimization procedure are evaluated against other experimental knowledge.

**...**

These procedures require multiple simulations of the model, where the result of the numerical integration of the ordinary differential equations (ODEs) is compared with the experimental data. In the analysis part, we investigate the behavior of the solutions following from the calibration step. As a first step, the solutions are grouped by clustering. The resulting subgroups are further evaluated by examining the group-averaged behavior. In a validation step, the solution behavior is compared with blind-test experiments, where the predictive power of the solutions is investigated. Also, we analyze how robust the solutions are with respect to perturbation of the parameters.

#### Solving the system of ordinary differential equations

The efficiency of the differential equation solver is extremely important since this is where most computational time is spent, in particular since the equations are often stiff. We use a procedure that adaptively switches between two methods to minimize the computational load:

- Fifth-order Runge-Kutta method, where the step size is varied to keep the local truncation error constant, using an embedded fourth-order method to estimate the truncation error; and
- The Rosenbrock method, which is an implicit method that uses the same kind of step-size control as Method 1, but is more efficient in the regions of parameter space where the ODEs become stiff.

Both methods are described in Press et al. (17) and initial parameter values and other details in this procedure can be found in the Appendix.

#### Calibration

In the optimization procedure, we estimate the parameters of the model by fitting to experimental data. After each solution to the ODEs in the iterative process, the *K* parameters **p** = (*p*_{0},…,*p*_{K−1}) are adjusted such that the model should more accurately describe the experimental data. The latter consist of *N* discrete time points for each experiment. As error measure, the quadratic difference is used,

where *x*_{i}(*t*, **p**) and denote model points and experimental points, respectively, and the index *i* denotes the different molecules (*M* in total). We use two experiments in the optimization procedure, and the sum of the two *R* values is used as error measure. To find good approximate solutions to global minima for Eq. 2, one can use Monte Carlo methods like simulated annealing (18). Here, we employ a related but more powerful method, simulated tempering (ST) (6,7), where the fictitious temperature is a dynamic variable, and the system is always kept at equilibrium for the different temperatures. Solutions are obtained by “quenching” from the lowest temperature to *T* = 0 corresponding to a local search. The underlying idea is to scan sizable parts of the solution space at different high temperatures and regularly visit low temperature solutions. In a sense, this optimization method corresponds to simulated annealing with multiple random starts and it yields ensembles of solutions rather than single ones. The details of the ST implementation are found in the Appendix.

To further restrict the behavior of solutions included in the analysis, we select solutions from the optimization step to correctly describe the dosage Experiment V. We run the model for different dosages of TGF-*β* and calculate a measure similar to *R* (see Appendix for details). Finally, a small subset of these solutions is removed based on an overfitted behavior. (Note that a small number—four—of the solutions display a high-order behavior in the simulations. Although these solutions do get a good *R*-value, the behavior does not fit the experiments well if the concentration levels are assumed to interpolate smoothly between the measurements. These solutions are removed by inspection, but could have been removed by, e.g., using a criteria of not allowing for multiple peaks. If these solutions are included in the analysis, they cluster with the group not using the feedback. The group behavior is not altered significantly, but the sensitivity and the variation in the predictions are slightly increased.)

#### Solution properties

To investigate properties and interior structures of the solution space, we use three different methods:

- Hierarchical clustering.
- K-means clustering.
- Principal component analysis (PCA).

Before this analysis, the data is preprocessed to obtain a distribution for each parameter with mean of 0 and standard deviation of 1 (for details on the implementation, see the Appendix).

#### Robustness

A common method used to analyze the robustness of a system is to use the derivatives of the molecule concentrations, *x*_{i}(*t*, **p**), with respect to the different parameters, **p**, as a direct measurement of the sensitivity of the system (20). We define a sensitivity vector according to

where the derivative is approximated by a simple finite-difference approximation, using 1% parameter variations.

## RESULTS

### Calibration

First, we generate an ensemble of solutions from fitting to the control (I) and cyclohexamide (II) experiments. Good solutions are selected with the criteria *R* < 0.01 yielding ~200 solutions. As can be seen from Fig. 4, these solutions fit both experiments well. Hence the parameterization form is appropriate and the optimization method efficient. Next, we select for those solutions that at the same time successfully describe the saturated behavior at different TGF-*β* dosages (Experiment V). An ensemble of 38 solutions pass this filtering step (see Fig. 5), from which four are removed based on an overfitting criteria (see end of Calibration, above). The remaining 34 solutions are used for further investigations.

### Solution properties

Individual parameter values vary considerably in the calibration solution set, most with ranges of several orders of magnitude. To analyze the homogeneity of the solutions we cluster the ensemble of parameter sets using different clustering algorithms and distance measures. The result for hierarchical clustering with a Pearson correlation distance measure is shown in Fig. 6 *A*, where two main groups can be identified. K-means clustering with *K* = 2 also results in a similar grouping. Fig. 6 *B* shows the K-means result projected onto the two main directions from a principal component analysis. As will be shown in a more detailed analysis, the two groups of solutions define two very distinct biological interpretations of how the PSmad1 signal is made transient in the case of a sustained TGF-*β* input: All the group-2 solutions use the putative Smad7 feedback loop, while the solutions of group 1 do not. This division of the solutions is very robust to a variety of settings in the clustering algorithms. Occasionally, a small set of the solutions emerge as outliers, and two of the solutions also end up in different clusters depending on method (compare Fig. 6, *A* and *B*). Although our analysis does not depend upon the assignment of these two solutions, we choose not to include them in the further analysis. The parameter values for these 32 solutions are provided in Supplementary Material.

We also performed clustering on a subset of the solutions that do not correctly describe the dose-response experiment, but still satisfy *R* < 0.01. In this case, we get equivalent results, with two distinct groups with the same difference in biologically interpretable behavior with one group using the Smad7 feedback loop, whereas the other group does not (data not shown).

### Prediction

The solutions that were clustered were chosen to accurately predict the dosage experiment (Fig. 5). To further analyze the predictive power of the two defined ensembles of solutions we have performed two blind-test experiments: Cells treated with the proteasome inhibitor MG-132 (III) and phosphatase inhibitor orthovanadate (IV), respectively. In Fig. 7, the model predictions from group 1 and group 2 are shown and compared with experiments, again for levels of PSmad1 and PSmad2. As can be seen, the PSmad2 levels are not affected significantly in either of the perturbed systems as compared to the control experiment (Fig. 4 *A*). This behavior is accurately predicted by both groups of solutions. In the MG-132 experiment (see Fig. 7 *A*), the PSmad1 signal still appears transient although the peak is broadened in time. Both groups of solutions predict a transient PSmad1 signal very similar to the behavior of the control experiment in this case. This lack of broadening of the peak for all solutions is discussed in more detail below, where we do optimization on control and MG-132.

*A*) Data and model predictions of PSmad1 and PSmad2 concentration when the cells are treated with the proteasome inhibitor MG-132, which is modeled by removing the proteasomal degradation of all proteins (Experiment III). The solutions used are those

**...**

It is in the PSmad1 behavior in the orthovanadate experiment (see Fig. 7 *B*) where the predictions from two groups distinctly differ. In this case, group 1 predicts a transient PSmad1 signal very similar to the behavior in the control experiment, whereas group 2 predicts a more sustained signal in closer agreement with the experimental values. This experiment (and model perturbation) mainly affects the feedback from Smad7 by disabling the phosphatase to inactivate the activated ALK1 receptor, and the behavior of group 1 in this case indicates that these solutions do not use the feedback loop.

It should be noted that these experiments are quite crude and may affect the cells in ways not feasible to include in our model, which is restricted to the molecules directly involved in the TGF-*β* pathway. A much more direct experiment for model prediction would be to perturb a single specific molecule included in the model, e.g., silencing Smad7 by an siRNA knockdown. The predicted PSmad1 and PSmad2 behaviors for the two groups when Smad7 is silenced are shown in Fig. 8. This is particularly interesting since the two solution groups exhibit very different behaviors. Again, the unchanged PSmad1 behavior of group 1 shows that these solutions do not need the Smad7 feedback to achieve a transient signal. The prediction for the feedback model is dependent on the assumption that Smad7 is the I-Smad active in endothelial cells, which is based on experiments. Smad6 could potentially also be active although there is no data for Smad6 behavior in endothelial cells. In other cell types, Smad6 has been shown to be more moderately and transiently induced by TGF-*β* compared to Smad7 (21,22). A fair assumption would be that if Smad6 is induced in endothelial cells its behavior would resemble the Smad7 behavior, which would lead to similar behavior for a model including Smad6 in all previous experiments but not for the Smad7 knockdown experiment. Instead, the effect of Smad7 knockdown would be less pronounced in such a feedback model.

### Robustness

To further illuminate differences between the two groups of solutions, we computed the sensitivity as defined in Eq. 3. In Fig. 9, the sensitivity of the two groups are shown, where the summed derivatives of PSmad1 and PSmad2 with respect to the parameters for Experiments I and II are displayed. It is clear that the group using the Smad7 feedback loop (group 2) is more robust than the other group. A Wilcoxon two-sample test on the measure for the solutions in the two groups gives a *p*-value <10^{−6}. The largest difference is found in the parameters governing the production and degradation of Smad1 and Smad4 (parameters *p*_{2}–*p*_{5}). This indicates that group 1 uses Smad1 and Smad4 production and degradation to achieve the transient PSmad1 signal instead of using the negative feedback of Smad7. It is indeed very interesting that the transient signal can be achieved by a pathway with fewer molecular players, but it appears that the drawback for the cells would be that the levels and production/degradation rates for the Smad1 and Smad4 need to be tightly regulated to achieve a robust signal behavior. In contrast to this, the group that uses the Smad7 feedback shows a low sensitivity in respect to Smad4 levels (*p*_{4},*p*_{5}), and more or less no sensitivity at all to Smad1 levels (*p*_{2},*p*_{3}). This latter fact, and the lack of sensitivity toward changes of the Michaelis-Menten constant in the phosphorylation step (*p*_{16}), indicates that the Smad1 levels are saturated. A more detailed look at the parameter values and Smad1 levels reveals that all solutions in group 2 indeed have saturated levels of Smad1 (data not shown), which hence can be regarded as a prediction of the model using Smad7 feedback.

Group 1 is insensitive to perturbations in all parameters directly included in the Smad7 feedback pathway (*p*_{10}–*p*_{12}, *p*_{27}, *p*_{28}, *p*_{31}, *p*_{32}), which agrees with the conclusion that the feedback is not used by these solutions. Group 2, on the other hand, shows some sensitivity in these parameters except for the parameters included in the Smad7 feedback on the activated ALK5 receptor (*p*_{31}, *p*_{32}). Neither of these solution ensembles make use of a Smad7 feedback for regulating PSmad2 levels, and this part of the network could have been left out of the model, at least for explaining the current experiments (compare to (10)).

The most sensitive parameters in group 2 are *p*_{1}, *p*_{9}, *p*_{15}, *p*_{17}, *p*_{22}, and *p*_{24}, and group 1 is about equally sensitive to these parameters. These parameters govern the initial ALK1 and ALK5 levels (*p*_{1}, *p*_{9}), as well as the rates of phosphorylation and dephosphorylation of Smad1 (*p*_{15}, *p*_{17}) and Smad2 (*p*_{22}, *p*_{24}). The early PSmad1 and PSmad2 kinetics and (at least partly) the entire PSmad signal are also dependent on these parameters. Hence, it is expected that the fitting to our kinetic PSmad1 and PSmad2 data is sensitive to these parameters. A final note is that although the ALK1 and ALK5 levels are important, the production and degradation rates are not (*p*_{0}, *p*_{9}). A more detailed look at the parameter levels show that these rates are low (data not shown), and it appears that it is the initial values that are important for the model to explain data.

### Permuting the experiments for the calibration

To further analyze the model behavior we also permuted the experiments used for calibration. We used combinations including the control experiment in the calibration part since this is the only experiment where all the parameters are present. Also, here we applied the dose experiment as a filtering step after optimization. The two additional calibration sets used were optimization on control (I) and MG-132 (III), and on control (I) and orthovanadate (IV) experiments. The new parameter sets are presented in a PCA-plot in Fig. 10 together with the previously defined parameter sets.

*right*) from those that do not (

*left*) (see Supplementary Material, Fig. S4).

When optimization is performed on control and orthovanadate, the extracted solutions behave very similarly to the ones extracted from optimization on control and cyclohexamide (see Supplementary Material, Fig. S4). All these solutions use the Smad7 feedback in the process of truncating the PSmad1 signal, which is expected since the optimization includes the orthovanadate experiment, which mainly affects the feedback. Also, the robustness analysis on this new data set shows a very similar pattern as for the previously defined Group 2 (data not shown).

In the case of optimizing against the control and MG-132, the optimization procedure works less efficiently. Among the solutions provided by the algorithm, only a very few resulted in *R* < 0.01 and among those, none passed the filtering step against the dose experiment (see Appendix for details). The parameter sets from this case provided in Fig. 10 are solutions with *R* < 0.015, which pass the dose experiment filter. These solutions show an average behavior for the PSmad1 lying in between the experimental curves for control and MG-132, and with very small change in behavior when protein degradation is removed (see Supplementary Material, Fig. S4). None of the parameter sets use the Smad7 feedback, and therefore these provide a poor prediction of the orthovanadate experiment, while the predictive power is small for the cyclohexamide experiment since the behavior is very spread out. An interesting note is that this apparent conflict for explaining the MG-132 together with the other experiments can be used to direct improvements for the model. This is illustrated by a slight adjustment of the model perturbation for the MG-132 experiment where a decreased inactivation of the activated receptors is included (simulating reduced ubiquitin-dependent degradation), which leads to an improved behavior (see Supplementary Material, Fig. S5).

## CONCLUSIONS AND OUTLOOK

We have developed a mathematical model for the TGF-*β* pathway in endothelial cells and introduced novel computational procedures for finding and analyzing robust models. This system was chosen given its paramount importance in diseases like cancer and in developmental processes, even though the information about concentrations, reaction rates, and other parameters is scarce. To cope with the latter, we generate an ensemble of solutions rather than a single one when fitting to the data. This also means that we are less sensitive to noise and, as it turned out, we are able to identify different solution categories with associated biological interpretations. We use different kinetic data sets by varying conditions including knockdowns. Some of the data sets already exist and others are newly generated and are presented here for the first time. Having access to kinetic data under different conditions enables us to fit models to a subset of these and use the remaining sets for blind-test evaluations. Our results can be summarized as follows:

- With efficient ODE solvers and a powerful optimization method, simulated tempering (ST), good solutions are found to the calibration sets.
- The calibrated solutions are found to well reproduce blind-test experiments, including those in which the external dosage is varied.
- The resulting solutions are analyzed with unsupervised clustering methods. Two clusters emerge—one in which the Smad7 feedback loop is employed, and another in which it is not. The group using the Smad7 feedback is better at predicting the blind-test experiments.
- The robustness is investigated with a gradient method. It is found that the solutions corresponding to the cluster using the Smad7 feedback loop are less sensitive to parameter perturbations, indicating that a role for this loop is to provide robustness to the system.
- Permutation of the experiments used for optimization resulted in similar solution sets, but also highlighted the MG-132 experiment as somewhat conflicting for the model to solve. This can be used to direct improvements of the model, which is indicated by simulations adjusting the interpretation of the MG-132 experiment.

In our robustness analysis we have investigated how the dynamical levels of different PSmads change for different parameter perturbations. The PSmads represent the signal through the pathway, but perhaps a more biologically relevant measure is the robustness in cell response. Hence, in the future one should augment the PSmad concentration measurements with downstream gene expression data and perform an integrated analysis. In this context, one should also include the effects from cross talk with neighboring pathways that are part of the TGF-*β* family.

Very recently, a detailed model for receptor dynamics was introduced in the context of the TGF-*β* pathway (5). It does not target endothelial cells specifically, but presents a detailed study of receptor dynamics including internalization and a specific inactivation of the ligand-bound receptor complex by degradation. This model is sufficient to explain a transient signal for PSmad2 after sustained TGF-*β* stimulation. To relate this to our more simplistic receptor model, not explicitly including receptor recycling, we showed that our receptor model has as versatile activation pattern when a single ligand is presented to the receptor. The behavior of PSmad1 in endothelial cells when treated with cyclohexamide is to extend the signal, while the same treatment in HaCaT cells has been shown to shorten the PSmad2 signal (23). Although the detailed receptor model predicts a shortened activation at cyclohexamide treatment (see Supplementary Material, Fig. S1) in full agreement with the PSmad2 data, our full pathway model can indeed explain the PSmad1 behavior in cyclohexamide-treated endothelial cells.

From the behavior of our different solution groups, we argue for a model where there exists a feedback from TGF-*β* induced Smad7 to repress the PSmad1 activation. This is based on indications from several experiments, which are all reproduced by the feedback model. Needless to say, a more distinct test of this model would be to perform a dedicated knockdown experiment for Smad7, which is currently in progress in siRNA experiments targeting Smad7. In this context, the importance of Smad6 in endothelial cells also needs to be investigated.

Our approach is not restricted to systems where all parameter values can be experimentally estimated. Rather, it allows for several solutions to solve a problem, and can account for similarities in behavior of highly conserved modules such as the TGF-*β* pathway, although quantitative details differ. In this study we are confined to experimental data which has not been calibrated to units of concentration. This lack of knowledge propagates to our parameters. Also, the measurements are restricted to a few components, and we have therefore chosen a simplistic description of some of the reactions. Hence, we have focused on relevant biological behavior of the measured molecules for different conditions and not attempted to evaluate parameter values with respect to biologically reasonable ones, which would have been dependent on further assumptions. Additional experiments, which provide quantitative estimates of parameters and concentration levels, are important and will constrain the solution space for the models. On the other hand, we demonstrate that the models, can pinpoint experiments that will provide maximal information given the current knowledge, and the combination of experiments and modeling provides an effective methodology for an increased understanding of highly complex biological networks.

## SUPPLEMENTARY MATERIAL

An online supplement to this article can be found by visiting BJ Online at http://www.biophysj.org.

## Acknowledgments

We thank Marie-Jose Goumans and Gudrun Valdimarsdottir for valuable discussions.

This work was in part supported by the Swedish Foundation for Strategic Research through a “Senior Individual Grant” (C.P.), by the Knut and Alice Wallenberg Foundation through Swegene and the Swedish Research Council (H.J.), and by the European Community project “Angiotargeting Integrated Project” No. 504743 and the Dutch Cancer Society with grant No. NKI 2005-3371 (to P.t.D.).

## APPENDIX

#### Experimental data

All the experimental data originate from Western Blot Analysis, where we measure the average intensity in a square on the inverted blot-images and use these intensities as a relative measure of concentration. As it turns out, the size of the square has only a marginal effect on the estimated concentration levels. The concentrations are normalized with the actin level measured in the cell, which is fairly constant throughout the time series. Finally, the concentrations are normalized to give a maximum value of 1 for both PSmad1 and PSmad2.

#### Solving the systems of ODEs

In Table 2 we show the system of ODEs used in our calculations, in which the following assumptions are made in the calibration process:

- The TGF-
*β*level is constant throughout the time series. - At
*t*= 0, we have(A1)

The ODEs are solved using mainly the fifth-order Runge-Kutta method, but in stiff regions of parameter space we switch to the Rosenbrock method using an adaptive procedure. For details on the ODE solvers, see Press et al. (17).

#### Parameter estimation

For generating ensembles of solutions we use simulated tempering, where configurations are generated for different fictitious temperatures *T*_{j} and the system is allowed to move between the different *T*_{j}-values. In other words, at a given Monte Carlo step, one updates the system by swapping configurations of the systems, or alternatively trading two temperatures. The method amounts to simulating the joint probability distribution

where the “energy” *R*(**p**) is the error measure of Eq. 2 with its system parameters ). The algorithm parameters *g*_{j} govern the weights *p*_{j} of the different temperatures, *T*_{j}. The latter are chosen according to

where we used *J* = 20, *T*_{1} = *T*_{min} = 0.0025, and *T*_{20} = *T*_{max} = 0.005. We want to spend roughly the same amount of time on each of the temperatures, and thus have to choose our *g*_{j}-values accordingly, i.e., we want to choose our *g*′_{j}-values such that the weights, *p*_{j}, are equal for all *j*. This is done through trial simulations in a two-step process. First we calculate rough estimates of the average “energy” at each temperature, *R*_{j} and put *g*_{20} = 0 and *g*_{j−1} = *g*_{j} − *R*_{j}(1/*T*_{j−1} − 1/*T*_{j}). In the next step, we perform longer simulations to obtain good estimates of the weights *p*_{j}; the uniform distribution is then obtained by replacing *g*_{j} with *g*_{j} + ln *p*_{j} (7).

The parameters are updated one at a time with *p*_{i} → *r**p*_{i}, where *r* is a multiplicative factor (*r* = 1.1 is used) and in 50% of the cases we set . At *T* = *T*_{min}, *r* is allowed to vary freely in the range *r* [1:2] individually for each parameter, to keep the acceptance ratio above 50%. Updates are accepted according to Eq. A3. For each *K* number of attempted parameter updates, *K* being the number of parameters, we attempt one update to an adjacent temperature *T*_{j±1} with a probability also governed by Eq. A3.

The performance of the algorithm is displayed in the table below showing the number of simulations it takes on average to find a minimum (*middle panel*) and the percentage of these minima having *R* < 0.01 (*right panel*) for each of the three sample permutations. These results can be compared with for example (24) where different optimization algorithms including simulated annealing are compared. The poor performance on the control+MG-132 set is discussed in the text and in the Supplementary Material (see Table 3).

### TABLE 3

Sample | No. of simulations | R < 0.01 |
---|---|---|

Control+cyclohexamide | 38,919 ± 5872 | 32.9% |

Control+orthovanadate | 26,783 ± 4707 | 25.6% |

Control+MG-132 | 33,374 ± 3326 | 1.4% |

#### Calibration

In the first step we merely select for solutions **p** satisfying *R*(**p**) < 0.01. In the second step we also require the solutions to display the saturating behavior observed in Experiment V. This is achieved by only considering solutions **p*** satisfying

where denotes the concentration of molecule *i* at time *t* given the parameters **p*** and an initial concentration of TGF-*β* of *C* ng/ml. For the cutoff value *ε* we found *ε* = 0.05 to be appropriate.

#### Implementation

The calibration framework as well as the robustness analysis are implemented in C++. For the two clustering methods, K-means and hierarchical clustering, and for the PCA, we used MatLab implementations (The MathWorks, Natick, MA) corresponding to the MatLab functions dendrogram, kmeans, and princomp, respectively.

## Notes

P. Melke, H. Jönsson, and C. Peterson contributed equally to this article.

## References

*κ*B-NF-

*κ*B signaling module: temporal control and selective gene activation. Science. 298:1241–1245. [PubMed]

*β*superfamily ligand-receptor network. PLoS Comput. Biol. 2:e3. [PMC free article] [PubMed]

*β*type I receptors. EMBO J. 21:1743–1753. [PMC free article] [PubMed]

*α*are critical determinants in the duration of TGF-

*β*/ALK1 signaling in endothelial cells. BMC Cell Biol. 7:16. [PMC free article] [PubMed]

*β*-Smad signalling. Trends Biochem. Sci. 29:265–273. [PubMed]

*β*signaling from cell membrane to the nucleus. Cell. 113:685–700. [PubMed]

*β*receptor signalling and turnover. Nat. Cell Biol. 5:410–421. [PubMed]

*β*type I receptor through Smad7 and induces receptor degradation. J. Biol. Chem. 276:12477–12480. [PubMed]

*β*family members. Biochem. Biophys. Res. Commun. 249:505–511. [PubMed]

*β*receptor activity. Mol. Cell. 10:283–294. [PubMed]

**The Biophysical Society**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (707K) |
- Citation

- Characterization of negative feedback network motifs in the TGF-β signaling pathway.[PLoS One. 2013]
*Nicklas D, Saiz L.**PLoS One. 2013; 8(12):e83531. Epub 2013 Dec 20.* - Regulation of TGF-beta signaling by Smad7.[Acta Biochim Biophys Sin (Shanghai). 2009]
*Yan X, Liu Z, Chen Y.**Acta Biochim Biophys Sin (Shanghai). 2009 Apr; 41(4):263-72.* - Dynamics and feedback loops in the transforming growth factor β signaling pathway.[Biophys Chem. 2012]
*Wegner K, Bachmann A, Schad JU, Lucarelli P, Sahle S, Nickel P, Meyer C, Klingmüller U, Dooley S, Kummer U.**Biophys Chem. 2012 Mar; 162:22-34. Epub 2012 Jan 5.* - VE-cadherin is a critical endothelial regulator of TGF-beta signalling.[EMBO J. 2008]
*Rudini N, Felici A, Giampietro C, Lampugnani M, Corada M, Swirsding K, Garrè M, Liebner S, Letarte M, ten Dijke P, et al.**EMBO J. 2008 Apr 9; 27(7):993-1004. Epub 2008 Mar 13.* - Transforming growth factor-beta signal transduction in angiogenesis and vascular disorders.[Chest. 2005]
*Bertolino P, Deckers M, Lebrin F, ten Dijke P.**Chest. 2005 Dec; 128(6 Suppl):585S-590S.*

- Dynamics of TGF-β/Smad Signaling[FEBS letters. 2012]
*Zi Z, Chapnick DA, Liu X.**FEBS letters. 2012 Jul 4; 586(14)1921-1928* - The Self-Limiting Dynamics of TGF-β Signaling In Silico and In Vitro, with Negative Feedback through PPM1A Upregulation[PLoS Computational Biology. ]
*Wang J, Tucker-Kellogg L, Ng IC, Jia R, Thiagarajan PS, White JK, Yu H.**PLoS Computational Biology. 10(6)e1003573* - An integrative modeling framework reveals plasticity of TGF-β signaling[BMC Systems Biology. ]
*Andrieux G, Le Borgne M, Théret N.**BMC Systems Biology. 830* - Characterization of Negative Feedback Network Motifs in the TGF-β Signaling Pathway[PLoS ONE. ]
*Nicklas D, Saiz L.**PLoS ONE. 8(12)e83531* - Prediction stability in a data-based, mechanistic model of σF regulation during sporulation in Bacillus subtilis[Scientific Reports. ]
*Fengos G, Iber D.**Scientific Reports. 32755*

- A Rate Equation Approach to Elucidate the Kinetics and Robustness of the TGF-β P...A Rate Equation Approach to Elucidate the Kinetics and Robustness of the TGF-β PathwayBiophysical Journal. Dec 15, 2006; 91(12)4368

Your browsing activity is empty.

Activity recording is turned off.

See more...