- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

# A probabilistic approach to identify putative drug targets in biochemical networks

^{1,}

^{3,}

^{*}Kieran Smallbone,

^{2,}

^{4}Jonathan Swinton,

^{5}Hans V. Westerhoff,

^{4,}

^{6}and Ralf Steuer

^{3,}

^{7,}

^{*}

^{1}Doctoral Training Centre Integrative Systems Biology from Molecules to Life, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK

^{2}Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK

^{3}School of Chemical Engineering and Analytical Science, The University of Manchester, Sackville Street, Manchester M60 1QD, UK

^{4}School of Mathematics, The University of Manchester, Oxford Road, Manchester M13 9PL, UK

^{5}Computational Biology/Advanced Science and Technology Laboratory, AstraZeneca, Alderley Park, Macclesfield SK10 4TF, UK

^{6}Netherlands Institute for Systems Biology, FALW, VU University Amsterdam, De Boelelaan 1085, 1081, HV Amsterdam, The Netherlands

^{7}Institute for Theoretical Biology, Humboldt University of Berlin, Invalidenstrasse 43, 10115 Berlin, Germany

^{*}Authors for correspondence (Email: ku.ca.retsehcnam.dargtsop@otibarum.erotte; Email: ku.ca.retsehcnam@reuets.flar).

## Abstract

Network-based drug design holds great promise in clinical research as a way to overcome the limitations of traditional approaches in the development of drugs with high efficacy and low toxicity. This novel strategy aims to study how a biochemical network as a whole, rather than its individual components, responds to specific perturbations in different physiological conditions. Proteins exerting little control over normal cells and larger control over altered cells may be considered as good candidates for drug targets. The application of network-based drug design would greatly benefit from using an explicit computational model describing the dynamics of the system under investigation. However, creating a fully characterized kinetic model is not an easy task, even for relatively small networks, as it is still significantly hampered by the lack of data about kinetic mechanisms and parameters values. Here, we propose a Monte Carlo approach to identify the differences between flux control profiles of a metabolic network in different physiological states, when information about the kinetics of the system is partially or totally missing. Based on experimentally accessible information on metabolic phenotypes, we develop a novel method to determine probabilistic differences in the flux control coefficients between the two observable phenotypes. Knowledge of how differences in flux control are distributed among the different enzymatic steps is exploited to identify points of fragility in one of the phenotypes. Using a prototypical cancerous phenotype as an example, we demonstrate how our approach can assist researchers in developing compounds with high efficacy and low toxicity.

**Keywords:**metabolic control analysis, network-based drug design, Monte Carlo sampling

## 1. Introduction

The main challenge in drug discovery consists of developing drugs which are both effective and selective, hence non-toxic. In this respect, drug development approaches have often assumed that the enzyme chosen as target plays the role of ‘rate limiting step’ for the biological function of interest [1]. However, single and completely rate-limiting steps barely exist [2] and it is not easy to identify them if they do. Potent inhibitors of essential enzymes often do not show the expected effect at the cellular level [3–5]. The network of biochemical interactions in living cells buffers changes introduced in a single enzymatic step. Likewise, local changes may induce unforeseen adverse side-effects on the whole system [4]. This interconnectedness of biological function usually results in poor predictive power with respect to the requirement of high efficiency and low toxicity for a drug.

With the advent of systems biology, drug discovery has been shifting its focus from a single-molecule to a system-level perspective, and the concept of network-based drug design has been introduced [3–6]. Within this new paradigm, the aim is to study a biochemical network as a whole and identify points of fragility specifically characterizing an altered phenotype (as in a cancer cell) [7,8]. By targeting these points of fragility, a significant response in altered cells is induced. Differential network-based drug design then maximizes the difference in response of target cells versus normal cells [9]. A possible implementation of network-based drug design is based on metabolic control analysis (MCA), where the concept of fragility is expressed in terms of control coefficients [10]. *Differential* MCA, in particular, has been proposed as a tool to understand how the system responds to specific perturbations under different physiological conditions or in different host cells, hence providing a way to assess both the efficiency and the specificity of a compound designed to target specific enzymes [3,9,11].

However, one fundamental requirement for applying MCA is the availability of a fully characterized kinetic model of the system under study. Creating such a model is not an easy task, even for relatively small metabolic networks [12,13]. In most cases, the detailed enzymatic mechanisms governing the dynamics of the different metabolic steps are unknown and precise knowledge of kinetic parameters under the relevant *in vivo* conditions is usually not available. The resulting uncertainty in predicting the dynamic behaviour is significant and increases drastically with the size of the system [12]. Randomized sampling of the parameter space represents a way in which such uncertainty can be quantified and probabilistic insights about the dynamical behaviour of the system can be obtained. Such probabilistic approaches have been recently used in a number of different studies, spanning from applications on MCA [14,15], on metabolic engineering [16], to a description of the dynamics in metabolic networks [17–19]. Here we build upon these ideas and present a Monte Carlo approach to identify differences between the control profiles of a metabolic network in two different settings, one representative of a tumour cell and the other of a corresponding normal cell. Our aim is to show how putative targets for drugs operating at the metabolic level may be identified in a probabilistic manner, when only partial knowledge is available with respect to the kinetic properties of the system. In particular, our analysis is based on experimentally accessible information on metabolic phenotypes, such as observable concentrations and fluxes, rather than on detailed knowledge of kinetic parameters. It is demonstrated that a combination of such phenotypic data, together with heuristic assumptions about the properties of typical enzyme-catalysed reactions, already allows for a fast and efficient way to explore the effectiveness of putative drug targets. Our method makes use of biophysical constraints on the metabolic network, as provided by mass-conservation and thermodynamics, and implements an efficient numerical scheme to allow scanning of a large parameter-space, making it applicable to networks of large size. As a proof-of-concept, we apply our methodology to identify the points of fragility characterizing a paradigmatic cancer metabolic phenotype. We demonstrate that our method allows us to identify those enzymes that exert a high differential control upon a given relevant system property, and thus represent suitable sites to specifically target the cancerous phenotype.

## 2. Methods

### 2.1. Metabolic control analysis

The dynamic behaviour of a metabolic system, consisting of *m* metabolites and *r* reactions, can be described by a set of ordinary differential equations of the form

where **S** is an *m*-dimensional vector denoting the set of metabolite concentrations and **N** is the *m* × *r* stoichiometry matrix. The *r*-dimensional vector **v** = **v**(**S**,**K**) describes the functional form of the reaction rates, which depend on the metabolite concentrations **S** and the set of kinetic parameters **K**. The presence of mass-conservation relationships and conserved chemical moieties in the network results in linear dependencies among the rows of **N**. In this case, one distinguishes between a set of independently variable metabolite concentrations **S**^{ind} and a set of dependent metabolite concentrations **S**^{dep}. To characterize the dynamics of the system it is sufficient to consider the set of independent variables, making use of a reduced stoichiometry matrix **N**′ consisting of linearly independent rows only. The link between **N**′ and **N** is provided by the expression **N** = **L** · **N**′, where **L** denotes the link matrix [20].

To evaluate the system, each reaction has to be assigned a particular rate equation and each kinetic parameter needs to be assigned a specific quantitative value. Given a set of initial conditions, equation (2.1) can then be solved computationally, using standard methods of numerical integration. Usually, the model gives rise to one or more steady states (**v**^{0}, **S**^{0}) that can be compared with experimentally observed values. The response of the system to a perturbation of a kinetic parameter, representing for example the action of a drug, can be quantified in terms of *control coefficients*, as introduced by MCA [2,21–23]. In particular, given an effector *p*_{i} which acts directly on the enzymatic step *i*, the (*scaled*) *concentration control coefficient* is defined as [20,24]

and quantifies the ratio between the relative change of the steady-state concentration of metabolite *S* and the relative change in the catalytic action of enzyme *i* (induced by an infinitesimal change in the parameter *p*_{i}). Analogously, the (*scaled*) *flux control coefficient* is defined as

and quantifies the response of the system in terms of the relative change in a steady-state flux *J*. Reder [20] showed that the matrix **C**^{S} of the concentration control coefficients can be expressed as

where **J′** denotes the Jacobian of the system with respect to the independent variables, while and denote square matrices with elements **S**^{0} and **v**^{0} on the diagonal, respectively, and zero elsewhere.

For the matrix of flux control coefficients **C**^{J} the following expression holds:

Because **J′** can be written in terms of the link matrix and the reduced stoichiometry matrix as

it follows from equations (2.4)–(2.6) that, once the network topology (as represented by **L** and **N′**) and the network ‘phenotype’ (as represented by and ) are known, the only quantities to be evaluated in order to retrieve the control coefficients are the partial derivatives **v**/**S**. These, in turn, depend on the functional form of the rate equations governing the dynamics of each enzymatic step, the kinetic parameters and the metabolic state where the derivatives are evaluated. When full kinetic information is available, the control coefficients can be readily calculated using available software packages, such as COPASI [25], or online simulation tools for biochemical models that reside in repositories, such as JWS Online [26].

### 2.2. Differential metabolic control analysis

To locate suitable drug targets, one aims to identify those enzymes where a perturbation—usually a decrease in enzyme activity induced by an applied inhibitor—elicits a large response in at least one vital aspect of the functioning of the diseased cell, whereas a similar decrease in the activity of the same enzyme is less detrimental for the functioning, or ‘phenotype’, of normal cells. We note that the diseased metabolic phenotype often corresponds to changes in either enzyme concentrations (because of mutations in regulatory elements or signal transduction genes), or availability of external substrates (influx), whereas most Michaelis–Menten constants remain unchanged. Given detailed kinetic models of both phenotypes, the putative action of a drug is applied to both models and the difference in the response is assessed using differential MCA or direct simulations.

The vital aspect of the diseased cell that is targeted may be the production flux of ATP or it could be the growth rate itself. Because it is often a flux, we shall denote it by *J*_{target}. Our aim is then to locate suitable sites for the action of a drug which results in the maximal differential response in this desired flux between diseased phenotype and normal phenotype. Suitable drug targets are chosen according to the following, alternative or simultaneous, criteria:

- —
*Maximal selectivity*. We assume that the effect we aim for is the decrease of*J*_{target}in the diseased cells. In this case, we are interested in enzymes which exert a positive control upon*J*_{target}in the altered phenotype: . Independent of whether the effect induced in normal cells consists of a decrease or increase of*J*_{target}, we require this effect to be lower, in magnitude, than in altered cells. In terms of flux control coefficients of an enzyme*i*, this criterion can be expressed as follows:2.7The higher the value of , the stronger the differential response between normal and altered cells. We note that equation (2.7) does not impose any restriction on the sign nor on the value of , provided that its magnitude is smaller than the magnitude of . Indeed, enzymes with non-negligible negative value of may be also considered suitable targets as long as the difference in magnitude with allows us to reach an acceptable specificity through an appropriate dosage of an applied inhibitor. - —
*Minimal toxicity*. To avoid other system properties (not captured by the flux of interest*J*_{target}) undergoing important changes in normal cells, we require the normal phenotype to be*wholly*robust against perturbation in the drug target activity. In this respect, we define the*toxicity coefficient*, which quantifies how far the normal phenotype deviates from its original metabolic state after an inhibition of enzyme*i*:where the summation is over all the2.8*N*exchange fluxes of the system. This definition of toxicity reflects a black-box perspective, where the behaviour of the system is assessed through its inputs and outputs. However, other definitions of toxicity are possible and are examined later in §4.

### 2.3. A Monte Carlo strategy

A straightforward implementation of the strategy described above is only rarely applicable, as detailed kinetic models, describing the normal as well as the diseased phenotype, are usually not available. To overcome this limitation, we implement a Monte Carlo strategy which allows us to deal with incomplete knowledge of kinetic parameters and explicitly take into account available phenotypic data. In particular, we assume that for both states of the system the measured metabolic phenotype, characterized by a set of concentrations and fluxes, is known. In this respect, our approach is based on the fact that high-throughput metabolomics and fluxomics studies are now standard techniques in the analysis of cellular metabolism [17,27–32]. We proceed according to the following rationales:

- —We require that the map of the metabolic network of interest is known and is the same for both the diseased and the normal phenotype. In addition, we assume that the topology of regulatory interactions is, as far as possible, available.
- —For some of the enzymatic steps, information about the detailed functional form of the rate laws may be available. However, in the absence of such information, we employ heuristic assumptions about generic reaction characteristics to describe the dependencies of flux rates with respect to substrates, products and allosteric effectors.
- —To obtain a probabilistic understanding on how the control is distributed, the kinetic parameters are sampled randomly from pre-assigned intervals. Each sampled set of parameters is made compliant with the given metabolic phenotypes and additional thermodynamic and biophysical constraints, by rescaling the maximal activity of every reaction step.
- —For each sampled set of parameters the differential response in the normal and the disease phenotype is evaluated.

Our workflow is illustrated in figure 1. In the following, each step is described more thoroughly and the details of its implementation are given.

### 2.4. Defining generic reaction characteristics

To evaluate the differential response, each rate equation has to be assigned a specific mathematical form. To describe the kinetics of enzyme catalysed reactions for which the true kinetics is unknown, several possible heuristic approximate rate equations have been proposed in the literature [33–35]. In general, good results are obtained for functional forms that follow generalized Michaelis–Menten equations. Given a reaction converting a set of substrates *A* into a set of products *B*

such rate equation can be written as follows:

where

- —
*f*_{reg}is a dimensionless prefactor describing the interactions of the enzyme with allosteric regulators. Following the definition given in Liebermeister & Klipp [35], we writewhere2.10*P*_{k}and*Q*_{l}denote, respectively, a generic (allosteric) inhibitor and activator, and and denote their corresponding binding constants. This equation assumes that the activators and inhibitors are non-competitive. - —
*V*_{max}is the forward maximal rate of the enzyme and implicitly depends on the enzyme concentration. - — is the concentration of reactant
*A*_{i}divided by its Michaelis constant . - — is the concentration of reactant
*B*_{j}divided by its Michaelis constant . - — is a positive polynomial of the scaled reactant concentrations. Following [35], we choose2.11
- —
*Γ*is the mass–action ratio, defined as - —
*K*_{eq}denotes the equilibrium constant of the reaction.

The functional form described above captures the generic characteristics of enzyme catalysed reactions, such as reversibility, product inhibition, saturation and a reaction direction that only depends upon the mass–action ratio relative to the equilibrium constant. In the case study presented in this paper, we used equations in the form proposed in Liebermeister & Klipp [35] for every reaction.

### 2.5. Defining the kinetic parameters

Once the functional form of each rate equation is specified, the equations have to be populated with their respective kinetic constants. Again, we may assume that a (usually small) number of kinetic parameters is available, either by direct experimentation or from data repositories [36,37]. To account for the remaining unknown or uncertain values, we implement a Monte Carlo approach to systematically evaluate the behaviour of the network in a probabilistic manner. The idea is to sample the parameters so that their values comply with known phenotypic data and satisfy basic biophysical and thermodynamic constraints.

Our starting point consists of a metabolic state as characterized by a flux distribution **v**^{0} and a set of metabolite concentrations **S**^{0}. The flux distribution **v**^{0} satisfies the stationarity constraint d**S**^{0}/d*t* **N** · **v**^{0} = **0**, and the direction of each flux must be consistent with the set of concentrations, relative to its equilibrium constant. Reaction parameters are then assigned or sampled according to the following criteria:

- —
*Equilibrium constants*. The equilibrium constants are physico-chemical quantities which reflect the change in standard Gibbs free energy occurring in a reaction. They do not depend on the specific organism or cell type, but may depend on intracellular parameters, such as temperature. While as yet a large-scale detailed experimental quantification of the change in free energy is not available, a number of algorithms exists that allow for a reasonable computational approximation [38–42]. In the case study presented here, the values assigned to the equilibrium constants are obtained from Holzhütter [43]. - —
*Michaelis–Menten constants*. The Michaelis–Menten constants are genuine enzyme kinetic parameters and are sampled randomly in intervals . Different options are available to specify the interval borders. Here, the interval boundaries are chosen according to the observed metabolite concentrations. We sample the values of the parameters from intervals covering at least two orders of magnitudes around the steady-state concentrations of the corresponding metabolite. In particular, if [*S*]_{1}and [*S*]_{2}denote the concentration of a metabolite*S*in the two metabolic states, the affinity, inhibition and activation constants of any enzyme in respect to*S*are sampled between min{[*S*]_{1},[*S*]_{2}} × 10^{−α}and min{[*S*]_{1},[*S*]_{2}} × 10^{−β}, with*α*and*β*representing adjustable factors. To ensure that the sampled values are evenly spread among the different order of magnitudes around [*S*]_{1}and [*S*]_{2}, the sampling is performed using a logarithmic distribution. In the sampling process no distinction is made between parameters relating to regulatory interaction and substrate and product affinities. The results shown in the next section refer to sampling conditions where*α*=*β*= 1. The robustness of our results was subsequently tested for different choices of*α*and*β*, and different sampling distributions. - —
*Maximum rates*. Given the metabolic state and the parameters defined above, it follows from equation (2.9) that the maximum reaction velocity*V*_{max}is unambiguously determined. In other terms, to make the sampled values of Michaelis–Menten constants compliant with the metabolic state (**v**^{0},**S**^{0}) and the thermodynamics of the system (determined by the equilibrium constants*K*_{eq}), the maximum reaction velocity is computed by reversing equation (2.9) with respect to*V*_{max}.

### 2.6. Evaluating the control coefficients

The sampling is performed iteratively. For each sampled set of parameters, the partial derivatives **v**/**S** are evaluated based on the generalized Michaelis–Menten rate equation described above. Both metabolic states were tested for stability by evaluating their corresponding Jacobian. Parameter sets resulting in Jacobians with positive real part among their spectrum of eigenvalues were discarded, otherwise the control coefficients were evaluated using equations (2.4)–(2.5). We note that the computation only employs basic matrix inversion and multiplication—a procedure which is orders of magnitude faster than numerical integration, making our approach applicable to systems of large size. The process is repeated iteratively until 2.5 × 10^{4} samples are obtained.

### 2.7. Assessing the suitability of drug target candidates

The differences in the control profiles between diseased and normal phenotype are evaluated according to the criteria of maximal selectivity and minimal toxicity specified above. Because of the probabilistic nature of our approach, these criteria have to be reformulated such that they describe the *average*, over all the sampling iterations, of the two quantities defined in equations (2.7)–(2.8). As an additional criterion, we evaluate the reliability of the estimated average selectivity. As a quantitative measure of the quality of our prediction, we define the *reliability coefficient* as the ratio of the average selectivity to the standard deviation of the sampled selectivities:

### 2.8. Defining the diseased phenotype

To show the applicability of our method on a system of reasonable complexity, we consider a case study to identify suitable sites for drug intervention specifically targeting a cancerous phenotype. As to the best of our knowledge no detailed characterization of a cancer metabolic phenotype (in terms of fluxes and metabolite concentrations) exists, we employ a modelling approach to obtain a set of fluxes and concentrations representative of a generic cancer phenotype.

Our starting point consists of a fully defined metabolic map of human erythrocyte metabolism [43], modified for our purpose (figure 2). Two different sets of maximal activities (*V*_{max}) are used to reproduce the flux patterns characteristic of a normal cell and a paradigmatic cancer cell. In particular, the cancerous metabolic phenotype is obtained by increasing the *V*_{max} of those enzymes which are often overexpressed in cancer, namely the glucose transporter [44,45], hexokinase (HXK) [46–49] and phosphofructokinase (PFK) [46,49,50]. The activity of a fourth enzyme, the glutathione oxidase (GSHox), is also increased in order to achieve a higher flux through the pentose phosphate pathway, as observed by Richardson *et al*. [51].

**...**

We emphasize that the model itself is not used in further analysis and its purpose is solely to obtain a kinetically and thermodynamically consistent set of concentrations and fluxes that represents the common features shared among a wide panel of different cancerous cell lines, in particular a higher uptake of glucose and an enhanced production of lactate [52–54]. Details of model construction and analysis are given in the electronic supplementary material.

## 3. Results

### 3.1. Identifying the control profile of the system based on the topology and the metabolic state

The conjecture underlying our probabilistic approach is that some of the control properties of a metabolic system are independent from the precise magnitude of the enzyme parameters. The rationale behind this conjecture is that the signs and magnitudes of the control coefficients are determined to an appreciable extent by the topology of the system as well as the metabolic state under consideration [20,55]. To investigate our conjecture, we distinguish between two extreme scenarios: if the control coefficients are entirely determined by the metabolic map and phenotype, then their value should be independent from the specific choice of the kinetic parameters; on the other hand, if stoichiometry and phenotype had no bearing on control properties at all, then the control coefficients calculated for randomized values of the kinetic parameters should have infinitely wide and flat distributions.

Figure 3 shows the distributions obtained for the control exerted by selected enzymes upon the glucose import flux in the normal phenotype. In some of the cases, the control coefficients are distributed fairly narrowly with respect to the entire probable range of values. For example, the control coefficient of the glucose transporter (GLT, figure 3*a*) is distributed almost entirely between 0.01 and 0.05. This reinforces an earlier conclusion from a precise calculation that this control coefficient is small and that inhibitors of the glucose transporter are unlikely to be toxic to human erythrocytes [56]. More importantly, it confirms the strength of our conjecture for this case; i.e. within the limits of experimental accuracy [2], the value of this flux control coefficient can be estimated on the basis of the metabolic map, the metabolic phenotype and very limited kinetic information. The other panels of figure 3 show that this phenomenon is not unique. It is also valid for the control exerted on this same flux by other enzymes in the system, also for ones with higher flux control. Figure 3 also shows that the accuracy by which metabolic map and metabolic phenotype determine the flux control coefficient is not always the same. For the phosphofructokinase, for example, the estimated flux control coefficients on glucose import exhibit a broad distribution covering almost the entire interval between zero and one (figure 3*b*). The other panels in figure 3 are chosen as representative cases where the distribution of the control coefficients over the uptake of glucose is either mainly confined in the negative semiaxes (as opposed to glucose transporter and phosphofructokinase) or is spread evenly around zero, allowing no best guess on the sign of the actual control coefficient. See also the electronic supplementary material for a magnified depiction of the individual distributions.

*a*) Glucose transporter (GLT), (

*b*) phosphofructokinase (PFK), (

*c*) transketolase 1 (TK1), (

*d*) phosphoglyceratekinase (PGK), (

*e*) lactate

**...**

Figure 4 summarizes the distributions of the estimated control coefficients for the entire network. Each position in the matrix corresponds to the control exerted by an enzyme (columns) upon a flux within the network (rows) and shows the (colour-coded) percentage of calculated control coefficients that is positive. If the box is white then the flux control coefficient is positive independent, to a good extent, of the values of the kinetic constants. For example, the figure shows that the control exerted by the first six enzymes on all the first 13 reactions is positive, as indeed might be expected from the network topology shown in figure 2. Likewise, the enzymes 14–18 (corresponding to G6PDH, 6PGD, GSSG, GSHox and EP) mostly exert a negative control on the flux through reaction 3 (PGI), again consistently with our expectations from the network topology. However, other results are less straightforward to interpret, showing the utility of our calculations for complex reaction network. Overall, figure 4 confirms our conjecture that often at least the sign of the control coefficient is, to a good extend, independent from the precise values of kinetic parameters—enabling us to obtain a best guess for putative drug targets in the face of incomplete information.

### 3.2. Identifying candidate targets for drug intervention

From the perspective of limiting the survival or proliferation of cells that function as parasites in the human body, a possible strategy consists of inhibiting glycolysis, provided the pathway is phenotypically different in the parasitic versus the host cells [3,5,57,58]. From an MCA perspective, this translates into inhibiting the activity of an enzyme which exerts a major control over (for example) the uptake of glucose in the diseased/parasitic phenotype and a minor control over the same flux in the normal phenotype. The rationale of such an approach is to starve the diseased/parasitic cells without significantly affecting the host. From here on we will refer to this specific strategy to present and comment on our results.

Figure 5 shows the distributions of the selectivity coefficients with respect to the uptake of glucose for the same enzymes of figure 3*a*,*b*, in particular, show that the selectivity coefficients of the glucose transporter (GLT) and phosphofruktokinase (PFK) are mainly distributed over negative values, meaning that the control exerted by these enzymes upon the uptake of glucose is higher in the normal phenotype than in the cancer phenotype. Consequently, GLT and PFK cannot be considered good candidate targets (cf. [59]). This result is somewhat expected as the diseased phenotype was taken to be due to overexpression of four enzymes among which GLT and PFK (see §2). In general, however, the overexpression of an enzyme does not necessarily imply a decrease in the control it exerts, as the activity of the enzyme must be considered in the context of the entire network. Other enzymes show the desired selectivity in terms of their control on the target flux between diseased and normal cells. This is the case for phosphoglycerate kinase (PGK, figure 5*d*) and lactate transporter (LCT, figure 5*e*), the selectivity coefficients of which are mainly distributed over positive values, although the magnitude of these values differ substantially between the two enzymes.

Because of the probabilistic nature of our approach, the quantities defining the criteria of maximal selectivity and minimal toxicity, i.e. the selectivity coefficient and the toxicity coefficient *T*_{i}, respectively, are taken in their average values and , where the average is computed over all the sampling iterations. For the statements about enzyme selectivity to be meaningful we also consider the reliability coefficient as defined in equation (2.12). We now have three criteria by which to compare potential molecular drug targets. In any actual situation we will need to look at all three criteria simultaneously. To make the weighing of the three criteria as transparent as possible, we introduce the following scaled quantities:

where *σ*_{i}, 1/*τ*_{i} and *ρ*_{i} are the normalized selectivity, the normalized safety and the normalized reliability, respectively. We note that the normalized safety is defined so that it increases with decreasing toxicity.

By restricting our search for putative targets to only those enzymes with positive average selectivity , i.e. enzymes which tend to produce the wanted inhibiting effect on *J*_{target} in the diseased phenotype, the three normalized criteria are bound between 0 and 1.

Figure 6 depicts the values of the normalized selectivity, reliability and safety for all the enzymes with positive average selectivity . A most interesting result was that one enzyme target was both most selective and most reliable, and was also hardly toxic. This target was phosphoglyceratekinase (PGK). The fact that all three criteria come out with a single enzyme target suggests that that target could indeed be exceptionally valuable. It also suggests that the methodology we have introduced may be useful; a lesser result would have been if the target that was most selective would have been least reliable and most toxic, or if the least toxic target would also have been least selective. However, in general, we expect that no single target is best according to all three criteria introduced above. The more important result therefore is that our methodology leads to a clear separation of the potential targets at least in the dimensions of selectivity and reliability. In this respect the criterion of maximal safety (i.e. minimal toxicity) appeared to be less discriminatory for the enzymes shown.

*σ*

_{i}), safety (1/

*τ*

_{i}) and reliability (

*ρ*

_{i}), defined in equations (3.1)–(3.3), respectively, are plotted versus each other. Enzymes with negative values of (see text) are not represented in the

**...**

We may wish to classify drug targets with a single score, which takes into account all three criteria (selectivity, reliability and safety), perhaps with different weights. To do so, we define the following quantity:

as the score to be assigned to enzyme *i*. Equation (3.4) represents a plane in the three-dimensional space defined by *σ*_{i}, *ρ*_{i} and 1/*τ*_{i}. For a minimally required score *Z**, one may draw the corresponding plane in figure 6*a* and require all enzyme targets one wishes to consider for further development to lie above that plane. In figure 6*a*, we have drawn the plane corresponding to *Z** = 1/3 and weight factors *w*_{σ} = 4, *w*_{τ} = 2 and *w*_{ρ} = 1. This specific choice for the weight factor values was made to prioritize the maximal selectivity over the minimal toxicity, and the latter over the requirement of minimal uncertainty. Table 1 shows the list of enzyme with positive average selectivity, sorted by decreasing value of *Z*.

### 3.3. Robustness of the results

As a probabilistic method, our approach crucially relies on the robustness of the results with respect to different ways of sampling the parameter values. To this end, we repeated the data generation process and the subsequent analysis using different sampling conditions. In particular, we changed the range of values from which the parameters were sampled and the sampling distribution. For each different set of sampling conditions, we ranked the suitability of the different enzyme as drug targets as explained above.

Table 2 summarizes the results obtained for the different sampling conditions, showing PGK as the highest scoring target for all of them. As expected, enzymes with lower scores than PGK did not always keep their position in the ranking list. The lower the score the higher the chance for the enzyme to enter the list in a different position, when the sampling condition was altered. In most of the cases, however, the top six entries of the list remained the same. In particular, PGK always emerged as the best guess as putative target, while enolase (ENO) and phosphoglycerate mutase (PGM) were always found within the top four entries of the list, and lactate dehydrogenase (LDH) was almost always found between the 3rd and the 6th position (see the electronic supplementary material for more details). The use of a linear sampling distribution caused the most different results in respect to the scoring list obtained with a logarithmic distribution, even for the high-scoring enzymes (except for PGK, which was always at the top of the list). This fact can be ascribed to the highly asymmetric sampling of the parameter values in respect to the metabolite concentrations. When using a linear distribution, there is a much higher probability that the sampled value of the Michaelis–Menten constants exceed the concentration of the corresponding metabolites. In our case, where *α* = *β* = 1 (see §2 for their definition), for each reaction step statistically at least 90 per cent of the sampled values were larger than the concentration of their corresponding metabolites. This implies a scenario where the saturation level of all the enzymes in respect to all their reactants and modifiers was almost always smaller that 50 per cent. On the other hand, when using a logarithmic distribution, the value of the Michaelis–Menten constants tended to be evenly sampled around the metabolite concentrations, or, more precisely, among the orders of magnitude spanned by the sampling intervals.

### 3.4. Comparison with the dynamic model

To assess the significance of our result, we compared the insights gained through our statistical approach with the results obtainable directly through a dynamic simulation. The dynamic model used to retrieve the cancer metabolic state (see §2 and electronic supplementary material) was used as workbench for this assessment. The two sets of parameters characterizing the two metabolic states (normal and cancer) were both modified by decreasing the activity (*V*_{max}) of PGK from 12.9 to 6.5 mM h^{−1}, simulating the addition of a non-competitive inhibitor. Figure 7 shows how some of the fluxes of the metabolic system changed in the two phenotypes in response to this perturbation. Table 3 shows the relative changes recorded once the system reached the new steady state after the perturbation.

^{−1}.

## 4. Discussion

Despite a surge in the perspectives of using mechanistic mathematical modelling in clinical development [60–63], parametric uncertainty remains a challenge to mechanistic approaches in medicine [13]. However, contrary to the scarcity of kinetic data, the comprehensive quantification of all concentrations and fluxes within a metabolic system is, at least in principle, experimentally feasible. The question we addressed in this paper is whether the knowledge of a metabolic phenotype only—expressed in terms of fluxes and metabolite concentrations at steady-state—allows for a *probabilistic* understanding of how the control properties of a biochemical system are distributed among the different enzymatic steps and metabolic processes. We developed a Monte Carlo approach which aims to provide researchers with a probabilistic description of how the control properties of a metabolic network differ between two fully characterized metabolic phenotypes, when only minimal knowledge of the system is available with respect to its kinetic parameters. Our goal was to locate points of fragility in a diseased/pathogen phenotype which can be considered for drug interventions with maximal effectiveness and minimal toxicity. In particular, enzymes which exert a major control over a certain property of interest in a diseased/pathogen phenotype, and a minor control over the same property in the normal/host phenotype, represent good putative targets. In our method, the control profiles characterizing the two phenotypes under comparison are determined for sampled values of the unknown affinity, inhibition and activation constants. Our Monte Carlo approach provides us with a statistical understanding of how the control is differentially distributed between the two metabolic states, where the word ‘statistical’ should be interpreted in the sense of uncertainty rather than in the sense of population dispersion.

The system under study was a simplified reconstruction of the central carbon metabolism, with two metabolic states that were representative of a normal and a paradigmatic cancerous phenotype. The results presented in this work refer to a clinical strategy aimed to starve specifically cancer cells, i.e. to locate enzymes which exhibit a high differential control upon the uptake of glucose, denoted as *J*_{target}, between the two phenotypes.

In the development of new drugs, pharmaceutical companies tend to give priority to the maximal effect that a compound can induce in the disease cells. Among the palette of compounds which pass this filter, a second screening is performed to look for compounds which leave the normal phenotype unaltered as much as possible. Our approach is different. To assess the suitability of the enzymes as possible targets for an anti-cancer drug, three different criteria were considered in this paper: maximal selectivity, maximal safety (i.e. minimal toxicity) and maximal reliability. The first criterion was formulated in such a way as to encompass both the requirements of high effectiveness and high selectivity with respect to the specific property one wants to affect in the diseased cells. High values of the selectivity coefficient correspond to a high (positive) control over *J*_{target} in the diseased cell, and a relatively small control over the same flux in the normal cells. In particular, we introduced the selectivity coefficient, as defined in equation (2.7), to quantify the differential response of the system to inhibition of enzyme *i*. Alternative definitions than equation (2.7) may also be considered. For example, Bakker *et al*. [3] defined it as the ratio between the control over *J*_{target} in the parasitic/diseased cell over the same control in the host/normal cell. Such a definition, however, may result in over-ranking enzymes with small control in the diseased phenotype as putative targets. An enzyme A with and , for instance, would be considered preferable to an enzyme B with and , as the ratio would be around three times the ratio . However, using an appropriate dosage of a drug that inhibits enzyme B, one could reduce the effect of the enzyme inhibition in the normal cell to a negligible extent while maintaining in the diseased cell a strong effect, in fact stronger than would be feasible by inhibition of enzyme A.

The criterion of maximal safety (or minimal toxicity) was introduced to assure that the normal phenotype was robust to a perturbation of the drug target. The predicted toxic effect of inhibiting an enzyme *i* was quantified through the toxicity coefficient defined in equation (2.8). We defined this coefficient as the average of the absolute values of the flux control coefficients of enzyme *i* with respect to all the exchange fluxes. Our definition reflects a black-box perspective, where the behaviour of the system is assessed solely through its input and output fluxes. However, this definition of toxicity is far from unique and its suitability may depend on the specific situation. For example, an alternative choice would be to define the toxicity as the largest of all of the target enzyme's flux control coefficients. Also, there may be reactions or pathways for which a change in the flux does not induce any major effect on the vital functions of the cell, while even small alterations in other fluxes might entail significant stress in the normal cellular physiology. In this case, the criterion of minimal toxicity can be augmented by assigning different weights to the respective control coefficients. Another possibility to define the toxicity coefficient would be to take into account the concentration of key metabolites that are known to have toxic effects. In this case, the toxicity is assessed in terms of the concentration control coefficients of the target enzymes with respect to these metabolites.

Finally, the *criterion of maximal reliability* was introduced in order to prefer enzymes whereby the computed average selectivity was affected by the least possible uncertainty.

To evaluate the suitability of an enzyme as a drug target, different weights can be assigned to these criteria, depending on the relevance they have for the researcher. In the present work, we chose to prioritize the maximal selectivity over the minimal toxicity, and the latter over the requirement of minimal uncertainty. By doing so, PGK was identified as the best guess for a suitable drug target. Although in this case PGK was both the most selective and most reliable enzyme target, besides being hardly toxic, we note that in general the outcome of our analysis can sensibly depend on the priority ascribed to the three criteria mentioned above. From a clinical perspective, performing our analysis with different weight factors may then provide a palette of best drug targets referring to different thresholds among the required properties of selectivity, safety and reliability.

The robustness of our results was tested against different choices of the sampling conditions. The fact that PGK always emerged as the best guess as putative target represents an important result, as it suggests that the sampling approach proposed in this paper can provide a relevant insight about the control profile of a metabolic system that is reasonably robust with respect to alterations in the numerical methods. If the top entries of the scoring list were to change completely based on the sampling condition, the method would have proved to be of less utility. We note, however, that in general different choices of the sampling distribution may lead to different enzymes as best putative drug target. The use of different sampling distributions may then be thought and adopted as a way to provide a more restrictive assessment of how the emergence of a specific enzyme as best target is due to topological constraints rather than specific parameter values (or range of values).

The conclusions achieved through our probabilistic approach, leading to the identification of PGK as the best putative target, were compared with the outcome of a mechanistic approach by decreasing the activity (*V*_{max}) of PGK in the dynamic model introduced in §2 and described in the electronic supplementary material. The response of the system to this perturbation showed that the normal phenotype was indeed less sensitive to an inhibition of PGK than the diseased phenotype. This lower sensitivity has been observed not only in respect to the glucose uptake flux, but in the general behaviour of the system. The exchange fluxes and the main metabolic processes were in fact less affected in the normal phenotype, showing good agreement with our statistical result. Regarding specifically the production of lactate, we registered a significant decrease in the flux through LDH in both normal and diseased phenotype (−61% and −82%, respectively). Since this strong decrease occurred on a branch which is virtually unused in normal cells (except for muscle cells under anaerobic conditions and erythrocytes), this result does not invalidate PGK as putative drug target candidate.

In conclusion, the statistical approach proposed in this paper provides us with a useful strategy for assessing how the control profile is differently distributed in two distinct metabolic phenotypes. It also highlights which enzyme can best represent a putative target with respect to requirements such as high effectiveness and low toxicity. The significance of the results obtained through this kind of analysis, however, may be improved at different levels, not necessarily only related to the probabilistic nature of our approach. For example, a higher degree of detail in the representation of the metabolic map would reduce the approximation introduced by considering lumped reactions representing more complex biochemical pathways or processes. In the example provided in this paper, two of such reactions were used to represent the TCA cycle and oxidative phosphorylation. Expanding a lumped reaction into the entire set of enzymatic steps it represents, could lead to new and important insights about how the control properties are distributed in the system. This is mostly the case when metabolic intermediates that are only involved in a lumped pathway have regulatory effects on enzymatic steps outside that pathway. A more detailed description of the lumped pathway would then result not only in a ‘higher resolution’ of how the control properties are locally distributed among the different steps which were lumped together, but also in a different overall control profile. Just as in the case of conventional kinetic modelling, the level of detail in which the metabolic map is represented can determine the level of accuracy of the regulatory map, and consequently have a non-negligible effect on the results.

A related aspect is the choice of rate equations of the enzymatic steps. While generic rate equations are commonly used to capture generic aspects of metabolic networks, the actual, experimentally determined, rate equations may result in slightly modified control properties. In particular, the experimental determination of reaction functions may also identify further unknown regulatory interactions, as well as possible cooperativity between metabolic compounds. Regarding the use of the generic rate equation, we also note that different choices are possible. The specific instance of generalized Michaelis–Menten equation proposed by Liebermeister & Klipp [35], and used in this paper, takes into account generic characteristics of enzyme catalysed reactions—such as reversibility, product inhibition, saturation, reaction direction that only depends upon the mass–action ratio relative to the equilibrium constant. However, alternative choices have been proposed by Rohwer *et al*. [64] that also account for competition between substrates and products and take possible cooperativity into account. We note though, in contrast to explicit kinetic models, we are only interested in the derivatives of the rate equations, such that minor differences in the precise functional form often have no major effect. In most applications, we also expect that at least partial information on some kinetic parameters is available. Such partial information allows us to further constraint the sampling intervals and to obtain results that are specific for the system under study. In this sense, our approach can be straightforwardly incorporated into an iterative scheme that allows us to quantify uncertainty in the control profile, and hence allows us to pinpoint further experiments to increase the specificity and reliability of the results.

## Acknowledgements

We acknowledge the support of the Doctoral Training Center ISBML by EPSRC and BBSRC (EP/F500 009/1), of the Manchester Center of Integrative Systems Biology by BBSRC, EPSRC (BB/C0082191) and additional BBSRC support (BBD0190791), of SysMO (BBF003535281, 35361, 35521) and of the EC Network of Excellence BioSim.

## References

*C. glutamicum*: a comparison of rate laws in combination with various parameter optimization strategies. BMC Syst. Biol. 3, 5. (doi:10.1186/1752-0509-3-5)10.1186/1752-0509-3-5 [PMC free article] [PubMed] [Cross Ref]

*Escherichia coli*metabolism. Biophys. J. 90, 1453–1461 (doi:10.1529/biophysj.105.071720)10.1529/biophysj.105.071720 [PMC free article] [PubMed] [Cross Ref]

*Escherichia coli*K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol. Syst. Biol. 3, 121. (doi:10.1038/msb4100155)10.1038/msb4100155 [PMC free article] [PubMed] [Cross Ref]

*in vivo*and

*in vitro*: transformation- and progression-linked discriminants of malignancy. Cancer Res. 45, 2993–3001 [PubMed]

**The Royal Society**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (683K)

- [Development of antituberculous drugs: current status and future prospects].[Kekkaku. 2006]
*Tomioka H, Namba K.**Kekkaku. 2006 Dec; 81(12):753-74.* - Estimating the size of the solution space of metabolic networks.[BMC Bioinformatics. 2008]
*Braunstein A, Mulet R, Pagnani A.**BMC Bioinformatics. 2008 May 19; 9:240. Epub 2008 May 19.* - Detecting drug targets with minimum side effects in metabolic networks.[IET Syst Biol. 2009]
*Li Z, Wang RS, Zhang XS, Chen L.**IET Syst Biol. 2009 Nov; 3(6):523-33.* - Inferring biochemical reaction pathways: the case of the gemcitabine pharmacokinetics.[BMC Syst Biol. 2012]
*Lecca P, Morpurgo D, Fantaccini G, Casagrande A, Priami C.**BMC Syst Biol. 2012 May 28; 6:51. Epub 2012 May 28.* - Exploiting complexity and the robustness of network architecture for drug discovery.[J Pharmacol Exp Ther. 2008]
*Hellerstein MK.**J Pharmacol Exp Ther. 2008 Apr; 325(1):1-9. Epub 2008 Jan 17.*

- Metabolomics and systems pharmacology: why and how to model the human metabolic network for drug discovery[Drug Discovery Today. 2014]
*Kell DB, Goodacre R.**Drug Discovery Today. 2014 Feb; 19(2)171-182* - What Can We Learn from Global Sensitivity Analysis of Biochemical Systems?[PLoS ONE. ]
*Kent E, Neumann S, Kummer U, Mendes P.**PLoS ONE. 8(11)e79244* - A model of yeast glycolysis based on a consistent kinetic characterisation of all its enzymes[Febs Letters. 2013]
*Smallbone K, Messiha HL, Carroll KM, Winder CL, Malys N, Dunn WB, Murabito E, Swainston N, Dada JO, Khan F, Pir P, Simeonidis E, Spasić I, Wishart J, Weichart D, Hayes NW, Jameson D, Broomhead DS, Oliver SG, Gaskell SJ, McCarthy JE, Paton NW, Westerhoff HV, Kell DB, Mendes P.**Febs Letters. 2013 Sep 2; 587(17)2832-2841* - Structure and dynamics of molecular networks: A novel paradigm of drug discovery: A comprehensive review[Pharmacology & therapeutics. 2013]
*Csermely P, Korcsmáros T, Kiss HJ, London G, Nussinov R.**Pharmacology & therapeutics. 2013 Jun; 138(3)333-408* - Yeast-based automated high-throughput screens to identify anti-parasitic lead compounds[Open Biology. 2013]
*Bilsland E, Sparkes A, Williams K, Moss HJ, de Clare M, Pir P, Rowland J, Aubrey W, Pateman R, Young M, Carrington M, King RD, Oliver SG.**Open Biology. 2013 Feb; 3(2)120158*

- A probabilistic approach to identify putative drug targets in biochemical networ...A probabilistic approach to identify putative drug targets in biochemical networksJournal of the Royal Society Interface. Jun 6, 2011; 8(59)880PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...