On correlated reaction sets and coupled reaction sets in metabolic ...

5 downloads 0 Views 362KB Size Report
Mar 4, 2015 - Furthermore, by analyzing genome-scale metabolic networks, we ... There are two main strategies for modeling metabolic network °uxes. The ¯ ...
Journal of Bioinformatics and Computational Biology Vol. 13, No. 4 (2015) 1571003 (16 pages) # .c Imperial College Press DOI: 10.1142/S0219720015710031

On correlated reaction sets and coupled reaction sets in metabolic networks

Sayed-Amir Marashi*,†,‡ and Zhaleh Hosseini* *Department

of Biotechnology, College of Science University of Tehran, Tehran, Iran

† Department of Bioinformatics, School of Computer Science Institute for Research in Fundamental Sciences (IPM), Tehran, Iran ‡[email protected]

Received 13 August 2014 Revised 25 January 2015 Accepted 25 January 2015 Published 4 March 2015 Two reactions are in the same \correlated reaction set" (or \Co-Set") if their °uxes are linearly correlated. On the other hand, two reactions are \coupled" if nonzero °ux through one reaction implies nonzero °ux through the other reaction. Flux correlation analysis has been previously used in the analysis of enzyme dysregulation and enzymopathy, while °ux coupling analysis has been used to predict co-expression of genes and to model network evolution. The goal of this paper is to emphasize, through a few examples, that these two concepts are inherently di®erent. In other words, except for the case of full coupling, which implies perfect correlation between two °uxes (R 2 ¼ 1), there are no constraints on Pearson correlation coe±cients (CC) in case of any other type of (un)coupling relations. In other words, Pearson CC can take any value between 0 and 1 in other cases. Furthermore, by analyzing genome-scale metabolic networks, we con¯rm that there are some examples in real networks of bacteria, yeast and human, which approve that °ux coupling and °ux correlation cannot be used interchangeably. Keywords: Co-sets; °ux coupling analysis (FCA); Monte Carlo sampling; uniform sampling; °ux cone.

1. Introduction There are two main strategies for modeling metabolic network °uxes. The ¯rst strategy is to use kinetic modeling. This strategy requires extensive knowledge on reaction kinetics and parameters. Unfortunately, it is not always possible to use this strategy for modeling metabolism, as complete knowledge about a metabolic system is rarely available. The second strategy is to use constraint-based modeling of metabolic °uxes.1,2 In the latter framework, the networks are supposed to function in steady state and the only constraints used are stoichiometric, thermodynamic ‡ Corresponding

author. 1571003-1

S.-A. Marashi & Z. Hosseini

(i.e. reversibility/irreversibility of reactions) and capacity constraints. In the present paper, we will discuss (and compare) the properties of two constraint-based concepts, namely °ux coupling and °ux correlation. The term \coupled reaction set" was ¯rst used in 2004 to describe a set of metabolic reactions (°uxes) which are coupled to each other based on °ux coupling analysis (FCA).3 If a nonzero °ux through reaction i ðvi 6¼ 0Þ implies a nonzero °ux through reaction j ðvj 6¼ 0Þ, then i is said to be directionally coupled to j (or, i ! j). If i ! j and j ! i, then reactions i and j are partially coupled (or, i $ j). Reactions i and j are called fully coupled (i , j) if they are partially coupled, and additionally, °uxes through reactions i and j are proportional (vi =vj ¼ constant). Two unblocked °uxes are either (fully, partially or directionally) coupled, or uncoupled.3 Coupling and uncoupling relationships have many biological implications. In case of full coupling, two reactions will always operate together with a ¯xed °ux ratio. This may imply that their corresponding genes are located in the same operon and/or they are co-regulated/co-expressed.3,4 In case of directional coupling, the dependency between reactions is unidirectional. It suggests that the independent reaction (i.e. reaction j in an i ! j relation) should be more essential, expressed, regulated and conserved compared to the dependent reaction.5 Horizontal gene transfer has also been shown to be a®ected by constraints applied by directional coupling relations between transferred genes and genes already present in host genome.6 In addition, the set of equivalent knockouts for eliminating the activity of a particular reaction can be identi¯ed by ¯nding the reactions to which the desired reaction is directionally coupled.3 This can be useful in designing genetic engineering experiments. On the other hand, uncoupled reactions have often compensatory roles in the absence of each other. In other words, the activity of only one is su±cient to have the function.7 Therefore, they can be regarded as alternative reactions, and from the essentiality point of view, they are synthetic lethal pairs. For instance, in some diseases like cancer, in which one of the two reactions is inactivated through mutation, the removal of the other reaction may result in cancerous cell death. In contrast, in normal cells (with no mutation), removal of the second reaction will not be harmful.7 Shortly after the introduction of FCA, Price et al. used the term \correlated reaction set" (Co-Set), to describe a set of reactions with correlated °uxes.8 The Pearson correlation coe±cient (CC) between every pair of reactions can be computed by Monte Carlo sampling of the °ux space. For a pair of perfectly correlated °uxes i and j ðjRi;j j ¼ 1Þ, the authors used the term \perfectly correlated" reactions.8 Random sampling and °ux correlation are used in various studies in order to explore di®erent biological properties of metabolic networks. For example, it has been shown that uniformly sampled °ux vectors can give insights into regulatory properties of di®erent reactions in the network.9,10 Additionally, uniform random sampling can be used for comparing in silico °ux distributions with experimental data, when additional constraints is introduced to model, based on experimental results.11,12 Calculation of pairwise CC between pairs of °uxes is a useful method for measuring the dependencies between pairs of °uxes under di®erent conditions.13,14 1571003-2

Co-sets and coupled reaction sets in metabolic networks

Such dependencies can be used to explore the e®ect of enzyme dysregulation and also enzymopathies. Immediately after the introduction of the two concepts of \°ux coupling" and \°ux correlation", Papin and his colleagues used the term Co-Sets to describe reactions with perfectly, partially and directionally related °uxes.15 Moreover, they emphasized that Co-Sets can be computed by pathways analysis, FCA or by analysis of correlated °uxes.15 After this paper, several other studies used similar terminology.4,16–19 Here, we emphasize that using the term \Co-Set" can be confusing, as °ux coupling and °ux correlation are two di®erent concepts. To demonstrate this, we show that uncoupled, directionally coupled and partially coupled °uxes can be perfectly correlated/completely uncorrelated. To be precise, reactions i and j are de¯ned to be in a perfect Co-Set if they have a ¯xed °ux ratio.15 As a result, the de¯nition of perfect Co-Sets is more restrictive than the de¯nition of perfectly correlated reaction sets,8 since jRi;j j ¼ 1 does not necessarily guarantee a ¯xed °ux ratio, e.g. when vi ¼ vj ¼ 0 is not included in the feasible °ux space.19 2. Materials and Methods 2.1. Computing Pearson CC between pairs of °uxes In the present study, for ¯nding the Pearson CC, between a pair of °uxes in each example network, COBRA toolbox20 was used to sample 100,000 uniform random vectors over the °ux space. COBRA toolbox uses parallelized arti¯cial center hit and run algorithm (ACHR) to sample the °ux space.21 Brie°y, an initial point is selected in the feasible space of °ux distributions. Then, the algorithm generates \warm-up" points. These points are stored in a matrix, and a centroid, s, is calculated for it. In the next step, a random vector, xn , is chosen in the matrix and a new point is generated by moving toward the direction of (s  xn ). This point is substituted randomly in the matrix. The procedure of determining the centroid and generating the new point is iterated until the desired number of points is achieved. The CC is assumed to be equal to zero if R 2 < 10 5 . In the analysis of the small example networks (see Figs. 1–3) the lower bound (respectively upper bound) of °uxes is 0 (respectively 2), except those reactions with ¯xed °uxes (shown by asterisk). Note that the results do not depend on the upper bounds of °uxes, in the sense that changing the upper bounds will only change the °ux ranges (on the axes of the plots) or the position of the lines in Figs. 1(d), 2(d) or 3(d) (see Sec. 3). 2.2. Computing maximal information coe±cients between pairs of °uxes In general, it is possible to have two variables which depend on each other, but their \linear" correlation is zero. One possibility to ¯nd the interdependence of such variables is to consider the mutual information of them. Mutual information between two variables A and B, denoted by IðA, BÞ, determines the dependence between two 1571003-3

S.-A. Marashi & Z. Hosseini

variables by measuring the information that they share. Mutual information is de¯ned by the following equation: IðA; BÞ ¼

MA X MB X i¼1

 pðai ; bj Þ log

j¼1

 pðai ; bj Þ ; pðai Þpðbj Þ

where pðai ,bj Þ is the joint probability that A is in the state of ai and B is in the state of bj . Intuitively, IðA; BÞ measures the reduction in uncertainty about one variable after observing the other. For detailed description of mutual information please see Ref. 22. In the present work, we used Minerva package23 in R environment24 to compute maximal information coe±cients (MIC, a means to apply mutual information) to examine reaction pairs which are found to be linearly uncorrelated based on computing Pearson CC. MIC is a metric for measuring the association between two variables which is based on mutual information. It can capture a wide range of associations. Informally speaking, it is supposed that if two variables are related there is a grid that can partition the scatter plot of these two variables, such that the relationship can be captured. Therefore, all possible grids are examined until a maximum for grid resolution is achieved. The largest mutual information value which can be obtained based on these grids is calculated and normalized. This statistic which is referred as MIC is a value between zero and one.25 2.3. Computing °ux coupling relations We calculated the °ux coupling relations by means of CFCA tool.26 This tool is basically an extension of the previously reported tools for computing °ux coupling relations.27,28 The main di®erence is to add the possibility to consider lower and upper bounds for each reaction in the network, while the previous tools assume no constraints on the lower and upper bounds. We used SoPlex solver in the OPTI Toolbox29 (as an \exact solver") to solve the linear programs required in FCA. 2.4. Analysis of genome-scale metabolic networks In the present study, to compare °ux coupling relation and °ux CC between realworld reactions, we studied seven di®erent genome-scale metabolic networks of ¯ve biologically important microorganisms (see Table 1 for a summary of general biological properties of these networks). Pseudomonas aeruginosa and Mycobacterium tuberculosis are major life-threatening pathogens, and therefore, their reconstructed metabolic networks can be used in identi¯cation of novel drug targets.30,31 Escherichia coli and Pseudomonas putida are two of the most important bacteria in industry. In silico modeling of the metabolism of these microorganisms can be employed in metabolic engineering applications for production of desired metabolites and in bioremediation processes.32,33 Saccharomyces cerevisiae, the baker's yeast, is extensively used for production of 1571003-4

Co-sets and coupled reaction sets in metabolic networks

food, pharmaceuticals and biofuels. Di®erent versions of yeast metabolic network have been used in metabolic engineering and in evolutionary studies.34 For each metabolic network we considered ¯ve di®erent conditions: (i) ¯xing the °ux of glucose uptake; (ii) ¯xing the °ux of oxygen uptake; (iii) ¯xing the °uxes of glucose and oxygen uptake; (iv) ¯xing the °ux of fructose uptake; and ¯nally (v) ¯xing the °uxes of fructose and oxygen uptake. It should be noted that the M. tuberculosis model does not include fructose, and therefore, for this network cases (iv) and (v) are not applicable. In each case, we computed the number of instances where: .

Reactions Reactions . Reactions . Reactions . Reactions .

are are are are are

directionally coupled but perfectly correlated, directionally coupled but completely uncorrelated, partially coupled but perfectly correlated, partially coupled but completely uncorrelated, uncoupled but perfectly correlated.

3. Results and Discussion 3.1. Correlation between uncoupled °uxes Figure 1(a) presents an example of uncoupled °uxes, where CC is zero (Fig. 1(b)). It might not be surprising to have uncorrelated uncoupled °uxes. Figure 1(c) is probably more interesting. In this example, °ux through reaction 1 is always equal to 1. As a result, °uxes 2 and 3 are uncoupled, because v2 ¼ 1 (i.e. a nonzero °ux through reaction 2) implies v3 ¼ 0, and additionally, v3 ¼ 1 implies v2 ¼ 0. However, these °uxes are perfectly correlated (Fig. 1(d)). 3.2. Correlation between directionally coupled °uxes In Fig. 2(a), °ux 3 is directionally coupled to °ux 2. To the best of our knowledge, it was previously believed that °ux coupling guarantees °ux correlation (R 2 6¼ 0). Figure 2(b) proves that this is not correct, where 2 and 3 are completely uncorrelated. Another example of directional coupling is shown in Fig. 2(c). Here, v3 ¼ 1, which results in a perfect correlation between °uxes 1 and 2 (Fig. 2(d)). Note that the diamond-like pattern in Fig. 2(b) is obtained because reaction 3 cannot carry any °ux independent of the °ux of reaction 2. If v3 is active in the forward direction, i.e. it converts B to C, then we have these two constraints: v2 þ v3 ¼ v4 ; v2 ¼ v3 þ v5 : Since v4  2, from the ¯rst equation it is inferred that sum of v2 and v3 is equal or less than 2. Hence, the points in Fig. 2(b) should be under the line v2 þ v3 ¼ 2. From the second equation it is inferred that v2 should always be more than v3 . Consequently, sampled points must be located under the line v2 ¼ v3 . Therefore, the points 1571003-5

S.-A. Marashi & Z. Hosseini

(a)

(b)

(c)

(d)

Fig. 1. (a) A small network in which °uxes 3 and 7 are uncoupled. (b) The corresponding scatter plot shows that the °uxes are uncorrelated. (c) A small network in which °uxes 2 and 3 are uncoupled. Here, v1 ¼ 1. (d) The corresponding scatter plot shows that the °uxes are perfectly correlated.

are restricted in a triangle made by these three lines: v3 ¼ 0, v2 þ v3 ¼ 2 and v2 ¼ v3 . When v3 is active in the reverse direction, the same reasoning can explain the existence of the lower part triangle in Fig. 2(b). 3.3. Correlation between partially coupled °uxes We show that even partially coupled °uxes i and j (with vi 6¼ 0 if and only if vj 6¼ 0) do not necessarily belong to a Co-Set. It can be easily shown that °uxes 1 and 4 in Fig. 3(a) are partially coupled. However, the stoichiometric coe±cients of reactions 2 and 3 are chosen such that the CC between °uxes 1 and 4 becomes zero (Fig. 3(b)). Similar to the uncoupling and directional coupling cases, it is possible to ¯nd examples of partially coupled °uxes which are perfectly correlated. In Fig. 3(c), 1571003-6

Co-sets and coupled reaction sets in metabolic networks

(a)

(b)

(c)

(d)

Fig. 2. (a) A small network in which °ux 3 is directionally coupled to °ux 2. (b) The corresponding scatter plot shows that the °uxes are uncorrelated. (c) A small network in which °ux 2 is directionally coupled to °ux 1. Here, v3 ¼ 1. (d) The corresponding scatter plot shows that the °uxes are perfectly correlated.

°uxes 1 and 2 are partially coupled. If v3 ¼ 1, however, a perfect correlation is found between °uxes 1 and 2. 3.4. Flux coupling versus °ux correlation in real biological networks In the previous sections, we observed, through a few examples, that °ux coupling and °ux correlation are inherently di®erent concepts. In other words, except for the case of full coupling, which implies perfect correlation between two °uxes (R 2 ¼ 1), there are no constraints on Pearson CC in case of any other type of coupling/uncoupling relations. In other words, Pearson CC can take any value between 0 and 1 in other cases. Here, we show that such examples (which will be referred to as \exceptional cases") can also be found in genome-scale metabolic networks. 1571003-7

S.-A. Marashi & Z. Hosseini

(a)

(b)

(c)

(d)

Fig. 3. (a) A small network in which °uxes 1 and 4 are partially coupled. (b) From the scatter plot it was shown that the °uxes are uncorrelated. (c) A small network in which °uxes 1 and 2 are partially coupled. Here, v3 ¼ 1. (d) The corresponding scatter plot shows that the °uxes are perfectly correlated.

We calculated the °ux coupling relations by means of CFCA tool.26 For calculation of CC between pairs of reactions we sampled 100,000 points for each model. The results are shown in Fig. 4. For all of the metabolic networks, we found some exceptional cases at least under one of the growth conditions. Some of these cases are rather infrequent, e.g. directionally coupled reactions with perfect correlation, uncoupled reactions with perfect correlation, or partially coupled reactions with zero correlation. However, in all networks, we observed a large number of directionally coupled reaction pairs which are not correlated. Figure 5(a) shows an example of directional coupling and perfect correlation in iND750. Here, D-sorbitol dehydrogenase reaction is directionally coupled to hexokinase reaction. As the °ux through fructose exchange is ¯xed, the °uxes of the two mentioned reactions become perfectly correlated. 1571003-8

Co-sets and coupled reaction sets in metabolic networks P. aeruginosa (iMO1086)

4

3.5 Condition i

3 2.5

Condition ii

2

Condition iii

1.5

Condition iv

1

Log10(frequency+1)

3.5

Log10(frequency+1)

E.coli (iJR904)

4

Condition i

3 2.5

Condition ii

2

Condition iii

1.5

Condition iv

1 Condition v

Condition v

0.5

0.5

0

0

case i

case i case ii case iii case iv case v

Exceptional cases E. coli (iJO1366)

4.5

3.5

3.5 Condition i

3 2.5

Condition ii

2

Condition iii

1.5

Condition iv

1

Condition v

Log10(frequency+1)

Log10(frequency+1)

P. putida (iJN746)

4

4

Condition i

3 2.5

Condition ii

2

Condition iii

1.5

Condition iv

1

Condition v

0.5

0.5

0

0 case i case ii case iii case iv case v

case i case ii case iii case iv case v

Exceptional cases

Exceptional cases M. tuberculosis (iNJ661)

4.5

S. cerevisiae (iND750)

4

4

3.5

3.5 3

Condition i

2.5

Condition ii

2

Condition iii

1.5 1

Log10(frequency+1)

Log10(frequency+1)

case ii case iii case iv case v

Exceptional cases

Condition i

3 2.5

Condition ii

2

Condition iii

1.5

Condition iv

1

Condition v

0.5

0.5 0

0 case i case ii case iii case iv case v

case i case ii case iii case iv case v

Log10(frequency+1)

Exceptional cases

Exceptional cases 5

S. cerevisiae (Yeast 5)

4

Condition i Condition ii

3 Condition iii

2

Condition iv Condition v

1 0

case i

case ii case iii case iv case v

Exceptional cases

Fig. 4. Logarithm (base 10) of number of exceptional cases in seven di®erent genome-scale metabolic networks (Table 1) under di®erent growth conditions. For each metabolic network, we considered ¯ve di®erent conditions: (i) Fixing the °ux of glucose uptake, (ii) ¯xing the °ux of oxygen uptake, (iii) ¯xing the °uxes of glucose and oxygen uptake, (iv) ¯xing the °ux of fructose uptake, and ¯nally (v) ¯xing the °uxes of fructose and oxygen uptake. The ¯ve exceptional cases in each horizontal axis are: (i) Directionally coupled/Perfectly correlated, (ii) Directionally coupled/Uncorrelated, (iii) Partially coupled/ Perfectly correlated, (iv) Partially coupled/Uncorrelated, (v) Uncoupled/Perfectly correlated. It should be noted that the M. tuberculosis model does not include \fructose", and therefore, for this network conditions (iv) and (v) are not applicable. 1571003-9

S.-A. Marashi & Z. Hosseini Table 1. General biological properties of studied metabolic networks, and the proportion of number of exceptional cases (see Fig. 4).

Species name

Number of Proportion Total Number of unblocked of the number Metabolic Number of Number number of unblocked reaction of exceptional model metabolites of genes reactions reactions pairs cases (%)

P. aeruginosa iMO108630 E. coli iJR90435 E. coli iJO136633 P. putida iJN74632 M. tuberculosis iNJ66131 S. cerevisiae iND75036 S. cerevisiae Yeast 537

1021 761 1805 911 826 1061 1655

1086 904 1367 760 661 750 918

1136 1075 2583 1060 1025 1266 2110

685 669 1692 676 740 632 1165

234270 223446 1430586 228150 273430 199396 678030

1.79 1.09 0.90 1.66 2.45 2.10 1.83

We should note that we also observed some cases in which none of the °uxes are ¯xed, but the possible values for one of the °uxes are too small compared to the other °uxes. For instance, in Fig. 5(b) a small subnetwork of iJN746 is shown. Here, due to the stoichiometric constraints imposed by the network, dihydrouracil dehydrogenase reaction can only carry °uxes in range of 0  10 4 , while the range for other two

(a)

(b)

(c)

(d)

Fig. 5. (a) Hexokinase and D-sorbitol dehydrogenase are directionally coupled and fully correlated, if the fructose exchange °ux is ¯xed. (b) Pyrimidine-nucleoside phosphorylase and uracil phosphoribosyl transferase are directionally coupled and fully correlated, if the °ux through dihydrouracil dehydrogenase reaction lies in a tight range (0  10 4 ). (c) Glucose dehydrogenase and hexokinase are uncoupled but perfectly correlated, if glucose exchange reaction has a ¯xed °ux. (d) Diphosphorylase and adenylyl transferase are directionally coupled but completely uncorrelated.

1571003-10

Co-sets and coupled reaction sets in metabolic networks

reactions is 01000. In this case, CC between °ux of pyrimidine-nucleoside phosphorylase and °ux of uracil phosphoribosyl transferase is 1. For the case of uncoupled reactions with perfect correlation, we can mention a case from iJN746 (Fig. 5(c)). Here, °ux through the glucose exchange reaction is ¯xed. In this case, glucose dehydrogenase and hexokinase reactions are uncoupled but perfectly correlated. Fixing a °ux to a particular value may dictate the existence of nonzero °uxes through other reactions. As an example, in iND750, when we ¯x the °ux through oxygen, the glucose exchange °ux becomes nonzero. These two reactions are partially coupled by de¯nition (as both reactions always have a nonzero °ux, while their °ux ratio is not a constant value). However, the Pearson CC between these °uxes is zero. As a di®erent case, we should mention ATP hydrolysis in iMO1086. When oxygen uptake °ux is ¯xed to a nonzero value, H2O exchange °ux and ATP hydrolysis °ux (ATP þ H2 O ! ADP þ Pi þ H þ ) are always nonzero. However, the °ux ratio for these two reactions is not ¯xed, and therefore, H2O exchange and ATP hydrolysis are partially coupled. In this case, the computed Pearson CC between °uxes is almost zero (or more precisely, 2  10 6 ). Finally, we observed some directionally coupled reaction pairs in which none of the °uxes are ¯xed but the CC is almost zero. An example of such cases is indicated in Fig. 5(d). In this ¯gure, Amidase, diphosphorylase and pyrophosphorylase are directionally coupled to adenylyl transferase. Possible °ux ranges for these four reactions are 023:1, 00:0025, 013:2 and 023:1, respectively. However, the Pearson CC between diphosphorylase and adenylyl transferase °uxes is 2:5  10 6 . 3.5. On the meaning of \(un)coupled " and \(un)correlated " reactions In the present paper, we show that CCs equal to zero/one may exist in case of (partial or directional) coupling and uncoupling. This means that only full coupling of °uxes implies perfect correlation, and none of the other coupling relations imply °ux correlation, or vice versa. Therefore, we suggest that the terms \perfect", \partial" and \directional" Co-Sets are misleading and should be avoided. From the biological point of view, directional °ux coupling means that one metabolic enzyme must be expressed if another metabolic enzyme is to be expressed in the cell. Partial and full °ux coupling can be seen as the mutual dependence of such metabolic enzymes. On the other hand, °ux correlation does not necessarily imply any biological constraint on the expression of the metabolic enzymes. From the biological point of view, °ux correlation simply re°ects the existence of some interrelationship between the enzyme activities. Such an interrelationship is presumably due to co-occurrence of these enzymes on some common biochemical pathway. Here, we discuss an example which re°ect the biological importance of the difference between °ux coupling and °ux correlation. In a previous study, sampling of the °ux space was used to examine the properties of the human mitochondrial metabolic network in di®erent conditions.13 One of these conditions was high fat-low 1571003-11

S.-A. Marashi & Z. Hosseini

Fig. 6. An example of di®erence between °ux coupling and °ux correlation in mitochondrial metabolic network. Co-Set 1 and Co-Set 2 are uncoupled both in normal physiological condition and in high fat-low glucose diet. However, these Co-Sets show very di®erent CC between the two conditions, because of the change in the uptake rate of hexadecanoate (and other fatty acids). In this ¯gure, (e), (c) and (m) represent di®erent cellular compartments that are \extracellular", \cytosol" and \mitochondrion", respectively.

glucose diet in which the uptake rate of fatty acids was shown to be increased. It has been shown that the Pearson CC of many pairs of Co-Sets (sets of perfectly correlated °uxes) are highly di®erent between this condition and normal physiological condition. Here, we calculated the °ux coupling relationships between these reaction pairs in both conditions. Figure 6 shows one of these cases. Here, °ux coupling relationships of the reaction pairs were identical between the two conditions. In this ¯gure, Co-Set 1 consists of reactions for import of (R)-3-hydroxybutanoate to the mitochondria and its conversion to acetoacetate. Co-Set 2 consists of reactions for import of hexadecanoate (n-C16:0) to the cell, its conversion to palmitoyl carnitine and entrance of this metabolite to the mitochondria. Based on FCA, in both conditions these sets are found to be uncoupled. This means that expression of the enzymes of these sets are not dependent to each other, i.e. each of which can be expressed in the absence of the other. However, the Pearson CC of these sets are highly di®erent from one condition to the other.13 As it is mentioned, the uptake rate of hexadecanoate and other fatty acids are increased in the high fat-low glucose diet condition. Additionally, the presence of NAD and NADH is essential for the reactions of both sets to carry a nonzero °ux. The increased rate of fatty acids uptake in high fat-low glucose diet, results in di®erences in the available amount of these cofactors between the two sets in each condition. Consequently, since correlation of °uxes depends on the enzyme activities which in turn depends on the available metabolites, the °ux correlation between these sets is di®erent in the two studied conditions. 3.6. Mutual information versus Pearson CC In Figs. 1(b), 2(b) and 3(b), R 2 ¼ 0 is based on the Pearson CC between °uxes. However, the (nonlinear) dependence between °ux values can be easily observed in these examples. For instance, as it is explained in Sec. 3.2, although the reactions in 1571003-12

Co-sets and coupled reaction sets in metabolic networks

Fig. 2(b) are not linearly dependent, the diamond-like pattern shows that they are not completely independent. In fact, this pattern is generated because v2 and v3 are related by some linear constraints. While Pearson correlation is probably the most popular measure of dependence, our examples suggest that other measures of dependence, e.g. mutual information22 are probably more relevant for studying the dependencies between °uxes. To show this, using Minerva package23 in R, we calculated MIC25 (a means to apply mutual information) for the reaction pairs in Figs. 1(b), 2(b) and 3(b). The MIC values are found to be 0.07, 0.13 and 0.24, respectively. Therefore, the observable nonlinear dependence of °uxes can be re°ected if other measures of dependence, like MIC, are used in studying the °ux correlations. We should note that mutual information has been previously used when Pearson CC could not re°ect the dependence between variables (see Ref. 38 as an example). 3.7. Conclusions In the present work, we showed that FCA and °ux correlation analysis represent two inherently di®erent concepts, since the existence of one does not dictate the existence of the other. Even in the simple case of full coupling and perfect correlation, we showed that these two concepts are not equivalent, i.e. perfect correlation does not imply full coupling. Additionally, we suggest avoiding the terms perfect, partial and directional \Co-Set". This is because partial or directional coupling may not necessarily imply correlation, as we showed in our examples. In the present paper, we presented a couple of hypothetical examples, each of which re°ects one of the exceptional cases mentioned in the text. Furthermore, we explored real-world biological networks to ¯nd such examples, in order to stress on the importance of our analysis. We presented explicit examples in real-world genomescale networks which approve that °ux coupling and °ux correlation cannot be used interchangeably. One should note that some parts of our results depend on the ¯xing of °ux values of some reactions. For instance, in case of Fig. 2(c), if v3 is not ¯xed then the perfect correlation between v1 and v2 is not obtained (v1 versus v2 plot will be a triangle-like shape, instead of a straight line). However, in a real biological network, °uxes of many reactions may not be ¯xed. Therefore, in Sec. 3.4, where we analyzed real networks, we only ¯xed the °uxes of one or two exchange reactions in each simulation. We showed that even in these conditions we can ¯nd several cases which indicate the di®erence of °ux coupling and °ux correlation. In addition, we showed that even when none of the °uxes are ¯xed, interesting examples can be still found, e.g. when one of the °uxes shows a very tight range of variability (e.g. between 0 and 0.0001). Finally, we showed that mutual information is a better measure of dependence to be applied in studying °ux dependencies. Pearson CC, which has been widely used in the analysis of metabolic networks, is a proper measure of linear dependence. 1571003-13

S.-A. Marashi & Z. Hosseini

However, for capturing the nonlinear dependence between °uxes, which is observed in many cases such as Figs. 2(b) and 3(b), other measures of dependence may be more relevant. In the present work, we showed that calculating mutual information using MIC can successfully detect the dependence between those °uxes with nonlinear dependence whose Pearson CC is zero. Acknowledgments We are thankful to L. David and A. Bockmayr (Freie Universität Berlin) for their fruitful discussions. The CFCA tool was kindly provided by L. David. We would like to acknowledge the ¯nancial support of University of Tehran for this research under grant number 28791/1/2. References 1. Bordbar A, Monk JM, King ZA, Palsson BO, Constraint-based models predict metabolic and associated cellular functions, Nat Rev Genet 15:107–120, 2014. 2. Klamt S, Hädicke O, von Kamp A, Stoichiometric and constraint-based analysis of biochemical reaction networks, in Benner P, Findeisen R, Flockerzi D, Reichl U, Sundmacher K (eds.), Large-Scale Networks in Engineering and Life Sciences, Springer, pp. 263–316, 2014. 3. Burgard AP, Nikolaev EV, Schilling CH, Maranas CD, Flux coupling analysis of genomescale metabolic network reconstructions, Genome Res 14:301–312, 2004. 4. Notebaart RA, Teusink B, Siezen RJ, Papp B, Co-regulation of metabolic genes is better explained by °ux coupling than by network distance, PLoS Comput Biol 4(1):e26, 2008. 5. Notebaart RA, Kensche PR, Huynen MA, Dutilh BE, Asymmetric relationships between proteins shape genome evolution, Genome Biol 10(2):R19, 2009. 6. Pal C, Papp B, Lercher MJ, Horizontal gene transfer depends on gene content of the host, Bioinformatics 21(suppl 2):ii222–ii223, 2005. 7. Lu X, Kensche PR, Huynen MA, Notebaart RA, Genome evolution predicts genetic interactions in protein complexes and reveals cancer drug targets, Nat Commun 4:2124, 2013. 8. Price ND, Schellenberger J, Palsson BO, Uniform sampling of steady-state °ux spaces: Means to design experiments and to interpret enzymopathies, Biophys J 87(4):2172– 2186, 2004. 9. Barrett CL, Price ND, Palsson BO, Network-level analysis of metabolic regulation in the human red blood cell using random sampling and singular value decomposition, BMC Bioinformatics 7(1):132, 2006. 10. Barrett CL, Herrgard MJ, Palsson B, Decomposing complex reaction networks using random sampling, principal component analysis and basis rotation, BMC Syst Biol 3(1):30, 2009. 11. Mo ML, Palsson BØ, Herrgård MJ, Connecting extracellular metabolomic measurements to intracellular °ux states in yeast, BMC Syst Biol 3(1):37, 2009. 12. Hadi M, Marashi SA, Reconstruction of a generic metabolic network model of cancer cells, Mol Biosyst 10:3014–3021, 2014. 13. Thiele I, Price ND, Vo TD, Palsson BØ, Candidate metabolic network states in human mitochondria impact of diabetes, ischemia, and diet, J Biol Chem 280(12):11683–11695, 2005.

1571003-14

Co-sets and coupled reaction sets in metabolic networks

14. Ghosh A, Zhao H, Price ND, Genome-scale consequences of cofactor balancing in engineered pentose utilization pathways in Saccharomyces cerevisiae, PloS One 6(11): e27316, 2011. 15. Papin JA, Reed JL, Palsson BO, Hierarchical thinking in network biology: The unbiased modularization of biochemical networks, Trends in Biochem Sci 29(12):641–647, 2004. 16. Bundy JG, Papp B, Harmston R, Browne RA, Clayson EM, Burton N et al., Evaluation of predicted network modules in yeast metabolism using NMR-based metabolite pro¯ling, Genome Res 17(4):510–519, 2007. 17. Chavali AK, D'Auria KM, Hewlett EL, Pearson RD, Papin JA, A metabolic network approach for the identi¯cation and prioritization of antimicrobial drug targets, Trends Microbiol 20(3):113–123, 2012. 18. Mo ML, Jamshidi N, Palsson BO, A genome-scale, constraint-based approach to systems biology of human metabolism, Mol Biosyst 3(9):598–603, 2007. 19. Xi Y, Chen YP, Qian C, Wang F, Comparative study of computational methods to detect the correlated reaction sets in biochemical networks, Brief Bioinformatics 12(2):132–150, 2011. 20. Schellenberger J, Que R, Fleming RM, Thiele I, Orth JD, Feist AM et al., Quantitative prediction of cellular metabolism with constraint-based models: The COBRA Toolbox v2.0, Nat Protocols 6(9):1290–1307, 2011. 21. Kaufman DE, Smith RL, Direction choice for accelerated convergence in hit-and-run sampling, Oper Res 46(1):84–95, 1998. 22. Steuer R, Kurths J, Daub CO, Weise J, Selbig J, The mutual information: Detecting and evaluating dependencies between variables, Bioinformatics 18:S231–S240, 2002. 23. Albanese D, Filosi M, Visintainer R, Riccadonna S, Jurman G, Furlanello C, Minerva and minepy: A C engine for the MINE suite and its R, Python and MATLAB wrappers, Bioinformatics 29:407–408, 2013. 24. Ihaka R, Gentleman R, R: A language for data analysis and graphics, J Comput Graph Stat 5:299–314, 1996. 25. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ et al., Detecting novel associations in large data sets, Science 334:1518–1524, 2011. 26. David L, Bockmayr A, Constrained °ux coupling analysis, Proc WCB13 Workshop on Constraint Based Methods for Bioinformatics, pp. 75–83, 2013. 27. David L, Marashi SA, Larhlimi A, Mieth B, Bockmayr A, FFCA: A feasibility-based method for °ux coupling analysis of metabolic networks, BMC Bioinformatics 12:236, 2011. 28. Larhlimi A, David L, Selbig J, Bockmayr A, F2C2: A fast tool for the computation of °ux coupling in genome-scale metabolic networks, BMC Bioinformatics 13:57, 2012. 29. Currie J, Wilson DI, OPTI: Lowering the barrier between open source optimizers and the industrial MATLAB user, Proc FOCAPO2012 (Foundations of Computer-Aided Process Operations), Savannah, Georgia, USA, pp. 8–11, 2012. 30. Oberhardt MA, Puchałka J, Fryer KE, Dos Santos VAM, Papin JA, Genome-scale metabolic network analysis of the opportunistic pathogen Pseudomonas aeruginosa PAO1, J Bacteriol 190(8):2790–2803, 2008. 31. Fang X, Wallqvist A, Reifman J, Development and analysis of an in vivo-compatible metabolic network of Mycobacterium tuberculosis, BMC Syst Biol 4(1):160, 2010. 32. Nogales J, Palsson BØ, Thiele I, A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory, BMC Syst Biol 2(1):79, 2008. 33. Orth JD, Conrad TM, Na J, Lerman JA, Nam H, Feist AM et al., A comprehensive genome-scale reconstruction of Escherichia coli metabolism    2011, Mol Syst Biol 7(1):535, 2011.

1571003-15

S.-A. Marashi & Z. Hosseini

34. Österlund T, Nookaew I, Nielsen J, Fifteen years of large scale metabolic modeling of yeast: Developments and impacts, Biotechnol Adv 30(5):979–988, 2012. 35. Reed JL, Vo TD, Schilling CH, Palsson BO, An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR), Genome Biol 4(9):R54, 2003. 36. Duarte NC, Herrgård MJ, Palsson BØ, Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model, Genome Res 14(7):1298–1309, 2004. 37. Heavner BD, Smallbone K, Barker B, Mendes P, Walker LP, Yeast 5    An expanded reconstruction of the Saccharomyces cerevisiae metabolic network, BMC Syst Biol 6(1):55, 2012. 38. Nikiforova VJ, Daub CO, Hesse H, Willmitzer L, Hoefgen R, Integrative gene-metabolite network with implemented causality deciphers informational °uxes of sulphur stress response, J Exp Bot 56:1887–1896, 2005.

Sayed-Amir Marashi is an Assistant Professor at Department of Biotechnology, College of Science, University of Tehran. He ¯nished his doctoral studies at the Freie Universität Berlin in 2011. His research interests include protein-protein interaction network analysis and modeling metabolic networks and pathways.

Zhaleh Hosseini is currently an MSc student in direct PhD program of Biotechnology at University of Tehran, Iran. Her research interests include systems biology, and more speci¯cally, modeling and analysis of metabolic networks.

1571003-16