Two-Phase Partition Method for Simulating a Biological ... - CiteSeerX

4 downloads 0 Views 178KB Size Report
Keywords: simulation, calculation, biological system, high speed, heat shock ... of a comprehensive software suite that excludes all the time-consuming manual ..... [2] Bailey, J.E. and Ollis, D.F., Biochemical Engineering Fundamentals, ...
Genome Informatics 11: 185–195 (2000)

185

Two-Phase Partition Method for Simulating a Biological System at an Extremely High Speed

1 2

Hiroyuki Kurata1

Kazunari Taira1,2

[email protected]

[email protected]

Department of Biochemical Engineering and Science Kyushu Institute of Technology, Kawazu, Iizuka, Fukuoka 820-8502, Japan National Institute for Advanced Interdisciplinary Research, AIST, MITI, Tsukuba Science City 305-8562, Japan Abstract

To accelerate the calculation speed for simulating a biological system, we proposed a novel simulation method, the two-phase partition method, which calculated molecular processes at a higher speed than any other proposed method. This method divides a biological system, which can be described by chemical reaction equations, into two-phases: the binding and reaction phases. We demonstrated the capability of the two-phase partition method to simulate a complex biological system at an extremely high speed and clarified the accuracy of the simulation. The two-phase partition method is very useful for simulating complex interactions among proteins and DNAs.

Keywords: simulation, calculation, biological system, high speed, heat shock response

1

Introduction

Genome sequencing projects and systematic functional analyses of complete gene sets are producing a mass of molecular information for a wide range of model organisms. This may enable a computer to analyze the whole biological systems at a molecular interaction level, thereby understanding the dynamic behavior of living cells: how all the cellular components function as a living system. Molecular interaction networks, such as cell cycle [18], bacterial chemotaxis [4], bacteriophage λ gene regulatory circuit [1], and circadian clocks, have been so extensively elucidated that a mathematical model could simulate the exact behaviors of them. The mathematical model has been elaborately programmed to adjust simulated data to observed ones, which required expertise or experiences regarding mathematical techniques and training. It is not easy for an ordinary experimentalist to get along with such a programmed modeling. The increasing demand for models of biochemical and physiological processes necessitates the development of a comprehensive software suite that excludes all the time-consuming manual operations involved in formulating, debugging and analysis of mathematical models. Various simulators or software packages, such as GEPASI [11, 12], KINSIM [3, 6], MIST [7], MetaModel [5], SCAMP [15], E-CELL [17], and BEST-KIT [13], have been developed that automatically converted a biological system to a mathematical model without any annoying modeling technique. Those simulators employ ordinary differential equations to simulate a molecular process, but the problem has been that exact simulation often required a long time of calculation, because there was a huge scale level of the hierarchy regarding the concentrations of cellular components and kinetic parameters. The number of proteins or small molecules within a cell, which depends on the species, was distributed in the wide range over the order of 108 . In addition, the difference in the rate constants of reactions including association, dissociation, conversion, and degradation, depending on the kinds of the reactions, can be over the order of 1010 . Such systems requires fine differential time interval, thus causing the calculation time to become too large, restricting the use of ordinary differential equations.

186

Kurata and Taira

To overcome the problem regarding the calculation time, we proposed a novel simulation method, the two-phase partition method, that calculated molecular processes at a higher speed than any other proposed method. This method divides a biological system, which can be described with chemical reaction equations, into two-phases: the binding and reaction phases. The calculation speed was enhanced mainly by the use of simultaneous algebraic equations expressing molecular binding processes. We demonstrated the capability of the two-phase partition method to calculate a complex biological system at an extremely high speed and clarified the accuracy of the simulation.

2

Mathematical modeling

2.1

General simulation methods

Various formalisms such as the Michaelis-Menten equation, the power law formalism, and the conventional mass action equations, have been extensively employed for simulating a biological system that is composed of a mass of various chemical reactions such as conversion, synthesis, degradation, transportation, and binding. The important thing is that all the reactions can be expressed with a combination of a simple chemical reaction equation as follows: E+S

k1 −→ ←− k−1

k

2 E : S −→ E + P,

(1)

where S is the substrate, P the product, and S:E the complex. The kinetic parameter k1 is the association rate constant, k−1 the dissociation rate constant, and k2 the reaction rate constant. We characterized the advantages and disadvantage of the above formalisms for simulating a biological system. 2.1.1

The Michaelis-Menten Equation

Generally, Eq. (1) is converted by using the Michaelis-Menten equation under the assumption that the concentration of the complex [E:S] keeps at a steady state and [E]  [S] as follows: Vmax [S] , Km + [S] Vmax = k2 [Etot ], dV dt

=

(2) (3)

where the maximum reaction rate Vmax and the Michaelis constant Km can be measured experimentally. The problem is that the complex concentration [E:S] is canceled. In a biological system including protein signal transduction, the chains of interactions among the proteins and DNAs are very long because the components are directly or indirectly interacted through their complexes. Therefore, exact simulation should consider the complex concentrations. The Michaelis-Menten equations are remarkably useful for the study of isolated reaction mechanisms, but they are often highly inappropriate for the study of integrated biochemical systems in vivo because of the neglect of the complex concentrations. The assumptions ([E]  [S]) of the Michaelis-Menten formalism are also violated by enzyme-enzyme interactions, suggesting that there are problems in using this formalism to characterize the protein signal transduction within integrated biochemical systems. 2.1.2

Power law formalism

To simulate a large scale of complex biochemical interaction networks instead of the Michaelis-Menten equations, the power law formalism that can include the effects of all the components within a cell has been applied in which the rates of reactions are described by products of power-law functions. The power law formalism provides the context for assessing the importance of fractal kinetics in the

Two-Phase Partition Method for Simulating a Biological System

187

quantitative characterization. This formalism was demonstrated to well characterize the large-scale metabolism of the Tricarboxylic acid cycle of Dictyostelium discoideum [16]. Although the power law formalism accurately represented the macroscopic behavior of large numbers of molecules such as metabolites, the behavior of a small numbers of molecules such as proteins and DNAs is poorly represented. Therefore, it seems not to be applied to signal transduction pathways involving enzymes and DNAs interactions. 2.1.3

Conventional mass action equation

The chemical reaction equation of Eq. (1) is expanded into ordinary differential equations using the rate for binding between enzyme and substrate, k1 , the rate for dissociation of the enzyme-substrate complex, k−1 , and the rate for forming the product, k2 , as follows. d[E] dt d[S] dt d[E : S] dt d[P ] dt

= −k1 [E][S] + k−1 [E : S]

(4)

= −k1 [E][S] + k−1 [E : S]

(5)

= k1 [E][S] + k−1 [E : S] − k2 [E : S]

(6)

= k2 [E : S]

(7)

This ordinary expansion is known as one of S-system methods. This method is able to correctly consider all the molecular interactions and seems to be one of the best or most general ways to describe a complex biological system, but there is a serious weakness. The problem is that it takes a long time for differential equations to calculate a biochemical reaction network where there is a huge difference in the values of biochemical parameters. Such a huge difference greatly decreases the differential time interval for numerical calculation, causing the calculation time to become remarkably long.

2.2

A powerful formalism: Two-phase partition method

To overcome the problems regarding the Michaelis-Menten equations, the power law formalism, or the conventional equation, we developed the two-phase partition method that calculated a large-scale of interactive molecular networks at a fast rate. The two-phase partition method divides molecular interaction networks into two-phases: the binding phase and reaction phase. The left hand side of Eq. (1) is transferred to the binding phase and the right hand side to the reaction phase. The expansion is as follows; Binding phase: [E : S] = Kb [E][S]

(8)

[Etot ] = [E] + [E : S]

(9)

[Stot ] = [S] + [E : S]

(10)

d[P ] = k2 [E : S], dt

(11)

Reaction phase:

where [Etot ] and [Stot ] are the total concentrations of enzyme and substrate, respectively. In the binding phase, the binding constant, Kb = k1 /k−1 , is employed to express the molecular binding process instead of the association/dissociation rate constants (k1 , k−1 ). The binding phase is described by the nonlinear algebraic equations that consist of the binding equations, Eq. (8), and the mass balance equations for each component, Eq. (9, 10). The reaction phase, Eq. (11), is described by an ordinary

188

Kurata and Taira

differential equation. In the conventional method, a large difference between the values of k1 and k−1 often causes the differential time interval to become too fine, remarkably increasing the calculation time. The two-phase-partition method excludes the parameters of k1 and k−1 from the differential equations by employing the binding constant (Kb) to accelerate the calculation speed.

2.3

Simplified equations for protein synthesis

Protein synthesis involves various components such as RNA polymerase, suppressor/activator proteins, rRNA, mRNA, tRNA, and elongation factors. The synthesis occurs in very complicated manners, which has not completely elucidated yet. Of course, if such complex processes are well elucidated, the two-phase partition method can formulate it. However, the detailed description of protein synthesis is not necessary if the simulation aims at elucidating global signal transduction pathways (metabolic cycles, stress responses). In such cases, the chemical reaction equation expressing protein synthesis is simplified as follows [2]: GENE transcription mRNA degradation −→ −→ , (12) gene(i) mRNA(i) mRNA mRNA(i)

transcription

−→

protein P(i)

degradation

−→

.

(13)

For transcription, the concentration of mRNA(i) is given by:  d[mRNA(i)] = km (i) · η(i) · [gene(i)] − kmd (i)[mRNA(i)] − kx (j)[mRNA(i) : C(j)], dt j

(14)

where km (i) and kmd (i) are the transcription and degradation rate constants of mRNA(i), respectively. The kinetic constant kx (j) is the rate constant for the degradation or export/import of mRNA(j) that is caused through the interaction with the component C(j). Transcription efficiency η(i) in E. coli can be expressed by: 

[Op(i)] : [Sup] [Op(i) : Ac] · 1− η(i) = Op(i)tot Op(i)tot



·

[P ro(i) : RNAP : σ] , P ro(i)tot

(15)

where [Op(i)] and [P ro(i)] are the concentrations of the operator and promoter for the gene(i). [Ac] and [Sup] are the concentrations of transcription regulation factors: activator and suppressor, respectively. [P ro(i) : RNAP : σ] is the complex of the promoter-bound RNA polymerase:σ complex. Suffix “tot” indicates the total concentration of the component. For translation, the concentration of the protein including modified (phosphorylated, adenylylated. etc) ones, P (i), is written as follows:  d[P (i)] = kp (i) · ϕ(i) · [mRNA(i)] − kdp (i)[P (i)] − ky (j)[P (i) : C(j)], dt j

(16)

where kp (i) and kdp (i) are the translation and degradation rate constants of protein P (i). The kinetic rate ky (j) is the rate constants for the degradation or import/export of P (i) that is caused through the interaction with the component C(j). ϕ(i) is the translation initiation rate.

3

Simulator engine of the two-phase partition method

A schematic diagram of how the two-phase partition method expands a mass of chemical reaction equations is shown in Figure 1. All the components are distributed by using nonlinear algebraic equations and the changes in their concentration are determined by the ordinary differential equations. Chemical reaction equations are input as a text file in Block (1). The equations are divided into the binding phase and the reaction phase in Block (2). The left hand side of Eq. (1) is transferred to

Two-Phase Partition Method for Simulating a Biological System

189

Figure 1: A flow chart of the two-phase partition method converting chemical reaction equations into mathematical description available as a computer program.

the binding phase and the right hand side of Eq. (1) and protein synthesis equations to the reaction phase. In Block (3), the binding equations and mass balance equations are built for each component. Such simultaneous nonlinear algebraic equations are solved the Newton-Raphson method. In Block (4), the reaction equations coupled with transcription-translation equations are generated and solved by the Runge-Kutta method. In Block (5), the given chemical reaction equations are calculated by alternatively calculating the binding phase and the reaction phase. The result is output in Block (6). C was used as a programming language.

4 4.1

Characterization of the two-phase partition method Accuracy in the two-phase partition method

The mathematical model of the heat shock response has been extensively investigated [8, 9, 19] and found suitable for being applied by the two-phase partition method, because the interactions among heat shock proteins and DNAs play a major role. Figure 2 shows a schematic diagram of the heat shock response system in E. coli. Needless to say, it is difficult to simulate it with the Michaelis-Menten equations or the power law formalism. The two-phase partition method is theoretically correct at k2  k−1 in Eq. (1), but this assumption will not be always allowed. Thus, the condition was characterized that the two-phase partition method was practically applicable. To investigate the error of the two-phase partition method, the molecular concentrations at a steady state that were given by the two-phase partition method were compared with those by the conventional method, assuming that the conventional method provided correct solutions. In the heat shock response, all the concentrations of molecules are possible to reach a steady state. The simultaneous algebraic equations were prepared to provide the molecule concentrations at steady state, where the differential rate of all the differential

190

Kurata and Taira

Figure 2: Schematic diagram of the heat shock response in E. coli. The details were described elsewhere [8, 9]. DnaK: a representative of chaperones; FtsH: a representative of σ 32 -degrading protease; Pun: proteins unfolded by heat shock; Pfold: folded proteins; RNAP: RNA polymerase. DnaK-bound σ 32 is sequestered away from binding RNAP or degraded.

equations were set as zero. The two-phase partition method was applied to the following equation of the heat shock response: k1

32 : DnaK σ 32 + DnaK −→ ←− σ

k2 (degradation)

−→

DnaK,

(17)

k−1

To characterize the error of the two-phase partition method, the value q was defined by: q = k−1 /k2 , which indicates the difference in the order of the parameters. Figure 3 shows the error calculated by the two-phase partition method with respect to the q value. The error was provided by dividing the steadystate concentrations calculated by the two-phase partition method with those by the conventional method. When q  1(k−1  k2 ), the two-phase partition method was theoretically correct, and was expected to become incorrect with the decrease in the value q. While the binding association constant k1 /k−1 was fixed, the value q was varied. At q = 0.001, the concentrations (σ 32 , DnaK, FtsH) at a steady state calculated by the two-phase partition method was 10-fold higher than those by the conventional method, indicating that the two-phase partition method generates a large error. By contrast, the error decreased enough to be neglected at q > 2, where the two-phase partition method simulates the molecular process of the heat shock response.

4.2

High speed calculation by the two-phase partition method

In the above section, the error for the two-phase partition method was investigated using the steadystate concentrations that were solved by the nonlinear algebraic equations. In this section, we demonstrate that the two-phase partition method can calculate a time course of the heat shock response at an extremely high rate. To show the high rate of calculation by the two-phase partition method, we compared it with the calculation speed by the conventional method using the heat shock response. A Pentium III 450 MHz calculated the mathematical model programmed by C. A time course was calculated from time 0 to 100 min by the two-phase partition method or by the conventional method, changing the number of the time steps (= 100 min / ∆t min) for differential equations from 1 × 102 to 1 × 109 (the stepsize from 1 to 1 × 10−7 min). Figure 4 shows the calculation time and the concentration of σ 32 at a steady state with respect to the number of the step. Heat shock

Two-Phase Partition Method for Simulating a Biological System

191

Figure 3: Characterization of the condition that the two-phase partition method accurately simulates the heat shock response. The value q is defined by q = k−1 /k2 , which is varied as q = 0.001 × 2x−1 (x = 1, 2, . . . , 20). The value was normalized by dividing each protein concentration (RNAP, σ 70 , σ 32 , DnaK, FtsH) calculated by the two-phase partition method with the concentration calculated by the conventional method. Those concentrations were solved by the nonlinear algebraic equations where the differential rate of all the differential equations were set as zero.

with small perturbation occurred at 50 min, and we plotted the calculation time and the concentration of σ 32 at 100 min. In Figure 4a (conventional method, the value of q = 5), the calculation time increased with the increase in the number of the steps. The σ 32 concentrations at a steady state were properly calculated above the number of the step, 5 × 106 , indicating that correct simulation requires more than 5 × 106 steps and 150 seconds of the calculation time. Below the step number of 5 × 106 , the calculation generated negative values regarding the cell components, due to the large difference in the values of the biochemical parameters. The steady state level was identical to those solved by the nonlinear algebraic equations, where the differential rates of all the differential equations described by the conventional method were set as zero. Although data were not shown, the important thing is to note that the increase in the value of q almost proportionally increased the calculation time. In the two-phase partition method (Figure 4b), the calculation time increased with the increase in the number of the step. The concentration of σ 32 indicated the steady state at the number of the step of 102 , showing that the step number of 102 or the calculation time of 0.03 seconds is enough for correct calculation. The steady state level was identical to those solved by the nonlinear algebraic equations, where the differential rates of all the differential equations described by the two-phase partition method were set as zero. The calculation time was much shorter than that by the conventional method. The two-phase partition method is demonstrated to simulate the heat shock response at an extremely high rate compared with the conventional method. Figure 5 shows the time course of the heat shock response calculated by the two-phase partition method, where strong heat shock occurs at 100 min. The two-phase partition method well simulated the main feature of the heat shock response [19] and the calculation time was several seconds. In contrast, the conventional method did not generate any sharp peak as shown in Figure 5, due to the huge differences in the values of the biochemical parameters.

192

Kurata and Taira

Figure 4: Characterization of the calculation time and the concentration of σ 32 with respect to the number of the steps. Heat shock with small perturbation occurs at 50 min. The calculation time is defined as the time that it takes for a computer (Pentium III 450 MHz) to execute the time course of 1000 minutes. (a) Conventional method, (b) two-phase partition method.

5

Application to the ammonia assimilation system

As mentioned above, the two-phase partition method simulated the time-course of the heat shock response composed of proteins and DNAs interactions with an extremely high speed. In this section, we show how the two-phase partition method is applied to a biological system composed not only of proteins and DNAs interactions but also of small compounds such as metabolites. The best way is to employ the two-phase partition method and the power law formalism in parallel, i.e., enzyme-enzyme reactions are described by the two-phase partition method, and enzyme reaction regarding metabolites with the power law formalism. To demonstrate this idea, we simulated the ammonia assimilation system in E. coli that controls the uptake rate of extracellular ammonia (nitrogen source) to adjust the balance between the nitrogen and carbon amounts within a cell [14]. A schematic diagram of the ammonia assimilation system is shown in Figure 6 [10]. This system consists of metabolisms of small molecules such as glutamine, α-ketoglutarate, glutamate, and signal transduction through regulatory proteins, UT/UR, PI, PII, NRI, and NRII. This is an excellent example for employing the two-phase partition method and the power law formalism in parallel. Actually, the reactions of small compounds were described using the power law formalism, and the signal transduction among the regulatory proteins and DNAs by the two-phase partition method. In order to adjust the main feature of the ammonia assimilation system, a genetic algorithm was used to tune the values of many biochemical parameters, which were crossed or mutated to obtain a high adaptability. The adaptability was defined as the ratio of glutamine/α-ketoglutarate, because the aim of this process is to maintain the glutamine concentration at a high level by increasing the uptake rate of ammonia from ammonia-depleted medium. The genetic algorithm gave the high adaptability after ten generations, indicating that it is a powerful tool for tuning the multiple parameters (data not shown). Figure 7 shows the time course of the ratio of glutamine to α-ketoglutarate with respect to extracellular ammonia, where the fine-tuning with the above genetic algorithm gives the values of the

Two-Phase Partition Method for Simulating a Biological System

193

Figure 5: A time course of the heat shock response. Calculation was carried out using the two-phase partition method. Heat shock occurred at 100 min.

parameters. The extracellular ammonia concentration decreased by 10-fold at 300 min. Despite the decrease in ammonia, the ratio of glutamine to α-ketoglutarate was enhanced at a high level, indicating that the ammonia assimilation has the capability to control the balance between glutamine and α-ketoglutarate.

6

Conclusion

The two-phase partition method is very useful for simulating complex interactions among proteins and DNAs. This method remarkably enhances the calculation speed, but the accuracy of the simulation has to be elaborately investigated, depending on the value of q. In addition, we demonstrate how the two-phase partition method can be applied to a biological system composed not only of proteins and DNAs interactions but also of small compounds such as metabolites. The best way would be to employ the two-phase partition method and the power law formalism in parallel.

References [1] Arkin, A., Ross, J., and McAdams, H.H., Stochastic kinetic analysis of developmental pathway bifurcation in phage λ-infected Escherichia coli cells, Genetics, 149:1633–1648, 1998. [2] Bailey, J.E. and Ollis, D.F., Biochemical Engineering Fundamentals, McGRAW-Hill Book Co., Singapore, 421–434, 1986. [3] Barchop, B.A., Wrenn, R.F., and Frieden, C., Analysis of numerical methods for computer simulation of kinetic processes: development of KINSIM-a flexible, portable system, Anal. Biochem., 130:134–145, 1983. [4] Barkai, N. and Leibler, S., Robustness in simple biochemical networks, Nature (London), 387:913– 917, 1997.

194

Kurata and Taira

Figure 7: A time course of the ratio of glutamine to α-ketoglutarate. Extracellular ammonia concentration reduced to one tenth at 300 min. Calculation was carried out for 600 min. Figure 6: Schematic diagram of the ammonia assimilation system in E. coli.

[5] Cornish-Bowden, A. and Hofmeyr, J.H., MetaModel: a program for modeling and control analysis of metabolic pathways on the IBM PC and compatibles, Comput. Appl. Biosci., 7:89–93, 1991. [6] Dang, Q. and Frieden, C., New PC versions of the kinetic-simulation and fitting programs, KINSIM and FITSIM, Trends. Biochem. Sci., 22:317, 1997. [7] Ehlde, M. and Zacchi, G., MIST: a user-friendly metabolic simulator, Comput. Appl. Biosci., 11:201–207, 1995. [8] Kurata, H., Complexity generates robustness and high performance in E. coli heat shock response, Technical Report of IEICE, AI99-5:33–38, 1999. [9] Kurata, H. and Taira, K., Complexity in Regulation Generates Robustness in Bacterial Molecular Networks, The First International Conference on Systems Biology(ICSB2000), in press. [10] Kurata, H. and Taira, K., Evolution of Ammonia Assimilation System in E. coli, Technical Report of IEICE, AI2000-12:17–24, 2000. [11] Mendes, P., GEPASI: a software package for modeling the dynamics, steady state and control of biochemical and other systems, Comput. Appl. Biosci., 9:563–571, 1993. [12] Mendes, P., Biochemistry by numbers: simulation of biochemical pathways with Gepasi 3, Trends. Biochem. Sci., 22:361–363, 1997. [13] Okamoto, M., Morita, Y., Tominaga, D., Tanaka, K., Kinoshita, N., Ueno, J.-I., Miura, Y., Maki, Y., and Eguchi, Y., Design of Virtual-Labo-System for Metabolic Engineering: Development of

Two-Phase Partition Method for Simulating a Biological System

195

Biochemical Engineering System Analyzing Tool-KIT(BEST-KIT), Computers Chem. Engng., 21:S745–S750, 1997. [14] Reitzer, L.J., Ammonia assimilation and the biosynthesis of glutamine, glutamate, aspartate, asparagines, L-alanine, and D-alanine, In Escherichia coli and Salmonella, Cellular and Molecular Biology, (ed. F. C. Neidhardt), ASM Press, Washington, 391–407, 1996. [15] Sauro, H.M., SCAMP: a general-purpose simulator and metabolic control analysis program, Comput. Appl. Biosci., 9:441–450, 1993. [16] Shiraishi, F. and Savageau, M.A., The tricarboxylic acid cycle in Dictyostelium discoideum, J. Biol. Chem., 267:22912–22918, 1992. [17] Tomita, M., Hashimoto, K., Takahashi, K., Shimizu, T., Matsuzaki, Y., Miyoshi, F., Saito, K., Tanida, S., Yugi, K., Venter, J.C., and Hutchison, C., Software environment for whole cell simulation, Bioinformatics, 15:72–84, 1999. [18] Tyson, J.J., Novak, B., Odell, G. M., Chen, K., and Thron, C. D., Chemical kinetic theory: understanding cell-cycle regulation, TIBS, 21:89–95, 1996. [19] Yura, T., Regulation and conservation of the heat-shock transcription factor σ 32 , Genes to Cells, 1:277–284, 1996.

Suggest Documents