IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
879
Context-Sensitive Probabilistic Boolean Networks: Steady-State Properties, Reduction, and Steady-State Approximation Ranadip Pal, Member, IEEE
Abstract—Context-sensitive probabilistic Boolean networks (PBNs) have been recently introduced as a paradigm for modeling genetic regulatory networks and have served as the main model for the application of intervention methods, including optimal control strategies, to favorably effect system dynamics. Since it is believed that the steady-state behavior of a context-sensitive PBN is indicative of the phenotype, it is important to study the alternation in the steady-state probability distribution due to any variations in the formulations of the context-sensitive PBNs. Furthermore, the huge computational complexity of the context-sensitive PBN model necessitates generation of size-reduction techniques and approximate methods for calculation of the steady-state probability distribution of context-sensitive PBNs. The goal of this paper is threefold: i) to study the effects of the various definitions of context-sensitive PBNs on the steady-state probability distributions and the downstream control policy design; ii) to propose a reduction technique that maintains the steady-state probability distribution; and iii) to provide an approximation method for calculating the steady-state probability distribution of a context-sensitive PBN. Index Terms—Context-sensitive PBN, genetic regulatory network model, steady state properties, reduction mapping, steady state distribution approximation.
I. INTRODUCTION
G
IVEN a set of genes, the evolution of their expression levels constitutes a dynamical system over time. A large number of approaches have been proposed to model the behavior of gene regulatory networks, both deterministic and stochastic [1], [2]. Some models use biochemical information to capture fine details involving transcription and translation via differential equations or their discrete approximations [3]–[6]. Others use data to infer the parameters of differential or difference equation models [7]–[11]. Discrete models with synchronous timing have been used since the late 1960s in the form of the Boolean network (BN) model [12]–[14]. In this model, gene expression is quantized to two levels: ON and OFF, denoted by 1 and 0, respectively. The expression
Manuscript received February 05, 2009; accepted July 26, 2009. First published August 21, 2009; current version published January 13, 2010. The associate editor coordinating review of this manuscript and approving it for publication was Dr. Yufei Huang. The author is with Texas Tech University, Electrical and Computer Engineering, Lubbock, TX, 79409 USA (e-mail:
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSP.2009.2030832
level (state) of a gene is functionally related via a logical rule to the expression states of other genes. Specifically, a BN consisting of is composed of a set binary variables, each denoting a gene expression, and a set of regulatory functions, such that for discrete time, . At any time point, the state of the network is given by an , called the gene activity expression vector profile (GAP). The BN model has yielded insights into the overall behavior of large genetic networks [15]–[18]. The dynamic behavior of BNs can be used to model many biologically meaningful phenomena—for instance, cellular state dynamics possessing switch-like behavior, stability, and hysteresis [19]. There has also been a number of attempts to model the cell cycle using Boolean networks [20]–[22]. Besides the conceptual framework afforded by such models, practical uses, such as the identification of suitable drug targets in cancer therapy, result by inferring the structure of the genetic models from experimental data, e.g., from the gene expression profiles [19]. In the Boolean model, the assumption of a single transition rule for each gene can be problematic with respect to inference. The data are typically noisy, the number of samples is small relative to the number of parameters to be estimated, and inputs to the system are typically not controlled. Thus, one may observe various possible transitions from a given state. A salient instance is that unobserved (latent) variables external to the model network result in the network structure taking on various realizations. Owing to these considerations, a stochastic rule-based regulatory model, the probabilistic Boolean network (PBN) [23], [24] has been proposed. A PBN is comof finite-state-space posed of a collection networks, each called a context and each composed of the same set of genes. At each time point, a random decision is made whether to switch (with probability ) to a new context or remain (with probability 1 ) in the current context. If a switch is made, the choice of context is governed by a probability . If , meaning certain switching, then density the PBN is said to be instantaneously random; if , with a small switching probability meant to capture the effect of latent variables, then the PBN is said to be context-sensitive [25], [26]. Limiting the switching between the constituent networks increases the stability of the biological system. In both cases, the probabilistic structure of the PBN can be modeled as a Markov chain. In the instantaneously random case, the states of the Markov chain are the GAPs; in the context-sensitive case, the states of the Markov chain consist of (context, GAP) pairs.
1053-587X/$26.00 © 2010 IEEE Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
880
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
If we assume that, at any time point and for any gene, there is a of random perturbation (owing to some positive probability event outside the network model), then the associated Markov chain possesses a steady-state distribution [25]. The rule-based structure of PBNs mirrors the functional transcriptional regulation in cells, and their stochastic structure allows them to reflect uncertainty, both in the data and in model selection. Definitions of the context-sensitive PBNs in the literature [25]–[27] differ with respect to whether at time step , network switching takes place before or after the state transition. In this paper, we will show that this minor variation in formulation of the problem can substantially change the steady-state probabilities of the states. The generated relationships between the steady-state probability distributions of the two formulations will assist in adapting the existing state reduction and intervention algorithms designed using one of the formulations to the other formulation. We will also illustrate the effectiveness of one formulation over another in the calculation of one-step intervention policies. Furthermore, the generated relationships between the steady-state probability distributions of the two formulations will ease understanding of context-reduction mappings discussed later in the paper. Context-reduction mappings refer to the generation of an approximate transition probability matrix, absent of context, for the Markov chain associated with the context-sensitive PBN. For instance, if we have binary genes and contexts, then a context-sensitive PBN denoted by will consist of a 2 2 size transition probability matrix , while the context-re2 size transiduced PBN denoted by will have a 2 . The steady-state probability of a tion probability matrix GAP state in the context-sensitive PBN can be calculated by summing over the contexts. Suppose denotes the 2 length steady-state probability vector of the context-sensitive PBN with state 2 1 corresponding to BN and GAP state . Then the steady-state probability of GAP state in the context-sensitive PBN is equal to . Since the steady-state probabilities are important characteristics of the network, our interest lies in maintaining the steady-state probabilities of the GAP states after the context reduction. For instance, if represents the 2 length steady-state probability vector of the context-reduced PBN , we would like to have . Earlier approaches to context-reduction mappings [26]–[28] did not satisfy this criterion. The second goal of this paper is to provide a context-reduction mapping that maintains the steady-state probability distribution of the GAP states. A set of other approaches on reducing the number of genes in a probabilistic Boolean network has been studied in the literature [29]–[31], but none of them has proved equivalence of the steady-state probabilities of the reduced network and the collapsed steady-state probabilities of the original context-sensitive PBN. The third and final goal of this paper is to propose an approximation algorithm for generation of the steady-state probabilities of a context-sensitive PBN. The huge computational complexity associated with the calculation of the steady-state probability 2 matrix necessitates the design of distribution of a 2 fast approximation methods. Previous approximation methods for calculation of steady-state probabilities of a PBN have con-
sidered generation of the steady-state probabilities of the major attractors of the constituent Boolean networks [25] or removal of Boolean networks with small selection probabilities [32]. In this paper, we will approach the approximation based on the steady-state probability distribution of the constituent Boolean networks. The proposed method provides good approximation for small values of network transition probability and has a time complexity of as compared to a time complexity for calculation of the true steady-state probabiliof ties of the context-sensitive PBN. We will also derive analytical bounds for the errors due to the approximation. This paper is organized as follows. Section II provides the different formulations of the context-sensitive PBNs and the relationships between them. We present the context-reduction mapping in Section III . Section IV contains an algorithm to approximate the steady-state probability distribution of the context-sensitive PBN. Conclusions are presented in Section V. II. FORMULATIONS OF CONTEXT-SENSITIVE PBNS A. Transition Probabilities of Context-Sensitive PBN We consider a context-sensitive PBN consisting of genes and composed of Boolean networks with selection probabilities . Furthermore, let and be the perturbation probability and the network switching probability respectively. Clearly, each Boolean network will have 2 states zero to 2 1 (equivalent binary representations 00 0 to 11 1) and the collection of BNs can be considered to have 2 states. The initial definition of context-sensitive PBNs in [25] allowed four mutually exclusive events to occur at time . • 1: The current network function is applied, the PBN transitions accordingly, and the network function remains the same for the next transition. • 2: The current network function is applied, the PBN transitions accordingly, and a new network function is selected for the next transition. • 3: There is a random perturbation and the network function remains the same for the next transition. • 4: There is a random perturbation and a new network function is selected for the next transition. be the state that is occuLet pied by the PBN at time . consists of the decimal equivalent of the binary context and the GAP information. Let be any two states and define ,1 , , and . and denotes the BNs corresponding to states and , respectively, whereas and signifies the decimal representation of the GAP of states and , respectively. We will assume that if at time , the network is at BN , then at time , we will remain at BN with probability or transition to BN with probability , , and . There is a minor difference in where this assumption [27], [33] and the assumption in [26] and [28], where the probability of remaining at network at time 1 was , i.e., returning to the same network funcassumed to be tion was not allowed for events 2 and 4. We are incorporating this change, as the new definition will ease the understanding of 1
dxe
= min(n 2
jn
x).
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
PAL: CONTEXT-SENSITIVE PROBABILISTIC BOOLEAN NETWORKS
881
context switching for the specific case of two contexts. To illustrate that, let us consider an example of a context-sensitive PBN consisting of two contexts and with network selection probabilities and and network transition probability . In the previous formulation, if we are at network at time , the network function changes to at time with probability irrespective of and . With the new assumption, the network can change to with probability or remain at with probability . Then based on the new assumption and the definition in [25], the one-step transition probability from to is given by
, then the 16 16 size transition probability matrix of the is given by context-sensitive PBN (6) Previous works [26], [34]–[36] on intervention in probabilistic Boolean networks had assumed that the status of one of the genes was indicative of the cost of the state and that the control action consisted of flipping one of the other genes. Let us assume those conditions and set up a finite horizon control problem similar to [26] and [34]. Given an initial state (7)
(1) where is the Hamming distance between and number of genes that differ between the two states
, i.e., the
if
transitions to in a single step in network otherwise if otherwise
(2) (3)
and if otherwise.
(4)
To visualize the transition probability matrix better, let us the 2 2 transition probability matrix corredenote by sponding to BN .2 In other words, , where denotes the decimal representation of the GAP state of a BN and , at time and is the Hamming distance between . Then the 2 2 transition probcorresponding to context-sensitive PBN is ability matrix given by
(5) B. Ordering of Network Switching and State Transitions If we revisit the definition of event 2 for a context-sensitive PBN, we notice that the current network function is applied first and then the network switch occurs. Based on this definition, if and the GAP is , then the GAP at time , the network is is , where is dependent on alone and at time not dependent on whether a network switch occurs or not. The statement is evident when we consider the matrix in (5), where each row is dependent on a single . We next show through an example that this definition is not suitable for a one-step control problem where the cost criteria are based on the GAP alone. consisting of two Let us consider a context-sensitive PBN BNs of three genes (eight states). If the 8 8 size transition and probabilities of the BNs with perturbation are given by 2Note that perturbation.
P
here refers to the transition probability matrix of a BN with
subject to the control-dependent one-step transition probability , where we have the following: represents the treatment/intervention window; • • for are functions mapping the state space into the control space, i.e., the controls considered are state feedbacks; for genes and contexts, the 2 1 and for binary controls, state space is 0,1, the control space is 0,1, 2 1; is the one-step cost of applying the control • at state ; is the terminal cost associated with the state . • The dynamic programming solution to (7) is given by [34], [37] (8)
(9) For instance, state 9 refers to binary 1001 with the most significant bit 1 representing the context, the next bit with value 0 representing the state of the first gene, and so on. We observe that if the control is flipping the second gene and the cost is based on the first gene, the control policy decision at time for states 0–7 (belonging to the first context) will not be affected by . To illustrate this, let us consider the minimization equafor tion . The minimization will not be dependent are not dependent on , as the first eight rows of the matrix and control by flipping the second gene switches the tranon sition probabilities from among the first eight rows.3 Furtherdependent on GAP and independent of context, more, for will not be depenthe minimization term at stage dent on . Thus, for a single intervention step, this formulation will not consider the effect of and the transition probabilities of the other networks. 3Control by flipping a gene changes the initial state of the network. For instance, consider the case of eight GAP states (three genes) and two context with MSB denoting the context. If the second gene (third binary digit) in the state 0110 is flipped, the initial state becomes 0100. Consequently, in terms of transition probabilities, the seventh row corresponding to 0110 will be replaced by the fifth row corresponding to 0100.
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
882
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
Fig. 1. Transitions for formulation 8 and 9.
We will henceforth consider another formulation that will incorporate the effect of and the other network transitions in a single step. Furthermore, the new formulation and the associated relationships between the two formulations will increase our overall analytical understanding of the context switching process. The new formulation has been recently used in [27] to compare the intervention performance in case of model reduction. Let us denote the context-sensitive PBN corresponding to the new formulation by . The four mutually exclusive events that can occur at time for are as follows. 1: The current network function is applied, the PBN tran• sitions accordingly, and the network function remains the same for the next transition. 2: A new network function is selected and the PBN tran• sitions according to the new network function. 3: There is a random perturbation and the network func• tion remains the same for the next transition. 4: A new network function is selected and there is a • random perturbation. Similar to , the one-step transition probability from state to for is given by
(10) has been used in place which is same as (1) except that of for the second term. 2 transition probability matrix corresponding The 2 to can be represented as
(11) is dependent on all the Equation (11) shows that each row of constituent BNs and not dependent on a single BN as in the case of . Furthermore, if we consider the minimization term in (9), we will notice that this formulation takes into consideration the effect of at step of the intervention problem.
Fig. 1 provides a graphical representation of the two formulations and . The network and the GAP transitions of the context-sensitive PBN at consecutive time instants are shown in the first two rows, respectively. The transitions will be observed in different ways by the two formulations: will consider the GAP transitions first and then the network transitions, whereas will consider the network transitions first and then the GAP transitions. The observed transitions for the two formulations are shown in the third and fourth rows of the figure. For instance, will consider the network transition first and then the , and thus we will observe the transition GAP transition if formulation is used. Let the 2 length steady-state probability vectors for and be denoted by and , respectively. The relationship between the two steady-state probabilities is given by the following theorem. Theorem II.1: , where and . Proof: Let us consider long runs of the network correand . When there are no netsponding to formulations work transitions, both the formulations will behave exactly the same. When there is a network transition from network to , the part of the steady-state probability corresponding to in formulation will be reflected in the state 2 steady-state probability of for formulation . This is because during a network transition from network to network , the GAP transition occurring will be according to network in foris the probability of transition from network mulation , and to network , thus adding to . In other words, a proportion of GAP profiles from the last networks will turn up as GAP profiles of the new network. in Fig. 1, For instance, during the network transition will be considered in network for formulation the GAP , while it will be considered in network for formulation . Similarly, when a network transition occurs from to another part of the steady-state probability network , the 1 corresponding to state in formulation will not be reflected in the steady-state probability of for formulation . Thus
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
(12)
PAL: CONTEXT-SENSITIVE PROBABILISTIC BOOLEAN NETWORKS
883
TABLE I ERROR TABLE
(13)
(14) The summation term in (12) is due to the fact that can be any of the networks except . Corollary II.2: The next theorem states that the sum of the steady-state probabilities of the GAP states for a particular context is equal to the selection probability of the context for both the formulations. Theorem II.3: . Proof: Let us consider the original steady-state probability for the formulation given by (11). equations The corresponding equations for are
(15) where Thus
and
.
(16)
Therefore, . From Corollary II.2, we arrive at . C. Simulations Simulations were conducted to illustrate the difference in the steady-state probabilities of the two formulations and to verify
(14). We generated 100 random context-sensitive PBNs for different combinations of , , , and and noted the differences between the steady-state probability distributions of formulaand . The , , and norms were used tions as the metrics for measuring the difference in the steady-state probability distributions. The range of , , , and was , , , and , which . To produced a set of 1800 different combinations of 1 points were selected generate a set of random s, a set of uniformly randomly between zero and one. The numbers 0 and 1 were appended to the array and the array sorted. The difference of two consecutive entries in the sorted array produced the s. To be concise, we will present the results of only few of the combinations in Table I.4 The fifth and sixth columns contain the average 1-norm and -norm of the difference in the steady-state probability distributions of and . The average is over the 100 random context-sensitive PBNs that were gener. Similarly, the seventh ated for each combination of and eighth columns contain the average 1-norm and -norm of , where denotes the difference between the true and the calculated using (14). We notice that the 1-norm and -norm of are in the range of 10 , which is due to the Matlab numerical approximations and, in all practicality, is equal to zero. are, however, significant, The 1-norm and -norm of especially for larger . Rows 2–5 of Table I show that the difference in the steadystate probabilities of the two formulations increases with an increase in . This trend is also observable for other values of and . The difference in the steady-state probability distributions of the two formulations with respect to and for fixed and is plotted in Fig. 2. We notice that the difference increases with an increase in the value of . We can possibly explain that by rewriting (14) in the following form: . This suggest that the difference increases for large .5 Furthermore, when is high, the frequency of network transitions will increase, and as we have elaborated earlier that the difference in the two formulations exists only during the network transitions, consequently a higher will tend to produce a higher difference 4Detailed simulation results are available at http://cvial.ece.ttu.edu/ ranadippal/cspbn/. 5We also have to consider the variation of the term c (2 (i 1) + a ) (a) with change in q but, assuming that the change in that term is insignificant with changes in q , a direct proportional relationship between the difference of the steady-state probability distributions and q is noticeable.
0
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
0
884
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
Fig. 2. Difference in the steady-state distributions of and q .
9 and 8 for different n
in the steady-state probability distributions. We also observe that the difference in the steady-state probability distributions of the two formulations decreases with increase in . To explain this behavior, we should note that represents perturbation probability; and the higher the value of , the GAP transitions will tend to follow random order as compared to following the transitions of the individual BNs of the context-sensitive PBN. Increase in randomness will make the steady-state probabilities more uniform and consequently decreasing . The variation with respect to and for fixed and is plotted in Fig. 3. III. STEADY-STATE PRESERVING CONTEXT REDUCTION In this section, we will consider the collapsed steady-state probabilities, which refers to steady-state probabilities for the and denote the collapsed steadyGAP states only. Let state probabilities for formulation and , respectively, i.e., and for to 2 . Then
from ((13))
Fig. 3. Difference in the steady-state distributions of and p.
9 and 8 for different n
(17) The above equation shows that the collapsed steady-state probabilities are not changed when we use either of the definitions or . From now onwards, we will denote by , the collapsed steady-state probability distribution. Thus, the relaand provided in (14) can be simplified tionship between as shown next (18) . where The collapsed steady-state probability distribution is calculated from the steady-state probabilities of a network of size 2 . We are interested in generating a 2 2 transi2 tion probability matrix that produces as the steady-state probability distribution vector. The averaging method used in [26] to collapse the transition probability matrix will not produce steady-state probabilities equal to . Let us revisit the problem and strengthen the simplified assumptions used in [26] to arrive at a more specific representation of the collapsed network. Here, collapsing refers to removing the context information. denote the GAP at Let denote the GAP at time and time 1. The context information has been removed, and this can take values from zero to 2 1. Similarly, let ranging from zero to 2 1 denote the context and GAP information at time corresponding to either formulation or . The tranto GAP state sition probability of going from GAP state is given by (19), shown at the bottom of the next page. In [26], was approximated by . The approximation will be very close for high values of , as shown next. From (18), we have (20)
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
PAL: CONTEXT-SENSITIVE PROBABILISTIC BOOLEAN NETWORKS
Substituting
885
in (20) results in . For large ,
Theorem III.1: With (22) denoting the transition probabilities of the reduced PBN, the steady-state probability distribution vector of the reduced PBN is given by
for formulation , and thus the . For formulaapproximation is very close for large and tion , we can write (18) in the following manner:
(21)
For
very close to one,
and thus . . From (18), —we have from For lower values of —for instance, Table I the -norm of to be in the range of 3.71 4, which when divided by will be of the range .037. As , the difference in and can be substantial for lower s. A better approximation of the transition probabilities of the (context-reduced network from ) can be reduced network in derived by using (19). The resulting state transition probabilities of the contextreduced network is given by (22), shown at the bottom of the page. Let the steady-state probability distribution of the contextbe denoted by the vector . The next theorem reduced PBN shows that if we use (22) for calculating the transition-probabilities of the context-reduced PBN; then the steady-state probability distribution vector of the context-reduced PBN will be the same as the collapsed steady-state probability distribution vector .
(23) for
. is the steady-state probability distriProof: bution of the reduced PBN if it satisfies the equation for each . is the steady-state probability distriSince 2 PBN , it satisfies bution for the 2 . Thus
(19)
(22)
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
886
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
This shows that satisfies the steady-state probability equadue to the tions of the reduced network and is equal to uniqueness of the steady-state probability distribution for ergodic Markov chains.6 Corollary III.2: If the context-reduced transition probabilities for formulation is given by (24), shown at the bottom of and , the page, for denoted by then the steady-state probability distribution of is given by (25) . for As shown in (23) and (25), the new reduction is capable of matching the steady-state probability distribution of the context-reduced PBN to the collapsed steady-state probability distribution of the original context-sensitive PBN, which was not feasible in the earlier approximation [26]. In the earlier approximation, the context-reduced PBN was independent of and was in the current case. So the current formulaequivalent to tion will better capture the dependence on . IV. APPROXIMATION METHOD FOR CALCULATING THE STEADY-STATE PROBABILITIES OF THE CONTEXT-SENSITIVE PBN To explain the importance of the calculation of the steadystate distribution for both context and gene states, let us consider the dynamical modeling of cells in a tumor being treated by a targeted molecular therapeutic drug. Targeted cancer therapy refers to a new generation of cancer drugs designed to interfere with a specific molecular target that is believed to have a critical role in tumor growth or progression [38], [39]. However, resistance to the drug can evolve due to the activation of alternative pathways [40]. The resistance to a specific targeted drug can disappear with stopping of that drug therapy or use of another targeted drug. Modeling of such a scenario with a context-sensitive probabilistic Boolean network is desirable, as the drug-sensitive and drug-resistant scenarios can be considered as two separate contexts with different network connections in each scenario, and they can switch between themselves with small probabilities. The two scenarios can be further divided into more contexts referring to the different signaling pathway activations for each individual tumor. The GAP states here will consist of the candidate gene or protein expression values important for the tumor. We will be interested in knowing the steady-state probability of each GAP state in the resistant and sensitive contexts, with an immediate application being a high steady-state probability of a GAP state in one of the contexts and relatively lower
steady-state probability in the other context, can signify the suitability of the GAP state as a molecular marker for drug resistance/sensitivity. Previous approximation methods for calculation of steady-state probabilities of a PBN have considered generation of the steady-state probabilities of the major attractors of the constituent Boolean networks [25] or removal of Boolean networks with small selection probabilities [32]. In this paper, we will approach the approximation based on the steady-state probability distribution of the constituent Boolean networks and the transitions between them. Instead of calculating the steady-state probability distribution of the context-sensitive 2 transition probability matrix, PBN directly with a 2 we will calculate the steady-state probability distributions of 2 transition probability the constituent networks with 2 matrices and utilize the frequency of transitions between the constituent networks to arrive at the approximation. The proposed method provides good approximation for small values of network transition probability . can be easily Here we will consider only , knowing that using (18). The transition probabilities for calculated from 2 matrix . The BN with perturbation are given by the 2 individual terms of are given by for , where is the perturbation probability, is the Hamming distance between and , and the functions and are given by (2) and (4), respectively. Let the steady-state probability distribution for network be denoted by a 2 -dimensional vector . To calculate the , we will consider the steady-state probability four mutually exclusive events 1 to 4 described earlier. The probability corresponding to events 1 and 3 can be approxi. The approximation is due to the mated by fact that the probability of being in network is ; the proba; bility of staying in network at the next time step is and the probability of GAP state in network can be approxi. mated by To approximate the probability corresponding to events 2 and 4, let us consider the event that a network change from to has occurred. Then according to formulation , the probability of occurrence of a transition from state 2 to the state 2 where will be given by the following equation:
6The nonzero perturbation probability p ensures the ergodicity of a contextsensitive PBN.
(26)
(24)
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
PAL: CONTEXT-SENSITIVE PROBABILISTIC BOOLEAN NETWORKS
887
Here, we have used the approximation that the probability of in the steady state is . state 2 As the network in the above equation can be any network from network 1 to network except network , and the state for each can range from zero to 2 , the steady-state probcan be approximated by the following ability for 2 equation:
(27) The utility of the above approximation lies in the fact that we can compute the steady-state probabilities of a context-sensitive PBN with a reduced computational complexity. The steady2 matrix can be computed state probabilities of a 2 using by a matrix inversion with a complexity of Gaussian elimination.7 In the new approach, we can calculate 2 networks with a the steady-state probabilities for 2 2 . The rest of the summation calculations complexity of 2 . For large , the reduction in comare of lower order plexity from 2 to 2 will be significant. We next show analytically and through simulations that the approximation is quite close to the original solution for smaller values of . A. Bound on the Approximation The steady-state probability distribution lution to the following linear equation:
of
As
for
where . As the multiplication of a state probability vector with a transition probability matrix produces another state probability s are state probability vectors for vector, and . Since , for . Furthermore, as individual entries of and s are positive and 1, . Thus, . Corollary IV.2: . Let us now consider the approximation provided by (27)
is the so-
(28) is the identity matrix of size 2 2 and the where solution vector is equal to . The residual for an approximate of (28) is given by , where the solution norm of provides a measure of closeness of the approximate solution as compared to the true solution [42]. Let us first study the closeness of the approximate solution , where is the 2 dimensional steady-state probability distribution vector for . The following theorem provides a bound on the infinity norm of the residual for (28). for the approximate solution , where is Theorem IV.1: the residual corresponding to the approximate solution for (28). Proof:
Thus the steady-state probability approximation provided by (27) is equal to . Consequently, the residual for the approximation is
(29) 7An improved complexity of and Winograd method [41].
O(N
) can be achieved by the Coppersmith
where
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
(30)
888
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
TABLE II APPROXIMATION ERRORS FOR DIFFERENT SETS OF n; p; q; k , WHERE E
= 0
As individual entries of are positive and the sum of the entries of add up to one, is a valid state probability distribution vector. Since multiplication of a state probability distribution vector with a probability transition matrix results in is a state probaa state probability distribution vector, bility distribution vector. Since the entries of are nonnegative and rows sum to one, is a state probability distribution vector. Thus, and for . Based on these inequalities, we can get a bound on as follows:
Fig. 4. Average kE k for k
= 15, p = 0:05, and different values of n and q .
(31) Similarly
(32) B. Simulation Results In this section, we present the results of simulations conducted to measure the accuracy of the approximation algorithm denote the difference between the true given by (27). Let steady-state probability distribution and the approximate distribution. The error measures used are the , , and norms of E and the and norms of (residual for
the approximate solution). We considered different values of , , , and and constructed 50 context-sensitive PBNs for each set of , , , and values. The average error of the 50 context-sensitive PBNs for specific sets of , , and is shown in Table II. The detailed table containing a large number of variations of , , , and is provided in the companion Web site. The simulations were conducted using Matlab on a Dell Optiplex GX745 with a 2.4 GHz Intel Core2 CPU and 2 GB of RAM. denotes the time in seconds taken to compute In the table, the true steady-state probability distribution of the context-sendenotes the time in seconds taken by the apsitive PBN and proximation algorithm to compute the steady-state probability distribution of the context-sensitive PBN. We observe that the approximation is close to the true steady-state probability dis, the 1-norm of tribution for low values of . Whenever is less than 0.045. The 1-norm of for different values of and and fixed and is plotted in Fig. 4. The figure shows that the 1-norm of the error for the approximation algorithm increase when increases. An analogous relationship
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
PAL: CONTEXT-SENSITIVE PROBABILISTIC BOOLEAN NETWORKS
holds for the 2-norm and the -norm. The simulations show is always , as is exthat the 1-norm of the residual pected from (32). When we study the relationship of the error with changes in , the error decreases with increase in the value of . The more random the individual networks become, the approximation algorithm has better performance. For instance, as , , , and , the shown in Table II, for 1-norm of the error is 4.5 10 , which is very low. In terms of time complexity, we observe that for large , the time taken by the approximation algorithm is much less than the time taken to calculate the actual steady-state probability distribution. The , , , and entry corresponding to in Table II shows the advantage of the approximation algorithm in terms of time complexity. The true steady-state probability distribution is computed in 22.33 s, but the approximation can be generated in only 0.678 s. V. CONCLUSION In this paper, we considered three issues related to context-sensitive PBNs. First, we derived the relationship between the steady-state probability distributions of the two context-sensitive formulations with different ordering of the same set of events. This will facilitate better understanding of the context-sensitive probabilistic Boolean network formulation. We next derived a mapping for removal of context information from a context-sensitive PBN that maintains the collapsed steady-state probability distribution. For our mapping, we analytically proved the equality of the steady-state probability distribution of the reduced PBN and the collapsed steady-state probability distribution of the original PBN. As the steady state is a crucial characteristic of the regulatory network that denotes the phenotype, this reduction algorithm will be valuable for preserving the steady-state probability distribution after reduction. Lastly, we proposed an approximation method for the calculation of the steady-state probabilities for a context-sensitive PBN, which is significantly faster than the calculation of the true steady-state probability distribution. Future research issues in this area include combining the context-reduction with the downstream control design with the objective of minimum alteration on the control outcome. REFERENCES [1] H. De Jong, “Modeling and simulation of genetic regulatory systems: A literature review,” J. Comput. Biol., vol. 9, pp. 67–103, 2001. [2] I. Shmulevich and E. R. Dougherty, Genomic Signal Processing. Princeton, NJ: Princeton Univ. Press, 2007. [3] J. M. Bower and H. Bolouri, Computational Modeling of Genetic and Biochemical Networks, 1st ed. Boston, MA: MIT Press, 2001. [4] R. Sharan and T. Ideker, “Modeling cellular machinery through biological network comparison,” Nat. Biotechnol., vol. 24, pp. 427–433, 2006. [5] E. D. Conrad and J. J. Tyson, “Modeling molecular interaction networks with nonlinear ordinary differential equations,” in System Modeling in Cell Biology From Concepts to Nuts and Bolts. Cambridge, MA: MIT Press, 2006, pp. 97–123. [6] D. S. Douglas and S. Jones, Differential Equations and Mathematical Biology, 1st ed. Boca Raton, FL: Chapman & Hall/CRC, 2003. [7] L. Qian, H. Wang, and E. Dougherty, “Inference of noisy nonlinear differential equation models for gene regulatory networks using genetic programming and Kalman filtering,” IEEE Trans. Signal Process., vol. 56, no. 7, pp. 3327–3339, 2008.
889
[8] M. de Hoon, S. Imoto, and S. Miyano, Inferring Gene Regulatory Networks From Time-Ordered Gene Expression Data Using Differential Equations. Berlin, Germany: Springer, 2002. [9] J. Goutsias and S. Kim, “Stochastic transcriptional regulatory systems with time delays: A mean-field approximation,” J. Comput. Biol., vol. 13, pp. 1049–1076, 2006. [10] J. Goutsias and S. Kim, “A nonlinear discrete dynamical model for transcriptional regulation: Construction and properties,” Biophys. J., vol. 86, pp. 1922–1945, 2004. [11] A. Margolin, I. Nemenman, K. Basso, U. Klein, C. Wiggins, G. Stolovitzky, R. D. Favera, and A. Califano, “Aracne: An algorithm for reconstruction of genetic networks in a mammalian cellular context,” BMC Bioinf., vol. 7, 2006. [12] L. Glass and S. A. Kauffman, “The logical analysis of continuous, nonlinear biochemical control networks,” J. Theor. Biol., vol. 39, pp. 103–129, 1973. [13] S. A. Kauffman, “Metabolic stability and epigenesis in randomly constructed genetic nets,” Theor. Biol., vol. 22, pp. 437–467, 1969. [14] S. A. Kauffman, The Origins of Order: Self-Organization and Selection in Evolution. New York: Oxford Univ. Press, 1993. [15] R. Somogyi and C. Sniegoski, “Modeling the complexity of gene networks: Understanding multigenic and pleiotropic regulation,” Complexity, vol. 1, pp. 45–63, 1996. [16] Z. Szallasi and S. Liang, “Modeling the normal and neoplastic cell cycle with realistic boolean genetic networks: Their application for understanding carcinogenesis and assessing therapeutic strategies,” Proc. Pac. Symp. Biocomput., vol. 3, pp. 66–76, 1998. [17] R. Thomas, D. Thieffry, and M. Kaufman, “Dynamical behavior of genetic regulatory networks i. biological role of feedback loops and practical use of the concept of the loop-characteristic state,” Bull. Math. Biol., vol. 57, pp. 247–276, 1995. [18] A. Wuensche, “Genomic regulation modeled as a network with basins of attraction,” Proc. Pac. Symp. Biocomput., vol. 3, pp. 89–102, 1998. [19] S. Huang, “Gene expression profiling, genetic networks, and cellular states: An integrating concept for tumorigenesis and drug discovery,” Molec. Med., vol. 77, pp. 469–480, 1999. [20] A. Faure, A. Naldi, C. Chaouiya, and D. Theiffry, “Dynamical analysis of a generic boolean model for the control of the mammalian cell cycle,” Bioinformatics, vol. 22, pp. 124–131, 2006. [21] J. C. Sible and J. J. Tyson, “Mathematical modeling as a tool for investigating cell cycle control networks,” Methods, vol. 41, pp. 238–247, 2007. [22] J. J. Tyson, A. Csikasz-Nagy, and B. Novak, “The dynamics of cell cycle regulation,” Bioessays, vol. 24, pp. 1095–1109, 2002. [23] I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, “Probabilistic boolean networks: A rule-based uncertainty model for gene regulatory networks,” Bioinformatics, vol. 18, pp. 261–274, 2002. [24] I. Shmulevich, E. R. Dougherty, and W. Zhang, “From boolean to probabilistic boolean networks as models of genetic regulatory networks,” Proc. IEEE, vol. 90, pp. 1778–1792, 2002. [25] M. Brun, E. R. Dougherty, and I. Shmulevich, “Steady-state probabilities for attractors in probabilistic Boolean networks,” Signal Process., vol. 85, pp. 1993–2013, 2005. [26] R. Pal, A. Datta, M. L. Bittner, and E. R. Dougherty, “Intervention in context-sensitive probabilistic boolean networks,” Bioinformatics, vol. 21, pp. 1211–1218, 2005. [27] B. Faryabi, G. Vahedi, J.-F. Chamberland, A. Datta, and E. R. Dougherty, “Intervention in context-sensitive probabilistic boolean networks revisited,” EURASIP J. Bioinf. Syst. Biol., vol. 2009, p. ID360864, 2009. [28] R. Pal, A. Datta, and E. R. Dougherty, “Robust intervention in probabilistic Boolean networks,” IEEE Trans. Signal Process., vol. 56, no. 3, pp. 1280–1294, Mar. 2008. [29] I. Ivanov and E. R. Dougherty, “Reduction mappings between probabilistic boolean networks,” EURASIP J. Appl. Signal Process., vol. 4, pp. 125–131, 2004. [30] I. Ivanov, R. Pal, and E. R. Dougherty, “Dynamics-preserving sizereduction mappings for probabilistic Boolean networks,” IEEE Trans. Signal Process., vol. 55, no. 5, pp. 2310–2322, May 2007. [31] E. R. Dougherty and I. Shmulevich, “Mappings between probabilistic boolean networks,” Signal Process., vol. 83, pp. 799–809, 2003.
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.
890
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY 2010
[32] W. K. Ching, S. Zhang, M. K. Ng, and T. Akutsu, “An approximation method for solving the steady-state probability distribution of probabilistic boolean networks,” Bioinformatics, vol. 23, pp. 1511–1518, 2007. [33] R. Layek, A. Datta, R. Pal, and E. R. Dougherty, “Adaptive intervention in probabilistic boolean networks,” in Proc. Amer. Contr. Conf., accepted for publication. [34] A. Datta, A. Choudhary, M. L. Bittner, and E. R. Dougherty, “External control in Markovian genetic regulatory networks,” Mach. Learn., vol. 52, pp. 169–191, 2003. [35] A. Datta, A. Choudhary, M. L. Bittner, and E. R. Dougherty, “External control in Markovian genetic regulatory networks: The imperfect information case,” Bioinformatics, vol. 20, pp. 924–930, 2004. [36] R. Pal, A. Datta, and E. R. Dougherty, “Optimal infinite horizon control for probabilistic Boolean networks,” IEEE Trans. Signal Process., vol. 54, no. 6, pp. 2375–2387, 2006. [37] D. P. Bertsekas, Dynamic Programming and Optimal Control, 2nd ed. Nashua, NH: Athena Scientific, 2001. [38] C. Sawyers, “Targeted cancer therapy,” Nature, vol. 432, pp. 294–297, 2004. [39] M. R. Green, “Targeting targeted therapy,” New Eng. J. Med., vol. 350, no. 21, pp. 2191–2193, 2004. [40] N. V. Zhukov and S. A. Tjulandin, “Targeted therapy in the treatment of solid tumors: Practice contradicts theory,” Biochemistry (Mosc)., vol. 73, p. 605-188, 2008.
[41] D. Coppersmith and S. Winograd, “Matrix multiplication via arithmetic progressions,” J. Symbol. Comput., vol. 9, pp. 251–280, 1990. [42] T. Kailath, Linear Systems. Englewood Cliffs, NJ: Prentice-Hall, 1979.
Ranadip Pal (M’07) received the B.Tech. degree in electronics and electrical communication engineering from the Indian Institute of Technology, Kharagpur, in 2002 and the M.S. and Ph.D. degrees in electrical engineering from Texas A&M University, College Station, in 2004 and 2007, respectively. In August 2007, he joined the Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, as an Assistant Professor. His research areas are computational biology, genomic signal processing, and control of genetic regulatory networks. Dr. Pal is the recipient of the 2010 NSF CAREER Award. While with Texas A&M, Dr. Pal received the Ebensberger/Fouraker Fellowship, a Distinguished Graduate Student Masters Research Award, and a National Instruments Fellowship. He was an Indian National Math Olympiad Awardee and was ranked first in the Regional Math Olympiad, West Bengal, India.
Authorized licensed use limited to: Texas Tech University. Downloaded on February 3, 2010 at 10:52 from IEEE Xplore. Restrictions apply.