Jun 28, 2016 - networks, the distribution of transmission and recovery times are far from ... messages in Twitter, or news in social media outlets, follow ...
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
1
Stability of Spreading Processes with General Transmission and Recovery Times
arXiv:1606.08518v1 [cs.SI] 28 Jun 2016
Masaki Ogura, Member, IEEE, Victor M. Preciado, Member, IEEE,
Abstract Although viral spreading processes taking place in networks are commonly analyzed using Markovian models in which both the transmission times and the recovery times follow exponential distributions, empirical studies show that, in most real scenarios, the distribution of these times are far from exponential. To overcome this limitation, we first introduce a generalized spreading model that allows for transmission and recovery times to follow arbitrary distributions within an arbitrary accuracy. In this context, we derive conditions for the generalized spreading process to converge towards the disease-free equilibrium (in other words, to eradicate the viral spread) without relying on mean-field approximations. Based on our results, we illustrate how the particular shape of the transmission/recovery distribution heavily influences the boundary of the stability region of the spread, as well as the decay rate inside this region. Therefore, modeling non-exponential transmission/recovery times observed in realistic spreading processes via exponential distributions can induce significant errors in the stability analysis of the dynamics.
I. I NTRODUCTION Understanding the dynamics of spreading processes in complex networks is a challenging problem with a wide range of practical applications in epidemiology and public health [3], information propagation in social networks [12], or cybersecurity [18]. During the last decade, significant progress has been made towards understanding the relationship between the topology of a network and the dynamics of spreading processes taking place over the network (see [13] for a recent survey). A common approach to investigate this relationship is by modeling spreading processes using networked Markov processes, such as the networked susceptible-infected-susceptible (SIS) model [20], [23]. Based on these Markovian models, it is then possible to find an explicit relationship between epidemic thresholds and network eigenvalues in static topologies [1], [20], [22], [23], as well as in multilayer [9], time-varying [14], and adaptive [15] networks. A consequence of using Markovian models in the analysis of spreading processes is that both transmission times (i.e., the time it takes for an infection to be transmitted from an infected node to one of its neighbors) and recovery times (i.e., the time it takes for an infected node to recover) follow exponential distributions. However, empirical studies show that, in most real networks, the distribution of transmission and recovery times are far from exponential. For example, the transmission time of messages in Twitter, or news in social media outlets, follow (approximately) log-normal distributions [12]. In the context of The authors are with the Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA 19014, USA. Email:
{ogura,preciado}@seas.upenn.edu This work was supported in part by the NSF under grants CNS-1302222 and IIS-1447470.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
2
human contact networks, the transmission of immunodeficiency viruses [4], or the time is takes to recover from the influenza virus [19] are clearly non-exponential. Since realistic transmission and recovery times usually follow non-exponential distributions, it is of practical importance to understand the role of these distributions on the dynamics of the spread. In this direction, the authors in [5], [21] illustrated via numerical simulations that non-exponential transmission times can have a substantial effect on the speed of spreading. An approximate analysis of spreading processes with general transmission and recovery times using mean-field approximations was proposed in [6]. In [11], an analytically solvable (although rather simplistic) spreading model with non-exponential transmission times was proposed. Using moment-closure techniques, the authors in [17] analyzed spreading processes with non-exponential transmission and recovery times in special network topologies (in particular, triangles and chains of length 2). In this paper, we propose a rigorous but tractable approach to analyze spreading processes over arbitrary networks with general transmission and recovery times. In this direction, we first introduce a generalized networked SIS (GeNeSIS) model which allows for transmission and recovery times following arbitrary phase-type distributions (see, e.g., [2]). Since phase-type distributions form a dense family in the space of positive-valued distributions [8], the GeNeSIS model allows to theoretically analyze arbitrary transmission and recovery times within an arbitrary accuracy [2]. We are particularly interested in finding conditions for the dynamics of the spread to converge towards the disease-free equilibrium exponentially fast; in other words, to eradicate the viral spreading process. In contrast with existing results [6], [17], we provide an exact analysis of the dynamics over arbitrary networks with general infection and recovery distributions (without relying on mean-field approximations). The key tool used in our derivations is a vectorial representation of phase-type distributions, which we use to derive conditions for exponential stability of the disease-free equilibrium in the (exact) stochastic dynamics of the GeNeSIS model. Based on our theoretical results, we show that modeling non-exponential transmission/recovery times using exponential distributions induces significant errors in the dynamics of spreading processes, as previously observed via numerical simulations. This paper is organized as follows. In Section II, we introduce elements of graph theory and stochastic differential equations with Poisson jumps. In Section III, we describe a generalized model of spreading processes over networks with arbitrary transmission and recovery times. In Section IV, we provide a vectorial representation of the GeNeSIS model, which we use in Section V to derive stability conditions of its stochastic dynamics. We validate the effectiveness of our results via numerical simulations in Section VI, where we also illustrate the significant effect of non-exponential transmission/recovery times in the dynamics of the spread. II. M ATHEMATICAL P RELIMINARIES Let R and N denote the set of real numbers and positive integers, respectively. For a positive integer n, define [n] = {1, . . . , n}. For a real function f , let f (t − ) denote the limit of f from the left at time t. We let In and On denote the n × n identity and zero matrices. By 1 p and 0 p , we denote the p-dimensional vectors whose entries are all ones and zeros, respectively. A real matrix A (or a vector as its special case) is said to be nonnegative, denoted by A ≥ 0, if A is nonnegative entry-wise. The notation A ≤ 0 is understood in the obvious manner. For a square matrix A, the maximum real part of its eigenvalues, called the spectral abscissae of A, is denoted by η(A). We say that A is Hurwitz stable if η(A) < 0. We also say that A is Metzler
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
3
if the off-diagonal entries of A are all non-negative. It is easy to see that, if A is Metzler, then eAt ≥ 0 for every t ≥ 0. The Kronecker product [10] of two matrices A and B is denoted by A ⊗ B, and the Kronecker sum of two square matrices A ∈ R p×p and B ∈ Rq×q is defined by A ⊕ B = A ⊗ Iq + I p ⊗ B. Given a collection of n matrices A1 , . . . , An having the same number of columns, the matrix obtained by stacking the matrices in vertical (A1 on top) is denoted by col(A1 , . . . , An ). An undirected graph is a pair G = (V , E ), where V = {1, . . . , n} is the set of nodes, and E ⊂ V × V is the set of edges, consisting of distinct and unordered pairs {i, j} for i, j ∈ V . We say that a node i is a neighbor of j (or that i and j are adjacent) if {i, j} ∈ E . The set of neighbors of node i is denoted by Ni . The adjacency matrix A ∈ Rn×n of a graph G is defined as the {0, 1}-matrix whose (i, j)-th entry is 1 if and only if nodes i and j are adjacent, 0 otherwise. Let (Ω, M, P) be a probability space. The expectation of a random variable is denoted by E[·]. A Poisson counter [7, Chapter 4] of rate λ > 0 is denoted by Nλ . In this paper, we extensively use a specific class of stochastic differential equations with Poisson jumps, described below. For each i ∈ [m], let fi : R × R → R be a continuous function, λi be a positive constant, and κi be a continuous-time real stochastic process defined over the probability space Ω such that {κi (t)}t≥0 are independent and identically distributed. Then, we say that a real and right-continuous function x is a solution of the stochastic differential equation m
dx = ∑ fi (x(t), κi (t)) dNλi ,
(1)
i=1
if x is constant on any interval where none of the counters Nλ1 , . . . , Nλm jumps, and x(t) = x(t − ) + fi (x(t − ), κi (t)) when Nλi jumps at time t. This definition can be naturally extended to the vector case. Below, we present two lemmas for this class of stochastic differential equations. The first lemma states a version of Itˆo’s formula: Lemma II.1. Assume that x is a solution of (1). Let g be a real continuous function. Then, y(t) = g(x(t)) is a solution of the stochastic differential equation m
dy = ∑ [g (x(t) + fi (x(t), κi (t))) − g(x(t))] dNλi , i=1
i.e., y satisfies y(t) = y(t − ) + g x(t − ) + fi (x(t − ), κi (t)) − g(x(t − )) if the Poisson counter Nλi jumps at time t and is constant over any interval in which none of the counters Nλ1 , . . . , Nλm jumps. Proof. Assume that the counter Nλi jumps at time t. Since x is the solution of the stochastic differential equation (1), it follows that y(t) − y(t − ) = g (x(t − ) + fi (x(t − ), κi (t))) − g(x(t − )), as desired. We also state the following lemma concerning the expectation of the solution to the stochastic differential equation (1): Lemma II.2. Assume that x is a solution of (1). If the functions f1 , . . . , fm are affine with respect to the second variable, then m d E[x(t)] = ∑ E fi (x(t), E[κi (t)]) λi . dt i=1
(2)
Proof. By the assumption, for each i ∈ [m] we can take real functions fi,1 and fi,2 such that fi (a, b) = fi,1 (a) + b fi,2 (a) for every a, b ∈ R. Let t ≥ 0 and h > 0 be arbitrary. We have the following three possibilities: (i) no counter jumps on the time
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
4
interval [t,t + h]; (ii) exactly one counter jumps in the interval; or (iii) more than one counter jumps in the interval. The first case happens with probability 1 − (λ1 + · · · + λm )h + o(h). For the second case, for each i ∈ [m], one and the only one counter Nλi jumps on the time interval [t,t +h] with probability λi h+o(h). In this case we have x(t +h) = x(t)+ fi (x(t), κi (τ)) = x(t) + fi,1 (x(t)) + κi (τ) fi,2 (x(t)) for some τ ∈ [t,t + h] and, therefore, E[x(t + h)] = E[x(t)] + E[ fi,1 (x(t))] + E[κi (τ)]E[ fi,2 (x(t))] = E[x(t)] + E fi (x(t), E[κi (t)]) because κi is an independent and identically distributed stochastic process. Finally, the third case occurs with probability o(h). Summarizing, we have shown that E[x(t + h)] − E[x] o(h) m = + ∑ E fi (x(t), E[κi (t)]) λi , h h i=1 which proves (2) in the limit of h → 0. III. SIS M ODEL WITH G ENERAL T RANSMISSION AND R ECOVERY T IMES The aim of this section is to introduce the generalized networked susceptible-infected-susceptible (GeNeSIS) model. Consider a network of n nodes1 connected via an undirected graph G = (V , E ). In this model, the state of a node i ∈ V is described by a {0, 1}-valued continuous-time stochastic process, denoted by zi = {zi (t)}t∈R . We say that node i is susceptible (respectively, infected) at time t if zi (t) = 0 (respectively, zi (t) = 1). We assume that the function zi is continuous from the right for all i ∈ [n]. Under this assumption, we say that node i becomes infected (respectively, becomes susceptible) at time t if zi (t − ) = 0 and zi (t) = 1 (respectively, zi (t − ) = 1 and zi (t) = 0). It is assumed that all nodes are susceptible before time t = 0, i.e., zi (t) = 0 for t < 0. Using this notation, we introduce the GeNeSIS model as follows. Definition III.1. We say that the family z = {zi }i∈[n] of stochastic processes is a generalized networked susceptible-infectedsusceptible model (GeNeSIS model, for short) if there exist a subset Λ ⊂ [n], as well as random variables 0 = τ0ji (t) < τ1ji (t) < . . ., and ρ i (t) > 0 satisfying the following conditions for all i ∈ [n], j ∈ Ni , and t ≥ 0: a) Node i becomes infected at time t = 0 if and only if i ∈ Λ, i.e., Λ is the initially infected subset. b) Assume that node i becomes infected at time t. Then, node i remains infected during the time interval [t,t + ρ i (t)) and becomes susceptible at time t + ρ i (t), i.e., the random variable ρ i (t) is the recovery time of node i; c) During the interval [t,t + ρ i (t)), the infected node i attempts to infect node j ∈ Ni at times {t + τkji (t)}k∈N . If node j is susceptible at time t + τkji (t) for any k ∈ N, then it becomes infected. ji (t)} j∈Ni ,t≥0, k∈N the transmission times of node i, since the difference Remark III.2. We call the random increments {τkji (t)−τk−1 ji τkji (t) − τk−1 (t) represents the time between infection attempts from an infected node i towards a neighboring node j. Note
that, when all the recovery and transmission times follow exponential distributions, the GeNeSIS model recovers the standard networked SIS model studied in the literature [13], [20], [23]. 1 Without
loss of generality, we assume that each node has at least one neighbor.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
5
Notice that the origin (i.e., zi = 0 for all i ∈ V ) is an absorbing state of the GeNeSIS dynamics. In what follows, we will refer to the origin as the disease-free equilibrium. The aim of this paper is to analyze the dynamics of the generalized SIS model according to the following stability criterion: Definition III.3. For a given λ > 0, we say that the infection-free equilibrium of the GeNeSIS model is exponentially mean stable with decay rate λ if, for every ε > 0, there exists a C > 0 such that E[zi (t)] ≤ Ce−(λ −ε)t for all t ≥ 0 and i ∈ [n]. In this paper, we consider the GeNeSIS model with transmission and recovery times following phase-type distributions [2]. In what follows, we briefly describe this wide class of probability distributions. Consider a time-homogeneous Markov process x in continuous-time with p + 1 (p ∈ N) states (also called phases) such that the states 1, . . . , p are transient and the state p + 1 is absorbing. The infinitesimal generator of the process is then necessarily of the form T b , b = −T 1 p , 0 0
(3)
where T ∈ R p×p is an invertible Metzler matrix with non-positive row-sums. Let col(φ , 0) ∈ R p+1 (φ ∈ R p ) denote the initial distribution of the Markov process x, i.e., P(x(0) = m) = φm for m ∈ [p] and P(x(0) = p + 1) = 0. Then, the time to absorption into the state p + 1 is a random variable following a phase-type distribution, which we denote by the pair (φ , T ). In the rest of the paper, we make the following assumption on the distribution of (random) transmission and recovery times in the GeNeSIS model: Assumption III.4. Transmission and recovery times of all nodes follow phase-type distributions (φ , T ) and (ψ, R), respectively. Remark III.5. It is known that the set of phase-type distributions is dense in the set of positive-valued distributions [8]. Therefore, it is possible to approximate any arbitrary distribution using a phase-type distribution within any given accuracy. Moreover, there are efficient numerical algorithms to find the parameters of this approximating phase-type distribution [2]. IV. V ECTORIAL R EPRESENTATIONS The aim of this section is to introduce a useful vectorial representation of the GeNeSIS model under Assumption III.4. We start our exposition by providing a vectorial representation of an arbitrary phase-type distribution (Subsection IV-A), and then present a vectorial representation of the GeNeSIS model (Subsection IV-B).
A. Vectorial Representation of Phase-Type Distributions In what follows, we use the following notation: For m, m0 ∈ [p], let Emm0 denote the p × p matrix whose entries are all zeros, except for its (m, m0 )-th entry being one. Also, let em denote the m-th vector of the canonical basis in R p (i.e., all the entries of em are zeros, except for the m-th entry being one). Finally, given a probability distribution φ on [p], we say that an R p -valued random variable x follows the distribution φ , denoted by x ∼ φ , if P(x = em ) = φm for every m ∈ [p]. In the following proposition, we prove that a random variable following a phase-type distribution can be represented using the exit time (see [16, p. 117]) of a particular stochastic differential equation:
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
6
Proposition IV.1. Consider an invertible Metzler matrix T ∈ R p×p with non-positive row-sums. Define the vector b = −T 1 p , and the phase-type distribution (φ , T ). Consider the R p -valued stochastic differential equation p
dx =
∑0
p
(Em0 m − Emm )x dNTmm0 −
∑ Emm x dNbm ,
(4)
m=1
m,m =1
with random initial condition x(0) ∼ φ . Then, the random variable ρ = min{t > 0 : x(t) = 0} = min{t > 0 : ∃ m ∈ [p] such that x(t − ) = em ,
(5)
and Nbm jumps at time t} follows (φ , T ). Proof. We first show that the solution x of (4) is a Markov process with state space {e1 , . . . , e p , 0} ⊂ R p and infinitesimal generator given by (3). Let t ≥ 0 and h > 0 be arbitrary. If x(t) = 0, then we can easily see that x(t + h) = 0. Let us consider the case x(t) = em for m ∈ [p]. From the right-hand side of the differential equation (4), we see that the value of x can change on the interval [t,t + h] only when one of the counters NTm1 , . . . , NTmp , or Nbm jumps. The probability that only the counter NTmm0 jumps in this interval is equal to Tmm0 h + o(h), in which case we obtain x(t + h) = em0 . Also, with probability bm h + o(h), only the counter Nbm jumps in this interval, in which case we obtain x(t + h) = 0. On the other hand, with probability 1 − bm h − ∑mp 0 6=m Tmm0 h + o(h), none of the counters jumps in this interval, in which case x(t + h) = x(t) = em . Finally, the probability that more than one counter jumps in this interval is of order o(h). From the above observations, we obtain that the random process x satisfying (4) is a Markov process with infinitesimal generator (3). Also, from this observation, we directly obtain that the second equality in (5) is true. Furthermore, since x(0) follows the probability distribution φ , the time to absorption of the stochastic process x into the absorbing state 0 follows the phase-type distribution (φ , T ), which follows directly from the definition of phase-type distributions. This completes the proof of the proposition. Based on Proposition IV.1, we now provide a vectorial representation of renewal sequences (see, e.g., [7, Chapter 9]) whose inter-renewal times follow a phase-type distribution: Proposition IV.2. Let (φ , T ) be a phase-type distribution. Let eφ = {eφ (t)}t≥0 be a stochastic process such that eφ (t) follows the distribution φ independently for each t ≥ 0. Consider the R p -valued stochastic differential equation p
dx =
p
∑0 (Em0 m − Emm )x dNTmm0 + ∑ (eφ (t)e>m − Emm )x dNbm
m,m =1
(6)
m=1
with a random initial state x(0) following the distribution φ . Define τ0 = 0 and let 0 < τ1 < τ2 < · · · be the (random) times at which x(t − ) = em and the counter Nbm jumps for some m ∈ [p]. Then, the increments {τk − τk−1 }k∈N are independent and identically distributed random variables following the phase-type distribution (φ , T ). Proof. Since the stochastic differential equation (6) is equivalent to (4) on the time interval [0, τ1 ), the random variable τ1 has the
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
7
same probability distribution as the random variable ρ defined in (5) and, therefore, follows the phase-type distribution (φ , T ) by − − Proposition IV.1. Furthermore, by the definition of τ1 , there exists an m ∈ [p] such that x(τ1 ) = (eφ (τ1 )e> m −Emm )x(τ1 )+x(τ1 ) =
eφ (τ1 ), which follows φ . Therefore, by the same argument as above, we see that the random increment τ2 − τ1 also follows the phase-type distribution (φ , T ). An induction completes the proof.
B. Vectorial Representation of the Generalized SIS Model In this subsection, we use Propositions IV.1 and IV.2 to provide a vectorial representation of the GeNeSIS model under Assumption III.4. Let A = [ai j ]i, j be the adjacency matrix of the graph G . Define the vectors b = −T 1 p ∈ R p , d = −R1q ∈ Rq .
(7)
For `, `0 ∈ [q], let F``0 denote the q × q matrix whose entries are all zeros, except for its (`, `0 )-th entry being one. Also, let f` denote the `-th vector in the canonical basis of Rq . Finally, for i, j ∈ [n] and γ > 0, we let Nλi j and Nλi denote independent Poisson counters with rate λ . The next theorem is the first main result of this paper: Theorem IV.3. For each i ∈ [n] and j ∈ Ni , let eφji = {eφji (t)}t≥0 and fψi = { fψi (t)}t≥0 be the stochastic processes such that eφji (t) ∼ φ and fψi (t) ∼ ψ, independently. Let x ji and yi be, respectively, the R p - and Rq -valued stochastic processes satisfying the following stochastic differential equations: p
dx ji =
p
∑ (Em0 m − Emm )x ji dNTji 0 + ∑ (eφji e>m − Emm )x ji dNbjim mm
m,m0 =1
q
m=1 n
i − x ji ∑ yi` dNdi ` + eφji (1 − 1> q y ) ∑ aik `=1
p
∑ xmik dNbikm ,
m=1 k=1 q q dyi = (F`0 ` − F`` )yi dNRi ``0 − yi yi` dNdi ` `=1 `,`0 =1 p n i ik + fψi (1 − 1> aik xm dNbikm , qy) m=1 k=1
∑
(8)
∑
∑
∑
where for an initially infected subset Λ ⊂ [n], the initial conditions satisfy x ji (0) ∼ φ , yi (0) ∼ ψ, if i ∈ Λ,
(9)
(10)
x ji (0) = 0 p , yi (0) = 0q , otherwise. Then, the generalized networked SIS model in Definition III.1 with transmission and recovery times following, respectively, phase-type distributions (φ , T ) and (ψ, R) can be equivalently described as the family of stochastic processes z = {zi }ni=1 , where zi (t) = 1> q yi (t) for all t ≥ 0 and i ∈ [n]. Remark IV.4. As we shall explain below, the variable x ji is related to spread of the infection from an infected node i to a susceptible node j, while yi controls the recovery process of node i. Before we provide a proof of Theorem IV.3, we start from the following result:
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
8
ji > i ji Lemma IV.5. The following statements are true for all i ∈ [n], j ∈ Ni , and t ≥ 0: 1) 1> p x (t) = 1q y (t); 2) x (t) ∈
{0 p , e1 , . . . , e p }; and 3) yi (t) ∈ {0q , f1 , . . . , fq }. ji > i Proof. To prove the first statement, fix i ∈ [n] and j ∈ Ni , and let ε = 1> p x − 1q y . From (8) and (9), we can show that
dε = ε ∑q`=1 yi` dNdi ` . This equation implies that ε is constant over [0, ∞) because ε(0) = 0. Therefore, we have ε(t) = ε(0) = 0 for every t ≥ 0, completing the proof of the first statement. Let us prove the second and third statements. Notice that x ji and yi are piecewise constant since they are the solutions of the stochastic differential equations (8) and (9). Moreover, their values can change only when one of the Poisson counters in the stochastic differential equations jumps. Finally, (10) shows that x ji (0) ∈ U = {0 p , e1 , . . . , e p } and yi (0) ∈ V = {0q , f1 , . . . , fq }. Therefore, it is sufficient to show that the jump of any Poisson counter leaves the sets U and V invariant. Let t > 0 be arbitrary and assume that x ji (t − ) ∈ U and yi (t − ) ∈ V . The stochastic differential equations (8) and (9) have the following five different types of Poisson counters: NTjimm0 , NRi ``0 , Nbjim , Nbikm , and Ndi ` . We now show below that, any of these counters leave the sets U and V invariant: Case 1) When NTjimm0 jumps at time t: We have that x ji (t) = x ji (t − ) + (Em0 m − Emm )x ji (t − ) em0 , if x ji (t − ) = em , = x ji (t − ), otherwise, and hence x ji (t) ∈ U. Also, we trivially have yi (t) = yi (t − ) ∈ V . Case 2) When NRi ``0 jumps at time t: Clearly x ji (t) = x ji (t − ) ∈ V . Also, we have yi (t) ∈ U because yi (t) = yi (t − ) + (F`0 ` − F`` )yi (t − ) e`0 , if yi` (t − ) = 1, = yi (t − ), otherwise. Case 3) When Nbjim jumps at time t: We clearly have yi (t) = yi (t − ) ∈ V . We can also show x ji (t) ∈ U because ji − x ji (t) = x ji (t − ) + (eφji (t)e> m − Emm )x (t ) eφji (t), if xmji (t − ) = 1, = x ji (t − ), otherwise.
Case 4) When Nbikm jumps at time t: We have that i − ik − x ji (t) = x ji (t − ) + eφji (t)(1 − 1> q y (t ))aik xm (t ) i − ik − x ji (t − ), if 1> q y (t ) = 1, aik = 0, or xm (t ) = 0, = eφji (t), otherwise.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
9
Also, i − ik − yi (t) = yi (t − ) + fψi (t)(1 − 1> q y (t ))aik xm (t ) yi (t − ), if 1> yi (t − ) = 1, a = 0, or xik (t − ) = 0, ik q m = fψi (t), otherwise.
Therefore, this jump leaves U and V invariant. We note that we have used the first claim in this derivation. Case 5) When Ndi ` jumps at time t: We can show that x ji (t) = x ji (t − ) − x ji (t − )yi` (t − ) 0, if yi` (t − ) = 1, = x ji (t − ), otherwise, and hence x ji (t) ∈ U. We also have yi (t) ∈ V because yi (t) = yi (t − ) − yi (t − )yi` (t − ) 0, if yi` (t − ) = 1, = yi (t − ), otherwise. This completes the proof of the lemma. From the proof of Lemma IV.5, the following corollary immediately follows. Corollary IV.6. The following statements are true for every i ∈ [n], j ∈ Ni , and t ≥ 0: 1) node i attempts to infect node j at time t, if and only if, xmji (t − ) = 1 and Nbjim jumps at time t for some m ∈ [p]. 2) Node i becomes susceptible at time t, if and only if, yi` (t − ) = 1 and Ndi ` jumps at time t for some ` ∈ [q]. ji > i Proof. From the proof of Lemma IV.5, the value of zi = 1> p x = 1q y can change from zero to one only in Case 4), given in
the proof. From the equations in Case 4), we see that node i becomes infected at time t ≥ 0, if and only if, zi (t − ) = 0, node i ik (t − ) = 1, and N ik jumps at time t for some m ∈ [p]. Therefore, node k attempts to infect node i at is adjacent to node k, xm bm ik (t − ) = 1 and N ik jumps at time t for some m ∈ [p], as desired. time t if and only if xm bm
Similarly, we see that the value of zi can change from one to zero only in Case 5). The equations in Case 5) imply that node i becomes susceptible at time t if and only if yi` (t − ) = 1 and the counter Ndi ` jumps at time t for some ` ∈ [q], showing the second assertion of the corollary. Now we are ready to prove our first main result, namely, Theorem IV.3. Proof of Theorem IV.3. We prove that the family of stochastic processes z = {zi }ni=1 , defined as the solution of (8) and (9), satisfies items a)-c) in Definition III.1, as well as Assumption III.4. Item a) is true by (10). Let us now prove item b). Assume that node i becomes infected at time t ≥ 0; hence, we have that either t = 0 or t > 0. We first consider the case t = 0. Let
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
10
3
3
3
2
2
2
1
1
1
0
0 0
1
2
3
(a) Log-normal distributions (solid lines) with mean µ and variance µ 2 , and their phase-type approximations (dashed lines).
0 0
1
2
3
(b) Log-normal distributions (solid lines) with mean µ and variance 2µ 2 , and their phase-type approximations (dashed lines).
0
1
2
3
(c) Log-normal distributions (solid lines) with mean µ and variance 4µ 2 , and their phase-type approximations (dashed lines).
Fig. 1. Approximations by phase-type distributions. Solid: Probability density functions of the log-normal distributions. Dashed: Probability density functions of phase-type distributions.
ρ be the earliest time at which yi (ρ) = 0q . Then, on the time interval [0, ρ), the differential equation (9) is equivalent to dyi = ∑q`,`0 =1 (F`0 ` − F`` )yi dNRi ``0 − ∑q`=1 F`` yi dNdi ` , since yi yi` = F`` yi . Moreover, the vector yi (0) follows the distribution ψ by (10). Therefore, by Proposition IV.1, we see that ρ follows the phase-type distribution (ψ, R), proving item b) in Definition III.1 under Assumption III.4 for t = 0. We then consider the case t > 0. From Case 4) in the proof of Lemma IV.5 and Corollary IV.6, we know that yi (t) equals fψi (t) and, therefore, follows the distribution ψ. We can then use the same argument as above to conclude that node i remains infected on [t,t + ρ) and becomes susceptible at time t + ρ, with ρ following the phase-type distribution (ψ, T ), proving item b) in Definition III.1 under Assumption III.4 for t > 0. Let us finally prove that the family of stochastic processes z = {zi }ni=1 satisfies item c) in Definition III.1 under Assumption III.4. Assume that node i becomes infected at time t, and, again, consider the case t = 0 first. Let τ1 < τ2 < . . . be the times at which node i attempts to infect node j. We need to show that the differences τ1 − 0 = τ1 , τ2 − τ1 , . . . are independent and identically distributed random variables following the phase-type distribution (φ , T ). Notice that x ji (t) 6= 0 until node i becomes susceptible, which occurs at the earliest time t > 0 at which yi` (t − ) = 1 and the counter Ndi ` jumps for some ` ∈ [q] (by Corollary IV.6). Therefore, until the recovery of node i, the stochastic differential equation (8) is equivalent to ji ji > ji p p ji ji − ji dx ji = ∑m,m 0 =1 (Em0 m − Emm )x dNT 0 + ∑m=1 (eφ em − Emm )x dNbm . Thus, by Proposition IV.2, the times at which x (t ) = em mm
and the counter Nbjim jumps form a renewal sequence with inter-renewal times following the phase-type distribution (φ , T ). On the other hand, by Corollary IV.6, these times are exactly the times at which node i attempts to infect node j. Therefore, item c) in Definition III.1 under Assumption III.4 holds true. We can prove the case t > 0 in the similar manner and hence omit the proof. V. S TABILITY A NALYSIS In this section, we use the representation of the GeNeSIS as the (vectorial) stochastic differential equations (8) and (9) to analyze its exponential stability under Assumption III.4. In other words, we provide conditions under which an epidemic process with general infection and recovery times taking place in an arbitrary network topology dies out exponentially fast over time. The following theorem is the main result of this section:
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
b) Log-normal recovery times with mean µ and variance µ 2
a) Exponential recovery times with mean µ
1) Exponential transmission times with mean µ
11
c) Log-normal recovery times with mean µ and variance 2µ 2
d) Log-normal recovery times with mean µ and variance 4µ 2
1.5
1.5
1.5
1.5
1
1
1
1
1
00
0.5 0.5
1
1.5
1.5
0.5
0.5
0.5 0.5
1
1.5
1.5
1
0.5 0.5
1
1.5
-1 1
1.5
1
1 -3 3
1
1.5
0.5 0.5
1
1.5
0.5 0.5
1
1.5
0.5 0.5
1.5
1.5
1.5
1
1
1
1
1.5
-4 4 -5 5 -6 6
1
1.5
1.5
4) Log-normal transmission 1 times with mean µ and variance 4µ 2 0.5
1.5
-2 2
1.5
3) Log-normal transmission 1 times with mean µ and variance 2µ 2 0.5
1
1.5
2) Log-normal transmission 1 times with mean µ and variance µ 2 0.5 0.5
0.5 0.5
1
1.5
0.5 0.5
1
1.5
0.5 0.5
1
1.5
0.5 0.5
1
1.5
1.5
1.5
1.5
-7 7
1
1
1
-8 8
0.5 0.5
1
1.5
0.5 0.5
1
1.5
0.5 0.5
-9 1
1.5
Fig. 2. Upper bound |η(A)| on the decay rate of generalized SIS model. Horizontal axes: µ. Vertical axes: µ divided by the spectral radius of the graph.
Theorem V.1. Consider the generalized networked SIS model with transmission and recovery times following, respectively, phase-type distributions (φ , T ) and (ψ, R). Define the (npq) × (npq) matrix A = In ⊗ T > ⊕ R> + (φ b> ) ⊗ Iq + A ⊗ (φ b> ) ⊗ (ψ1> q ), where A is the adjacency matrix of the graph G and the vector b is defined in (7). If A is Hurwitz stable, then the infection-free equilibrium of the generalized networked SIS model is exponentially mean stable with decay rate |η(A)|, where η(A) is the spectral abscissae of A. Proof. Combining equations (8) and (9), we obtain the following R p+q -valued stochastic differential equation ji ji p 0 x (E − E )x mm mm ji d = ∑ dNTmm0 i 0 m,m =1 y 0nq ji > ji p (e e − E )x mm φ m ji +∑ dNbm m=1 0nq ji i q q 0np i −x y` i + ∑ dNR``0 + ∑ dNd` `=1 −yi yi `,`0 =1 (F`0 ` − F`` )yi ` ji > i ik p n eφ (1 − 1q y )xm ik + ∑ ∑ aik dNbm . i ik k=1 m=1 fψi (1 − 1> q y )xm
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
12
Then, we apply Itˆo’s formula in Lemma II.1 using the function ji x g = w ji = x ji ⊗ yi , yi to obtain the following R pq -valued stochastic differential equation (after tedious, but simple, algebraic manipulations) p
dw ji =
∑0
(Em0 m − Emm ) ⊗ Iq w ji dNTji
∑
ji ji (eφji e> m − Emm ) ⊗ Iq w dNbm
mm0
m,m =1 p
+
m=1 q
+
∑
`,`0 =1 p n
q I p ⊗ (F`0 ` − F`` ) w ji dNRi ``0 + ∑ (−yi` w ji ) dNdi ` `=1
∑ aik (eφji ⊗ fψi )(1 − 1>pq w ji )(e>m ⊗ 1>q )wik dNbikm .
+∑
(11)
k=1 m=1
For brevity, we omit the details of this derivation. Define ω ji (t) = E[w ji (t)] for t ≥ 0. Then, using Lemma II.2, from (11) we can derive the R pq -valued differential equation dω ji = dt
p
+
∑0
∑
(E[eφji ]e> m − Emm ) ⊗ Iq bm
m=1 q
+
((Em0 m − Emm ) ⊗ Iq ) Tmm0
m,m =1 p
∑
`,`0 =1 p n
+∑
q I p ⊗ (F`0 ` − F`` ) R``0 ω ji + ∑ E[−yi` w ji ]d` `=1
∑ aik E[eφji ⊗ fψi ]E[(1 − 1>pq w ji )(e>m ⊗ 1>q )wik ]bm
k=1 m=1
h = (T > + B) ⊗ Iq + (φ b> − B) ⊗ Iq i + I p ⊗ (R> + D) ω ji − (I p ⊗ D)ω ji p > + (φ ⊗ ψ)(e> m ⊗ 1q )
∑
m=1
n
bm ∑ (aik ωik ) − ε ji ,
(12)
k=1
where B and D are the n × n diagonal matrices having the diagonals b1 , . . . , bn and d1 , . . . , dn , respectively, and the R pq -valued p ji > > ik function ε ji is defined by ε ji (t) = ∑nk=1 ∑m=1 bm aik (φ ⊗ ψ)E[1> pq w (t)(em ⊗ 1q )w (t)] for every t ≥ 0.
Now, for every i ∈ [n], we take an arbitrary ji ∈ Ni . We then define the function ωi by ωi = ω ji i (i.e., the solution to (12) when j = ji ), as well as the Rnpq -valued function ω = col(ω1 , . . . , ωn ). Notice that, from (11), all the stochastic processes in the set {w ji } j∈Ni follow the same stochastic differential equation and, therefore, present the same probability distribution. This implies that ωi = ω ji for every j ∈ Ni . Therefore, we can show that ∑nk=1 aik ωik = ∑nk=1 aik ωk = [ai1 I pq · · · ain I pq ]ω = (Ai ⊗I pq )ω, where Ai denotes the i-th row of the adjacency matrix A. Then, from (12), it follows that dωi > = T ⊗ Iq + I p ⊗ R> + (φ b> ) ⊗ Iq ωi dt + Ai ⊗ (φ b> ) ⊗ (ψ1> q ) ω − ε ji i .
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
13
Defining ε = col(ε j1 1 , . . . , ε jn n ), we obtain the differential equation dω = Aω − ε, dt
(13)
where A is defined in the statement of the theorem. Since A is Metzler, we have that eAt ≥ 0 for every t ≥ 0. Also, since both x ji (t) and yi (t) are nonnegative for all i ∈ [n], j ∈ Ni , and t ≥ 0, we have that ε(t) ≥ 0 for every t ≥ 0. Therefore, from (13), it follows that ω(t) = eAt ω(0) −
R t A(t−τ) ε(τ) dτ ≤ 0e
eAt ω(0). This inequality implies that ω(t) converges to zero as t → ∞ with a decay rate equal to |η(A)|, since A is assumed to be Hurwitz stable in the statement of the theorem, and ω(t) ≥ 0 for all t ≥ 0. ji > ji > i > i On the other hand, for each i ∈ [n] and j ∈ Ni , we can show that 1> pq w (t) = (1 p x (t))(1q y (t)) = 1q y (t) = zi (t) from
Lemma IV.5. Therefore, E[zi (t)] = 1> pq ω ji (t), which shows the exponentially fast convergence of E[zi (t)] towards zero with a decay rate equal to |η(A)|. This completes the proof of the theorem. VI. N UMERICAL S IMULATIONS In this section, we illustrate the effectiveness of our results with numerical simulations in a small real social network. We specifically show that modeling non-exponential transmission/recovery times using exponential distributions (i.e., using the standard Markovian model in [20]) can induce significant errors in the computation of the decay rate of the epidemic process. In our simulations, we use the following four distributions to model transmission and recovery times in the GeNeSIS model: (i) the exponential distribution with mean µ (and, hence, variance µ 2 ); (ii) the log-normal distribution with mean µ and variance µ 2 ; (iii) the log-normal distribution with mean µ and variance 2µ 2 ; and (iv) the log-normal distribution with mean µ and variance 4µ 2 . In order to analyze the stability of the GeNeSIS model whose transmission and recovery times follow one of these four distributions, we first approximate the three log-normal distributions in (ii)–(iv) using phase-type distributions (as described in Section III). In particular, we use p = 10 phases in our approximation and use the expectation-maximization (EM) algorithm proposed in [2] to find the particular values of T and b in (3). Fig. 1 shows the probability density functions of the (exact) log-normal distributions, as well as the fitted phase-type distributions, when the value of the parameter µ is in the range [0.5 : 0.1 : 1.5]. Using the obtained phase-type distributions, we then apply Theorem V.1 to analyze the stability of the GeNeSIS model when the transmission/recovery times follow one of the four distributions described above. In particular, we compute the exponential decay rate for each one of the possible sixteen combinations of transmission/recovery distributions when µ is in the range [0.5 : 0.05 : 1.5]. In the subfigures of Fig. 2, we include contour plots of the decay rates for these sixteen cases. We see that, even though the four distributions have the same mean, the resulting GeNeSIS models exhibit distinct regions of exponential mean stability as well as different decay rates. In Fig. 2, we observe that for a fixed recovery distribution (i.e., for a fixed column in the table), the stability region (indicated by a black solid line in Fig. 2) shrinks as we increase the variance of the log-normal distribution modeling the transmission times (see rows 2) to 4) in Fig. 2). Since the spreading process ‘dies out’ exponentially fast inside the stability region, we conclude that the ‘heavier’ the tail of the transmission distribution, the more
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
14
‘viral’ the spreading process. Similarly, we analyze the impact of the variance of the recovery distribution in the stability of the spread. In Fig. 2, we observe that for a fixed transmission distribution (i.e., for a fixed row in the table), the stability region remains almost unaltered as we increase the variance of the distribution modeling the recovery time. However, the decay rate |η(A)| increases more abruptly inside the stability region as we decrease this variance. This is indicated by a steeper gradient inside the stability region. In other words, the smaller the variance of the recovery distribution, the faster the spreading process ‘dies out’ in the network. VII. C ONCLUSION In this paper, we have analyzed the dynamics of a generalized model of spreading over arbitrary networks with phase-type transmission and recovery times. Since phase-type distributions form a dense family in the space of positive-valued distributions, our results allow to theoretically analyze arbitrary transmission and recovery times within an arbitrary accuracy. In this context, we have derived conditions for a generalized spreading process to converge towards the origin (i.e., to eradicate the spread). In contrast with existing results, we provide an exact analysis of the stochastic spreading dynamics over arbitrary networks (without relying on mean-field approximations). Our results illustrate that the particular shape of the transmission/recovery distribution heavily influences the boundary of the stability region of the spread, as well as the decay rate inside the region. In particular, we observe that the ‘heavier’ the tail of the transmission distribution, the more ‘viral’ the spreading process. Furthermore, the ‘thiner’ the tail of the recovery distribution, the faster the spreading process ‘dies out’ in the network. R EFERENCES [1] H. J. Ahn and B. Hassibi, “Global dynamics of epidemic spread over complex networks,” in 52nd IEEE Conference on Decision and Control, 2013, pp. 4579–4585. [2] S. Asmussen, O. Nerman, and M. Olsson, “Fitting phase-type distributions via the EM algorithm,” Scandinavian Journal of Statistics, vol. 23, no. 4, pp. 419–441, 1996. [3] N. T. J. Bailey, The Mathematical Theory of Infectious Diseases and its Applications.
Griffin, 1975.
[4] S. P. Blythe and R. M. Anderson, “Variable infectiousness in HFV transmission models,” Mathematical Medicine and Biology, vol. 5, no. 3, pp. 181–200, 1988. [5] M. Bogu˜na´ , L. F. Lafuerza, R. Toral, and M. A. Serrano, “Simulating non-Markovian stochastic processes,” Physical Review E, vol. 90, no. 4, p. 042108, 2014. [6] E. Cator, R. van de Bovenkamp, and P. Van Mieghem, “Susceptible-infected-susceptible epidemics on networks with general infection and cure times,” Physical Review E, vol. 87, no. 6, p. 062816, 2013. [7] E. C ¸ inlar, Introduction to Stochastic Processes.
Prentice-Hall, 1975.
[8] D. R. Cox, “A use of complex probabilities in the theory of stochastic processes,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 51, no. 02, pp. 313–319, 1955. [9] F. Darabi Sahneh, C. Scoglio, and P. Van Mieghem, “Generalized epidemic mean-field model for spreading processes over multilayer complex networks,” IEEE/ACM Transactions on Networking, vol. 21, no. 5, pp. 1609–1620, 2013. [10] R. Horn and C. Johnson, Matrix Analysis.
Cambridge University Press, 1990.
[11] H.-H. Jo, J. I. Perotti, K. Kaski, and J. Kert´esz, “Analytically solvable model of spreading dynamics with non-Poissonian processes,” Physical Review X, vol. 4, no. 1, p. 011041, 2014. [12] K. Lerman and R. Ghosh, “Information contagion: An empirical study of the spread of news on Digg and Twitter social networks,” in Fourth International AAAI Conference on Weblogs and Social Media, 2010, pp. 90–97.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
15
[13] C. Nowzari, V. M. Preciado, and G. J. Pappas, “Analysis and control of epidemics: A survey of spreading processes on complex networks,” IEEE Control Systems, vol. 36, no. 1, pp. 26–46, 2016. [14] M. Ogura and V. M. Preciado, “Stability of spreading processes over time-varying large-scale networks,” IEEE Transactions on Network Science and Engineering, vol. 3, no. 1, pp. 44–57, 2016. [15] ——, “Epidemic processes over adaptive state-dependent networks,” Physical Review E, vol. 93, no. 6, p. 062316, 2016. [16] B. Øksendal, Stochastic Differential Equations.
Springer, 2003.
[17] L. Pellis, T. House, and M. J. Keeling, “Exact and approximate moment closures for non-Markovian network epidemics,” Journal of Theoretical Biology, vol. 382, pp. 160–177, 2015. [18] S. Roy, M. Xue, and S. K. Das, “Security and discoverability of spread dynamics in cyber-physical networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 9, pp. 1694–1707, 2012. [19] T. Suess, U. Buchholz, S. Dupke, R. Grunow, M. An der Heiden, A. Heider, B. Biere, B. Schweiger, W. Haas, and G. Krause, “Shedding and transmission of novel influenza virus A/H1N1 infection in households-Germany, 2009,” American Journal of Epidemiology, vol. 171, no. 11, pp. 1157–1164, 2010. [20] P. Van Mieghem, J. Omic, and R. Kooij, “Virus spread in networks,” IEEE/ACM Transactions on Networking, vol. 17, no. 1, pp. 1–14, 2009. [21] P. Van Mieghem and R. van de Bovenkamp, “Non-Markovian infection spread dramatically alters the susceptible-infected-susceptible epidemic threshold in networks,” Physical Review Letters, vol. 110, no. 10, p. 108701, 2013. [22] Y. Wan, S. Roy, and A. Saberi, “Designing spatially heterogeneous strategies for control of virus spread,” IET Systems Biology, vol. 2, no. 4, pp. 184–201, 2008. [23] Y. Wang, D. Chakrabarti, C. Wang, and C. Faloutsos, “Epidemic spreading in real networks: an eigenvalue viewpoint,” in 22nd International Symposium on Reliable Distributed Systems, 2003, pp. 25–34.