Sep 2, 2004 - well as those with transition matrices possessing some block structure. ... how significant the rate matrix R is in discussing a Markov chain of.
Infinite Block-Structured Transition Matrices and Their Properties Yiqiang Q. Zhao1 , Wei Li1,2 , W. John Braun1 1
Department of Mathematics and Statistics University of Winnipeg Winnipeg, Canada R3B 2E9 2
Institute of Applied Mathematics Chinese Academy of Sciences Beijing, P.R. China September 2, 2004
Abstract In this paper, we study Markov chains with infinite state block-structured transition matrices, whose states are partitioned into levels according to the block structure, and various associated measures. Roughly speaking, these measures involve first passage times or expected numbers of visits to certain levels without hitting other levels; they are very important and often play a key role in the study of a Markov chain. Necessary and/or sufficient conditions are obtained for a Markov chain to be positive recurrent, recurrent, or transient in terms of these measures. Results are obtained for general irreducible Markov chains as well as those with transition matrices possessing some block structure. We also discuss the decomposition or the factorization of the characteristic equations of these measures. In the scalar case, we locate the zeros of these characteristic functions and therefore use these zeros to characterize a Markov chain. Examples and various remarks are given to illustrate some of the results. Keywords: infinite-state Markov chains; block-form matrices; Markov chains with repeating rows; transient, recurrent and positive recurrent states; censored Markov chains; factorization of generating functions
1
Introduction
It is well-known now how significant the rate matrix R is in discussing a Markov chain of GI/M/1 type and the matrix G of the fundamental period in discussing a Markov chain of M/G/1 type (for example, Neuts (1980, 1989)). In the present paper, we will discuss the matrices Ri,j and Gi,j , which are the counterparts of R and G for general Markov chains. Along the way, we will also study some related measures. The only condition imposed on the Markov chains is irreducibility, though this condition is not essential for all of the results presented in this paper. Results presented here are for discrete time Markov chains. However, corresponding results for continuous time Markov chains can be obtained in parallel. Let {Zt = (Xt , Yt ); t = 0, 1, 2, . . .} be the Markov chain, whose transition matrix
P is expressed in block matrix form:
P0,0 P0,1 P0,2 · · · · · ·
P1,0 P = P 2,0
.. .
P1,1 P1,2 · · · · · ·
, P2,1 P2,2 · · · · · · .. .. .. .. . . . .
(1.1)
where Pi,j is a matrix of size ki × kj with both ki < ∞ and kj < ∞. In general, P is allowed
to be substochastic. The state space S is partitioned accordingly into S=
∞ [
Li
(1.2)
i=0
with Li = {(i, 1), (i, 2), . . . , (i, ki )}.
(1.3)
In state (i, r), i is called the level variable and r the stage variable. We also use the notation L≤i =
i [
Lk .
(1.4)
k=0
Partitioning the transition matrix P into blocks is not only done because it is convenient for a comparison with results in the literature, but also because it is necessary when the Markov chain exhibits some kind of block structure. For the above Markov chain, we define matrices Ri,j for i < j and Gi,j for i > j as follows. Ri,j is a matrix of size ki × kj whose
(r, s)th entry is the expected number of visits to state (j, s) before hitting any state in
L≤(j−1) , given that the process starts in state (i, r). Gi,j is a matrix of size ki × kj whose 1
(r, s)th entry is the probability of hitting state (j, s) when the process enters L≤(i−1) for the first time, given that the process starts in state (i, r). We call matrices Ri,j and Gi,j , respectively, the matrices of expected number of visits to higher levels before returning to lower levels and the matrices of the first passage probabilities to lower levels. The significance of Ri,j and Gi,j in studying Markov chains, especially for ergodic Markov chains, has been pointed out in many research papers, including some which are closely related to the present paper (for example, Grassmann and Heyman (1990), Grassmann and Heyman (1993), and Heyman (1995)). In these papers, under the ergodic condition, stationary distribution vectors, the mean first passage times, and the fundamental matrix are discussed in terms of these matrices. In light of this work, we have continued the discussion of these matrices and have obtained more results about them. We also study some related measures, for example, matrices Ai,j and Bi,j , which are defined below. For i ≥ 0 and j ≥ 0 with i 6= j, define Ai,j to be a matrix of size ki × kj whose (r, s)th
entry is the expected number of visits to state (j, s) before hitting any state in level i, given that the process starts in state (i, r). For i ≥ 0 and j ≥ 0, define Bi,j to be a matrix of size
ki × kj . When i 6= j, the (r, s)th entry of Bi,j is the probability of visiting state (j, s) for
the first time before hitting any state in level j, given that the process starts in state (i, r). When i = j, the (r, s)th entry of Bi,j is the probability of returning to level j for the first time by hitting state (j, s), given that the process starts in state (i, r). The measure Ai,j is the block form counterpart of the generalized stationary distribution defined in the scalar case, when the Markov chain is recurrent (for example, Karlin and Taylor (1981)). The measure Bi,j is the block form counterpart of the scalar case probability of ever hitting state j given that the Markov chain starts in state i. Results obtained in this paper include: necessary and/or sufficient conditions for a Markov chain to be positive recurrent, recurrent, or transient in terms of Ri,j , Gi,j and related measures, and the decomposition or factorization of the characteristic equations of these measures. In the scalar case, we locate the zeros of these characteristic functions and therefore use these zeros to characterize a Markov chain. Examples and various remarks are also provided in the paper. Many aspects of the study of Markov chains can be carried out in terms of these results, including the classification of states of Markov chains, the limiting behavior of Markov chains, and treatment of computational issues which arise for various measures. 2
The rest of the paper is organized into four sections. In Section 2, we discuss a general Markov chain, whose transition matrix is partitioned into blocks. Section 3 contributes to a study of Markov chains partitioned in block form and with a repeating property. Results, which only apply to Markov chains whose transition matrix entries are scalars, are obtained in Section 4. Examples are provided and conclusions are drawn in the last section of the paper.
2
Markov chains in block form
In this section, we discuss the matrices of expected visits to higher levels before returning to lower levels and the matrices of the first passage probabilities to lower levels for Markov chains P partitioned in block form. We also discuss the measures Ai,j and Bi,j . We show that many basic aspects of Markov chains can be discussed in terms of these measures, for example, the classification of states of Markov chains. There are many advantages for using these measures for computing various interesting probabilities and expected values. Theorem 2.1 Matrices A0,n and Rk,n as defined in Section 1 satisfy A0,n
R , 0,1 = R + Pn−1 A 0,n
k=1
0,k Rk,n ,
if n = 1,
(2.5)
if n ≥ 2.
Proof: It is clear from the definitions that A0,1 = R0,1 . When n ≥ 2, A0,n (r, s) =
∞ X
k=1
=
∞ X
k=1
P {Zk = (n, s), Zl ∈ / L0 for 1 ≤ l ≤ k − 1|Z0 = (0, r)}
P {Zk = (n, s), Zl ∈ / L≤n−1 for 1 ≤ l ≤ k − 1|Z0 = (0, r)} +
∞ k−1 X X
k=2 t=1
P {Zk = (n, s),
Zk−1 ∈ / L≤n−1 , . . . , Zt+1 ∈ / L≤n−1 , Zt ∈ L≤n−1 \ L0 , Zt−1 ∈ / L0 , . . . , Z1 ∈ / L0 |Z0 = (0, r)} = R0,n (r, s) +
ki ∞ n−1 ∞ X XX X
t=1 k=t+1 i=1 w=1
P {Zt = (i, w), Zt−1 ∈ / L0 , . . . , Z1 ∈ / L0 |Z0 = (0, r)}
P {Zk = (n, s), Zk−1 ∈ / L≤n−1 , . . . , Zt+1 ∈ / L≤n−1 |Zt = (i, w)} = R0,n (r, s) +
ki n−1 XX
A0,i (r, w)Ri,n (w, s).
i=1 w=1
3
For any subset D of the state space S, we use P D to mean the transition probability matrix of the stochastic process formed from the process Zt by deleting all Zt not belonging to D. This new process is called the censored process and it possesses the Markovian property. See Kemeny et al. (1976) or Freedman (1983) for more details. When D = L≤n , P D is denoted by P (n) . Some of the properties of the censored process, which will be used throughout this paper, are summarized by the following Lemma. Lemma 2.2 Let P be the transition matrix of a Markov chain, which is possibly substochastic, and let D be a subset of the state space. i) P is irreducible if and only if P D is irreducible for all D. ii) P is recurrent if and only if P D is recurrent for all D. iii) P is transient if and only if P D is transient for all D. iv) If P is irreducible, then P is recurrent if and only if P D is recurrent for some D. v) If P is irreducible, then P is transient if and only if P D is transient for some D.
vi) For D1 ⊆ D2 , P D1 = P D2
D1
.
Remark 2.3 i) P D is not necessarily stochastic even though the original P is stochastic. ii) Ai,j , Bi,j , Ri,j and Gi,j are independent of the censoring process. It means that if we let C be any of A, B, R and G and let C D be the corresponding measure for the censored D for all i, j ∈ D. This fact was essentially process with censoring set D, then Ci,j = Ci,j
observed by Grassmann and Heyman (1990) for the case when C = R or G.
Theorem 2.4 Let A0,0 = P (0) . Then A0,j = P0,j +
∞ X
A0,i Pi,j ,
i=1
4
j ≥ 0.
(2.6)
Proof: For j ≥ 1, let 1 ≤ r ≤ k0 and 1 ≤ s ≤ kj . It follows from the definition of A0,j
that
A0,j (r, s) =
∞ X
n=1
P {Zn = (j, s), Zn−1 ∈ / L0 , . . . , Z1 ∈ / L0 |Z0 = (0, r)}
= P {Z1 = (j, s)|Z0 = (0, r)} + ki ∞ X ∞ X X
n=2 i=1 w=1
= P0,j (r, s) +
P {Zn = (j, s), Zn−1 = (i, w), Zk ∈ / L0 , 1 ≤ k ≤ n − 2|Z0 = (0, r)} ki ∞ X X
A0,i (r, w)Pi,j (w, s).
i=1 w=1
For j = 0, rewrite P as
P =
P0,0 U D
Q
.
By the properties of the censored Markov chain and the definition of A0,n , we have (A0,1 , A0,2 , ˆ and P (0) = P0,0 + U QD, ˆ ˆ = P∞ Qk . Therefore, the result is also true . . .) = U Q where Q 0,0
k=0
when j = 0.
ˆ= Remark 2.5 The matrix Q
P∞
k k=0 Q
in the proof is called the fundamental matrix, which
can be defined for any stochastic or substochastic matrix Q. A row vector x is called a left (super) regular measure of P , if x = xP (x ≥ xP ). A
column vector x is called a right (super) regular measure of P , if x = P x (x ≥ P x).
Theorem 2.6 P is recurrent if and only if P has a nonnegative nonzero left regular vector (0)
π = (π0 , π1 , π2 . . .) given by πn = π0 A0,n , where π0 satisfies π0 = π0 P0,0 , (0)
Proof: Suppose that P is recurrent. Then P0,0 is stochastic and irreducible, and hence positive recurrent with a positive stationary measure, π0 . Define π by πn = π0 A0,n for n ≥ 1. By Theorem 2.4, we have (for n ≥ 0) πn = π0 P0,n +
∞ X
π0 A0,k Pk,n = π0 P0,n +
∞ X
k=1
k=1
which means that π is a nonnegative left regular measure of P .
5
πk Pk,n ,
(0)
(0)
If P is transient, then P0,0 is substochastic and hence (I − P0,0 )−1 exists so that the (0)
trivial solution is the only solution of π0 = π0 P0,0 . Therefore, the vector π defined by πn = π0 A0,n must be the 0 vector. (0)
Corollary 2.7 P is positive recurrent if and only if i) π0 = π0 P0,0 has a nonnegative nonzero solution and ii) π0
P∞
n=1 A0,n
< ∞.
Corollary 2.8 If P given in (1.1) is lower-Hessenberg, that is, Pi,j = 0 if i < j + 1, then (0)
P is positive recurrent if and only if i) π0 = π0 P0,0 has a nonnegative nonzero solution and ii) π0
P∞
n=1 Rn−1,n
< ∞.
Proof: This follows from Corollary 2.7 and Theorem 2.1. Remark 2.9 i) When P is ergodic, π0 (A0,0 , A0,1 , A0,2 , . . .) is the stationary probability vector of P , unique up to multiplication by a constant. ii) Ai,j was introduced by Karlin and Taylor, page 35 of [7], for the case where P has scalar entries. They used it to study ratio theorems and for the interpretation of generalized stationary probabilities. Theorem 2.10 Matrices Bn,0 and Gn,i as defined in Section 1 satisfy Bn,0
G , 1,0 = G + Pn−1 G n,0
i=1
if n = 1, n,i Bi,0 ,
if n ≥ 2.
(2.7)
Proof: For n = 1, the result follows immediately from the definitions. For n ≥ 2,
1 ≤ r ≤ kn and 1 ≤ s ≤ k0 , Bn,0 (r, s) =
∞ X
k=1
=
∞ X
k=1
P {Zk = (0, s), Zl ∈ / L0 , 1 ≤ l ≤ k − 1|Z0 = (n, r)}
P {Zk = (0, s), Zl ∈ / L≤n−1 , 1 ≤ l ≤ k − 1|Z0 = (n, r)} +
∞ k−1 X X
k=2 t=1
P {Zk = (0, s),
/ L≤n−1 for 1 ≤ l ≤ t − 1|Z0 = (n, r)} Zl ∈ / L0 for t + 1 ≤ l ≤ k − 1, Zt ∈ L≤n−1 \ L0 , Zl ∈ = Gn,0 (r, s) +
ki ∞ X ∞ n−1 X XX
t=1 k=t+1 i=1 w=1
P {Zt = (i, w), Zl ∈ / L≤n−1 for 1 ≤ l ≤ t − 1|Z0 = (n, r)}
P {Zk = (0, s), Zl ∈ / L0 for 1 ≤ l ≤ k − t − 1|Zt = (i, w)} = Gn,0 (r, s) +
ki n−1 XX
Gk,i (r, w)Bi,0 (w, s)
i=1 w=1
6
Thus, Bn,0 = Gn,0 +
n−1 X
Gn,i Bi,0 .
i=1
Corollary 2.11 Bi,0 is stochastic for all i ≥ 1 if and only if
i ≥ 1.
Pi−1
k=0 Gi,k
is stochastic for all
Lemma 2.12 Let P be stochastic. If Bi,0 is stochastic for all i ≥ 1, then so is B0,0 . Proof: Notice that B0,0 = P0,0 +
∞ X
P0,i Bi,0
i=1
and that Bi,0 is stochastic for all i ≥ 1. The result follows. Theorem 2.13 Let P be stochastic. P is recurrent if and only if for all i ≥ 1.
Pi−1
k=0 Gi,k
is stochastic
Proof: Suppose first that P is recurrent. Let f(i,r),(j,s) be the probability that the process ever makes a transition into state (j, s), given that the process starts in state (i, r). We have f(i,r),(j,s) = 1 for all i ≥ 0, 1 ≤ r ≤ ki , j ≥ 0 and 1 ≤ s ≤ kj . Stochasticity then follows from Corollary 2.11 and
f(i,r),(0,s) = Bi,0 (r, s) +
X
Bi,0 (r, w)f(0,w),(0,s) .
w6=s
Suppose now that
Pi−1
k=0 Gi,k
is stochastic for all i ≥ 1. Then, Bi,0 is stochastic for all ¯ 0 as in i ≥ 1 from Corollary 2.11. Partition the transition matrix P according to L0 and L Theorem 2.4 and consider the censored process with censoring set L0 , which gives us ˆ = B0,0 . P (0) = P0,0 + U QD By Lemma 2.12, B0,0 is stochastic. Now, for a Markov chain with transition matrix B0,0 , consider the censored process with the censoring set consisting of the single element (0, 1). Since B0,0 is finite stochastic matrix, the transition matrix of the censored process is also 7
stochastic. Hence, the only entry in this matrix is one. Take E1 = {(0, 1)} and E2 = L0 . It follows from vi) of Lemma 2.2 that P E1 = probability in the transition matrix of
P E1
P E2
E1
. Therefore, the only transition
is equal to f(0,1),(0,1) according to the definition
of the censored process, which in turns equals 1. Recurrence thus follows from irreducibility.
Corollary 2.14 Let P be stochastic. P is recurrent if and only if Bi,0 is stochastic for all i ≥ 1. Proof: Use Corollary 2.11 and Theorem 2.13.
Remark 2.15 It follows from the definition of Gi,k that for any Markov chain, is either stochastic or substochastic for any i.
Pi−1
k=0 Gi,k
For i ≥ 0 and j ≥ 0, define Mi,j to be a matrix of size ki × kj . When i 6= j, the (r, s)th
entry of Mi,j is the expected number of transitions needed to enter level j for the first time by hitting state (j, s), given that the process starts in state (i, r). When i = j, the (r, s)th entry of Mi,j is the expected number of transitions needed to return level j by hitting state (j, s), given that the process starts in state (i, r). Lemma 2.16 Let e be the column vector of ones, then M0,0 e = e +
P∞
k=1 A0,k e.
Proof: Let the rth row of A0,n be denoted by A0,n (r). Based on the definition of M0,0 , we have k0 X
M0,0 (r, s)
s=1
= 1+ = 1+
kn ∞ X X
n=1 w=1 ∞ X
E[ # of visits to (n, w) before hitting any state in level 0 |Z0 = (0, r)]
A0,n (r)e.
n=1
The proof of the following lemma is obvious, and is therefore omitted. 8
Lemma 2.17 Let P be stochastic. P is positive recurrent if and only if M0,0 e < ∞ or M0,0 < ∞.
Corollary 2.18 Let P be stochastic. Let A = and only if A < ∞.
P∞
k=1 A0,k .
Then, P is positive recurrent if
Proof: This follows from Lemma 2.16 and Lemma 2.17. (0)
Corollary 2.19 Let P be stochastic. A < ∞ if and only if i) π0 = π0 P0,0 has a nonnegative nonzero solution and ii) π0
P∞
n=1 A0,n
< ∞.
Proof: This follows from Corollary 2.7 and Corollary 2.18
Theorem 2.20 For any stochastic irreducible Markov chain P , let Di =
∞ X
Ri,j ,
i ≥ 0,
j=i+1
and define BN , DN and DN by BN = max Di ,
DN = sup Di
0≤i≤N
and
i≥N
DN = inf Di . i≥N
a) If there exists an N ≥ 0 such that BN < ∞ and limn→∞ DN +1
n
= 0, then P is
positive recurrent. b) If P is positive recurrent, then B0 < ∞ and limn→∞ (DN )n = 0 for all N ≥ 1.
Proof: a) According to Theorem 2.1, A0,n = R0,n +
Pn−1 k=1
A0,k Rk,n for n ≥ 1. For (0)
convenience, we agree to write A0,0 = I. Notice that A0,0 was defined as P0,0 in Theorem 2.4. The current convention is only used in this proof. For M > N , M X
A0,n =
n=1
M −1 X
M X
A0,k Rk,n
k=0 n=k+1
=
N X
k=0
≤
A0,k
N X
k=0
M X
n=k+1
!
Rk,n + M X
A0,k BN +
k=1
9
M −1 X
k=N +1
!
A0,k
¯ N +1 . A0,k D
M X
n=k+1
Rk,n
For any n ≥ 1, using the above inequality repeatedly leads to M X
k=1
A0,k ≤ ≤
N X
k=0 N X
!
N X
A0,k BN +
k=0 n−1 X
!
¯ N +1 + BN D
i=0
M X
!
!
2 ¯N A0,k D +1
k=1 M X
!
¯i D N +1 +
A0,k BN
k=1
A0,k
!
¯n . A0,k D N +1
k=1
Therefore, M X
k=1
and then
∞ X
k=1
A0,k ≤
N X
A0,k ≤
A0,k BN
k=0 N X
∞ X
!
i=0
∞ X
!
A0,k BN
k=0
i ¯N D +1
!
!
i ¯N D +1 .
i=0
P∞ ¯ i ¯i Since D i=0 DN +1 converges, and since by assumption BN < ∞ for N +1 → 0 as i → ∞,
some N ,
P∞
k=1 A0,k
also converges. Thus, P is positive recurrent by Corollary 2.18.
b) If P is positive recurrent, then B0 < ∞ by Theorem 2.1 and Corollary 2.18. For any
M ≥ 1,
∞ X
n=M (N +1)
A0,n ≥
∞ X
≥ ≥
∞ X
k=M (N +1)−1 ∞ X
Since
P∞
k=M
A0,k > 0 and
P∞
A0,k
∞ X
n=k+1
Rk,n
A0,k DM (N +1)−1
k=M (N +1)−1 ∞ X k=M N ∞ X
A0,k DM (N +1)−1 · DM (N +1)−2 · · · DM N A0,k DM MN
k=M N
≥ ······ ≥ ≥
A0,k Rk,n
n=M (N +1) k=M (N +1)−1
= ≥
n−1 X
∞ X
k=M
k=1 A0,k
∞ X
k=M
!
M M A0,k DM M N D M (N −1) · · · D M
!
N A0,k DM M .
< ∞ by Corollary 2.18, we have
N N M 0 = lim DM M = ( lim D M ) . N →∞
N →∞
10
Therefore, limN →∞ DN M = 0 for any M ≥ 1. ¯ 1 = D1 . Then P is positive recurrent if and only if B0 < ∞ Corollary 2.21 Suppose that D ¯ 1 )n = 0. and limn→∞ (D
Remark 2.22 i) When rows of P are repeating, the above corollary gives a necessary and sufficient condition for positive recurrence, which will be discussed in the next section. ii) We will provide one example, Example 5.2, to show that the conditions cannot be sharpened ¯ N )n = 0 for some N ≥ 0 is equivalent to any further. iii) When Ri,j are scalars, limn→∞ (D lim supi Di < 1. However, limn→∞ (DN )n = 0 for all N ≥ 1 is not equivalent to lim inf i Di < 1.
3
Markov chains in block form and with repeating rows
In this section, we study a special type of Markov chains in block form, that have the property of repeating rows (or columns). Markov chains of GI/M/1 type and M/G/1 type are special cases. Results given in the previous section will be sharpened and new results will be also provided. By repeating rows, we mean that the transition probability matrix partitioned as in (1.1) has the following form:
P0,0
P0,1
P0,2
P1,0 P = P2,0 P 3,0 . .
A0
A1
A−1
A0
.
A−2 A−1 .. .. . .
P0,3 · · · · · ·
A1
A2
A0 .. .
··· ···
··· ··· ,
(3.8)
··· ··· .. ..
.
.
where P0,0 is a matrix of size k0 × k0 and all Ak for k = 0, ±1, ±2, . . ., are matrices of size
m × m with m < ∞. The sizes of other matrices are determined accordingly.
It is easy to see that for any censoring set L≤n with n ≥ 1, the fundamental matrix
of the censoring Markov chain remains the same because of the repeating property of the transition matrix. The next corollary directly follows from the above property and the definition of Ri,j . 11
Corollary 3.1 For the block-structured Markov chain P with repeating rows as given in (3.8), Ri,n and Gn,i depend only on the difference n − i for all 1 ≤ i ≤ n − 1. Remark 3.2 Because of Corollary 3.1, we can write Rn−i = Ri,n and Gn−i = Gn,i for i > 0. When i = 0, R0,n may not be equal to Rn and Gn,0 may not be equal to Gn .
Lemma 3.3 If
P∞
k=−∞ Ak
is stochastic, then limn→∞ Gn,0 e = 0.
Proof: For any n ≥ 1, partition P according to censoring set L≤n−1 and its complement:
P =
T
U
.
D Q
ˆ by qi,j , i, j ∈ {1, 2, . . .}. Then Denote the (i, j)th element of the fundamental matrix Q lim Gn,0 e = lim
n→∞
n→∞
∞ X
∞ X
q1,k+1 Pn+k,0 e =
k=0
k=0
lim q1,k+1 Pn+k,0 e = 0,
n→∞
since limn→∞ Pn+k,0 e = 0 for all k ≥ 0. The interchange of the limit and the summation
is justified by the dominated convergence theorem. Let fn,k = q1,k+1 Pn+k,0 e. fn,k ≤ q1,k+1 P1+k,0 e for all n ≥ 1 since Pn,0 e is decreasing. Also, dominated convergence theorem now applies.
Theorem 3.4 Let P be stochastic. a) Let R0 =
P∞
n=1 R0,n
P∞
and R =
chain P in (3.8) is positive recurrent if and only if R0 < ∞ b) Let G =
P∞
n=1 Gn .
P∞
If
k=−∞ Ak
and
k=0 q1,k+1 P1+k,0 e
P∞
n=1 Rn .
lim Rk = 0.
k→∞
< ∞. The
The Markov
(3.9)
is stochastic, then the Markov chain P in (3.8) is
recurrent if and only if G is stochastic. Proof: a) Let A =
P∞
k=1 A0,k .
From Theorem 2.1, we see that A = R0 + AR. If P is
positive recurrent, then 0 < R0 < ∞ by Corollary 2.18. By repeatedly using A = R0 + AR, we can write
A = R0
n−1 X k=0
Rk + ARn ≥ R0 12
n−1 X k=0
Rk ,
for any n ≥ 2.
Since R0 > 0, it follows that
P∞
other half of the conclusion, use repeatedly to obtain
k k k=0 R < ∞ and therefore limk→∞ R = 0. P∞ k k k=0 R < ∞ since limk→∞ R = 0 and use
A = R0
∞ X
k=0
To prove the A = R0 + AR
Rk < ∞.
The positive recurrent property of P now follows from Corollary 2.18. b) If P is recurrent, then for all n ≥ 0, Bn,0 is stochastic from Corollary 2.14. It follows
from Theorem 2.10 that for n ≥ 2, e = Bn,0 e = Gn,0 e +
n−1 X
Gn,k Bk,0 e = Gn,0 e +
n−1 X
Gn,k e = Gn,0 e +
Gi e.
i=1
k=1
k=1
n−1 X
Taking the limit as n → ∞, it follows that G is stochastic. By using the repeating property
of the transition probabilities and the assumption that Pn,0 e = (Pn+1,0 + Pn+1,1 )e,
P∞
k=−∞ Ak
is stochastic, we have
for all n ≥ 1.
(3.10)
ˆ = (qi,j )i,j∈{1,2,...} is For any censoring set L≤n with n ≥ 1, the fundamental matrix Q independent of n. Consider two censoring sets L≤n and L≤n−1 . We have Gn+1,i =
∞ X
q1,k Pn+k,i ,
i = 0, 1
(3.11)
k=1
and Gn,0 =
∞ X
q1,k Pn−1+k,0 .
(3.12)
k=1
Combining (3.11) and (3.12), together with (3.10), leads to Gn,0 e = (Gn+1,0 + Gn+1,1 )e,
for all n ≥ 1.
(3.13)
Now, for any n ≥ 1, since (3.13) and Gn,k = Gn−k for k = 1, 2, . . . , n − 1, we have n−1 X
Gn,k e =
k=0
=
n X
Gn+1,k e
k=0 N −1 X
lim (GN,0 +
N →∞
Gk )e = lim GN,0 e + Ge = Ge = e. N →∞
k=1
The second last equality follows from Lemma 3.3. Finally, we complete the proof by using Corollary 2.14.
13
Remark 3.5 i) We can use a) and b) of Theorem 3.4 to give a necessary and sufficient condition for a null recurrent Markov chain. ii) For the scalar case, the condition limk→∞ Rk = 0 in a) of Theorem 3.4 is equivalent to R < 1. However, for the block case, this condition is not equivalent to R being substochastic (see Example 5.5). When the transition matrix P has the repeating property, one may analyze the Markov chain using the generating function technique. In the scalar case — all blocks in P are numbers — it is well-known that the factorization of the so-called queueing equation plays a key role. This factorization was obtained in Grassmann (1985) in terms of Rn and Gn . In the following, we discuss the factorization for the block-structured Markov chain P . n Let Pn−i,n be the (n − i, n)th block of the transition matrix of the censored Markov
chain with censoring set L≤n . Grassmann and Heyman showed that if P is ergodic, then n Pn−i,n is independent of n if n > i ≥ 0. By using the same argument one may see that this
n claim is also true for a non-ergodic P . Let Φi = Pn−i,n for 0 ≤ i < n.
Lemma 3.6 Let E0 = I − Φ0 and define R(z), G(z) and Q(z) by R(z) = −I + G(z) = −I + and Q(z) = −I +
∞ X
i=1 ∞ X
Ri z i ,
(3.14)
Gi z −i
(3.15)
Ai z i .
(3.16)
i=1
∞ X
i=−∞
Then Q(z) = C0 − R(z)E0 G(z),
(3.17)
C0 = (C − I) + (I − R)E0 (I − G)
(3.18)
where
with C =
P∞
i=−∞ Ai ,
R=
P∞
i=1 Ri
and G =
P∞
i=1 Gi .
Proof: This follows from direct algebraic manipulations. Some steps are shown below. R(z)E0 G(z) 14
= =
E0 − E0 −
= E0 + = E0 +
∞ X i=1
∞ X i=1
∞ X
j=1 ∞ X
Ri E0 z i − Ri E0 z i −
Rj E0 Gj + Rj E0 Gj +
j=1
= C0 − Q(z)
∞ X
!
E0 Gi z −i +
i=1
∞ X
E0 Gi z −i +
∞ X
k=1 ∞ X
∞ X
j=1
i=1
!
i=1
∞ X
i X
Rj E0 Gi z j−i +
j=1
∞ ∞ X X
j=i+1
Rj E0 Gk+j z −k +
Rj E0 Gk+j − E0 Gk z −k + ∞ X
Rj E0 Gi z j−i
∞ ∞ X X
Ri+k E0 Gi z k
k=1 i=1
k=0 j=1
(−A−k )z −k +
∞ X
∞ X
k=1
(−Ak )z k
∞ X
j=1
Rj+k E0 Gj − Rk E0 z k
k=1
k=1
Remark 3.7 i) The last equation can be either proved directly or using the second last equation at z = 1. ii) For any irreducible Markov chain, there exist real numbers ρ1 and ρ2 satisfying ρ1 ≥ 1 and 0 < ρ2 ≤ 1 such that both R(z) and G(z) are at least defined in
ρ2 ≤ |z| ≤ ρ1 . iii) In the above proof, we used the following two equations characterizing the relationships between Ri and Ai , and Gi and Ai . The equations were proved by Grassmann and Heyman (1990) for ergodic Markov chains. In the proof, properties of censored Markov chains are used, which are still valid for non-ergodic Markov chains. Therefore, these equations remain true for non-ergodic Markov chains. Ri (I − Φ0 ) = Ai + (I − Φ0 )Gj
∞ X
Ri+m (I − Φ0 )Gm ,
m=1 ∞ X
= A−j +
Rm (I − Φ0 )Gj+m ,
m=1
i ≥ 1, j ≥ 1.
Theorem 3.8 For an irreducible Markov chain with repeating rows as given in (3.8), a) Q(z) − Q(1) = R(1)E0 G(1) − R(z)E0 G(z);
and b) if the Markov chain is recurrent, then
Q(z)e = −R(z)E0 G(z)e.
(3.19)
Proof: a) This follows directly from Lemma 3.6. b) This follows since the Markov chain is recurrent if and only if G is stochastic and
P∞
15
i=−∞ Ai
is stochastic.
4
Markov chains with scalar entries
When all entries, except the boundaries, of the Markov chain defined in (1.1) are scalars, all results given in previous sections are reduced to scalar form. We will not repeat these results. We only give those results, which need additional attention or are not valid for block-structured Markov chains. Therefore, in most cases of this section, we will study the Markov chain with repeating property. The transition probability matrix is assumed to be
p0,0 .. .
p0,1 .. .
pm−1,0 P = pm,0 pm+1,0 pm+2,0
··· .. .
p0,m−1 .. .
p0,m .. .
p0,m+1 .. .
p0,m+2 .. .
··· .. .
pm−1,1 · · · pm−1,m−1 pm−1,m pm−1,m+1 pm−1,m+2 · · · pm,1
···
pm,m−1
pm+1,1 · · · pm+1,m−1
pm+2,1 · · · pm+2,m−1 .. .. .. . . .
.. .
β0
β1
β2
β−1
β0
β1
β−2 .. .
β−1 .. .
β0 .. .
Lemma 4.1 Let f (z) = −1 +
∞ X
··· ···
··· .. .
ck z k
.
(4.20)
(4.21)
k=1
be a function with ck ≥ 0 for all k ≥ 1. Then, a) all solutions of f (z) = 0 lie in the area |z| > 1 outside the unit circle if and only if
P∞
k=1 ck
< 1. b) All solutions of f (z) = 0 lie in
the area |z| ≥ 1 outside or on the unit circle if and only if
P∞
k=1 ck
≤ 1.
Proof: The proofs to a) and b) are similar, we only give the details of the proof to a) here. If
P∞
k=1 ck
< 1, let x = αeiθ be the complex expression of an arbitrary solution of
f (z) = 0. we want to show that |x| > 1. Suppose that it were not true, then 1≤
∞ X
k=1
|ck αk eikθ | =
∞ X
k=1
ck α k ≤
∞ X
ck < 1,
k=1
which is a contradiction. If all the solutions of f (z) = 0 lie outside the unit circle, but P∞
k=1 ck
≥ 1, then f (1) = −1 +
16
∞ X
k=1
ck ≥ 0.
Since all the solutions of f (z) = 0 lie outside the unit circle, f (1) > 0. Also, we know that f (0) = −1. Therefore, there exists a solution inside the unit circle, which is a contradiction.
For scalar cases, we will often use lower case letters for notations instead of upper case. For example, we will use ri,j , gi,j , ai,j , bi,j and etc. for the corresponding measures Ri,j , Gi,j , Ai,j , Bi,j and etc. Now, let us consider the functions R(z), G(z) and Q(z) defined in Section 3. We use ri and gi to denote Ri and Gi . Theorem 3.4 and Lemma 4.1 directly leads to the following theorem. Theorem 4.2 For the Markov chain P in (4.20), a) P is positive recurrent if and only if
P∞
k=i+1 rm,k
< ∞ for i = 0, 1, . . . , m − 1, and all the solutions of R(z) = 0 lie outside
the unit circle. b) If
P∞
k=−∞ βk
= 1, then P is transient if and only if all the solutions of
G(z) = 0 lie inside the unit circle. c) If
P∞
k=−∞ βk
= 1, then P is null recurrent if and
only if there is at least one solution of R(z) = 0 and at least one solution of G(z) = 0 lies on the unit circle.
Remark 4.3 From the above theorem and the factorization of the queueing equation Q(z) in Theorem 3.8, we can find the relationship between rk and the roots. For example, let P be given by (4.20) and additionally we assume that pi,j = βj−i , i = 0, 1, . . . , m − 1 and j ≥ m,
and βn = 0 whenever n > m. If P is ergodic, then the stationary probabilities πk for k ≥ m
can be explicitly expressed as the linear combination of m geometric terms. The inverses 1/θk (k = 1, 2, . . . , m) of the m geometric parameters are the zeros of the characteristic
function, which lie outside the unit circle. The characteristic function is our Q(z), which can be factorized into Q(z) = −R(z)E0 G(z),
(4.22)
where R(z) = 0 gives all zeros 1/θk , k = 1, 2, . . . , m. We then obtain the relationship between rk and θk : rk = (−1)k−1
X
1≤i1 1 whenever (−1 + 5)/2 < ρ < 1. Before we make final conclusions, we would like to mention that since we are dealing with infinite matrices, the associativity of matrix multiplication cannot be taken for granted. Many proofs would have been much simpler if we had dealt with finite matrices. For example, let P be a stochastic matrix partitioned according to {i} and its complement:
P =
pi,i U D
Q
.
Since D = (I − Q)e, we can write ˆ = U Q[(I ˆ − Q)e]. U QD Since associativity does not always hold, one cannot write it as ˆ = U [Q(I ˆ − Q)]e = U e = 1 − pi,i . U QD Otherwise, P would be always recurrent because the probability of ever returning to i is ˆ = 1. fi,i = pi,i + U QD In this paper, we have studied the measures Ri,j , Gi,j , and related ones to characterize Markov chains partitioned in block form. Necessary and sufficient conditions for recurrence have been found for a general Markov chain. Necessary and sufficient conditions for positive recurrence have been obtained separately for a general Markov chain, where the conditions cannot be improved any further. When the Markov chain has the repeating property, we improved the necessary and sufficient condition for recurrence, and obtained a necessary and sufficient condition for positive recurrence. We have also discussed the decomposition of the characteristic function Q(z) of Ai for both the block case and scalar case, which can 23
be factorized into the product of the characteristic functions of Ri and Gi if the Markov chain is recurrent. For the scalar case, we have further located the zeros of the characteristic functions of Ri and Gi . In terms of these zeros, we can fully describe properties of transience, recurrence and positive recurrence of Markov chains. This work also builds a bridge between the root-finding method and other methods.
Acknowledgement The authors thank the referee for the valuable suggestions and comments and acknowledge that this work was supported by research grants from the Natural Sciences and Engineering Research Council of Canada (NSERC). Dr. Wei Li also acknowledges the financial support of the University of Winnipeg.
References [1] Asmussen, S. (1987) Applied Probability and Queues, Wiley, Chichester. [2] Freedman, David (1983) Approximating Countable Markov Chains, 2nd edn, SpringerVerlag, New York. [3] Gibson, D. and Seneta, E. (1987) Augmented truncations of infinite stochastic matrices. J. Appl. Prob. 24, 600–608. [4] Grassmann, W.K. and Heyman, D.P. (1990)
Equilibrium distribution of block-
structured Markov chains with repeating rows. J. Appl. Prob. 27, 557–576. [5] Grassmann, W.K. and Heyman, D.P. (1993) Computation of steady-state probabilities for infinite-state Markov chains with repeating rows. ORSA J. on Computing 5, 292– 303. [6] Haight, F.A. (1958) Two queues in parallel. Biometrika 45, 401–410. [7] Karlin, S. and Taylor, H.M. (1981) A Second Course in Stochastic Processes, Academic Press, New York. [8] Kemeny, J.G., Snell, J.L. and Knapp, A.W. (1976) Denumerable Markov Chains, 2nd edn, Springer-Verlag, New York. 24
[9] Neuts, M.F. (1981) Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach, The Johns Hopkins University Press, Baltimore. [10] Neuts, M.F. (1989), Structured Stochastic Matrices of M/G/1 Type and Their Applications, Marcel Decker Inc., New York.
25