ON THE CONVERGENCE OF MULTITYPE ... - Project Euclid

2 downloads 0 Views 466KB Size Report
spatially inhomogeneous diffusion on the Sierpinski gasket. 1. Introduction and statement of results. A multitype branching. Ž . process with varying environment ...
The Annals of Applied Probability 1997, Vol. 7, No. 3, 772 ] 801

ON THE CONVERGENCE OF MULTITYPE BRANCHING PROCESSES WITH VARYING ENVIRONMENTS1 BY OWEN DAFYDD JONES University of Sheffield Using the ergodic theory of nonnegative matrices, conditions are obtained for the L 2 and almost sure convergence of a supercritical multitype branching process with varying environment, normed by its mean. We also give conditions for the extinction probability of the limit to equal that of the process. The theory developed allows for different types to grow at different rates, and an example of this is given, taken from the construction of a spatially inhomogeneous diffusion on the Sierpinski gasket.

1. Introduction and statement of results. A multitype branching process with varying environment ŽMTBPVE. generalizes the classical multitype branching or Galton]Watson process. For a finite number d of types, we allow the number of type j offspring of a type i parent at time n to depend on i, j and n. In what follows, we give second moment conditions under which a MTBPVE normed by its mean, whose mean matrices are weakly ergodic, converges a.s. and in L 2 to a nontrivial limit. These conditions generalize those of Harris Ž1963. for multitype fixed environment processes and those of Fearn Ž1971. and Jagers Ž1974. for single-type varying environment processes. Notably, if the mean matrices are well behaved in some sense, then our L 2 convergence condition is best possible. Our results give conditions under which a MTBPVE grows like its mean, which in this case is given by a forward product of nonnegative matrices. Nonnegative matrix products can exhibit more than one rate of growth, in the sense that, as additional factors are added to the product, different elements of the product can grow at different rates. This opens up the possibility of MTBPVE with more than one rate of growth. Indeed, in Section 4 we give an example of a MTBPVE with two distinct growth rates, arising from the construction of a spatially inhomogeneous diffusion on the Sierpinski gasket Ža simple fractal.. In order to analyze growth rates better, the discussion of ergodic theory given in Section 2 goes beyond that strictly required for our convergence results, in particular looking at strong ergodicity and some

Received July 1995; revised December 1996, March 1997. 1 Research partially supported by U.K. Engineering and Physical Sciences Research Council. AMS 1991 subject classifications. Primary 60J80; secondary 15A48. Key words and phrases. Branching process, multitype, varying environment, ergodic matrix products.

772

CONVERGENCE OF BRANCHING PROCESSES

773

related ideas. However, this extra analysis will be needed for the example in Section 4. In addition to results on the convergence of the normed process, we also derive conditions for the extinction probability of the limit to equal that of the process. This result will also be applied in Section 4. We will adopt the following notation for the remainder of the paper. For a matrix A g R d= d , write AŽ i, j . for its Ž i, j .th element, AŽ i, ? . for the row vector given by its ith row and AŽ?, j . for the column vector given by its jth column. Similarly, for a vector a g R d , write aŽ i . for its ith component. The vector of 1s will be written 1 and the unit vector with a 1 in position i will be written e i . A Žnonnegative. matrix is called rowrcolumn allowable if each rowrcolumn has a nonzero component. A row and column allowable matrix is simply called allowable. Clearly, the product of allowable matrices is also allowable. Write A G 0 or a G 0 if every element of A or a is G 0, and write A ) 0 or a ) 0 if every element is greater than 0. Unless stated otherwise, we will assume that all matrices and vectors dealt with are nonnegative. A nonnegative matrix A g R d= d is called primitive if there exists an n such that A n ) 0. For such A we write PFŽ A. for its Žunique, real. largest eigenvalue, that is, its spectral radius, and LPFŽ A. and RPFŽ A. for the corresponding Žunique, strictly positive. left and right eigenvectors respectively, normed to be probability vectors. Here, PF stands for Perron] Frobenius. Suppose that the offspring distributions of the process are given by a d= d sequence of Zq valued r.v.s.  X n 4`ns0 . That is, the distribution of the number of type j children born to a single type i parent at time n is the same as that of X nŽ i, j .. Define Mn s E X n , Vn w i x s Cov X nŽ i, ? . and sn2 Ž i, j . s Var X nŽ i, j . s Vn w i xŽ j, j .. We will assume that the  Mn 4`ns0 are finite in all that follows. Unless otherwise stated, we will also assume that they are allowable. For fixed m G 0, let Zm s  Zm, n 4`nsm be the branching process defined in the usual way w see, e.g., Asmussen and Hering Ž1983. or Athreya and Ney Ž1972.x , letting Zm, nŽ i, j . be the number of type j descendants at time n of a single type i parent at time m. Note that, as defined, Zm takes on d= d values in Zq , where the rows of Zm are independent processes. For a sequence of matrices  A n 4`ns0 , we will write A m, n for the forward product from m to n y 1. That is, A m, n s A m A mq1 ??? A ny1. It follows from the branching property of Zm that for any m F n F p, E Ž Zm , p < Zm , n . s Zm , n Mn , p . Our tool for dealing with the matrix product Mm, n is the ergodic theory of nonnegative matrices. Ergodic theory for nonnegative matrices can be viewed as a generalization of the Perron]Frobenius theory, as it describes the growth and limiting behavior of matrix products. We say that the matrices  Mn 4 are weakly ergodic if for all m G 0 the forward product Mm, n is strictly positive and of rank 1 in the limit as n ª `. A precise definition is given in Section 2.

774

O. D. JONES

A fundamental tool used in the development of ergodic theory for matrix products is Birkhoff’s contraction coefficient. For x, y g R d , x, y ) 0, put

r Ž x T , y T . s log

max i x Ž i . ry Ž i . min i x Ž i . ry Ž i .

xŽ i. yŽ j.

s max log

xŽ j. yŽ i.

i, j

.

The character r is often called a projective distance since r Ž x T , y T . s 0 if and only if x s l y for some l ) 0 and r Ž a x T , b y T . s r Ž x T , y T . for all scalars a , b ) 0. For a nonnegative column allowable matrix A g R d= d , Birkhoff’s contraction coefficient is defined as

t Ž A. s

r Ž x TA, y TA .

sup x , y)0, x/ l y

r Ž x T , yT .

.

It is easily shown that 0 F t Ž A. F 1 and that for any other nonnegative column allowable matrix B g R d= d , t Ž AB . F t Ž A.t Ž B .. If A is allowable then

¡ ¢0,

' t Ž A. s 1 q 'f Ž A . 1 y f Ž A.

A Ž i , k . A Ž j, l .

min , where f Ž A . s ~ i , j, k , l A Ž j, k . A Ž i , l .

A ) 0,

A s 0. w This result is due originally to Birkhoff Ž1957., but see also Seneta Ž1981., Section 3.4.x It follows that for allowable A, t Ž A. - 1 if and only if A ) 0 and t Ž A. s 0 if and only if A s wv T for some strictly positive w, v g R d. Given this and our definition of weak ergodicity, it should come as no surprise that

 Mn 4 are weakly ergodic

m ; m G 0, t Ž Mm , n . ª 0 as n ª `.

w See Seneta Ž1981., Lemma 3.3.x Define diagonal matrices m

R n s diag Ž m R n Ž 1 . , . . . , m R n Ž d . .

for 0 F m F n by

m

R n Ž j . s 1T Mm , n Ž ?, j . .

The main results of the paper follow. THEOREM 1 Ž L 2 convergence theorem.. If the  Mn 4 are allowable and weakly ergodic with column limit vectors  wm 4 and if for some m G 0, `

Ý Ý

Ž 1.

nsm i , j

m

R n Ž i . sn2 Ž i , j . m

R 2nq 1 Ž j .

- `,

then there exists a r.v. L m G 0 such that E L m s wm and T Zm , nm Ry1 n ªL 2 L m 1

as n ª `.

THEOREM 2 ŽAlmost sure convergence theorem.. conditions of Theorem 1 we have

Ž 2.

`

Ý Ý

nsm i , j

If in addition to the

Ž n q 1 y m . m R n Ž i . sn2 Ž i , j . -` m 2 R nq 1 Ž j .

CONVERGENCE OF BRANCHING PROCESSES

775

and there exists a C - ` such that for all n G m, `

2 Ý t Ž Mn , p . F Ž n q 1 y m . C,

Ž 3.

psn

then in addition to L 2 convergence we have T Zm , nm Ry1 n ª Lm1

a.s. as n ª `.

Here condition Ž2. is a strengthening of the variance condition Ž1., while condition Ž3. constrains the speed at which the mean matrix Mm, n tends to a rank 1 matrix. The proofs are given in Section 3. If the  Mn 4 are well behaved, then not only is condition Ž1. necessary for L 2 convergence, but we can also dispense with condition Ž3. for a.s. convergence. The following corollary details what we mean by ‘‘well behaved,’’ and is the form of these results used in Section 4. Its proof can also be found in Section 3. COROLLARY 3 ŽNecessary and sufficient variance condition.. Suppose we are given rescaling matrices Dn s diagŽ DnŽ1., . . . , DnŽ d .. for all n G 0, such y1 that Q n [ Dn Mn Dnq1 converges elementwise to a primitive matrix Q. Then the condition Dn Ž i . sn2 Ž i , j .

`

Ý Ý 1T Q

Ž 4.

nsm i , j

2 m , n 1 Dnq1

Ž j.

-`

is necessary and sufficient for the L 2 convergence of Zm, n Dny1 r1T Q m, n 1 as n ª `. When it exists, this limit is of the form L m v T , for some r.v. L m with y1 E L m s Dm wm . Here v is the left Perron]Frobenius eigenvector of Q Ž normed as a probability vector . and the  wm 4`ms0 are strictly positive probability vectors, which converge as m ª ` to w, the right Perron]Frobenius eigenvector of Q Ž normed as a probability vector .. Also, if

Ž 5.

`

Ý Ý

nsm i , j

Ž n q 1 y m . Dn Ž i . sn2 Ž i , j . -` 2 1T Q m , n 1 Dnq1 Ž j.

then we get a.s. convergence. Finally, we have a result that shows that Žunder certain conditions. Zm, nŽ i, ? . either dies out, or else the number of type j individuals grows like m R nŽ j ., for all j, with probability 1. PROPOSITION 4 ŽExtinction probabilities.. Suppose that the conditions of Theorem 1 hold, and for all m G 0 and 1 F i F d, let qm Ž i . be the extinction probability of the process Zm Ž i . [  Zm, nŽ i, ? .4`nsm . Then, if there exists some d constant K and vectors h n g Rq , n s 0, 1, . . . , such that for all m G 0,

776

O. D. JONES

1 F i F d and x g Rq,

P Ž Zm , n Ž i , ? . h n F x . ª qm Ž i .

Ž 6.

as n ª `

and `

Ý Ý

Ž 7.

nsm j, k

Mm , n Ž i , j . sn2 Ž j, k . wm Ž i .

2 m

R 2nq1 Ž k .

F Krh m Ž i . y 1

then qm Ž i . s PŽ L m Ž i . s 0.. Furthermore, if there exist rescaling matrices Dn s diagŽ DnŽ1., . . . , DnŽ d .. y1 for all n G 0, such that Q n [ Dn Mn Dnq1 converges elementwise to a primitive matrix Q, then Ž7. is equivalent to

Ž 8.

`

Ý Ý

nsm j, k

Dn Ž j . sn2 Ž j, k . 2 1T Q m , n 1 Dnq1 Ž k.

y1 F Ž Krh m Ž i . y 1 . Ž Dm wm . Ž i .

where the wm are the same as those of Corollary 3. In practice, condition Ž6. can be difficult to check. However, we can give some more practical conditions which imply it. If PŽ X n ) 0. s 1 for all n, then Z can never die out. Let Mm, n denote the minimum family sizes of Zm, n . Note that in many situations of interest}such as the example considered in Mm, n can be explicitly determined. If we choose the  h n 4 so that Section 4}M

Ž 9.

M0 , n h n ª ` as n ª ` Ž . then 6 follows. In practice we try and take the  h n 4 as small as possible, so that condition Ž7. can also be satisfied. If the  h n 4 are constant, then Ž6. reduces to requiring that the only recurrent state of the branching process is 0. In the fixed environment case, Harris Ž1963. showed that ‘‘nonsingularity’’ of the offspring distribution is sufficient for the nonzero states to be transient. This result can be partially d= d extended to MTBPVE. Suppose that a r.v. X, taking values in Zq , describes the offspring distribution of a fixed environment multitype branching process. We say X is singular if P Ž X Ž i , ? . 1 s 1 . s 1 for all i. Now distinguish two cases. 1. The process Z can never die out. In this case, if we can find a nonsingular X such that X FD X n for all n G 0, then the only recurrent state of  Zm, nŽ i, ? .4`nsm is 0, for all i. 2. Extinction is possible from any initial state. In this case, if we can find a nonsingular X such that X FD X n for all n G 0, then the only recurrent state of  Zm, nŽ i, ? .4`nsm is 0, for all i. An appropriate X can always be found if, for example, X n ªD X where X is nonsingular and M s E X is primitive. A proof of these results is given in Jones Ž1995.. Note that the advantage of Ž9. over case Ž1. is that if the minimum population size grows to infinity, then we can take h n ª 0, making Ž7. easier to satisfy. This is precisely the situation encountered in Section 4.

CONVERGENCE OF BRANCHING PROCESSES

777

Background. Although the results of Harris Ž1963. and those of Fearn Ž1971. and Jagers Ž1974., which Theorems 1 and 2 generalize, were first proved more than twenty years ago, there has been little interest in MTBPVE until very recently. One reason for this new interest is the potential application of MTBPVE to the study of diffusion on fractals, as is pursued Žwith some success. in the work of Hattori and Watanabe Ž1993., Hattori, Hattori and Watanabe Ž1994. and Hattori Ž1994.. The first of these uses an analytical approach to prove the weak convergence of the normed process to some limit, though it makes a number of rather restrictive conditions on the mean matrices  Mn 4 . The report by Hattori Ž1994. goes somewhat further, giving conditions for the L 2 convergence of Zm, nŽ i, j .rMm, nŽ i, j . and some results on the continuity of the limit. The conditions given make implicit use of weak ergodicity, though are somewhat more technical than those of Theorem 1. They also explicitly require supercriticality. The method used adapts some of the ideas of Cohn Ž1989. to the varying environment case. Cohn himself has taken the results of Cohn Ž1989. further, in joint work with Jagers Ž1994. and also with Nerman and Biggins. The work with Jagers claims L 1 convergence given weak ergodicity of the mean matrices 4` and Žessentially. uniform integrability in n of  Zm, nm Ry1 n nsm . Cohn also 2 Ž . gives without proof some conditions for the L convergence of Zm, nŽ i, j .rMm, nŽ i, j . as n ª `, in a recent research report w Cohn Ž1993.x . These again assume weak convergence of the mean matrices together with a uniform integrability condition and a variance condition, similar to but not the same as condition Ž1.. Cohn also makes the observation that it is possible to move from an L 2 result to an a.s. result using a Borel]Cantelli argument. At the time of writing, the work of Cohn, Nerman and Biggins referred to is still in preparation. It seems this work will treat MTBPVE more generally than the concept of weak ergodicity allows, using instead the concept of space]time harmonic functions to gain the required control over the matrix products  Mm, n 4`nsm w see Cohn and Nerman Ž1990. for a definition of space]time harmonic functions and a detailed analysis of how they relate to weak ergodicityx . 2. Ergodicity of nonnegative matrix products. The use of ‘‘coefficients of ergodicity’’ such as t in the study of products of nonnegative matrices, owes much of its modern development to the work of Hajnal Ž1976. and Cohen Ž1979.. Their ideas in turn owe a lot to the study of products of positive stochastic matrices and inhomogeneous Markov chains. It is from this connection with Markov chains that we get the term ergodic. Hajnal Ž1976. suggested the more appropriate term ‘‘contractive’’ as an alternative, but this has yet to be widely adopted. Most of the standard results and ideas we will be using can be found in Seneta Ž1981., which provides a good summary of the work in the area and has an extensive bibliography. For more recent work in the area the reader is referred to Cohn and Nerman Ž1990..

778

O. D. JONES

In this section, we introduce the standard notions of weak and strong ergodicity and describe how they relate to each other. Although we do not use strong ergodicity explicitly in Theorems 1 and 2, it is of practical use when applying them, as can be seen in Section 4. It is also used in demonstrating that the variance condition Ž1. is best possible in certain situations: see Corollary 16. In what follows, we will mean by a sequence of rescaling matrices a sequence of nonnegative diagonal matrices of full rank. For matrices M g R d= d and D s diagŽ DŽ1., . . . , DŽ d .., premultiplying M by D is equivalent to scaling each row i of M by DŽ i ., while postmultiplying M by D is equivalent to scaling each column j of M by DŽ j .. Define rescaling matrices m R n s diagŽm R nŽ1., . . . , m R nŽ d .. by putting m R m s I and requiring m R n Mnm Ry1 nq1 to be column stochastic for all n G m. The term m R n is allowable and thus also invertible, provided Mm, n is column allowable. As products of column stochastic matrices are column stochastic, it is clear that, as defined in Section 1, m R n Ž j . s 1T Mm , n Ž ?, j . . DEFINITION 5 ŽWeak ergodicity.. The matrices  Mn 4`ns0 are said to be weakly ergodic if there exist strictly positive wm, n , vm, n g R d such that for all mG0 Mm , n Ž i , j .

wm , n Ž i . vm , n Ž j .

ª 1 for all i and j as n ª 8.

For allowable Mn this is equivalent to requiring for all m the existence of an n such that Mm, n ) 0 and a probability vector wm ) 0 such that Mm , n Ž i , k . Mm , n Ž j, k .

ª

wm Ž i . wm Ž j .

for all i , j and k as n ª `.

w See Hajnal Ž1976., Theorem 1 or Seneta Ž1981., Lemma 3.4 and Exercise 3.5.x Since wm ) 0, this is equivalent to requiring Mm , n Ž i , k . Ý j Mm , n Ž j, k .

ª wm Ž i .

for all i and k as n ª `.

T We will generally write this in matrix form as Mm, nm Ry1 n ª wm 1 as n ª `.

So, weak ergodicity requires that as n ª `, the elements of any one column of the forward product Mm, n all grow at the same rate and that within each column, the rows tend to fixed proportions. The contraction coefficient t is used to give more tractable conditions for weak ergodicity. We have already noted in Section 1 that the  Mn 4 are weakly ergodic if and only if t Ž Mm, n . ª 0 as n ª `, for all m. Using the submultiplicity of t , this is often enough to give a practical check for weak ergodicity. The form of t for allowable matrices gives the following refinement for allowable Mn . The  Mn 4 are weakly ergodic if and only if there exist nŽ k .­`,

CONVERGENCE OF BRANCHING PROCESSES

779

nŽ k . / nŽ k q 1., such that `

Ý

ks0

'f Ž M

nŽ k ., nŽ kq1.

.

s `.

w See Hajnal Ž1976., Theorem 4 or Seneta Ž1981. Theorem 3.2.x A sufficient condition for allowable Mn w which also gives geometric decay of t Ž Mm, n . as n ª `, thereby satisfying condition Ž3. of Theorem 2x is that there exist some n 0 G 1 and g ) 0 such that for all n, Mn, nqn 0 ) 0 and minq i , j Mn Ž i , j . max i , j Mn Ž i , j .

Gg,

where minq denotes the minimum over positive elements. w This is the well known Coale]Lopez theorem. See Seneta Ž1981. Theorem 3.3 for the form given here.x This condition also has consequences for strong ergodicity, as we will see below. Say column-allowable matrices  Mn 4`ns0 have the RGR property Žfor relative growth rates. if for all m, i and j there exists rm Ž i, j . g w 0, `x such that m

RnŽ i.

m

RnŽ j.

ª rm Ž i , j . as n ª `.

If the RGR property holds and rm Ž i, j . g Ž0, `. for all m, i and j, then we say the  Mn 4 have a single growth rate. DEFINITION 6 ŽStrong ergodicity.. The matrices  Mn 4 are said to be strongly ergodic if for all m there exists a probability vector vm such that Mm , n Ž i , j . e Ti Mm , n 1

ª vm Ž j .

as n ª ` independently of i.

It follows immediately that if such vm exist then they are in fact independent of m, that is, vm s v for all m. Weak ergodicity can be thought of as requiring the columns of Mm, n to tend to fixed proportions as n ª `, given by wm . In an analogous manner, strong ergodicity is often thought of as requiring the rows of Mm, n to tend to fixed proportions as n ª `, given by v. However, strong ergodicity only tells you about proportions with respect to the largest growth rate of Mm, n . Smaller growth rates, represented by the zeros in v, cannot be compared without extra information, such as that supplied by the RGR property. The next lemma should give a better idea of how the concepts of weak and strong ergodicity are related. PROPOSITION 7 ŽRelating weak and strong ergodicity.. Ži. If the  Mn 4 are row allowable, then strong ergodicity with v ) 0 implies weak ergodicity.

780

O. D. JONES

Žii. If the  Mn 4 are allowable, then weak and strong ergodicity together imply Mm, nr1T Mm, n 1 ª wm v Tas n ª ` for all m. Žiii. For allowable Mn , if there exist probability vectors wm ) 0 and a probability vector v such that Mm, nr1T Mm, n 1 ª wm v T as n ª ` for all m, then the  Mn 4 are strongly ergodic. Živ. If the  Mn 4 are allowable, then strong ergodicity with v ) 0 implies the RGR property, with rm Ž i, j . s v Ž i .rv Ž j . Ž thus giving a single growth rate .. Žv. Weak ergodicity and the RGR property imply that for all m, i and j, rm Ž i, j . s r Ž i, j . is independent of m and they imply strong ergodicity, with v Ž j . s 1rÝ i r Ž i, j .. PROOF. Ži. It suffices to put wm, n s Mm, n 1 and vm, n s v in the first of our definitions of weak ergodicity. The wm, n are strictly positive provided the  Mn 4 are row allowable. Žii. Since the  Mn 4 are weakly ergodic and allowable, we have that f Ž Mm, n . ª 1 as n ª `. Thus Mm , n Ž i , j . ? 1T Mm , n 1 e Ti Mm , n 1

T

? 1 Mm , n e j

s s

Mm , n Ž i , j . Ý k , l Mm , n Ž k , l . Ý k , l Mm , n Ž i , k . Mm , n Ž l, j . Mm , n Ž i , j . Ý k , l Mm , n Ž k , l .

Ý

k, l

Mm , n Ž i , k . Mm , n Ž l, j . Mm , n Ž l, k . Mm , n Ž i , j .

Mm , n Ž l, k . Mm , n Ž i , j .

ª 1 as n ª `. Thus, dividing top and bottom of the left-hand side by Ž1T Mm, n 1. 2 , we have Mm , n Ž i , j . r1T Mm , n 1

Ž eTi Mm , n1r1T Mm , n1 . Ž 1T Mm , n e jr1T Mm , n1 .

ª 1 as n ª `.

But e Ti Mm, n 1r1TAMm, n 1 ª wm Ž i . and 1T Mm, n e jr1T Mm, n 1 ª v Ž j . and so Mm, nŽ i, j .r1T Mm, n 1 ª wm Ž i . v Ž j . as n ª `. Note that we do not in general need full weak ergodicity for this result to hold. All we need is that Mm, n e jrm R nŽ j . ª wm for those j for which v Ž j . ) 0. This is why we only get a partial converse to this result Žsee the next item.. Žiii. It suffices to divide top and bottom of Mm, nŽ i, j .re Ti Mm, n 1 by 1T Mm, n 1 and send n ª `. Note that it also follows that Mm, n e jrm R nŽ j . ª wm for all j for which v Ž j . ) 0. Živ. This follows from items Ži. and Žii. on dividing top and bottom of m R nŽ i .rm R nŽ j . by 1T Mm, n 1. This argument fails if v has two or more zero elements, i 0 and i 1 say, as we do not know how quickly m R nŽ i 0 .r1T Mm, n 1 and m R nŽ i 1 .r1T Mm, n 1 go to zero. In particular we do not know if one of them tends to zero faster than the other or not.

781

CONVERGENCE OF BRANCHING PROCESSES

Žv. We observe to begin with that for all i Mm , n Ž i , j . Mm , n Ž i , k .

s

Mm , n Ž i , j . rm R n Ž j .

m

Mm , n Ž i , k . r R n Ž k .

m

m

ª

wm Ž i . wm Ž i .

RnŽ j.

RnŽ k .

rm Ž j, k . s rm Ž j, k .

as n ª `.

Thus rm Ž i , j . s lim

nª`

s lim

nª`

m

RnŽ i.

m

RnŽ j.

Ý l Ž Ý k Mm Ž k , l . . Mmq1 , n Ž l, i .

Ý l Ž Ý k Mm Ž k , l . . Mmq1 , n Ž l, j .

ª rmq 1 Ž i , j .

as n ª `

noting that for positive a n , bn , c n and d n , if a nrbn ª x and c nrd n ª x then Ž a n q c n .rŽ bn q d n . ª x. So rm Ž i, j . is independent of m. To show strong ergodicity, consider Mm , n Ž i , j . Ý k Mm , n Ž i , k .

s ª

Mm , n Ž i , j . m

RnŽ j. 1

Ýk r Ž k , j .

Ý k

Mm , n Ž i , k . m

RnŽ k .

m

RnŽ k .

m

RnŽ j.

as n ª `.

Note that 1rÝ k r Ž k, j . F 1 since r Ž k, k . s 1, and that Ý j Ž1rÝ k r Ž k, j .. s 1. I The concept of strong ergodicity is a generalization of that used when dealing with Žrow. stochastic matrices. A sequence  A n 4`ns0 of stochastic matrices is strongly ergodic if for all m, i and j, A m, nŽ i, j . ª v Ž j . as n ª ` for some probability vector v. It is also usual when dealing with stochastic matrices to use stochastic ergodicity rather than weak ergodicity. The  A n 4 are stochastically ergodic if A m, nŽ i, k . y A m, nŽ j, k . ª 0 for all m, i, j and k as n ª `. Clearly strong ergodicity implies stochastic ergodicity Žthough it still does not imply weak ergodicity even in this setting.. These definitions are generally sufficient in the stochastic setting, as the forward products  Mm, n 4`nsm are bounded and the question of growth rates is not particularly important. The extra concept of the RGR property is useful when you are interested in multiple growth rates, in particular, not just the largest growth rate. A straightforward condition for strong ergodicity with v ) 0 is the following. For allowable Mn , if there exists an n 0 G 1, a d - 1 and a probability vector v ) 0 such that t Ž Mn, nqn 0 . F d for all n and x n ª v for some sequence of Mn left eigenvectors  x n 4 , then the  Mn 4 are strongly ergodic with row limit vector v. w See Seneta and Sheridan Ž1981. Theorem 4.2.x This happens, for example, when the  Mn 4 converge elementwise to some primitive

782

O. D. JONES

matrix M, in which case v is the left Perron]Frobenius eigenvector of M. It is possible to say a little more in this case. LEMMA 8 ŽAsymptotically primitive mean matrices.. If Mn ª M elementwise where M is primitive, then the  Mn 4 are strongly ergodic. Moreover, if we let l s PF Ž M . be the spectral radius of M, v s LPF Ž M . be the left Perron]Frobenius eigenvector of M and w s RPF Ž M . be the right Perron] Frobenius eigenvector of M Ž normed as probability vectors ., then v is the row limit vector for the  Mn 4 , the column limit vectors  wm 4 converge elementwise to w as m ª ` and

l s lim lim

mª` nª`

1T Mm , n 1 1T Mmq 1 , n 1

s lim

nª`

1T Mm , nq11 1T Mm , n 1

for all m.

PROOF. That the  Mn 4 are strongly ergodic with row limit vector v follows from Theorem 4.2 of Seneta and Sheridan Ž1981.. For the remainder consider the following: wm s lim

nª`

Mm , n 1 T

1 Mm , n 1

s Mm lim

nª`

Mmq 1 , n 1 1T Mmq1 , n 1 1T Mmq 1 , n 1 1T Mm , n 1

s Mm wmq1 lim

nª`

1T Mmq 1 , n 1 1T Mm , n 1

which implies the existence of a mmq 1 [ lim n ª` 1T Mmq1, n 1r1T Mm, n 1 and shows that if lim m ª` wm exists then it must equal w. Moreover, if lim m ª` wm exists, then so does lim m ª` a mmq 1 , which must equal ly1 . In fact, we can show that the limit of any convergent subsequence of the  wm 4 must be w. Let  wnŽ k .4`ks0 be a convergent subsequence of the  wn 4 with limit x 0 . Such a subsequence always exists, as the space of probability nŽ k .q1 , whence vectors is compact in R d. We have that wnŽ k . s MnŽ k . wnŽ k .q1 a nŽ k. sending k ª `, x 0 s Mx 1 b 0 , nŽ k .q1 where the existence of x 1 [ lim k ª` wnŽ k .q1 and b 0 [ lim k ª` a nŽ is k. implied by the existence of x 0 and M. That each limit exists separately follows from the fact that the  wn 4 are all probability vectors. We can repeat this procedure with the  wnŽ k .q1 4`ks0 to show that wnŽ k .q2 converges to some limit x 2 . Repeating this ad infinitum gives us a sequence of probability vectors x 0 , x 1 , x 2 , . . . and a sequence of scalars b 0 , b 1 , b 2 , . . . ky 1 such that x k s Mx kq1 b k for all k. That is, x 0 s M k x k Ł ls0 b l for all k. Let `  x k Ž p.4ps0 be a convergent subsequence of the x k with limit y. Then x 0 s k Ž p.y1 lim p ª` M k Ž p. yŁ ls0 b l . It follows immediately that x 0 s w, since the w coefficient of the eigenvalue expansion of y cannot be zero because y G 0.

CONVERGENCE OF BRANCHING PROCESSES

783

Given that all convergent subsequences of the  wm 4 converge to w, it follows immediately that the  wm 4 themselves must converge to w, since  wm 4`ms0 is contained in the compact set of probability vectors. The final part of the lemma follows directly from the following observation lim

nª`

Mm , n Mn 1T Mm , n 1

s wm v T M s lim

nª`

1T Mmq1 , n 1 1T Mm , n 1

wm v T .

I

By way of illustrating what is required to get more than one growth rate, we give the following. If the  Mn 4 are strongly ergodic and irreducible, then the existence of some g ) 0 such that minq i , j Mn Ž i , j . G g for all n max i , j Mn Ž i , j . is sufficient to ensure that v ) 0, that is, that there is a single growth rate. w See Seneta Ž1981., Theorem 3.4.x It is possible to adapt existing results on strong ergodicity with v ) 0 to the multiple growth rate case by using rescaling arguments, as we will see in Section 2.1 below. 2.1. Rescaling. We will now take a closer look at rescaling in general and the rescaling matrices  m R n 4 0 F m F n in particular. LEMMA 9 ŽRescaling lemma.. Let D s diagŽ DŽ1., . . . , DŽ d .. be nonnegative and of full rank. Then for any nonnegative column allowable matrix A g R d= d , t Ž AD . s t Ž DA . s t Ž A . and for allowable A f Ž AD . s f Ž DA . s f Ž A . . PROOF. The result follows directly from the definitions of t and f . I Observe that for column-allowable column-stochastic matrices  Pn 4`ns0 , weak ergodicity is equivalent to strong ergodicity with v s Ž1rd .1. Now,  4 define m Pn sm R n Mnm Ry1 nq1 for all n G m G 0 and suppose that the M n are column-allowable and weakly ergodic with column limit vectors  wm 4 . Thus for any m F n F p, t Ž Mn, p . ª 0 as p ª `, so from Lemma 9, t Žm Pn, p . ª 0 as n ª `. Thus as the m Pn are column-stochastic, they are strongly ergodic with v s Ž1rd .1. That is, there exist strictly positive probability vectors  m wn 4`nsm such that m

It is clear that

m

Pn , p ªm wn 1T

as p ª ` for all n G m.

wm s wm . More generally we have m

wn 1T s lim m Pn , p pª`

n R pm Ry1 s lim m R n Mn , pn Ry1 p p

pª`

m

s R n wn 1T lim n R pm Ry1 p . pª`

784

O. D. JONES

n n It follows that lim p ª` n R pm Ry1 p exists and equals a m I for some constant a m . Thus for n G m, m wn s a mn m R n wn .

It is easily checked that this definition of a mn is consistent with the definition given in Lemma 8, namely that a mn s lim p ª` 1T Mn, p 1r1T Mm, p 1. We can in fact bound the speed at which m Pn, p converges to m wn 1T as p ª `. To do this we make use of a second coefficient of ergodicity Žor contraction coefficient., k . It is normally used with row-stochastic matrices, but has been adapted here for use with column-stochastic matrices by the simple expedient of transposing everything. For a column-stochastic matrix P g R d= d define 5 Px 5 1 kŽ P. s sup . 5 x 51 xgR d , x/0, 1T xs0 It can be shown that k Ž P . F t Ž P . w Seneta Ž1981., Theorem 3.13x . Thus for a column stochastic P, if x g R d and 1T x s 0 then 5 Px 5 1 F 5 x 5 1t Ž P .. That is, if t Ž P . - 1 then P is a contraction mapping on the set  x g R d : 1T x s 04 . In particular we have here that 1T Ž e j ym wp . s 0 and so m

Pn , p Ž ?, j . ym wn

1

s

m

Pn , p Ž e j ym wp .

1

F 2t Ž Pn , p . m

s 2t Ž Mn , p . . In practice we may be able to find natural rescaling matrices different from the  m R n 4 . Suppose that  Dn s diagŽ DnŽ1., . . . , DnŽ d ..4`ns0 is a sequence of y1 rescaling matrices and put Q n s Dn Mn Dnq1 for all n G 0. It follows from the  4 rescaling lemma that the Q n are weakly ergodic if and only if the  Mn 4 are weakly ergodic. Suppose this is the case, and let  wm 4`ms0 and  wm 4`ms0 be the column limit vectors for the  Mn 4 and  Q n 4 , respectively, then, noting that wm s lim

nª`

Qm , n e j 1T Q m , n e j

s lim

nª`

Dm Mm , n e j 1T Dm Mm , n e j

and that 1T Mm , n e j 1T Dm Mm , n e j

s

y1 Ý i Dm Ž i . Qm , n Ž i , j .

1T Q m , n e j

ª

Ý Dmy1 Ž i . wm Ž i . i

as n ª `,

it follows that

Ž 10.

wm s

y1 Dm wm y1 1T Dm wm

.

For strong ergodicity we have the following. PROPOSITION 10 ŽSufficient condition for strong ergodicity.. If the  Q n 4 are strongly ergodic with row limit vector v ) 0 and lim n ª` DnŽ i .rDnŽ j . exists

CONVERGENCE OF BRANCHING PROCESSES

785

g w 0, `x for all i and j, then the  Mn 4 are strongly ergodic and have the RGR property, with r Ž i, j. s

vŽ i. vŽ j.

lim

Dn Ž i .

nª`

Dn Ž j .

.

PROOF. As v ) 0, the  Q n 4 and thus the  Mn 4 are weakly ergodic. Thus from Proposition 7Žv., it suffices to establish the RGR property for m s 0. Consider 0

lim

nª`

0

RnŽ j.

RnŽ k .

s lim

nª`

s

Ý i Dy1 0 Ž i . Q 0 , n Ž i , j . Dn Ž j . Ý i Dy1 0 Ž i . Q 0 , n Ž i , k . Dn Ž k .

vŽ j. vŽ k.

lim

nª`

Dn Ž j . Dn Ž k .

since lim n ª` Q0, nŽ i, j .rQ0, nŽ i, k . s v Ž j .rv Ž k . independently of i. I This proposition provides a practical way of applying our existing conditions for strong ergodicity with v ) 0 to situations where we have multiple growth rates. We also have the following Žfor use in Corollary 16 below.. PROPOSITION 11 ŽRescaled limit matrices.. If the  Q n 4 converge elementwise to a primitive matrix Q then for all m, m Pn converges elementwise to a primitive matrix P given by

l P Ž i , j . s v Ž i . Q Ž i , j . rv Ž j . where l s PF Ž Q . is the spectral radius of Q and v s LPF Ž Q . is the left Perron]Frobenius eigenvector of Q Ž normed as a probability vector .. y1 y1 PROOF. To begin with put m Dn s Dm Dn and m Q n sm Dn Mnm Dnq1 for all m m m y1 n G m. Then Q n ª Q [ Dm QDm as n ª ` where PFŽ Q . s l and m v [ LPFŽm Q . s Dm vrÝ i Dm Ž i . v Ž i .. Now, define a further set of rescaling matrices  m En 4 0 F m F n by putting m Em s I and requiring m Enm Q nm Ey1 nq1 to be column stochastic for all n G m Žso the  m En 4 play the same role for the  m Q n 4 that m the  m R n 4 play for the  Mn 4.. It should be clear that m Enm Q nm Ey1 nq1 s Pn , that m m m y1 is, that En s R n Dn . Now, from Lemma 8, we have that

1T m En 1T m Q m , n 1

s

1T m Q m , n 1T m Q m , n 1

ªm v as n ª `

and that 1T m Q m , n 1 T m

1

Q m , nq11

ª

1

l

as n ª `,

786

O. D. JONES

whence

l Pn Ž i , j . s l

1T m Q m , n 1

m

s

En Ž i .

m

1T m Q m , nq11 1T m Q m , n 1

m

ª

m

vŽ i. mQŽ i , j. m

vŽ j.

vŽ i. QŽ i , j. vŽ j.

Qn Ž i , j .

1T m Q m , nq11 m

Enq 1 Ž j .

as n ª `

.

I

2.2. Growth rates. In this subsection we compare and bound various growth rates obtained from the matrix product Mm, n as n ª `, with the aim of simplifying the application of Theorems 1 and 2 and Proposition 4. The results obtained will be put to practical use in Section 4. For two sequences  x n 4`ns0 and  yn 4`ns0 , write  x n 4 '  yn 4 if lim n ª` x nryn exists g Ž0, `.. We say two such sequences have the same growth rate. LEMMA 12. Suppose we are given rescaling matrices Dn s diagŽ DnŽ1., y1 4` . . . , DnŽ d .. for n G 0 such that the matrices  Q n [ Dn Mn Dnq1 ns0 are strongly ergodic with row limit vector v ) 0. Then for any m G 0 and 1 F j F d

 m R n Ž j . 4 nsm '  1T Q m , n1 ? Dn Ž j . 4 nsm . `

`

y1 Q m, n e j ? DnŽ j .. From Proposition 7 we PROOF. We have m R nŽ j . s 1T Dm T Ž . know that Q m, n i, j r1 Q m, n 1 converges as n ª ` to wm Ž i . v Ž j ., where the  wm 4 are the column limit vectors for the  Q n 4 . Thus y1 1T Dm Qm , n e j

1T Q m , n 1

s

y1 Ý i Dm Ž i . Qm , n Ž i , j .

1T Q m , n 1

ª

Ý Dmy1 Ž i . wm Ž i . ? v Ž j . i

as n ª `,

whence we get the result. I Under the conditions of Lemma 8 we can give bounds on the growth of 1T Mm, n 1 or Žmore commonly. 1T Q m, n 1, as n ª `. LEMMA 13 ŽGrowth bounds.. If Mn ª M elementwise where M is primitive, then for any « ) 0 we can find a constant c 0 g Ž0, `. such that for all n G m G 0, cy1 0 Žly«.

ny m

F 1T Mm , n 1 F c 0 Ž l q « .

nym

where l s PF Ž M . is the spectral radius of M. PROOF. From Lemma 8 we have for any k G 0 that m T T k lim a mq k [ lim lim 1 M m , n 1r1 M mqk , n 1 s l .

mª`

mª` nª`

CONVERGENCE OF BRANCHING PROCESSES

787

Thus for any « ) 0 we can find some constant c1 g Ž0, `. such that for all m, k G 0,

Ž 11.

m cy1 1 Ž l y « . F a mqk F c1 Ž l q « . . k

k

Now since M is primitive and Mn ª M, there exists an m 0 and a c 2 g Ž0, `. T T such that for all m G m 0 , cy1 2 11 F M m, mqn 0 F c 2 11 , where n 0 is such that n0 M ) 0. Thus for all n G m G m 0 and p G n q n 0 , T T T T cy1 2 1 M m , p 1 F 1 M m , n 11 M nqn 0 , p 1 F c 2 1 M m , p 1.

Dividing through by 1T Mnq n 0 , p 1 and sending p ª ` gives us m m T cy1 2 a nqn 0 F 1 M m , n 1 F c 2 a nqn 0 .

The result now follows on applying inequalities Ž11. to these. I The next result gives a condition for the growth rate of 1T Mm, n 1 to equal l exactly. LEMMA 14 ŽAsymptotically geometric growth function.. If Mn ª M elementwise where M is primitive and Ý`ns0 5 Mn y M 5 1 - `, then for any m G 0, ny1 l k 4 nsm  1T Mm , n1 4 nsm '  lnym 4 `ns m '  Ł ksm `

`

where l s PF Ž M . is the spectral radius of M and l k s PF Ž Mk . is the spectral radius of Mk , for all k G m. ny 1 l k . Also, recall that for any matrix A g R d= d , PROOF. Put l m, n s Ł ksm 1 the L operator norm 5 A 5 1 is equivalent to the matrix norm 5 A 5 [ max i, j < AŽ i, j .< . The following result is taken from Markus and Minc Ž1964., Theorem 3.1.6. For A, B g R d= d , B G A G 0, A, B primitive with PFŽ A. s a , PFŽ B . s b we have

m max i , j C Ž i , j . rÝ k C Ž k , j .

FbyaF

M min i , j C Ž i , j . rÝ k C Ž k , j .

where m s min i, j Ž B Ž i, j . y AŽ i, j .., M s max i, j Ž B Ž i, j . y AŽ i, j .. and C G 0 is column allowable and commutes with either A or B. Suppose Mn ­ M. Let C s M n 0 where n 0 is such that M n 0 ) 0 and put y1 c1 s min i, j C Ž i, j .rÝ k C Ž k, j .. Then from the above result, l y l n F c1 5 Mn y M 5. Thus Ý n 5 Mn y M 5 - ` implies Ý nŽ l y l n . - ` which implies that l m, nrl ny m converges g Ž0, `. as n ª `, that is, that  l m, n 4`nsm '  l ny m 4`nsm . The same holds if Mn x M. If Mn ª M but not monotonically, then consider the following elementwise minima and maxima: My and n s Mn n M

Mq n s M n k M.

5 5 q 5 5 y 5 Clearly 5 Mn y M 5 s 5 My n y M k M n y M , so Ý n M n y M - ` and q y y q . Ý n 5 Mn y M 5 - `. Putting l n s PFŽ Mn . F l n F l n s PFŽ Mq n , we get ` q 4` ny m 4` ` ny m 4`  ly 4    4  ' l ' l , whence l ' l m, n nsm m, n nsm nsm m, n nsm nsm .

788

O. D. JONES

Now my n ly1 M ny m m , n Mm , n y l

ny1

s

y1 y1 ly1 M . lŽ kq1.yn M nyŽ kq1. m , k Mm , k Ž l k Mk y l

Ý

ksm

Ža telescoping sum.. Thus, since 5 Mn 5 1 F l n , Ý`ns0 5 Mn y M 5 1 - ` and  l m, n 4`nsm '  l ny m 4`nsm , myn 5 ly1 M nym 5 1 m , n Mm , n y l

ny1

Ý

F

y1 5 ly1 M 51 k Mk y l

ksm `

y1