Averaging for martingale problems and stochastic ...

Citations Article

From References: 7 From Reviews: 0

MR1169928 (93h:60070) 60G42 (60B10 60G57 62L20) Kurtz, Thomas G. (1-WI-S) Averaging for martingale problems and stochastic approximation. Applied stochastic analysis (New Brunswick, NJ, 1991), 186–209, Lecture Notes in Control and Inform. Sci., 177, Springer, Berlin, 1992. Let {(Xn , Yn )} be a sequence of stochastic processes with values in a complete separable metric space, where Yn is a “time scale” component and Xn has the following property: there exists an operator A R tsuch that for any function f (x) belonging to the domain of this operator the process f (Xn ) − 0 Af (Xn (s), Yn (s)) ds is a martingale. The author gives an approach to study the weak convergence of Xn as n → ∞. This method applies to various stochastic models with averaging, in particular to stochastic approximation algorithms. {For the entire collection see MR1169913 (93a:93100)} Reviewed by R. Sh. Liptser c Copyright American Mathematical Society 1993, 2008

AVERAGING FOR APPROXIMATION

MARTINGALE

PROBLEMS

AND

STOCHASTIC

Thomas G. Kurtz Departments of Mathematics and Statistics Univcl~ity of Wisconsin-Madison Madison, WI 53706 0. Averaging Let {(Xn,Yn) } be a sequence of stochastic processes with values in a product space E 1 × E2, where E 1 and E 2 are complete separable metric spaces. We are interested in studying the limiting behavior of the sequence under the assumption that the "time scale" of the second component Yn is much faster than that of the first component Xn in a sense that will be clear from our assumptions. The behavior of X n will be related to that of Yn

through the assumption that there is an operator

A : ~(A) C C(E1) --~ C(E 1 x E2) such that for f E ~(A)

J0nf(Xn(s),Vn(s)),ls t

(o.z)

f(Xn(t)) -

is a martingale, or more generally, that t

(0.2)

f(xn(t)) - I 0Af(Xn(s),rn(s))as +

is a martingale and Ef =~ 0. The fast process Yu is generally specified in one of two ways. Yn can simply be defined by setting Yn(t) = Y(//nt) where Y is an ergodic, stationary process aud

tin-+co, or we can assume that there is an operator

B : ~(B) C C(E 2)-+ C(E 1 x E2) such that for g E ~(B) (0.3)

t g(Yn(t)) - I 0/3nBg(Xn(s),Yn(s)) ds + 6g(t)

is a martingale, t i n - , c o , and fl~'16g@0. The goal in tllese problems is to show that X n :*-X, where X can be characterized as a solution of a martingale problem for an operator ~. obtained by "averaging" A. Problems of this type for deterministic systems (Af(x,y) = b(x,y)- X7f(x) in the present context) have a long history. (See, for example, Lochak and Meunier (1988).) Stochastic models considered by Khas'minsldi (1966a,b) fit the above formulation as do random evolutions introduced by Griego azld Hersh (1969) and studied by a variety of workers.

See Hersh (1974), Pinsky (1974), Vapanicolaou (1978), Kushner (1984), and

Ethier and Kurtz (1986), for examples and furtber references.

187

In many of the examples, the hard work is in showing that the model satisfies the martingale conditions of (0.2). Our goal here, howcvcr, is to show how to carry the argument to completion, once the model is put into this form, so at this point we only give two simple examples. Khas'minskii (1966a) is concerned with models of the form Xn(t) = F(Xn(t),Yn(t)) where Yn(t) = Y(nt). Setting Af(x,y) = F(x,y). Vxf(x,y), we sec that t

f ( X n ( t ) ) - I 0 Af(Xn(s)'Yn(s)) ds = f(Xn(O)) which is trivially a martingale, For a second example (a simple special case of Khas'minskii (1966b)), suppose that Y is an irreducible finite Maxkov chain with infinitesimal transition matrix Q = ((q(y,y'))) and stationary distribution r, Vn(t) = Y(nt), EF(x,y)7r(y) = 0, and :~n(t) = ,I-ffF(Xn(t),Yn(t)). Enlploying the perturbation method of Kurtz (1973) (see Papanicolaou (1978) and Kushner (1979, 1984) for generaliza.tions), let G satisfy ~ q ( y , y ' ) G ( x , y ~) = - F ( x , y ) . (The solution will exist since ~r is the unique left cigenvector of the zero cigcnvalue of Q, and ~ F(x,y) r(y) = 0.) Define aij(x,y ) = Gi(x,y ) Fj(x,y), bi(x,Y) = E j Fj(x,Y)~-xiG(x,Y), and Af(x,y) = ~ aij(x,y ) ~

02

f(x) + ~ b i(x,y) 0 flx~ 0x i ~ j.

Then t

f(Xn(t)) - I0af(Xn(s),Yn(s)) d s + N1 G(xn(t),~ . , n(t))- vf(xn(t)) is a martingale of tile form (0.2). Much of the material below exists at least implicitly in the earlier work, but some of tile most useful idee~s (e.g., Lemma 1.5) axe, as far as we know, stated explicitly for the first time. Section 1 contains basic results on convergence of random measures. (Sec Kallcnberg (1986) for more details.) Section 2 contains the basic limit theorem and a number of examples. We also introduce a notion of an averaged martingale problem

188

corresponding to the operators A and B in (0.2) and (0.3). Section 3 applies the averaging arguments to stochastic approximation algorithms following results of Kushner and Schwartz (1984) and Dupuis and Kushner (1989). Throughout, C(S) will denote the space of bounded, continuous real-valued functions on a metric space S, CE[0,oo ) will denote the space of continuous, E-valued functions on [0,oo), and DE[0,oo ) will denote the space of E-valued functions that are right continuous on [0,oo) and have left limits at each t > 0. The space E will always be a complcte, separable metric space. 1. Convergence of random meamtrc8 Let (S,d) be a complete separable metric space, and let

.At,(S) be the space of finite measures on S with the weak topology.

The

Prohorov metric on MI,(S) is dcfined by

(L1) where

p(I,,u)

= inf{e > 0:it(B) < u(Be)+e, u(B) < #(Be)+e, B E ~(S))

B e = {x E S : i n f y E B d ( X , y )
0 and T > 0, there exists a compact

194

K C E 1 such that (2.1)

infnP{Xn(t ) e K, t < T} > 1 - e,

and assume that {Yn(t):t >_ 0, n = 1,2.... } is relatively compact (as a collection of E2-valued random variables). Suppose there is an operator A : ~(A) C C(E 1) ~ C(E 1 x E2) such that fox" f E ~(A) there is a process ef for which (2.2)

t f(Xn(t)) - IoAf(Xn(s),Yn(s)) ds + efn(t)

is an {~[~}-martingale. Let ~(A) be dense in C(E1) in the topology of uniform convergence on compact sets. Suppose that for each f E ~(A) and each T > 0, there exists p > 1 such that (2.3)

suPn

< oo

aud

(2.4)

limn_.ooE[sup t < T[ef(t)[] = O.

Let Fn be tlle Zm(E2)-valued random variable given by (2.5)

rn(i0,t ] × B)

=

ItoIB(Yn(s))ds. "

Then {(Xn,Fn)} is relatively compact in DEl[0,oo ) × £m(E2), and for any limit point (X,F) there exists a filtration {gt} such that (2.6)

t f(X(t)) - I0 IE2Af(X(s),y)r(ds × dy)

is a {gt}-martingale for each f E ~(A). Proof The relative compactness of { X n } follows from (2.1), (2.3) and (2.4) by Theorems 3.9.1 and 3.9.4 of Ethier and Kurtz (1986). By the relative compactness of {Yn(t):t >_0, n = 1,2,...}, for each e > 0, there exists a compact K C E 2 such that P{Yu(t) E K} _> i - e, and hence E[Fn(i0,t ] x K] _> t(1-e). Consequently, the relative compactness of {In} follows by Lemma 1.3. Let (X,F) be a limit point of {(Xn,rn)} ~,1 dcfine gt = a{X(s),F([0,s] x H):s _< t, H e ~(E2) }.

195

For f E ~(A), t

r

Mn(t ) = f(Xn(t)) - IoAf(Xn(s),'t n(s)) as + ef(t) = f ( X n ( t ) ) - I 0t I E 2Af(Xn(s)'y)rn(ds×dy)+ef(t) is a martingale, and for each t t' t Zn(t) = f(Xn(t)) - JoAf(Xn(s),Yn(~))ds

= f(Xn(t)) - I : I E2Af(X"(s)'y)rn(ds x dy) is uniformly integrable and, applying Lemma 1.5, converges in distribution along an appropriate subsequence to (2.6). Since by (2.4), limn___,~E[lZn(t ) - Mn(t)l] = 0, it follows that (2.6) is a {gt}-martingale. El Example 2.2 by Yn(t) the marginal Theorem 2.1,

Suppose Y is stationary and ergodic, and Yn in Theorem 2.1 is given Y(nt). Then F = m × r where m denotes Lebesgue measure and lr is distribution for Y. Consequently, under the other assumptions of X is a solution of the martingale problem for C given by Cf(x) = IAf(x,y)zr(dy), f e D 1.

Example 2.3 Suppose that there is an operator B : ~(B) C C(E2) --* C(E 1 x E2) such that for g E ~(B) (2.7)

, t g(k n(t)) -- I0flnBg(Xn(s),Yn(s)) ds + 6g(t)

is an {~[~}-martingale, fln--.c¢, and for each T > 0 lira u--~c~ E[sup t < Tflu'l]sgu(t)l] = 0. Then under the assumptions of Theorem 2.1, it follows that (2.8)

I [0,t] x E 2 Bg(X(s),y) r(ds × dy)

is a martingale. But (2.8) is continuous and of bounded variation and hence must be constant. Consequently, for each g e ~(B), with probability one

196 ¢

(2.9)

J[0,t] x E2 Bg(X(s)'y)F(ds x dy) = 0

for all t > 0. Let ? be as in Lemma 1.4.

Suppose that there exists a countable subset I) C ~(B)

such that the closure of {(g,Bg):g • l~} in C ( E 2 ) x C(E 1 x E2) is the same as the closm'e of {(g,Bg):g • ~(B)}. For example, such a subset, would exist if E 1 and E 2 were comp~t. (by the separMfility of C(E2) x C(E 1 × E2)), or if ~(B) = Cc°° and B is a second order differential operator with bounded coefficients. By (2.9) It f

Bg(X(s),y)%(dy)ds = 0 E2

0

for all t a.s., and hence (2.10)

[ J

Bg(X(s),y)Ts(dy ) = 0 E2

a.e. m a.s. Consequently, with probability one, there exists a single set Q c [0,~) with re(Q) = 0 such that (2.10) holds for all g E l) and all s E [ 0 , c o ) - Q . But the choice of I) ensures that (2.10) holds for all g E ~(B) and all s E [0,c~) - Q. Define Bx:~(B ) --~ C(E2) by Bxg(y ) = Bg(x,y). measure ~rx in ~(E2) satisfying

Suppose that there is a unique

I Bxgd~ x = 0 for all g C D 2. (If B x is the generator for an E2-valucd Markov process, this assumption is csscntiaUy the assertion that there is a unique stationary distribution corresponding to Bx. ) Then wc can take 7s = ~rx(s), and defining C on D 1 by Cf(x) = I Af(x,y) ~rx(dy), it follows that X is a solution of the martingale problem for C. Example 2.4

Let

a:RdxRd ~ MdXm

and

b : ~ d x R d ~ Rd

13 be continuous and

suppose that a(x,y) and b(x,y) are periodic with period 1 in the last d coordinates (Yl ..... yd ). Suppose that X n satisfies the Ito equation dXn = a(Xn,nXn)dW + b(Xn,nXn)dt

197

and that {Xu(0)} is relatively compact. Set a = eraT, and assume that there exists a constant K such that la(x,y)l < K(1 + Ixl2), x.b(x,y) < K(l+lxl 2) for all x, y. This assumption ensures that {Xn} satisfies the compact containment condition. Define Yu(t) - nXn(t)rood 1. Let ~(A) bc the linear span of 1 and C2(ad), and for f E ~(A), define

Let ~(B) be the collection of C2 flmctions g on [0,1]d such that g and its first two derivatives satisfy periodic boundary conditions, that is, the periodic extension of g to all of Rd is C 2. For g E~(B) define

l~J

1

B g(y) =

1J

and

ij(x,y)

Hg(x,y) = E bi(x'Y)0"~0ylg(y)' 1

It follows from Ito's formula that ¢ t

•

f(Xn(t)) - ~/oAf(Xn(s),~ n(s)) ds and t

g(Yn(s))- f0n2Bng(Xn(s),Yn(s))ds t

t

= g(Yn(s)) - I{ln2Bg(Xn(s)'Yn(s)),is - I 0nHg(Xn(s)'Yn(s)) ds ~re {~[1}-n,~rting~l~ for S[ ~ = ,,(Xn(sl:s _< t).

Let E l be R(1 and E 2 = [0,1]d. Then the conditions of tile previous example satisfied. If

are

198

fBxg~rx(dy) = O, g E ~(B), then by Echeverria's theorem (Ethier and Kurtz (1986), Theorem 4.9.17), ~x is a stationary distribution for Bx. If B x has a unique stationary distribution, for example, if a is positive definite, then any limit point X of {Xn} is a solution of the martingale problem for C given by Cf(x) = f Af(x,y) 7rx(dy). For related work and further refcrenccs on diffusions with rapidly varying periodic coefficients see Bensoussan, Lions, and Papanicolaou (1978) and Bhattacharya, Gupta, and Walker (1989). []

Let A:~(A) C C(E1)---*C(E 1 xE2) and B:~(B) C C(E2)---*C(E 1 xE2). A process (X,F) in D E [0,oo)x Zm(E2) will be called a solution of the averaged martingale problem for (A,IB), if for each f E ~(A) and g E ~(B), (2.11)

f(X(t)) - f [0,t] x E2 Af(X(s)'y)) F(ds x dy)

and (2.12)

f [0,t] x E2 Bg(X(s)'y) r(ds x dy)

are {0t}-martingales for a filtration {0t} with respect to which X and F are adapted. Of course, as above, (2.12) being a martingale implies (2.12) is zero. Integrating h(X(t)) by (2.12), we get

(2.13)

I [0,t] x E2 h(x(s)) 13g(X(s),y) r(ds × dy) - 0.

We observe that a solution of the averaged martingale problem can be viewed as a solution of a controlled martingale problem (see Kurtz (1987)) in which E 2 is the control space and F is the relaxed control. A solution of the averaged martingale problem is stationary, if X is stationary and F has stationary increments, i.e., F([a+t,b+t] x G) is stationary for all choices of a < b and G E ~B(E2). If (X,P) is stationary, then the measure z: E 9(E 1 x E2) determined by

199

(2.14)

7r(G1 x G 2) = Eli 101G1(X(s))IG2(Y)F(ds x dy)l

satisfies (2.15)

S E1 × E2 Af(x'y) rr(&x x dy) = 0,

IE1 xE2 h(x)Bg(x'y) 7r(dx x dy) = 0

for all f E @(A), g 6 @(B), and h 6 B(E1). We now address the converse problem. When does a measure satisfying (2.15) correspond to a stationary solution of the averaged martingale problem? In particular, we extend Echeverria's theorem to this setting (see Ethier and Kurtz (1986), Theorem 4.9.17), or more precisely, we extend the analogue of Echeverria's theorem for controlled martingale problems given by Stockbridge (1990). For f E ~(A), define Ayf(x) = Af(x,y), and for g E ~(B), define Bxg(y) = Bg(x,y). Theorem 2.5 Let E 1 and E 2 be locally compact and separablc, and let E~x = E i U {Ai} denote their one-point compactifications. We ~sumc the following conditions for A and B: i. ~(A) is an algebra and is dense in C(EI). ii. For each y, Ay satisfies the positive maximum principle. iii. For each compact K C E 2, limx._. A SUpy 6 K IAf(x'Y)[ = 0. iv. There exists ~ 6 C(E2), ~, > 0 sucl{ that for each f 6 ~(h), there exist constants af and bf satisfying }Af(x,y)l < af+bf~(y). v. There exists ~b6 C(E1), ¢ > 0 such that for each g 6 @(B), there exist co,tstants ag and bg satisfying ]Bg(x,y)} < ag + bg¢(X). Suppose ~r E O(E l x E2) satisfies (2.15) and

(2.16)

IE 1 x E2 (¢(x) + ~(Y)) ,~(dx x dy) < oo.

Then there exists a stationary solution of the averaged martingale problem satisfying (2.14). Remark Note that Conditions iv and v and (2.16) ensure that integrable with respect to rr.

Af and

Bg are

Proof Without loss of generality, we can assume that cp and ¢ are strictly positive

and that ~ 6 C(EI) and 1 6 C(E2). The proof of the theorem is much the same as the proofof Theorems 4.1 and 4.7 of Stockbridge(1990). In particular,by replacing A, B, and 7: by

200 (2.17)

Anf(x,y ) = ~

n2 Af(x,y), Bng(x'Y) = (n V ~(y))(n V ~(x)) Bg(x,y)

and (2.18)

~rn(dx × dy) = Cn(n V ~,(y)) r(dx x dy),

where cn is a constant that normalizes zrn to be a probability measure, we can first prove the theorem under the assumption that ¢J~(A), ~Jt~(B)C C(E 1 × E2) and then obtain the general case by taking the linfit as n---,c¢. (In order to ensure that Anf and Bug are in C(E 1 x E2) , one can find T* E C(E2) and ~,* E C(E1) satisfying the integrability condition (2.16), but tending to c~ faster than the original ~p and ~b.) Under this assumption, all functions involved extend continuously to E ~ x E2~, so we may as well assmne E 1 and E 2 are compact. We need the following variation of the result in Lemma 4.2 of Stockbridge (1990). Assuming now that E 1 and E 2 are compact, let fl,...,fmEC~(A) and gl .... ,gm E ~(B), and let H be a, 1)olynomial on Rm that is convex on [-al,O'l] x . . . × I - a m , a m ] where a i > [[fi- li(Afi + Bgi)[[ V I[fill. Then (2.19) f E 1 x E 2 H(fl(x) - l(Afl(x'Y) + Bgl(x'Y))' ...,fro(x) - l(Afm(x,y ) + Bgm(x,y)) ) 7r(dx x dy) > I E 1 × E 2 (H(fl(x)''''fm(x)) - 1 V H ( f l ( x ) " " "'fm(x))" (Afl(x'Y)"" "'Afm(x'Y)) - 1VH(f 1(×),...,fro(x)) • (Bg 1(x,y),...,Bgm(x,y))) ~r(dx x dy) > -

E1 x

E2

H(fl(x),--,fm(x)) z~(dx x dy)

where the first inequality follows from the convexity of H, and the second inequality follows from the fact that the dot product in the second term of the second integrand is negative by Lemma 3.3 of Stockbridge (1990) and the third term integrates to zero by (2.15). Note that the inequality between the first and third expressions can be extended to arbitrary convex fimctions. Let M C C(E 1 x E 1 x E2) be the linear subspace of functions of the form

(2.20)

F(Xl,X2,Y) = ~ hi(Xl)(fi(x2) - l(Afi(x2,Y ) + Bgi(x2,Y)) ) -b h0(x2,Y ) i=1

201 where f i e f ( A ) ' g i e % B ) ' h i e C ( E 1) for i = l , . . . , m F E M, define the linear functional ~ by

and h 0 e C ( E l x E 2 ) .

For

111

(2.21)

qrF= [ E l x E2 t~lhi(x)fi(x) + h0(x,y)) ~r(dx x dy). Ill

Define the convex function It : Rm-~ ~ by H(rl,...,qn) = sup x E E 1 i~lhi (x) ri" Then .m h x f

< I E~ × E 2 ('(fl(x)," ,f,,,(x)) + h0(x,y))~(dx × dy) -< I E 1 xE2 (H(fl(x) - I(Afl(x,Y) + Bgl(x,Y)), ...,fro(x) - l(Afm(x,y ) + Bgm(x,y)) ) + h0(x,y)) ~(dx x dy) = I E 1 x E 2 sup Xl F(x 1,x2,y ) 7r(dx2 x dy) < [IFIt.

If F = I, then ~ F = I, so by the Riesz representation theorem, there exists a measure u E 9(E 1 x E 1 x E2) such that ¢

(2.23)

q,F = ~/

We can write F(xi,x2,Y ) = h(xt)

E 1 x E 1 x E2

F(Xl,X2,Y ) v(dx 1 x dx 2 x dy).

v(dx 1 x dx 2 x dy) = ~r0(dxl)q(xl,dx 2 x dy), where by taking we can see that ~r0(r ) = 7r(P x E2). Consider r/ as a transition

function on E l x E 2. T h e n w e have

(2.24)

I E x E 2 I E 1 x E2 h0(x2'y2) q(Xl'dX2 x dY2) 7r(dx1 x dYl) = IE 1 x E2 h0(x2'y2) 7r(dx2 x dY2),

and we see that 7r is a stationary measure for 1/. Finally, let {(Uk,Vk) } be a Markov chain with transition function 11 and initial distribution ~r (hence it is stationary). Then, by definition of (2.25)

E[h(Uk)(f(Uk+l)

-~(Af(V k+l ,v k+l ) +Bg(Uk+l'Vk+l))] = E[h(Uk)f(Uk)]'

202

and defining Xn(t) = U[nt] and Yn(t)= V[nt], it follows from the Markov property that

(2.26)

[nt] f(Xn(t))- l / A f ( X n ( s + l ) , Yn(s+l)) ds

and I [~-~t]Bg(Xn(s+l), Yn(s+l)) as

(2.27)

0

are martingales. Dcfining

[n__q (2.28)

rn([0,t] x G) = I0 n IG(Yrt(s+-~l)) as,

(Xn,Fn) converges to the desired process as in Thcorem 2.1.

3. Stochastic appro~mation Let S be a complete separable metric space. We consider a discrctc-timc proccss {(Xk,Yk,Uk,ak)} in ~d x S × Rd × (0,oo) adaptcd to a filtration {~k}" For cach n and t > 0, let ~/n(t) satisfy

(3.1)

,ln(t)-i

~

k=n

% 0, a.s., and (3.16) and assumption A.7 imply that for every e > 0 Lhere exists a compact K C DRa[0,oo) such that P{Xn E K, n=l,2 .... } _> 1-e. This conclusion along with assumption A.8 imply that for each e > 0 there exist compacts K 1 E DE[0,oo ) and K2 E Z(S) such that P{(X.,rn) E K1 x K2,n=l,2,...} > 1-e. By (3.15), with 1,robability one, any linfit point (x,#) of

(3.17)

{(Xn,rn)} satisfies

I [0,t] XS(qh(x(s)'y) - h(x(s),y))/~(dsx dy) = 0

for h satisfying the above assumptions. The collection of h for which (3.17) holds is closed under bounded pointwise convergence, and hence includes all of B(Rdx S). By the separability of S there exists a countablc subset D C c ( R d x S) such that the bounded, pointwise closure of D is B(Rdxs). As in Example 2.3, there exists J C [0,oo) with r e ( J ) = 0 and 7s E ~(S) such that for s E [ 0 , o o ) - J

(3.1s)

Is(q~(x(s),y)- h(x(s),y))~s(ay)

= o

for ,all 11 E B(R a x S). A.3 and (3.18) imply that ?s = 7rx(s),and A.I and L e m m a 1.5 then give

206

x(t) -- x(o) +

(3.19)

I' F(x(s))ds 0

where F(x) = ~f(x,y) 3.2 Remark

~x(dy).

Dupuis and Kushner (1989) obtain a.s. convergence of stochastic

approximation algorithms using large deviation estimates. In the present context, results analogous to theirs can be obtained by observing that the limit in (3.15) holds with A.5 and A.6 replaced by A.9 Let {Ck} be a nonincreasing sequence of constants satisfying c k > sup m > kam and {an} a nonnegative sequence satisfying ] ~ e - a n e < oo for all e > 0Y For each T > 0, the following hold with probability one.

(3.4) t

'/n(T) 2 sup C~n~ c k < co, for each T > 0 13.

k---~n

qn(T) (3.5)'

limn_,oo ~ E[Jak+l - akllqkl = 0 a.s. k=n

(3.6)'

l i m n ~ e ~ s u P t _< Tk_~n= akUk+ I = 0 a.s.

(3.7) t

limn-,oo X~ .~E[IUkk+lll'~ ]=Ok a,s.

,h,(T) o

1¢----11

The only complication in checking that A.9 can be used in place of A.5 and A.6 in the above axgument is to verify that for each T > 0

(3.20)

limn--4cosuPu < m _< ,In(T) {Mm - Mn I = 0

a.s.

v - m n c2k -> R.}. Then, by (3.4) t, (3.20) will hold if Let ( u ( R ) - - - m i n { m : o nz..k= limn--rcx~suPu < m < ~n(R) l Mm - Mn [ = 0

a.s.

for each R > 0. Observing that [ M k + 1 - M k l 0 and R > 0

207 co

P{suPn