Convergence of Empirical Processes for Interacting Particle Systems ...

2 downloads 0 Views 308KB Size Report
Convergence of Empirical Processes for Interacting. Particle Systems with Applications to Non-linear. Filtering. P. Del Moral, M. Ledouxy. Abstract. In this paper ...
Convergence of Empirical Processes for Interacting Particle Systems with Applications to Non-linear Filtering 

P. Del Moral,

M. Ledoux

y

Abstract

In this paper, we investigate the convergence of empirical processes for a class of interacting particle numerical schemes arising in biology, genetic algorithms and advanced signal processing. The Glivenko-Cantelli and Donsker theorems presented in this work extend the corresponding statements in the classical theory and apply to a class of genetic type particle numerical schemes of the non-linear ltering equation.

Keywords : Empirical processes, Interacting particle systems, Glivenko-Cantelli and Donsker theorems. code A.M.S : 60G35, 92D25

 y

UMR C55830, CNRS, Univ. Paul-Sabatier, 31062 Toulouse, [email protected] [email protected] 1

1 Introduction

1.1 Background and Motivations

Let E be a Polish space endowed with its Borel  - eld B(E ). We denote by M1(E ) the space of all probability measures on E equipped with the weak topology. We recall that the weak topology is generated by the bounded continuous functions on E and we denote by Cb (E ) the space of these functions. Let n : M1 (E ) ! M1(E ), n  1, be a sequence of continuous functions. Starting from this family we can consider an interacting N -particle system n = (n1 ; : : : ; nN ), n  0, which is a Markov process with state space E N and transition probability kernels ?



P n 2 dx j n?1 = z =

N Y p=1

 N X 1 n N zq (dxp) 

q=1

(1)

where dx def = dx1      dxN is an in nitesimal neighborhood of the point 1 x = (x ; : : : ; xN ) 2 E N , z = (z 1 ; : : : ; z N ) 2 E N and a stands for the Dirac measure at a 2 E . Let us introduce the empirical distributions nN of the N -particle system

n

nN = N1

N X i=1

ni

which is a random measure on E . Assume that 0 = (01; : : : ; 0N ) is a sequence of N -independent variables with common law 0 2 M1 (E ). Under rather general assumptions, it can be shown that nN converges weakly to a nonrandom probability distribution n 2 M1 (E ) as N ! 1 (see [11, 14]) and that fn; n  0g solves the following non-linear measure valued process equation

n = n (n?1 ); n  1: This class of interacting particle schemes has been introduced in [12, 13] to solve numerically the so-called non-linear ltering equation. The study of the convergence for abstract functions n was initiated in [11], whereas large deviation principles and the associated uctuations are presented in [14, 15]. As described in these papers, under certain conditions on the mappings fn ; n  1g, versions of the law of large numbers and the central limit theorem in the sense of convergence of nite dimensional distributions are available. Namely, 2

for any bounded test function f : E ! IR,

nN (f ) ???????! n (f ) P. a.s. N !1 and

p ?  WnN (f ) = N nN (f ) ? (f ) ???????! Wn(f ) N !1

where Wn is a centered Gaussian eld. The aim of this paper is to make these two statements uniform over a class of functions F , following the theory of empirical processes for independent samples (cf. [38] and the references therein). Our results apply to a class of measure valued processes and genetic type interacting particle systems arising in biology, in the theory of genetic type interacting particle systems and in advanced signal processing. We prove in this setting Glivenko-Cantelli and Donsker theorems under entropy conditions, as well as exponential bounds for Vapnik-Cervonenkis classes of sets or functions. We provide nally some uniform bounds in both space and time for a class of processes without memory.

1.2 Description of the Model and Statements of the Results 1.3 Description of the Model To describe precisely our model, we need rst introduce some notations. For any Markov transition kernel K and any probability measure  on E , we denote by K the probability measure de ned by, for any f 2 Cb (E ), Z

K (f ) = (dx) K (x; dz) f (z): Our measure valued dynamical system is then described by the equation

n = n (n?1 ); n  1; where

n () = n ()Kn ;

n ()(f ) =

and

(2)

(gn f ) ; (gn)

 fKn; n  1g is a sequence of Markov kernels on E , 3

 fgn; n  1g is a collection of bounded positive functions on E such that, for any n  1, there exist an 2 [1; 1) with Using the fact that

8 x 2 E; 8 n  1; a1  gn(x)  an : n

 X N N 1X  = n N xi 

i=1

i=1

(3)

gn (xi)  i ; j x j =1 gn (x )

PN

we see that the resulting motion of the particles is decomposed into two separate mechanisms n?1 = (n1?1 ; : : : ; nN?1 ) ?! bn?1 = (bn1?1 ; : : : ; bnN?1) ?! n = (n1 ; : : : ; nN ): These two transitions can be modelled as follows:

Initial particle system: Py (0 2 dx) =

N Y p=1

0(dxp):

Selection/Updating: Py

?

bn?1



2 dx j n?1 = z =

N X N Y p=1 i=1

gn (yn ? hn (z i))  i (dxp): j z j =1 gn (yn ? hn (z ))

PN

Mutation/Prediction: ?



Py n 2 dz j bn?1 = x =

N Y p=1

Kn(xp ; dz p):

(4)

Thus, we see that the particles move according to the following rules. In the selection transition, one updates the positions in accordance with the tness functions fgn ; n  1g and the current con guration. More precisely, at each time n  1, each particle examines the system of particles n?1 = (n1?1 ; : : : ; nN?1 ) and chooses randomly a site ni ?1 , 1  i  N , with a probability which depends on the entire con guration n?1 and given by gn(ni ?1 ) : PN j j =1 gn (n?1 ) 4

This mechanism is called the Selection/Updating transition as the particles are selected for reproduction, the most t individuals being more likely to be selected. In other words, this transition allows particles to give birth to some particles at the expense of light particles which die. The second mechanism is called Mutation/Prediction since at this step each particle evolves randomly according to a given transition probability kernel. The preceding scheme is clearly a system of interacting particles undergoing adaptation in a time non-homogeneous environment represented by the tness functions fgn ; n  1g. Roughly speaking the natural idea is to approximate the two step transitions of the system (1) Updating

Prediction

n?1 ??????! bn?1 def = n?1 (n?1 ) ??????! n = bn?1 Kn?1 by a two step Markov chain taking values in the set of nitely discrete probability measures with atoms of size some integer multiple of 1=N . Namely,

nN?1

N N N X X X Selection N Mutation 1 1 1 = N ni ? ??????! bn?1 = N bni ? ??????! n = N ni : i=1

1

1

i=1

i=1

1.4 Statement of the main Results

Given a collection F of measurable functions f : E ! IR, the particle density pro les nN at time n induce a map from F to IR given by

nN : f 2 F ?! nN (f ) 2 IR: The F -indexed collection fnN (f ); f 2 Fg is usually called the F -empirical process associated to the empirical random measures nN . The semi-metric commonly used in such a context is the Zolotarev semi-norm de ned by 



k ?  kF = sup j(f ) ?  (f )j; f 2 F ; 8 ;  2 M1(E ) (see for instance [32]). In order to control the behavior of the supremum knN ? nkF as N ! 1, we will impose conditions on the class F that are classicaly used in the statistical theory of empirical processes for independent samples. To avoid technical measurability conditions, and in order not to obscure the main ideas, we will always assume the class F to be countable and uniformly bounded. Our conclusions also hold under appropriate separability assumptions on the empirical process. We do not enter these questions here (see [38]). The Glivenko-Cantelli and Donsker theorems are uniform versions 5

of the law of large numbers and the central limit theorem for the empirical measures. In the classical theory of independent random variables, these properties are usually shown to hold under entropy conditions on the class F . Namely, to measure the size of a given class F , one considers the covering numbers N ("; F ; Lp()) de ned as the minimal number of Lp ()-balls of radius " > 0 needed to cover F . With respect to the classical theory, we will need assumptions on these covering numbers uniformly over all probability measures . Classically also, this supremum can be taken over all discrete probability measures. Since we are dealing with interacting particle schemes, we however need to strengthen the assumption and take the corresponding supremum over all probability measures. Several examples of classes of functions satisfying the foregoing uniform entropy conditions are discussed in the book [38]. Denote thus by N ("; F ), " > 0, and by I (F ) the uniform covering numbers and entropy integral given by 



N ("; F ) = sup N ("; F ; L2());  2 M1(E ) ; Z 1p I (F ) = log N ("; F ) d": 0

Our results will then take the following form. Theorem 1.1 [Glivenko-Cantelli] If N ("; F ) < 1 for any " > 0, then, for any time n  0,

N

 ? n = 0 P. a.s. lim n F N !1

If I (F ) < 1, we will prove that the F -indexed process WnN is asymptotically tight. Together with the central limit theorem for nite dimensional distributions described in [15], this leads to a uniform version in the Banach space l1 (F ) of all bounded functions on F .

Theorem 1.2 [Donsker] Under some regularity conditions on the transition kernels Kn , if I (F ) < 1, then, for any time n  0, the empirical process p ?  WnN : f 2 F ?! WnN (f ) = N nN (f ) ? n (f ) converges in law in l1 (F ) to a centered Gaussian process fWn (f ); f 2 Fg. The proofs of Theorem 1.1 and Theorem 1.2 are given in Section 2. Our method of proof is essentially based on a precise study of the dynamical structure of the limiting measure valued process (2) and on properties of covering numbers. For reasons which will appear clearly later on, this approach provides a natural and simple way to extend several results of the classical theory 6

of empirical processes to genetic-type interacting particle systems. For these reasons, we present at the very beginning of Section 2 some key lemmas that will be used in extending these results. Section 3 is concerned with further uniform results. We rst establish a precise exponential bound for classes of functions F with polynomial covering numbers, such as the well-known Vapnik-Cervonenkis classes. More precisely we will prove the following extension of a theorem due to Talagrand [37] in the independent case. Theorem 1.3 Let F be a countable class of measurable functions f : E ! [0; 1] satisfying 



V N ("; F )  C" for every 0 < " < C for some constants C and V . Then, for any n  0,  > 0 and N  1, we have  V  

D  N

P Wn F >  n  (n + 1) p e?2  V where D, resp. n , is a constant that only depends on C , resp. on the tness functions fgn ; n  1g. 2

In the second part of Section 3, we present a uniform convergence result with respect to time under some additional conditions on the limiting system (2). Theorem 1.4 When the class of function F is suciently regular and under certain conditions on the mappings fn ; n  1g, there exists a convergence rate > 0 such that sup E nN ? n F  C =2 I (F ) N n0 



for some constant C that does not depend on the time parameter n  0. Application of the above results to non-linear ltering problems are presented in the last section.

2 Glivenko-Cantelli and Donsker Theorems Before turning to the proof of Theorems 1.1 and 1.2, we present three simple lemmas on the dynamical structure of (2) and the covering numbers. These properties will also be useful in the further developments of Section 3. 7

Let us introduce before some additional notations. Denote by fnjp ; 0  p  ng the composite mappings njp = n  n?1      p+1 ; 0  p  n (with the convention njn = Id). A clear backward induction on the parameter p shows that the composite mappings njp have the same form as the one step mappings fn ; n  1g. This is the content of the next lemma. Lemma 2.1 For any 0  p  n, we have ?   gnjp? Knjp (f ) njp ()(f ) = for all f 2 Cb (E ) where

 gnjp

?



K g K (f ) Knjp?1 (f ) = p njp? njp ; gnjp?1 = gp Kp(gnjp) Kp gnjp with the conventions gnjn = 1 and Knjn = Id.

(5)

Under our assumptions and using the above notations we also notice that 8 x 2 E; 8 0  p  n; 1  g (x)  a (6)

anjp

where

anjp =

n Y q=p+1

njp

njp

aq ; 0  p  n:

Lemma 2.2 For any 0  p  n, any ;  2 M1(E ) and any function f njp ()(f ) ? njp ( )(f ) =

1

h?



?

 (gnjp ) (fnjp ) ?  (fnjp ) + njp ()(f )  (gnjp) ? (gnjp)

where

i

(7)

fnjp def = gnjpKnjp (f ): If, for any g : E ! IR such that kg k  1 and for any Markov kernel K on

E we de ne

then we have

g  K F = fgK (f ); f 2 Fg 8

Lemma 2.3 For any p  1, " > 0 and  2 M1(E ), ?  ?  N "; g  K F ; Lp()  N "; F ; Lp(K ) and therefore,

N ("; g  K F )  N ("; F ); I (g  K F )  I (F ): Lemma 2.3 follows from the fact that N ("; g  F ; Lp())  N ("; F ; Lp()) and N ("; K F ; Lp())  N ("; F ; Lp(K )): The rst assertion is trivial. To establish the second inequality, simply note that, for every function f , jK (f )jp  K (jf jp) and go back to the de nition of the covering numbers.

2.1 Glivenko-Cantelli Theorem

To establish a Glivenko-Cantelli property in our setting, we make use of the following basic decomposition: for any f 2 F , n h i X ?  N njp (pN )(f ) ? njp p(pN?1) (f ) n (f ) ? n(f ) = p=0 the convention 0 (?N1 ) = 0 . There is no loss of generality in

with that 1 2 F . Then, by Lemma 2.2, we get knjp() ? njp( )kF  2 a2njp k ?  kFnjp

(8) assuming (9)

where Fnjp = gnjp  Knjp F and gnjp = an?j1p gnjp so that kgnjp k  1). It easily follows that, for every " > 0, 

P nN ? n F > "



 

 (n + 1) sup P njp (pN ) ? njp(p(pN?1)) F > n +" 1 :

0pn

Using (9), this implies that

   

(10) P nN ? n F > "  (n + 1) sup P pN ? p(pN?1 ) F > "

0pn

where n = 2(n + 1)a2nj0. Our main assumption in this section will be (H1)

8 " > 0; N ("; F ) < 1:

The Glivenko-Cantelli theorem may then be stated as follows. 9

n

Theorem 2.4 Assume that F is a countable collection of functions f such that kf k  1 and (H1) holds. Then, for any time n  0, knN ? n kF converges almost surely to 0 as N ! 1. Proof: It is based on the following standard lemma in the theory of empirical processes.

Lemma 2.5 Let fX i;N ; 1  i  N g be independent random variables with common law P (N ) and let F be a countable collection of functions f such that p kf k  1. Then, for any " > 0 and N  4 "?1 we have that    P mN (X ) ? P (N ) kF > 8 "  8 N ("; F ) e?N" =2 2

P

where mN (X ) = N1 Ni=1 X i;N . Before turning to the sketch of the proof of this lemma, let us show how it implies the theorem. n is xed throughout the argument. Using (10) and p ? Lemma 2.5 one easily gets that, for N  4 "n 1 where "n = "=8n , 



P nN ? n F > "  8(n + 1) e?N"n =2 sup N ("n ; Fnjp): 2

0pn

Furthermore, by de nition of the class Fnjp and with the help of Lemma 2.3, we know that, for each " > 0, and 0  p  n,

N ("; Fnjp)  N ("; F ): Therefore,



p



P nN ? n F > "  8(n + 1)N ("n; F ) e?N"n=2 2

as soon as N  4 "?n 1 . The end of proof of Theorem 2.4 is an immediate consequence of the Borel-Cantelli lemma. Proof of Lemma 2.5: Using the classical symmetrization pinequalities (see Lemma 2.3.7 in [38] or pp. 14-15 in [31]), for any " > 0 and N  4 "?1,    

P mN (X ) ? P (N ) F > "  4P mN" (X ) F > 4"

where mN (X ) denotes the signed measure

mN" (X ) =

N 1X N "i X i;N i=1

10

(11)

and f"1; : : : ; eN g are symmetric Bernoulli random variables, independent of the X i;N 's. Conditionally on the X i;N 's, and by de nition of the covering numbers, we easily get by a standard argument that 



P mN" (X ) F >  X

N



(12)

   =2; F ; L1(mN (X )) sup P mN" (X )(f ) > =2 X : f 2F

?

Indeed, let



?



fm ; 1  m  N =2; F ; L1(mN (X )) be a (=2)-coverage of F for the L1(mN (X ))-norm. Then, 















P mN" (X ) F >  X  P sup mN" (X )(fm) > =2 X : m

Therefore,

P



kmN" (X )kF

>

  X

?











 N =2; F ; L1(mN (X )) supP mN" (X )(fm) > =2 X : m

By Hoe ding's inequality (see for instance Lemma 2.2.7 in [38]), for any f 2 F and  > 0,   P mN" (X )(f ) > =2 X  2 e?N =8: As a consequence, we see that the conditional probability 2

is bounded above by



 P mN" (X ) F > 4" X

?  2N "=8; F ; L1(mN (X )) e?N" =128  2 N ("=8; F ) e?N" =128: 2

From (11), it follows that 



2

P mN (X ) ? P (N ) F > "  8 N ("=8; F ) e?N" =128 p as soon as N  4 "?1 which is the result. The proof of Lemma 2.4 is com-

plete.

11

2

2.2 Donsker's Theorem

After the Glivenko-Cantelli theorem, Donsker's theorem is the second most important result on the convergence of empirical processes. Theorem 1.2 that we will establish in this section, may be seen as a uniform version of the central limit theorem presented in [15]. More precisely, it was shown there that the marginals of the F -indexed process fWnN (f ); f 2 Fg converge weakly to the marginal of a F -indexed centered Gaussian process fWn (f ); f 2 Fg. Now, weak convergence in l1 (F ) can be characterized as the convergence of the marginals together with the asymptotic tightness of the process fWnN (f ); f 2 Fg. In view of these observations, the proof of Theorem 1.2 is naturally decomposed into two steps. In the rst step, we establish the asymptotic tightness using the standard entropy condition (H2)

I (F ) < 1 .

We then make use of the results of [15] to identify the limiting Gaussian process. To this task, and as in [15], we need to impose some further conditions on the kernels Kn . We will namely assume that (H3) Kn is Feller for each n  1 and such that there exists, for any time n  1, a reference probability measure n 2 M1(E ) and a B(E )-measurable non-negative function 'n such that

Kn(x; )  n for every x 2 E with log



dKn(x; )  ' and dn n

Z

er'n dn < 1

(13)

for every r  1.

Theorem 2.6 Assume that F is a countable class of functions such that kf k  1 for any f 2 F and satisfying (H2) and (H3). Then, for any n  0, fWnN (f ); f 2 Fg converges weakly in l1(F ) as N ! 1 to a centered Gaussian process fWn (f ); f 2 Fg with covariance function ?



E Wn(f )Wn(h) =

n Z X p=0



?  gnjp 2 ? K ( f ) ?  ( f ) K ( h ) ?  ( h ) dp: n n n j p n j p p (gnjp)

12

Remark 2.7: If the transition probability kernels fKn; n  1g are trivial, in the sense that, for every n,

Kn (x; dz) = n (dz); n 2 M1(E ); then one can readily check that

n = n and Knjp (x; dz) = n (dz); 0  p  n: In this particular situation fWn (f ); f 2 L2 (n )g is the classical n -Brownian bridge. Namely, Wn is the centered Gaussian process with covariance ?  ?  E Wn (f )Wn (h) = n (f ? n f )(h ? n h) : As announced, the proof of Theorem 2.6 will be a consequence of the following two lemmas. Note that the entropy condition I (F ) < 1 is only used in the proof of the asymptotic tightness, while the regularity condition (H3) is needed in identifying the limiting Gaussian process. Again, n is xed throughout the proof of Theorem 2.6.

Lemma 2.8 If F is a countable collection of functions f such that kf k  1 and (H2) holds then the F -indexed process fWnN (f ); f 2 Fg is asymptotically

tight.

Lemma 2.9 Under (H3), the marginals of fWnN (f ); f 2 L2(n)g converge weakly to the marginals of a centered Gaussian process fWn (f ); f 2 L2(n )g with covariance function ?



E Wn(f )Wn(h) =

n Z  X p=0

?  gnjp 2 ? K (f )?n (f ) Knjp (h)?n (h) dp: n j p p (gnjp)

Since the collection of functions fgnjp; 0  p  ng satis es (6), using the Cauchy-Schwarz inequality and Lemma 2.1, we have that 



E Wn (f ) ? Wn(h) 2  Cnkf ? hk2L (n ) 2

for some constant Cn < 1 and all f; h 2 L2(n ). In particular, to prove the asymptotic tightness in Lemma 2.8, it will be enough to establish the asymptotic equicontinuity in probability of fWnN (f ); f 2 Fg with respect to the semi-norm on F given by ?

f 2 F ! n (f ? n (f ))2 13

1=2

(see for instance Chap. 1.5 and Example 1.5.10 pp. 40 in [38]). Proof of Lemma 2.8: Under (H2), the class F is totally bounded in L2(n) for any n  0. According to the preceeding comment, it will be enough to show that  

W N n (14) lim E n F (N ) = 0 N !1 ( )

for all sequences N # 0 where, for any  > 0 (possibly in nite), 



F (n)() def = f ? h; f; h 2 F : kf ? hkL (n ) <  : 2

To this task, we use the decompositions (8) and (7) to get that

 N

n

? n F n ()  ( )

n X p=0

a2njp

h

 N

p



? p(pN?1) Fnjp () i



+ pN Fnjp () pN (gnjp ) ? p(pN?1 )(gnjp ) where





Fnjp() = gnjpKnjpf ; f 2 F (n)() ; Q gnjp = an1jp gnjp and anjp = nq=p+1 aq . Since for any 0  p  n,

E pN (gnjp ) ? p (pN?1)(gnjp ) 2  N1 ; to prove (14) it suces to check that for any 0  p  n and N # 0 p 

 N ? p ( N ) 1) Nlim E N p p?1 Fnjp (N ) = 0; !1  



2)

lim E N !1

where





N



p Fn2jp (N )

= 0:



Fn2jp() = f 2; f 2 Fnjp()



Let us prove 1). By the symmetrization inequalities, for any N , 





E pN ? p (pN?1 ) Fnjp (N )  2E mN" (p) Fnjp (N ) where, as in Section 2.1,

mN" (p ) =

N 1X N "i pi : i=1

14



Fix p = (p1 ; : : : ; pN ). By Hoe ding's inequality (cf. Lemma 2.2.7 in [38]), the process N X p f ! p1 f (pi ) "i = NmN (p)(f )

N

i=1

is sub-Gaussian with respect to the norm k  kL (pN ) . Namely, for any f; h 2 Fnjp(N ) and > 0, 2

P

p ? N  ?  N m (p)(f ) ? mN (p )(h) > p  2 e



1 2

2 =kf ?hk2L

N 2 (p )

:

Using the maximal inequality for sub-Gaussian processes (cf. [26], [38]), we get the quenched inequality 



E mN" (p) Fnjp (N ) p



Z njp (N )q ?  C p log N "; Fnjp(N ); L2(pN ) d" N 0

where

(15)

njp (N ) = pN Fnjp (N ) : On the other hand we clearly have that, for every " > 0, ?  ?  ?  N "; Fnjp (); L2(pN )  N "; Fnjp(1); L2(pN )  N 2 "=2; Fnjp; L2(pN ) where we recall that Fnjp = gnjp  Knjp F . Under our assumptions, it thus 2

follows from Lemma 2.3 that ?



N "; Fnjp(); L2(pN )  N 2("=2; F ): Using (15), one concludes that, for every N  1, ! p Z njp (N ) p  

2 C log N ("=2; F ) d" E mN" (p ) Fnjp (N )  p E N 0 and therefore  E  N

N p ? p (p?1 ) Fnjp (N )



p N

Z njp (N )p

 2p 2C E

0

!

log N ("=2; F ) d" :

By the dominated convergence theorem, to prove 1) it suces to check that

N

 = 0 P. a.s. (16) lim  ( N ) = lim n j p N !1 N !1 p F (N ) 2

njp

15

We establish this property by proving that

kpkFnjp(N )  N2

N

 ? p lim p Fnjp (N ) = 0 N !1

a) b)

2

P. a.s.

2

Let f; h 2 F be chosen so that n ((f ? h)2 ) <  2 (i.e. f ? h 2 F (n)( )). Use Cauchy-Schwarz to see that 



 



?

p gnjp Knjp(f ? h) 2  p g2njp Knjp (f ? h)2 :

(17)

Since 0  gnjp = a?nj1p gnjp  1, the right-hand side of (17) is bounded above by 1  ?g K ?(f ? h)2 = p(gnjp )  ?(f ? h)2 n a p njp njp a njp

njp

which is less that  2 . This ends the proof of a). To prove b), rst note that

N



p





? p Fnjp (N )  pN ? p Fnjp (1): 2

2

Using Theorem 2.4, to prove b) it certainly suces to show that ?



sup N "; Fn2jp(1); L2() < 1  for every " > 0. Since all functions in F have norm less than or equal to 1, for any f; h in Fnjp(1) and any  2 M1 (E ),

kf 2 ? h2kL ()  4 kf ? hkL (): It follows that, for every " > 0,  ?  ? N "; Fn2jp(1); L2()  N "=4; Fnjp(1); L2() : 2

Since

?

2



?

N "; Fnjp(1); L2()  N 2 "=2; Fnjp; L2()



one concludes, using Lemma 2.3 that ?



?



sup N "; Fn2jp(1); L2()  sup N 2 "=8; F ; L2() :   This ends the proof of b) and 1). In the same way, by dominated convergence, the proof of 2) is an immediate consequence of (16). This completes the proof 16

of Lemma 2.8 We turn to the proof of Lemma 2.9. To this task, we need to recall some results presented in [15]. Under (H3), for any n  1, set 8 (x; z) 2 E 2; k (x; z) def = dKn (x; ) (z ): n

dn

From now on, the time parameter n  0 is xed. For any x = (x0; : : : ; xn ) and z = (z0 ; : : : ; zn ) 2 E n+1 set,

q[0;n] (x; z) = with

n X m=1

qm(x; z)

gm(zm?1 ) km(zm?1 ; xm) ; E gm (u) km(Zu; xm )m?1 (du) a[0;n] (x; z) = q[0;n] (x; z) ? q[0;n] (x0 ; z) [0;n] (dx0); n [0;n] = 0    n: qm (x; z) =

R

Observe now that from the exponential moment condition (13) of (H3), q[0;n] 2 L2([0;n] [0;n] ). It follows that a[0;n] 2 L2([0;n] [0;n]) and therefore the integral operator An given by Z

An '(x) = a[0;n] (z; x)'(z)[0;n](dz); ' 2 L2 ([0;n]) is Hilbert-Schmidt on L2([0;n]). Under (H3), it is proved in [15] that the integral operator I ? An is invertible and that the random eld

' 2 L2(

p N [0;n]) ! W[0;n] (') = N



 N 1X i i N '(0; : : : ; n ) ? [0;n](')

i=1

converges as N ! 1, in the sense of convergence of nite dimensional distributions, to a centered Gaussian eld fW[0;n]('); ' 2 L2([0;n])g with covariance ?  E W[0;n]('1)W[0;n]('2) 



= (I ? An )?1 ('1 ? [0;n]('1 )); (I ? An )?1 ('2 ? [0;n]('2 )) L ( 2

17

[0;n]

)

for '1; '2 2 L2([0;n]). From the above observations it follows that the process

p ?  f 2 L2(n) ! WnN (f ) = N nN (f ) ? n(f )

converges in the sense of convergence of nite dimensional distributions and as N ! 1 to a centered Gaussian eld fWn ('); f 2 L2 (n )g satisfying ?





E Wn(f )Wn(h) = (I ? An )?1 (f ? n (f )) 1; (I ? An )?1 (h ? n (h)) 1



L2 (n )

for any f; h 2 L2 (n ), where, for all f 2 L2(n ), = 1| :{z: : 1} f: f 1 def (n?1) times

In the next lemma, we identify the preceding covariance function and thus establish in this way Lemma 2.9.

Lemma 2.10 For every f 2 L2(n), and every z0; : : : ; zn 2 E , n X ?  1 ? 1 (I ? An ) f ? n (f ) (z0 ; : : : ; zn ) = f~njp (zp) p=0

where the functions f~njp , 0  p  n, are given by

?  g f~njp =  (ngjp ) Knjp (f ) ? n(f ) ; 0  p  n; p njp with the convention gnjn = 1 and Knjn = Id.

Proof: Using Lemma 2.1, we rst note that

 gp Kp (gnjp)  Kp (gnjp Knjp(f )) ~ fnjp?1 =  (g ) Kp(gnjp) ? n (f ) p?1 njp?1 ?  =  (ggp ) Kp(gnjp Knjp (f )) ? Kpgnjp n (f )

p?1 njp?1

and therefore

 ?  g  (g )  g f~njp?1 =  p (pg njp ) Kp  (ngjp ) Knjp (f ) ? n(f ) p?1 njp?1 p njp p(gnjp) ~ = gp  (g ) Kp(fnjp ) :

p?1 njp?1

18

Then, using the fact that

 (g K (g )  (g ) p(gnjp ) = p?1 p (gp ) njp) = p?1 (ngjp?)1 p?1 p p?1 p one easily gets the backward recursion equations f~njp?1 =  gp(g ) Kp (f~njp ); 1  p  n: p?1 p

(18)

By de nition of An , for any ' 2 L2 ([0;n]), we have that

An '(z0; : : : ; zn) =

n Z X m=1

'(x0; : : : ; xn)am(xm; zm?1 )[0;n](dx0; : : : ; dxn)

where, for every 1  m  n,

am (xm; zm?1 ) = gm(zm?(1g) kmk(z(m;?x1 ; x))m) ? gm(zm(g?1 )) : m?1 m m

m

m?1 m

On the other hand, we observe that since m(dxm) = m?1(gm k(mg(;)xm)) m(dxm) m?1 m and Km(zm?1 ; dxm) = km(zm?1 ; xm)m(dxm); we have that gm(zm?1 ) km(zm?1 ; xm)  (dx ) = gm (zm?1 ) K (z ; dx ): m?1 ( gm km(; xm)) m m m?1 (gm ) m m?1 m Therefore, (I ? An )'(z0; : : : ; zn ) = '(z0; : : : ; zn ) n X

Z g ( z ) m m ? 1 + '(x0; : : : ; xn) ~[0m;n](zm?1 ; dx0; : : : ; dxn)  ( g ) m ? 1 m m=1

(19)

where ~[0m;n](zm?1 ; dx0; : : : ; dxn) ?



= [0;m?1](dx0; : : : ; dxm?1 ) m (dxm ) ? Km (zm?1 ; dxm) [m+1;n] (dxm+1 ; : : : ; dxn): 19

Choose now ' given by

'(z0; : : : ; zn ) =

n X p=0

f~njp (zp); z0 ; : : : ; zn 2 E:

Then, we get (I ? An )'(z0; : : : ; zn ) = '(z0 ; : : : ; zn ) ?

n X m=1

gm (zm?1 ) K (f~ )(z ): m?1 (gm) m njm m?1

Finally, using the backward equation (18), one concludes that (I ? An )'(z0; : : : ; zn ) = '(z0; : : : ; zn ) ? =

n X m=0

n X m=1 nX ?1

f~njm (zm) ?

m=0

f~njm?1 (zm?1 ) f~njm (zm )

so that the result follows from (I ? An )'(z0; : : : ; zn ) = fnjn (zn ) = f (zn ) ? n (f ): The nature of the limiting Gaussian eld fWn (f ); f 2 L2 (n )g is now clearly determined. More precisely, for every f; h 2 L2 (n ), we get from the preceding that ?



E Wn(f )Wn(h) =

n Z X p=0



?  gnjp 2 ? K (f )?n (f ) Knjp (h)?n (h) dp: n j p p (gnjp)

Lemma 2.9 is therefore established in this way. This completes the proof of Donsker's theorem.

3 Further Uniform Results 3.1 Exponential Bounds

In Section 2, we have shown that, under rather general assumptions, the Glivenko-Cantelli theorem hold for the genetic type interacting scheme (1). 20

The proof of Theorem 2.4 also gives an exponential rate of convergence but this result is only valid for a number of particles larger than some value depending on the time parameter. In this section, we re ne this exponential bound in the case of uniformly bounded classes F with polynomial covering numbers. More precisely we will use the following assumption

N ("; F )  displaystyle

(H4)

? C V

"

for every 0 < " < C

for some constants C and V . Several examples of classes of functions satisfying this condition are discussed in [38]. For instance Vapnik-Cervonenkis classes F of index V (F ) and envelope function F = 1 satisfy (H4) with V = 2(V (F ) ? 1) and a constant C that only depends on V . Theorem 3.1 Let F be a countable class of measurable functions f : E ! [0; 1] satisfying (H4) for some constants C and V . Then, for any n  0,  > 0 and N  1, 

V D  e?2  n F > n  (n + 1) p V Q where D is a constant that only depends on C and n = 2(n + 1) np=1 a2p . 

P W N



2

Proof: We will use the decomposition (8). Using the same notations as in

the beginning of Section 2 and in the proof of Theorem 2.4, we have

P



N p ? njp?1 (p?1 ) F

njp ( N )

  

" > n + 1  P pN ? p(pN?1 ) Fnjp > "n

(20)

where "n = "=n and n = 2(n + 1)a2n=0 . Now, under the polynomial assumption (H4) on the covering numbers, it is convenient to note that the classes Fnjp, 0  p  n, also satisfy the assumptions of the theorem. Indeed, the class Fnjp is again a countable class of functions f : E ! [0; 1] and using Lemma 2.3 we also have, for every " > 0, V C N ("; Fnjp)  N ("; F )  " : 

We are now in position to apply the exponential bounds of [37]. More precisely, by recalling that pN is the empirical measure associated to N conditionally independent random variables with common law p (pN?1), and using 21

the exponential bounds of [37] (see also [38]), we get

P



 N

N p ? p (p?1) Fnjp

>

p !V p D pN"n e?2 ( N"n ) p?1  V

 "n  N

2

where D is a constant that only depends on C . The remainder of the proof is exactly as in the proof of Theorem 2.4. Using (20), one gets nally the exponential bound

p V p D pN "n e?2 ( N"n ) : n ? F > "  (n + 1) V p Hence, if we denote  = N"=n we obtain the desired inequality and the  P N





n

2

theorem is thus established.

3.2 A Uniform Convergence Result

Here we discuss the long time behavior of the particle scheme (1). In a previous work [16], it was shown that under some conditions on the mappings fn ; n  1g, there exists a convergence rate > 0 such that for any bounded test function f sup E nN (f ) ? n (f )  C =2 kf k N n0 



(21)

for some constant C that does not depend on the time parameter n  0. The aim of this section is to turn this result into a statement uniform in f varying in a suitable class of function F . Before starting this discussion, let us present some comments on the results of [16]. The main idea in this work was to connect the asymptotic stability of the limiting system (2) with the study of the long time behavior of the corresponding particle scheme. One condition under which the system (2) is asymptotically stable is the following (H5) There exists a reference probability measure  2 M1(E ) and  2 (0; 1] such that Kn (x; )   for any x 2 E and n  1 and

  dKnd(x; )  1 : 22

More precisely, under (H5) one can show that for any n  0



nj0 ()

where k  kTV

? nj0() TV  (1 ? 2)n; 8 ;  2 M1(E ) denotes the total variation norm on M1(E ).

(22)

It should be mentioned that (H5) is stronger than the condition (H3) needed in the identi cation of the limiting Gaussian process in Donsker's theorem. Moreover, under appropriate regularity conditions on the tness functions fgn ; n  1g, the uniform estimate (21) holds for some convergence exponent > 0 which depends on the parameter  in (H5). Several ways to relax (H5) are also presented in [16]. With these preliminaries out of the way, we now state and prove the following theorem. Theorem 3.2 Let F be a countable collection of functions f such that kf k  1 and satisfying (H5). Assume moreover that the limiting dynamical system (2) is asymptotically stable in the sense that, for some > 0,



sup p+T jp ()(f ) ? p+T jp ( )(f )  e? T p0

(23)

for all ;  2 M1(E ), T  0 and f 2 F . When the tness functions fgn ; n  1g satisfy (3) with supn1 an = a < 1, then we have the following uniform estimate with respect to time

0 C e n ? F  N =2 I (F ) n0 where C is a universal constant and and 0 are given by = + 0 and 0 = 1 + 2 log a:  sup E  N

 n

(24)

Proof: We use again the decomposition (8). By the same line of arguments as the ones given in the proof of Lemma 2.8, and recalling that pN is the empirical measure associated to N conditionally independent random variables with common law p (pN?1) one can prove the error bound   ?  E njp (pN ) ? njp p(pN?1) F pN?1  pC a2njp I (Fnjp) N

23

where C > 0 is a universal constant. Since anjp  an=0  an , 0  p  n, by Lemma 2.3, I (Fnjp)  I (F ), and one concludes that  ?   E njp (pN ) ? njp p (pN?1 ) F  pC a2nI (F ): N Therefore, for any time T  0, any N  1 and any f 2 F , ?  sup E knN ? n kF  pC (T + 1) a2T I (F ): N 0nT Similarly, for any n  T , we have

N  (f )

n

?

n (f )



n X

p=n?T +1



(25)

N N njp (p )(f ) ? njp (p (p?1))(f )



+ n=n?T (nN?T )(f ) ? n=n?T (n?T )(f ) : Under our assumptions, this implies that, for every n  T , N  (f )

n

n X



? n(f )  e? T +

p=n?T +1

njp ( N )(f )

p



? njp(p(pN?1))(f )

and using the same arguments as before one can prove that

sup E nN ? n F  e? T + pC (T + 1) a2T I (F ): N nT 



(26)

Combining the right-hand sides of (25) and (26) leads to a uniform L1-error bound with respect to time in the form of the inequality 0

sup E nN ? n F  e? T + pC e 0 T I (F ) 



N

n0

where 0 = 1 + 2 log a and C 0 > 0 is a universal constant. Obviously, if we choose   1 log N def T = T (N ) = 2 + 0 + 1 where [r] denotes the integer part of r 2 IR, we get that 

 ?  sup E  N ?   1 1 + e 0 C 0 I (F ) n0

n

n F

N =2

where = =( + 0). This ends the proof of the theorem. 24

4 Applications The interacting particle systems models presented above have many applications in biology, genetic algorithms, non-linear ltering and bootstrap. In this short subsection, we brie y comment these examples of applications. The reader who wishes to know more about speci c applications of these particle algorithms is invited to consult the referenced papers. Weighted bootstrap methods are based on re-sampling from a given weighted random measure. Roughly speaking, the key idea of bootstrap methods is the following: if the weighted random measure is close to a given non-random distribution, then the empirical measure associated to this re-sampling scheme should imitate the role of the weighted one (the book [1] includes a useful survey on this subject). In our framework, we form at each time n, a weighted empirical measure N i N ) = X P gn (n ) (  n n N g ( j ) ni i=1 j =1 n n and the selection mechanism consists of re-sampling N independent particles fbni ; 1  i  N g with this law so that to obtain a new probability measure with atoms of size integer multiples of 1=N . In contrast to the classical bootstrap theory, one tries to approximate a measure valued dynamical system (2) and the weights are dictated by the dynamics structure of the former dynamical system. The basic model for the general non-linear ltering problem consists of a time inhomogeneous Markov process fXn ; n  0g taking values in a Polish space (E; B(E )) and, a non-linear observation process fYn ; n  0g taking values in IRd for some d  1. The classical ltering problem can be summarized as to nd the conditional distributions

n (f ) def = E (f (Xn) j Y1; : : : ; Yn ); f 2 Cb (E ); n  0:

(27)

It was proven in a rather general setting by Kunita [25] and Stettner [35] that, given a series of observations Y = y , the distributions fn ; n  0g are solution of a discrete time measure valued dynamical system of the form (2). In this framework, the Markov kernels fKn ; n  1g correspond to the transition probability kernels of the signal process fXn ; n  0g and the tness functions fgn ; n  1g are related to the observation noise source and also depends on the observation data. In such a context, the resulting interacting particle scheme can be viewed as a stochastic adaptative grid approximation of the non-linear ltering equation (see for instance [11, 12, 13] and references 25

therein). Because of its importance in practical situations, we devote the last subsection of this paper to this theme. In biology, the former model is used to mimic the genetic process of biological organisms and more generally natural evolution processes. Most of the terminology we have used is drawn from this framework. For instance, in gene analysis, each population represents a chromosome and each individual particle is called a gene. In this setting the tness unction is usually time-homogeneous and it represents the performance of the set of genes in a chromosome ([22]). These particle algorithms are also used in population analysis to model changes in the structure of population in time and in space. The interacting particle system model (1) is not only designed to solve the non-linear ltering equation or to mimic natural evolution. Several practical problems have been solved using this approach, including numerical function optimization, image processing, combinatorial optimization tasks and machine learning. Among the huge literature on genetic algorithm we refer to the papers [3, 18, 19, 20], [28, 33] and [39, 40, 41, 42, 43].

4.1 The Non-linear Filtering Problem

4.1.1 Introduction

The non-linear ltering problem consists in recursively computing the conditional distributions of a non linear signal given its noisy observations. This problem has been extensively studied in the literature and, with the notable exception of the linear-Gaussian situation or wider classes of models (Benes lters [2]), optimal lters do not have nitely recursive solutions (ChaleyatMaurel/Michel [7]). Although Kalman ltering ([23], [27]) is a popular tool in handling estimation problems, its optimality heavily depends on linearity. When used for non-linear ltering (Extended Kalman Filter), its performance relies on and is limited by the linearization performed on the model under investigation. It has been recently emphasized that a more ecient way to solve numerically the ltering problem is to use random particle systems. That particle algorithms are gaining popularity is attested by the list of referenced papers. Instead of hand-crafting algorithms often on the basis of ad-hoc criteria, particle systems approaches provide powerful tools for solving a large class of nonlinear ltering problems. Several practical problems which have been solved using these methods are given in [5, 6] including Radar/Sonar signal processing and GPS/INS integrations. Other comparisons and examples where the 26

extended Kalman lter fails can be found in [4]. The present paper is concerned with the genetic type interacting particle systems introduced in [12]. Not all the non-linear ltering problems which arise in applications can be represented by (2). Here we assumed that all processes are time continuous and the observation noise source is independent of the signal process. A new interacting particle scheme for solving continuous time ltering problems in which the noise source is correlated to the signal is proposed in [17]. Several variants of the genetic-type interacting particle scheme studied in this paper have been presented in [8, 9] and [10]. These variants are less \time consuming" but as a result the size of the system is not xed but random. In [16],it is shown that one cannot expect a uniform convergence result with respect to time. Roughly speaking ,the size of the system behaves as a martingale and, trivial cases apart, its increasing predictable quadratic variation is usually not uniformly integrable with respect to time. On the other hand, the central limit theorem for such variants with random population size is still an open subject of investigation.

4.1.2 Description of the models

Let (X; Y ) be a time-inhomogeneous and discrete time Markov process taking values in a product space E  IRd , d  1, and de ned by the system  X = (Xn)n0 (28) Yn = hn (Xn?1 ) + Vn; n  1; where E is a Polish space, hn : E ! IRd , d  1, are bounded continuous functions and Vn are independent random variables with continuous and positive density gn with respect to Lebesgue measure. The signal process X that we consider is assumed to be a non-inhomogeneous and E -valued Markov process with Feller transition probability kernel Kn , n  1, and initial probability measure 0, on E . We will also assume the observation noise V and X are independent. The classical ltering problem is concerned with estimating the distribution of Xn conditionally to the observations up to time n. Namely, n (f ) def = E (f (Xn) j Y1; : : : ; Yn ) for all f 2 Cb (E ). For a detailed discussion of the ltering problem ,the reader is referred to the pioneering paper [36] and to the more rigorous studies [34] and [24]. Recent developments can be found in [29] and [30]. The dynamical structure of the conditional distribution fn ; n  0g is given by the following lemma. 27

Lemma 4.1 Given a xed observation record Y = y, fn ; n  0g is solution of the M1(E )-valued dynamical system n = n(yn ; n?1 ); n  1; (29) where yn 2 IRd is the current observation and n is the continuous function given by

n (yn; ) = n(yn; )Kn where for any f 2 Cb (E ) ;  2 M1(E ) and y 2 IRd , R f (x)gn(y ? hn (x))(dx) n (y;  )f = R g (y ? h (z )) (dz ) : n n

In view of (1), the transition probability kernels of the genetic scheme associated to (29) are now given by

Py (n 2 dx j n?1 = z) =

N X N Y p=1 i=1

gn(yn ? hn (zi )) K (zi ; dxp) : (30) n j j =1 gn (yn ? hn (z ))

PN

Thus, we see that the particles move according the following rules 1. Updating: When the observation Yn = yn is received, each particle examines the system of particles n?1 = (n1?1 ; : : : ; nN?1 ) and chooses randomly a site ni ?1 with probability

gn(yn ? hn (ni ?1 )) : j j =1 gn (yn ? hn (n?1 ))

PN

2. Prediction: After the updating mechanism, each particle evolves according to the transition probability kernel of the signal process.

4.1.3 Applications Quenched Results

In order to apply the results of the preceding sections to this setting, rather than (3) we simply use the following assumption (F) For any time n  0, hn is bounded and continuous and gn is continuous with strictly positive values. 28

We note that if (F) holds then it is easily seen that there exists a family of positive functions fan ; n  0g, such that, for all (y; x) 2 IRd  E and n  0, an (y)?1  gn (yg?(yhn)(x))  an (y): n n In such a framework, the tness functions depend on the observation record. The assumption (F) is chosen so that the bounds (3) hold for some constants depending on the observation parameters. We also note that we can replace the tness functions gn (y ? hn ()) by the \normalized" ones gn (y )?1 gn (y ? hn ()) without altering the dynamical structure of (29) or (30). For any observation record Y = y , the results presented in Section 2 and 3 hold although the constants fan ; n  1g and the covariance function in Donsker's theorem will depend here on the observation parameters fyn ; n  1g.

Averaged Results One question that one may naturally ask is whether the averaged version of Theorem 1.2 and Theorem 1.3 hold. In many practical situations, the functions an : IRd ! IR+ , n  1, have a rather complicated form and it is dicult to obtain an averaged version of Theorem 1.3. Nevertheless, the averaged version of the L1-error bounds presented in Theorem 1.4 does hold for a large class of non-linear sensors. Instead of (F), we will use the following stronger condition (F)0 For any time n  1, there exists a positive function an : IRd ! [1; 1) and a non-decreasing function  : IR ! IR such that for all x 2 E and y 2 IRd , 1  gn (y ? hn (x))  a (y ) (31) n an(y) gn(y) and j log an(y + u) ? log an (y)j  (kuk): The main simpli cation due to (F)0 is that now ?  a2n (Yn)  a2n(Vn) exp 2(khnk) : In view of the proof of Theorem 1.4, one can check that (24) remains valid if we replace the condition supn1 an < 1 by ?

L def = sup log E a2 (Vn) n1

1=2

< 1 and M def = sup khn k < 1: n1

29

In this case (24) holds with 0 = 1 + 2(L + (M )).

4.1.4 Examples

Let us now illustrate examples of non-linear observation noise sources for which (F)0 holds. 1. As a typical example of non-linear ltering problem, assume the functions hn : E ! IRd , n  1, are bounded and continuous and the densities gn given by   gn (v) = ((2)dj1R j)1=2 exp ? 21 v0 R?n 1 v n where Rn is a d  d symmetric positive matrix. This correspond to the situation where the observations are given by

Yn = hn (Xn?1) + Vn; n  1; where (Vn )n1 is a sequence of IRd -valued and independent random variables with Gaussian densities. After some easy manipulations one concludes that (F)0 hold with with log an (y ) = 21 kR?n 1k khn k2 + kR?n 1 k khn k jy j where kR?n 1k is the spectral radius of R?n 1 . In addition we have that log an (y + u)



? log an(y)  Ln juj with Ln = kR?n 1k khnk

2. Our result is not restricted to Gaussian noise sources. For instance, let us assume that d = 1 and gn is a bilateral exponential density

gn(v) = 2n exp ? n jvj ; n > 0: ?



In this case (F)0 holds with



log an (y ) = n hn which is independent of the observation parameter y . Another interesting remark is that in this particular case the exponential bounds of Theorem 1.3 hold. 30

References [1] P. Barbe, P. Bertail. The Weighted Bootstrap. Lecture Notes in Statistics, No 98, Springer-Verlag, 1995. [2] B.E. Benes. Exact Finite-Dimensional Filters for Certain Di usions with Nonlinear Drift. Stochastics, Vol.5, pp 65-92, 1981. [3] C.L. Bridges, D.E. Goldberg, An analysis of reproduction and crossover in a binary-coded genetic algorithm, in J.J. Grefenstette (Eds.), Proc. of the 2nd Int. Conf. on Genetic algorithms, Lawrence Erlbaum Associates, 1987. [4] R.S. Bucy, Lectures on discrete time ltering, Springer Verlag, Signal Processing and Digital Filtering, 1994. [5] H. Carvalho. Filtrage optimal non lineaire du signal GPS NAVSTAR en racalage de centrales de navigation, These de L'Ecole Nationale Superieure de l'Aeronautique et de l'Espace, Toulouse, 1995. [6] H. Carvalho, P. Del Moral, A. Monin, G. Salut. Optimal non-linear ltering in GPS/INS Integration. IEEE Trans. on Aerospace and Electronic Systems, July 1997, Volume 33, Number 3, pp. 835-850. [7] M. Chaleyat-Maurel, D. Michel. Des resultats de non existence de ltres de dimension nie. C.R.Acad.Sci. Paris, Serie I, t. 296, 1983. [8] D. Crisan, T.J. Lyons. Nonlinear Filtering And Measure Valued Processes. Prob. Theory and Related Fields, 109, 217-244, 1997 [9] D. Crisan, J. Gaines, T.J. Lyons. A Particle Approximation of the Solution of the Kushner-Stratonovitch Equation. To appear in SIAM Journal of Applied Math., 1998. [10] D. Crisan, P. Del Moral,T.J. Lyons. Non Linear Filtering Using Branching and Interacting Particle Systems, Publications du Laboratoire de Statistiques et Probabilites, Universite Paul Sabatier, No 01-98, 1998. [11] P. Del Moral. Measure Valued Processes and Interacting Particle Systems. Application to Non Linear Filtering Problems, Publications du Laboratoire de Statistiques et Probabilites, Universite Paul Sabatier, No 15-96, 1996. To appear in Annals of Applied Prob. 1998. [12] P. Del Moral. Non-linear Filtering: Interacting particle resolution, Markov Processes and Related Fields, Vol.2., 1996. 31

[13] P. Del Moral. Filtrage non lineaire par systemes de particules en interaction, C.R. Acad. Sci. Paris, t. 325, Serie I, p. 653-658, 1997. [14] P. Del Moral, A. Guionnet. Large Deviations for Interacting Particle Systems. Applications to Non Linear Filtering Problems. Publications du Laboratoire de Statistiques et Probabilites, Universite Paul Sabatier, No 05-97. 1997. To appear in Stochastic Processes and their Applications 1998. [15] P. Del Moral, A. Guionnet. A Central Limit Theorem for Non Linear Filtering using Interacting Particle Systems. Publications du Laboratoire de Statistiques et Probabilites, Universite Paul Sabatier, No 11-97, 1997. To appear in Annals of Applied Prob. 1998. [16] P. Del Moral, A. Guionnet. On the Stability of Measure Valued Processes. Applications to Non Linear Filtering and Interacting Particle Systems, Publications du Laboratoire de Statistiques et Probabilites, Universite Paul Sabatier, No 03-98, 1998. [17] P. Del Moral, J. Jacod, Ph. Protter. The Monte-Carlo Method for ltering with discrete time observations. Publications du Laboratoire de Probabilites, Paris VI, No 453, June 1998. [18] D.E. Goldberg. Genetic Algorithms and Rule Learning in Dynamic Control Systems. Proceedings of the First International Conference on Genetic Algorithms, Lawrence Erlbaum Associates Hillsdale, NJ, pp. 8-15, 1985. [19] D.E. Goldberg, Simple genetic algorithms and the minimal deceptive problem, in Lawrence Davis (Eds.), Genetic Algorithms and simulated annealing, Pitman, 1987. [20] D.E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA., 1989. [21] N.J. Gordon, D.J. Salmon, A.F.M. Smith. Novel Approach to nonlinear/non-Gaussian Bayesian state estimation., IEE, 1993. [22] J.H. Holland. Adaptation in Natural and Arti cial Systems.University of Michigan Press, Ann Arbor, 1975. [23] A.H. Jazwinski. Stochastic Processes and Filtering Theory. Academic Press., 1970. [24] G. Kallianpur, C. Striebel, Stochastic di erential equations occuring in the estimation of continuous parameter stochastic processes, Tech. Rep. No 103, Department of statistics, Univ. of Minnesota, September, 1967. 32

[25] H. Kunita. Asymptotic Behavior of Nonlinear Filtering Errors of Markov Processes. Journal of Multivariate Analysis, Vol. 1, No 4, p. 365-393, 1971. [26] M. Ledoux, M. Talagrand. Probability in Banach spaces. Springer-Verlag. 1991. [27] D. Liang, J.McMillan. Development of a marine integrated navigation system. Kalman Filter Integration of Modern Guidance and Navigation Systems, AGARD-LS-166, OTAN, 1989. [28] A. Nix, M.D. Vose, Modelling genetic algorithms with Markov chains, Annals of Mathematics and Arti cial Intelligence, 5, 79-88, 1991. [29] D.L. Ocone, Topics in Nonlinear Filtering Theory, P h.D. Thesis, MIT, Cambridge, 1980. [30] E. Pardoux, Filtrage Non Lineaire et Equations aux Derives Partielles Stochastiques Associees, Ecole d'ete de Probabilites de Saint-Flour XIX1989, Lecture Notes in Mathematics, 1464, Springer-Verlag, 1991. [31] D. Pollard. Convergence of Stochastic Processes. Springer Verlag, New York. 1984. [32] S.T. Rachev. Probability Metrics and the Stability of Stochastic Models. Wiley, New York. 1991. [33] P. Segrest and D.E. Goldberg, Finite Markov chain anaysis of genetic algorithms, in J.J. Grefenstette (Eds.), Proc. of the 2nd Int. Conf. on Genetic algorithms, Lawrence Erlbaum Associates, 1987. [34] A.N. Shiryaev, On stochastic equations in the theory of conditional Markov processe, Theor. Prob. Appl. (11) (179-184), 1966. [35] L. Stettner. On Invariant Measures of Filtering Processes. Stochastic Differential Systems, Lecture Notes in Control and Information Sciences, 126, Springer-Verlag. 1989. [36] R.L. Stratonovich, Conditional Markov processes, Theor. Prob. Appl. (5) (156-178), 1960. [37] M. Talagrand (1994). Sharper bounds for Gaussian and empirical processes. Annals of Probability 22, 28-76. [38] A.N. Van der Vaart, J.A. Wellner. Weak Convergence and Empirical Processes with Applications to Statistics. Springer Series in Statistics. 1996. 33

[39] M.D. Vose, Modelling simple genetic algorithms, in Foundations of Genetic Algorithms, Morgan Kaufmann, 1993. [40] M.D. Vose, Modelling simple genetic algorithms, Elementary Computations, 3 (4), 453-472, 1995. [41] M.D. Vose, G.E. Liepins, Punctuated equilibra in genetic search. Complex Systems, 5, 31-44, 1993. [42] M.D. Vose, A.H. Wright, Simple genetic algorithms with linear tness, Elementary Computations, 2 (4), 347-368, 1995. [43] L.D. Whitley, An excecutable model of a simple genetic algorithm, in Foundations of Genetic Algorithms, Morgan Kaufmann, 1993.

34

Suggest Documents