Interacting Particle Systems Approximations of the ... - CiteSeerX

54 downloads 0 Views 279KB Size Report
In 1, 2] a branching particle system approximation for the ltering problem of dif- ... In 6, 7, 9, 10]and 14] di erent interacting particle systems approximations for the.
Interacting Particle Systems Approximations of the Kushner Stratonovitch Equation D. Crisan, P. Del Moraly, T.J. Lyons

Abstract

In this paper we consider the continuous time ltering problem and we estimate the order of convergence of an interacting particle system scheme presented by the authors in previous works. We will discuss how the discrete time approximating model of the Kushner-Stratonovitch equation and the genetic type interacting particle system approximation combine. We present quenched error bounds as well as mean order converge results.

Keywords: Non Linear Filtering, Filter approximation, error bounds, interact-

ing particle systems, genetic algorithms.

Code A.M.S: 60G35,93E11, 62L20.

1 Introduction

1.1 Background and Motivations

The aim of this work is the design of an interacting particle system approach for the numerical solving of the continuous time non linear ltering problem. Recently intense interest has been aroused in the non linear ltering theory community concerning the connections between non linear estimation, measure valued processes and branching and interacting particle systems. The evolution of this material may be seen quite directly through the list of referenced papers. Let us brie y survey some di erent approaches and motivate our work. Imperial College, Department of Mathematics, Huxley Building, 180 Queen's Gate, London SW7 2BZ, Work partly supported by CEC Contract No ERB-FMRX-CT96-0075 y LSP-CNRS UMR C55830, Bat.1R1, Univ. Paul Sabatier, 31062 Toulouse, [email protected].. 

1

In [17, 18], [19] and [23] the authors proposed a preliminary time discretization scheme of the optimal lter evolution. Then they solved the discrete time approximating model by using Monte-Carlo simulations or spatial quantizations of the signal. In these works special attention was paid to study the rate of convergence of the time discretization scheme to the continuous time original model. The MonteCarlo particle approach described therein consists of independent particles weighted by exponentials. Unfortunately the above Monte Carlo approximation is not ecient mainly because the particles are independent of each other and the growth of the exponential weights are dicult to control as time goes on. In [5, 8] a way was proposed to regularize these exponential weights and a natural ergodic assumption was introduced on the signal semigroup under which the particle approximation converges in law to the optimal lter uniformly with respect to time. All the above Monte Carlo approximations are the crudest of the random particle system approaches. It has recently been emphasized that a more ecient approach is to use interacting and branching particle systems to numerically solve the ltering problem. In [1, 2] a branching particle system approximation for the ltering problem of diffusions is studied. Although these works give some insight into the connections between branching particle systems and non linear ltering, they entirely rely on two assumptions: First it is assumed that the continuous semigroup of the signal is explicitly known in the sense that we can exactly simulate random transitions according to this semigroup. Further it is assumed that the stochastic integrals arising in the Girsanov exponentials are readily accessible without further work. It can be argued that in practice the signal semigroup and the above stochastic integrals are not known exactly and another level of approximation is therefore needed. In [6, 7, 9, 10]and [14] di erent interacting particle systems approximations for the discrete time ltering problem are studied. Here again it is assumed that we can exactly simulate random transitions according to the semigroup of the signal and the Girsanov exponentials are exactly known. Even if these particle methods provide essential insight for solving quite general discrete time non linear ltering problem they do not applied directly to the continuous time case. The aim of this work is to extend the genetic type interacting particle system approach introduced in [6, 7, 9] and [14] to handle continuous time ltering problems. A crucial practical advantage of our approximating model is that it does not involves stochastic integrals and the transition probability kernel which govern our algorithm is chosen so that we can exactly simulate its random transitions. Our construction is also explicit, the recursions have a simple form and they can easily 2

be simulated on a computer. In contrast to [1, 2] we use a preliminary discrete time and measure valued approximating model for the Kushner-Stratonovitch equation. This time discretization scheme was introduced by Picard in [23]. The study of time discretization approximating models is still a active research area. The most accurate work in this subject seems to be [18]. In this work the authors extend their study to the time discretization problem of non linear ltering of signals driven by not necessarily white signals. As pointed out in [18] and [23] the resulting discrete time and measure valued model also characterize the evolution in time of the optimal lter for a suitably de ned discrete time ltering problem. It follows that the branching and interacting particle system approaches presented in [6, 7, 9] and [3] can be applied to this discrete time approximating model. We study how the time discretization and the particle system approximating models combinate and provide explicit quenched and mean error bounds for the global approximating scheme. The paper has the following structure: In Section 2 we introduce the continuous time non linear ltering model and the discrete time approximating model under study. In Section 3 we introduce the interacting particle system approximating model and we study the order of convergence to the solution of the so-called KushnerStratonovitch equation. Firstly we present two di erent approaches to obtain quenched error bounds. Then we apply these strategies to study mean error bounds and the resulting order of convergence for the Fortet-Mourier distance (see for instance [13]).

3

1.2 Description of the Model and Statement of some Results

To clarify the presentation all the processes considered in this work are indexed on the compact interval [0; 1]  IR+ . The basic model for the continuous time non linear ltering problem consists of a time homogeneous Markov Process f(Xt; Yt) : t 2 [0; 1]g taking values in IRp  IRq , p; q  1. We assume that Y0 = 0 and X0 is a random variable with law  . The classical ltering problem can be summarized as to nd the conditional distribution of the signal X at time t with respect to the observations Y up to time t. Namely tf = E (f (Xt)=Y0t) 8f 2 Cb(IRp) (1) where Y0t is the ltration generated by the observations Y up to time t. Here we will only consider the case where the signal X is a di usion and the observation fYt : t 2 [0; 1]g is solution of the Ito-type di erential equation

dYt = h(Xt) dt + dVt

(2)

where h : IRp ! IRq is a bounded continuous function and fVt : t 2 [0; 1]g is a standard q -dimensional Brownian motion independent of fXt : t 2 [0; 1]g. As mentioned in the introduction our approximating model is obtained by rst introducing a time discretization scheme of the basic model. To this end we introduce a sequence a meshes f(t0; : : :; tM ) : M  1g given by

tn = Mn

n 2 f0; : : :; M g:

Then, to obtain a computationally feasible solution we will use the following natural assumptions: (H1) For any M  1 there exists a transition probability kernel P (M) such that





K sup E jXt ? Xt(M) j2  M t2[0;1] n

n

K 0

and

n

n

8x 2 IRp:

From now on we denote by fet : n = 0; : : :; M g the solution of the resulting approximating discrete time model ( et = Mt (Yt ; et ?1 )P 1 ; n = 1; : : :; M (8) e0 =  n

n

n

n

n

M

where for any probability measure  on IRp and for any f 2 Cb (IRp )

Z

M t (Yt

n

n

; )f =

f (x) gtM (Yt ; x) (dx)

Z

n

n

gtM (Yt n

n

; x) (dx)

:

Elementary manipulations show that the solution of the latter system is given by the formula

Z

et f = n

f (xn )

Z

n Y

gtM (Yt ; xk?1)

k=1 n Y gtM (Yt k=1 k

k

n

n

; xk?1)

n Y k=1

n Y

k=1

P 1 (xk?1; dxk)  (dx0) M

P 1 (xk?1 ; dxk)  (dx0)

:

(9)

M

Remark 2.2: It appears from the above that an equivalent way to compute recursively in time the multiple integrals arising in (12) consists of solving the discrete time and measure 9

valued dynamical system (8). Let us also remark that each transition et ?1 ; et is decomposed into two separate mechanisms n

n

(1)

(2)

et ?1 ??????! Mt (Yt ; et ?1 ) ??????! M t (Yt ; et ?1 )P 1 : n

n

n

n

n

n

n

M

.

The rst one is usually called the \updating" transition. The transformation M t (Y; ) is a Bayes' rule which depends on the current observation data Yt . The second one is called the \prediction" transition because it does not depend on the observations but on the semigroup of the signal. As we shall see in the forthcoming development our interacting particle approximating scheme is also decomposed into the same kind of transitions. n

n

A crucial practical advantage of the approximating model (8) is that it does not involves stochastic integrations. We recall that in practical situations the observational data are often sampled with a constant period of a given length M1 . In our approximating model the exponential weights fgtM (Yt ; ) : n = 0; : : :; M g have an explicit and an analytic expression so that then can readily be computed when the observations fYt : n = 0; : : :; M g are received. Of course the diculty in solving (8) lies in computing, recursively in time, a series of integrations over the whole state space IRp . It is therefore tempting to use the interacting particle system approximations introduced in [3] and [7, 9]. Unfortunately we do not know how to simulate random transitions according to the semigroup of the signal and another kind of approximation is therefore needed. n

n

.

n

2.2.2 Approximation of the signal semigroup

Under the assumptions (H1) and (H2) we now introduce a nal discrete time and measure valued approximating model. Firstly it is worth noting that an example of an approximating Markov chain fXt(M ) : n = 0; : : :; M g satisfying these assumptions is given by the classical Euler scheme n

Xt(M ) = Xt(M?)1 + a(Xt(M?)1 )(tn ? tn?1 )+ c(Xt(M?)1 ) ( t ? t ?1 ) n

n

n

n

n

n

n = 1; : : :; M (10)

with X0(M ) = X0 . This scheme is the crudest of the discretization scheme that can be used in our settings. Other time discretization schemes for di usive signals are described in full detail in [21] and [25]. Once again we x the observations and we note nally that ftM : n = 0; : : :; M g n

10

the solution of the discrete time and measure valued dynamical system

(

M tM = M t (Yt ; t ?1 ) 0M =  n

.

n

n

; n = 1; : : :; M

n

(11)

where M t (Yt ; ) is de ned by the formula n

n

(M ) Mt (Yt ; ) def = M t (Yt ;  )P for any probability measure  on IRp . n

n

n

n

Arguing as before, we see that the solution of the above system is given by

Z

tM f = n

f (xn)

Z

n Y

gtM (Yt ; xk?1)

k=1 n Y gtM (Yt k=1 k

k

k

k

n Y P (M )(xk?1; dxk)  (dx0)

k=1 : n Y ( M ) P (xk?1; dxk)  (dx0) ; xk?1)

(12)

k=1

The error bound caused by the discretization of the time interval [0; 1] and the approximation of the signal semigroup is well understood (see for instance Proposition 5.2 pp. 31 [17], Theorem 2 in [22], Theorem 4.2 in [18] and also Theorem 4.1 in [4]). More precisely we have the well known result. Theorem 2.3 ([18]) Let f be a bounded test function on IRp satisfying the Lipschitz condition jf (x) ? f (z)j  k(f ) jx ? zj: Then   sup E jtf ? tM f j  pC (kf k + k(f )) (13) M t2[0;1) where C is some nite constant.

3 Interacting Particle System Model

3.1 Description of the genetic type particle algorithm

Even if it looks innocent, the dynamical system (11) involves at each step several integrations over the whole state space IRp . Therefore it is necessary to nd a new strategy to solve the former integrals recursively in time. Now we introduce a particle system approximating model for the numerical solving of (11). The genetic-type algorithm presented in the forthcoming development was introduced in [6, 9] in discrete time settings and further developed in [7, 10, 11]. 11

Namely, the interacting particle system associated to (11) consists of a Markov chain ft : n = 0; : : :; M g with product state space (IRp )N , where N  1 is the size of the system and de ned by n

N Y

PY (t0 2 dx) =

p=1 N Y

PY (t 2 dx=t ?1 = z) = n

 (dxp)

n

M t (Yt

p=1

n

n

N X 1 ; z )(dxp)

N i=1

i

where dx = dx1  : : :  dxN is an in nitesimal neighborhood of the point x = (x1; : : :; xN ) 2 (IRp )N . Let us remark that

Mt (Y; N1 n

N X

!

N X

gM (Y ; zi) z ) = PN t gM (t Y ; zj ) z P (M) t i=1 j =1 t i=1 i

n

n

i

n

n

and therefore the transitions of the systems may be written

PY (t 2 dx=t ?1 = z) = n

n

N X N Y

gM (Y ; zi)

PN t gM (t Y ; zj ) P (M)(zi; dxp): t p=1 i=1 j =1 t n

n

n

n

Using the above observations we see that the particles evolve according two separate mechanisms Updating Prediction

t ?1 ????????! bt ?1 ????????! t : n

n

n

More precisely an equivalent formulation is the following  Initial Particle System

PY (t0 2 dx) =

N Y p=1

 (dxp)

 Updating PY (bt ?1 2 dx=t ?1 = z) = n

n

N N X Y

gM (Y ; zi)

PN t gM (t Y ; zj ) z (dxp) t p=1 i=1 j =1 t n

n

n

 Prediction

PY (t 2 dx=bt ?1 = z) = n

n

12

n

N Y P (M)(zp; dxp):

p=1

i

As pointed out in [6] we see that this particle approximation belongs to the class of algorithms called genetic algorithms which guide natural evolution: the exploration/Mutation is the prediction step and the Selection transition correspond to the updating procedure. These algorithms were introduced by Holland in 1975 [15] to handle global optimization problems on a nite state space. The situation is di erent when dealing with non linear ltering problems mainly because there is no temperature parameter which tends to zero and the state space is not nite. A crucial practical advantage of the genetic type approximating introduced above is that it leads to an empirical measure valued approximating model def 1 tM;N ?1 = N n

N X i=1



Updating

1 ? ? ? ? ? ? ? ? ? ?! ?1 N

i tn

N X i=1

b

i tn

?1

??Prediction ????????! tM;N = N1 n

N X i=1

 : i tn

The convergence of the genetic type interacting particle system under study to the solution of the time discretization approximating model (11) has been studied in [6, 7, 9]; quenched large deviations principles are developed in [10] and the associated

uctuations are presented in [11].

3.2 Error bounds and order of convergence

The aim of this section is to study the convergence of the particle density pro les to the solution of the Kushner-Stratonovitch equation as the number of particles N ! 1 and the time discretization parameter M ! 1. A rough structure is as follows. In Section 3.2.1 we present two di erent ways to obtain quenched error bounds for the convergence of the genetic type approximating model to the solution of the discrete time and measure valued dynamical system (11) as N is growing. This section is based in general on a previous work of one of the authors [6, 7]. Here we present a simpli ed proof that captures the main ideas. We also emphasize that our strategy can be used to get explicit quenched error bounds for the more general branching and interacting particle systems presented in [3] or [1, 2]. In Section 3.2.2 we study the convergence of the particle approximating model to the solution of the Kushner-Stratonovitch equation and we propose several orders of convergence depending on the choice of the time dicretization parameter M = M (N ) with respect to the number of particles N . The situation is di erent in continuous time settings and the quenched error bounds presented in Section 3.2.1 cannot be used directly. Nevertheless we will see that the strategy used to obtain these quenched error bounds can be nested to easily produce the order of convergence.

13

3.2.1 Quenched error bounds

To motivate our work we begin with a quenched error which can easily be derived from the very de nition of the interacting particle system approximating model. Proposition 3.1 For any bounded measurable function f on IRp such that kf k  1 and for any M  1 we have P.a.s. 4M C (M; Y ) sup E (j M f ?  M;N f j)  p (14) n=0;:::;M

with

Y

t

t

n

N

n

2

Proof:

log C (M; Y ) = kh2k + 2khk

M X n=1

jYt j:

(15)

n

Let us x in what follows the observation process Y . In addition, to clarify the presentation, we simplify the notations suppressing, when there is no possible confusions, the observation parameter Y , the index parameters t and M , so that we (M ) M simply note gn , n , P , n and nN instead of gtM (Yt ; ), M t (Yt ; ), P , t and tM;N . To prove the quenched error bound (14) the key idea is to introduce the composite mappings n=p def = n  n?1  : : :  p+1 80  p  n  M with the convention n=n = Id. A clear backward induction on the parameter p( n) leads to the formula n

n

.

n

n

.

n

n





 g (P f )  n=p()f = n=p n=p  gn=p where, for any f 2 Cb (IRp ) and 0  p  n  M   P gn=p (Pn=pf )   gn=p?1 = gp P (gn=p) Pn=p?1f = P gn=p with the conventions gn=n = 1 and Pn=n = Id. For later use we immediately notice that for any x 2 IRp an  gn(x)  An with

2

an def = exp (?khkjYt j ? k2hMk )

and

n

14

(16)

(17)

An def = exp (khkjYt j): n

More generally, from the de nition of the functions gn=p we note the following inequality for future use: an=p  gn=p(x)  An=p 0  p  n  M with n 2 X an=p def = exp (?khk jYt j ? (n ?2pM)khk ) q=p+1 q

An=p def = exp ( khk

n X

q=p+1

jYt j) q

Using the above notation we have the decomposition

nN f ? nf =

n  X

p=0

n=p(pN )f ? n=p(p(pN?1))f



with the convention 0 (?N1 ) =  . By using the formula (16) we see that each term jn=p(pN )f ? n=p(p(pN?1))f j is bounded by

 An=p  N N )f j + j N f ?  ( N )f j j  f ?  (  p p?1 2 p 2 an=p p 1 p p?1 1

where f1 and f2 are some bounded measurable functions on IRp such that kf1 k; kf2k  1 and with the convention that An=n = an=n = 1. By recalling that pN is the empirical measure associated to N conditionally independent random variables with common law p (pN?1) we have the quenched error bound   E jN f ?  (N )f j  p1 Y

p

p p?1

for any bounded measurable function f with kf k  1.

N

Collecting the above inequalities one concludes that for any n = 0; : : :; M n A   X n=p ) = p2 (1 + n An=0 ): E jN f ?  f j  p2 ( Y

This yields

n



n

sup EY jnN f ? n f j n=0;:::;M

N p=0 an=p



N

an=0

4M AM=0  p N aM=0 ! M 2 X 4 M k h k = p exp 2 + 2khk jYt j N n=1 n

15

and the proof of Proposition 3.1 is completed. The term pMN in the right hand side of (14) expresses the fact that our interacting particle algorithm has M steps and at each step the approximating empirical measure consists of N particles. We can obtain a sharper quenched error bound if we use the \martingale approach" presented independently in [6] and [2]. More precisely we have the following result Theorem 3.2 For any bounded measurable function f on IRp such that kf k  1 and for any M; N  1 we have P.a.s.

s p sup EY (jtM f ? tM;N f j)  2 2 M N C (M; Y ) n=0;:::;M n

with

Proof:

n

2

log C (M; Y ) = kh2k + 2khk

M X n=1

(18)

jYt j: n

Once again we x the observation Y and we use of simpli ed notations introduced in the proof of Proposition 3.1. Before getting down to serious business it is useful to note that the dynamical system (11), the composite mappings n=p and the recursive formulas (16) remain the same if we replace all the functions gn by the \normalized" ones

gn=n?1 def =  gn(g ) ; n?1 n but in this situation the functions gn=p are de ned by the recursion gn=p?1 = gp=p?1 P (gn=p) p = 1; : : :; n with the convention gn=n = 1.

Using the above notation we clearly have p(gn=p) = 1 80  p  n: (19) To see this it suces to note that n?1 (gn=n?1 ) = 1 and, for any 1  p  n we have





p?1 gp=p?1 P (gn=p)   = p?1(gn=p?1): p(gn=p) = p?1 gp=p?1 16

On the other hand, when using these\normalized" functions we have the \normalized" bounds ?1  g (x)  B Bn=p 0pnM n=p n=p with n X (20) log Bn=p def = 2khk jYt j + n2?Mp khk2: q=p+1

q

To clarify the presentation for any bounded test function f we write fn=p for the functions given by fn=p = gn=p Pn=p(f ) p = 0; : : :; n: Arguing as before we have

p(fn=p) = n(f )

80  p  n:

To see this it suces to note that (19) and the recursive formula (16) implies that





  p gn=p (Pn=pf )   = p gn=p (Pn=pf ) = p(fn=p): n(f ) = p gn=p As in [6] the next step is to introduce the random sequence fUn : n = 0; : : :; M g de ned by nY ?1 Un = pN (gp+1=p) p=0

Q with the standard convention ; = 1 and the decomposition n   X UppN (fn=p) ? Up?1pN?1(fn=p?1) + 0N (fn=0): Un nN (f ) = p=1

Using the fact that





(21)

pN?1 gp=p?1 (Pfn=p) N N   = pN?1(gp=p?1) p(pN?1)(fn=p) p?1(fn=p?1) = p?1(gp=p?1) pN?1 gp=p?1 we arrive at

Up?1 pN?1(fn=p?1) = Up p(pN?1)(fn=p);

so that (21) may be rewritten in the more tractable form Un nN (f ) ? nf = Un nN (f ) ? 0(fn=0) =

n X

p=0



Up pN (fn=p) ? p(pN?1)(fn=p) 17



with the convention 0 (?N1 ) =  . Using the fact that Up only depends on the random empirical measures (0N ; : : :; pN?1) up to time (p ? 1) we have that



EY (Un nN (f ) ? nf )2

n  X

=

p=0

2 N N EY Up p (fn=p) ? p(p?1)(fn=p) :   2

Recalling that

EY



 2  2 pN (fn=p) ? p(pN?1)(fn=p) pN?1 = N1 p(pN?1) fn=p ? p(pN?1)(fn=p)

and using the bounds (20) one gets









n X

2 ) EY jUn nN (f ) ? n f j2  N1 EY Up2p(pN?1)(fn=p p=0 n  2X 2 )  kfNk EY Up2p(pN?1)(gn=p

(22)

p=0

and

p n Y n   2X 2 2  kf k2 X B 2 B 2 EY jUn nN (f ) ? nf j2  kfNk Bq=q B ?1 n=p N p=0 n=p p=0 p=0 q=1 2 kf k2:  n N+ 1 Bn= 0

This implies that









EY jnN (f ) ? nf j  EY jUn nN (f ) ? nf j + kf k EY (jUn ? 1j) s rn + 1 p M  2 N Bn=0 kf k  2 2 N C (M; Y ) kf k: and completes the proof of the theorem.

3.2.2 Order of convergence

For any n = 0; : : :; M ? 1 and t 2 [tn ; tn+1 ) we denote by tM;N , the empirical measures associated with the system t , namely n

tM;N = N1 18

N X i=1

 : i tn

In view of Theorem 3.2 we may ask whether we can deduce the convergence of ftM;N : t 2 [0; 1)g to the solution of the Kushner Stratonovich equation. This question has of course a positive answer but the error bounds obtained using these quenched results are not really satisfactory. More precisely Theorem 3.2 has the following immediate consequence. Corollary 3.3 For any bounded Lipschitz test function f such that

jf (x) ? f (z)j  k(f ) jx ? zj we have that

sup E



t2[0;1)

 jtf ? tM;N f j 

s

p pC (kf k + k(f )) + 2 2kf k M N E (C (M; Y )) M

s

and

s

5 2 2qM 2M 2 2 khk + 2khk   log E (C (M; Y ))  5q khk + 2q khk  where the nite constant C is the one given in Theorem 2.3

(23)

Proof:

In a rst place we note that













E jtf ? tM;N f j  E jtf ? tM f j + E jtM f ? tM;N f j : Thus, in view of Theorem 2.3 and Theorem 3.2 it remains to check (23). To clarify the presentation we will only treat the case p = q = 1 since the extension to the vector case is straightforward. By de nition of the observation process Y we rst note that 2

log C (M; Y ) = 5k2hk + 2khk

M X n=1

jVt j: n

On the other hand, it is well known that for any Gaussian variable U with E (U ) = 0 and E (U 2) = 1 we have that

r !

2 E (exp ( jU j))  1 + 2 exp 2

8  0:

Applying this inequality and using the fact that



1 + p

M

M

p  exp M 19

8  0;

and after some elementary manipulations one gets

s

log E (C (M; Y ))  5khk2 + 2khk 2M : This yields

p

sup E (jtM f ? tM;N f j)  C1 pM exp (C2 M )

N

t2[0;1)

with

r

and C2 = 2khk 2 : Using Jensen's inequality we also have that

C1 = 4 exp (5khk2)

0 s 1 2 5 k h k log E (C (M; Y ))  @ + 2khk 2M A: 

2

The extension to the vector case is a clear consequence of the following chain of inequalities. If we write





Vt = Vt1 ; : : :; Vtq 2 IRq ; n

then we have

n

n

!1=2 X q  jVtr j jVt j = r=1  r=1 q exp ( E (jVt j))  E (exp ( jVt j))  E exp ( jVt1 j) q X

n

jVtr j2

n

n

n

n

n

8  0:

Remark 3.4: In view of (23) one cannot expect to obtain a better error bound using the above lemma. Roughly speaking the exponential term (23) represents the mean growth rate of the increasing sequence of variables fC (M; Y ) : M  1g.

We now present the main result of this section. Theorem 3.5 For any bounded Lipschitz test function f such that jf (x) ? f (z)j  k(f ) jx ? zj we have that sup E

t2[0;1)



 jtf ? tM;N f j 

s

pC1 (kf k + k(f )) + C2 M N kf k M 20

(24)

p

where C1 is the nite constant de ned in theorem 2.3 and C2 = 2 2 exp (12khk2). In addition, if p = q = 1 and a; b; f; h are four times continuously di erentiable with bounded derivatives then we have

0 s 1 1 A sup E jt f ? tM;N f j  Cte @ M + M N : t2[0;1) 



(25)

Proof:

In view of (13) is suces to check that sup E

t2[0;1)

s



 p jtM f ? tM;N f j  2 2

M kf k exp (12khk2): N

(26)

The proof of (26) relies strongly on the proof of Theorem 3.2. To clarify the presentation all notations introduced in the proof of Theorem 3.2 are in force. We will also simplify the notation suppressing the time parameter t and writing Xn , n and Yn instead of Xt , t and Yt . n

n

n

Our immediate goal is to replace the quenched error bounds in (22) by some explicit mean error bounds. If we denote gn=p the \unormalized" functions given by

gn=p = gp+1 P (gn=p+1)

(27)

with the convention gn=n = 1, we have the relation

gn=p(x) = gn=p(x)

,nY ?1 q=p

q (gq+1) :

Using the fact that kPn=p f k  kf k and gp  1 one can easily check that 2 )  U 2 kf k2 p ( N )(g2 ): Up2p(pN?1)(fn=p n p?1 n=p

Up to now we have considered the observation records as a xed series of parameters. Our aim is now to produce some mean error bounds. To this end is it necessary to evaluate the mean of the terms in the right hand side of (22). The key idea consists in using in a rst stage the reference probability measure P0 . This approach was presented by two of the authors in [1, 2]. Recalling that under P0 the observation process Y is a standard Brownian motion the random sequence fYn : n  0g can be considered as a sequence of i.i.d. and IRq -valued Gaussian variables with zero mean and E ((Y1 )?(Y1 )) = M ?1 Iqq . 21

.

In what follows we denote by E0 ( ) the expectation associated with P0 . Using the above notation and the Cauchy-Schwartz inequality we arrive at

 2 )2  kf k4 E0 U 4  E0 g 4 ( 1 ) : E0 Up2p(pN?1)(fn=p n n=p p

(28)

To complete the proof of the theorem it remains to estimate the terms

 

E0 Un4

  E0 g4n=p(p1) :

and

In view of (27) we clearly have for any x 2 IRp

   .  E0 g4n=p(p1) p1 = x  E0 gp4+1(Xp) : : :gn4 (Xn?1) /Xp = x : On the other hand for any p + 1  q  n we have that   E0 gq4(Xq?1) jXq?1  E0 (exp (4h?(Xq?1)Yq ) jXq?1 ) 2 2 = exp 8jh(XMq?1)j  exp 8kMhk : It follows that

    E0 g4n=p(p1)  exp 8(nM? p) khk2 :

(29)

Let us now estimate the term E0 (Un4). Using Jensen's formula we can check that Un4 is upper bounded by

In = where

nY ?1 p=0





pN exp (4 elp+1)

elp+1(x) = lp+1(x) ? p(lp+1)

and

Noticing that

lp+1(x) = log gp+1:

eln(x) = (h?(x) ? n?1 (h))Yt ? 1 jh(x)j2 ? n?1 (jhj2) 2M p and for any  0 and x 2 IR 2 2  2  log E0 e (h (x)? ?1 (h))Y jn?1 = 2 M jh? (x) ? n?1 (h)j2  2 Mkhk ; n

?

we arrive at



log E0 nN?1

n

tn

 16  e  33 khk2: 4 l Yt1 ; : : :; Yt ?1 ; n?1  2M 4khk2 + M1 khk2 = M e n

n

22

Thus, one easily gets the inequality

 33  E0 (In)  exp M khk2 E0 (In?1) ;

so that

 



E0 Un4  exp 33 khk2



80  n  M:

(30)

If we combine (28), (29) and (30) we conclude that





2 )  kf k2 exp (17khk2 + 4khk2) = kf k2 exp (21khk2): E0 Up2p(pN?1)(fn=p

Hence, in view of (22) we have

  2 2 E0 (Un nN (f ) ? nf )2  2 M N kf k exp (21khk ):

Taking into consideration the inequality

E0

2 N n (f ) ? nf



we conclude that

h  N i2 N  E0 Un n (f ) ? nf + n (f ) (1 ? Un )      2 E0 (Un nN (f ) ? nf )2 + 2kf k2 E0 (Un ? 1)2 ;

  2 2 E0 jnN (f ) ? n f j2  8 M N kf k exp (21khk ):

Finally, using the inequalities

 dP 2!1=2  N   N  1 = 2 E jn (f ) ? n f j  E0 jn (f ) ? n f j2 E0 dP 0

and

E0

 dP 2! dP0

= E (Z1 )  exp (khk2);

after some elementary manipulations we arrive at the desired mean error bound

 p sM  N E jn (f ) ? n f j  2 2 N kf k exp (12khk2)

from which the end of proof of (24) is now straightforward. When the signal X and the observation Y are real processes and the functions 23

a; b; f; h are four times continuously di erentiable with bounded derivatives a more

precise error bound has been obtained by Picard in [22], namely,   sup E jtf ? tM f j  Cte M t2[0;1) Under these assumptions (25) is a clear consequence of (26)

Remark 3.6: Theorem 1.1 is a clear consequence of Theorem 3.5 and the fact that

s

s

M  p1 () M  N 1=2 N M

M  1 () M  N 1=3: N M

The basic space for the study of the weak convergence of a sequence of random measures is the set of all probability measures on M1(IRp) denoted byM1 (M1(IRp )). On this set we can de ne the Kantorovitch-Rubinstein or Vasershtein metric

D(1; 2) = inf E (k1 ? 2kF )

1 ; 2 2 M1 (M1(IRp ))

where the in mum is taken over all pair of random variables (1; 2 ) such that 1 has distribution 1 and 2 has distribution 2 . This metric gives to M1 (M1(IRp )) the topology of weak convergence. If we note M;N the distribution of tM;N and t the distribution of t the conclusions of t Theorem 3.5 yield p Cte M  N =) D(M;N t ; t)  N1=4 :

Concluding Remarks

We have proposed an approximating scheme for the numerical solution of the Kushner-Stratonovitch equation. We recall that our approach is based on the use of the classical time discretization scheme as presented in [17, 18, 22] and on the interacting particle scheme presented in [6, 9, 7]. This yields a natural line of proof for the convergence of our particle approximating models. The main contribution here was to connect these two separate approximating schemes and to propose explicit error bounds associated to this combination. The interacting particle system approach studied in this work is one of the crudest particle methods. For the sake of unity and to highlight issues speci c to interacting and branching particle system theory we mention that the same line of proof can be used to get the convergence of more general particle schemes such as those presented in [1, 2] and [3] . 24

We also feel that our approach is more transparent and the proof of convergence is simpler than other studies on interacting particle approximations. We have only treated a very special case of a non linear ltering problem. It is obvious that the situation becomes considerably more involved if one dispenses with the assumption that the observation process is solution of an Ito-type di erential equation of the form (2). To deal with more general signal/observation pair it has recently been proposed in [12] an alternative interacting particle system approach which only uses the semigroup of the pair process f(Xt; Yt ) : t 2 [0; 1]g. The idea is to use particles which explore the whole state/observation space and to \select" particles when their sampled observation component is close to the current observation data.

References [1] CRISAN D., LYONS T.J, Nonlinear Filtering And Measure Valued Processes. Prob. Theory and Related Fields, 109, 217-244, 1997. [2] CRISAN D., J. GAINES, LYONS T.J., A Particle Approximation of the Solution of the Kushner-Stratonovitch Equation. SIAM Journal of Applied Math. (to appear). [3] CRISAN D., DEL MORAL P., LYONS T.J. (1998), Discrete Filtering Using Branching and Interacting Particle Systems, Publications du Laboratoire de Statistiques et Probabilites, Universite Paul Sabatier, No 01-98. [4] DI MASI G.B., PRATELLI M., RUNGGALDIER W.G., An approximation for the Non linear ltering problem with error bounds. Stochastics 14 (1985), no. 4, 247{271. [5] DEL MORAL P. (1996). Non linear ltering using random particles. Theor. Prob. Appl., Vol. 40, (4). [6] DEL MORAL P. (1996). Non-linear Filtering: Interacting particle resolution. Markov Processes and Related Fields, Vol.2 [7] DEL MORAL P. (1996). Measure Valued Processes and Interacting Particle Systems. Application to Non Linear Filtering Problems. Ann. Appl. Probab. 8 (1998), no. 2, 438{495.. [8] DEL MORAL P., A Uniform Theorem for the Numerical Solving of Non Linear Filtering Problems. Journal of Applied Probability (to appear). [9] DEL MORAL P (1997), Filtrage non lineaire par systemes de particules en interaction. C.R. Acad. Sci. Paris, t. 325, Serie I, p. 653-658. 25

[10] DEL MORAL P., GUIONNET A., Large Deviations for Interacting Particle Systems. Applications to Non Linear Filtering Problems. Stochastic Processes Appl., (to appear). [11] DEL MORAL P., GUIONNET A. (1997). A Central Limit Theorem for Non Linear Filtering using Interacting Particle Systems. Ann. Appl. Probab., (to appear). [12] DEL MORAL P., JACOD J., PROTTER Ph. (1998). The Monte Carlo Method for Filtering with discrete-time observations. Prepublication no 453 du Laboratoire de Probabilites de L'Universite Paris VI, 1998. [13] FORTET R., MOURIER B. (1953). Convergence de la repartition empirique  Norm. Sup., 70, pp. 267-285. vers la repartition theorique. Ann. Sci. Ecole [14] GORDON N.J., SALMON D.J., SMITH A.F.M. (1993). Novel Approach to non-linear/non-Gaussian Bayesian state estimation.,IEE Proceedings 140, 107113. [15] HOLLAND J.H. (1975). Adaptation in natural and arti cial systems. University of Michigan Press, Ann. Arbor [16] KOREZLIOGLU H. (1987). Computation of Filters by Sampling and Quantization. Center for Stochastic Processes, University of North Carolina, Technical Report No 208. [17] KOREZLIOGLU H., MAZIOTTO G. (1983). Modelization and Filtering of Discrete Systems and Discrete Approximation of Continuous Systems. VI Conference INRIA, Modelisation et Optimisation des Systemes, 19-22/06/84, Nice. [18] KOREZLIOGLU H., RUNGGALDIER W.J. (1992). Filtering for Non Linear Systems Driven by Nonwhite Noises : An Approximating Scheme. Stochastics Stochastics Rep. 44 (1993), no 1-2, 65-102. [19] LEGLAND F. (1989). High order time discretization of nonlinear ltering equations, 28th IEEE CDC, Tampa, pp. 2601-2606. [20] VAN DOOTINGH M., VIEL F., RAKOTOPARA D., GAUTHIER J.P. (1991), Coupling of non-linear control with a stochastic lter for state estimation : Application on a free radical polimerization reactor,I.F.A.C. International Symposium ADCHEM'91, Toulouse, France, October 14-15 [21] PARDOUX E., TALAY D. Approximation ans simulation of solutions of stochastic di erential equations, Acta Applicandae Mathematicae 3 (1985), no 1, 23-47. [22] PICARD J. (1984). Approximation of non linear ltering problems and order of convergence. Filtering and Control of random processes, Lecture Notes Control and Inf. Sc., 61, Springer. Filtering 26

[23] PICARD J. (1986). Nonlinear ltering of one-dimensional di usions in the case of a high signal-to-noise ratio, SIAM J. Appl. Math., 16, pp. 1098-1125. [24] PICARD J. (1986). An estimate of the error in time discretization of nonlinear ltering problems, Proceedings of the 7th MTNS | Theory and Applications of Nonlinear Control Systems, Stockholm 1985, C.I. Byrnes and A. Lindquist Eds. North{Holland Pub. Amsterdam, pp. 401-412. [25] TALAY D. Ecient numerical schemes for the approximation of expectations of functionals of the solution of S.D.E. and applications, Filtering and Control of random processes (Paris 1983), 294-313, Lecture Notes in Control and Inform. Sci., 61, Springer, Berlin-New York, 1984.

27