Convergence of multi-dimensional quantized $ SDE $'s

6 downloads 18 Views 387KB Size Report
Jul 6, 2010 - quantizer solution to that problem as shown in [3, 13] in this infinite dimen- ..... so that. ̂IT (ϕ) = n. ∑ k=1 ξi E(Wsk+1 − Wsk | ̂W). = n. ∑.
arXiv:0801.0726v2 [math.PR] 6 Jul 2010

Convergence of multi-dimensional quantized SDE’s ` Gilles PAGES



Afef SELLAMI



Abstract We quantize a multidimensional SDE (in the Stratonovich sense) by solving the related system of ODE’s in which the d-dimensional Brownian motion has been replaced by the components of functional stationary quantizers. We make a connection with rough path theory to show that the solutions of the quantized solutions of the ODE converge toward the solution of the SDE. On our way to this result we provide convergence rates of optimal quantizations toward the Brownian motion for 1q -H¨ older distance, q > 2, in Lp (P).

Key words: Functional quantization, Stochastic Differential Equations, Stratonovich stochastic integral, stationary quantizers, rough path theory, Itˆo map, H¨older semi-norm, p-variation.

1

Introduction

Quantization is a way to discretize the path space of a random phenomenon: a random vector in finite dimension, a stochastic process in infinite dimension. Optimal Vector Quantization theory (finite-dimensional) random vectors finds its origin in the early 1950’s in order to discretize some emitted signal (see [10]). It was further developed by specialists in Signal Processing and later in Information Theory. The infinite dimensional case started to ∗

LPMA, Universit´e Paris 6, case 188, 4 pl. Jussieu, 75252 Paris Cedex 5, France [email protected] † JP Morgan, London & LPMA, Universit´e Paris 6, case 188, 4 pl. Jussieu, 75252 Paris Cedex 5, France [email protected]

1

be extensively investigated in the early 2000’s by several authors (see e.g. [18], [5], [19], [20], [19], [4], [12], etc). In [20], the functional quantization of a class of Brownian diffusions has been investigated from a constructive point of view. The main feature of this class of diffusions was that the diffusion coefficient was the inverse of the gradient of a diffeomorphism (both coefficients being smooth). This class contains most (non degenerate) scalar diffusions. Starting from a sequence of rate optimal quantizers, some sequences of quantizers of the Brownian diffusion are produced as solutions of (non coupled) ODE’s. This approach relied on the Lamperti transform and was closely related to the Doss-Sussman representation formula of the flow of a diffusion as a functional of the Brownian motion. In many situations these quantizers are rate optimal (or almost rate 1 optimal) i.e. that they quantize the diffusion at the same rate O((log N )− 2 ) as the Brownian motion itself where N denotes the generic size of the quantizer. In a companion paper (see [27]), some cubature formulas based on some of these quantizers were implemented, namely those obtained from some optimal product quantizers based on the Karhunen-Lo`eve expansion of the Brownian motion, to price some Asian options in a Heston stochastic volatility model. Rather unexpectedly in view of the theoretical rate of convergence, the numerical experiments provided quite good numerical results for some “small” sizes of quantizers. Note however that these numerical implementations included some further speeding up procedures combining the stationarity of the quantizers and the Romberg extrapolation leading to 3 a O((log N )− 2 ) rate. Although this result relies on some still pending conjectures about the asymptotics of bilinear functionals of the quantizers, it strongly pleads in favour of the construction of such stationary (rate optimal) quantizers, at least when one has in mind possible numerical applications. Recently a sharp quantization rate (i.e. including an explicit constant) has been established for a class of not too degenerate 1-dimensional Brownian diffusions. However the approach is not constructive (see [4]). On 1 the other hand, the standard rate O((log N )− 2 ) has been extended in [22] to general d-dimensional Itˆo processes, so including d-dimensional Brownian diffusions regardless of their ellipticity properties. This latter approach, based an expansion in the Haar basis, is constructive, but the resulting quantizers are no longer stationary. Our aim in this paper is to extend the constructive natural approach ini2

tiated in [20] to general d-dimensional diffusions in order to produce some rate optimal stationary quantizers of these processes. To this end, we will call upon some seminal results from rough path theory, namely the continuity of the Itˆo map, to replace the “Doss-Sussman setting”. In fact we will show that if one replaces in an SDE (written in the Stratonovich sense) the Brownian motion by some elementary quantizers, the solutions of the resulting ODE’s make up some rough paths which converge (in p-variation and in the H¨older metric) to the solution of the SDE. We use her the rough path theory as a tool and we do not aim at providing new insights on this theory. We can only mention that these rate optimal stationary quantizers can be seen as a new example of rough paths, somewhat less “stuck” to a true path of the underlying process. This work is devoted to Brownian diffusions which is naturally the prominent example in view of applications, but it seems clear that this could be extended to SDE driven e.g. by fractional Brownian motions (however our approach requires to have an explicit form for the Karhunen-Lo`eve basis as far as numerical implementation is concerned). Now let us be more precise. We consider a diffusion process dXt = b(t, Xt ) dt + σ(t, Xt ) ◦ dWt , X0 = x ∈ Rd , t ∈ [0, T ], in the Stratonovich sense where b : [0, T ] × Rd → Rd and σ : [0, T ] × Rd → M(d×d) are continuously differentiable with linear growth (uniformly with respect to t) and W = (Wt )t∈[0,T ] is a d-dimensional Brownian motion defined on a filtered probability space (Ω, A, P). (The fact that the state space and W have the same dimension is in no case a restriction since our result has nothing to do with ellipticity). Such an SDE admits a unique strong solution denoted X x = (Xtx )t∈[0,T ] (the dependency in x will be dropped from now to alleviate notations). The Rd -valued process X is pathwise continuous and supt∈[0,T ] |Xt | ∈ Lr (P), r > 0 (where | . | denotes the canonical Euclidean norm on Rd ). In particular X is bi-measurable and can be seen as an Lr (P)-Radon random variable taking values in the Banach spaces (LpT, Rd , | . |LpT ) where LpT, Rd = LpRd ([0, T ], dt) and R 1 p T denotes the usual Lp -norm when p ∈ [1, ∞). |g|Lp = 0 |g(t)|p dt T

For every integer N ≥ 1, we can investigate for X the level N (Lr (P), LpT )quantization problem for this process X, namely solving the minimization 3

of the Lr (P)-mean LpT,Rd -quantization error o n eN,r (X, Lp ) := min eN,r (α, X, Lp ), α ⊂ LpT,Rd , card α ≤ N

where eN,r (α, X, Lp ) denotes the Lr -mean quantization error namely 1 

r eN,r (α, X, Lp ) := E min |X − a|rp =

min |X − a|Lp a∈α

a∈α

(1.1)

induced by α,



T,Rd

.

Lr (P)

The use of “min” in (1.1) is justified by the existence of an optimal quantizer solution to that problem as shown in [3, 13] in this infinite dimensional setting. The Voronoi diagram associated to a quantizer α is a Borel partition (Ca (α))a∈α such that o n Ca (α) ⊂ x ∈ LpT,Rd | |x − a|Lp ≤ min |x − b|Lp T,Rd

b∈α

T,Rd

and a functional quantization of X by α is defined by the nearest neighbour projection of X onto α related to the Voronoi diagram X b α := X a1{X∈Ca (α)} . a∈α

In finite dimension (when considering Rd -valued random vectors instead of LpT,Rd -valued processes) the answer is provided by the so-called Zador

Theorem which says (see [10]) that if E|X|r+δ < +∞ for some δ > 0 and if g denotes the absolutely continuous part of its distribution then 1

1

N d eN,r (X, Rd ) → Jer,d kgk r d

d+r

as

N →∞

(1.2)

where Jer,d is finite positive real constant obtained as the limit of the nord

malized quantization error when X = U ([0, 1]). This constant is unknown except when d = 1 or d = 2. A non-asymptotic version of Zador’s Theorem can be found e.g. in [22]: for every r, δ > 0 there exists a universal constant Cr,δ > 0 and an integer Nr,δ ≥ such that, for every random vector Ω, A, P) → Rd , ∀ N ≥ Nr,δ ,

1

eN,r (X, Rd ) ≤ Cr,δ kXkr+δ N − d . 4

The asymptotic behaviour of the Ls (P )-quantization error of sequences of Lr -optimal quantizers of a random vector X when s > r has been extensively investigated in [13] and will be one crucial tool to establish our mains results. In infinite dimension, the case of Gaussian processes was the first to have been extensively investigated, first in the purely quadratic case (r = p = 2): sharp rates have been established for a wide family of Gaussian processes including the Brownian motion, the fractional Brownian motions (see [18, 19]). For these two processes sharp rates are also known for p ∈ [1, ∞] and r ∈ (0, ∞) (see [4]). More recently, a connection between mean regularity of t 7→ Xt (from [0, T ] into Lr (P)) and the quantization rate has been established (see [22]): if the above mapping is µ-H¨older for an index µ ∈ (0, 1], then eN,r (X, Lp ) = O((log N )−µ ),

p ∈ (0, r).

Based on this result, some universal quantization rates have been obtained for general L´evy processes with or without Brownian component some of them turning out to be optimal, once compared with the lower bound estimates derived from small deviation theory (see e.g. [11] or [5]). One important feature of interest of the purely quadratic case is that it is possible to construct from the Karhunen-Lo`eve expansion of the process two families of rate optimal (stationary) quantizers, relying on – sequences (α(N,prod) )N ≥1 of optimal product quantizers which are rate optimal i.e. such that eN,r (α(N ) , X, L2 ) = O(eN,2 (X, L2 )) (although not with a sharp optimal rate). – sequences of true optimal quantizers (or at least some good numerical approximations) (α(N,∗) )N ≥1 i.e. such that eN,r (α(N,∗) , X, L2 ) = eN,2 (X, L2 ). We refer to Section 2.1 below for further insight on these objects (both being available on the website www.quantize.math-fi.com). The main objective of this paper is the following: let (αN )N ≥1 denote a sequence of rate optimal stationary (see (2.8) further on) quadratic quantizers of a d′ -dimensional standard Brownian motion W = (W 1 , . . . , W d ). Define the sequence xN = (xN n )n=1,...,N , N ≥ 1, of solutions of the ODE’s Z t Z t N N N σ(xN n = 1, . . . , N. b(xn (s))ds + xn (t) = x + n (s))dαn (s), 0

0

5

Then, the finitely valued-process defined by eN = X

N X

n=1

xN n 1{W∈Cn (α(N) )}

converges toward the diffusion X on [0, T ] (at least in probability) as N → ∞. This convergence will hold with respect to distance introduced in the rough path theory (see [25, 14, 6, 9, 26]) which always implies convergence with respect to the sup norm. The reason is that our result will appear as an application of (variants of the) the celebrated Universal Limit Theorem originally established by T. Lyons in [25]. The distances of interest in rough path theory are related to the 1q -H¨older semi-norm or the q-variation seminorm both when q > 2 defined for every x ∈ C([0, T ], Rd ) by 1

kxkq,Hol = T p

|x(t) − x(s)|

sup 0≤s 0 and Ik = [tk , tk+1 ], k = 0, . . . , 2n − 1.

• mesh of the interval [0, T ], T • ⌊x⌋ denotes the lower integral part of x ∈ R. • Let (an )n≥0 and (bn )n≥0 be two sequences of real numbers: an ∼ bn if an = bn + o(bn ) and an ≍ bn if an = O(bn ) and bn = O(an ).

2 2.1

Background and preliminary results on functional quantization Some background on functional quantization

Functional quantization of stochastic processes can be seen as a discretization of the path-space of a process and the approximation (or coding) of a process by finitely many deterministic functions from its path-space. In a Hilbert space setting this reads as follows. Let (H, h·, ·i) be a separable Hilbert space with norm | · | and let X : (Ω, A, P) → H be a random vector taking its values in H with distribution PX . Assume the integrability condition E |X|2 < +∞.

(2.3)

For N ≥ 1, the L2 -optimal N -quantization problem for X consists in minimizing

 1/2

min |X − a| 2 = E min |X − a|2 a∈α

a∈α

L (P)

over all subsets α ⊂ H with card(α) ≤ N . Such a set α is called N codebook or N -quantizer. The minimal quantization error of X at level N is then defined by n o eN (X, H) := inf (E min |X − a|2 )1/2 : α ⊂ H, card(α) ≤ N . (2.4) a∈α

8

For a given N -quantizer α one defines an associated nearest neighbour projection X πα := a1Ca (α) a∈α

and the induced α-(Voronoi)quantization of X by setting ˆ α := πα (X), X

(2.5)

where {Ca (α) : a ∈ α} is a Voronoi partition induced by α, that is a Borel partition of H satisfying Ca (α) ⊂ {x ∈ H : |x − a| = min |x − b|} b∈α

(2.6) ′

for every a ∈ α. Then one easily checks that, for any random vector X : Ω → α ⊂ H, ′ ˆ α |2 = E min |X − a|2 E |X − X |2 ≥ E |X − X

a∈α

so that finally  

Borel

en (X, H) = inf |X − q(X)| 2 , q : H → H, card(q(H)) ≤ N L (P)  

r.v.

= inf |X − Y | 2 , Y : (Ω, A) → H, card(Y (Ω)) ≤ N .(2.7) L (P)

A typical setting for functional quantization is H = L2T := L2R ([0, 1],dt) p RT (equipped with hf, gi2 := 0 f g(t)dt and |f |L2 := hf, f i2 ). Thus any (biT measurable, real-valued) process X = (Xt )t∈[0,T ] defined on a probability space (Ω, A, P) such that Z

T 0

E(Xt2 )dt < +∞

is a random variable X : (Ω, A, P) → L2T . But this Hilbert setting is not the only possible one for functional quantization (see e.g. [21], [12], [5], etc) since natural Banach spaces like LpR ([0, T ], dt) or C([0, T ], R) are natural path-spaces. In the purely Hilbert setting the existence of (at least) one optimal N quantizer for every integer N ≥ 1 is established so that the infimum in (2.4) 9

holds as a minimum. A typical feature of this quadratic Hilbert framework is the so-called stationarity (or self-consistency) property satisfied by such an optimal N -quantizer α(N,∗) : ˆ α(N,∗) ). ˆ α(N,∗) = E(X | X X

(2.8)

This property, known as stationarity, will be used extensively throughout the paper. This existence property holds true in any reflexive Banach space and L1 path spaces (see [12] for details).

2.2 2.2.1

Constructive aspects of functional quantization of the Brownian motion Karhunen-Lo` eve basis (d = 1)

First we consider a scalar Brownian motion (Wt )t∈[0,T ] on a probability space (Ω, A, P). The two main classes of rate optimal quantizers of the Brownian motion are the product optimal quantizers and the true optimal quantizers. Both are based on the Karhunen-Lo`eve expansion of the Brownian motion given by Xp (2.9) λk ξk eW Wt = k (t) k≥1

where, for every k ≥ 1, λk = and



T π(k − 1/2)

2

and

) (W | eW √ k 2 = ξk = λk

r

2 T

eW k (t) Z

0

T

=

r

2 sin T



t √ λk



p dt Wt sin(t/ λk ) √ . λk

(2.10)

The sequence is an orthonormal basis of L2T . The system (λk , eW k )k≥1 can be characterized as the eigensystem of the symmetric positive trace class RT covariance operator of f 7→ (t 7→ 0 (s ∧ t) f (s)ds) ≡ (t 7→ E(< f | W >2 Wt ). In particular this implies that the Gaussian sequence (ξk )k≥1 is pairwise uncorrelated hence i.i.d., N (0; 1)-distributed. The Karhunen-Lo`eve expansion of W plays the role of P CA of the process: it is the fastest way to exhaust the variance of W among all expansions on an orthonormal basis. (eW k )k≥1

10

The convergence of the series in the right hand side of (2.9) holds in L2T for every ω ∈ Ω and P(dω)-a.s. for every t ∈ [0, T ]. In fact this convergence also holds in L2 (P) and P(dω)-a.s. for the sup norm over [0, T ]. The first convergence follows from Theorem 4.3(a) further on applied with X = W and GN = σ(ξ1 , . . . , ξN ) and the second one follows e.g. from [21] P(dω)-a.s.. In particular the convergence holds in L2 (dP⊗dt) or equivalently in L2L2 (P). T

Note that this basis has already been used in the framework of rough path theory for Gaussian processes, see e.g. [2, 7, 8]. 2.2.2

Optimal product quantization (d ≥ 1)

 The one-dimensional case d = 1. The previous expansion of the Brownian motion suggests to define a product quantization of W at level N by r   L t 2 X p bNk (N1 ,...,NL ) c λk ξk sin √ Wt := (2.11) T λk k=1

N where N1 , . . . , NL are non zero integers satisfying N1 · · · NL ≤ N and ξb1N1 , . . . , ξbL L are optimal quadratic quantizations of ξ1 , . . . , ξL . The resulting (squared) quadratic quantization error reads

c (N1 ,...,NL ) k2 = kW − W 2

X λk . Nk2

(2.12)

k≥1

c N,prod is obtained as a solution to An optimal product N -quantization W the following integral bit allocation optimization problem for the sequence (Nk )k≥1 : n o c (N1 ,...,NL ) k , N1 , . . . , N ≥ 1, N1 · · ·NL ≤ N, L ≥ 1 min kW − W (2.13) 2 L

(see [18] for further details and [27] for the numerical aspects). It is established in [18] (as a special case of a more general result on Gaussian processes) that 1 c N,prod k ≍ (log N )− 12 kW − W (2.14) 2 T Furthermore, the critical dimension L = LW (N ) satisfies LW (N ) ∼ log N . Numerical experiments carried out in [27] show that 1 c N,prod k ≈ c (log N )− 21 kW − W 2 W T 11

with cW ≈ 0.5 (at least up to N ≤ 10000). It is possible to get a closed form for the underlying optimal product quantizers αN . First, note that the normal distribution on the real line being log-concave, there is exactly one stationary quadratic quantizer of full size M for every M ≥ 1 (hence it is the optimal one). So, let N ≥ 1 and let (Nk )k≥1 denote its optimal integral bit allocation for the Brownian motion (N ) W . For every Nk ≥ 1, we denote by β (Nk ) := {βik k , 1 ≤ ik ≤ Nk } the unique optimal quantizer of the normal distribution: thus α(0) = {0} by symmetry of the normal distribution. Then, the optimal quadratic product N -quantizer αN,prod (of “true size” N1 ×· · ·×NLW (N ) ≤ N ) can be described using a multi-indexation as follows: X p αN,prod nk ∈ {1, . . . , Nk }, k ≥ 1. βn(Nk k ) λk eW k (t), (n1 ,...,nk ,...) (t) = k≥1

∞ These sums are in fact all finite so that all the functions αN,prod (i1 ,...,in ,...) are C with finite variation on every interval of R+ . Explicit optimal integral bit allocations as well as optimal quadratic quantizations (quantizers and their weights) of the scalar normal distribution are available on the website [28]. Note for practical applications that this optimal product quantization is based on 1-dimensional quantizations of small size of the scalar normal distribution N (0; 1). This kind of functional quantization has been applied in [27] to price Asian options in a Heston stochastic volatility model.

 The d-dimensional case. Assume now W = (W 1 , . . . , W d ) is a d-dimensional Brownian motion. Its optimal product quantization at level N ≥ 1 will be 1 defined as the optimal product quantization at level ⌊N d ⌋ of each of its d components.  Additional results on optimal vector quantization of the normal distribution on Rd . We will extensively make use of the distortion mismatch result established in [13] that we recall here only in the d-dimensional Gaussian case. Let Z be an N (0; Id ) random vector and let αN be an optimal

12

quadratic quantizer at level N of Z (hence of size N ). Then 1 N (i) ∀ p ∈ (0, 2 + d), ∀ N ≥ 1, kZ − Zbα kp ≤ CZ,p N − d ,

(ii)

(2.15)

∀ p ∈ [2 + d, +∞), ∀ η ∈ (0, d + 2), ∀ N ≥ 1, N kZ − Zbα kp ≤ CZ,p,η N −

2+d−η dp

(2.16)

where CZ,p and CZ,p,η are two positive real constants. 2.2.3

Optimal quantization (d = 1)

It is established in [18] (Theorem 3.2) that the quadratic optimal quantization of the one-dimensional Brownian motion reads c N,opt W t

=

r

  dW (N ) t 2 X p N k b λk (ζdW (N ) ) sin √ T λk

(2.17)

k=1

where, for every integer d ≥ 1, ζd = Proj⊥ Ed(W ) ∼ N (0; Diag(λ1 , . . . , λd ))  √  √  and ζbdN is an optimal with Ed := R-span sin ./ λ1 , . . . , sin ./ λd quadratic quantization of ζd at level (or of size) N . If one considers an optimal quadratic N -quantizer β N = {βnN , n =  1, . . . , N } ⊂ RdW (N ) of the distribution N 0; Diag(λ1 , . . . , λdW (N ) ) (a priori not unique) dW (N )

αN,opt (t) = n

X k=1

(βn(N ) )k

p

λk eW k (t),

n = 1, . . . , N.

Once again this defines a C ∞ function with finite variation on every interval of R+ . A sharp rate has been obtained in [19] for the resulting optimal quantization error 1

c N,optk ∼ T copt (log N )− 2 kW − W 2 W √

as

N →∞

(2.18)

= π2 ≈ 0.4502. where copt W The true value of the critical dimension dW (N ) is unknown. A conjecture supported by numerical evidences is that dW (N ) ∼ log N . Recently a first 13

step to this conjecture has been established in [23] by showing that lim inf N

dW (N ) 1 ≥ . log(N ) 2

Large scale computations of optimal quadratic quantizers of the Brownian motion have been carried out (up to N = 10 000 and d = 10). They are available on the website [28]. In the d-dimensional setting, several definitions of an optimal quantization of the Brownian motion W = (W 1 , . . . , W d ) can be given. For our purpose, it is convenient to adopt the following one:  ⌊N d1 ⌋,opt  N,opt ci c W := W

1≤i≤d

.

Its property of interest is that this definition preserves the componentwise independence as well as a stationarity property (see below) since 1 1    ⌊N d ⌋,opt  ⌊N d ⌋,opt i ci i c N,opt c i = E W |W E W |W =W , i = 1, . . . , d.

2.2.4

Wiener like integral with respect to a stationary functional quantization (d = 1)

Both types of quantizations defined above share an important property of quantizers: stationarity. Definition 2.1. Let α ⊂ L2T , α 6= ∅, be a quantizer. The quantizer α is stationary for the (one-dimensional) Brownian motion W if there is a c := W c α induced by α such that Voronoi quantization W c = E(W | σ(W c )) W

a.s.

(2.19)

where E( . |G) denotes the functional conditional expectation given the σ-field c ) is the σ-field spanned by W c. G on L2L2 (P) (see Appendix) and σ(W T

Note that if α is stationary for one Brownian motion, so it is for any Brownian motion since this stationarity property only depends on the Wiener distribution.

c N,prod , this follows from the staIn the case of product quantization W tionarity property of the optimal quadratic quantization of the marginals ξn 14

c N,opt this (see [18] or [27]). In the case of optimal quadratic quantization W follows from the optimality of the quantization of ζdW (N ) itself.

We will now define a kind of Wiener integral with respect to such a c of a one-dimensional W . So we assume that stationary quantization W d = 1 until the end of this Section.

First, we must have in mind that if W is an (Ft )-Brownian motion where the filtration (Ft )t≥0 satisfies the usual conditions, one can define the Wiener stochastic integral (on [0, T ]) of any process ϕ ∈ L2 ([0, T ] × Ω, B([0, T ]) ⊗ F0 , dt ⊗ dP) with respect to W . The non-trivial case is when FtW 6= Ft , typically when Ft = FTB ∨ FtW , t ∈ [0, T ] where B and W are independent. One can see it as a special case of Itˆo stochastic integral or as an extended Wiener integral: if (ϕ(t, ω))(ω,t)∈Ω×[0,T ] denotes an elementary process of the form ϕ(t, ω) :=

n X k=1

ϕk (ω)1sk 2. Namely this means solving an equation formally reading dxt = f (xt )dyt ,

x0 ∈ R d .

(3.26)

In this equation y does not represent the path (null at 0) yt = Wt (ω), t ∈ [0, T ] itself but an enhanced path embedded in a larger space, also called geometric multiplicative functional lying on y with controlled 1q -H¨older 1 2 1 ) semi-norm, namely a couple y = ((ys,t 0≤s≤t≤T , (ys,t )0≤s≤t≤T ) where ys,t = yt − ys ∈ Rd+1 , 0 ≤ s ≤ t ≤ T , can be identified with the path (yt ) and 2 ) 2 (d+1)2 for every 0 ≤ s ≤ u ≤ t ≤ T and the (ys,t 0≤s≤t≤T satisfies, ys,t ∈ R following tensor multiplicative property 2 2 2 1 1 ys,t = ys,u + yu,t + ys,u ⊗ yu,t .

Different choices for this functional are possible, leading to different solutions to the above Equation (3.26). The choice that makes coincide a.s. the solution of (3.25) and the pathwise solutions of (3.26) is given by Z t  i i j 1 2 (Wu − Ws ) ◦ dWu (ω) ys,t = Wt (ω) − Ws (ω), ys,t := (3.27) s

i,j=0,...,d

so that 1 1,i 1,j 1 ⊗ yu,t = ys,u ys,u ys,u



i,j=0,...,d

.

2 is but the “running” L´ The term ys,t evy areas related to the components of the Brownian mtion W . The enhanced path of W will be denoted W (although we will keep the notation y in some proofs for notational convenience). One defines, for every q ≥ 1, the 1q -H¨older distance by setting

ρq (y − x) = ky1 − x1 kq,Hol + ky2 − x2 kq/2,Hol where 2

kx2 kq/2,Hol := T q

sup 0≤s

2 3

then

P e N , X) −→ ρq (X 0.

e ⌊e ρq (X

Nr ⌋

a.s.

, X) −→ 0.

In view of what precedes this result is, as announced, a straightforward corollary of the continuity of the Itˆo map established Theorem 3.1, once the c N , W) in probability is established for any q ∈ (2, 3). A convergence ρq (W slightly more derailed proof is proposed at the end of Section 5. In fact we will prove a much precise statement concerning the Brownian motion since we will establish for every q > 2 the convergence in every Lp (P), c N , W) with an explicit Lp (P)-rate of convergence in the 0 < p < ∞, of ρq (W scale (log N )−θ , θ ∈ (0, 1). These rates can be transferred to the convergence of the quantized SDE, conditionally to some events on which the Itˆo map is itself Lipschitz continuous for the distances ρq . Several results of local Lipschitz continuity have been established recently, especially in [6], [9], [16], [17], although not completely satisfactory from a practical point of view. So we decided not to reproduce (and take advantage of) them here. The proof is divided into two steps: the convergence for the H¨older seminorm) of the regular path component is established in Section 4 (in which more general processes are considered) and the convergence of approximate L´evy areas in Section 5 (entirely devoted to the Brownian case for the sake of simplicity). Remarks. • There is a small abuse of notation in the above Theorem since e N is not a Voronoi quantizer of X: this quantization of X is defined on X 23

the Voronoi partition (for the L2T,Rd -norm) induced by the quantization of the Brownian motion W . • The same results holds for the Brownian bridge, the Ornstein-Uhlenbeck process and more generally for continuous Gaussian semi-martingales that satisfy the Kolmogorov criterion.

4 4.1

Convergence of the paths of processes in H¨ older semi-norm A general setting including stationary functional quantization

In this section we investigate the connections between the celebrated Kolmogorov criterion and the tightness of some classes of sequences of processes for the topology of 1q -H¨older convergence. In fact this connection is somehow the first step of the rough path theory, but we will look at it in a slightly different way. Whatsoever this naive pathwise convergence is not sufficient to get the continuity of the Itˆo map in a Brownian framework and we will also have to deal for our purpose with the multiplicative functional (see Section 5). But at this stage we aim at showing that when a sequence (Y N )N ≥1 satisfies some “stationarity property” with respect to a process Y , several properties of Y can be transferred to the Y N . Indeed, the same phenomenon will occur for the multiplicative function (see the next section). If Y satisfies the Kolmogorov criterion and (GN )N ≥1 denotes a sequence of sub-σ-fields of A, then a sequence of processes defined by Y N := E(Y | G N ),

N ≥ 1,

where the conditional expectation is considered in the functional sense (see Appendix) is (C-tight and) tight for a whole family of topologies induced by convergence in 1q -H¨older sense. Definition 4.2. Let p ≥ 1, θ > 0. A process Y = (Yt )t∈[0,T ] satisfies the Kolmogorov criterion (Kp,θ ) if there is a real constant CTKol > 0 such that ∀ s, t ∈ [0, T ],

E|Yt − Ys |p ≤ CTKol |t − s|1+θ 24

and

Y0 ∈ Lp (P).

Theorem 4.3. Let Y := (Yt )t∈[0,T ] be a pathwise continuous process defined on (Ω, A, P) satisfying the Kolmogorov criterion (Kp,θ ). Let (GN )N ≥1 be a sequence of sub-σ-fields of A. For every N ≥ 1 set Y N := E(Y | GN ). For every N ≥ 1, Y N has a pathwise continuous version satisfying ∀ t ∈ [0, T ],

YtN = E(Yt | GN )

a.s.

Furthermore, if one of the following conditions is satisfied (a) GN ⊂ GN +1 ,

(b) There exists an everywhere dense subset D ⊂ [0, T ] such that P

YtN −→ Yt .

∀ t ∈ [0, T ], P

(c) |Y N − Y |LrT −→ 0 for some r ≥ 1, then 1 ∀q > , θ

∀ p ∈ [1, qθ),

Lp

kY − Y N ksup + kY − Y N kq,Hol −→ 0.

The proof of the theorem is a variant of the proof of the Kolmogorov criterion for functional tightness of processes. It consists in a string of several lemmas. For the following classical lemma, we refer to [14] (where it is stated and proved for semi-norms in p-variation). Lemma 4.1. Let x, y ∈ C([0, T ], Rd ) and let q ≥ 1. Then

(a) kx − x(0)ksup ≤ kxkq,Hol .

(b) kx + ykq,Hol ≤ kxkp,Hol + kykq,Hol if q ≥ 1,



(c) For every q > q ′ ≥ 1, kxkqq,Hol ≤ (2kxksup )q−q kxkqq′ ,Hol . ′

(d) Claims (a)-(b)-(c) remain true with the p-variation semi-norm Varq,[0,T ] instead of the 1q -H¨ older semi-norm. Lemma 4.2. Let p ∈ [1, ∞). If Y satisfies the Kolmogorov criterion (Kp,θ ) then, for every N ≥ 1, the process Y N defined by YtN = E(Yt | GN ) has a ′ older continuous for every θ ′ ∈ (0, θ) continuous modification which is θp -H¨

25

(i.e. kY N k p′ ,Hol < +∞ a.s.). Furthermore, the sequence (Y N )N ≥1 is Cθ tight and for every θ ′ ∈ (0, θ), there exists a random variable Zθ′ ∈ LpR (P) such that P(dω)-a.s. kY (ω)k p′ ,Hol ≤ Zθ′ (4.29) θ

and ∀ N ≥ 1, kY N (ω)k p′ ,Hol ≤ E(Zθ′ | GN )(ω).

(4.30)

θ

In particular, the sequence of H¨ older semi-norms (kY N k p′ ,Hol )N ≥1 is θ integrable.

Lp -uniformly

Remark. As a by-product of the proof we also get that E(Zθp′ ) ≤ Cp,T,θ,θ′ CTKol where CT,p,θ,θ′ is a finite real constant that only depends upon p, T , θ and θ ′ (and not on Y or the σ-fields GN ). Proof. First it follows form the Kolmogorov criterion that for every N ≥ 1, ′ Y N admits a continuous modification which is θp -H¨older for every θ ′ ∈ (0, θ). Moreover the sequence (Y N )N ≥1 is C-tight since every Y N satisfies the same Kolmogorov criterion (Kp,θ ) and Y0N = E(Y0 |GN ) is tight on R (see [1], [29] p.26). Now, let s, t ∈ [0, T ], let m, n ≥ 1 be two fixed integers. First note that

sup s,t∈[0,T ], t≤s≤t+ 2Tn

|Yt − Ys | ≤ 2

X

m≥0

max

0≤k≤2n+m −1

and max

0≤k≤2n+m −1

p

|Ytn+m − Ytn+m | ≤ k+1

k

2n+m X−1 k=0

|Ytn+m − Ytn+m | k+1

k

|Ytn+m − Ytn+m |p . k+1

k

For every θ ′ ∈ (0, θ), set   ′ X θ 2 2n p |Yt − Ys | . Zθ′ :=  sup T T s,t∈[0,T ], t≤s≤t+ n n≥0

2

26

(4.31)

(4.32)

Taking the Lp -norm in (4.31) yields kZθ′ kp

  θ′ X θ′ 2 p ≤ 2n p k sup |Yt − Ys |kp T s,t∈[0,T ], t≤s≤t+ Tn n≥0

2

  θ′ X ′ X 2 p nθ 2 p k max |Ytn+m − Ytn+m |kp . ≤ 2 k+1 k T 0≤k≤2n+m −1 n≥0

ℓ≥0

On the other hand, owing to the the Kolmogorov criterion (Kp,θ ), E

max

0≤k≤2n+m −1

p

|Ytn+m − Ytn+m | k+1

k

2n+m X−1



k=0

E|Ytn+m − Ytn+m |p k+1

k

≤ 2n+m CTKol 2−(n+m)(1+θ) T −(1+θ) = CTKol T −(1+θ) 2−(n+m)θ . Hence



E Zθp′ ≤ CTKol Cp,T,θ,θ′ 

XX

′ n θ p−θ

2

−m θp

2

n≥0 m≥0

p

 < +∞

where the finite real constant Cp,T,θ,θ′ only depends on p, T , θ and θ ′ . On the other hand, for every δ ∈ [0, T ], there exists a integer nδ ≥ 1 such that 2−(1+nδ ) ≤ δ/T ≤ 2−nδ . Hence, ′



δ−θ sup s,t∈[0,T ], t≤s≤t+δ



|Yt − Ys |p ≤ 2(1+nδ )θ T −θ ×

sup s,t∈[0,T ], t≤s≤t+ 2Tn

|Yt − Ys |p ≤ Zθp′ .

Consequently, for every s, t ∈ [0, T ], and every ω ∈ Ω, θ′

|Yt (ω) − Ys (ω)| ≤ Zθ′ (ω)|t − s| p i.e. kY (ω)k p′ ,Hol ≤ Zθ′ (ω). θ

Finally, it follows from Jensen’s Inequality that for every s, t ∈ Q ∩ [0, T ], P(dω)-a.s.



|YtN (ω) − YsN (ω)| ≤ E(Zθ′ | GN )(ω)|t − s|θ . 27

In particular this means that, for every p ≥ 1 and every θ ′ ∈ (0, θ), kY N (ω)k p′ ,Hol ≤ E(Zθ′ | GN )(ω) < +∞

P(dω)-a.s.

θ

and satisfies the Lp -uniform integrability assumption.



Proof of Theorem 4.3. The sequence (Y N )N ≥1 being C-tight on (C([0, T ], Rd ), k . ksup ), so is the case of the the sequence (Y N , Y )N ≥1 on (C([0, T ], R2d ), k . ksup ) since the product topology coincides with the uniform topology. Let Q = wlimN P(Y N ′ ,Y ) denote a weak functional limiting value of (Y N , Y )N ≥1 . If Ξ = (Ξ1 , Ξ2 ) denotes the canonical process on (C([0, T ], R2d ), k . ksup ), it is clear that QΞ2 = PY .  Convergence of the sup-norm. Assume that (c) holds: the functional y 7→ |y 1 (t) − y 2 (t)|LrT is continuous on (C([0, T ], R2d ), k . ksup ), consequently,

|Ξ1 − Ξ2 |LrT = 0 Q-a.s. i.e. Q = P(Y,Y ) so that (Y N , Y )

L(k . ksup )

P

−→

(Y, Y ) as

N → ∞ which simply means that kY N − Y ksup −→ 0. On the other hand,it follows from Lemma 4.2 that, for every N ≥ 1,  kY N − Y kpsup ≤ Cp,T E(Zθp′ | GN ) + Zθp′ a.s.

(for a given fixed θ ′ ∈ (0, θ)) which implies that (kY N − Y kpsup )N ≥1 is uniformly integrable. Finally, E kY N − Y kpsup −→ 0

as

N → ∞.

Assume that (b) holds: it follows that, for every t1 , . . . , tk ∈ D, one has P

) −→ (Yt1 , . . . , Ytk ), which in turn implies that the convergence (YtN , . . . , YtN 1 k L

, Yt1 , . . . , Ytk ) −→ (Yt1 , . . . , Ytk , Yt1 , . . . , Ytk ). This means that (YtN , . . . , YtN 1 k Q and P(Y,Y ) have the same finite dimensional marginals i.e. Q = P(Y,Y ) . One concludes like in (c). If (a) holds, for every t ∈ [0, T ], YtN → Yt P-a.s., so that (b) is satisfied.  Convergence of the H¨ older semi-norm. Let q ≥ 1. As concerns the 1 convergence of the q -H¨older semi-norm, one proceeds as follows. Let q ′ ∈ ( pθ , q) and set θ ′ := qp′ ∈ (0, θ). It follows from Lemma 4.1(b)-(c) that ′

1− qq

kY − Y N kq,Hol ≤ 2

1− q



kY − Y N ksup q × kY kq′ ,Hol + kY N kq′ ,Hol 28

 qq′

.

Now let Z := Zθ′ be defined by (4.32). Then,  kY kq′ ,Hol + kY N kq′ ,Hol ≤ Z + E(Z | GN ) .

Hence, the sequence (kY kq′ ,Hol + kY N kq′ ,Hol )N ≥1 , is tight since it is Lp Lp

bounded. On the other hand, kY − Y N ksup −→ 0 so that P

kY − Y N kq,Hol −→ 0 as N → ∞. Now let θe = pq ∈ (0, θ). The same argument as above shows that kY − e = Z e is still given by (4.32). As a Y N kq,Hol ≤ Ze + E(Ze | GN ) where Z θ p N consequence, (kY − Y kq,Hol )N ≥1 is uniformly integrable since, for every N ≥ 1, Jensen’s Inequality implies  ep + E(Zep | G ) kY − Y N kpq,Hol ≤ 2p−1 Z N Lp

which finally implies that kY − Y N kq,Hol −→ 0.

4.2



Application to stationary quantizations of Brownian motion: convergence and rates

c N )N ≥1 be a sequence of stationary quadratic funcTheorem 4.4. (a) Let (W tional quantizers of a standard d-dimensional Brownian motion W defined by (2.11) or (2.17) converging to W in a (purely) quadratic sense, namely c N |L2 k2 → 0 as N → ∞. Then, for every q > 2, k |W − W T

Lp

c N kq,Hol −→ 0 as N → ∞. kW − W

∀ p ∈ (0, ∞),

c N is an optimal product quantization (b) Let q > 2. If, for every N ≥ 1, W at level N . Then, for every p ∈ (0, ∞),  



− 23 min 51 (1− 2q ), p1 +α N c , ∀ α > 0. = o (log N ) kW − W k

q,Hol p

The proof of this Theorem is a consequence of the above Theorem 4.3. So we need to get accurate estimates for the increments of the processes c N . This is the aim of the following lemma. W −W

c N , N ≥ 1, denote a sequence of Lemma 4.3. Let p ∈ [2, +∞). Let W optimal product quadratic quantizers. For every ρ ∈ (0, 12 ) and every ε ∈ (0, 3), for every s, t ∈ [0, T ], s ≤ t,

) −( 1 −ρ)∧( 3−ε ctN − W csN ) 2p .

(Wt − Ws ) − (W

≤ Cρ,p,T,d,ε|t − s|ρ (log N ) 2 p

(4.33)

29

In particular, if p ∈ (2, 3), then

N N ρ −( 12 −ρ) c c − W ) − ( W − W ) .

(Wt s t s ≤ Cρ,p,T,d |t − s| (log N )

(4.34)

p

Proof. We may assume without loss of generality that we deal with a 1 one-dimensional Brownian motion W , quantized at level N ′ = ⌊N d ⌋ since everything is done component by component. Set for every k ≥ 1, ξek := ξk − ξbkNk where N1 , . . . , Nk , . . . denotes the optimal bit allocation of an optimal product quadratic quantization at level N ′ . Keep in mind that for every k > LW (N ′ ), Nk = 1 and that of course N1 · · · NLW (N ′ ) ≤ N ′ . The random vectors (ξek )k≥1 are independent and centered. It follows from the K-L expansion of W and its product quantization that X  W ctN ′ − W csN ′ ) = (Wt − Ws ) − (W λk ξek eW k (t) − ek (s) . k≥1

Then, it follows from the B.D.G. Inequality for discrete time martingales that

1

2

X

N′ N′ W W 2 e c c

λk ξk (ek (t) − ek (s))

(Wt − Ws ) − (Wt − Ws ) ≤ Cp,T p

k≥1

p ≤ Cp,T

since, for every k ≥ 1, 2 W (eW k (t) − ek (s)) =

 

X k≥1

1

2

2

e 2 λ1−ρ k kξk kp

|t − s|ρ

t − s t − s 8 8 cos2 √ ≤ |t − s|2ρ λ−ρ sin2 √ k . T T λk λk

The random variables ξbkNk being an optimal quadratic quantization of the one-dimensional normal distribution for every k ∈ {1,. . ., LW (N ′ )}, it follows from (2.16) that, there exists for every ε ∈ (0, 3), a constant κp,ε such that 1 ∀ m ≥ 1, kξek kp = kξ − ξbkNk kp ≤ κp,ε 1∧ 3−ε Nk p

30

where ξbm denotes the (unique) optimal quadratic quantization at level m of a normally distributed scalar random variable ξ. As a consequence, 

X 1−ρ

cN ′ − W c N ′ ) λk

(Wt − Ws ) − (W

≤ Cp,T,ε|t − s|ρ  t s p

k≥1

1

2

1 2(1∧ 3−ε ) p Nk

 .

 Temporarily assume that p ∈ [2, 3). One may choose ε so that 1∧ 3−ε p = 1. ′ ′ ′ −2 Now, keeping in mind that L := LW (N ) ∼ log N and λk ≤ c k for a real constant c > 0, one gets X k

λ1−ρ n



1 Nk2



λρL′

L X X 1−ρ λk + λk 2 Nk ′ k=1

k>L



L X λk (log N ) + (log N )2ρ−1 Nk2 ′ 2ρ

≤ Cρ

k=1

!

.

Now, following e.g. [18], we know that the optimal bit allocation yields ′

L X C λk ≤ (log N ′ )−1 T Nk2 k=1

so that, finally

1

ctN ′ − W csN ′ )

(Wt − Ws ) − (W

≤ Cρ,p,T |t − s|ρ (log N ′ )ρ− 2 . p

 Assume now that p ∈ [3, +∞) and ε ∈ (0, 3). Set p˜ = conjugate exponent. Then, H¨older Inequality implies ′

L X λ1−ρ k

k=1

2 p ˜

Nk



L X λk Nk2



k=1

! 1p˜



L X

1− ρp λk p−3+ε

k=1

p 3−ε

! q1˜

> 1 and q˜ its

.

We inspect now three possibles cases for ρ. • If 0 < ρ < 12 (1 − 3−ε p ), then 1 − which in turn implies that ′

L X λ1−ρ k

k=1

2 p ˜

Nk

ρp p−3+ε

>

1 2

so that

P

− 3−ε  p . ≤ Cρ,p,T log N ′ 31

ρp 1− p−3+ε

k≥1 λk

< +∞,

Furthermore 1 −

ρ 2

>

3−ε p . < 21 ,

3−ε 1 2 (1 − p ) < ρ ρp 1− p−3+ε < +∞ λ k≥1 k

• If P



ρ 2


1d log N (which implies log N ′ > 1 d log(N/2)). ♦ Proof of Theorem 4.4. (a) Owing to the monotonicity of the Lp -norms, it is enough to show that, the announced convergence holds for every q > 2 2q 2p and every p > q−2 or equivalently for every p > 2 and every q > p−2 . This 1 statement follows for the q -H¨older (semi-)norm follows from Theorem 4.3(c). Indeed W satisfies the Kolmogorov Kp,θ with θ = p/2 − 1. On the other hand, it follows from [13] that, for any sequence of (Voronoi) quantizations c N at level N converging in L2 2 (P) toward W , this convergence also holds W L T

in the a.s. sense. So Criterion(c) is fulfilled.

32

c N satisfies Kp,ρp−1 for every ρ ∈ ( 1 , 1 ) (b) Let q > 2. The process W − W p 2 with “Kolmogorov constants” )] −p[( 12 −ρ)∧( 3−ε 2p

Kol CT,p = Cp,T,ρ,d,ε(log N )

, ε ∈ (0, 3).

We wish to apply Lemma 4.2 (and the remark that follows).  Assume 0 < p < p′

5q q−2 .

Then there exists η > 0 such that p < p′ =

5q q−2+η .

Set θ ′ = q . One cheks that p1′ + 1q < 12 so that there exists η ′ > 0 such 3 . that ρ = p1′ + 1q + η ′ < 21 . Elementary computations show that 12 − ρ < 2p 3 1 Let ε ∈ (0, 3) such that 2 − ρ < 2p − ε. Consequently, Lemma 4.2 (and the remark that follows) imply that

1

c N kq,Hol

≤ Cq,η,η′ ,T,ε (log N )−( 2 −ρ)

kW − W p′

and for any small enough α > 0, one my specify η, η ′ and ε so that 21 − ρ = 2 3 ′ 10 (1 − q ) − α. Finally this bounds holds true for p ∈ (0, p ) since the the Lp -norm is non-decreasing.   5q 3 , one checks that 2p ≥ 21 − 1p + 1q . It becomes impossible  Now, if p ≥ q−2 to specify ρ ∈ (0, 21 ) so that θ ′ = pq < θ = ρp − 1 and 1 − ρ > same specifications as above lead to

− 3−ε c N kq,Hol

kW − W

≤ Cq,η,η′ ,ε,T (log N ) 2p

3 2p .

So the

p′

which yields the announced result.

5



Convergence of stationary quantizations of the Brownian motion for the ρq -H¨ older distance.

In view of what will be needed to apply this theorem to the Brownian motion and its functional quantizations, we need to prove a counterpart 2 . However, for the sake of simplicity, by of Lemmas 4.2 and 4.3 for Ws,t contrast with the previous section, we will only deal with the case of the Brownian motion and its stationary quantizations. The main result of this section is the following Theorem. Theorem 5.5. Let q > 2. 33

c N )N ≥1 be a sequence of stationary quadratic functional quantizers (a) Let (W of a standard d-dimensional Brownian motion W defined by (2.11) or (2.17) c N |L2 k2 → 0 converging to W in a (purely) quadratic sense, namely k |W − W T as N → ∞. Then,

c N ) ∀ q > 2, ∀ p > 0,

ρq (W, W

−→ 0 as N → ∞. p

c N is an optimal product (b) Let q > 2. Assume that, for every N ≥ 1, W quantization at level N of W . Then, for every q > 2 and every p > 0,  



− 3 min 27 (1− q2 ), p1 +α c 2,N k q ,Hol , ∀ α > 0,

= o (log N ) 2

kW2 − W 2

p

so that, finally,



− 3 min c N )

= o (log N ) 2

ρq (W, W

1 (1− 2q ), p1 7

p



(c) If r > 32 , then



 , ∀ α > 0.

  r q−2 3 c ⌊eN ⌋ ) = o N −( 2 r−1) 7q +α ∀ α > 0, P-a.s. ρq (W, W

Note that the result of interest for our purpose (convergence on multidimensional stochastic integrals) corresponds to q ∈ (2, 3). The proposition below appears as the counterpart of Lemma 4.2 on the way to the proof. Proposition 5.3. Let p > 2. 2 be defined by (3.27). For every θ ˜′ ∈ (0, p − 1), there exists a (a) Let Ws,t (2)

random variable Zθ˜′ ∈ Lp such that P-a.s.

(2)

θ˜′

2 |Ws,t | ≤ Zθ˜′ |t − s| p .

∀ s, t ∈ [0, T ],

(b) Let c 2,N (ω) W s,t

=

Z

s

t

ci (W u



c i )dW cj W s u



(ω), i,j=0,...,d

s, t ∈ [0, T ], s ≤ t,

c=W c N is a stationary quantization of W (the integration holds in where W the Stieltjes sense). Then, for every p > 2 and every θ˜′ ∈ (0, p − 1), P-a.s.

∀ s, t ∈ [0, T ],

(2)

θ˜′

c 2,N | ≤ E(Z | G )|t − s| p . |W s,t N p,θ˜′ 34

2 −W f 2,N = Ws,t c 2,N where W c=W c N is now an optimal quadratic (c) Let W s,t s,t product quantization of W at level N . Then, if p > ρ1 , for every θ˜′ ∈ (0, p(ρ + 12 ) − 2), for every ε ∈ (0, 3) and every δ > 0, there exists a real constant Cρ,p,T,d,ε,δ > 0 such that

f 2,N | 3−ε

 1 |W s,t

sup

≤ Cρ,p,T,d,ε,δ log N −( 2 −ρ)∧ 2(p+δ) .

θ˜′

s,t∈[0,T ] |t − s| p p

(2)

Proof. (a) The random variable Zθ˜′ of interest is defined by (2)

Zθ˜′ :=

2 X n θ˜p′ 2 sup |Ws,t |. 2 T s≤t≤s+ Tn n≥0

2

Let s, t ∈ [0, T ], s ≤ t ≤ s + 2Tn . We know from the multiplicative tensor property that, for every u ∈ [s, t], 2 2 2 Ws,t = Ws,u + Wu,t + Ws,u ⊗ Wu,t

and that, for every i, j ∈ {0, . . . , d}, 1 j 2 j i 2 i | + |Wu,t | ). |Ws,u ⊗ Wu,t | ≤ (|Ws,u 2 2, |, we may restrict to dyadic numbers owing to To evaluate supt∈[s,s+ Tn ]|Ws,t 2

2 . As a consequence, we have, still following the continuity in (s, t) of Ws,t the classical scheme of Kolmogorov criterion X 2 | ≤ 2 max |Wt2n+m ,tn+m | sup |Ws,t t∈[s,s+ 2Tn ]

m≥0

+

0≤k≤2n+m −1

max

0≤k≤2n+m −1

k

k+1

|Wtn+m ,tn+m |2 . k

k+1

Now E



2m+n X−1

E |Wt2n+m ,tn+m |p

|Wtn+m ,tn+m | ≤

2m+n X−1

E |Wtn+m ,tn+m |p

max |Wt2n+m ,tn+m |p k k+1 0≤k≤2n+m −1

and E

max

0≤k≤2n+m −1

p

k

k+1

35

ℓ=0

ℓ=0





ℓ+1

ℓ+1

where the norms | . | are the canonical Euclidean norms on the spaces M((d+ 1), (d + 1)) and Rd+1 respectively. It is clear that, for every i 6= j, i, j ≥ 1 and every t ≥ s,

Z t

2,ij i i j

kWs,t kp = (Wu − Ws )dWu s

p

s

p

Z t

i i j

≤ (Wu − Ws )dWu

Z t

1

2 i i 2 BDG

≤ Cp

(Wu − Ws ) du s

p 2

≤ Cp |t′ − t|

whereas k |Wt′ − Wt |2 kp = |t′ − t|k |W1 |kp = Cp,d |t′ − t|.

2,ii = 12 (Wti − Wsi )2 Noting that Wt0 = t and, if i = j, 1 ≤ i ≤ d, Ws,t shows that the above upper-bound still holds for i = j and i or j = 0. Consequently, we also have 2,ij kWs,t kp ≤ Cp,d |t′ − t|.

Consequently E max |Wt2n+m ,tn+m |p n+m k k+1 0≤k≤2 −1 so that (2)

kZθ˜′ kp ≤ Cp,d,T

X

θ˜′

2n p

n≥0

2n+m X−1

≤ Cp,d

X

k=0

T 2n+m

1

p

2(n+m)( p −1) = Cp,d,T

m≥0

= Cp,d,T 2(n+m)(1−p)

X

θ˜′

2n( p −1) < +∞

n≥0

since θ˜′ < p − 1. On the other hand, one has obviously 2 | |Ws,t

sup s,t∈[0,T ],s6=t

|t − s|

θ˜′ p

(2) θ

≤ Z ˜′ < +∞

a.s.

Lemma 4.2(a) applied to W (which satisfies (Kp, p −1 )) yields for every θ ′ ∈ (0, p2 − 1) the existence of Z (1) ∈ Lp (P) such that sup s,t∈[0,T ],s≤t

1 | |Ws,t

|t − s|

θ′ p

36

(1)

≤ Zθ ′

2

a.s.

q>

As a consequence, combining these two results shows that, for every 2p p−2 , (1)

(2) θ

ρq (W, 0) < Z = Zθ′ + Z ˜′ ∈ Lp (P) p q

where Z (1) is related to θ ′ = (0, p − 2).

∈ (0, 2p − 1) and Z (2) is related to θ˜′ =

2p q



c 2,ij,N = (b) If i 6= j, 0 ≤ i, j ≤ d, it follows from Proposition 2.2 that W s,t c 2,ij,N | GN ) where GN = σ(W c ) and W c N = (W c i,N )1≤i≤d is an optimal E(W s,t

product quantization at level N (which means that for each component W i , c i,N is an optimal product quantization at level N ′ = ⌊N d1 ⌋). W  c 2,ii,N | ≤ 1 E (W i − W i )2 | GN ) . One derives that When i = j ≥ 1, |W s,t

c 2,ii,N | |W s,t θ˜′

|t − s| p

t

2



≤ E

2,ii |Ws,t |

θ˜′

|t − s| p

s



  ˜′ (2) θp  | GN ≤ E (Z ˜′ ) | GN . θ

c 2,ii,N = W2,ii = 1 (t − s)2 . When i = j = 0, W 2 (2),N

of interest is defined by (c) In this claim, the random variable Zθ˜′ 2 X n θ˜p′ f 2,N | e(2),N sup |W 2 Z = s,t θ′ T T s≤t≤s+ n n≥0

2

Lp (P)

and we aim at showing that it lies in with a control on its Lp -norm 2,N f as a function of N . One first derives for W s,t the straightforward identity when s ≤ u ≤ t N f 2,N = W f 2,N + W f 2,N + W fs,u,t W s,u s,t u,t where

fN cN cN W s,u,t = Ws,u ⊗ Wu,t − Ws,u ⊗ Wu,t

N N N cs,u cs,u cu,t = (Ws,u − W ) ⊗ Wu,t + W ⊗ (Wu,t − W )

with Wr,s := Wr − Ws if r ≥ s, etc. One derives from (5.36) that X f 2,N | ≤ 2 f 2,N |W max |W | s,t tn+m ,tn+m m≥0

+2

+2

X

0≤k≤2n+m −1

k

m,m′ ≥0

max

k

k+1

k

k+1

k′

k′ +1

c 2,N c |Wt2,N n+m n+m − W n+m n+m ||Wtn+m ,tn+m |. (5.39) ,t t ,t

0≤k≤2n+m −1 ′ n+m′

0≤k ≤2

(5.37)

k+1

c 2,N |Wt2,N n+m n+m − W n+m n+m ||Wtn+m ,tn+m |(5.38) ,t t ,t

max

0≤k≤2n+m −1 m,m′ ≥0 ′ 0≤k ′ ≤2n+m −1

X

(5.36)

k

k+1

−1

37

k

k+1

k′

k′ +1

where we used that |u ⊗ v| ≤ |u||v|. We will first deal with deal with the first term in (5.37). We note that X p f 2,N f 2,N E max |W E|W |p . n+m n+m | ≤ t ,t tn+m ,tn+m 0≤k≤2n+m −1

k

k+1

k

0≤k≤2n+m −1

k+1

Let s, t ∈ [0, T ], s ≤ t and i, j ∈ {1, . . . , d}, i 6= j. One checks that the following decomposition holds Z t Z t i j j,N f 2,ij,N = c c j,N d(Wui − W cui,N ) . W d(W − W ) W W + s,u u u s,t u,t |s {z } |s {z } (A)

(B)

Let us focus on (A). First not that, owing to Proposition 2.1 applied with Ft = σ(Wui , u ∈ [0, T ], Wsj , s ≤ t),  u  X Z t i du. ξenj Ws,u cos √ (A) = λn s n≥1

Using that W i and W j are independent, one derives that (A) is the terminal value of a martingale with respect to the filtration σ(ξkj , k ≤ n, Wui , 0 ≤ u ≤ T ), n ≥ 1 so that combining B.D.G. and Minkowski inequalities yields, with the notations of Lemma 4.3, p  Z t  u  2 2 X i du  Ws,u cos √ E(|(A)|p ) ≤ CpBDG E  (ξenj )2 λ n s n≥1 p 

2 2

Z t   X

u i  du ≤ CpBDG  Ws,u cos √ kξenj k2p

λ s

n≥1

n

p

where ξ˜n = ξn − ξbnNn and N1 , . . . , Nn , . . . denote the optimal bit allocation of an optimal quadratic product quantization at level N ′ (keep in mind that Nk = 1, k > LB (N ′ ) and N1 · · · NLB (N ′ ) ≤ N ′ (B scalar Brownian motion). Now an elementary integration by parts yields Z t  t   u   u  p Z t i du = λn sin √ − sin √ dWui Ws,u cos √ λn λn λn s s so that, for every ρ ∈ (0, 21 ), one checks that, owing to the BDG Inequality,

Z t  u  1−ρ

i

≤ CpBDG Cp,ρ λn 2 |t − s| 12 +ρ .

Ws,u √ du cos

λn s p 38

Finally, for every ε ∈ (0, 3),



k(A)kp ≤ Cp,T,ρ,ε 

X

n≥1

1

2 1

˜ 2 λ1−ρ n kξn kp

|t − s| 2 +ρ .

One shows likewise the same inequality for (B) once noted that Z t  Z t  u   u  c i,N i,N i W c √ √ Ws,u cos Ws,u cos du = E du | FT λn λn s s which implies

Z t

Z t  u   u 



i i,N

W c cos √ du du ≤ Ws,u cos √ s,u

.

λ λ n

s

n

s

p

Consequently, for every ε ∈ (0, 3), f 2,ij,N kp kW s,t

≤ Cp,ρ,T

 

X

n≥1

1

2

˜ 2 λ1−ρ n kξn kp

≤ Cp,T,ρ,d,ε log N

If i = j ≥ 1, then

p

−( 1 −ρ)∧ 3−ε 2

1

|t − s| 2 +ρ

2p

1

|t − s| 2 +ρ .

(5.40)

   i − c i,N 2 f 2,ii,N = 1 Ws,t W W ) s,t s,t 2

so that, using again H¨older Inequality,

i i f 2,ii,N kp = 1 kWs,t c i,N k c i,N kW −W s,t s,t p+δ) kWs,t − Ws,t kp(1+ p ) 2 δ

and one gets the same bounds as in the case i 6= j.

If i or j = 0, one gets similar bounds: we leave the details to the reader. Finally, one gets that, for every i, j ∈ {0, . . . , d}, f 2,N kp ≤ Cp,ρ,T,d,ε,δ log N kW s,t

−( 1 −ρ)∧ 2

3−ε 2(p+δ)

1

|t − s| 2 +ρ .

By standard computations similar to those detailed in Lemma 4.2, we get X

k

m≥0

−( 1 −ρ)∧ 3−ε −n( 1 +ρ) 2 2(p+δ) f 2,N max |W 2 2 . n+m n+m |kp ≤ Cp,ρ,T,d,ε,δ log N t ,t

0≤k≤2n+m−1

k

k+1

39

Let us pass now to the two other sums. We will focus on (5.38) since both behave and can be treated similarly. max

p p c 2,N |Wt2,N n+m n+m − W n+m n+m | |Wtn+m ,tn+m | ,t t ,t

0≤k≤2n+m −1 ′ 0≤k ′ ≤2n+m −1

k

k+1

k



X

k′ +1

k′

k+1

p p c 2,N |Wt2,N n+m n+m − W n+m n+m | |Wtn+m ,tn+m | . ,t t ,t k

k+1

0≤k≤2n+m −1 ′ 0≤k ′ ≤2n+m −1

k

k′

k+1

k′ +1

Now for every s, u, t ∈ [0, T ], s ≤ u ≤ t, it follows from H¨older Inequality that c N | |Wu,t |kp k |Ws,u − W s,u

c N k kWu,t k ≤ kWs,u − W s,u p+δ p(1+p/δ)

1 N cs,u ≤ Cp,δ kWs,u − W kp+δ |t − u| 2 .

Using Inequality (4.33) from Lemma 4.3, we get for every p > 2, every ρ ∈ (0, 21 ), every ε ∈ (0, 3), and every s, t ∈ [0, T ], s ≤ t,

i ) −( 1 −ρ)∧( 3−ε i,N c 2p .

Ws,t − Ws,t ≤ Cρ,p,T,d,ε|t − s|ρ (log N ) 2 p

Now,

E max

p p c 2,N |Wt2,N n+m n+m − W n+m n+m | |Wtn+m ,tn+m | ,t t ,t

0≤k≤2n+m −1 ′ n+m′

0≤k ≤2

k

k+1

k

k′

k+1

k′ +1

−1

  3−ε ) p (n+m)(1−ρp) (n+m′ )(1− p ) −( 1 −ρ)∧( 2(p+δ) 2 2 2 ≤ Cρ,p,T,d,δ,ε(log N ) 2

and we use that ρ > p1 and p > 2 to show that X p c 2,N max k|Wt2,N n+m n+m − W n+m n+m | |Wtn+m ,tn+m |kp ,t t ,t m,m′ ≥0

0≤k≤2n+m −1 ′ 0≤k ′ ≤2n+m −1

k

k+1

k

k+1

k′

k′ +1

3−ε −( 12 −ρ)∧ 2(p+δ) n( p2 −( 21 +ρ))

≤ Cρ,p,T,d,δ,ε(log N )

Finally, we get   −p( 1 −ρ)∧ 3−ε (2),N p 2 2(p+δ) ≤ Cρ,p,T,d,ε,δ log N E Ze˜′

2

.

θ

˜ with θ˜ = p(ρ + 1 ) − 2. Now, it follows by standard as soon as θ˜′ ∈ (0, θ) 2 arguments that θ˜′ f 2,N | ≤ Z e(2),N |t − s| p sup |W s,t∈[0,T ]

s,t

θ˜′

40

so that, finally

f 2,N | 3−ε

 1 |W s,t

sup

≤ Cρ,p,T,d,ε,δ log N −( 2 −ρ)∧ 2(p+δ) . ′ ˜

θ

s,t∈[0,T ] |t − s| p



p

Now, we are in position to prove the main result of this section.

Proof of Theorem 5.5. (a) Given Theorem 4.4, this amounts to proving c 2,N k q ,Hol converges to 0 in every Lp (P). This easily follows that kW2 − W 2 from Proposition 5.3(a)-(b). 3 (b) We inspect successively four cases to maximize min(1 − ρ, 2p ) in ρ when it is possible.

7q  q ∈ (2, 4) and p < 2(q−2) + α2 with α > 0 . Let p′ be defined by p1′ = 2(q−2) 7q small enough so that p′ > p and p1′ + 1q < 12 . Then set ρ′ = 2q + p2′ − 21 + α2 (note that ρ′ > p1′ ). One checks that 12 − ρ′ = 1 − 2( p1′ + 1q ) = 73 (1 − 2q ) − α ∈ 1 (0, 2(p3−ε ′ +δ) ∧ 2 ) at least for any small enough α, δ = δ(α, q) > 0 and ε = ′ ε(α, q) > 0. Now, Proposition 5.3(c) applied with θ˜′ = 2pq < p′ (ρ′ + 12 ) − 2



2 2,N c q yields the announced asymptotic rate for kW − W k ,Hol , p < p′ , 2

since Lp (P)-norms are non-decreasing in p.

p

7q  q ∈ (2, 4) and p ≥ 2(q−2) . One sets the same specifications as above for 3 ′ and choose ε = ε(q, α) > 0 and ρ but with p = p. Then 1/2 − ρ > 2p 3−ε 3 + α. δ = δ(q, α) > 0 small enough so that 2(p+δ) ≤ 2p 7q 2q 2(q−2) < q−4 7q 2q [ 2(q−2) , q−4 ) can be

 q ∈ [4, 20/3). Then 7q 2(q−2) )

and one checks that the cases p ∈

2q and p ∈ (2, solved as above. If p ≥ q−4 (hence 3 1 . ≥ 5), no optimization in ρ is possible i.e. any admissible ρ satisfies 2 −ρ > 2p

7q 2q 2q ′  q ≥ 20/3 i.e. 2(q−2) . If p < q−4 , set p′ such that p1′ = q−4 > q−4 2q + α /2, α′ > 0 small enough and ρ′ = 2q + p2′ − 12 + α2 . Doing as above yields 3 ) = 2q + α for an arbitrary small α > 0. Note that this quantity min(1 − ρ, 2p is greater than 37 (1 − 2q ) + α (so in that case our exponent is not optimal). 2q , we proceed to no optimization in ρ. If p ≥ q−4

(c) This is a consequence of Borel-Cantelli’s Lemma by considering p > 7q q−2 . ♦ Now we conclude by proving Theorem 3.2. 41

c N , 0) Proof of Theorem 3.2. First we check using Proposition 5.3 that ρq (W and ρq (W, 0) are a.s. finite since they are integrable. Now we may apply Theorem 3.1 which yields the announced result. ♦ Acknowledgement: We thank A. Lejay for helpful discussions and comments about several versions of this work and F. Delarue and S. Menozzi for initiating our first “meeting” with rough path theory.

References [1] Billingsley P. (1968). Convergence of Probability Measure, Wiley Series in Probability and Mathematical Statistics, Wiley, New York, 253p. [2] Coutin L., Victoir N. (2009). Enhanced Gaussian Processes and Applications, ESAIM Probab. Stat., 13: 247–269. ´n C. (1988). The strong law of large num[3] Cuesta-Albertos J.A., Matra bers for k-means and best possible nets of Banach valued random variables, Probab. Theory Rel. Fields, 78:523–534. [4] Dereich S. (2008). The coding complexity of diffusion processes under Lp [0, 1]-norm distortion, pre-print, Stoch. proc. and their Appl., 118(6):938– 951. [5] Dereich S., Fehringer F., Matoussi A., Scheutzow M. (2003). On the link between small ball probabilities and the quantization problem for Gaussian measures on Banach spaces, J. Theoretical Probab., 16:249–265. [6] Friz P. (2005). Continuity of the Itˆ o-map for H¨ older rough paths with applications to the support theorem in H¨ older norm. Probability and partial differential equations in modern applied mathematics, 117-135, IMA Vol. Math. Appl., 140, Springer, New York. [7] Friz P., Victoir N. (2007). Differential Equations Driven by Gaussian Signals I (2007), available at arXiv:0707.0313. [8] Friz P., Victoir N. (2007). Differential Equations Driven by Gaussian Signals II (2007), available at arXiv:0711.0668. [9] Friz P., Victoir N. (2010). Multidimensional Stochastic Differential Equations as Rough Paths: Theory and Applications, Cambridge Studies in Advanced Mathematics, 670p. [10] Graf S., Luschgy H. (2000). Foundations of Quantization for Probability Distributions. Lect. Notes in Math. 1730, Springer, Berlin.

42

[11] Graf S., Luschgy H., Pag` es G. (2003). Functional quantization and small ball probabilities for Gaussian processes, J. Theoret. Probab, 16(4):1047–1062. [12] Graf S., Luschgy H., Pag` es G. (2007). Optimal quantizers for Radon random vectors in a Banach space, J. of Approximation Theory, 144:27–53. [13] Graf S., Luschgy H., Pag` es G. (2008). Distortion mismatch in the quantization of probability measures, ESAIM P&S, 12, 127-153. [14] Lejay A. (2003). An introduction to rough paths, S´eminaire de Probabilit´es YXXVII, Lecture Notes in Mathematics 1832:1–59. [15] Lejay A. (2009). Yet another introduction to rough paths, S´eminaire de Probabilit´es XLII, Lecture Notes in Mathematics 1979, 1–101. [16] Lejay A. (2009). On rough differential equations, Electronic Journal of Probability, 14(12):341–364. [17] Lejay A. (2010). Global solutions to rough differential equations with unbounded vector fields, pre-pub. Inria 00451193, 2010. [18] Luschgy H., Pag` es G. (2002). Functional quantization of stochastic processes, J. Funct. Anal., 196: 486–531. [19] Luschgy H., Pag` es G. (2004). Sharp asymptotics of the functional quantization problem for Gaussian processes, Ann. Probab. 32:1574–1599. [20] Luschgy H., Pag` es G. (2006). Functional quantization of a class of Brownian diffusions: a constructive approach, Stoch. Proc. and their Appl., 116:310336. [21] Luschgy H., Pag` es G. (2007). High resolution product quantization for Gaussian processes under sup-norm distortion, Bernoulli, 13(3):653–671. [22] Luschgy H., Pag` es G. (2008). Functional quantization and mean pathwise regularity with an application to L´evy processes, Annals of Applied Probability, 18(2):427–469. [23] Luschgy H., Pag` es G., Wilbertz B. (2008). Asymptotically optimal quantization schemes for Gaussian processes, ESAIM P&S, 12:127–153. [24] Lyons T. (1995). Interpretation and Solutions of ODE’s Driven by Rough signals, Proc. Symposia Pure, 1583. [25] Lyons T. (1998). Differential Equations driven by rough signals, Rev. Mat. Iberoamericana, 14(2): 215–310. ´vy T. (2007). Differential equations driven [26] Lyons T., Caruana M.J., Le by rough paths. Lect. Notes in Math. 1908. Notes from T. Lyons’s course at ´ Ecole d’´et´e de Saint-Flour (2004).

43

[27] Pag` es G., Printems J. (2005). Functional quantization for numerics with an application to option pricing, Monte Carlo Methods & Applications, 11(4): 407–446. [28] Pag` es G., Printems J. (2005). Website devoted to vector and functional optimal quantization: www.quantize.maths-fi.com. [29] Revuz D., Yor M. (1999). Continuous martingales and Brownian motion. Third edition. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 293, Springer-Verlag, Berlin, 602 p.

Appendix: Functional conditional expectation Let (Yt )t∈[0,T ] be a bi-measurable process defined on a probability space (Ω, A, P) such that Z T E(Yt2 )dt < +∞. 0

One can consider Y as a random variable Y : (Ω, A, P) → L2T := L2 ([0, T ], dt) and more precisely as an element of the Hilbert space n o L2L2 (Ω, A, P) := Y : (Ω, A, P) → L2T , E |Y |2L2 < +∞ T

where |f |2L2 = T

RT 0

T

f 2 (t)dt. For the sake of simplicity, one denotes kY k2 :=

q E |Y |2L2 . T

If B denotes a sub-σ-field of A (containing all P-negligible sets of A) then L2L2 (Ω, B, P) T

is a closed sub-space of L2L2 (Ω, A, P) and one can define the functional conditional T expectation of Y by E(Y | B) := Proj⊥ L2 (Ω,B,P) (Y ). L2 T

Functional conditional expectation can be extended to bi-measurable processes Y such that kY k1 := E |Y |L1T < +∞ following the approach used for Rd -valued random vectors. Then, E(Y | B) is characterized by: for every B([0, T ]) ⊗ B-bimeasurable process Z = (Zt )t∈[0,T ] , bounded by 1, E

Z

T

Zt Yt dt = E 0

Z

T 0

Zt E(Y | B)t dt.

In particular, owing to the Fubini theorem, this implies that as soon as the process (E(Yt | B))t∈[0,T ] has a B([0, T ])⊗B bi-measurable version, the functional conditional expectation could be defined by setting E(Y | B)t (ω) = E(Yt | B)(ω),

44

(ω, t) ∈ Ω × [0, T ].

Examples: (a) Let B := σ(NA , Bi , i ∈ I) where (Bi )i∈I is a finite measurable partition of Ω such that P(Bi ) > 0, i ∈ I.

(b) Let Y := (Wt )t∈[0,T ] a standard Brownian motion in Rd and let B := σ(Wt1 , . . . , Wtn ) where 0 = t0 < t1 < . . . < tn = T . Then ∀ t ∈ [tk , tk+1 ),

E(W | B)t = Wtk +

45

t − tk (Wtk+1 − Wtk ). tk+1 − tk