Integration by parts for point processes and Monte Carlo ... - CiteSeerX

0 downloads 0 Views 535KB Size Report
Dec 22, 2005 - In the case of a Poisson process with arbitrary deterministic intensity λ ∈ C1 b (R+) we have log jT,NT (T1,...,TNT ) = ∫. T. 0 log λ(t)dNt − ∫. T.
Integration by parts for point processes and Monte Carlo estimation Nicolas PRIVAULT D´epartement de Math´ematiques Universit´e de La Rochelle 17042 La Rochelle, France Xiao WEI School of Mathematics and Statistics Wuhan University 430072 Hubei, P.R. China December 22, 2005 Abstract We develop an integration by parts technique for point processes, with application to the computation of sensitivities via Monte Carlo simulations in stochastic models with jumps. The method is applied to density estimation and to the construction of a modified kernel estimator which is less sensitive to variations of the bandwidth parameter than standard kernel estimators. Simulations are presented for a random functional of a log-normal renewal process in order to compare the performance of our modified estimator to standard kernel estimators.

AMS Classification: 60H07, 65C05, 62G07, 60G55, 60K15. Keywords: Malliavin calculus, point processes, renewal processes, sensitivity analysis, density estimation, kernel estimators.

1

Introduction

Estimation techniques for the density φF of a random variable F from a random sample {F (k)}k=1,...,N of F have been introduced in [16], [13]. In [16], finite difference

estimators of the form

N

1 X 1 1[−h,h](F (k) − y), φF (y) ' E[1[− h , h ] (F − y)] ' 2 2 h 2N h k=1 1

y ∈ R+ ,

(1.1)

have been constructed, and extended in [13] to estimators of the form N

1 X K φF (y) ' N h k=1



F (k) − y h



,

(1.2)

where K : R → R+ is a kernel satisfying Z ∞ K(x)dx = 1. −∞

The performance of kernel estimators is dependent on the choice of the bandwidth parameter h, whose optimal value is function of the number N of samples, i.e. it should decrease as N increases. It is known since [16] that the optimal rate of decrease in the mean square sense is N −1/4 for the finite difference estimator, while in [13] optimal values of h have been obtained for kernel estimators, in terms of N and K. On the other hand, integration by parts and related Malliavin calculus techniques can be used to represent the density φF of F as φF (y) =

∂ P (F ≤ y) = E[W 1{F ≤y} ] = −E[W 1{F >y} ], ∂y

(1.3)

under certain technical assumptions, cf. e.g. § 2.1 of [12] on the Wiener space, where

W is a random variable called a weight. This provides another way to estimate the

density of F by Monte Carlo methods: denoting by {F (k)}k=1,...,N a random sample

distributed according to the law of F we have φF (y) '

N 1 X W (k)1{F (k)≤y} , N k=1

(1.4)

where {W (k)}k=1,...,N denotes the corresponding sample of W . The interest in (1.4),

compared to kernel estimators, is to be independent on the value of a bandwidth parameter.

More generally, the Malliavin calculus has been applied to the sensitivity analysis of ∂ continuous financial markets, cf. [9], [8], to express derivatives of the form E[f (Fζ )], ∂ζ where (Fζ ) is a family of random variables depending on a parameter ζ ∈ R, as: ∂ E[f (Fζ )] = E[Wζ f (Fζ )]. ∂ζ

(1.5)

Here, Wζ is a weight independent of the function f , which need not be differentiable: in particular the estimation of density (1.4) corresponds to f = 1(−∞,0) and Fy = F −y, 2

with W independent of y. Integration by parts techniques have also been applied to financial models with Poisson jumps in [5], [6], [10], [1] and to insurance [15]. The last two mentioned works rely on a version of the Malliavin calculus with jumps developed in [2], [7], [14]. Note that in mathematical finance, each value of the bandwidth parameter h in the finite difference

1 E[f (Fζ+h ) − f (Fζ−h )] 2h yields a different estimate of the corresponding sensitivity (also called “Greek”), see e.g. [4], p. 40, whereas (1.5) is again independent of a bandwidth parameter. In Proposition 3.4 below we derive a general integration by parts formula for point processes, extending the results obtained in the Poisson case in [2], [7], [14], [10], [15], with potential application to sensitivity analysis and density estimation for stochastic models in finance, insurance and engineering. Using this integration by parts formula we obtain an expression of the form (1.3)-(1.4): N 1 X φF (y) = E[W 1{F ≤y} ] ' 1{F ≤y} W (k), N k=1

(1.6)

for the density of a random functional F of a point process. It turns out that the performance of the corresponding estimator (1.6) decreases for small values of y, for which W has a large variance. This problem is tackled by a localization procedure, mixing (1.6) with a standard kernel estimate:       F −y 1 F −y φF (y) = −E W f + E K h h h     N N 1 X F (k) − y 1 X F (k) − y ' − W (k)f + K , N h Nh h k=1

(1.7)

k=1

where K is a kernel supported in [0, ∞) and   Z x f (x) = 1[0,∞) (x) 1 − K(y)dy , 0

x ∈ R.

As shown in Section 5.3, this estimator combines the advantages of Malliavin type estimators (1.6) and kernel estimators (1.2), in that it is little sensitive to values of the bandwidth parameter h, while at the same time it does not present the above mentioned variance problem. Actually, (1.7) recovers with a simple proof an analog of Theorem 2.1 proved in [11] on the Wiener space. The optimization results of [11] 3

in terms of kernel K and bandwidth parameter h also apply here and are used in numerical simulations, cf. Figures 6 and 7. We proceed as follows. In Section 2 we review some properties of point processes, and in Section 3 we establish the integration by parts formula (Proposition 3.4) which will be our main tool for density estimation. In Section 4 we present an application of the integration by parts formula to the computation of sensitivities. Simulations and comparisons of different methods for density estimation are presented in Section 5. As an example we will consider the functional Z T Fr = e−rt dXt , 0

where Xt =

Nt X

Yi ,

i=1

t ∈ R+ ,

is a jump process with random marks (Yk )k≥1 independent of the point process (Nt )t∈R+ . Such functionals can be used to express risk reserve processes for insurance portfolios in which the accumulated amount of claims occurring in the time interval (0, t] is given by Xt , cf. e.g. [15].

2

Point processes

Let Nt =

∞ X

1[Tk ,∞) (t),

k=1

t ∈ R+ ,

(2.1)

be a point process with increasing sequence of jump times (Tk )k≥1 , on a probability space (Ω, F , P ). Set T0 = 0 and let the inter-jump times of (Nt )t∈R+ be denoted by

τk := Tk − Tk−1 , k ≥ 1. Given T > 0, consider F in the algebra ST of functionals of the form

F = f0 1{NT =0} +

m X

1{NT =n} fn (T1 , . . . , Tn ),

n=1

m ≥ 1,

(2.2)

where f0 ∈ R and fn is symmetric in n variables and continuously differentiable on ∆Tn = {0 ≤ t1 < t2 < · · · < tn ≤ T }, 1 ≤ n ≤ m, T > 0. Denoting by (Ft )t∈R+ the filtration generated by (Nt )t∈R+ , we

have that ST is dense in L2 (Ω, Ft ), t ∈ R+ . The expectation of F has the form Z Z T m X 1 T ··· fn (t1 , . . . , tn )jT,n (t1 , . . . , tn )dt1 · · · dtn , (2.3) E[F ] = jT,0 f0 + n! 0 0 n=1 4

where jT,n : Rn+ → R+ , n ≥ 1, are nonnegative symmetric functions on [0, T ]n called

the Janossy densities, and jT,0 ∈ R+ , cf. [17], §5.3 of [3], and references therein. In other terms we have

P (T1 ∈ dt1 , . . . , Tn ∈ dtn , NT = n) = jT,n (t1 , . . . , tn )dt1 · · · dtn , 0 ≤ t1 < t2 < · · · < tn ≤ T . We turn to some examples of point processes and their

Janossy densities.

Poisson processes In the case of Poisson processes with arbitrary deterministic intensity λ(t) we have  Z T  jT,n (t1 , . . . , tn ) = λ(t1 ) · · · λ(tn ) exp − λ(t)dt , 0

i.e. for the standard Poisson process with intensity λ > 0 we have jT,n (t1 , . . . , tn ) = λn e−λT ,

t1 , . . . , tn ∈ [0, T ].

Renewal processes A point process (Nt )t∈R+ as in (2.1) is called a renewal process with inter-occurrence time distribution Z(x) if the random variables τk = Tk − Tk−1 , k ≥ 1, are independent

and identically distributed with P (τk ≤ x) = Z(x), and density z(x), x ∈ R+ , k ≥ 1. Since the sequence (τk )k≥1 is i.i.d., for 0 ≤ t1 < t2 < · · · < tn ≤ T we have P (T1 ∈ dt1 , . . . , Tn ∈ dtn , NT = n) = P (τ1 ∈ dt1 , t1 + τ2 ∈ dt2 , . . . , tn−1 + τn ∈ dtn , τn+1 > T − tn ) = z(t1 )z(t2 − t1 ) · · · z(tn − tn−1 )(1 − Z(T − tn ))dt1 · · · dtn , hence the Janossy densities jT,n (t1 , . . . , tn ) are given by jT,n (t1 , . . . , tn ) = z(t1 )z(t2 − t1 ) · · · z(tn − tn−1 )

Z



z(s)ds,

(2.4)

T −tn

for (t1 , . . . , tn ) ∈ ∆Tn . The value of jT,n (t1 , . . . , tn ) on (t1 , . . . , tn ) ∈ [0, T ]n is obtained

by symmetrization:

jT,n (t1 , . . . , tn ) = jT,n (t(1) , . . . , t(n) ),

t1 , . . . , tn ∈ [0, T ],

where (t(1) , . . . , t(n) ) denotes the sequence (t1 , . . . , tn ) in ascending order, see §5.3 of

[3].

5

3

Integration by parts

Definition 3.1. Given w ∈ C 1 ([0, T ]), let Dw denote the gradient operator defined on

F ∈ ST of the form (2.2) by

Dw F = −

m X n=1

1{NT =n}

n X

w(Tk )

k=1

∂fn (T1 , . . . , Tn ). ∂tk

The next lemma is the core of our integration by parts formula. It extends results of [2], [15] to the setting of point processes. Let C01 ([0, T ]) denotes the space of w ∈ C 1 ([0, T ])

such that w(0) = w(T ) = 0. In the sequel we assume that jT,n ∈ C 1 (∆Tn ), n ≥ 1. Lemma 3.2. Let w ∈ C01 ([0, T ]). For F ∈ ST of the form (2.2),   Z T 0 w (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) . E[Dw F ] = E F 0

Proof. We have

Z Z TX m n X 1 T ∂fn E[Dw F ] = − ··· w(tk ) (t1 , . . . , tn )jT,n (t1 , . . . , tn )dt1 · · · dtn n! 0 ∂tk 0 k=1 n=1 Z Z T m n X X ∂ 1 T (w(tk )jT,n (t1 , . . . , tn )) dt1 · · · dtn ··· = fn (t1 , . . . , tn ) n! ∂t k 0 0 n=1 k=1 Z T Z T n m X X 1 fn (t1 , . . . , tn )jT,n (t1 , . . . , tn ) w 0 (tk )dt1 · · · dtn ··· = n! 0 0 n=1 k=1 Z Z T m n X 1 T X ∂jT,n + ··· (t1 , . . . , tn )dt1 · · · dtn fn (t1 , . . . , tn ) w(tk ) n! 0 ∂tk 0 n=1 k=1  Z T  0 = E F w (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) . 0

 Next, we state the definition of the divergence operator. Definition 3.3. Given w ∈ C01 ([0, T ]) and F ∈ Dom(Dw ), let Z T ∗ Dw F = F w 0 (t)dNt − F Dw log |F jT,NT (T1 , . . . , TNT )|.

(3.1)

0

The domain Dom(Dw ), resp. Dom(Dw∗ ), of Dw , resp. Dw∗ , is the set of functionals

F ∈ L2 (Ω, FT ) for which there exists (Fn )n∈N in ST converging to F in L2 (Ω, FT ), and such that (Dw Fn )n∈N , resp. (Dw∗ Fn )n∈N , converges in L2 (Ω, FT ).

Proposition 3.4. Let w ∈ C01 ([0, T ]). The operators Dw and Dw∗ can be extended to their respective domains Dom(Dw ), Dom(Dw∗ ), and they satisfy the duality relation E[GDw F ] = E[F Dw∗ G], 6

F, G ∈ ST .

(3.2)

Proof. From Lemma 3.2 and Definition 3.3 we have E[GDw F ] = E[Dw (F G) − F Dw G]    Z T 0 w (t)dNt − GDw log jT,NT (T1 , . . . , TNT ) − Dw G = E[F Dw∗ G], = E F G 0

which proves (3.2). Let now (Fn )n∈N , (Gn )n∈N be two sequences in ST converging to

a same F in L2 (Ω, FT ), and such that both (Dw Fn )n∈N and (Dw Gn )n∈N have limits denoted by U and V in L2 (Ω, FT ). In this case, letting (Hn )n∈N ⊂ ST be a sequence converging to U − V we have: kU − V kL2 (Ω,FT ) = = ≤

lim lim hDFn − DGn , Hm iL2 (Ω,FT )

m→∞ n→∞

lim lim hFn − Gn , Dw∗ Hm iL2 (Ω,FT )

m→∞ n→∞

lim lim kFn − Gn kL2 (Ω,FT ) kDw∗ Hm kL2 (Ω,FT )

m→∞ n→∞

= 0, hence U = V , P -a.s. This shows that for F ∈ Dom(Dw ) we may define Dw F = lim Dw Fn , n→∞

whenever the limit exists in L2 (Ω, FT ), for any sequence (Fn )n∈N converging to F

in L2 (Ω, FT ). The same argument applies to Dw∗ and as a consequence, the duality

formula (3.1) is extended to F ∈ Dom(Dw ).



We now turn to the calculation of Dw log jT,NT (T1 , . . . , TNT ) for particular examples of point processes. Poisson processes In the case of a Poisson process with arbitrary deterministic intensity λ ∈ Cb1 (R+ ) we have

log jT,NT (T1 , . . . , TNT ) = and

Z

T 0

log λ(t)dNt −

Dw log jT,NT (T1 , . . . , TNT ) = − Renewal processes In this case, (2.4) yields: Dw log jT,NT (T1 , . . . , TNT ) 7

Z

T

w(t) 0

Z

T

λ(t)dt, 0

λ0 (t) dNt . λ(t)

NT NX T −1 z 0 (Tk − Tk−1 ) z 0 (Tk+1 − Tk ) w(TNT )z(T − TNT ) X w(Tk ) w(Tk ) − + 1 − Z(T − TNT ) z(Tk − Tk−1 ) z(Tk+1 − Tk ) k=1 k=1  0  Z T z (TNt +1 − TNt ) z 0 (TNt − TNt −1 ) w(t) = − dNt z(TNt +1 − TNt ) z(TNt − TNt −1 ) 0 z 0 (TNT +1 − TNT ) w(TNT )z(T − TNT ) − −w(TNT ) z(TNT +1 − TNT ) 1 − Z(T − TNT ) Z T 0 z (TNt − TNt −1 ) w(TNT )z(T − TNT ) (w(t − τNt ) − w(t)) = dNt − . z(TNt − TNt −1 ) 1 − Z(T − TNT ) 0

= −

Log-normal renewal process In this example the inter-arrival times are independent and identically distributed according to the log-normal distribution with parameter σ > 0, i.e. 2

2

e−(log x) /(2σ ) √ , z(x) = σx 2π

x > 0.

In other terms Tk − Tk−1 = eσξk , where (ξk )k≥1 is an i.i.d. sequence of standard

Gaussian random variables, and

NT X



 log(Tk − Tk−1 ) 1+ Dw log jT,NT (T1 , . . . , TNT ) = σ2 k=1 2   2 NX T −1 w(Tk ) w(TNT )e−(log(T −TNT )) /(2σ ) log(Tk+1 − Tk ) − − √ 1+ Tk+1 − Tk σ2 σ 2π(T − TNT )(1 − Z(T − TNT )) k=1 =

NT X k=1

w(Tk ) Tk − Tk−1

NX T −1 w(Tk ) w(Tk ) (1 + σ −1 ξk ) − (1 + σ −1 ξk+1 ) Tk − Tk−1 Tk+1 − Tk k=1

2

2 w(TNT )e−(log(T −TNT )) /(2σ ) − √ σ 2π(T − TNT )(1 − Z(T − TNT )) 2 2 NT X w(TNT )e−(log(T −TNT )) /(2σ ) 1 + σ −1 ξk = − √ + (w(Tk ) − w(Tk−1 )) Tk − Tk−1 σ 2π(T − TNT )(1 − Z(T − TNT )) k=1 2 2 Z TN T w(TNT )e−(log(T −TNT )) /(2σ ) 1 + σ −1 ξ1+Ns = − √ ds. + w 0 (s) τ1+Ns σ 2π(T − TNT )(1 − Z(T − TNT )) 0

In the simulations of Section 5 we will simply take w(t) = t(T − t), t ∈ [0, T ]. In this case we have Z T w 0 (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) 0

2

2 NT NT X TNT e−(log(T −TNT )) /(2σ ) X √ + = (T − 2Tk ) − (T − Tk − Tk−1 )(1 + σ −1 ξk ) (1 − Z(T − TNT ))σ 2π k=1 k=1 ! 2 NT −(log(T −TNT )) /(2σ 2 ) X e −1 √ − 1 T NT − σ = (T − Tk − Tk−1 )ξk . (1 − Z(T − TNT ))σ 2π k=1

8

4

Sensitivity analysis

Let I = (a, b) be an open interval of R and consider the derivative   ∂ ∂Fζ 0 E[f (Fζ )] = E f (Fζ ) , ζ ∈ (a, b). ∂ζ ∂ζ

(4.1)

This expression can be approximated by finite differences as 1 E[f (Fζ+h ) − f (Fζ−h )], 2h

(4.2)

while (4.1) fails when f is not differentiable, e.g. when f = 1[0,∞) . In Proposition 4.1 below we show that

∂ E[1A f (Fζ )] = E[1A Wζ f (Fζ )], (4.3) ∂ζ provided Fζ is sufficiently smooth to be in the domain of Dw , with Dw Fζ 6= 0 a.s.

on A, and where the random variable Wζ is a weight independent of the function f . The application of this formula to numerical simulations will be compared to kernel estimates in Section 5.

Proposition 4.1. Given a, b ∈ R, a < b, let (Fζ )ζ∈(a,b) be a family of random func-

tionals, continuously differentiable in Dom(Dw ) in the parameter ζ ∈ (a, b) and such

that Dw Fζ ∈ Dom(Dw ), ζ ∈ (a, b). Let w ∈ C01 ([0, T ]), and let A ∈ FT such that 1A ∈ Dom(Dw ) and Dw 1A = 0, a.s., with Dw Fζ 6= 0,

a.s. on A,

ζ ∈ (a, b).

Let f : R → R be such that f (Fζ ) ∈ L2 (Ω, FT ), ζ ∈ (a, b). We have ∂ E[f (Fζ ) | A] = E[Wζ f (Fζ ) | A], ∂ζ

ζ ∈ (a, b),

(4.4)

where the weight ∂ ζ Fζ Wζ = Dw F ζ

Z

T

Dw Dw F ζ w (t)dNt − Dw log |∂ζ Fζ jT,NT (T1 , . . . , TNT )| + Dw F ζ 0

0

is assumed to belong to L1 (A). Proof. Assuming that f ∈ Cb∞ (R) we have from Proposition 3.4:   ∂Fζ ∂ 0 E[1A f (Fζ )] = E 1A f (Fζ ) ∂ζ ∂ζ   ∂ ζ Fζ Dw (f (Fζ )) = E 1A Dw F ζ    ∂ ζ Fζ ∗ = E f (Fζ )Dw 1A . Dw F ζ 9



The weight Dw∗ (1A ∂ζ Fζ /Dw Fζ ) is computed using stochastic integrals as:   ∂ ζ Fζ ∗ Dw 1 A Dw F ζ   Z T  ∂ ζ Fζ ∂ ζ Fζ 0 w (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) − Dw 1A = 1A Dw F ζ Dw F ζ 0 Z T  Dw Dw F ζ ∂ ζ Fζ 0 w (t)dNt − Dw log |∂ζ Fζ jT,NT (T1 , . . . , TNT )| + = 1A Dw F ζ Dw F ζ 0 = 1 A Wζ , since Dw 1A = 0. The extension to square-integrable f is obtained as in [9], [10].



In applications of the above proposition we will take A = {NT 6= 0}. In the case of

Poisson processes with arbitrary deterministic intensity λ ∈ C 1 (R+ ), the weight Wζ is given by

∂ ζ Fζ Wζ = Dw F ζ

Z

T 0

0

w (t)dNt −

Z

T 0

Dw Dw F ζ λ0 (t) dNt + w(t) λ(t) Dw F ζ





Dw ∂ ζ F ζ . Dw F ζ

For general point processes and Fy = F − y we have Z T  Dw Dw F 1 0 w (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) + W =− , Dw F Dw F 0 hence ∂ E[1{NT 6=0} f (F − y)] ∂y  Z T  Dw Dw F f (F − y) 0 w (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) + = −E 1{NT 6=0} . Dw F Dw F 0 More generally we will be able to consider point processes of the form Xt =

Nt X

Yi ,

t ∈ R+ ,

i=1

where (Yi )i≥1 is a sequence of marks independent of (Nt )t∈R+ . Consider for example the functional Fr =

Z

T

e−rt dXt . 0

Since the gradient operator Dw does not act on Yi , i ∈ N, these random variables may be considered as constants in the integration by parts formula (3.2). We have Z T w(t)e−rt dXt , Dw Fr = −r 0

and Dw Dw F r = r

Z

T 0

w(t)(w 0 (t) − rw(t))e−rt dXt . 10

As an example we compute the weight Wr corresponding to the sensitivity ∂ E[1{NT 6=0} f (Fr )] = E[1{NT 6=0} Wr f (Fr )] ∂r

(4.5)

with respect to the parameter r > 0. We have Z T ∂ r Fr = − te−rt dXt 0

and

Dw ∂ r F r = − hence Wr

Z

T 0

w(t)e−rt (1 − rt)dXt ,

RT RT R T −rt w(t)(rw(t) − w 0 (t))e−rt dXt w(t)te−rt dXt te dXt 1 0 0 0 = − + RT − RT RT −rt dX −rt dX r r w(t)e w(t)e w(t)e−rt dXt t t 0 0 0  Z T − w 0 (t)dNt + Dw log jT,NT (T1 , . . . , TNT ) . 0

5

Density estimation

In this section we are specially interested in derivatives with respect to y of expectation of the form E[f (F − y)] with f = 1(0,∞) , which yield the probability density φF of F as

d E[1[0,∞) (F − y)], y ∈ R. dy Our results are illustrated by Monte Carlo density estimations with 10000 samples for φF (y) = −

the random variable 2

Fr = exp((1 + r) − 1)

Z

T

e−rt dNt , 0

where (Nt )t∈R+ is a log-normal renewal process and T = 5, σ = 0.3.

5.1

Kernel estimators

The density φF can be estimated using a kernel K, as      N F −y 1 X F (k) − y 1 ' K , φF (y) ' E K h h Nh h k=1

where K is a continuous positive function such that Z ∞ K(x)dx = 1. −∞

In Figure 1 we compare several kernel estimators, with K(x) =

π 1[−1/2,1/2] (x) cos (πx) , 2

and h = 1, 0.1, 0.01. 11

(5.1)

1.2 h=1 h=0.1 h=0.01 1

Density

0.8

0.6

0.4

0.2

0

1.5

2

2.5

3

3.5

4

4.5

5

5.5

y

Figure 1: Kernel estimation of φFr with 10000 samples and r = 0.2 Figure 2 provides a 3-dimensional illustration, in which the convergence of the law of Fr to the discrete distribution of NT is apparent as r tends to 0.

2

1.5

1

0.5

0

0.2 r

0.3 0.4

2.5

3

3.5

4.5

4

5

5.5

y

Figure 2: Kernel estimation of φFr with 10000 samples and h = 0.01

12

5.2

Malliavin method

As an application of Proposition 4.1 we have φF (y) = −

∂ E[1{NT 6=0} 1[0,∞) (F − y)] = E[1{NT 6=0} W 1[0,∞) (F − y)], ∂y

(5.2)

where the weight W given by Z T  Dw Dw F 1 0 w (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) + W = − Dw F Dw F 0 Z T 1 = w 0 (t)dNt − Dw log jT,NT (T1 , . . . , TNT ) RT r 0 w(t)e−rt dNt 0 ! RT 0 −rt w(t)(rw(t) − w (t))e dN t . (5.3) + 0 RT w(t)e−rt dNt 0 is independent of y and of any bandwidth parameter. One can check in Figures 3 and 4 that although the estimator (5.2) yields more precise values than the kernel estimator (5.1) when y is large, it behaves badly for small values of y due to a higher variance of W 1[0,∞) (F − y) in this situation. This phenomenon is explained and dealt with by a localization method in the next section. 1.2 Malliavin method Exact value 1

Density

0.8

0.6

0.4

0.2

0

1.5

2

2.5

3

3.5

4

4.5

5

5.5

y

Figure 3: Probability density of Fr for r = 0.2 (Malliavin method with 10000 samples) The graph labeled “exact value” has been obtained via the modified kernel estimator (see (5.5) below in Section 5.3) with 107 samples.

13

The next graph is a 3-dimensional version.

1.5

1

0.5

0

0.2 r

0.3 0.4

2.5

3

3.5

4.5

4

5

5.5

y

Figure 4: Probability density of Fr as a function of r (Malliavin method with 10000 samples)

5.3

A modified kernel estimator

When Fr =

RT 0

e−rt dNt is close to 0, Dw Fr = −r

RT 0

w(t)e−rt dNt is also likely to be

small and the value of W is usually large, due to the division by Dw F in (5.3). Hence when y is small the term W 1[y,∞) (Fr ) is allowed to be non-zero for small values of Fr , and it has a large variance. The consequence of this phenomenon is illustrated by the lack of precision obtained in Figures 3 and 4 above for small values of y. A variance reduction technique called localization had been introduced in [8] to deal with related problems on the Wiener space. Here we apply a similar procedure to construct a modified kernel estimator using Malliavin weights. We consider a decomposition of the form 1[0,∞) = f + g, where g is a C 1 function, for example,   π  1 + 1[1,∞) (x), g(x) = 1[0,1) (x) 1 + sin πx − 2 2 14

and

  1 π  f (x) = 1[0,1) (x) 1 − sin πx − . 2 2 1.2 f(x) g(x) 1

0.8

0.6

0.4

0.2

0 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

x

Figure 5: Decomposition of the Heaviside function In the following proposition we obtain an analog of Theorem 2.1 in [11], via a somewhat simpler argument. Proposition 5.1. Let F ∈ Dom(Dw2 ) and f a function on R such that f (0) = 1,

f (x) = 0, x < 0, and 1(0,∞) f 0 ∈ L2 ((0, ∞)). We have for all h > 0       F −y 1 F −y 0 − E 1{F >y} f , φF (y) = −E W f h h h

y ∈ R,

(5.4)

where W is given by (5.3). Proof. We have d E[1[0,∞) (F − y)] dy       d d F −y F −y − E g = − E f dy h dy h       F −y F −y 1 0 = −E W f − E 1{F >y} f , h h h

φF (y) = −

where W is given by (5.3).

y ∈ R, 

Relation (5.4) yields an analog of Theorem 2.1 of [11] for point processes, with a simple proof. The method for the determination of an optimal kernel f and bandwidth parameter h by minimization of "   2 #   F −y 1 0 F −y E 1{F >y} W f − f , h h h

y ∈ R,

of [11], page 446, also applies here and yields f (x) = 1[0,∞) (x)e−λx , x ∈ R, and

hopt = kW k−1 L2 (A) , for any λ > 0.

15

Letting K(x) = −1(0,∞) (x)f 0 (x), this leads by Monte Carlo approximation to the

corrected kernel estimator:

    N N F (k) − y F (k) − y 1 X 1 X W (k)f K φF (y) ' − + . N k=1 h N h k=1 h

(5.5)

Note that (5.4) is an equality, whereas the standard kernel estimate    1 F −y φF (y) ' E K , y ∈ R, h h is only an approximation. Figure 6 shows the result of this modified kernel estimation for h = 1, 0.2, 0.01, for comparison with the standard kernel estimate of Figure 1. The modified kernel estimator does depend on a bandwidth parameter h, but it appears more stable and less sensitive two variations of h compared to standard kernel estimators. For small values of h, the behaviour of (5.5) becomes close to that of a standard kernel estimator. In our setting we found hopt = 0.1963 by Monte Carlo simulation and we use the optimal kernel K(x) = 1(0,∞) (x)e−x . 1.2 h=1 h=0.2 h=0.01 1

Density

0.8

0.6

0.4

0.2

0

1.5

2

2.5

3

3.5

4

4.5

5

5.5

y

Figure 6: Modified kernel estimate of φFr with 10000 samples and r = 0.2 Finally the following Figure 7 presents a 3-dimensional representation of the density using the modified kernel estimator (5.5).

16

2

1.5

1

0.5

0

0.2 r

0.3 0.4

2.5

3

3.5

4.5

4

5

5.5

y

Figure 7: Modified kernel estimate of φFr with 10000 samples and h = 0.2

6

Conclusion

The performances of kernel estimators are dependent on the choice of a bandwidth parameter h. The results of the Malliavin method are independent of h but may be degraded as the weight variance increases. Our modified kernel estimator, constructed by localization of the Malliavin method, appears to perform better than the other estimators considered.

References [1] M.-P. Bavouzet-Morel and M. Messaoud. Computation of Greeks using Malliavin’s calculus in jump type market models. Preprint, 2004. [2] E. Carlen and E. Pardoux. Differential calculus and integration by parts on Poisson space. In S. Albeverio, Ph. Blanchard, and D. Testard, editors, Stochastics, Algebra and Analysis in Classical and Quantum Dynamics (Marseille, 1988), volume 59 of Math. Appl., pages 63–73. Kluwer Acad. Publ., Dordrecht, 1990. [3] D. J. Daley and D. Vere-Jones. An introduction to the theory of point processes. Vol. I. Probability and its Applications. Springer-Verlag, New York, 2003. [4] J.W. Dash. Quantitative finance and risk management. World Scientific Publishing Co. Inc., River Edge, NJ, 2004.

17

[5] M.H.A. Davis and M.P. Johansson. Malliavin Monte Carlo Greeks for jump diffusions. Preprint, 2004, to appear in Stochastic Processes and their Applications. [6] V. Debelley and N. Privault. Sensitivity analysis of European options in jump diffusion models via the Malliavin calculus on Wiener space. Preprint, 2004. [7] R.J. Elliott and A.H. Tsoi. Integration by parts for Poisson processes. J. Multivariate Anal., 44(2):179–190, 1993. [8] E. Fourni´e, J.M. Lasry, J. Lebuchoux, and P.L. Lions. Applications of Malliavin calculus to Monte-Carlo methods in finance. II. Finance and Stochastics, 5(2):201–236, 2001. [9] E. Fourni´e, J.M. Lasry, J. Lebuchoux, P.L. Lions, and N. Touzi. Applications of Malliavin calculus to Monte Carlo methods in finance. Finance and Stochastics, 3(4):391– 412, 1999. [10] Y. El Khatib and N. Privault. Computations of Greeks in markets with jumps via the Malliavin calculus. Finance and Stochastics, 4(2):161–179, 2004. [11] A. Kohatsu-Higa and R. Pettersson. Variance reduction methods for simulation of densities on Wiener space. SIAM J. Numer. Anal., 40(2):431–450, 2002. [12] D. Nualart. The Malliavin Calculus and Related Topics. Probability and its Applications. Springer-Verlag, 1995. [13] E. Parzen. On estimation of a probability density function and mode. Ann. Math. Statist., 33:1065–1076, 1962. [14] N. Privault. Chaotic and variational calculus in discrete and continuous time for the Poisson process. Stochastics and Stochastics Reports, 51:83–109, 1994. [15] N. Privault and X. Wei. A Malliavin calculus approach to sensitivity analysis in insurance. Insurance Math. Econom., 35(3):679–690, 2004. [16] M. Rosenblatt. Remarks on some nonparametric estimates of a density function. Ann. Math. Statist., 27:832–837, 1956. [17] S.K. Srinivasan. Stochastic theory and cascade processes. American Elsevier Publishing Co., Inc., New York, 1969.

18