Moment estimation for ergodic diffusion processes - arXiv

arXiv:0711.3957v1 [math.ST] 26 Nov 2007

Bernoulli 13(4), 2007, 933–951 DOI: 10.3150/07-BEJ1040

Moment estimation for ergodic diffusion processes YURY A. KUTOYANTS1 and NAKAHIRO YOSHIDA2 1

Laboratoire de Statistique et Processus, Universit´e du Maine, 72017 Le Mans, France. E-mail: [email protected] 2 Graduate School of Mathematical Sciences, The University of Tokyo, Komaba 3-8-1, Meguro-ku, Tokyo 153-0041, Japan. E-mail: [email protected] We investigate the moment estimation for an ergodic diffusion process with unknown trend coefficient. We consider nonparametric and parametric estimation. In each case, we present a lower bound for the risk and then construct an asymptotically efficient estimator of the moment type functional or of a parameter which has a one-to-one correspondence to such a functional. Next, we clarify a higher order property of the moment type estimator by the Edgeworth expansion of the distribution function. Keywords: asymptotic efficiency; asymptotic expansions; diffusion process; moment estimation; nonparametric estimation

1. Introduction Suppose that a diffusion process {Xt , 0 ≤ t ≤ T } is uniquely defined by the stochastic differential equation dXt = S(Xt ) dt + σ(Xt ) dWt ,

0 ≤ t ≤ T,

(1)

where {Wt , t ≥ 0} is a standard Wiener process, and X0 is an initial random variable independent of {Wt , t ≥ 0}. We assume that {Xt , 0 ≤ t ≤ T } is ergodic with the invariant probability distribution µS (depending on S and σ). In our statistical problem, the diffusion coefficient σ is known to the observer, while the trend coefficient S is unknown. Given a function F : R → R, we want to estimate the parameter Z ϑ = ϑS = ES [F (ξ)] = F (y)µS (dy),

where ES denotes the expectation with respect to µS , and ξ denotes a “stationary” random variable with distribution µS . If Xt is stationary, we take ξ = X0 ; if not, we may enlarge the original probability space to realize ξ. This is an electronic reprint of the original article published by the ISI/BS in Bernoulli, 2007, Vol. 13, No. 4, 933–951. This reprint differs from the original in pagination and typographic detail.

1350-7265

c

2007 ISI/BS

934

Y.A. Kutoyants and N. Yoshida

We are interested in the asymptotically efficient estimation of ϑ based on the observations {Xt , 0 ≤ t ≤ T } as T → ∞. We will consider nonparametric and parametric estimation problems. Section 2 treats a nonparametric case where the function S is completely unknown. As usual in such problems, we derive a minimax bound of the risks of all estimators and then show that the empirical moment estimator ϑ∗T =

1 T

Z

T

F (Xt ) dt

0

is asymptotically efficient in that ϑ∗T attains this lower bound. In Section 3, we suppose that the function S belongs to a parametric family, that is, S = S(γ, ·), γ ∈ Γ, and ϑ is a function of γ : ϑ = ϑ(γ). Then it turns out that the maximum likelihood estimator (MLE) γˆT (under certain regularity conditions) provides the asymptotically efficient estimator ϑˆT = ϑ(ˆ γT ) of ϑ. However, the computation of the MLE in nonlinear models is often not easy. To avoid this drawback, we consider the socalled one-step MLE introduced by LeCam [6], which allows us to improve the empirical moment estimator to an asymptotically efficient one in this parametric setting. Here the asymptotic efficiency is in the sense of Hajek and LeCam. After investigations of the first-order asymptotics, a natural direction of study is the second-order approximation of the distribution of the estimator. In Section 4, we derive an asymptotic expansion formula with the help of the local approach of the Malliavin calculus developed recently by Yoshida [12, 13, 14] and Kusuoka and Yoshida [2].

2. Nonparametric estimation We consider the diffusion process {Xt , 0 ≤ t ≤ T } described in Section 1. In this section, S is unknown, while σ is a known continuous positive function. Let us denote by S the class of functions S that satisfies Z y Z x S(v) dv dy → ±∞ as x → ±∞ (2) exp −2 V (S, x) = 2 0 σ(v) 0 and G(S) =

Z

∞

−∞

−2

σ(y)

Z exp 2

0

y

S(v) dv dy < ∞. σ(v)2

(3)

These conditions guarantee the existence of invariant probability measure µS with the density function Z x S(v) −1 −2 dv fS (x) = G(S) σ(x) exp 2 (4) 2 0 σ(v) and the law of large numbers. We suppose that the initial value X0 = ξ has a probability density function f (·), in particular, the process Xt , t ≥ 0, is stationary.

Moment estimation for ergodic diffusion processes

935

We would like to estimate the mathematical expectation Z ∞ ϑS = F (y)fS (y) dy. −∞

To construct a lower minimax bound on the risks of all estimators, we define a nonparametric vicinity of a fixed model as follows. Fix some S∗ (·) ∈ S and δ > 0, and introduce the set Vδ = S(·) : sup |S(x) − S∗ (x)| < δ . x∈R

We suppose that S∗ (·) is such that the conditions (2) and (3) are fulfilled for all S(·) ∈ Vδ and sup G(S) < ∞,

sup ES |F (ξ)| < ∞.

S(·)∈Vδ

(5)

S(·)∈Vδ

The role of Fisher information in our problem will be played by the quantity

I(S) = 4ES

MS (ξ) σ(ξ)fS (ξ)

2 −1

,

(6)

where MS (y) = ES [(F (ξ) − ϑS )χ{ξ 0. Theorem 1. Let the conditions (5) be fulfilled and let I∗ > 0. Then −1/2

lim lim inf sup ES [ℓ(T 1/2 (ϑ¯T − ϑS ))] ≥ E[ℓ(ηI∗

¯T S(·)∈V δ→0 T →∞ ϑ δ

)],

(7)

where L(η) = N (0, 1) and inf is taken over all possible estimators ϑ¯T of the unknown parameter. Proof. We follow the well known scheme described in Ibragimov and Khasminskii ([1], Chapter 4); see also the proofs of similar results in Kutoyants [4]. Denote ϑ∗ = ϑS∗ and let ψ(·) be a continuous function with compact support such that Sh (·) = S∗ (·) + (h − ϑ∗ )ψ(·)σ(·)2 ∈ Vδ for all h ∈ (ϑ∗ − γ, ϑ∗ + γ). Here γ > 0 is a number chosen in such a way that Sh (·) ∈ Vδ . For the process (1) with S(·) = Sh (·) we consider the parameter estimation problem for h and recall the construction of the Hajek–LeCam minimax bound in this situation. The direct expansion of the function ϑh = ϑS∗ +(h−ϑ)ψ(·)σ(·)2 by the powers of h − ϑ∗ gives the representation Z ϑh = ϑ∗ + 2(h − ϑ∗ ) E F (ξ)

0

ξ

Z ψ(v) dv − ϑ∗ E

0

ξ

ψ(v) dv

+ o(h − ϑ∗ ),

936


where we write E = ES∗ and Eh = ESh . Therefore, if we take ψ(·) from the class Z K = ψ(·) : E [F (ξ) − ϑ∗ ]

ξ

0

ψ(v) dv = 2−1 , (T )

then ϑh = h+ o(h− ϑ∗ ). The corresponding family of measures {Ph , h ∈ (ϑ∗ − γ, ϑ∗ + γ)} is locally asymptotically normal (LAN) at the point h = ϑ∗ , therefore, we can apply the Hajek–LeCam inequality (see Ibragimov and Khasminski [1], Theorem 2.12.1) and write for ψ ∈ K, lim lim inf sup ES [ℓ(T 1/2 (ϑ¯T − ϑS ))]

¯T S(·)∈V δ→0 T →∞ ϑ δ

−1/2 sup Eh [ℓ(T 1/2 (ϑ¯T − ϑh ))] ≥ E[ℓ(ηIψ )].

≥ lim lim inf

¯T |h−ϑ| 0, but this function does not have compact support and, therefore, it cannot belong to K. As usual in such situation (see Ibragimov and Khasminski [1], page 218), we introduce a sequence of smooth functions {ψN (·)} with compact supports approximating ψ∗ (·) and such that inf Iψ = lim IψN = I∗ .

ψ(·)∈K

N →∞


937

Then −1/2

sup Eℓ(ηIψ

−1/2

) = Eℓ(ηI∗

)

ψ(·)∈K

and we have the desired estimate (7).

Definition 1. We say that the estimator ϑ¯T is asymptotically efficient for the loss function ℓ(·) if lim lim

−1/2 sup ES ℓ(T 1/2 (ϑ¯T − ϑS )) = Eℓ(ηI∗ )

δ→0 T →∞ S(·)∈Vδ

(8)

for all functions S∗ (·) ∈ S. Below we will show that the empirical estimator is an asymptotically efficient estimator of ϑ in this sense. The process {Xt }t∈R+ is stationary; therefore the estimator ϑ∗T is unbiased: ES ϑ∗T = ϑS . If the initial value X0 has another distribution, then the estimator ϑ∗T is no longer unbiased but nevertheless we have |ES ϑ∗T − ϑS | ≤ C T (see Kutoyants [3], Lemma 3.4.8) and the properties of the estimator ϑ∗T can be studied without assumption of stationarity as well. Let us introduce the function Z x Z y 2 [F (v) − ϑS ]fS (v) dv dx HS (y) = 2 0 σ(x) fS (x) −∞ and the conditions that, for some p∗ > 0, sup ES |HS (ξ)|p∗ < ∞,

S(·)∈Vδ

Let QS (y) =

MS (ξ) p∗ < ∞. sup ES σ(ξ)fS (ξ) S(·)∈Vδ

(9)

2MS (y) . σ(y)fS (y)

Theorem 2. Suppose that conditions (5) and (9) for some p∗ > 2 are fulfilled, that the Fisher information I(S) is continuous at S(·) = S∗ (·) and that the law of large numbers holds uniformly in S(·) ∈ Vδ , that is, for each κ > 0, lim

sup

T →∞ S(·)∈Vδ

(T ) PS

Z T 1 2 2 QS (Xt ) dt − ES QS (ξ) > κ = 0. T 0

(10)

Then the empirical moment ϑ∗T is an asymptotically efficient estimator of the parameter ϑS under the loss function ℓ(u) = |u|p with p < p∗ .

938


Proof. Using the Itˆ o formula, we rewrite the difference ηT = T 1/2 (ϑ∗T − ϑS ) with a stochastic integral as ηT =

1 HS (XT ) − HS (X0 ) √ −√ T T

Z

T

QS (Xt ) dWt

(11)

0

under PS . The uniform version of the central limit theorem for stochastic integral (Kutoyants [3], Theorem 3.3.7) allows us to show the weak convergence Z T 1 LS √ QS (Xt ) dWt =⇒ N (0, I(S)−1 ) T 0 uniform in S(·) ∈ Vδ . Therefore, the empirical estimator is uniformly asymptotically normal LS {T 1/2 (ϑ∗T − ϑS )} =⇒ N (0, I(S)−1 ).

(12)

For the polynomial loss functions we have to verify the uniform integrability of the family of random variables {|ηT |p , T → ∞}, but it follows from the representation (11) and the conditions (9) as it was done in Kutoyants [5]. Example 1. Let us consider the case where σ(y) ≡ 1 and, for some ρ > 0, A > 0 and L > 0, define the class of functions S(̺, A, L) = {S(·) : |S(y) − S(z)| ≤ L|y − z| for |y|, |z| ≤ A, sgn(y)S(y) ≤ −̺, for |y| ≥ A}. Then, for the function F (y) = |y|k with any k > 0, we can verify all the conditions of Theorem 2 to show that the estimator Z 1 T ∗ |Xt |k dt ϑT = T 0 is consistent, uniformly (in S(·) ∈ Vδ with δ < γ) asymptotically normal and asymptotically efficient for the polynomial loss functions ℓ(u) = |u|p with any p > 0. The verification is quite close to the one given in Kutoyants [5]. Remark 1. If we put F (y) = χ{y 0, and the derivative of the function ϑ(γ) is separated from zero, ˙ ˙ 0 < inf |ϑ(γ)| ≤ sup |ϑ(γ)| < ∞. γ∈Γ

γ∈Γ

(15)

940

Y.A. Kutoyants and N. Yoshida (T )

Under this regularity condition, the family of measures {Pγ , γ ∈ Γ} is LAN at any point γ0 ∈ Γ and we have the Hajek–LeCam lower bound on the risks of all estimators for the loss functions ℓ(u) = |u|p , p ≥ 1, that is, lim lim inf

sup

¯T |γ−γ |1 L (P ). A sufficient condition for A1 is provided, for example, in Veretennikov [10, 11] and ∞ exists a function ρ ∈ C ∞ (R) Kusuoka and Yoshida R [2]. Indeed, if σ ∈ CB (R) and if there −1 ∗ such that ρ > 0, R ρ(x) dx = 1 and lim sup|x|→∞ ρ(x) L ρ(x) < 0, where L∗ is the formal adjoint operator of the generator L of this diffusion process, then Condition A1 holds true (see Kusuoka and Yoshida [2]). Put Z T ZT = q(Xt ) dt, 0

where q(x) = F √ (x) − ϑ. As in Kusuoka and Yoshida [2], we define the rth cumulant function of ZT / T by r ZT d log E exp iεu · √ χT,r (u) = . dε 0 T Then define function ˆ T,k (u) = exp( 1 χT,2 (u)) + Ψ 2

k X

T −r/2 P˜T,r (u),

r=1

where P˜T,r (u) are defined by the formal Taylor expansion ! X ∞ ∞ X 1 1 r−2 ε χT,r (u) = exp χT,2 (u) + εr T −r/2P˜T,r (u). exp r! 2 r=1 r=2 ˆ T,k . For meaDenote by ΨT,k the signed measure defined as the Fourier inversion of Ψ surable function h : R → R, let Z sup{|h(x + y) − h(x)| : |y| ≤ r}φ(x; 0, I−1 ω(h, r) = ⋆ ) dx, R

where φ(x; µ, Σ) denotes the probability density of the normal distribution N (µ, Σ) and I⋆ is an arbitrary positive number smaller than I∗ . The Hermite polynomials are defined

948


by hk (z; Σ) = (−1)k φ(z; 0, Σ)−1 ∂zk φ(z; 0, Σ) for a positive constant Σ. For a continuous function a : R → R, we define Ga : R → R by Z y Z x Ga (x) = G(S)p(y) 2a(v)f (v) dv dy, −∞

where p(y) = exp(−2

Ry 0

−∞

σ(v)−2 S(v) dv) if the mapping Z y y 7→ p(y) a(v)f (v) dv −∞

is in L1 ((−∞, 0], dy). We write ha, f i =

R

R

a(x)f (x) dx and

[a] = −σ∇Ga−ha,f i . Define the set of functions Z C = a ∈ C↑ (R)|

∞

a(x)f (x) dx = 0;

−∞

p(·)

Z

·

−∞

a(x)f (x) dx ∈ L1 ((−∞, 0]); [a], Ga ∈ C↑ (R) .

Theorem 4. Let k ∈ N and let M, γ, K > 0. Suppose that F : R → R is not constant. Then (1) There exist constants δ > 0 and c > 0 such that for h ∈ E(M, γ), √ (k) |E[h( T (ϑ∗T − ϑ))] − ΨT,k [h]| ≤ cω(h, T −K ) + εT , (k)

where εT = o(T −(k+δ)/2 ). Here E(M, γ) = {h : R → R, measurable, |h(x)| ≤ M (1 + |x|)γ (x ∈ R)}. (2) The signed measure dΨT,1 has a density dΨT,1 (z)/dz = pT,1 (z) with (2)

(3)

(2)

pT,1 (z) = φ(z; 0, κT )(1 + 61 κT h3 (z; κT )), (r)

where κT is the rth cumulant of belong to C, then

√ T (ϑ∗T − ϑ). Moreover, if q and [q]2 − h[q]2 , f i

pT,1 (z) = p∗T,1 (z) + RT (z),

where 1 2 −1 √ E[[[q] ][q](ξ)]h (z; I ) p∗T,1 (z) = φ(z; 0, I−1 ) 1 + 3 ∗ ∗ 2 T

Moment estimation for ergodic diffusion processes and lim

T →∞

949

√ T sup {|RT (z)| exp(bz 2 )} = 0 z∈R

for some positive constant b. In particular, Z √ ∗ E[h( T (ϑ∗T − ϑ))] − h(z)pT,1 (z) dz ≤ cω(h, T −K ) + ε˜T R

√ for any h ∈ E(M, γ), with ε˜T = o(1/ T ).

¯ x) = (X(t, x), Z(t, x)) that satisfies the Proof. We consider the stochastic flow X(t, stochastic differential equation ¯ x) = V¯0 (X(t, ¯ x)) dt + V¯ (X(t, ¯ x)) ◦ dWt , dX(t,

¯ x) = (x, 0), X(0,

where V¯0 (x, z) = (S(x) − 2−1 σ ′ (x)σ(x), q(x)) and V¯ (x, z) = (σ(x), 0). By assumption, F is not constant; hence, there exists a point x0 ∈ R such that F ′ (x) 6= 0 in a neighborhood U of x0 . Easy calculus shows that Span{V¯ (x0 , 0), [V¯ , V¯0 ](x0 , 0)} = R2 . In the same argument used to prove Theorem 4 of Kusuoka and Yoshida [2] or that in Example 2 of the same paper, Condition [A3′ ] of it can be verified. Indeed, take the sequence u(j) = j, v(j) = j + 1, j ∈ R, and let ψj = ϕ(Xj ) for some truncation function ϕ ∈ C ∞ (R; [0, 1]) with compact support in U taking value 1 near x0 . For each j ∈ N, the Malliavin operator Lj is constructed in the usual way so that Lj does not shift the path w )−1 ; x ∈ U } outside of [j, j + 1]. Then it is known that for sufficiently small U , {(det σX(1,x) ¯ is bounded in Lp (P ) for any p > 1. Therefore, Condition [A3′ ] of Kusuoka and Yoshida [2] can be verified. Thus, the first assertion has been obtained. For the formula of pT,k with the validity, see Sakamoto and Yoshida [8, 9]. To obtain the last result, some calculations are involved. Given that q ∈ C, Z T Z T 1 Gq (XT ) − Gq (X0 ) 1 √ √ +√ q(Xt ) dt = [q](Xt ) dWt (24) T 0 T T 0 and E[|Gq (ξ)|p ] < ∞ for any p > 1. Let 0 < β < 1/2. The strong mixing property A1 induces the so-called covariance inequality, which yields Z T −T β 1 E Gq (XT ) · √ [q](Xt ) dWt T Tβ

Z T −T β

1 1/r

[q](Xt ) dWt ≤ 8α(B[0,T −T β ] , B[T,∞) ) √

kGq (XT )kq T Tβ p aT β , ≤ a−1 exp − r

950


where α is the coefficient of the strong mixing, a is some positive constant and 1/p + 1/q + 1/r = 1. A similar estimate holds even if Gq (XT ) above is replaced with Gq (X0 ). Accordingly, we obtain (2)

κT =

1 T

Z

T

0

Z T 2 1 E[[q]2 (Xt )] dt + √ E (Gq (XT ) − Gq (X0 )) · √ [q](Xt ) dWt T T 0

1 2 E[(Gq (XT ) − Gq (X0 )) ] T 1 2 = E[[q] (ξ)] + o √ . T +

(25) (3)

Next, we consider the third cumulant κT . Put k = [q] and denote simply kt = k(Xt ). By assumption, k ∈ C↑ (R), and so we see by Itˆ o’s lemma that E

1 √ T

Z

= 3T

T

kt dWt

0

−3/2

E

Z

T Z t

2 ks dWs kt dt

E

Z

T Z t

2 2 ks dWs (kt − hk , f i) dt

0

= 3T

−3/2

3

0

0

1 = 3T −1/2 E √ T

0

Z

T 0

1 ks dWs · √ T

Z

0

T

(kt2 − hk 2 , f i) dt .

Because k 2 − hk 2 , f i ∈ C, the right-hand side equals Z T Z T Gk2 −hk2 ,f i (XT ) − Gk2 −hk2 ,f i (X0 ) 3 1 1 √ E √ √ +√ ks dWs [[q]2 ](Xt ) dWt T T 0 T T 0 3 1 = √ E[[q]2 [q](ξ)] + o √ . (26) T T In view of (24) and by a similar argument after it, it is possible to see that the cross terms RT [e.g., those between Gq (XT ) − Gq (X0 ) and (T −1/2 0 [q] dWt )2 ] have asymptotically no (3) contribution to κT . It follows from (25) and (26) that 2 1 sup ebz |pT,1 (z) − p∗T,1 (z)| = o √ T z∈R for some positive constant b. This completes the proof.


951

Acknowledgement We would like to thank the referees for helpful comments.

References [1] Ibragimov, I.A. and Khasminskii, R.Z. (1981). Statistical Estimation. New York: SpringerVerlag. MR0620321 [2] Kusuoka, S. and Yoshida, N. (2000). Malliavin calculus, geometric mixing, and expansion of diffusion functionals. Probab. Theory Related Fields 116 457–484. MR1757596 [3] Kutoyants, Yu.A. (1984). Parameter Estimation for Stochastic Processes. Berlin: Heldermann. MR0777685 [4] Kutoyants, Yu.A. (1997). Efficiency of the empirical distribution for ergodic diffusion. Bernoulli 3 445–456. MR1483698 [5] Kutoyants, Yu.A. (2004). Statistical Inference for Ergodic Diffusion Processes. New York: Springer-Verlag. MR2144185 [6] Le Cam, L. (1961). On the asymptotic theory of estimation and testing hypotheses. In Proc. Third Berkeley Symp. Math. Statist. Prob. 1 129–156. Berkeley: Univ. California Press. MR0084918 [7] Liptser, R.S. and Shiryayev, A.N. (2001). Statistics of Random Processes 1, 2nd edn. New York: Springer-Verlag. [8] Sakamoto, Y. and Yoshida, N. (1997). Third order asymptotic expansion for diffusion process. Proc. of the Symposium on the Theory and Applications of Statistical Inference held at Osaka City University. [9] Sakamoto, Y. and Yoshida, N. (2004). Asymptotic expansion formulas for functionals of epsilon-Markov processes with a mixing property. Ann. Inst. Statist. Math. 56 545–597. MR2095018 [10] Veretennikov, A.Yu. (1987). Bounds for the mixing rate in the theory of stochastic equations. Theory Probab. Appl. 32 273–281. [11] Veretennikov, A.Yu. (1997). On polynomial mixing bounds for stochastic differential equations. Stochastic Process. Appl. 70 115–127. MR1472961 [12] Yoshida, N. (1994). Malliavin calculus and asymptotic expansion for martingales. Research Memorandum 504, 517. The Institute of Statistical Mathematics. [13] Yoshida, N. (1997). Malliavin calculus and asymptotic expansion for martingales. Probab. Theory Related Fields 109 301–342. MR1481124 [14] Yoshida, N. (2004). Partial mixing and conditional Edgeworth expansion. Probab. Theory Related Fields 129 559–624. MR2078982 Received May 2001 and revised January 2007

Moment estimation for ergodic diffusion processes - arXiv

Moment estimation for ergodic diffusion processes - arXiv

Suggest Documents

Ergodic control of diffusion processes - ICM 2006

Maximum-likelihood estimation for diffusion processes via ... - arXiv

Self-normalized processes: exponential inequalities, moment ... - arXiv

Estimation of Parameters for Diffusion Processes with Jumps from ...

Ergodic Versus Uncertain Financial Processes

Kernel density estimation via diffusion - arXiv

Mathematical background for diffusion processes

Diffusion Adaptation Strategies for Distributed Estimation over ... - arXiv

Latent Gaussian Processes for Distribution Estimation of ... - arXiv

Equivalence theory for density estimation, Poisson processes ... - arXiv

estimation of axial diffusion processes by analog ... - Science Direct

Parameter Estimation of Hidden Diffusion Processes: Particle Filter vs ...

Probability, Random Processes, and Ergodic Properties - Electrical ...

Ergodic Convergence Rates of Markov Processes

[math.ST] 31 Dec 2006 Goodness of fit test for ergodic diffusion ... - arXiv

dimensional Brownian diffusion processes

Diffusion Processes on Manifolds

Surface atomic diffusion processes observed at milliseconds ... - arXiv

Clustering of discretely observed diffusion processes arXiv:0809.3902v1

Fractional diffusion equations and processes with randomly ... - arXiv

Diffusion approximation of birth-death-innovation processes - arXiv

Probability tree algorithm for general diffusion processes

Probability tree algorithm for general diffusion processes

Hitting probability for anomalous diffusion processes