VOLATILITY ESTIMATION FOR STOCHASTIC PDES USING HIGH-FREQUENCY OBSERVATIONS B Y M ARKUS B IBINGER† AND M ATHIAS T RABS∗,†
arXiv:1710.03519v1 [math.ST] 10 Oct 2017
Philipps-Universit¨at Marburg and Universit¨at Hamburg Abstract We study the parameter estimation for parabolic, linear, second order, stochastic partial differential equations (SPDEs) observing a mild solution on a discrete grid in time and space. A high-frequency regime is considered where the mesh of the grid in the time variable goes to zero. Focusing on volatility estimation, we provide an explicit and easy to implement method of moments estimator based on the squared increments of the process. The estimator is consistent and admits a central limit theorem. This is established moreover for the estimation of the integrated volatility in a semi-parametric framework. Starting from a representation of the solution as an infinite factor model and exploiting mixing properties of Gaussian time series, the theory considerably differs from the statistics for semi-martingales literature. The performance of the method is illustrated in a simulation study.
1. Introduction. 1.1. Overview. Motivated by random phenomena in natural science as well as by mathematical finance, stochastic partial differential equations (SPDEs) have been intensively studied during the last fifty years with a main focus on theoretical analytic and probabilistic aspects. Thanks to the exploding number of available data and the fast progress in information technology, SPDE models become nowadays increasingly popular for practitioners, for instance, to model neuronal systems or interest rate fluctuations to give only two examples. Consequently, statistical methods are required to calibrate this class of complex models. While in probability theory there are recently enormous efforts to advance research for SPDEs, for instance Hairer (2013) was able to solve the KPZ equation, there is scant groundwork on statistical inference for SPDEs and there remain many open questions. Considering discrete high-frequency data in time, our aim is to extend the well understood statistical theory for semi-martingales, see, for instance, Jacod and Protter (2012), Mykland and Zhang (2009) or Jacod and Rosenbaum (2013), to parabolic SPDEs whose solution can be understood as infinite dimensional stochastic differential equations. Generated by an infinite factor model, highfrequency dynamics of the SPDE model differ from the semi-martingale case in several ways. We show that the SPDE model induces a distinctive, much rougher behavior of the marginal processes over time for a fixed spatial point compared to semi-martingales or diffusion processes. Also, we find non-negligible negative autocovariances of increments such that the semi-martingale theory and martingale central limit theorems are not applicable to the marginal processes. Nevertheless, we show that the fundamental concept of realized volatility as a key statistic can be adapted to the SPDE setting. ∗
Corresponding author. We thank Jan Kallsen and Mathias Vetter for helpful comments. M.T. gratefully acknowledges the financial support by the DFG research fellowship TR 1349/1-1. This work has been started while M.T. was affiliated to the Universit´e ParisDauphine. AMS 2000 subject classifications: Primary 62M10; secondary 60H15 Keywords and phrases: high-frequency data, stochastic partial differential equation, random field, realized volatility, central limit theorem under ρ-mixing †
1
2
BIBINGER AND TRABS
While this work can only be a starting point, we are convinced that more results and ideas from the high-frequency semi-martingale literature can be fruitfully transferred to SPDE models. We consider the following linear parabolic SPDE with one space dimension (1)
∂ 2 X (y) ∂Xt (y) t dXt (y) = ϑ2 + ϑ + ϑ X (y) dt + σ(t) dBt (y), X0 = ξ 1 0 t ∂y 2 ∂y (t, y) ∈ R+ × [ymin , ymax ], Xt (ymin ) = Xt (ymax ) = 0 for t > 0,
where Bt is defined as a cylindrical Brownian motion in the Sobolev space on [ymin , ymax ], the initial random variable ξ is independent from B, and with parameters ϑ0 , ϑ1 ∈ R and ϑ2 > 0 and some volatility function σ which we assume to depend only on time. The simple Dirichlet boundary conditions Xt (ymin ) = Xt (ymax ) = 0 are natural in many applications. We also briefly touch on enhancements to other boundary conditions. A solution X of (1) will be observed on a discrete grid (ti , yj ) ⊆ [0, T ] × [ymin , ymax ] for i = 1, . . . , n and j = 1, . . . , m in a fixed cuboid. More specifically we consider equidistant time points ti = i∆n . While the parameter vector ϑ = (ϑ0 , ϑ1 , ϑ2 )> might be known from the physical foundation of the model, estimating σ 2 quantifies the level of variability or randomness in the system. This constitutes our first target of inference. An extension of our approach for inference on ϑ will also be discussed. 1.2. Literature. The key insight in the pioneering work by H¨ubner et al. (1993) is that for a large class of parabolic differential equations the solution of the SPDE can be written as a Fourier series where each Fourier coefficient follows an ordinary stochastic differential equation of OrnsteinUhlenbeck type. Hence, statistical procedures for these latter processes offer a possible starting point for inference on the SPDE. Most of the available literature on statistics for SPDEs studies a scenario with observations of the Fourier coefficients in the spectral representation of the equation, see for instance H¨ubner and Rozovski˘ı (1995) and Cialenco and Glatt-Holtz (2011) for the maximum likelihood estimation for linear and non-linear SPDEs, respectively. Bishwal (2002) discuss a Bayesian approach and Cialenco et al. (2016) a trajectory fitting procedure. H¨ubner and Lototsky (2000) and Prakasa Rao (2002) have studied nonparametric estimators when the Fourier coefficients are observed continuously or in discrete time, respectively. We refer to the monograph by Bishwal (2008) and the survey by Lototsky (2009) for an overview on the existing theory. The (for many applications) more realistic scenario where the solution of a SPDE is observed only at discrete points in time and space has been considered so far only in very few works. Markussen (2003) has derived asymptotic normality and efficiency for the maximum likelihood estimator in a parametric problem for a parabolic SPDE. Mohapl (1997) has considered maximum likelihood and least squares estimators for discrete observations of an elliptic SPDE where the dependence structure of the observations is however completely different (in fact simpler) from the parabolic case. Our theoretical setup differs from the one in Markussen (2003) in several ways. First, we introduce a time varying volatility σ(t) in the disturbance term. Second, we believe that discrete high-frequency data in time under infill asymptotics, where we have a finite time horizon [0, T ] and in the asymptotic theory the mesh of the grid tends to zero, are most informative for many prospective data applications. Markussen (2003) instead focused on a low-frequency setup where the number of observations in time tends to infinity at a fixed time step. Also his number of observations in space is fixed, but we allow for an increasing number of spatial observations, too. During the final preparations of this manuscript, we noticed a very recent and related preprint by Cialenco and Huang (2017). In this independent project the authors establish results for power
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
3
variations in a similar parametric model when either m = 1 and n → ∞ or n = 1 and m → ∞. They prove central limit theorems for slightly different statistics than we consider from which one is related to one of our results on rescaled realized volatility for m = 1 in the parametric case. Interestingly, their proof is based on very different technical ingredients – instead of mixing theory for Gaussian time series they rather rely on tools from Malliavin calculus. 1.3. Methodology. The literature on statistics for high-frequency observations of semi-martingales suggests to use the sum of squared increments in order to estimate the volatility. While in the present infinite dimensional model this remains generally possible, there are some deep structural differences. We show for any fixed spatial point y ∈ (ymin , ymax ) the stochastic convergence n −y ϑ1 /ϑ2 Z T 1 X P e RVn (y) := √ σ 2 (t)dt for n → ∞, (∆i X)2 (y) → √ n ∆n i=1 ϑ2 π 0 where ∆i X(y) := Xi∆n (y) − X(i−1)∆n (y), ∆n = T /n, and supposing that σ is at least √ 1/2-H¨older regular. For a semi-martingale instead, squared increments are of order ∆n and not ∆n . We also see a dependence of the realized volatility on the spatial position y. Assuming ϑ1 , ϑ2 to be known and exploiting this convergence, the construction of a consistent method of moments estimator for the integrated squared volatility is obvious. However, the proof of a central limit theorem for such an estimator is non-standard due to several difficulties inherent in the model’s geometry. The method of moments can be extended from one to m spatial observations where m is a fixed integer as n → ∞ or in the double asymptotic regime where n → ∞ and m → ∞. It turns out that the realized volatilities (RVn (yj ))16j6m even de-correlate asymptotically when m n. We prove that √ the final estimator attains a parametric rate, i.e. a mn-rate, and satisfies a central limit theorem. Since we have negative auto-correlations between increments (∆i X(yj ))16i6n , a martingale central limit theorem cannot be applied. Instead, we prove that the sequence ((∆i X(y1 ), . . . , ∆i X(ym ))> )16i6n is ρ-mixing and apply a central limit theorem by Utev (1990). Introducing a quarticity estimator, we also provide a feasible version of the limit theorem that allows for the construction of confidence intervals. In view of the complex probabilistic model, our estimator is strikingly simple. Thanks to its explicit form that does not rely on optimization algorithms, the method can be easily implemented and is very fast. An M-estimation approach can be used to construct a joint estimator of the parameters in (1). While the volatility estimator is first constructed in the simplified parametric framework, by a careful decomposition of approximation errors we derive analogous asymptotic results for the semiparametric estimation of the integrated squared volatility under quite general assumptions. 1.4. Applications. Although (1) is a relatively simple SPDE model, it already covers many interesting applications. Let us give some examples: 1. The stochastic heat equation ∂Xt (y) ∂ 2 Xt (y) =α + σ(t) dBt (y) ∂t ∂y 2 with thermal diffusivity α > 0 is the prototypical example for parabolic SPDEs. The stochastic disturbance term σ(t)dBt models a random heat source in the system. 2. Going back to Frankignoul (1979), sea surface temperature anomalies T can be modeled by ∂T = D∆T − hv, ∇T i − λT + σ dBt , ∂t
4
BIBINGER AND TRABS F IGURE 1. A simulated random field solving the SPDE.
Notes: One simulated solution field of the SPDE at times i/n, i = 0, . . . , 1000 at space points j/m, j = 1, . . . , 10. Implementation and parameter details are given in Section 6. One marginal process for j = 1 is highlighted by the black line.
(with two space dimensions) where D is a diffusion coefficient, v is the advection velocity, λ is an atmosphere-ocean feedback parameter and the stochastic term is understood as atmospheric forcing, cf. Piterbarg and Ostrovskii (1997). 3. The cable equation is a basic PDE-model in neurobiology, cf. Tuckwell (2013). Modeling the nerve membrane potential, its stochastic version is given by cm
∂Vt (y) 1 ∂ 2 Vt (y) Vt (y) = − + σ(t) dBt (y), ∂t ri ∂y 2 rm
0 < y < l, t > 0,
where y is the length from the left endpoint of the cable with total length l > 0 and t is time. The parameters are the membrane capacity cm > 0, the resistance of the internal medium ri > 0 and the membrane resistance rm > 0. The noise terms represent the synaptic input. The first mathematical treatment of this equation goes back to Walsh (1986). 4. Our leading example is the term structure model by Cont (2005). It describes a yield curve as a random field, i.e. a continuous function of time t and time to maturity y ∈ [ymin , ymax ]. Differently than classical literature on time-continuous interest rate models in mathematical finance, including the landmark paper by Heath et al. (1992) and ensuing work, which focus on modeling arbitrage-free term structure evolutions under risk-neutral measures, Cont (2005) aims to describe real-world dynamics of yield curves. In particular, the model meets the main stylized facts of interest rate movements as mean reversion, humped term structure of volatility, smoothness in maturity and irregularity of evolutions over time. Thereto, the instantaneous forward rate curve (FRC) rt (y) is decomposed (2) rt (y) = rt (ymin ) + (rt (ymax ) − rt (ymin )) Y (y) + Xt (y) . When the short rate and the spread are factored out, the FRC is determined by a deterministic shape function Y (y) reflecting the average profile and the stochastic deformation process Xt (y)
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
5
modeled by the SPDE (3)
∂X (y) κ ∂ 2 X (y) t t + dt + σ(t, y) dBt (y), ∂y 2 ∂y 2 (t, y) ∈ R+ × [ymin , ymax ], Xt (ymin ) = Xt (ymax ) = 0 for t > 0, dXt (y) =
with a curvature parameter κ > 0. Figure 1 visualizes one generated discretization of Xt (y) with σ ≡ 1/4 and κ = 0.4 from our Monte Carlo simulations. Observations of the FRC (2) can either be reconstructed from observed swap rates or prices of future contracts are used. For instance, Bouchaud et al. (1999) show how to reconstruct discrete recordings of Xt (y) from observed prices of futures. There are more SPDE approaches to interest rates including Musiela (1993), Santa Clara and Sornette (2001), Bagchi and Kumar (2001) and Filipovi´c (2001). 1.5. Outline & Contribution. In Section 2, we present probabilistic preliminaries on the model and our precise theoretical setup and observation model. In Section 3, we start considering a parametric problem with σ 2 in (1) constant. The statistical properties of the discretely observed random field generated by the SPDE are examined and illustrated. We develop the method of moments estimator for the squared volatility and present the central limit theorem. In Section 4 we verify that the same estimator is suitable to estimate the integrated squared volatility in a nonparametric framework with time-dependent volatility. Theorem 4.2 presents the first main result: a central limit theorem for the semi-parametric estimation with parametric rate and a feasible version that facilitates confidence. Section 5 addresses a simultaneous estimation of ϑ and σ 2 based on a least squares approach. The second main result in Theorem 5.1 establishes joint asymptotic normality of this estimator with parametric rate. Section 6 discusses the simulation of the SPDE model and the implementation of the methods and confirms good finite-sample properties in a Monte Carlo study. Proofs are given in Section 7. The appendix reports an explicit bound for the ρ-mixing coefficient for multivariate Gaussian time series which is crucial for our proofs and which might be of independent interest. 2. Probabilistic structure and statistical model. Without loss of generality, we set ymin = 0 and ymax = 1. We apply the semi-group approach by Da Prato and Zabczyk (1992) to analyze the SPDE (1) which is associated to the differential operator Aϑ := ϑ0 + ϑ1
∂ ∂2 + ϑ2 2 ∂y ∂y
such that the SPDE (1) reads as dXt (y) = Aϑ Xt (y) dt + σ(t) dBt (y). The eigenfunctions ek of Aϑ with corresponding eigenvalues −λk are given by (4) (5)
√
ϑ1 2 sin πky exp − y , y ∈ [0, 1], 2ϑ2 ϑ2 λk = −ϑ0 + 1 + π 2 k 2 ϑ2 , k ∈ N. 4ϑ2
ek (y) =
This eigendecomposition takes a key role in our analysis and for the probabilistic properties of the model. Of particular importance is that λk increase proportionally with k 2 and that ek (y) decrease exponentially in y with an additional oscillating factor. Note that (ek )k>1 is an orthonormal basis of
6
BIBINGER AND TRABS
the Hilbert space Hϑ := f : [0, 1] → R : kf kϑ < ∞, f (0) = f (1) = 0 with Z 1 exϑ1 /ϑ2 f (x)g(x) dx and kf k2ϑ := hf, f iϑ , hf, giϑ := 0
which we choose as state space for the solutions of (1). In particular, we suppose that ξ ∈ Hϑ . Aϑ is self-adjoint on Hϑ . The cylindrical Brownian motion (Bt )t>0 in (1) can be defined via X hBt , f iϑ = hf, ek iϑ Wtk , f ∈ Hϑ , t > 0, k>1
for independent real-valued Brownian motions (Wtk )t>0 , k > 1. Xt (y) is called a mild solution of (1) on [0, T ] if it satisfies the integral equation for any t ∈ [0, T ] Z t tAϑ Xt = e ξ + e(t−s)Aϑ σ(t)dBt a.s. 0
Defining the coordinate processes xk (t) := hXt , ek iϑ , t > 0, for any k > 1, the random field Xt (y) can thus be represented as the infinite factor model Z t X −λk t (6) Xt (y) = xk (t)ek (y) with xk (t) = e hξ, ek iϑ + e−λk (t−s) σ(t)dWsk . 0
k>1
In other words, the coordinates xk satisfy the Ornstein-Uhlenbeck dynamics dxk (t) = −λk xk (t)dt + σ(t)dWtk , xk (0) = hξ, ek iϑ . R· There is a modification of the stochastic convolution 0 e(·−s)Aϑ σ(t)dBt which is continuous in time and space and thus we may assume that (t, y) 7→ Xt (y) is continuous, cf. Da Prato and Zabczyk (1992, Thm. 5.22). The neat representation (6) separates dependence on time and space and connects the SPDE model to stochastic processes in Hilbert spaces. This structure has been exploited in several works, especially for financial applications, see, for instance, Bagchi and Kumar (2001) and Schmidt (2006). We consider the following observation scheme.
(7)
A SSUMPTION 2.1 (Observations). grid (ti , yj ) ∈ [0, 1]2 with (8)
ti = i∆n
Suppose we observe a mild solution X of (1) on a discrete
for i = 0, . . . , n
and
δ 6 y1 < y2 < · · · < ym 6 1 − δ
where n, m ∈ N and δ > 0. Let m · maxj=2,...,m |yj − yj−1 | be uniformly bounded in m. We consider an infill asymptotics regime where n∆n = √ 1, ∆n → 0 as n → ∞ and m < ∞ is either fixed or it may diverge m → ∞ such that m(log m) ∆n → 0. While we have fixed the time horizon to T = 1, the subsequent results can be easily extended to any finite T . We write ∆i X(y) = Xi∆n (y) − X(i−1)∆n (y), 1 6 i 6 n, for the increments of the marginal processes in time and analogously for increments of other processes. The condition m = O(n1/2 ) implies that we provide asymptotic results for discretizations which are finer in time than in space. For instance in the case of an application to term structure data, m
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
7
n appears quite natural, since we can expect many observations over time of a moderate number of different maturities. Especially for high-frequency intra-day data, for instance, from futures on government bonds, we typically have at hand several thousands intra-day price recordings per day, but usually at most 10 frequently traded different contract lengths. We impose the following mild regularity conditions on the initial value of the SPDE (1). A SSUMPTION 2.2.
In (1) we assume that
(i) either E[hξ, ek i] = 0 for all k > 1 and supk λk E[hξ, ek i2 ] < ∞ holds true or 1/2 E[kAϑ ξk2ϑ ] < ∞; (ii) (hξ, ek iϑ )k>1 are independent. This assumption is especially satisfied if ξ is distributed according to the stationary distribution 1/2 of (1), where hξ, ek i are independently N (0, σ 2 /λk )-distributed. Note also that E[kAϑ ξk2ϑ ] < ∞ 1/2 implies supk λk E[hξ, ek i2 ] = supk E[hAϑ ξ, ek i2 ] < ∞. Assuming independence of the sequence (hξ, ek iϑ )k>1 is a convenient condition for an analysis of the variance-covariance structure of our estimator, but can be replaced by other moment-type conditions on the coefficients of the initial value also. 3. Estimation of the volatility parameter. In this section, we consider the parametric problem where σ is a constant volatility parameter. Suppose first that we want to estimate σ 2 based on n → ∞ discrete recordings of Xt (y), along only one spatial observation y ∈ (δ, 1 − δ), δ > 0, m = 1. Let us assume first that ϑ is known. We thus know (λk , ek )k>1 explicitly from (5). In many applications, ϑ will be given in the model or reliable estimates could be available from historical data. Nevertheless, we address the parameter estimation of ϑ in Section 5. Volatility estimation relies typically on squared increments of the observed process. We develop a method of moment estimator here utilizing squared increments (∆i X)2 (y), 1 6 i 6 n, also. However, the behavior of the latter is quite different compared to the standard setup of high-frequency observations of semi-martingales. Though we do not observe the factor coordinate processes in (6), understanding the dynamics of their increments is an important ingredient of the analysis of (∆i X)2 (y), 1 6 i 6 n. The increments of the Ornstein-Uhlenbeck processes from (6) can be decomposed as −λk i∆n
∆i xk = hξ, ek iϑ e |
−e {z
=Ai,k
−λk (i−1)∆n
}
Z
(i−1)∆n
+ 0
|
=Bi,k
Z (9)
σe−λk ((i−1)∆n −s) (e−λk ∆n − 1) dWsk {z } i∆n
+ (i−1)∆n
|
σe−λk (i∆n −s) dWsk . {z } =Ci,k
For a fixed finite intensity parameter λk , observing ∆i xk corresponds to the classical semi-martingale framework. In this case, Ai,k , Bi,k are negligible and the exponential term in Ci,k is close to one, such that σ 2 is estimated via the sum of squared increments. Convergence to the quadratic variation when n → ∞ implies consistency of this estimator, well-known as the realized volatility. It is important that our infinite factor model of X includes an infinite sum involving (∆i xk )k>1 with increasing λk ∝ k 2 . In this situation the terms Bi,k are not negligible and induce a quite distinct behavior of the increments. The larger k, the stronger the mean reversion effect and the smaller the
8
BIBINGER AND TRABS F IGURE 2. Realized volatilities and autocorrelations of returns.
−1/2 Pn 2 Notes: In the left plot, the dark points show averages of realized volatilities n−1 ∆n i=1 (∆i X) (yj ) for yj = j/9, j = 1, . . . , 9 based on 1,000 Monte Carlo replications with n = 1, 000 and a cut-off frequency K = 10, 000. The light dotted line depicts the function (πϑ2 )−1/2 exp(−yϑ1 /ϑ2 )σ 2 , σ = 1/4, ϑ0 = 0, ϑ1 = 1, ϑ2 = 1/2. The right plot shows empirical autocorrelations from one of these Monte Carlo iterations for y = 8/9. The theoretical decay of autocorrelations for lags |j − i| = 1, . . . , 15 is added by the dashed line.
variance of the Ornstein-Uhlenbeck coordinate processes, such that higher addends have decreasing influence in (6). In simulations, for instance in Figure 2, Xt (y) is generated summing over 1 6 k 6 K up to some sufficiently large cut-off frequency K. 1/2 An important consequence is that the squared increments (∆i X(y))2 are of order ∆n , while for k k ) induce the order ∆ of second moments. Hence, semi-martingales the terms σ(W(i+1)∆ − Wi∆ n n n the aggregation of the coefficient processes leads to a rougher path t 7→ Xt (y). P ROPOSITION 3.1. (10)
On Assumptions 2.1 and 2.2, we have uniformly in y ∈ [δ, 1 − δ] that σ2 E (∆i X)2 (y) = ∆n1/2 e−y ϑ1 /ϑ2 √ + ri + O ∆n3/2 ϑ2 π
with terms ri that become negligible when summing all squared increments: h E
1 1/2
n∆n
n X
i σ2 + O ∆n . (∆i X)2 (y) = e−y ϑ1 /ϑ2 √ ϑ2 π i=1
At first, the expressions in (10) hinge on the time point i∆n , but this dependence becomes asymptotically negligible at first order when summing all squared increments. Moreover, the first order expectation of squared increments shows no oscillating term in y, only the exponential decay. This crucially simplifies the structure. The dependence on the spatial coordinate y is visualized in Figure 2. In view of the complex model, our asymptotic analysis brings forth a surprisingly simple consistent estimator for the squared volatility (11)
σ ˆy2
√ n π ϑ2 X = √ (∆i X)2 (y) exp(y ϑ1 /ϑ2 ) . n ∆n i=1
In particular, an average of rescaled squared increments (a rescaled realized volatility) depends on y (only) via the multiplication with exp(y ϑ1 /ϑ2 ). Also, the laws of (∆i X)2 (y) depend on i, but we
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
9
derive a consistent estimator (11) as a non-weighted average over all time instants. The negligibility of the terms ri in (10) renders this nice and simple structure. Note that the estimator σ ˆy2 depends on ϑ. The case of unknown ϑ is discussed in Section 5. Assessing the asymptotic distribution of the volatility estimator, however, requires a more careful handling of the higher order terms. Especially, we cannot expect that the increments (∆i X)2 (y), 1 6 i 6 n, are uncorrelated. P ROPOSITION 3.2. On Assumptions 2.1 and 2.2, the covariances of increments (∆i X(y))i=1,...,n satisfy uniformly in y ∈ [δ, 1 − δ] for |j − i| > 1: (12) Cov ∆i X(y), ∆j X(y) −y ϑ1 /ϑ2 p p p 2e √ = −∆1/2 2 |j − i| − |j − i| − 1 − |j − i| + 1 + ri,j + O(∆3/2 n σ n ), 2 ϑ2 π P where the remainder terms ri,j are negligible in the sense that ni,j=1 ri,j = O(1). While standard semi-martingale models lead to discrete increments that are (almost) uncorrelated (martingale increments are uncorrelated), (12) implies negative autocovariances. This fact highlights another crucial difference between our discretized SPDE model and classical theory from the statistics for semi-martingales literature. We witness a strong negative correlation of consecutive returns for |j − i| = 1. Autocorrelations decay proportional to |j − i|−3/2 with increasing lags, see Figure 2. Covariances hinge on y at first order only via the exponential factor exp(−y ϑ1 /ϑ2 ), such that autocorrelations do not depend on √ the spatial coordinate y. For lag |j − i| = 1, the theoretical autocorrelation given in Figure 2 is ( 2 − 2)/2 ≈ −0.292, while for lag 2 and lag 3 we have factors of approximately −0.048 and −0.025. For lag 10 the factor decreases to −0.004. Also the acf from simulations in Figure 2 shows this fast decay of serial correlations. Due to the correlation structure of the increments (∆i X)16i6n , martingale central limit theorems as typically used in the volatility literature do not apply. Instead, we shall exploit a central limit theorem for ρ-mixing triangular arrays by Utev (1990). The ρ-mixing coefficients are suprema of autocorrelations, see (5) on page 3 of Doukhan (1994). Let, for the moment, ξ be distributed according to the stationary distribution, i.e. the independent coefficients satisfy hξ, ek i ∼ N (0, σ 2 /λk ). Extending (W k )k>1 to Brownian motions on the whole real line for each k defined on a suitable extension of the probability space, we then have the representation Z (i−1)∆n ˜i,k , B ˜i,k = (13) ∆i xk = Ci,k + B σ(e−λk ∆n − 1)e−λk ((i−1)∆n −s) dWsk , −∞
P
and the sequence ∆i X(y) = k>1 ∆i xk ek (y) is centered, Gaussian and stationary. In particular, covariance stationarity of a Gaussian sequence also implies stationarity in the strong sense. For such processes, a profound theory on mixing, including sufficient and necessary conditions, is available dating back to Kolmogorov and Rozanov (1961). In particular, the polynomial decay of the covariances as seen in Proposition 3.2 allows for establishing ρ-mixing for the sequence of increments. P ROPOSITION 3.3. Let ξ be distributed according to the stationary distribution, i.e., hξ, ek i ∼ 1/4 N (0, σ 2 /λk ). Then the Gaussian sequence (∆i X(y)/∆n )i>1 is stationary, centered and ρ-mixing (equivalently strong mixing) with mixing coefficients ρ(l) = O(l−ς ) for any ς < 1/2 and for sufficiently large n.
10
BIBINGER AND TRABS
The condition on ξ will not be important, since under Assumption 2.2 we can approximate the sequence (∆i X(y))i>1 by the stationary version up to a negligible remainder. We prove in Proposition 7.3 that Var(ˆ σy2 ) ∝ ∆n , such that the estimator attains a usual parametric convergence rate. With Proposition 3.3, we establish the following central limit theorem for estimator (11). T HEOREM 3.4. On Assumptions 2.1 and 2.2, for any y ∈ [δ, 1 − δ] the estimator (11) obeys as n → ∞ the central limit theorem d (14) n1/2 σ ˆy2 − σ 2 → N 0, πΓσ 4 , with Gaussian limit law where Γ ≈ 0.75 is a numerical constant analytically given in (43). (14) takes a very simple form. In fact, we prove in Lemma 7.2 that two bias terms of order one in the left-hand side cancel each other leading to the simple asymptotic limit distribution. Consider next an estimation of the volatility when we have at hand observations (∆i X)2 (y), 1 6 i 6 n, along several spatial values y1 , . . . , ym . Proposition 7.3 shows that the covariances Cov(ˆ σy21 , σ ˆy22 ) 2 vanish asymptotically for y1 6= y2 as long as m ∆n → 0. Therefore, estimators in different maturities de-correlate asymptotically. Since the realized volatilities hinge on y at first order only via the factor exp(−2y), the first-order variances of the rescaled realized volatilities do not depend on y ∈ (0, 1). We thus define the volatility estimator (15)
2 σ ˆn,m :=
√ m m n 1 X 2 π ϑ2 X X √ σ ˆ yj = (∆i X)2 (yj ) exp(yj ϑ1 /ϑ2 ) . m m n ∆n j=1 i=1 j=1
T HEOREM 3.5. On Assumptions 2.1 and 2.2 with a sequence m = mn satisfying mn 6 nρ for some ρ ∈ (0, 1/2), the estimator (15) obeys as n → ∞ the central limit theorem (16)
d 2 (mn · n)1/2 σ ˆn,m − σ 2 → N 0, πΓσ 4 , n
with Gaussian limit law where Γ ≈ 0.75 is a numerical constant analytically given in (43). While the general proof strategy is the same as for Theorem 3.4, an additional problem arises from the fact that we need an explicit bound for the mixing coefficients of the Gaussian vectors (∆i X(y1 ), . . . , ∆i X(ym ))> ∈ Rm . In particular, we require a sharp estimate with respect to the dimension m which may diverge to infinity in our asymptotics. To this end, we prove a multivariate generalization of Theorem 1 by Ibragimov and Rozanov (1978, Chapter V) in the appendix, cf. Theorem A.1. 2 Since mn is the total number of observations of the random field X on the grid, the estimator σ ˆn,m √ achieves the parametric rate mn as for independent observations if the discretizations are finer in √ time than in space, i.e. m = O( n). The constant πΓ ≈ 2.357 in the asymptotic variance is not too far from 2 which is clearly a lower bound, since 2σ 4 is the Cram´er-Rao lower bound for estimating the variance σ 2 from i.i.d. standard normals. In fact, the term (πΓ − 2)σ 4 is the part of the variance induced by non-negligible covariances of squared increments. In order to use the previous central limit theorem to construct confidence sets or tests for the volatility, let us also provide a normalized version applying a quarticity estimator.
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
P ROPOSITION 3.6.
On Assumptions 2.1 and 2.2 the quarticity estimator 4 σ ˜n,m =
(17)
11
m n ϑ2 π X X (∆i X)4 (yj )e2yj ϑ1 /ϑ2 , 3m j=1 i=1
P
4 satisfies σ ˜n,m → σ 4 . In particular, we have under the assumptions of Theorem 3.5
(18)
d 4 2 (mn · n)1/2 (πΓ˜ σn,m )−1/2 σ ˆn,m − σ 2 → N 0, 1 . n n
4. Semi-parametric estimation of the integrated volatility. Especially in the finance and econometrics literature, heterogeneous volatility models with dynamic time-varying volatility are of particular interest. In fact, different in nature than a maximum likelihood approach, our moment estimators (11) and (15) serve as semi-parametric estimators in the general time-varying framework. In the sequel, we consider (σs )s∈[0,1] a time-varying volatility function. A SSUMPTION 4.1. Assume (σs )s∈[0,1] is a strictly positive function that is α-H¨older regular with H¨older index α ∈ (1/2, 1], that is, σt+s − σt 6 C sα , (19) for all 0 ≤ t < t + s ≤ 1, and some positive constant C. R1 The integrated squared volatility 0 σs2 ds aggregates the overall variability of the model and is one standard measure in econometrics and finance to quantify financial risk. For volatility estimation of a semi-martingale, the properties of the realized volatility RV in a nonparametric setting with time√ varying volatility are quite close to the parametric case. In fact, the central limit theorem n(RV − R1 R1 √ d d σ 2 ) → N (0, 2σ 4 ) extends to n(RV − 0 σs2 ds) → N (0, 2 0 σs4 ds) under very mild (smoothness) conditions on (σs )s∈[0,1] . In our model, however, a simple local constant approximation of (σs )s∈[0,1] is not sufficient due to the non-neglibility of the stochastic integrals over the whole past Bi,k in (9). Nevertheless, under R 1Assumption 4.1, we prove that our estimators consistently estimate the integrated squared volatility 0 σs2 ds and satisfy similar central limit theorems as above. We keep to the notation for the estimators motivated by the parametric model. T HEOREM 4.2. On Assumptions 2.1, 2.2 and 4.1, for any y ∈ [δ, 1 − δ], the estimator (11) obeys as n → ∞ the central limit theorem Z 1 Z 1 d (20) n1/2 σ ˆy2 − σs2 ds → N 0, πΓ σs4 ds . 0
0
nρ
Let m = mn be a sequence satisfying mn 6 for some ρ ∈ (0, 1/2). The estimator (15) obeys the central limit theorem Z 1 Z 1 d 2 (21) (m · n)1/2 σ ˆn,m − σs2 ds → N 0, πΓ σs4 ds . 0
0
R1
P
4 With the quarticity estimator from (17), satisfying σ ˜n,m → 0 σs4 ds, we obtain the normalized version Z 1 d 1/2 4 −1/2 2 (22) (m · n) (πΓ˜ σn,m ) σ ˆn,m − σs2 ds → N 0, 1 , 0
that provides confidence intervals.
12
BIBINGER AND TRABS
Let us mention that, using only observations in a neighborhood of a fixed time t0 , our method of moments can also be used to construct a non-parametric estimator for the function σ at t0 , but this is beyond the scope of the present paper. Extending the theory to semi-martingale stochastic volatility models with leverage would require a substitute for Jacod’s stable central limit theorem, cf. Jacod (1997, Theorem 3–1), the latter being typically exploited in the literature on high-frequency statistics. While building upon our first asymptotic results for the SPDE model, we expect that this extension requires a further deep study, since Jacod (1997, Theorem 3–1) generalizes martingale limit theorems and here we rather need a generalization of the limit theorem under ρ-mixing by Utev (1990). 5. Calibrating more general models. 5.1. Estimation of the parameters in the differential operator. Concerning the estimation of the whole vector ϑ = (ϑ0 , ϑ1 , ϑ2 ), we have to start with the following restricting observation: For any fixed m the process Z := (Xt (y1 ), . . . , Xt (ym ))t∈[0,1] is a multivariate Gaussian process with continuous trajectories (for a modification), cf. Da Prato and Zabczyk (1992, Thm. 5.22). Consequently, its probability distribution is characterized by the mean and the covariance operator. It is then well known from the classical theory for statistics for stochastic processes, that even with continuous observations of the process Z, the underlying drift parameter cannot be consistently estimated within a fixed time horizon. Therefore, ϑ0 is not identifiable in our high-frequency observation model. Any estimator of ϑ should only rely on the covariance operator. Proposition 3.2 reveals that for ∆n → 0 the latter depends on p σ02 := σ 2 / ϑ2 and κ := ϑ1 /ϑ2 . If both σ and ϑ are unknown, the normalized volatility parameter σ02 and the curvature parameter κ thus seem the only identifiable parameters based on high-frequency observations over a fixed time horizon. This allows, for instance, the complete calibration of the parameters in the stochastic heat equation or in the yield curve model (3). In order to estimate κ, we require observations in at least two distinct spatial points y1 and y2 . The behavior of the estimates of the realized volatilities along different y, as seen in (10) and Figure 2, motivate the following curvature estimator Pn Pn 2 2 log i=1 (∆i X) (y1 ) − log i=1 (∆i X) (y2 ) κ ˜= (23) . y2 − y1 √ The previous analysis and the delta method facilitate a central limit theorem with n-convergence rate for this estimator κ. ˜ It is however not clear how to generalize this direct approach to more than two spatial observations. In particular, a linear combination along m → ∞ spatial point does not attain a faster rate of convergence. In the general case we propose a least squares approach. In view of (10) we rewrite n 1 X Zj := √ (∆i X)2 (yj ) = fσ02 ,κ (yj ) + δn,j n ∆n i=1
s for fs,k (y) := √ e−ky , π
where δn,j are random variables satisfying E[δn,j ] = O(∆n ) and
Cov(δn,j , δn,k ) = 1{j=k} ∆n Γσ04 e−2κyj + O ∆n3/2 (δ −1 + |yj − yk |−1 ) ,
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
13
cf. Proposition 7.3. In particular, δn,j and δn,k de-correlate for j 6= k as ∆n → 0. We define (24)
(ˆ σ02 , κ) ˆ := arg min s,k
m X
2 Zj − fs,k (yj ) .
j=1
This M-estimator is a classical least squares estimator for parametric regression with the parameter η = (σ02 , κ) and with non-standard observation errors δn,j . Combining the classical theory on minimum contrast estimators with our analysis of the random variables (δn,j ) yields the following limit theorem. For simplicity we suppose that σ02 and κ belong to a compact parameter set. T HEOREM 5.1. Grant Assumptions 2.1 and 2.2 and let m = mn be a fixed sequence satisfying mn → ∞ and mn 6 nρ for some ρ ∈ (0, 1/2). Let η = (σ02 , κ) ∈ Ξ for some compact subset Ξ ⊆ (0, ∞) × [0, ∞). Then the estimators (ˆ σ02 , κ) ˆ from (24) satisfy the central limit theorem d (25) (mn · n)1/2 (ˆ σ02 , κ) ˆ > − (σ02 , κ)> → N 0, σ04 Γπ V (η)−1 U (η)V (η)−1 with strictly positive definite matrices (26)
U (η) :=
(27)
V (η) :=
! R R 1−δ −4κy 2 1−δ ye−4κy dy e dy −σ 0R δ δR , 1−δ 1−δ −σ02 δ ye−4κy dy σ04 δ y 2 e−4κy dy ! R 1−δ R 1−δ −2κy dy −σ02 δ ye−2κy dy δR e R 1−δ . 1−δ −σ02 δ ye−2κy dy σ04 δ y 2 e−2κy dy
It can be easily deduced m > 2 R 1−δ from the proof that we obtain an analogous result for 1any Pfixed m where the integrals δ h(y)dy, for a generic function h, have to be replaced by m h(y j ). In j=1 particular, the Cauchy-Schwarz inequality shows that the determinant of V (η) is non-zero and thus V is invertible. Above we have focused on a relatively small number of spatial observations compared to the number of time points. Assuming we have a high number of spatial observations, we may use that the eigenfunction from (4) and the considered scalar product h·, ·iϑ depend on ϑ only via κ. Hence, we can introduce the estimated eigenfunctions √ eˆk (y) = 2 sin πky e−ˆκy/2 , y ∈ [0, 1], k > 1, and approximate the corresponding coordinate processes by m
1 X x ˆk (t) := Xt (yj )ˆ ek (yj )eκˆ yj . m j=1
Using x ˆk as noisy observations of the true coefficient processes xk , we then can apply the maximum likelihood approach from H¨ubner et al. (1993) to separately estimate ϑ1 , ϑ2 and σ 2 , but even then ϑ0 is not identifiable which follows from H¨ubner and Rozovski˘ı (1995). Note also that the approximation x ˆk will only be accurate for a very high number of spatial observations, i.e. m n.
14
BIBINGER AND TRABS
5.2. Boundary conditions. Finally, let us briefly discuss the effect of different boundary conditions. In the case of a non-zero Dirichlet boundary condition our SPDE model reads as ( dXt (y) = (Aϑ Xt (y))dt + σ(t)dBt (y) in [0, 1], X0 = ξ, Xt (0) = g(0), Xt (1) = g(1), ¯ t := where we may assume that the function g is twice differentiable with respect to y. Then X (Xt − g) ∈ Hϑ is a mild solution of the zero boundary problem ( ¯ t = (Aϑ X ¯ t )dt − (Aϑ g)dt + σ(t)dBt dX in [0, 1], ¯ ¯ ¯ X0 = ξ − g, Xt (0) = 0, Xt (1) = 0. Hence, ¯ X(t) = etAϑ (ξ − g) +
Z
t
e
(t−s)Aϑ
Z Aϑ g dt +
t
e(t−s)Aϑ σ(t)dBt
a.s.
0
0
In analysis we would have to take into account the additional deterministic and known term R t our (t−s)Aϑ A g dt, which is possible with only minor modifications. e ϑ 0 Especially for the cable equation, Neumann’s boundary condition is also of interest and corresponds ∂ ∂ to sealed ends of the cable. In this case we have ∂y Xt (0) = ∂y Xt (1) = 0 and ϑ1 = 0, such that we only need to replace the eigenfunction (ek ) from (5) by √ e¯k (y) = 2 cos(πky). We expect that our estimator achieves the same results in this case using an approximation based on the equality cos2 (z) = 21 + cos(2z) instead of sin2 (z) = 21 − cos(2z). 6. Simulations. Since our SPDE model leads to Ornstein-Uhlenbeck coordinates with decayrates λk ∝ k 2 , k > 1 going to infinity, an exact simulation of the Ornstein-Uhlenbeck processes turns out to be crucial. In particular, an Euler-Maruyama scheme is not suitable to simulate this model. For Rt some solution process x(t) = x(0)e−λt + σ 0 e−λ(t−s) dWs with fix volatility σ > 0 and decay-rate λ > 0, increments can be written Z t+τ e−λ(t+τ −s) dWs , x(t + τ ) = x(t)e−λτ + σ t
with variances σ 2 (1 − exp(−2λτ )). Let ∆N be a grid mesh with N = ∆−1 N ∈ N. Given x(0), the exact simulation iterates r 1 − exp(−2λ∆N ) x(t + ∆N ) = x(t)e−λ∆N + σ Nt , 2λ at times t = 0, ∆N , . . . , N − 1 with i.i.d. standard normal random variables Nt . We can set N = n to simulate the discretizations of K independent Ornstein-Uhlenbeck processes with decay-rates given by the eigenvalues (5), k = 1, . . . , K. The cut-off frequency K needs to be sufficiently large. In the following, we take K = 10, 000 that turned out to be a suitable choice to allow for very precise results at reasonable computational costs. When K is set too small, we witness an additional negative bias in the estimates due to the deviation of the simulated to the theoretical model. For a spatial grid yj , j = 1, . . . , m, yj ∈ [0, 1], we obtain the simulated observations of the field by Xi/n (yj ) =
K X k=1
xk (i/n) ek (yj ) ,
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
15
F IGURE 3. Monte Carlo MSEs of σ ˆ02 and κ ˆ.
Notes: Monte Carlo empirical mean squared errors of least squares estimates from (24) σ ˆ02 (left) and κ ˆ (right) for different sample sizes n = i · 100, m = i · 10, i = 1, . . . , 10.
using the eigenfunctions from (4). We implement the parametric model (3) with κ = 2/5 and σ = 1/4, or (1) with √ ϑ0 = 0, ϑ1 = 1 and ϑ2 = 1/5, respectively. The curvature parameter is thus κ = 5 and σ02 = 5/16. We generate discrete observations equi-spaced in time and space with yj = j/(m + 1), j = 1, . . . , m. The least squares estimator (24) can be computed using√the RP function nls, see Fox and Weisberg (2000). More precisely, with rv denoting the vector ( ∆n ni=1 (∆i X)2 (yj ))16j6m and y the vector of spatial coordinates, the command ls n, when Assumption 2.1 is apparently violated, the ratios slowly increase with the strongest increase for 2 . The plot indicates heuristically that when m > √n, the correct rate will not be √mn. estimator σ ˆn,m Moreover, it shows how fast the ratio increases with a growing number of spatial observations. The right-hand plot of Figure 4 compares the variances of the two curvature estimators κ ˆ from (24) and κ ˜ from (23). For a small number of spatial observations, κ ˜ outperforms the least squares estimator,
16
BIBINGER AND TRABS F IGURE 4. Scaled variances of estimators and comparison of curvature estimators.
Notes: Ratios of Monte Carlo and theoretical asymptotic variances and covariance of the estimators (15) and (24) for n = 1, 000 and m = i · 10 − 1, i = 1, . . . , 10 (left). The lines σ, σ0 , κ0 and cov refer to the variance of estimator (15) and the variances and covariance of the least squares estimator (24). Ratio of variances of estimator κ ˆ from (24) and κ ˜ from (23) (right).
√ √ even though the convergence rate n is slower than the rate mn of the least squares estimator. This illustrates that estimating the curvature based on the shape of the curve in Figure 2 works particularly well. However, in case of a larger number of spatial coordinates, the improved rate facilitates a more efficient estimation based on the least squares approach. Finally, we investigate the semi-parametric estimation of the integrated squared volatility in the nonparametric model with a time-varying volatility function. We consider σt = 1 − 0.2 sin
(28)
3 4 π t),
t ∈ [0, 1] .
This is a smooth deterministic volatility that mimics the typical intra-day pattern with a strong decrease at the beginning and a moderate increase at the end of the interval. We analyze the finite-sample accuracy of the feasible central limit theorem (22). To this end, we rescale the estimation errors of estimator (15) with the square root of the estimated integrated quarticity, based on estimator (17), times √ √ Γπ. We compare these statistics rescaled with the rate mn from 3,000 Monte Carlo iterations to the standard Gaussian limit distribution in the Q–Q plots in Figure 5 for n = 1, 000 and m = 1 and m = 9, respectively. The plots confirm a high finite-sample accuracy. In particular, the Monte Carlo empirical variances are very close to the ones predicted by the rate and the variance of the limit law as long as m is of moderate size compared to n. 7. Proofs. 7.1. Proof of Proposition 3.1. Before we prove the formula for the expected value of the squared increments, we need some auxiliary lemmas. L EMMA 7.1. that (29)
On Assumption 2.2, there is a sequence (ri )ni=1 satisfying
E[(∆i X)2 (y)] = σ 2
X 1 − e−λk ∆n k>1
λk
1−
Pn
i=1 ri
1/2
= O(∆n ) such
1 − e−λk ∆n −2λk ti−1 2 e ek (y) + ri . 2
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
17
F IGURE 5. Q–Q plots for the feasible central limit theorem (22).
Notes: Monte Carlo rescaled estimation errors of the statistics in the left-hand side of (22) compared to a standard normal distribution, for n = 1, 000 and m = 1 (left) and m = 9 equi-spaced spatial observations (right).
P ROOF. Recall Xt (y) = E[Bi,k Ci,k ] = 0, as well as
P
k>1 xk (t)ek (y)
and (9). It holds that E[Ai,k Bi,k ] = E[Ai,k Ci,k ] =
E[A2i,k ] = E hξ, ek i2ϑ e−2λk (i−1)∆n (e−λk ∆n − 1)2 ,
(30a)
2 E[Bi,k ]
(i−1)∆n
Z =
σ 2 e−2λk ((i−1)∆n −s) (e−λk ∆n − 1)2 ds
0
= σ 2 (e−λk ∆n − 1)2
(30b)
2 ] E[Ci,k
(30c)
Z
i∆n
1 − e−2λk (i−1)∆n , 2λk
σ 2 e−2λk (i∆n −s) ds =
= (i−1)∆n
1 − e−2λk ∆n 2 σ . 2λk
Since (Bi,k + Ci,k )k , k > 1, are independent and centered, we obtain X E[(∆i X)2 (y)] = E[∆i xk ∆i xl ]ek (y)el (y) k,l>1
= σ2
X 1 − e−2λk ∆n k>1
= σ2
2λk
X 1 − e−2λk ∆n k>1
= σ2
X 1 − e−λk ∆n k>1
λk
1 (1 − e−2λk ti−1 ) e2k (y) + ri 2λk 2 2 + 1 − e−λk ∆n e−λk ∆n − 1 −2λk ti−1 2 − e ek (y) + ri 2λk 2λk + 1 − e−λk ∆n
1−
2
1 − e−λk ∆n −2λk ti−1 2 e ek (y) + ri , 2
where the remainder ri hinges on ξ and i: X 1 − e−λk ∆n 1 − e−λl ∆n e−(λk +λl )ti−1 E[hξ, ek iϑ hξ, el iϑ ]ek (y)el (y) . ri = k,l>1
18
BIBINGER AND TRABS
If on Assumption 2.2 E[hξ, ek iϑ ] = 0 and E[hξ, ek i2ϑ ] 6 C/λk with some C < ∞, then we have that X 1 − e−λk ∆n 2 ri 6 C e−2λk ti−1 e2k (y) λk k>1
by independence of the coefficients. For the alternative condition from Assumption 2.2, we use that 1/2 1/2 Aϑ is self-adjoint on Hϑ , such that −λk hξ, ek iϑ = hAϑ ξ, ek iϑ and thus h X 1 − e−λk ∆n 2 i 1/2 −λk ti−1 e ek (y)hAϑ ξ, ek iϑ ri = E . 1/2 λk k>1 The Cauchy-Schwarz inequality and Parseval’s identity yield ri 6
X (1 − e−λk ∆n )2 λk
k>1
hX i 1/2 hAϑ ξ, ek i2ϑ e−2λk ti−1 e2k (y)E k>1
X (1 − 1/2 6 2E kAϑ ξk2ϑ k>1
e−λk ∆n )2 λk
e−2λk (i−1)∆n ,
1/2 where E kAϑ ξk2ϑ is finite by assumption. Applying a Riemann sum approximation yields n X (1 − e−λk ∆n )2 X k>1
X (1 − e−λk ∆n )2 λk λk (1 − e−2λk ∆n ) i=1 k>1 Z ∞ 2 X (1 − e−λk ∆n ) 1 − e−z 1/2 6 ∆n 6 ∆n dz + O(∆1/2 n ), λk ∆n z2 0 e−2λk (i−1)∆n 6
k>1
such that
n X
ri = O(∆n1/2 ) .
i=1
On Assumption 2.2 it holds uniformly for all y ∈ [δ, 1 − δ], δ > 0, that 2 p σ 2 e−y ϑ1 /ϑ2 Z ∞ 1 − e−z 2 1 − e−z −2z 2 (i−1) 2 √ dz + ri + O(δ −2 ∆3/2 E[(∆i X) (y)] = ∆n 1− e n ), z2 2 π ϑ2 0 L EMMA 7.2.
i ∈ {1, . . . , n}, where ri is the sequence from Lemma 7.1. P ROOF. Inserting the eigenfunctions ek (y) from (5) and setting ti−1 = (i − 1)∆n , Lemma 7.1 gives for i ∈ {1, . . . , n} E[(∆i X)2 (y)] = 2σ 2
X 1 − e−λk ∆n k>1
λk
1−
1 − e−λk ∆n −2λk (i−1)∆n 2 e sin (πky)e−y ϑ1 /ϑ2 + ri . 2
Applying the identity sin2 (z) = 1/2(1 − cos(2z)), we decompose the sum into three parts, which will be bounded separately: √ (31) E[(∆i X)2 (y)] = σ 2 ∆n (I1 + I2 (i) + R(i))e−y ϑ1 /ϑ2 + ri ,
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
19
where I1 :=
X 1 − e−λk ∆n p , ∆n λk ∆n k>1
X 1 − e−λk ∆n p I2 (i) := ∆n 2λk ∆n
2
e−2λk (i−1)∆n ,
k>1
X 1 − e−λk ∆n p 1 − e−λk ∆n −2λk (i−1)∆n R(i) := − ∆n 1− e cos(2πky) . ∆n λk 2 k>1
The terms I1 and I2 (i) can be approximated by integrals, determining the size of the expected squared increments. The remainder terms R(i) from approximating sin2 turn out to be negligible. We begin with analyzing I1 . We perform a substitution with z 2 = k 2 ∆n to approximate the sum by an integral, such that ϑ2 1 − ϑ0 . λ k ∆n = z 2 π 2 ϑ2 + ∆ n 4ϑ2 A Taylor expansion of the function (1 − exp(−z 2 ))/z 2 and using that the derivative is in L1 ([0, ∞)) 3/2 shows that the remainder of setting λk ∆n ≈ π 2 z 2 ϑ2 is of order ∆n . A Riemann midpoint approxi1/2 mation yields for the grid ak := ∆n (k + 1/2), k > 0, and the function 2
2
(1 − e−π ϑ2 z ) , f (z) = π 2 ϑ2 z 2
z > 0,
that for intermediate points ξk ∈ [ak−1 , ak ] Z ∞ X Z ak p I1 − √ f (z) dz = f (k ∆n ) − f (z) dz + O(∆n ) ∆n 2
k>1
ak−1
XZ
ak
2 1 p k ∆n − y f 00 (ξk ) dy + O(∆n ) 2 k>1 ak−1 Z ∞ = O ∆n |f 00 (y)| dy + O(∆n ). =−
0
Noting that f 00 ∈ L1 ([0, ∞)), we conclude that √
Z I1 =
∞ √
∆n 2
∞
Z
Z f (z) dz −
f (z) dz + O(∆n ) = 0
∆n 2
f (z) dz + O(∆n ).
0
The subtracted integral over the asymptotically small vicinity close to zero can be approximated by √ f (0) ∆n /2 + O(∆n ) and thus Z (32)
∞
I1 = 0
2
2
(1 − e−π ϑ2 z ) dz − π 2 ϑ2 z 2
√
∆n + O(∆n ). 2
For the second i-dependent term I2 (i) in (31), we proceed analogously with the function g(z; i) =
(1 − e−π
2ϑ
2
)2 e−2π 4π 2 ϑ2 z 2 2z
2ϑ
2z
2 (i−1)
,
z > 0,
20
BIBINGER AND TRABS
for i ∈ N. Since limz→0 g(z; i) = 0, we obtain that Z ∞ (33) I2 (i) = g(z; i) dz + O(∆n ) . 0
A substitution in the integrals√I1 and I2 (i) yields the asserted result provided the remainder term R(i) is negligible up to the term 21 ∆n . To show this, with Re denoting the real part of complex arguments we rewrite p X p R(i) = − Re ∆n hi ∆n k e2πiky k>1
with
2
2
2
2
2
2
1 − e−π ϑ2 z (1 − e−π ϑ2 z )e−2π ϑ2 z (i−1) 1 − , z > 0. π 2 ϑ2 z 2 2 Using (ak )k>0 from above, we derive that √ Z ak √ √ ∆n 2πiak y/√∆n 2πiuy/ ∆n du = − e2πiak−1 y/ ∆n e e 2πiy ak−1 √ ∆n πiy = e − e−πiy e2πiky 2πiy sin(πy) p = ∆n e2πiky . πy hi (z) =
In terms of the Fourier transform F, we thus obtain Z ak πy X √ R(i) = − Re hi (∆n1/2 k) e2πiuy/ ∆n du sin(πy) ak−1 k>1 πy hX i = − Re F hi (∆n1/2 k)1(ak−1 ,ak ] (u) (2πy∆−1/2 ) n sin(πy) k>1
= R1 (i) + R2 (i), where hX i πy F hi (∆n1/2 k)1(ak−1 ,ak ] (u) − hi (u)1(a0 ,∞) (2πy∆−1/2 ) , n sin(πy) k>1 πy R2 (i) := − Re F hi (u)1(a0 ,∞) (2πy∆−1/2 ) . n sin(πy)
R1 (i) := − Re
The first term can be bounded with similar Riemann sum approximation arguments as above: h i πy
X p
|R1 (i)| 6 hi ( ∆n k)1(ak−1 ,ak ] (u) − hi (u)1(a0 ,∞) (u)
F sin(πy) ∞ k>1
X p πy
6 hi ( ∆n k)1(ak−1 ,ak ] (u) − hi (u)1(a0 ,∞) (u) 1 = O(δ −1 ∆n ),
sin(πy) L k>1
where we also have used that y is bounded away from 1. For the second term R2 (i) we will use the decay of the Fourier transformation Fhi . Since h0i , h00i ∈ L1 ([0, ∞)), we have |z 2 Fhi (z)| = O(1) and thus for any y > δ > 0: |Fhi (2πy∆n−1/2 )| = O(δ −2 ∆n ).
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
21
This implies πy F[hi 1(0,a0 ] ](2πy∆n−1/2 ) + O(∆n ) sin(πy) Z √∆n /2 πy = cos(2πyu∆n−1/2 )hi (u)du + O(∆n ) sin(πy) 0 Z 1/2 ∆n πy 1/2 = cos(2πyu)hi (∆n1/2 u)du + O(∆n ) . sin(πy) 0
R2 (i) = Re
Integration by parts and that hi (0+) = 1 finally yield R2 (i) =
∆1/2 n
πy sin(πy) hi (∆n1/2 /2) − ∆1/2 n sin(πy) 2πy
Z 0
1/2
sin(2πyu) 0 1/2 hi (∆n u)du + O(∆n ) 2πy
1 = ∆1/2 hi (∆1/2 n /2) + O(∆n ) 2 n 1 + O(∆n ). = ∆1/2 2 n √ Altogether, we conclude R(i) = 21 ∆n + O(δ −2 ∆n ). 1/2
For later proofs it will be important that we also obtain R(i) = O(δ −1 ∆n ) with only a minor modification for the term R2 (i). With the above lemmas we can deduce Proposition 3.1. P ROOF OF P ROPOSITION 3.1. Computing the integrals in Lemma 7.2 Z ∞ 2 √ 1 − e−z dz = π, z2 0 Z ∞ 2 p √ √ √ (1 − e−z )2 −2τ z 2 e dz = π 2 1 + 2τ − 2τ − 2(1 + τ ) z2 0 for any τ ∈ R, we derive that p E[(∆i X) (y)] = ∆n e−y ϑ1 /ϑ2 σ 2 2
r
p 1 1p 1p 1 − 1 + 2(i − 1) + 2(i − 1) + 2 + 2(i − 1) ϑ2 π 2 2
+ ri + O(∆3/2 n ). Since for the i-dependent term, by a series expansion we have that (34)
n X p i=1
1 + 2(i − 1) −
n X 1p 1p 2(i − 1) − 2 + 2(i − 1) = O i−3/2 , 2 2 i=1
where the latter sum converges, we obtain r n p 1 1p 1p 1 X −y ϑ1 /ϑ2 2 e σ 2(i − 1) + 2 + 2(i − 1) 1 − 1 + 2(i − 1) + n ϑ2 π 2 2 i=1 r 1 2 = e−y ϑ1 /ϑ2 σ + O(∆n ). ϑ2 π Since also the remainders from Lemmas 7.1 and 7.2 are bounded uniformly in i with the claimed rate, we obtain the assertion.
22
BIBINGER AND TRABS
7.2. Proof of Proposition 3.2 and Proposition 3.3. Next, we prove the decay of the covariances as well as the mixing property of the increments. P ROOF OF P ROPOSITION 3.2. Let us first calculate correlations between the terms Bi,k and Ci,k , which are required frequently throughout the following proofs. (35a)
ΣB,k ij
2
= Cov(Bi,k , Bj,k ) = σ (e
−λk ∆n
2
Z
(min(i,j)−1)∆n
− 1)
0
= e−λk |i−j|∆n − e−λk (i+j−2)∆n (35b) (35c)
e−λk ((i+j−2)∆n −2s) ds
1 − e−2λk ∆n , 2λk Z = Cov(Ci,k , Bj,k ) = 1{i1
=
X
Cov(Ai,k + Bi,k + Ci,k , Aj,k + Bj,k + Cj,k ) e2k (y)
k>1
= ri,j +
X
BC,k 2 ΣB,k ek (y) ij + Σij
k>1 BC,k with ΣB,k from (35a) and (35c) and with ij and Σij
ri,j =
X
Var hξ, ek iϑ e−λk (i+j−2)∆n (e−λk ∆n − 1)2 e2k (y)
k>1 1/2
6 2 sup E[hAϑ ξ, ek i2ϑ ] k
X (e−λk ∆n − 1)2 e−λk (i+j−2)∆n . λk k>1
Using analogous steps as in the proof of Proposition 3.1, we deduce that X B,k 2 Cov ∆i X(y), ∆j X(y) = Σij + ΣBC,k ek (y) + ri,j ij =
X k>1
e−λk (j−i)∆n σ 2
k>1 −λ (e k ∆n
− 1)2 + 1 − eλk ∆n − e−2λk ∆n + e−λk ∆n 2 ek (y) 2λk
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
+
X
23
(e−λk ∆n − 1)2 2 ek (y) +ri,j 2λk {z }
e−λk (j+i−2)∆n σ 2
k>1
|
=:si,j
2 − e−λk ∆n − eλk ∆n 2 = e σ ek (y) + si,j + ri,j 2λk k>1 −λk ∆n − e−2λk ∆n − 1 X −λk (j−i−1)∆n 2 2e e2k (y) + si,j + ri,j = e σ 2λk k>1 −λk ∆n − 1 2 X p 1/2 −λk (j−i−1)∆n 2 e = −∆n e σ ∆n e2k (y) + si,j + ri,j . 2λk ∆n X
(36)
−λk (j−i)∆n 2
k>1
Using the Riemann sum approximation (33) the previous line equals Z ∞ 2 2 1 − e−z σ2 2 1/2 − ∆n exp(−y ϑ1 /ϑ2 ) √ e−z (j−i−1) dz + si,j + ri,j + O(∆3/2 n ) z2 2π ϑ2 0 p p p σ2 2 j − i − j − i − 1 − j − i + 1 + si,j + ri,j + O(∆n3/2 ) . = − ∆n1/2 exp(−y ϑ1 /ϑ2 ) √ 2 πϑ2 The claim is implied by the estimate n X
1/2
(si,j + ri,j ) 6 σ 2 + 2 sup E[hAϑ ξ, ek i2ϑ ]
i,j=1
k
i,j=1 k>1 1/2
= σ 2 + 2 sup E[hAϑ ξ, ek i2ϑ ] k
n X X
e−λk (j+i−2)∆n
(e−λk ∆n − 1)2 λk
2 X X (e−λk ∆n − 1)2 n−1 e−λk i∆n λk k>1
i=0
X (e−λk ∆n − 1)2 1/2 6 σ 2 + 2 sup E[hAϑ ξ, ek i2ϑ ] (1 − e−λk ∆n )−2 = O(1). λk k k>1
P ROOF OF P ROPOSITION 3.3. Since α and ρ-mixing rates are equivalent for Gaussian sequences, cf. (Doukhan, 1994, Theorem 1 in Sect. 2.1), we apply Corollary 2 from Section 2.1.1 by Doukhan (1994)1 which shows that it is sufficient to find a uniform lower bound for the spectral density ∞
X 1 −1/2 1 −1/2 iu(j−1) fn (u) = ∆ E[∆1 X(y)∆1 X(y)] + Re ∆n E[∆1 X(y)∆j+1 X(y)]e , 2π n π j=1
where we have identified the original sequence with a Gaussian sequence (∆i+1 X)i∈Z with symmetric correlations on Z. We obtain with similar steps as in the proof of Proposition 3.2 using the notation from (13): ∞ X X 1 ˜1,k )(Cj,k + B ˜j,k ) e2 (y)eiu(j−1) fn (u) = Re ∆−1/2 E (C1,k + B n k π j=2 k>1 1 −1/2 X ˜1,k )2 e2 (y) E (C1,k + B + ∆n k 2π k>1
1
based on Kolmogorov and Rozanov (1961)
24
BIBINGER AND TRABS ∞
X σ 2 −1/2 X 1 − e−2λk ∆n + (1 − e−λk ∆n )2 2 B,k BC,k = ∆ Re (Σ1j + Σ1j )+ ek (y) π n 4λk =
σ2 π
∆−1/2 n
k>1
j=2
X
∞ X
Re
e−λk (j−2)∆n +iu(j−1)
j=2
k>1
(1 − e−λk ∆n )2 1 − e−λk ∆n 2 + ek (y) 2λk 2λk
(1 − e−λk ∆n )2 eiu X 1 − e−λk ∆n 2 = + ∆−1/2 Re ek (y) π n 2λk 2λk (1 − e−λk ∆n +iu ) σ2
k>1
=
σ2 π
∆−1/2 n
X 1 − e−λk ∆n 2λk
k>1
σ 2 e−yϑ1 /ϑ2 √ = π 2 ϑ2
Z 0
∞
Re 2 −
1 − eiu ek (y)2 1 − e−λk ∆n +iu
2
(1 − e−z ) 1 − eiu Re 2 − dz + O(∆n ). 2z 2 1 − e−z 2 +iu
˜ 2 ] 6= ΣB,k = 0 and ΣC,k and in the third equality (35a) and In the second equality we insert E[B 11 11 1,k (35c). We further calculate 2
Re 2 − for the function
g(a, b) := 2 − It holds that
∂ ∂b g(a, b)
2−
2
1 − eiu 1 − (1 + e−z ) cos(u) + e−z 2 = 2 − = g(e−z , cos(u)) 2 +iu 2 2 −z −2z −z 1−e 1+e − 2e cos(u) (1 + a)(1 − b) , 1 − 2ab + a2
a ∈ [0, 1], b ∈ [−1, 1].
> 0 for all a, b and therefore
2 = g(a, −1) 6 g(a, b) 6 g(a, 1) = 2 1+a
for all a ∈ [0, 1], b ∈ [−1, 1].
Hence, we find the uniform lower bound for the spectral density (37)
σ 2 e−yϑ1 /ϑ2 √ fn (u) > π 2 ϑ2
Z
∞
0
2
2
(1 − e−z )e−z dz + O(∆3/2 n )>0 z 2 (1 + e−z 2 )
for sufficiently large n. Finally the decay of the mixing coefficients follows from the decay of the correlations. Since in Proposition 3.2 the dependence in the remainder term on the lag (i − j) is not explicit, we deduce from (36) for j > 2 and ς < 1/2 (noting that r1,j = s1,j = 0 in the stationary case) −λk ∆n − 1 2 X e 1 1/2 −λ (j−2)∆ 2 n k e Cov ∆1 X(y), ∆j X(y) 6 ∆n σ 1/2 λk ∆n ∆n k>1 Z ∞ 2 X e−λk ∆n − 1 2 σ2 C (1 − e−z )2 1/2 6 ∆ 6 dz (j − 2)1+ς n (λk ∆n )2+ς (j − 2)1+ς 0 z 4+2ς k>1
for some constant C > 0 depending only on σ 2 and ϑ. Since the last integral is finite for any ς < 1/2, the decay of the mixing coefficients follows by (Doukhan, 1994, Cor. 2 in Sect. 2.1).
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
25
7.3. Proofs of the central limit theorems. To prepare the central limit theorems, we first compute the asymptotic variances. According to (11), we define the rescaled realized volatility at y ∈ (0, 1) as n
Vn (y) := √
(38)
1 X (∆i X)2 (y)ey ϑ1 /ϑ2 . ∆n n i=1
P ROPOSITION 7.3. On Assumptions 2.1 and 2.2, the covariance of the rescaled realized volatility for two maturities y1 , y2 ∈ [δ, 1 − δ] satisfies 3/2 −1 (39) Cov Vn (y1 ), Vn (y2 ) = 1{y1 =y2 } ∆n Γσ 4 ϑ−1 + δ −1 ) 2 + O ∆n (1{y1 6=y2 } |y1 − y2 | where Γ > 0 is a numerical constant given in (43) with Γ ≈ 0.75. P ROOF. Since (∆i X)2 (y) =
X
∆i xk ek (y)
k>1
2
=
X
∆i xk ∆i xl ek (y)el (y),
k,l
we obtain the following variance-covariance structure of the rescaled realized volatilities (38) in different points y1 , y2 ∈ [δ, 1 − δ]: Cov Vn (y1 ), Vn (y2 ) n eϑ1 /ϑ2 (y1 +y2 ) X X ek (y1 )ek (y2 )el (y1 )el (y2 ) Cov(∆i xk ∆i xl , ∆j xk ∆j xl ) , = ∆n n2 i,j=1
k,l>1
while other covariances vanish by orthogonality. A careful analysis of covariance structures yields that Cov ∆i xk ∆i xl , ∆j xk ∆j xl = Cov (Ai,k + Bi,k + Ci,k )(Ai,l + Bi,l + Ci,l ), (Aj,k + Bj,k + Cj,k )(Aj,l + Bj,l + Cj,l ) BC,l = Cov(Ai,k Ai,l , Aj,k Aj,l ) + E[Ai,k Aj,k ] ΣB,l + ΣBC,l + ΣC,l ij + Σij ji ij BC,k + E[Ai,l Aj,l ] ΣB,k + ΣBC,k + ΣC,k ij + Σij ji ij B,l BC,k + ΣB,k + ΣBC,k + ΣC,k Σij + ΣBC,l + ΣBC,l + ΣC,l . ij + Σij ji ij ij ji ij Denoting Cξ := supk λk E[hξ, ek i2ϑ ], the resulting sum of the first terms is bounded by n 1 X X Cov(Ai,k Ai,l , Aj,k Aj,l ) n2 i,j=1 k,l>1 n X
=
1 n2
6
Cξ2 n2
X
(1 − e−λl ∆n )2 (1 − e−λk ∆n )2 e−(λk +λl )(i+j−2)∆n Var hξ, ek iϑ Var hξ, el iϑ
i,j=1 k,l>1
X (1 − e−λk ∆n )2 (1 − e−λl ∆n )2 λk λl (1 − e−(λk +λl )∆n )2 k,l>1
Cξ2 X (1 − e−λk ∆n )(1 − e−λl ∆n ) ∆n 6 2 6 2 Cξ2 . n λk λl n k,l>1
26
BIBINGER AND TRABS
Repeating several steps from the proofs of Propositions 3.1 and 3.2, we see that also the cross terms are asymptotically small: n 1 X X B,l BC,l BC,l C,l E[A A ] Σ + Σ + Σ + Σ i,k j,k ij ij ji ij n2 i,j=1 k,l>1 n X
=
=
1 n2 1 n2
X
E[Ai,k Aj,k ]
i,j=1 k>1 n X X
Cξ
i,j=1
k>1 2 σ
X
BC,l ΣB,l + ΣBC,l + ΣC,l ij + Σij ji ij
l>1
(1 − e−λk ∆n )2 −λk (i+j−2)∆n e λk
p p p √ × ∆1/2 1i6=j 2 |j − i| − |j − i| − 1 − |j − i| + 1 n 2 πϑ2 X 2(1 − e−λl ∆n )2 1 − e−λl ∆n + + O(∆2n ) + σ 2 1i=j λl λl l>1 ∆ X n =O 2 (i + j)−3/2 |i − j|−3/2 + ∆n5/2 = O(∆5/2 n ). n i6=j
Therefore, Cov Vn (y1 ), Vn (y2 ) n eϑ1 /ϑ2 (y1 +y2 ) X X ek (y1 )ek (y2 )el (y1 )el (y2 ) = ∆n n2 i,j=1 k,l>1
× Cov (Bi,k + Ci,k )(Bi,l + Ci,l ), (Bj,k + Cj,k )(Bj,l + Cj,l ) + O(∆3/2 n ).
(40)
3/2
The sum over all terms with k = l in (40) induces a variance part of order ∆n and is thus asymptotically negligible at first order. For this reason, we focus on twice the sum over all k < l. sin2 (πky)e−y ϑ1 /ϑ2 ≈ Similarly as in Lemma 7.2, one can approximate the terms e2k (y) = 2√ e−y ϑ1 /ϑ2 for the variances with y1 = y2 = y, up to a term of order O(δ −1 ∆n ) in the final Riemann sum. In fact, all variances of the estimates (38) are equal at first order. For the covariances ek (y1 )ek (y2 ) the identity ek (y1 )ek (y2 ) = 2 sin(πky1 ) sin(πky2 )e−(y1 +y2 ) ϑ1 /(2ϑ2 ) = cos(πk(y1 − y2 )) − cos(πk(y1 + y2 )) e−(y1 +y2 ) ϑ1 /(2ϑ2 )
(41)
can be used to prove that covariances vanish asymptotically at first order. In case that y1 6= y2 , the series of covariance terms over k > 1 is thus negligible. This can be seen using an integral approximation analogously √ to the last part of the proof of Lemma 7.2 leading to an additional factor −1 −1 O((δ + |y1 − y2 | ) ∆n ). We are thus left to consider for any y ∈ [δ, 1 − δ] the variance n X 2e2y ϑ1 /ϑ2 X 2 2 Var(Vn (y)) = 2 ek (y)el (y) Cov(∆i xk ∆i xl , ∆j xk ∆j xl ) ∆n n2 k1 −λ∆n
ei,k = Bi,k + σ (e √ B
− 1) λk (i−1)∆n Zk . e 2λk
e B,k and Cov(B ei,k , C ej,k ) = e 2 ] = (2λk )−1 σ 2 (e−λk ∆n − 1)2 =: Σ e 2 ] = E[C 2 ], E[B Since E[C ii i,k i,k i,k ei,k , B ej,k ) = (2λk )−1 σ 2 (e−λk ∆n − 1)2 e−λk |i−j|∆n , it indeed holds that Cov(Bj,k , Ci,k ), Cov(B n X
P
(ζen,i − ζn,i ) → 0 .
i=1
According to (Utev, 1990, Thm. 4.1 with Thm. 5.4), the central limit theorem for the triangular array Zi,n = ζen,i − E[ζen,i ] holds under the following sufficient conditions: ∞ X
(46a)
ρ(2k ) < ∞ ,
k=1
(46b)
lim sup n→∞
(46c)
n X
n X
2 E[Zi,n ] < ∞,
i=1
4 E Zi,n → 0 as n → ∞ ,
i=1
see (4.2) and (4.6) in Utev (1990). By definition of the mixing coefficients, the sequence of squared 2) e increments ((∆i X(y)) i>1 is by Proposition 3.3 also ρ-mixing with the same upper bound for ρ(l), l ∈ N. Hence, (46a) is fulfilled. Condition (46b) is ensured by Proposition 7.3.
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
31
e B,k = E[B e 2 ] from above and It remains to establish the Lyapunov condition. With the notation Σ ii i,k 2 ] from (35b), we obtain by independence of the coefficient processes x , k > 1, for ΣC,k = E[Ci,k k ii some numerical constant C > 0 that X e 8 (y) 6 105 E[(∆i x ek )2 ]E[(∆i x el )2 ]E[(∆i x ep )2 ]E[(∆i x eq )2 ]e2k (y)e2l (y)e2p (y)e2q (y) E (∆i X) k,l,p,q
6C
X
B,p B,l C,p e B,q e e + ΣC,l Σ e B,k + ΣC,k Σ Σii + ΣC,q Σ ii ii + Σii ii ii ii ii
k,l,p,q
4 X B,k C,k e Σ + Σ = C ∆2n ∆−1/2 n ii ii k>1
= C ∆2n ∆−1/2 σ2 n
X 1 − exp(−λk ∆n ) 4 λk
k>1
Z = O ∆2n σ 2
∞
2
z −2 (1 − e−z ) dz
4
= O(∆2n ) .
0
4 4 P Thereby, we derive E ζen,i = O(∆n ) → 0 as = O(∆2n ) uniformly in i and thus ni=1 E ζn,i n → ∞ which implies (46c). P ROOF OF T HEOREM 3.5. Similar to the proof of Theorem 3.4 we will verify the conditions (46a) - (46c) of Utev’s central limit theorem for the redefined triangular array √ m ϑ2 π X e 2 (yj ) exp(yj ϑ1 /ϑ2 ) , ζen,i = √ (47) (∆i X) m j=1
e is defined where m = mn is a fixed sequence satisfying mn = O(nρ ) for some ρ ∈ (0, 1/2) and X as in (45). e 2 (yj ), (∆i X) e 2 (yl )) = O(∆n 1jl + From the proof of Proposition 7.3, we deduce that Cov((∆i X) 3/2 m(log m)∆n ). We thus have for some constant C > 0 n X i=1
n m CXX 2 e e 2 (yj ), (∆i X) e 2 (yl )) E[ζn,i ] 6 Cov((∆i X) m i=1 j,l=1
= O(n∆n + m(log m)n∆n3/2 ) = O(1)
(48)
which implies (46b). By the Cauchy-Schwarz inequality we moreover have 4 π 2 ϑ22 E ζen,i = m2
(49)
m X j1 ,...,j4 =1 m X
e 2 (yj ) · · · (∆i X) e 2 (yj ) e(yj1 +···+yj4 )ϑ1 /ϑ2 E (∆i X) 1 4
6
π 2 ϑ22 m2
6
π 2 ϑ22 m2 e4ϑ1 /ϑ2
e 8 (yj ) 1/4 · · · E (∆i X) e 8 (yj ) 1/4 e(yj1 +···+yj4 )ϑ1 /ϑ2 E (∆i X) 1 4
j1 ,...,j4 =1
max
y∈{y1 ,...,ym }
e 8 (y) . E (∆i X)
e 8 (y) = O(∆2n ), we obtain that Pn E ζe4 = Using from the proof of Theorem 3.4 that E (∆i X) n,i i=1 O(m2 ∆n ) = O(1) such that the Lyapunov condition (46c) is verified.
32
BIBINGER AND TRABS
It remains to verify the mixing condition (46a). To this end we apply Theorem A.1 to the Gaussian triangular array e m ))> ∈ Rm , i = 1, . . . , n. e 1 ), . . . , ∆i X(y ξn,i = ∆−1/4 (∆i X(y n The corresponding spectral density matrix is given by fl,l0 (u) =
=
1 1/2
Re
∞ X
π∆n
e l )∆j X(y e l0 )] eiu(j−1) E[∆1 X(y
j=1
1/2 σ 2 ∆n
π
X 1 − e−λk ∆n k>1
λ k ∆n
g(e−λk ∆n , cos(u))ek (yl )ek (yl0 ),
l, l0 ∈ {1, . . . , m},
with the function g : [0, 1] × [−1, 1] → [0, 2] from the proof of Proposition 3.2. For all diagonal elements we already have proven in (37) that fl,l (u) > C for some uniform constant C > 0. Owing to (41), the off-diagonal elements can be bounded exactly as the term R(i) in Lemma 7.2. We obtain √ 1/2 |fl,l0 (u)| = O(|yl − yl0 |−1 ∆n ) for l 6= l0 . Consequently, m(log m)∆n = O(1) is exactly the condition that we need to show that f (u) is a diagonally dominant matrix. More specifically, the Gershgorin circle theorem yields for the minimal eigenvalue λmin (f (u)) that X λmin (f (u)) > min fll (u) − |fl,l0 (u)| l
l0 6=l
X √∆n p > C − O(m(log m) ∆n ) > C/2 > min fll (u) − max 0 l l |yl − yl | 0 l 6=l
for n sufficiently large. By Theorem A.1 we thus have for any p : C → Cm×m , whose entries are all polynomials of degree less than τ − 1 and satisfy the symmetry pkl = plk for all k, l = 1, . . . , m, that the ρ-mixing coefficients of (ξn , i) are bounded by ρ(τ ) 6
2 max f (u) − p(eiu ) C u
where k · k denotes the spectral norm. The norm coinciding with the largest eigenvalue, the latter can be bounded again with the help of Gershgorin’s circle theorem: X
f (u) − p(eiu ) 6 max fll (u) − pll (eiu ) + max fll0 (u) − pll0 (eiu ) . l
l
l0 6=l
1 P e l )∆|j|+1 X(y e l )] and pll0 (z) = 0 for l 6= l0 , we deduce Choosing pll (z) = 2π z j E[∆1 X(y |j|6τ −1 √ from |fl,l0 (u)| = O(|yl − yl0 |−1 ∆n ) for l 6= l0 that
ρ(τ ) 6
∞ p 2 1 X e l )∆j+1 X(y e l )] + O(m(log m) ∆n ). E[∆1 X(y 1/2 C π∆n j=τ
The expectation in the previous line can be bounded exactly as the covariances in the proof of Proposition 3.2 and thus p ρ(τ ) = O τ −ς + m(log m) ∆n
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
33
for any ς ∈ (0,√ 1/2). Finally the assumption m 6 nρ for some ρ ∈ (0, 1/2) implies with ρ0 ∈ (ρ, 1/2) 0 0 that m(log m) ∆n = n−1/2+ρ 6 τ −1/2+ρ for any τ 6 n. Hence, 0 ρ(τ ) = O τ −(δ∧(1/2−ρ )) for all τ 6 n. Finally, we note that the triangular (ζen,i ) inherits the decay of the mixing coefficients of the vectors (ξn,i ) which implies (46a). P ROOF OF P ROPOSITION 3.6. First, we compute the expectation of the quarticity estimator. We 4 ] = σ 4 + O(1), since for any y ∈ [δ, 1 − δ] have that E[˜ σn,m n X
πϑ2 exp(2y ϑ1 /ϑ2 )E (∆i X)4 (y)
i=1
= πϑ2 exp(2y ϑ1 /ϑ2 )
n X X
3 E[(∆i xk )2 ]E[(∆i xl )2 ] e2k (y)e2l (y) 1 + O(1)
i=1 k,l n X
∆−1/2 n
X
i=1
k
= 3 nπϑ2 ∆n ∆−1/2 σ2 n
X
= 3 πϑ2 ∆n
C,k ΣB,k ii + Σii
2
1 + O(1)
2 λ−1 1 − exp(−λ ∆ ) 1 + O(1) n k k
k 4
= 3 σ 1 + O(1) . We redefine the statistics from (44) for m > 1: √ m ϑ2 π X (50) (∆i X)2 (yj ) exp(yj ϑ1 /ϑ2 ) . ζn,i = √ m j=1
In order to establish consistency of the quarticity estimator, observe that for j > i with generic constant C: π 2 ϑ22 2 2 , ζn,i = Cov ζn,j m2
m X
e(yu1 +···+yu4 )ϑ1 /ϑ2
u1 ,...,u4 =1
× Cov (∆i X)2 (yu1 )(∆i X)2 (yu2 ), (∆j X)2 (yu3 )(∆j X)2 (yu4 ) 6 π 2 ϑ22 e4ϑ1 /ϑ2 m2 Cov (∆i X)2 (yu )(∆i X)2 (yu ), (∆j X)2 (yu )(∆j X)2 (yu ) max 1 2 3 4 u1 ,u2 ,u3 ,u4 ∈{j1 ,...,jm } X 6 Cm2 Cov ∆i xk ∆i xl ∆i xp ∆i xq , ∆j xk ∆j xl ∆j xp ∆j xq
×
k,l,p,q 2
= Cm
X
BC,k BC,l BC,p BC,q ΣB,q ΣB,k ΣB,l ΣB,p ij + Σij ij + Σij ij + Σij ij + Σij
k,l,p,q
× 1 + O(1) 4 X B,k BC,k = Cm2 ∆2n ∆−1/2 Σ + Σ 1 + O(1) n ij ij
= O m2 ∆2n σ 8
Z 0
k ∞
2
z −2 (1 − e−2z )2 e−z
2 (j−i−1)
dz
4
34
BIBINGER AND TRABS
p p p 4 = O m2 ∆2n σ 8 2 j − i − j − i − 1 − j − i + 1 = O m2 ∆2n σ 8 (j − i − 1)−6 . 4 2 , ζ2 Using E ζn,i = O(m2 ∆2n ) and the order of Cov ζn,j n,i for i 6= j, we obtain that Var
n X
2 ζn,i
i=1
n X 4 X 2 2 6 E ζn,i + Cov ζn,j , ζn,i i=1
i6=j
6 O m2 ∆n +
X
X m2 ∆2n |j − i − 1|−6 = O m2 ∆n 1 + ∆n |j − i − 1|−6
i6=j
i6=j
2
= O(m ∆n ) . √ For arbitrary 1 6 m = O( n) we thus have 4 σ ˜n,m → σ4 . P
(51)
Slutsky’s lemma thus implies the normalized central limit theorem (18). 7.4. Proof of Theorem 4.2. Decompose ∆i xk as in (9) with terms Z (i−1)∆n Bi,k = σs e−λk ((i−1)∆n −s) (e−λk ∆n − 1) dWsk , 0 Z i∆n σs e−λk (i∆n −s) dWsk , Ci,k = (i−1)∆n
and Ai,k unchanged. Instead of (30c), we obtain on Assumption 4.1: Z i∆n 1 − e−2λk ∆n 2 2 σs2 e−2λk (i∆n −s) ds = E[Ci,k ] = (52) σ(i−1)∆n + O ∆αn . 2λk (i−1)∆n 2 − σt2 | = |σt+s − σt |(σt+s + σt ), we see that (σt2 )t>0 satisfies the same In particular, writing |σt+s H¨older regularity as (σt )t>0 . Bi,k and its analogue Bi,k hinge on the whole time interval [0, (i − 1)∆n ] and the approximation is less obvious. We deduce that Z (i−1)∆n 2 E[Bi,k ] = σs2 e−2λk ((i−1)∆n −s) (e−λk ∆n − 1)2 ds 0 −2λk (i−1)∆n 2 −λk ∆n 21 − e = σ(i−1)∆ (e − 1) n 2λk Z (i−1)∆n −2λ ((i−1)∆n −s) −λ ∆n 2 + σs2 − σ(i−1)∆ e k (e k − 1)2 ds . n 0
Consider the remainder term and decompose for some J, 1 6 J 6 (i − 1): Z (i−1)∆n −2λ ((i−1)∆n −s) 2 σs2 − σ(i−1)∆ e k ds n 0
Z = 0
J ∆n
2 σs2 −σ(i−1)∆ n
(53) 6 (i − 1 − J)∆n
α
−2λk ((i−1)∆n −s)
e
2λk
−1
+ sup s∈[0,1]
σs2
Z ds + 2λk
(i−1)∆n
−2λ ((i−1)∆n −s) 2 σs2 −σ(i−1)∆ e k ds n
J ∆n −2λk (i−1−J)∆n
−1
e
.
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
35
For each i, k this decomposition of the remainder holds for any J, in particular when setting J = λk J(i, k) = bi − 1 − α2λlog c which minimize (53). We then obtain k ∆n (54)
2 2 (e−λk ∆n − 1)2 E[Bi,k ] = σ(i−1)∆ n
(log λ )α (e−λk ∆n − 1)2 1 − e−2λk (i−1)∆n k +O . 2λk λ1+α k
Reconsidering the analysis of E (∆i X)2 (y) with this decomposition in leading and remainder terms, we derive instead of (10) for any α0 < α ∧ 1 σ2 0 2 1/2 −y ϑ1 /ϑ2 (i−1)∆n √ + ri + O ∆1/2+α , E (∆i X) (y) = ∆n e n ϑ2 π
(55)
with ri exactly the same as in Proposition 3.1 and expectation of estimator (15) that m
2 E[ˆ σn,m ]
Pn
i=1 ri
1/2
= O ∆n
. We obtain by (55) for the
n
p 1 XX 2 0 y ϑ /ϑ yj ϑ1 /ϑ2 = . σ(i−1)∆n ∆n + O ∆1+α e j 1 2 + ∆1/2 n ri πϑ2 e n m j=1 i=1
Since m−1
m X n X
0
eyj ϑ1 /ϑ2 ∆1+α 6 n
j=1 i=1
n X
0 0 eϑ1 /ϑ2 ∆1+α = O ∆αn , n
i=1
the above remainder is negligible whenever α > 1/2. Furthermore, replacing (35b) and (35c), we obtain 1 − e−2λk ∆n , 2λk (e−λk ∆n − 1) 2 eλk ∆n −e−λk ∆n σ(i−1)∆n +O ∆αn , 2λk
2 (56a) Cov(Ci,k , Cj,k ) = 1{i=j} σ(i−1)∆ + O ∆αn n
(56b) Cov(Ci,k , Bj,k ) = 1{j>i} e−λk (j−i)∆n as well as for j > i generalizing (35a): (56c)
−λk ∆n
Cov(Bi,k , Bj,k ) = (e
2
Z
(i−1)∆n
− 1)
σs2 e−λk ((i+j−2)∆n −2s) ds
0
= (e−λk ∆n − 1)2 e−λk (j−i)∆n
Z
(i−1)∆n
σs2 e−2λk ((i−1)∆n −s) ds
0 2 = E[Bi,k ]e−λk (j−i)∆n .
For analyzing the variance-covariance structure of the estimators under time-dependent volatility, we have to reconsider and modify the proof of Proposition 7.3. The analogues of the terms (42f) P are4 zero. Further, (42b) and (42e) directly generalize to the same expressions replacing σ 4 by n−1 ni=1 σ(i−1)∆ n R1 → 0 σs4 ds with a remainder smaller than the terms in (53). We thoroughly address the analogue of P BC,l (42d). The analogue of n−1 ni,j=1 ΣB,k is ij Σij n n e−λl ∆n − 1 2 1X X 2 E[Bi,k ]e−(λk +λl )(j−i)∆n (eλl ∆n − e−λl ∆n ) σ(i−1)∆n + O ∆αn . n 2λl i=1 j=i+1
36
BIBINGER AND TRABS
4 Inserting (54), we obtain σ(i−1)∆ and compute first, analogously to the proof of Proposition 7.3, the n geometric series n X
e−(λk +λl )(j−i)∆n =
j=i+1
e−(λk +λl )∆n − e−(λk +λl )(n−i+1)∆n 1 − e−(λk +λl )∆n
where the resulting i-dependent term becomes again asymptotically negligible. Therefore, in the lead4 ing term the only dependence on i remains in σ(i−1)∆ , while all other terms are exactly the same as n above, and the generalization becomes obvious. The analogue n
i−1
1 XX e−λl ∆n − 1 2 2 σ(j−1)∆n + O ∆αn E[Bj,k ]e−(λk +λl )(i−j)∆n (eλl ∆n − e−λl ∆n ) n 2λl i=2 j=1
B,k BC,l is more involved. We have to show that the remainder of approxi,j=1 Σij Σji P BC,l 4 imating the term with σ(i−1)∆n is negligible. The proof for the analogue of n−1 ni,j=1 ΣB,k ij Σij 4 shows that we can approximate with σ(j−1)∆ and we are left to consider n
of the term n−1
n
Pn
i−1
−(λ +λ )(i−j)∆n 1 XX 4 4 σ(j−1)∆n − σ(i−1)∆ e k l n n i=1 j=1
J n i−1 α X C X X −(λk +λl )(i−j)∆n −(λk +λl )(i−j)∆n (i − j)∆n + 6 e e n i=1
6
C n
n X i=1
j=1
j=J+1 i−1 X
e−(λk +λl )(i−j)∆n (i − 1 − J)∆n
α
+ e−(λk +λl )(i−J)∆n
j=J+1
1 − e−(λk +λl )J∆n , 1 − e−(λk +λl )∆n
with some C < ∞ and any J ∈ {1, . . . , i − 1}. We only require the first-order term of the asymptotic variance and (different as for the bias) not a certain speed of convergence in the remainders. Therefore, the above decomposition with some J = J(i; b) = bi − 1 − ib c ∧ 0 with b ∈ (0, 1) for all k, l is sufficient. In the first addend we then have α (i − 1 − J)∆n = O n(b−1)α rendering this part of the remainder asymptotically negligible. For the second addend we conduct an integral approximation as in Lemma 7.2, leading in analogy with (34) to a remainder of order n−1
n X
i−(3/2)b = O n−((3/2)b∧1) .
i=1
This decomposition is valid for all b ∈ (0, 1) where both terms react to different values of b in a reciprocal way. Setting, for instance, b = 1/3 which balances both terms to be of order n−1/2 in case that α is close to 1/2, we conclude that the overall remainder is asymptotically negligible for the central limit theorem. The generalizations of (42a) and (42c) are analogous, we discuss (42a). Decomposing the double sum
37
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
similar as below (42a), we obtain that n n n i−1 X 1X 4 2X 4 1 X −(λk +λl )|i−j|∆n 4 e σ(i−1)∆n = σ(i−1)∆n + σ(i−1)∆n e−(λk +λl )(i−j)∆n n n n i,j=1
1 = n
i=1 n X
i=1
4 σ(i−1)∆ n
1+2
j=1
e−(λk +λl )∆n
i=1
1 − e−(λk +λl )(i−1)∆n 1 − e−(λk +λl )∆n
and the sum over the i-dependent terms from the geometric series is asymptotically negligible analogously to ri in Lemma 7.1. Thus, all remainders in the variance above are asymptotically negligible and after summing over k, l, where this step is identical to the proof of Proposition 7.3, we derive the asymptotic variance of (21). The above considerations combined with the approximation in the proof of Theorem 3.4 imply that n n X X P e E ζn,i − ζn,i → 0 and thus ζn,i − ζen,i → 0 i=1
i=1
with ζn,i
√ √ m m ϑ2 π X ϑ2 π X 2 e e 2 (yj ) exp(yj ϑ1 /ϑ2 ) , = √ (∆i X) (yj ) exp(yj ϑ1 /ϑ2 ) and ζn,i = √ (∆i X) m m j=1
j=1
P e ek ek (y) and where ∆i X(y) = σ(i−1)∆n k>1 ∆i x Z ∆i x ek =
i∆n
e−λk (i∆n −s) dWsk +
(i−1)∆n
Z
(i−1)∆n
−∞
(e−λk ∆n − 1)e−λk ((i−1)∆n −s) dWsk .
e For m = 1, Proposition 3.3 can be applied to the stationary Gaussian sequence ∆i X(y)/σ (i−1)∆n which satisfies for i 6= j: 1/2 3/2 e e Cov ∆i X(y)/σ , ∆ X(y)/σ j (i−1)∆n (j−1)∆n = O ∆n |i − j|
16i6n
by Proposition 3.2 and is ρ-mixing with ρ(l) = O(l−ς ) for all ς < 1/2 by Proposition 3.3. Since multiplication with the deterministic volatilities (σ(i−1)∆n )16i6n preserves the mixing structure of the sequence and since the Lyapunov condition directly extends as (σt )06t61 is bounded, (Utev, 1990, Thm. 4.1 with Thm. 5.4) applies to ζen,i 16i6n and we conclude (20). In the general case, m > 1, we apply Theorem A.1 as in the proof of Theorem 3.5 to the vector −1 e 1 ), . . . , ∆i X(y e m) . ξn,i = ∆−1/4 σ(i−1)∆ ∆i X(y n n Since the mixing structure is preserved when rescaling with (σ(i−1)∆n )16i6n and the Lyapunov condition in order, Utev’s central limit theorem yields (21). 7.5. Proof of Theorem 5.1. The M-estimator is associated to the contrast function, with η := (σ02 , κ), Z 1−δ 2 0 fη (y) − fη0 (y) dy. K(η, η ) := δ
Step 1: We prove for Kn
(η 0 )
:=
1 m
Pm
j=1 (Zj
− fη0 (yj ))2 that Pη
Kn (η 0 ) − K(η, η 0 ) −→ 0.
,
38
BIBINGER AND TRABS
To verify this stochastic convergence, we decompose (57)
Kn (η 0 ) =
m m m 2 1 X 2 X 1 X 2 fη (yj ) − fη0 (yj ) − δn,j fη (yj ) − fη0 (yj ) + δn,j . m m m j=1
j=1
j=1
2 R 1−δ The first term converges to δ fη (y) − fη0 (y) dy as m → ∞. For the second term we have with g := fη − fη0 , being uniformly bounded on Ξ × [0, 1], due to Proposition 7.3 m m 2 X 4 X Pη δn,j g(yj ) > ε 6ε−2 2 |E[δn,j δn,k ]| · |g(yj )g(yk )| m m j=1
6ε−2
4 m2
j,k=1 m X
1{j=k} ∆n Γσ04 e−2κyj + O(∆n3/2 ) · |g(yj )g(yk )|
j,k=1
=O ∆n /m + ∆n3/2 . 2 ] = ∆ Γσ 4 e−2κyj : For the third term in (57) we obtain with E[δn,j n 0
Pη
m 1 X
m
j=1
2m 2 2 2 δn,j > ε 6 m sup Pη δn,j > ε/2 6 sup E δn,j = O(m∆n ) . ε j j
Step 2: We prove consistency of ηˆ = (ˆ σ02 , κ). ˆ Since K(η, ·) and Kn are continuous, consistency follows from continuous mapping if the stochastic convergence of Kn is also uniform. Now, tightness follows from the equicontinuity: lim lim sup Pη sup |Kn (η) − Kn (η 0 )| > ε δ→0 n→∞
|η 0 −η| 1, c u pkl ∈Pτ −1 where Pτ −1 denotes the space of real polynomials of degree 6 τ − 1 and k · k denotes the spectral norm. P ROOF. For a better readability, we write (ξ(t))t∈Z in the proof. By Theorem 1 from Kolmogorov and Rozanov (1961) the mixing coefficient equals the maximal correlation over the closed linear envelops: ρ(τ ) = sup E[η1 η2 ] : η1 ∈ H(−∞, 0), η2 ∈ H(τ, ∞), E[η12 ] = E[η22 ] = 1 , where we define for any interval I the set H(I) := spanR {ξk (t) : t ∈ I, 1 6 k 6 d} as the linear span of the coefficient processes of (ξ(t))t∈I closed with respect to the scalar product hη1 , η2 i = E[η1 η2 ]. By the spectral representation for multivariate time series, cf. Brockwell and Davis (1991, Chapter Rπ 11.8) we have ξ(t) = −π eitu dZ(u) for a right-continuous orthogonal increment process Z whose distribution is determined by the spectral distribution matrix F . By assumption F can be represented via a spectral density matrix f : [−π, π] → Cd×d . Then for any two elements of spanR {ξk (t) : t ∈ Z, 1 6 k 6 d} K L X X η1 = a> ξ(s ), η = b> 2 k k l ξ(tl ) k=1
PK
l=1
eisk u ,
and a mapping T defined by T (η1 ) = k=1 ak we have X > hη1 , η2 i = E[η1 η2 ] = a> k E[ξ(sk )ξ(tl ) ]bl k,l π
Z =
−π
T (η1 )(u)> f (u)T (η2 )(u)du =: hT (η1 ), T (η1 )if .
VOLATILITY ESTIMATION FOR SPDES USING HIGH-FREQUENCY OBSERVATIONS
41
Defining the real d-dimensional linear span LI,d (f ) := spanRd {eit· : t ∈ I} closed with respect to h·, ·if , T can be extended to a unique isomorphism from H(I) onto LI (f ), cf. Ibragimov and Rozanov (1978, Chapter I.6). We conclude Z π
ρ(τ ) = sup eiτ u ϕ(u)> f (u)ψ(u)du = sup eiτ · ϕ, ψif , ϕ,ψ
−π
ϕ,ψ
where the suprema are taken over ϕ, ψ ∈ L[0,∞),d (f ) satisfying kϕkf = kψkf = 1. In view of Ibragimov and Rozanov (1978, Theorem 1 in Chapter II, see also p.145), we have Z π einu ϕk (u)ψl (u) = 0 for all k, l = 1, . . . , d and n = 1, 2, . . . −π
For any function P : C → Cd×d whose entries are all elements of Pτ −1 we deduce Z π Z π iτ u > iτ u > iu sup e ϕ(u) f (u)ψ(u)du = sup e ϕ(u) f (u) − P (e ) ψ(u)du ϕ,ψ
−π
Z
ϕ,ψ π
6 max kf (u) − P (u)k sup u∈[−π,π]
u∈[−π,π]
kϕk (u)k`2 kψl (u)k`2 du −π
ϕ,ψ
Z 6 max kf (u) − P (u)k
−π
π
−π
1/2 Z kϕk (u)k2`2 du
π
−π
1/2 kψk (u)k2`2 du ,
with the spectral norm k · k. Using the uniform lower bound c on the minimal eigenvalue of f (u), we conclude 1 1/2 1/2 ρ(u) 6 max kf (u) − P (u)k kϕkf kψkf . c u∈[−π,π] Since kϕkf = kψkf = 1, minimizing in P yields the asserted bound. REFERENCES Bagchi, A. and Kumar, K. S. (2001). An infinite factor model for the interest rate derivatives. In Kohlmann, M. and Tang, S., editors, Mathematical Finance: Workshop of the Mathematical Finance Research Project, Konstanz, Germany, October 5–7, 2000, pages 59–68. Birkh¨auser Basel, Basel. Bishwal, J. P. N. (2002). The Bernstein-von Mises theorem and spectral asymptotics of Bayes estimators for parabolic SPDEs. Journal of the Australian Mathematical Society, 72(2):287–298. Bishwal, J. P. N. (2008). Parameter estimation in stochastic differential equations, volume 1923 of Lecture Notes in Mathematics. Springer, Berlin. Bouchaud, J.-P., Sagna, N., Cont, R., El-Karoui, N., and Potters, M. (1999). Phenomenology of the interest rate curve. Applied Mathematical Finance, 6(3):209–232. Bradley, R. C. (2005). Basic properties of strong mixing conditions. A survey and some open questions. Probab. Surv., 2:107–144. Brockwell, P. J. and Davis, R. A. (1991). Time series: theory and methods. 2nd ed. Berlin etc.: Springer-Verlag, 2nd ed. edition. Cialenco, I. and Glatt-Holtz, N. (2011). Parameter estimation for the stochastically perturbed Navier-Stokes equations. Stochastic Processes and their Applications, 121(4):701–724. Cialenco, I., Gong, R., and Huang, Y. (2016). Trajectory fitting estimators for spdes driven by additive noise. Statistical Inference for Stochastic Processes, , forthcoming. Cialenco, I. and Huang, Y. (2017). A note on parameter estimation for discretely sampled spdes. arXiv:1710.01649. Cont, R. (2005). Modeling term structure dynamics: an infinite dimensional approach. International Journal of Theoretical and Applied Finance, 8(3):357–380. Da Prato, G. and Zabczyk, J. (1992). Stochastic equations in infinite dimensions, volume 44 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge. Doukhan, P. (1994). Mixing, volume 85 of Lecture Notes in Statistics. Springer-Verlag, New York. Properties and examples.
42
BIBINGER AND TRABS
Filipovi´c, D. (2001). Consistency problems for Heath-Jarrow-Morton interest rate models, volume 1760 of Lecture Notes in Mathematics. Springer-Verlag, Berlin. Fox, J. and Weisberg, S. (2000). Nonlinear regression and nonlinear least squares in r. An Appendix to An R Companion to Applied Regression , second edition, https://socserv.socsci.mcmaster.ca/jfox/Books/ Companion/appendix/Appendix-Nonlinear-Regression.pdf. Frankignoul, C. (1979). Large scale air-sea interactions and climate predictability. Elsevier Oceanography Series, 25:35–55. Hairer, M. (2013). Solving the KPZ equation. Annals of Mathematics, 178(2):559–664. Heath, D., Jarrow, R., and Morton, A. (1992). Bond pricing and the term structure of interest rates: a new methodology for contingent claims valuation. Econometrica, 60(1):77–105. H¨ubner, M., Khasminskii, R., and Rozovskii, B. L. (1993). Two examples of parameter estimation for stochastic partial differential equations. In Stochastic processes, pages 149–160. Springer, New York. H¨ubner, M. and Lototsky, S. (2000). Asymptotic analysis of a kernel estimator for parabolic SPDE’s with time-dependent coefficients. The Annals of Applied Probability, 10(4):1246–1258. H¨ubner, M. and Rozovski˘ı, B. L. (1995). On asymptotic properties of maximum likelihood estimators for parabolic stochastic PDE’s. Probability Theory and Related Fields, 103(2):143–163. Ibragimov, I. and Rozanov, Y. (1978). Gaussian random processes. Translated by A. B. Aries. Applications of Mathematics. 9. New York - Heidelberg - Berlin: Springer- Verlag. X, 275 p. $ 27.30; DM 49.60 (1978). Jacod, J. (1997). On continuous conditional gaussian martingales and stable convergence in law. S´eminaire de Probabiliti´es, pages 232–246. Jacod, J. and Protter, P. (2012). Discretization of processes. Springer. Jacod, J. and Rosenbaum, M. (2013). Quarticity and other functionals of volatility: efficient estimation. The Annals of Statistics, 41(3):1462–1484. Kolmogorov, A. and Rozanov, Y. (1961). On strong mixing conditions for stationary Gaussian processes. Theory of Probability and its Applications, 5:204–208. Lototsky, S. V. (2009). Statistical inference for stochastic parabolic equations: a spectral approach. Publicacions Matem`atiques, 53(1):3–45. Markussen, B. (2003). Likelihood inference for a discretely observed stochastic partial differential equation. Bernoulli, 9(5):745–762. Mohapl, J. (1997). On estimation in the planar Ornstein-Uhlenbeck process. Communications in Statistics. Stochastic Models, 13(3):435–455. Musiela, M. (1993). Stochastic pdes and term structure models. Journees Internationales de Finance, IGR-AFFI, La Baule. Mykland, P. and Zhang, L. (2009). Inference for continuous semimartingales observed at high frequency. Econometrica, 77(5):1403–1445. Piterbarg, L. I. and Ostrovskii, A. G. (1997). Advection and Diffusion in Random Media: Implications for Sea Surface Temperature Anomalies. Springer Science & Business Media. Prakasa Rao, B. L. S. (2002). Nonparametric inference for a class of stochastic partial differential equations based on discrete observations. Sankhy¯a Ser. A, 64(1):1–15. Pruscha, H. (1989). Angewandte Methoden der mathematischen Statistik. Teubner Skripten zur Mathematischen Stochastik. [Teubner Texts on Mathematical Stochastics]. B. G. Teubner, Stuttgart. Lineare, loglineare, logistische Modelle. [Linear, loglinear, logistic models]. Santa Clara, P. and Sornette, D. (2001). The dynamics of the forward interest rate curve with stoachstic string shocks. Review of Financial Studies, 14(1):149 – 185. Schmidt, T. (2006). An infinite factor model for credit risk. International Journal of Theoretical and Applied Finance, 9(1):43–68. Tuckwell, H. C. (2013). Stochastic partial differential equations in neurobiology: Linear and nonlinear models for spiking neurons. In Stochastic Biomathematical Models, pages 149–173. Springer. Utev, S. A. (1990). Central limit theorem for dependent random variables. In Probability theory and mathematical statistics, Vol. II (Vilnius, 1989), pages 519–528. “Mokslas”, Vilnius. Walsh, J. B. (1986). An introduction to stochastic partial differential equations. Springer. M ARKUS B IBINGER , FACHBEREICH 12 M ATHEMATIK UND I NFORMATIK , ¨ M ARBURG , P HILIPPS -U NIVERSIT AT H ANS -M EERWEIN -S TRASSE 6, 35032 M ARBURG , G ERMANY, E- MAIL :
[email protected]
M ATHIAS T RABS , FACHBEREICH M ATHEMATIK , ¨ H AMBURG , U NIVERSIT AT B UNDESSTRASSE 55, 20146 H AMBURG , G ERMANY, E- MAIL :
[email protected]