TESTS OF CORRELATION AMONG WAVELET-BASED ESTIMATES FOR LONG MEMORY PROCESSES
Livia De Giovannia and Maurizio Naldib a
∗
Universit´a LUMSA
Piazza delle Vaschette 101 - 00193 Roma - Italy
[email protected] b
Universit´a di Roma Tor Vergata
Dipartimento di Informatica Sistemi Produzione Via del Politecnico 1 - 00133 Roma - Italy
[email protected] Key Words: long memory processes; correlation tests; Hurst parameter estimation; wavelets. ABSTRACT Long memory models have received a significant amount of attention in the theoretical literature as they cover a wide range of applications, including economics and telecommunications. In recent years a semiparametric estimator of the long memory parameter of stationary processes with long range dependence based on wavelet decomposition has been proposed and studied by Veitch and Abry (1999) under the idealized assumption of decorrelation among wavelet coefficients. The asymptotic statistical analysis of the wavelet based estimator has been recently complemented taking into account the correlations among wavelet coefficients, at fixed scales as well as among different scales (Bardet et al., 2000). The goal of the present paper is to study the statistical properties of the wavelet based estimator for a finite sample size and the correlation among the wavelet based long memory estimates. The analysis is conducted by simulation, through the use of the circulant matrix method and shows that the correlation among wavelet coefficients has an impact on the moments of the wavelet based ∗
Acknowledgments: Financial support from the Italian Ministry of University and Scientific Research
(MIUR), also in the context of the PRIN 2006 MAINSTREAM Project, is gratefully acknowledged.
1
estimator and on the correlation among the wavelet based long memory estimates computed on non overlapping blocks of the original process. 1. INTRODUCTION Granger (1966) first pointed out that the estimated power spectra of many economic variables, such as industrial production and commodity price indexes, suggested the importance of the low frequency components. The success of the long range dependence (LRD) or long memory concept (associated with heavy low frequency components) in economics may also be attributed to the development of a rationale for its presence in macro-level economic and financial systems based on the aggregation of microunits. This was originally proposed in Robinson (1978) and further developed in Granger (1980) and recently in Davidson and Sibbertsen (2005). In recent years, a semiparametric estimator of the long memory parameter of stationary processes with long range dependence (LRD processes) based on wavelet decomposition has been proposed by Abry and Veitch (1998), Veitch and Abry (1999), Veitch et al. (2003). This linear wavelet based estimator (linear wavelet estimator in the following) is based on the statistical properties of the coefficients of a discrete wavelet transform of long range dependence processes. The basic idea is to measure the slope (related to the long memory parameter) and the intercept (related to the variance of the process) of a linear regression on the wavelet scales after performing a discrete wavelet transform. Under the idealized assumption of independence of the wavelet coefficients, the estimator is shown to be unbiased and of minimum (or close to minimum) variance for the long memory parameter as well as asymptotically unbiased and efficient for the intercept (Veitch and Abry, 1999). The asymptotic statistical analysis of the linear wavelet estimator has been recently performed taking into account the correlations among wavelet coefficients, at a fixed scale as well as among different scales. Consistency and asymptotic normality of the estimator have been obtained, under appropriate regularity conditions (Bardet et al., 2000). Tied to the problem of the estimation of the long memory parameter, there is the problem of its time-varying behaviour. The high variability inherent in long memory processes is very easily confused with non stationarity, and, conversely, variability due to non stationarity may be erroneously 2
taken as long memory. The relevance of the time varying behaviour of the long memory parameter has also been highlighted in some contributions on long memory stochastic volatility models (Vuorenmaa, 2005; Whitcher and Jensen, 2000). In order to monitor the value of the long memory parameter over time a basic approach is to split the data sequence into a number of non overlapping adjacent blocks and separately estimating the long memory parameter over each. Under the idealized assumption of independence of the wavelet coefficients, the linear wavelet estimates computed in each block can be treated as almost independent. This is the basis of an optimal (uniformly most powerful invariant) test for the time constancy of the long memory parameter (Veitch and Abry, 2001). The main goal of the present paper is to analyze the statistical properties of the linear wavelet estimator for a finite sample size. Moreover the correlation among the linear wavelet long memory estimates computed on non overlapping blocks of the original process is also studied for a finite sample size. The analysis is conducted by simulation, through the use of the circulant matrix method, which allows an exact generation of long range dependence processes. The paper is organized as follows. The estimator is described in Section 2. The characteristics of the estimator are analyzed by simulation in Section 3. The correlation among the long memory parameters estimates on non overlapping blocks is analyzed by simulation in Section 4. 2. WAVELET-BASED ESTIMATOR OF THE LONG MEMORY PARAMETER An important class of long memory processes are derived from self similar processes. A continuous parameter stochastic process {Z(t), t ≥ 0}, is self similar, with self similarity parameter (Hurst parameter) 0 < H < 1, if for any positive real number c the pro cess c−H Z(ct), t ≥ 0 has the same finite-dimensional distributions as the original process {Z(t), t ≥ 0} (Beran, 1994; Samorodnitsky and Taqqu, 1994). Self similar processes are invariant in distribution under scaling of time and amplitude. A discrete parameter (second order) stationary process {Xj , j ≥ 0}, with variance σ 2 = E Xj2 − E2 (Xj ), is long range dependent (has long memory) if the correlation coefficients r(k) = σ −2 [E (Xj Xj+k ) − E2 (Xj )] take the form r(k) ≈ cr k −(1−α) for some 0 < α < 1 as k tends to infinity (Beran, 1994; Samorodnitsky and Taqqu, 1994). The case 1/2 < 3
H < 1 corresponds to long range dependence, whilst for H = 1/2 the coefficients r(k) are zero for every k ≥ 1. If the original process {Z(t), t ≥ 0}, is self similar with stationary increments and self similarity parameter H, then the correlation coefficients r(k) of the increments process {Xj = Z(j) − Z(j − 1), j ≥ 1} take the form r(k) ≈ cr k −2(1−H) as k → ∞ (Beran, 1994). As a consequence, the increments of a self similar process with (second order) stationary increments form a stationary process with long range dependence, where the Hurst parameter and the long memory parameter are linked by the relationship H = (1 + α)/2. The process of the (stationary) increments of a self similar process with self similarity parameter H exhibit power law divergence at the origin of the spectrum f (ν) ≈ cf ν −α (as ν → 0, and cf is a positive constant) (Beran, 1994; Samorodnitsky and Taqqu, 1994). The discrete wavelet coefficients of a function x(t) are (Daubechies, 1992): Z ∞ djk = x(t)ψjk (t)dt j, k ∈ Z
(1)
−∞
where the ψjk (t) are called the wavelet functions. The whole set of wavelets can be derived from a single function, named the mother wavelet ψ0 (t), through operations of scaling and dilation/contraction: ψjk (t) = 2−j/2 ψ0 (2−j t − k), j ≥ 0, k = ±1, ±2, . . .. The coefficients djk represent the Discrete Wavelet Transform (DWT) of x(t) . Among the several possibilities for the choice of the mother wavelet the Haar wavelet is often used for its finite support property and the simplicity of the coefficient computation procedure. The Haar mother wavelet correspond to the choice N = 0 of the Daubechies N mother wavelet, where N is a positive integer, the number of vanishing moments of the wavelet (an integer such that R k t ψ0 (t)dt = 0, ∀k = 0, 1, . . . , N − 1). The Daubechies mother wavelets are continuous except in the case N = 0 (Haar mother wavelet). The greater the N , the longer the support (Daubechies, 1992). In contrast to the trigonometric functions employed in the Fourier series, which are extremely localized in frequency but completely unlocalized in time, wavelets are well localized both in time and frequency. The wavelet coefficients of a long memory process have recently received considerable attention because of their special properties that can be fruitfully used to estimate the long memory parameter α. 4
The wavelet coefficients of a discrete parameter (second order) stationary long memory process {Xj , j ≥ 1} satisfy the following two properties (Flandrin, 1992; Masry, 1993; Tewfik and Kim, 1992): Prop 1. Provided N ≥ (α − 1)/2 the wavelet coefficients djk with fixed scale index j form a stationary process satisfying E d2jk = 2jα cf C (α, ψ0 ) as j → ∞ (this relationship actually holds true for j larger than an extreme scale j1 ). The function C (α, ψ0 ) is defined as R −α ν |ψ0 (ν)|2 dν, ψ0 (ν) being the Fourier transform of the mother wavelet ψ0 . Moreover, E d2jk represents the variance of the processes {djk } at fixed scale index j. The expectation E (djk ) is null due to required conditions for the mother wavelet in the framework of multiresolution analysis (Daubechies, 1992; Mallat, 1989). Prop 2. Provided N ≥ α/2, it follows that E (djk djk0 ) ≈ |k − k 0 |α−1−2N → 0 as N → ∞ and |k − k 0 | → ∞. In Prop 1 and Prop 2 α is the long memory parameter of the power law spectrum (eventually tied to the Hurst parameter of the self similar originating process by α = 2H −1). Among the available scales 1, . . . , jn = log2 (n) where n is the number of available data, j1 is the lowest scale above which the log-scale diagram is approximately linear. It has to be identified from the data. The estimator of α depends on the mother wavelet only through its vanishing moments. The choice of N must be large enough to compensate the divergence of ν −α (Prop 1 ). From Prop 1, Veitch and Abry (1999) propose to estimate (α, cf ) by a linear regression of log2 E d2jk versus log2 2j = j. The quantity log2 E d2jk is Pnj 2 estimated by log2 1/nj k=1 djk − gj where nj is the number of available coefficients at scale j (nj = n/2j ) and the gj are deterministic quantities that account for the fact that log2 E(·) 6= E [log2 (·)]. The term gj is equal to ψ (nj /2) / ln 2 − log2 (nj /2) including the Psi function ψ(z) = Γ0 (z)/Γ(z), i.e. the logarithmic derivative of the Gamma function. (Veitch and Abry, 1999). The linear wavelet estimator is based on the following regression model (Veitch and
5
Abry,1999): ! nj 1 X 2 log2 d − gj = jα + log2 cf C (α, ψ0 ) (2) nj k=1 jk Pnj 2 The graph of log2 1/nj k=1 djk − gj against the scale index j is the logscale diagram. Pnj 2 Moreover, closed form solutions for the variances σj2 of log2 1/nj k=1 djk have been obtained (Veitch and Abry, 1999) and used in the weighted linear regression. The value of σj2 is ζ (2, nj /2) / ln2 2 where ζ(s, q) is the generalized Riemann Zeta function (Gradshteyn and Ryzhik, 1980). Its evaluation is reported in the Appendix. The estimator of α is the slope of the weighted linear regression. The estimator of log2 [cf C (α, ψ0 )] is the intercept of the weighted linear regression. From its expression, an estimator of the second parameter of interest cf is easily computed. In order to state the statistical properties of such estimators, in (Veitch and Abry, 1999) the following supplementary conditions are introduced. Cond 1. The process {Xj , j ≥ 1}, is a Gaussian stochastic process. As a consequence, the process {djk } is Gaussian. Cond 2. The random variables djk ’s with fixed scale index j are independent and identically distributed (i.i.d.). Cond 3. The processes {djk } and {dj 0 k } are independent for every j 6= j 0 . Under the two properties and the three conditions above stated the estimator of the couple {α, log2 [cf C (α, ψ0 )]} is unbiased, asymptotically efficient and asymptotically normally distributed. (Abry and Veitch, 1998). Confidence intervals have been computed using these arguments (Veitch and Abry, 1999). In Bardet et al. (2000) a deep statistical analysis of the linear wavelet estimator has been complemented taking into account the correlation between wavelet coefficients at fixed scales as well as between different scales. Consistency and asymptotic normality of the estimator have been obtained under some regularity conditions for the spectral density function, and as long as the scale index j1 (n) → ∞ and n−1 2j1 (n) → 0, where n denotes the sample size 6
(Bardet et al., 2000). The main result by Bardet et al. (2000) can be formulated as follows: q d ˆn − H − nj1 (n) H → N 0, σ 2 (H)
as n → ∞
(3)
where H is derived from α. The asymptotic variance of the linear wavelet estimator when considering the non-negligible correlation among wavelet coefficients depends on H. In Bardet et al. (2000) the observed data come from a continuous time process, while a discrete time process is actually considered in practice. In Veitch et al. (2000) it is proven that from the original discrete time process {Xj , j ≥ 1} a continuous time process X, t ≥ 0 can be constructed that exhibits the same wavelet coefficients. 3. STATISTICAL PROPERTIES OF THE WAVELET LONG MEMORY ESTIMATOR In this section a simulation study is performed in order to evaluate the impact of the correlation among wavelet coefficients on the statistical properties of the linear wavelet estimator for a finite sample size. There exist a number of methods dedicated to the synthesis of stationary increments of a self-similar process with self-similarity parameter H = (1 + α)/2, some of which are based on wavelets. Among these, the so called Choleski method (Hall et al., 1998) is valuable since it is exact even if computationally heavy. Various approximate synthesis methods with reasonable computational loads are known (Park et al. 2000) that yield tractable practical implementation, but they present the drawback that the errors due to the approximation made cannot be controlled. The Choleski algorithm is described below. 1. Generate a sequence Z0 = {Z10 , . . . , Zn0 } of n i.i.d. standard normal random variables. 2. If R denotes the nxn correlation matrix defined by rij =
1 2
(k + 1)2H − 2k 2H + (k − 1)2H ,
with k = |j −i| and 1/2 < H < 1, then decompose R into U 0 U by Choleski factorization (Trefethen and Bau (1997)), and denote rij by r(k). 3. Define Zn = {Z1 , . . . , Zn } = U 0 Zn0 . Then Zn follows a multivariate normal distribution with zero mean, variances equal to one and correlation matrix R. Since
7
r(k) ≈ 12 cr k −(1−α) , Zn can be considered as a sample of size n from a Gaussian, stationary LRD process with self-similarity parameter H. Since the Choleski method is exact but computationally heavy, the simulation presented here has been conducted by the circulant matrix method (Davies and Harte 1987), which is based on the Choleski method (and is hence exact), but is computationally more efficient for generating long traces. In order to generate a sequence of length n, the method essentially embeds the correlation matrix R of the original process into a non negative definite matrix R0 of size m ≥ 2(n − 1) that is circulant. Two values of H have been considered, namely H = 0.6 and H = 0.8 (in addition to the H = 0.5 case, associated to uncorrelated wavelet coefficients, which acts as a reference). For each value of H, 1000 independent traces of stationary increments of a self similar process have been generated by using the circulant matrix embedding method. Traces of length n = 2048 = 211 and n = 32768 = 215 have been considered. First of all, the coverage probability of the linear wavelet estimator has been computed over the full sample of 2048 and 32768 values for three different choices of the mother wavelet: Haar, Daubechies 2 and Daubechies 4 respectively. A goodness of fit test (Veitch et al. 2003) has been applied in order to test which scales should be used. The results are reported in Tables 1 through 3 (the standard errors are reported within parentheses). In Tables 4 and 5 the sample moments of the Hurst parameters are reported. A few comments are in order. 1. The variance of the linear wavelet estimator, as expected from the asymptotic results in Bardet et al. (2000), actually depends on H and is larger than the one computed by Abry and Veitch. 2. The confidence intervals computed under the idealized assumption of negligible correlation among wavelet coefficients present a slight inaccuracy associated to undercoverage. The reason is that the (simulated) distribution of H over the 1000 traces for each value of H exhibits: 8
Table 1: Coverage probabilities for the linear wavelet estimator Trace length = 211 Wavelet type
Haar
Daubechies 2
Daubechies 4
H=0.6
H=0.8
Cov. prob. 95%
Cov. prob. 99%
Cov. prob. 95%
Cov. prob. 99%
0.941
0.982
0.919
0.978
(0.0075)
(0.0042)
(0.0086)
(0.0046)
0.942
0.984
0.923
0.980
(0.0073)
(0.0040)
(0.0084)
(0.0044)
0.942
0.983
0.931
0.973
(0.0074)
(0.0041)
(0.0080)
(0.0051)
Trace length = 215 Wavelet type
Haar
Daubechies 2
Daubechies 4
H=0.6
H=0.8
Cov. prob. 95%
Cov. prob. 99%
Cov. prob. 95%
Cov. prob. 99%
0.941
0.981
0.917
0.972
(0.0074)
(0.0043)
(0.0087)
(0.0052)
0.943
0.985
0.921
0.979
(0.0073)
(0.0038)
(0.0085)
(0.0045)
0.941
0.982
0.933
0.976
(0.0075)
(0.0042)
(0.0079)
(0.0048)
9
Table 2: Coverage probabilities for the linear wavelet estimator (H=0.5, Trace length = 211 ) Wavelet type Haar
Daubechies 2
Daubechies 4
Cov. prob. 95%
Cov. prob. 99%
0.946
0.987
(0.0071)
(0.0036)
0.955
0.993
(0.0065)
(0.0026)
0.953
0.990
(0.0067)
(0.0031)
Table 3: Confidence interval size Confidence level
Trace length 211
215
95%
0.049
0.022
99%
0.065
0.029
10
Table 4: Sample mean and variance of the Hurst parameter Trace length = 211 Wavelet type
H=0.5
H=0.6
H=0.8
Mean
Variance
Mean
Variance
Mean
Variance
Haar
0.496
0.00099
0.581
0.015
0.780
0.016
Daubechies 2
0.499
0.00091
0.625
0.020
0.827
0.021
Daubechies 4
0.499
0.00094
0.610
0.024
0.814
0.027
Trace length = 215 Wavelet type
H=0.6
H=0.8
Mean
Variance
Mean
Variance
Haar
0.595
0.0057
0.794
0.0062
Daubechies 2
0.602
0.0049
0.806
0.005
Daubechies 4
0.602
0.0054
0.808
0.0055
(a) a variance larger than the one computed by Abry and Veitch (b) skewness (c) negative kurtosis so that the confidence interval computed under the hypothesis of normality is narrower than it should be. 3. The coverage probability decreases when H increases. The reason is that the correlation of the {djk } increases with increasing H (Flandrin, 1992; Masry, 1993; Tewfik and Kim, 1992); as a consequence the correlation among wavelet coefficents, not considered in Abry and Veitch, has a major impact with increasing H. 4. The results for the case H = 0.5 (in which the wavelets coefficients are uncorrelated), are different from the cases H = 0.6 and H = 0.8 showing that the correlation among
11
Table 5: Sample skeweness and kurtosis of the Hurst parameter Trace length = 211 Wavelet type
H=0.5
H=0.6
H=0.8
Skewness
Kurtosis
Skewness
Kurtosis
Skewness
Kurtosis
Haar
-0.095
-0.040
-0.130
-0.156
-0.130
-0.114
Daubechies 2
-0.090
-0.018
0.120
-0.089
-0.107
-0.050
Daubechies 4
-0.080
-0.014
0.110
-0.130
0.099
-0.174
Trace length = 215 Wavelet type
H=0.6
H=0.8
Skewness
Kurtosis
Skewness
Kurtosis
Haar
-0.140
-0.143
-0.130
-0.154
Daubechies 2
0.150
-0.101
0.153
-0.098
Daubechies 4
0.190
-0.113
0.163
-0.164
12
the wavelets coefficients is a possible explanation of the inaccuracy of the confidence intervals based on the linear wavelet estimator, and of the skewness and curtosis of the related estimated H’s. 4. TEST OF CORRELATION AMONG LONG MEMORY WAVELET ESTIMATES ON NON-OVERLAPPING BLOCKS In this section a simulation study is performed in order to evaluate the correlation among the long memory linear wavelet estimates computed on non overlapping blocks of the original process. Two values of H have been considered, namely H = 0.6 and H = 0.8. For each value of H, l = 100 and l = 500 independent traces of stationary increments of a H self similar process have been generated by the circulant matrix method. The length of each trace is n = 32768 = 215 . The Daubechies 2 mother wavelet has been used. A goodness of fit test (Veitch et al. 2003) has been applied in order to test which scales should be used. Abry and Veitch propose an optimal (uniformly most powerful) test for the time constancy of the long memory parameter over time (Veitch and Abry 2001). If a trace is split into m non overlapping blocks and α ˆ j denotes the linear wavelet estimate of the long memory parameter in block j, according to section 2 the random variables (r.v.’s) α ˆ j can be considered uncorrelated Gaussian α ˆ j ∼ N (αj , σj2 ) with known common variance σj2 = σ 2 . The variance of the long memory wavelet estimator by Abry and Veitch in fact depends only on the number of scales used (see Appendix). The null hypothesis H0 is that the means αj are identical (equivalent to identical distribution of the r.v.’s) against the alternative H1 that they differ. The optimality properties of the test hold under the condition of independence of the linear wavelet estimates in each block, which is motivated by the idealized assumption Cond 2 of decorrelation of the wavelet coefficients (Abry and Veitch present evidence of decorrelation among wavelets estimates based upon Fisher’s z test which considers only two adiacent blocks). In order to test the correlation among long memory estimates two tests have ben considered. In the first one each trace has been split into 2, 4, 8 and 16 non overlapping blocks of length 214 , 213 , 212 and 211 , respectively. In each one the long memory linear wavelet 13
estimator has been computed. A test on the structure of the covariance matrix R of the multivariate normal distribution represented by the {ˆ αj }, j = 1, . . . , m has been applied (Anderson 1958). This is a test for the null hypothesis H0 : R = δR0 against the alternative hypothesis H1 : R 6= δR0 , where R0 is a definite matrix and δ an unknown positive constant. If R0 is a diagonal matrix, the marginal distributions are uncorrelated and, under the hypothesis of a multivariate normal distribution, independent. The size of the matrix R0 is 2x2, 4x4, 8x8, 16x16 depending on the number of blocks. Within each matrix R0 the variances on the diagonal of the covariance matrix are the same (because the variance of the long memory wavelet estimator by Abry and Veitch depends only on the number of scales used). The scales used are the same within each block. Anyway, under the idealization of uncorrelated wavelet coefficients, the test may be applied to independent (uncorrelated normally distributed) random variables with unknown means and known (possibly) different variances. h i The rejection region of the test is l k ln δˆ − ln det R0−1 S ≥ χ2d,α where k is the number of random variables of the multivariate normal distribution (2, 4, 8, 16), l is the sample size (l = 100, l = 500), δˆ = Tr S0−1 S is the maximum likelihood estimate of δ, S is the estimated covariance matrix, and d is the number of degrees of freedom of the chi-square variable d = (k − 1)(k + 2)/2. The observed significance levels (p-value) are presented in Table 6. A few comments are in order. 1. The analysis of the p-values shows that the null hypothesis of independence among the long memory estimates α ˆ j in each block is not accepted at the significance level 0.05 in one case and at the significance level of 0.1 in another case (in Italics in the table). In two cases the observed test statistic is very close to the threshold value at the significance level 0.1. In all the other cases the null hypothesis is accepted. 2. The value of the estimate of δ is equal to 2 when the sequence is split into 2 and 4 blocks, it is equal to 1 when the sequence is split into 8 and 16 blocks. 3. The value of the estimate of δ is always larger in the case H = 0.8 with respect to the 14
analogous case H = 0.6. The reason is that the correlation of the {djk } , which has an impact on the variance of the linear wavelet estimate, increases with increasing H (Flandrin, 1992; Masry, 1993; Tewfik and Kim, 1992). A second test is proposed to be applied to finite sample sizes. In order to test the correlation among long memory estimates each trace has been split into 4 and 8 non overlapping blocks of length 213 and 212 , respectively. In each block the long memory linear wavelet estimator has been computed. A test of randomness based on a descriptive measure of dependence between observations, the oscillation index of order h, has been applied. The test statistic P 2 is A = (n − 1)−1 n−1 i=1 f (xi+1 , xi ), f : R → R+ being a continuous function having the following properties: 1. f (x, y) = 0 ⇐⇒ x = y 2. f (x, y) = f (y, x)
∀(x, y) ∈ R2
3. f (x, y) < f (x, z)
∀(x, y, z) : x < y < z
Let us consider a sequence X1 , X2 , . . . , Xn of n nondegenerate r.v.’s. The null hypothesis H0 is that they are i.i.d., against the alternative H1 that they are not. It is intuitive to reject the null hypothesis when the observed value of A is too small or too large. The main result by Conti et al. (1994) is that √ n(A − η) d √ − N (0, 1) → V + 2C
as n → ∞
(4)
where C = Cov [f (xi , xj ) , f (xi , xk )], η = E [f (xi , xj )], V = Var [f (xi , xj )]. Since the quantities η, V , and C are unknown, consistent estimators of them are introduced in order to build the rejection region The rejection region of the test has the following form: r of the test. √ | n(A − ηˆ)| ≥ z1−α/2 Vˆ − 2Cˆ /n. In this paper the permutation distribution of A has been introduced to be applied to finite sample sizes. If X1 , X2 , . . . , Xn are i.i.d. r.v.’s, conditionally on the observed values x1 , x2 , . . . , xn the n! permutations of x1 , x2 , . . . , xn have all the same probability 1/n!. Hence 15
the permutation distribution of A can be used to determine the rejection region so that the test has a desired significance level. The rejection region should be placed in the tails of the sample permutation distribution of A. The formal proof that the permutation conditional version of A is asymptotically equivalent to the unconditional one can be found in Hoeffding (1952). In the following the function f is f (xi+1 , xi ) = (xi+1 − xi )2 . The r.v.’s considered are the Gaussian r.v.’s α ˆ j ∼ N αj , σj2 with known common variance σj2 = σ 2 . Since the value of α is constant in each trace, the test proves the independence of the α ˆ j ’s. The results are presented in Table 7 (the standard errors are reported within parentheses). For each value of H ∈ (0.6, 0.8) and each number of blocks (4, 8) the estimated probability of rejecting H0 though it is true (type I error, H0 corresponding to the α ˆ j ’s being independent) with respect to two nominal values 5% and 1% has been determined considering the exact permutation distribution of A on each trace. The traces considered are 500. The 5% and 1% lower and upper thresholds of the rejection region, 4 blocks, are randomized due to the low number of resulting permuted sequences (4! = 24) which doesn’t allow to achieve the required percentile order (1/24 = 0.041667). They are true thresholds at the nominal probability of Type I error being equal to 0.082. A few comments are in order. 1. The estimated probability of the Type I error is larger than the nominal one, expecially for the case of four blocks. The reason may be that the random variables (and hence the wavelet coefficients) included into 4 non overlapping blocks are more correlated with respect to the variables included into 8 non overlapping blocks. 2. The probability of Type I error overestimates the nominal one in the case H = 0.8 more than in the analogous case H = 0.6. The reason may be that the correlation of the {djk } increases with increasing H (Flandrin, 1992; Masry, 1993; Tewfik and Kim, 1992). 5. CONCLUSIONS A computationally efficient semiparametric estimator of the long memory parameter of stationary processes with long range dependence based on wavelet decomposition (Veitch 16
and Abry, 1999) has been analyzed to verify its statistical properties for a finite sample size. The asymptotic statistical analysis of the wavelet based estimator has in fact been recently complemented taking into account the correlations among wavelet coefficients, at fixed scales as well as among different scales (Bardet et al., 2000). The analysis has been conducted by simulation, through the use of the circulant matrix method. The analysis shows that the correlation of the wavelet coefficients could explain the inaccuracy of the confidence intervals obtained under the assumption of decorrelation of the wavelets coefficients and the correlation among the linear wavelet estimates. BIBLIOGRAPHY Abry P., and Veitch D. (1998). Wavelet Analysis of Long Range Dependent Traffic. IEEE Transactions on Information Theory, 44, 2–15. Anderson, T.W. (1958). An Introduction to multivariate statistical analysis. New York: J. Wiley. Greenacre, M. J. (1984). Correspondence Analysis. New York: Academic Press. Bardet, J.M., Lang, G., Moulines, E., and Soulier P. (2000). Wavelet Estimator of LongRange Dependent Processes. Statistical Inference for Stochastic Processes, 3, 85–99. Beran, J. (1994). Statistics for Long-Memory Processes. Chapman and Hall. Conti P.L., and De Giovanni L. (1994). On a procedure to test whether the random variables of a sequence are independent and identically distributed, with applications to telephone and packet-switched networks. Proceedings of the 14th International Teletraffic Congress, Antibes Juan-les-Pins (France), 833–840 Daubechies, I. (1992). Ten Lectures on Wavelets. SIAM. Davidson, J., and Sibbertsen, P. (2005). Generating schemes for long memory processes: regimes, aggregation and linearity. Journal of Econometrics, 128, 253–282.
17
Davies R.B., and Harte D.S. (1987). Test for hurst effect. Biometrika, 74, 95–101. Flandrin, P. (1992). Wavelet analysis and synthesis of fractional Brownian motion. IEEE Transactions on Information Theory, 8, 910–917. Gradshteyn, I.S., and Ryzhik, I.M. (1980). Table of Integrals, Series, and Products. Academic Press. Granger, C.W.J. (1966). The tipical spectral shape of an economic variable. Econometrica, 34, 150–161. Granger, C.W.J. (1980). Long memory relationships and the aggregation of dynamic models. Journal of Econometrics, 14, 227–238. Hall, P., Jing, B., and Lahiri, S.N. (1998). On the sampling window method for long rangedependent data. Statistica Sinica, 8, 1189–1204. Hoeffding, W. (1952). The large-sample power of tests based on permutations of observations, Annals of Mathematical Statistics, 23, 169–192. Mallat, S. (1989). A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674–693. Masry, E. (1993). The wavelet transform of stochastic processes with stationary increments and its application to fractional Brownian motion. IEEE Transactions on Information Theory, 39, 260–264. Park, K., and Willinger, W. (2000). Self-similar network traffic and performance evaluation. J. Wiley. Robinson, P.M. (1978). Statistical inference for a random coefficient autoregerssive model. Scandinavian Journal of Statistics, 5, 81–90. Samorodnitsky, G., and Taqqu, M. (1994). Stable Non-Gaussian Random Processes, Stochastic Models with Infinite Variance. Chapman and Hall. 18
Tewfik, A.H., and Kim M. (1992). Correlation structure of the discrete wavelet coefficients of fractional Brownian motion. IEEE Transactions on Information Theory, 38, 904–909. Trefethen, L.N., and Bau, D. (1997). Numerical Linear Algebra. Philadelphia: SIAM. Veitch, D., and Abry P. (1999). A Wavelet Based Joint Estimator of the Parameters of Long Range Dependence Traffic. IEEE Transactions on Information Theory, special issue on ”Multiscale Statistical Signal Analysis and its Applications, 45, 878–897. Veitch, D., and Abry, P. (2001). A statistical test for the time constancy of scaling exponents. IEEE Transactions on Signal Processing, 49, 2325–2334. Veitch, D., Abry, P., and Taqqu, M. (2000). Meaningful MRA inisialisation for discrete time series. Signal Processing, 80, 1971–1983. Veitch, D., Abry, P., and Taqqu, M. (2003). On the Automatic Selection of the Onset of Scaling. Fractals, 11, 377–390. Vuorenmaa, T. (2005). A wavelet analysis of scaling laws and Long Memory in Stock Market Volatility. Frontiers in Time Series Analysis, Olbia (Italy). Witcher, B., and Jensen, M. (2000). Wavelet estimation of a Local Long memory parameter. Exploration Geographics, 31, 94–103.
APPENDIX: THE VARIANCE OF THE LOGSPECTRUM ESTIMATOR In (Veitch and Abry, 1999) the variance of the estimator of the logspectrum is evaluated as
( " #) nj n h io X 1 ζ (2, nj /2) Var log2 SˆX (νj ) = Var log2 |djk |2 = nj k=1 ln2 (2)
(5)
In the same paper the following approximate expression is given n h io Var log2 SˆX (νj ) ≈
19
2 nj ln2 (2)
(6)
However, the original can be evaluated exactly, since P∞ Pnj /2−1 1 n h io ζ (2, n /2) ζ(2) − n=0 (nj /2+n)2 j i=1 Var log2 SˆX (νj ) = = = 2 2 2 ln (2) ln (2) ln (2)
1 i2
(7)
where ζ(2) ≈ 1.6449 is the ordinary Riemann Zeta function when its argument is 2. It exhibits a nearly geometric growth (by roughly a factor of 2 at each increment of the scale index), with a maximum value of 1.342 at j = 15.
20
Table 6: Observed significance levels (p-values) for the test of independence of the long memory estimates on non overlapping blocks Traces
Blocks
Test statistics
5% Threshold
P-value
δˆ
0.6 0.2
100
2
4.18
5.99
0.123
2.2477
0.6 0.2
100
4
16.16
16.98
0.049
2.0280
0.6 0.2
100
8
45.04
45.89
0.103
0.9817
0.6 0.2
100
16
140.82
163.11
0.350
1.0236
0.6 0.2
500
2
2.14
5.99
0.341
2.1832
0.6 0.2
500
4
11.43
16.91
0.247
2.0945
0.6 0.2
500
8
29.59
49.80
0.730
1.0206
0.6 0.2
500
16
137.44
163.11
0.425
1.0313
0.8 0.6
100
2
2.26
5.99
0.322
2.4261
0.8 0.6
100
4
14.30
14.72
0.099
2.2322
0.8 0.6
100
8
38.72
49.80
0.305
1.0758
0.8 0.6
100
16
146.12
163.11
0.242
1.1026
0.8 0.6
500
2
0.73
5.99
0.693
2.3198
0.8 0.6
500
4
6.09
16.91
0.730
2.2864
0.8 0.6
500
8
23.64
49.80
0.927
1.1069
0.8 0.6
500
16
128.79
163.11
0.634
1.1144
H
α
21
Table 7: Estimated Type I error probability H
0.6
0.8
0.6
0.8
No. of blocks
4
4
8
8
Type I error probability
Type I error probability
Nominal value=5%
Nominal value=1%
< 14.6%
< 14.6%
(0.015)
(0.015)
< 15.4%
< 15.4%
(0.016)
(0.016)
6%
2%
(0.06)
(0.06)
6%
1%
(0.04)
(0.04)
22