TESTING THE FIT OF A VECTOR AUTOREGRESSIVE MOVING AVERAGE MODEL By Efstathios Paparoditis University of Cyprus First Version received October 2003 Abstract. A new procedure for testing the fit of multivariate time series model is proposed. The method evaluates in a certain way the closeness of the sample spectral density matrix of the observed process to the spectral density matrix of the parametric model postulated under the null and uses for this purpose nonparametric estimation techniques. The asymptotic distribution of the test statistic is established and an alternative, bootstrap-based method is developed in order to estimate more accurately this distribution under the null hypothesis. Goodness-of-fit diagnostics useful in understanding the test results and identifying sources of model inadequacy are introduced. The applicability of the testing procedure and its capability to detect lacks of fit is demonstrated by means of some real data examples. Keywords. Bootstrap; diagnostics; goodness-of-fit; kernel estimators; periodogram matrix; spectral density matrix.
1.
INTRODUCTION
In modelling multiple time series the challenge is to find parsimonious models that satisfactorily capture the complicated dependence structure of the observed series and can be analytically tamed to permit sufficient mathematical tractability. Linear multivariate time series models, like vector autoregressive moving-average (VARMA) models, form an important class of such models with a wide range of applications in different disciplines like engineering, business and economics. For the particular class of VARMA models several model specification as well as estimation procedures have been developed and investigated during the last decades; e.g. Hannan (1970), Hannan and Deistler (1988), Lu¨tkepohl (1991) and Reinsel (1993). One important aspect of the modelling process is the evaluation of the fit of a selected model. Depending on the purpose of the model, some alternatives are available in the literature; e.g. Newbold (1983), Lu¨tkepohl (1991) and Reinsel (1993). When specific alternatives are in mind, tests which focus their power in the direction of these alternatives are most appropriate. For instance, Kohn (1979), Hosking (1981) and Poskitt and Tremayane (1982) consider testing a VARMA model against a higher order VARMA alternative. In applications the most popular approach in assessing the fit of a VARMA model is the multivariate portmanteau-type statistic, which is based on testing whether a finite number of 0143-9782/05/04 543–568 JOURNAL OF TIME SERIES ANALYSIS Vol. 26, No. 4 Ó 2005 Blackwell Publishing Ltd., 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.
544
E. PAPARODITIS
estimated autocorrelation matrices of model residuals is zero; e.g. Chitturi (1974), Hosking (1980) and Li and McLeod (1981). In this paper, we are interested in goodness-of-fit tests for VARMA models, i.e., tests which can be applied when no a priori information is available about what departures from the null should be anticipated. The null hypothesis is that the observed multivariate process is a VARMA (P,Q) process with fixed and minimal autoregressive and moving average orders P and Q respectively. Such tests evaluate the overall ability of a VARMA model to provide an adequate parametrization of the second order structure of the observed multivariate process. For this purpose a novel method is proposed, which works by evaluating the closeness of the sample spectral density matrix (periodogram) of the observed process to the spectral density matrix of the fitted model. The idea underlying our procedure is to pre and postmultiply the sample spectral density matrix by the inverse of the square root of the spectral density matrix of the fitted VARMA model. The test statistic developed is then based on the property that if the VARMA model is appropriate then the so transformed sample spectral density matrix will behave like the periodogram matrix of a white noise process. Nonparametric (kernel) smoothers together with an integrated squared deviation measure are then used to evaluate the closeness of this matrix to that of a white noise process. The proposed procedure can be also used to explore the fit of the VARMA model and to understand its capability and weaknesses to parametrize appropriately the covariance structure of the observed series. This is achieved by certain goodness-of-fit diagnostics, which are based on a frequency domain decomposition of the contribution of each of the components of the multivariate model to the test statistic considered. The idea to test the goodness-of-fit of a time series model using spectral characteristics is not new. There has been a large body of literature in this area, although dealing exclusively with univariate time series. For testing a completely specified model, Grenander and Rosenblatt (1952, 1957) consider tests based on the difference between the sample spectral distribution function (integrated periodogram) and that of the process under the null. Bartlett (1954, 1966) uses for the same purpose a standardized sample spectral distribution function. For a discussion of these tests see Priestley (1981). The asymptotic behaviour of Bartlett’s test statistic has been further investigated by Dahlhaus (1985). Anderson (1993) discusses asymptotic properties of different test statistics based on the standardized spectral distribution function. Tests based on the spectral distribution for compound hypothesis have been recently considered by Chen and Romano (1997) and Hainz and Dahlhaus (2000), where the limiting distribution of the test statistics is approximated by resampling methods. Compared to the above approaches, and apart from dealing with multivariate time series, our approach is based on smoothed rather than integrated periodogram statistics. Due to smoothing and the lower convergence rate of the involved nonparametric estimators, the asymptotic distribution of the test statistic has the desirable property to be free of unknown parameters or process characteristics even in the case of testing a compound hypothesis. Smoothed-
Ó Blackwell Publishing Ltd 2005
TESTING THE FIT
545
based tests have also some advantages in terms of power for certain classes of local alternatives. For instance, it has been found that in the closely related i.i.d. case and for the so-called sharp-peak or high frequency alternatives, goodness-offit tests based on smoothed density estimators are more powerful than tests based on the cumulative distribution function; e.g. Rosenblatt (1975), Ghosh and Huang (1991), Eubank and LaRiccia (1992) and Eubank et al. (1993). Similar results also exist in the regression setting, for instance, when using L2-type tests based on smoothed nonparametric (kernel) estimators of the regression function; e.g. Ha¨rdle and Mammen (1993) and Fan and Li (2000). Finally, in the time series set-up and for sequences of local alteratives for which deviations from the null occur locally in the frequency domain or are due to significant autocorrelation at high lags, goodness-of-fit tests based on smoothed spectral density estimators are more powerful than tests based on the spectral distribution function; see Paparoditis (2000b) for some theoretical and numerical comparisons. Note that in the univariate time series context, applications of nonparametric smoothing methods together with overall measures of discrepancy between estimated spectral density characteristics of the observed process and those of the parametric model, have been also used for testing parametric fits. For instance, Prewitt (1993) proposed a test based on smoothed estimators of the spectral density, while Hong (1996) proposed goodness-of-fit tests based on a comparison of the nonparametrically estimated spectral density of time domain model residuals with that of a white noise process. A different approach of testing the fit of a univariate time series model, which is more akin to the procedure proposed in this paper, has been proposed by Paparoditis (2000a). The test is based on a smoothed estimator of the periodogram rescaled by the spectral density of the fitted parametric model. A comparison of these different methods to those based on the integrated periodogram or on the (more popular) portmanteau type statistics has been given by Hong (1996) and Paparoditis (2000a, 2000b). The test proposed in this paper extend these approaches to the multivariate case and in particular to the case of testing the fit of a VARMA model. Since model identification and parametrization is more difficult in the multivariate context, it is very useful to have some powerful procedures, which asses the overall fit of a selected model class. Our test achieves this in a natural way by comparing the ability of the proposed parametric fit to capture the whole second order stuctrure of the observed multivariate series. Furthermore, it leads to the development of goodness-of-fit diagnostics, which are very useful in exploring the fit and in detecting weaknesses and deficiencies of the fitted VARMA model. The paper is organized as follows. Section 2 introduces the proposed test statistic, discusses its properties and establishes its asymptotic distribution. In Section 3, a bootstrap based method to estimate more accurately the distribution of the test statistic under the null hypothesis is proposed and its properties are investigated. In Section 4, some useful goodness-of-fit diagnostics supporting the interpretability of the test results are developed while, in Section 5, some practical aspects are discussed and applications to real-life data examples are presented. All technical proofs are deferred to Section 6.
Ó Blackwell Publishing Ltd 2005
546
E. PAPARODITIS
2.
TESTING THE FIT OF A VARMA MODEL
2.1. The test statistic Since we are dealing with second order properties we assume that the real valued m-dimensional stochastic process fXt, t 2 Zg is generated by Xt ¼
1 X
Wj etj ;
ð1Þ
j¼1
where fetg is an i.i.d. sequence with mean zero and fWj, j 2 Zg satisfies certain conditions to be specified later. Let F be the class of spectral density matrices, n o F ¼ f : fðkÞ ¼ ð2pÞ1 WðkÞRW0 ðkÞ; f > 0 ; ð2Þ whereP R ¼ Var(et). Here and in the sequel we use the shorthand C(k) for C(eik) ¼ Im þ jCjeikj. The hypothesis of interest is that the process fXtg has a VARMA(P,Q) representation, i.e., that H0 : f 2 F H against H1 : f 2 F nF H ; where FH F denotes the set of rational spectral densities corresponding to the VARMA representation Xt ¼
P X j¼1
Uj Xtj
Q X
Bj etj þ et
j¼1
of the process and the spectral density matrix fH 2 FH is given by fH ðkÞ ¼ ð2pÞ1 U1 ðkÞBðkÞ R B0 ðkÞU01 ðkÞ: In the above polynomials given by P notation U(z) and B(z) Pare matrix j UðzÞ ¼ Im Pj¼1 Uj zj and BðzÞ ¼ Im Q j¼1 Bj z respectively and P and Q are fixed nonnegative integers. Furthermore, H ¼ (U, B, R), where U ¼ (U1, U2, . . ., UP), B ¼ (B1, B2, . . ., BQ) and R > 0. To ensure invertibility and causality of the VARMA process we assume that |U(z)| and |B(z)| have roots outside the unit circle. Furthermore, identifiability of the rational parametrization fH requires that the matrix polynomials U(z) and B(z) left Pare P ~ ~ j zj coprime, that is, if for some m m matrix polynomials UðzÞ ¼ I U m j¼1 P ~ ~ ~ ~ j and BðzÞ ¼ Im Q j¼1 Bj z , we have AðzÞ½UðzÞBðzÞ ¼ ½UðzÞ BðzÞ and if A(z) is not a constant matrix then A(z) is unimodular with |A(z)| ¼ 1, (cf. Hannan, 1969). Assume that observations X1, X2,. . .,Xn from the process in (1) are available and consider the sample spectral density matrix (periodogram) defined by
Ó Blackwell Publishing Ltd 2005
547
TESTING THE FIT
n;X ðkÞ; In;X ðkÞ ¼ Jn;X ðkÞJ
where
n 1 X Jn;X ðkÞ ¼ pffiffiffiffiffiffiffiffi Xt eikt : 2pn t¼1
ð3Þ
Here and in the sequel ) denotes transposition combined with complex conjugation. Recall that if the stochastic process fXtg is generated by (1) and satisfies assumption (A1) below, then the periodogram matrix In,X(k) can be expressed as In;X ðkj Þ ¼ Wðkj ÞIn;e ðkj ÞW0 ðkj Þ þ Rn ðkj Þ;
ð4Þ
¼ 2pj/n, j ¼ 0, 1, 2, . . ., N ¼ [n/2], are the so-called Fourier frequencies, where kj P 1 ikk WðkÞ ¼ k¼1 Wk e , n;e ðkÞ and In;e ðkÞ ¼ Jn;e ðkÞJ
n 1 X Jn;e ðkÞ ¼ pffiffiffiffiffiffiffiffi et eikt : 2pn t¼1
ð5Þ
Note that In,e(k) is the periodogram matrix of the i.i.d. series e1, e2, . . ., en. Furthermore, under the assumption of finite fourth order moments of et, the (r, s)th element, Rn,rs(kj), of the remainder matrix Rn(kj) satisfies maxkj2[0, p]E[|Rn, rs(kj)|2] ¼ O(n)1); cf. for instance, Brockwell and Davis (1991, Propn 11.7.4). To motivate the proposed test statistic, assume for the moment that the parameter matrices are known and given by U0, B0 and R0. Let H0 ¼ (U0, B0, R0), pffiffiffiffiffiffi 1=2 1=2 fH0 ðkÞ ¼ 2pR0 B1 ð6Þ 0 ðkÞU0 ðkÞ and consider the sequence of random matrices fUn(kj), j ¼ 0, 1, . . ., Ng, where Un(k) ¼ (Un,rs(k))r,s¼1, 2,. . .,m is defined by 1=2
1=2
Un ðkÞ ¼ fH0 ðkÞIn;X ðkÞfH0 ðkÞ:
ð7Þ
1 Since the null hypothesisP WðkÞ ¼ U1 0 ðkÞB0 ðkÞ with U0 ðzÞ ¼ PP underj 1 Q ðIm j¼1 U0 z Þ and B0 ðzÞ ¼ Im j¼1 B0 zj , we have by (4) that 01
0 In;X ðkj Þ ¼ U1 0 ðkj ÞB0 ðkj ÞIn;e ðkj ÞB0 ðkj ÞU0 ðkj Þ þ Rn ðkj Þ:
ð8Þ
Substituting in (7) the above expression for the periodogram of a VARMA(P,Q) process we get that if f ¼ fH0 then 1=2
Un ðkj Þ ¼ 2pR0 ~ n ðkj Þ ¼ where R
1=2
In;e ðkj ÞR0
1=2 1=2 fH0 ðkÞRn ðkÞfH0 ðkÞ.
~ n ðkj Þ; þR
ð9Þ
Thus if the null hypothesis is true then
E½Un ðkj Þ ¼ Im þ Oðn1=2 Þ;
ð10Þ
)1/2
where Im denotes the m m unit matrix and the O(n ) term is uniformly in kj. Equation (10) suggests that a way to test the null hypothesis is to test that the expected value of the random matrix function Un(k) equals the unit matrix.
Ó Blackwell Publishing Ltd 2005
548
E. PAPARODITIS
For this a nonparametric estimator of the matrix mean function E [Un(k) ) Im] is used and the result so obtained is compared to the null matrix. In particular, consider the m m matrix of nonparametric estimators Qn(k) ¼ (qn,rs(k))r,s¼ 1,2,. . .,m defined by N 1X Kh ðk kj Þ Un ðkj Þ Im ; Qn ðkÞ ¼ ð11Þ n j¼N where N ¼ [n/2], Kh(Æ) ¼ h)1K(Æ/h), K: [)p, p] ! [0, 1) is a smoothing kernel and h a smoothing bandwidth. Notice that Qn(k) is a kernel estimator of E[Un(k) ) Im]. Since under the null hypothesis E[Un(k) ) Im] ¼ O(n)1/2), cf. (10), we get under the assumptions stated in Section 2.2 and by standard arguments that if h ! 0 such that nh ! 1 as n ! 1, then Qn(k) ! 0mm a.s., where 0mm denotes the m m zero matrix. Thus if the null hypothesis is true, we expect the matrix Qn(k) to be close to the zero matrix. Now, to asses the overall closeness of Qn(k) to this matrix a useful statistic is given by Z p 1=2 ~ Tn ¼ nh kQn ðkÞk2 dk; ð12Þ p
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi is its Euclidean norm. As where for a square complex matrix A, kAk ¼ trðAAÞ the proof of Theorem 1 shows, the use of the inflation factor nh1/2 instead of nh is due to the variance of T~n . By the preceding discussion we expect T~n to be small if the null hypothesis is true, i.e., large values of T~n argue against the null hypothesis. Note that T~n can be also written as m X m Z p X T~n ¼ nh1=2 jqn;rs ðkÞj2 dk; ð13Þ r¼1 s¼1
p
which shows that T~n is determined by the L2 distance between the elements of the estimated matrix function Qn(k) to the zero matrix. Therefore, T~n is an overall measure of the closeness of fH(k) to In,X(k) which is affected by the deviation of every element of the sample spectral density matrix In,X(k) from the corresponding element of the parametric spectral density matrix postulated under the null. In practice, the true parameter matrices U0, B0 and R0 of the model under the null are rarely known and have to be estimated using the observed series ^ B ^ and R ^ a sequence of pffiffinffi-consistent X1, X2, . . ., Xn at hand. Denote by U, estimators of these parameter matrices. Motivated by the preceding discussion, the test statistic proposed is then given by Z p ^ n ðkÞk2 dk; Tn ¼ nh1=2 kQ ð14Þ p
^ n ðkÞ ¼ ð^ qn; rs ðkÞÞr; s¼1; 2;...;m is the kernel estimator where, Q N X ^ n ðkÞ ¼ 1 ^ n ðkj Þ Im Q Kh ðk kj Þ U n j¼N
Ó Blackwell Publishing Ltd 2005
ð15Þ
549
TESTING THE FIT
^ n ðkÞ ¼ ðU ^ n; rs ðkÞÞ and U r; s¼1; 2;...;m is defined in exactly the same way as Un(k) given 1=2 in (7) but by replacing the unknown matrices U0, B0 and R0 appearing in fH0 by ^ ^ ^ their corresponding estimators U, B and R. Note that a computationally more attractive version of Tn is given by DTn ¼ 2ph1=2
N X
^ n ðkj Þk2 ; kQ
ð16Þ
j¼N
which is obtained by approximating the integral in (14) by the corresponding Riemann sum. ^ n ðk1 Þ; U ^ n ðk2 Þ; . . . ; U ^ n ðkN Þ can be The estimated sequence of matrices U interpreted as a series of estimated frequency domain model residuals. The idea behind the goodness-of-fit statistic Tn proposed is, therefore, to test the fit of the model by evaluating the whiteness properties of these frequency domain residuals. Essential for this is the property that if the postulated VARMA(P,Q) ^ n ðkj Þ behaves like the periodogram model is correct then the random matrix U matrix of a m-dimensional white noise sequence with mean zero and covariance matrix given by the unit matrix Im. As we will see in the sequel, an interesting aspect of our approach is that it allows for the development of goodness-of-fit diagnostics which are very useful in understanding the results of the testing procedure and in discovering weaknesses and deficiencies of the fitted VARMA model if the null hypothesis is rejected.
2.2. Assumptions and large sample behaviour In order to apply the test statistic Tn its distribution under the null hypothesis is needed. To derive the asymptotic limit of this distribution the following assumptions are imposed. (A1) The process fXtg is generated according to (1), where fetg is a sequence of independent, identically distributed random variables with mean zero, nonsingular covariance matrix R and et has finite absolute eighth order P 1=2 moments. Furthermore, 1 < 1. j¼1 kWj kjjj (A2) Matrices U0 and B0 in the interior of the parameter space and R0 > 0 exist ^ B; ^ Rg ^ such that the sequence fU; n¼1; 2;... of estimators used satisfies pffiffiffi pffiffiffi pffiffiffi ^ U0 Þ ¼ OP ð1Þ; ^ B0 Þ ¼ OP ð1Þ and nðR ^ R0 Þ ¼ OP ð1Þ: nð U nðB (A3) K is a bounded, symmetric, Lipschitz continuousR and non-negative kernel p with compact support [)p, p] satisfying ð2pÞ1 p KðuÞdu ¼ 1. (A4) The bandwidth h satisfies h ! 0 and nh ! 1 as n ! 1. Note that under H0 the matrices U0, B0 and R0 are the true parameter matrices of the VARMA representation. If H0 is wrong then U0, B0 and R0 denote the ^ B ^ and R ^ used. In contrast to asymptotic limits of the sequence of estimators U; the case where the null hypothesis is valid, under the alternative the limits U0, B0
Ó Blackwell Publishing Ltd 2005
550
E. PAPARODITIS
and R0 will generally depend on the particular estimation method used to obtain ^ For instance, if H0 is the hypothesis that fXtg is a first order autoregressive H. ^ ¼ ðU; ^ RÞg ^ process and fH n2IN is the sequence of Yule-Walker estimators, then ^ ! H0 ¼ ðC1 C1 ; C0 C0 C0 C1 Þ a.s. as n ! 1, where we have under H1, that, H 0 1 C0 ¼ EðXt X0t Þ and C1 ¼ EðXt X0tþ1 Þ. See Hannan (1970), Lu¨tkepohl (1991) and Reinsel (1993) for different estimation methods of the VARMA parameters. Theorem 1 deals with the limiting distribution of Tn under the null hypothesis. Theorem 1. then
Assume that (A1)–(A4) are fulfilled. If the null hypothesis is true Tn lh ðKÞ ) N ð0; r2 ðKÞÞ
as n ! 1, where 2 1=2
lh ðKÞ ¼ m h and r2 ðKÞ ¼
m2 p
Z
2p Z p
2p
Z
p
K 2 ðuÞdu
ð17Þ
p
2 KðuÞKðu þ xÞdu
dx:
ð18Þ
p
Based on Theorem 1 and for a 2 (0, 1) a test of asymptotic level a is obtained by rejecting the null hypothesis if Tn lh ðKÞ =rðKÞ z1a ; pffiffiffiffiffiffiffiffiffiffiffiffi where rðKÞ ¼ r2 ðKÞ and z1)a is the (1 ) a)th quantile of the standard normal, i.e., P(Z > z1)a) ¼ a for a standard normal distributed random variable Z. Note that the asymptotic limiting distribution of Tn under the null hypothesis does not depend on any unknown parameters or process characteristics and it is not affected by the fact that the unknown parameter matrices H are replaced by a pffiffiffi ^ n-consistent estimator H. Theorem 2 characterizes the behaviour of Tn if the null hypothesis is wrong. Theorem 2. Let the assumptions of Theorem 1 be true and assume that f 2 FnFH. If n ! 1, then Z p kDðkÞk2 dk; n1 h1=2 Tn ! d :¼ p
in probability, where 1=2 1=2 DðkÞ ¼ fH0 ðkÞfðkÞfH0 ðkÞ Im :
ð19Þ
Since under the alternative D(Æ) 6¼ 0mm, continuity of D(Æ) implies that d > 0. Tn is therefore an omnibus test, i.e., a test which has power against any alternative f 2 FnFH.
Ó Blackwell Publishing Ltd 2005
551
TESTING THE FIT
3.
BOOTSTRAP APPROXIMATIONS
The test statistic Tn is based on the L2 distance between the nonparametrically ^ n ðÞk and the zero function. Even in the estimated matrix norm function kQ univariate context and for i.i.d. data, it is known that the convergence of such L2-type statistics to their asymptotic limits is very slow; see for instance Ha¨rdle and Mammen (1993). For finite sample sizes, we expect, therefore, that the standard normal distribution will not provide a good approximation to the null distribution of (Tn ) lh(K))/r(K). To improve upon the Gaussian approximation we propose an alternative, bootstrap-based method. Since we are in a testing context, a bootstrap procedure will be successful only if it is able to mimic correctly the distribution of Tn under the null even if the null hypothesis is wrong. Furthermore, since a particular parametric structure is assumed under the null hypothesis, this parametric structure will be used to generate the bootstrap pseudo-series. The bootstrap alternative which we propose in order to approximate the distribution of Tn under the null is to generate bootstrap replicates, say X1 ; X2 ; . . . ; Xn , of the observed series using the fitted parametric model and to re-calculate the test statistic considered using these bootstrap replicates. More specifically the following procedure is applied. 1. Generate random P samples XL þ 1 ; XL þ 2 ; . . . ; X0 ; . . . ; Xn by PP Q ^ ^ Xt ¼ X þ e þ e , where X ; X ; . . . ; X are U B k k tk t tk Lþ1 Lþ2 LþP k¼1 k¼1 ^ some starting values and et is an i.i.d. sequence satisfying et N ð0; RÞ. 2. Based on the pseudo-series X1 ; X2 ; . . . ; Xn , estimate the model parameters by applying the same estimation method as the one used to obtain the sample ^ B ^ and R. ^ Denote these estimates by U ^ , B ^ and R ^ . parameter estimates U, ^ 3. Calculate the sequence of frequency domain random matrices Un ðkj Þ defined by ^ ðkj Þ ¼ 2pðR ^ Þ1=2 ðB ^ ðkÞÞ1 U ^ 0 ðkÞðB ^ 0 ðkÞÞ1 ðR ^ Þ1=2 ; ^ ðkÞIn;X ðkÞU U where In,X (k) is the periodogram matrix of the series X1 ; X2 ; . . . ; Xn . Calculate the statistic Tn , which is the bootstrap analogue of Tn and given by Z p 1=2 ^ ðkÞk2 dk Tn ¼ nh kQ ð20Þ n p ^ ðkÞ ¼ 1 PN ^ with Q n j¼N Kh ðk kj Þ Un ðkj Þ Im . n Theorem 3 justifies the use of Tn to approximate the distribution of Tn under the null hypothesis. Theorem 3.
Assume (A2–A4). As n ! 1 then Tn lh ðKÞ ) N ð0; r2 ðKÞÞ
in probability, where lh(K) and r2(K) are given in (17) and (18) respectively.
Ó Blackwell Publishing Ltd 2005
552
E. PAPARODITIS
Note that under the assumptions made, the Gaussian distribution used in the first step of the above bootstrap algorithm in order to generate the pseudo-errors et does not affect the consistency properties of the bootstrap procedure even if the distribution of the true errors et is not Gaussian. This is so since by Theorem 1 the limiting distribution of Tn does not depend on characteristics of the distribution of ^ is, therefore, the noise process fetg. The use of the Gaussian distribution N ð0; RÞ rather convenient. We demonstrate this behaviour by means of some simulated examples. Numerical evidence To illustrate the finite sample performance of the above bootstrap procedure and to compare it to that of the Gaussian approximation given in Theorem 1, a small simulation experiment was conducted. Realizations of length n ¼ 64 and n ¼ 1024 of the first-order vector autoregressive (VAR(1)) process Xt ¼ UXt)1 þ et were generated, where U ¼ (/i, j)i,j¼1,2, /11 ¼ 0.8, /12 ¼ 0.7, /21 ¼ )0.4, /22 ¼ 0.6; (see Reinsel, 1993). Two different distributions have been used for the i.i.d. sequence fetg. In both cases we set et ¼ A(u1,t, u2,t)0 , where A ¼ (ai,j)i,j¼12 with a11 ¼ a22 ¼ 2 and a12 ¼ a21 ¼ 1/4. In the first case we set (u1,t, u2,t) N(0, I2ffi) p while pffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffi in the second case the (u1,t, u2,t)0 are independent uniform U ð 3 3=2; 3 3=2Þ distributed. The test statistic Tn has been calculated using least squares estimates of the parameters U and R, the Bartlett–Priestley kernel given by K(x) ¼ 3(1 ) (x/p)2)/2 for |x| p and two different bandwidths h for each sample size n. Table I presents the results obtained in approximating the distribution of Tn. The estimated exact percentage points of the distribution of Tn given in this table have been obtained using 10, 000 replications of the model. The bootstrap estimates are mean values over 400 replications, where for each replication 1000 bootstrap samples have been used. Note that according to the proposed bootstrap ^ procedure, the errors et have been generated using the bivariate N ð0; RÞ distribution. As Table I shows, the bootstrap provides a very accurate estimation of the distribution of the test statistic Tn under the null even for sample sizes as small as n ¼ 64 observations and for both error distributions used. It clearly outperforms the Gaussian approximation which is worse even in the case of n ¼ 1024, indicating the expected slow convergence of the distribution of Tn against its asymptotic limit.
4.
EXPLORING THE FIT OF THE VARMA MODEL
Since modelling multiple time series is a complicated and demanding procedure it is not sufficient to have solely a test statistic which summarizes the overall performance of a postulated VARMA model by means of a single value only, i.e., the value of Tn in our case. It is very helpful to develop some statistics which are useful in understanding the test results and the reasons for the possible rejection
Ó Blackwell Publishing Ltd 2005
553
TESTING THE FIT TABLE I Percentage Points of the Distribution of Tn
n ¼ 64 h ¼ 0.2
Exact (Normal) Boot. Approx. Exact (Uniform) Boot. Approx.
h ¼ 0.1
Normal Approx. Exact (Normal) Boot. Approx. Exact (Uniform) Boot. Approx.
Normal Approx. n ¼ 1024 h ¼ 0.06 Exact (Normal) Boot. Approx. Exact (Uniform) Boot. Approx. Normal Approx. h ¼ 0.03 Exact (Normal) Boot. Approx. Exact (Uniform) Boot. Approx. Normal Approx.
5%
25%
29.765 28.598 (2.442) 29.548 28.597 (2.249) 40.209 52.363 52.298 (1.535) 53.017 52.300 (1.548) 68.143
42.392 41.776 (3.287) 42.613 41.740 (2.993) 56.265 67.707 67.540 (1.854) 68.051 67.536 (1.750) 84.199
77.322 77.422 (1.145) 77.567 77.463 (1.096) 95.896 129.238 129.038 (1.185) 128.815 128.963 (1.293) 146.896
94.677 94.746 (0.916) 94.898 94.735 (0.863) 111.52 147.795 148.169 (0.917) 148.239 148.122 (0.901) 162.952
50%
75%
54.258 69.119 53.692 68.618 (4.109) (5.421) 55.221 70.623 53.612 68.504 (3.763) (4.845) 67.438 78.611 80.743 96.971 80.917 97.412 (2.438) (3.550) 81.826 99.236 80.977 97.426 (2.293) (3.249) 95.372 106.545 108.327 108.401 (0.920) 108.396 108.509 (0.899) 123.125 162.421 162.749 (0.968) 162.706 162.732 (0.933) 174.125
123.342 123.549 (1.137) 123.446 123.758 (1.047) 134.298 178.240 178.458 (1.116) 178.340 178.544 (1.084) 185.298
90%
95%
99%
85.377 85.636 (7.268) 88.228 87.371 (6.294) 88.659 115.093 115.772 (5.365) 118.271 115.633 (4.538) 116.592
96.964 97.673 (8.607) 101.053 98.662 (7.549) 94.667 127.306 128.578 (6.576) 131.460 129.383 (5.523) 122.601
126.103 124.844 (11.854) 130.276 125.384 (10.617) 105.939 157.649 156.414 (9.393) 160.697 156.468 (8.396) 133.873
139.000 138.563 (1.481) 138.650 138.903 (1.481) 144.34 192.981 193.593 (1.459) 193.312 193.897 (1.404) 195.345
148.458 148.288 (1.950) 148.405 148.496 (1.973) 150.354 203.358 203.265 (1.825) 203.833 203.461 (1.889) 201.354
169.674 168.355 (3.452) 169.062 167.788 (3.629) 161.626 223.028 222.069 (3.328) 222.366 222.239 (3.351) 212.626
Exact (Normal) refers to the percentage points for the case of normal distributed errors, while Exact (Uniform) for the case based on uniform distributed errors. The exact values have been approximated via Monte Carlo simulations. Normal Approx. refers to the estimates based on the large sample Gaussian approximation given in Theorem 1 and Boot. Approx. to the bootstrap estimates. Numbers in parentheses are the estimated standard deviations of the bootstrap estimates.
of the null hypothesis. The aim of this section is to show how the test statistic Tn proposed in this paper can be used for such a purpose. Approximating the integral in (14) by the corresponding Riemann sum and using the definition of the particular matrix norm applied, we have m X m X N N 2 pffiffiffi X 1 X ^ n;rs ðkl Þ drs : Tn 2p h Kh ðkj kl Þ U n l¼N r¼1 s¼1 j¼N
ð21Þ
Note that the right-hand side above represents a decomposition of the contributions to the test statistic Tn according to the different components ^ n ðkj Þ and to the different Fourier frequencies kj. ^ n; rs ðkj Þ of the residual matrix U U ^ n ðkj Þ behaves like the Recall that under the null hypothesis the random matrix U periodogram matrix of an m-dimensional white noise process with mean zero and
Ó Blackwell Publishing Ltd 2005
554
E. PAPARODITIS
^ n ðkj Þ are unit variance, i.e., under the null hypothesis the components of U ^ asymptotically independent from each other. Recall further that pffiffiffiffiffiQn ðkÞ ¼ PN 1 ^ ^ npffiffiffiffiffi j¼N Kh ðk kj ÞðUn ðkj Þ Im Þ and that by Lemma 1, nhQn ðkÞ ¼ nhQn ðkÞ þ oP ð1Þ, where the matrix Qn(k) ¼ (qn,rs)r,s¼1,. . .,m is defined in (11). By the asymptotic normality of smoothed kernel estimators of the spectral density, we get for any k 2 (0, p) and under the assumptions made that under validity of the null hypothesis, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð2 drs Þnh Refqn;rs ðkÞg Re Zrs ðkÞ :¼ ) N ð0; 1Þ ð22Þ sðKÞ and for r 6¼ s pffiffiffiffiffiffiffiffi 2nh Imfqn;rs ðkÞg ) N ð0; 1Þ; ð23Þ :¼ sðKÞ pffiffiffiffiffiffiffiffiffiffiffiffi R where sðKÞ ¼ s2 ðKÞ, s2(K) ¼ (2p))1 K2(u)du, dij ¼ 1 if i ¼ j and dij ¼ 0 if i 6¼ j (see Hannan, 1970). Define next the random variable ZrsIm ðkÞ
XT2 ðkj Þ ¼
m X
Zrr2 ðkj Þ þ
r¼1
m1 X m h i X ðZrsRe ðkj ÞÞ2 þ ðZrsIm ðkj ÞÞ2
ð24Þ
r¼1 s¼rþ1
and verify that (cf. eqn 21), m X m X N N N 2 X pffiffiffi X 1 X ^ n;rs ðkl Þ drs ¼ cn 2p h Kh ðkj kl Þ U XT2 ðkj Þ; n j¼N r¼1 s¼1 j¼N l¼N
ð25Þ
pffiffiffi where the constant cn is given by cn ¼ 2p s2 ðKÞ=ðn hÞ. Now under the assumptions of Theorem 1 and under validity of the null hypothesis, XT2 ðkj Þ is a sum of the m2 asymptotically independent terms given by Zrr2 ðkj Þ, r ¼ 1, 2, . . ., m and ððZrsRe ðkj ÞÞ2 ; ðZrsIm ðkj ÞÞ2 ; r ¼ 1; 2; . . . ; m and s ¼ r þ 1; r þ 2; . . . ; mÞ. According to (22) and (23), each of these terms converges in distribution to a central v2 distribution with one degree of freedom. Thus the following result is true: Proposition 1. Under the assumptions of Theorem 1 and if the null hypothesis is true, then for every k 2 (0, p), XT2 ðkÞ ) v2m2 as n ! 1, where v2m2 denotes the central v2 distribution with m2 degrees of freedom. Using (21–25) we, therefore, get that the test statistic Tn can be approximately written as
Ó Blackwell Publishing Ltd 2005
555
TESTING THE FIT N X
Tn cn
XT2 ðkj Þ:
ð26Þ
j¼N
Expression (26) gives an (approximative) representation of the test statistic Tn, which is very useful for explorative purposes. In particular, if the overall VARMA fit is satisfactorily, we expect XT2 ðkj Þ to behave for every kj as a central v2m2 distributed random variable. Furthermore, by (26) the statistic Tn can be approximately written as a sum of such v2m2 distributed random variables. Thus, an informative graph of the overall fit of the VARMA model can be obtained by plotting the values of the ratio C2T ðkj ; 1 aÞ :¼
XT2 ðkj Þ v2m2 ;1a
against the Fourier frequencies kj, j ¼ 1, 2, . . . , N. Here v2k; 1a denotes the (1 ) a)th quantile of the central v2 distribution with k degrees of freedom and a is some small value in the interval (0, 1), say a ¼ 0.10 or 0.05. Frequency domain regions, where the values of C2T ; 1a ðkj Þ exceed 1 are suspicious and pinpoint to frequency regions, where the parametric fit is not satisfactorily. For these regions it may be further of interest to identify the sources of the lacks of fit, i.e., to identify the components of the sample spectral density matrix which are not appropriately parametrized by the VARMA model and are, therefore, responsible for the high value of Tn leading to a rejection of the null hypothesis. For this a more detailed analysis is required, where the different components of the statistic XT2 ðkj Þ can be considered. In particular, the ratios C2Re;rs ðkj ; 1 aÞ :¼
½ZrsRe ðkj Þ2 v21;1a
and C2Im;rs ðkj ; 1 aÞ :¼
½ZrsIm ðkj Þ2 v21;1a
for r 6¼ s;
can be investigated. Recall that if the sample spectral density matrix is appropriately parametrized by the VARMA model, then we expect ðZrsRe ðkj ÞÞ2 and ðZrsIm ðkj ÞÞ2 to behave like a v2 distributed random variable with one degree of freedom. Plots of C2Re; rs ðkj ; 1 aÞ and C2Im; rs ðkj ; 1 aÞ against the Fourier frequencies kj, j ¼ 1, 2, . . ., N, are, therefore, very informative in evaluating the contributions of the different components to the total sum of squares XT2 ðkj Þ. Again, frequency domain regions, where the values of C2Re; rs ðkj ; 1 aÞ or C2Im; rs ðkj ; 1 aÞ are larger than 1, corresponds to frequency regions, where the dynamics between the rth and sth component of the multivariate process are not appropriately captured by the VARMA model.
Ó Blackwell Publishing Ltd 2005
556
E. PAPARODITIS
5.
PRACTICAL ASPECTS AND REAL-LIFE DATA EXAMPLES
5.1. Choosing the smoothing parameters Implementing the test statistic Tn and the associated diagnostics in practice, some decisions have to be made about the choice of the smoothing parameters, i.e., the kernel K and the bandwidth h. For testing purposes the choice of the kernel K has been investigated in the literature in contexts different to that considered here (cf. Ghosh and Huang, 1991). However, the choice of the smoothing bandwidth h seems to be more important. Since the aim is to maximize the power of the test, one approach is to chose h in a way which leads to a good estimate of the unknown matrix function D(Æ) given in (19) and which dominates the power of the test. In the context of spectral density estimation the issue of bandwidth selection has been investigated by Beltr~ ao and Bloomfied (1987), Hurvich (1985) and Robinson (1991). In particular, for the multivariate case Robinson (1991) suggested an approach based on the multivariate pseudo log-likelihood criterion Lðf; In;X Þ ¼
N X
flog detffðkj Þg þ trfIn;X ðkj Þf1 ðkj Þgg:
j¼N
The analogue approach in our context would be to consider instead of 1=2 1=2 L(f, In, X) the function Lð~f; Un Þ, where ~fðkÞ ¼ fH0 ðkÞfðkÞfH0 ðkÞ and Un(k) is defined in (7). However, it is easily seen using properties of the determinant and of the trace that N X 1=2 1=2 Lð~f; Un Þ ¼ Lðf; In;X Þ þ log det fH0 ðkÞfH0 ðkÞ : j¼N
Therefore, a useful procedure is to choose the smoothing bandwidth h by minimizing the objective function CV ðhÞ ¼
N 1X flog detf^fj ðkj Þg þ trf^In;X ðkj Þ^f1 j ðkj Þgg; N j¼1
P where ^fj ðkj Þ ¼ 1n s2Nj Kh ðk ks Þ^In; X ðks Þ and Nj ¼ fs: )N s N and s 6¼ ±jg. That is, ^fj ðkj Þ is the leave out kernel estimator of f(k), i.e., the estimator of this quantity obtained after deleting the jth point. Notice that using the described approach, the value of h, which minimizes CV(h) and which is used to calculate the kernel estimators and the test statistic Tn does not depend on the parametric VARMA model fitted to the series. 5.2. Applications to real-life data Example 1. Consider the bivariate time series of quarterly, seasonally adjusted U.S. fixed investment and change in business inventories for the time period 1947–
Ó Blackwell Publishing Ltd 2005
557
TESTING THE FIT
1971 given in Lu¨tkepohl (1991). Based on different order selection criteria, Reinsel (1993, p. 102) reports that a mixed VARMA(1,1) model as well a pure VAR(2) model are most appropriate for this series. Reinsel (1993) proposes also a slight modification of the VAR(2) model allowing for a fourth order moving average term in order to capture some remaining seasonal correlation (cf. Reinsel, 1993, pp. 103–104). Based on the objective function CV(h), a bandwidth of h ¼ 0.19 has been used to calculate the smoothed estimates. Figure 1a shows the C2T ð:; 0:95Þ statistic for the VARMA(1,1) model, Figure 1c shows the same statistic for the VAR(2) model and Figure 1e for the VARMA(2,4) model. As it is clearly seen from these plots, the VARMA(2,4) model fits the data best while the fits of the VARMA(1,1) and the VAR(2) model are not satisfactory. The later two models have a very comparable overall fit and show the same difficulties in parametrizing appropriately the behavior of the periodogram matrix around the frequency k ¼ 0.375. Note that k ¼ 0.375 is a harmonic of the seasonal (quarterly) frequency. Thus, although both series considered have been seasonally adjusted, Figure 1a,c show that the lacks of the VARMA(1,1) and of the VAR(2) fit are due to difficulties of these models to capture some remaining seasonal correlation in the data. To see this more clearly and to understand where the high values of the C2T statistics for these models comes from, we present in Figure 1b,d the
(a)
(b)
(c)
(d)
(e)
(f)
Figure 1. Plot of some diagnostic statistics for the different VARMA models fitted to the data of Example 1: (a) Plot of C2T ðkj ; 0:95Þ and (b) Plot of C2Re; 22 ðkj ; 0:95Þ, for the VARMA(1,1) model. (c) Plot of C2T ðkj ; 0:95Þ and (d) Plot of C2Re; 22 ðkj ; 0:95Þ for the VAR(2) model. (e) Plot of C2T ðkj ; 0:95Þ and (f) Plot of C2Re; 22 ðkj ; 0:95Þ for the VARMA(2,4).
Ó Blackwell Publishing Ltd 2005
558
E. PAPARODITIS
C2Re; 22 ð; 0:95Þ statistic for the VARMA(1,1) and for the VAR(2) model respectively. The peak around frequency k ¼ 0.375 appearing in Figure 1a,c follows closely the corresponding peak around the same frequency of the C2Re; 22 statistic shown in Figure 1b,d. Clearly, there is some seasonal behavior left in the series of business inventories which is not captured by the VARMA(1,1) and the VAR(2) fit. Now, Figure 1e,f presents the same statistics for the VARMA(2,4) model as those presented in Figure 1a,b for the VARMA(1,1) and in Figure 1c,d for VAR(2) model. As Figure 1e,f show, allowing for a fourth order, seasonal moving average term removes completely the seasonal correlation of the series and leads to a substantial improvement of fit. Example 2. Consider the bivariate data of Makridakis and Whellwright (1978) of weekly production figures, X1,t, and billing figures, X2,t. There are n ¼ 100 observations available. For this series, Reinsel (1993) proposes a VARMA(2,1) model Estimating this model by the conditional maximum likelihood method using the NAG library routine G13DCF, we get the estimates ^ 1 ¼ 0:000 0:000 ; U 2:034 1:539
^2¼ U
0:000 0:000 ; 1:896 0:787
^ 1 ¼ 0:497 0:000 ; B 0:000 0:743
d 1; t ; e2; t Þ ¼ 0:303, which are very d 2; t Þ ¼ 3:417 and Covðe d 1; t Þ ¼ 2:417, Varðe Varðe close to those obtained by Reinsel (1993, p. 153). The objective function CV(h) is minimized for the value h ¼ 0.17 and for this value of the smoothing parameter we get the value of the test statistic Tn ¼ 102.94. Applying the parametric bootstrap procedure proposed we get using B ¼ 1000 replications that this value of Tn corresponds to a p-value of 0.024. Note that in applying the bootstrap and in order to take fully into account the randomness of h, the bandwidth used for every bootstrap pseudoseries has been re-estimated using the function CV(h). The results of our procedure lead to a rejection of the VARMA(2,1) model and this conclusion is also supported by an examination of the overall goodness-of-fit diagnostic plot C2T ð; 0:95Þ, which is presented in Figure 2a. (a)
(b)
(c)
Figure 2. Plot of some diagnostic statistics for the VARMA(2,1) model fitted to the data of Example 2: (a) Plot of C2T ðkj ; 0:95Þ, (b) plot of C2Im; 12 ðkj ; 0:95Þ and (c) plot of C2Re; 22 ðkj ; 0:95Þ against the Fourier Frequencies.
Ó Blackwell Publishing Ltd 2005
559
TESTING THE FIT
Figure 2a shows that the difficulties of the VARMA(2,1) model are due to its weakness to parametrize appropriatelly the low and high frequency behavior of the sample spectral density matrix. A further examination of the diagnostic statistics introduced in this paper shows that the difficulties of the VARMA(2,1) model are mainly due to the parametrization of the low and high frequency dynamic relations between production and billing figures and the low and high frequency behavior of the series of billing figures. This is clearly seen in Figure 2b,c where plots of the statistics C2Im; 12 ð; 0:95Þ and C2Re; 22 ð; 0:95Þ are shown. Example 3. Consider the trivariate system of quarterly, seasonally adjusted series of West German fixed investment, disposable income and consumption expenditures for the period from 1960 to 1982, discussed in Lu¨tkepohl (1991). Based on order selection criteria, Lu¨tkepohl (1991) selects for the differenced logarithms of these series a VAR(2) model. To asses the fit of this model we apply the test statistic Tn and get for h ¼ 0.2 the value Tn ¼ 75.48. Using the bootstrap approximation of the null distribution of Tn and based on 1000 bootstrap replications we get that this value of Tn corresponds to a p-value of 0.345. Although this result does not lead to a rejection of the VAR(2) model at the common levels, an examination of the explorative statistics associated with Tn raises more doubts about the quality of the VAR(2) fit. In particular, an inspection of the overall C2T ð ; 0:95Þ statistic, which is shown in Figure 3a, leads to the conclusion that the VAR(2) model has some difficulties in parametrizing satisfactorily the low frequency behavior of the trivariate system. ^ n of A more detailed examination of the different components of the matrix U the VAR(2) fit shows that responsible for this relative high value of the ^ n; 21 ðkÞ and C2T ð; 0:95Þ statistic at the low frequencies are the components U ^ Un; 32 ðkÞ. Figure 3b,c shows the corresponding plots of the diagnostic statistics C2Im; 21 ðkj ; 0:95Þ and C2Im; 32 ðkj ; 0:95Þ respectively. These figures suggest that the low frequency dynamic relations between changes in investment and income as well as income and consumption are not captured satisfactorily by the VAR(2) model.
(a)
(b)
(c)
Figure 3. Plot of some diagnostic statistics for the VAR(2) model fitted to the data of Example 3: (a) Plot of C2T ðkj ; 0:95Þ, (b) plot of C2Im; 21 ðkj ; 0:95Þ and (c) plot of C2Im; 32 ðkj ; 0:95Þ.
Ó Blackwell Publishing Ltd 2005
560
E. PAPARODITIS
6.
PROOFS
Let Wn(kj) ¼ 2pR)1/2In,e(kj)R)1/2)Im, recall R that Rn(kj) 2is the remainder matrix in 1=2 p eqn (4) and define T ¼ nh where Q0; n ðkÞ ¼ 0; n p kQ0; n ðkÞk dk, PN 1=2 1 f1=2 ðkÞ and K ðk k ÞðU ðk Þ I Þ, U ðkÞ ¼ f ðkÞI ðkÞ h j 0; n j m 0; n n; X j¼N H0 H0 n1=2 pffiffiffiffiffiffi 1=2 1 fH0 ðkÞ ¼ 2pR0 B0 ðkÞU0 ðkÞ. Lemma 1. Let assumptions (A1)–(A4) be satisfied and assume that H0 is true. If n ! 1, then ð27Þ Tn T0;n ¼ oP ð1Þ: Notice that
Proof.
nh
1=2
Z
p
^ n ðkÞk2 kQ0;n ðkÞk2 Þdk ¼ nh1=2 ðkQ
p
Z
p
^ n ðkÞðQ ^ n ðkÞ Q0;n ðkÞÞ0 gdk trfQ
p
þ nh1=2
Z
p
p
^ n ðkÞ Q0;n ðkÞÞQ0 ðkÞgdk: trfðQ 0;n
Since Z
p
^ n ðkÞ Q0;n ðkÞÞQ0 ðkÞgdk trfðQ 0;n ( ! Z p 1X 1=2 ^ n ðkj Þ U0;n ðkj ÞÞ tr Kh ðk kj ÞðU ¼ nh n j p !) 1X Kh ðk kj ÞðU0;n ðkj Þ Im Þ dk; n j
Ln ¼ nh
1=2
p
and n1 get that
P
j
Kh ðk kj ÞðU0; n ðkj Þ Im Þ ¼ n1
Ln ¼ 2pnh
1=2
Z
(
p
tr p
P
j
~ n ðkj ÞÞ, we Kh ðk kj ÞðWn ðkj Þ þ R
! 1X 1=2 1=2 1=2 Kh ðk kj Þ½fH^ ðkj Þ fH0 ðkj ÞIn;X ðkj ÞfH0 ðkj Þ n j !)
1X Kh ðk kj ÞWn ðkj Þ n j
dk þ OP ðh1=2 Þ
~n þ OP ðh1=2 Þ ¼L ~n . Note that the OP(h1/2) term above appears with an obvious notation for L 1=2 1=2 because by assumption (A2), fH^ ðkj Þ fH0 ðkj Þ ¼ OP ðn1=2 Þ and Rn(kj) ¼ )1/2 OP(n ), both uniformly in kj. Now, using a Taylor series expansion of 1=2 1=2 fH^ ðkj Þ around fH0 ðkj Þ we get Ó Blackwell Publishing Ltd 2005
561 !
TESTING THE FIT
1=2 @fH ðkj Þ 1X 1=2 ^ tr ðH HÞ Kh ðk kj Þ In;X ðkj ÞfH0 ðkj Þ : n j @H p H¼H0 !) 1X Kh ðk kj ÞWn ðkj Þ dk þ oP ð1Þ n j ! Z p ( X 1 1=2 ^ HÞ tr ðH Kh ðk kj ÞG0 ðkj ÞIn;e ðkj ÞR0 ¼ nh1=2 n j p !) 1X Kh ðk kj ÞWn ðkj Þ dk n j ! Z p ( 1=2 X @f ðk Þ 1 j 1=2 ^ HÞ tr ðH Kh ðk kj Þ H Rn ðkj ÞfH0 ðkj Þ : þ nh1=2 n j @H p H¼H0 !) X 1 Kh ðk kj ÞWn ðkj Þ dk þ oP ð1Þ; n j pffiffiffiffiffiffi 1=2 where G0 ðkÞ ¼ 2p@fH ðkj Þ=@HjH ¼ H0 U1 0 ðkj ÞB0 ðkj Þ. Since by straightforward calculations, " ! #2 ! Z p X X 1 1 1=2 E nh1=2 Kh ðkkj ÞG0 ðkj ÞIn;e ðkj ÞR0 Kh ðkkj ÞWn ðkj Þ dk ¼ Oð1Þ n j p n j ~n ¼ nh1=2 L
Z
p
(
1=2 ^ Þ, we get that the first term on the right-hand side of the last and HH 0¼OP ðn equality above is OP(n)1/2). Furthermore, using Rn(kj) ¼ OP(n)1/2) the second term is OP(h1/2). u
Lemma 2. As n ! 1
Let the assumptions of Theorem 1 be satisfied and assume that f ¼ fH0. T0;n ¼ nh1=2
Z
p
kKn ðkÞk2 dk þ oP ð1Þ;
ð28Þ
p
where Kn ðkÞ ¼
Proof.
N 1X Kh ðk kj ÞWn ðkj Þ: n j¼N
ð29Þ
Note that Z
p
2
Z
p
kKn ðkÞk dk þ nh kR1;n ðkÞk2 dk p p Z p Z p n ðkÞR1;n ðkÞgdk; 1;n ðkÞgdk þ nh1=2 þ nh1=2 trfKn ðkÞR trfK
T0;n ¼ nh
1=2
p
1=2
p
Ó Blackwell Publishing Ltd 2005
562
E. PAPARODITIS
where R1;n ðkÞ ¼
N 1X 1=2 1=2 Kh ðk kj ÞðfH0 ðkj ÞRn ðkj ÞfH0 ðkj ÞÞ: n j¼N
ð30Þ
fact that Rn(k) ¼ OP(n)1/2) uniformly in k, we get that nh1/2 R Using the 2 kR1, n(k)k dk ¼ OP(h1/2). Thus it remains to show that nh1=2
Z
p
1;n ðkÞgdk ¼ oP ð1Þ: trfKn ðkÞR
ð31Þ
p
Let Wrs(k) and R1,rs(k) be the (r,s)th element of the random matrices Wn(k) and R1,n(k) respectively. We then have Z 1=2 E nh ¼
p
1;n ðkÞgdk trfKn ðkÞR
p m X
h n2 r ;r ;s 1
2
N X
1 ;s2 ¼1 j1 ;j2 ;l1 ;l2 ¼N
Z
p p
2 Z
p
Kh ðk1 kj1 Þ
p
Kh ðk1 kj2 ÞKh ðk2 kl1 ÞKh ðk2 kl2 Þdk1 dk2 E½Wr1 s1 ðkj1 ÞR1;r1 s1 ðkj2 ÞWr2 s2 ðkl1 ÞR1;r2 s2 ðkl2 Þ:
ð32Þ
Although evaluation of the above expectation term requires tedious manipulations of formulae, the arguments used to show that the terms obtained vanish asymptotically are very similar. We demonstrate this by means of some particular cases. Consider, for instance, the case where all indices j1, j2, l1, l2 in (32) are different. Under the assumptions made and because for j1 6¼ j2, EðWr21 s1 ðkj1 Þ Wr22 s2 ðkj2 ÞÞ ¼ Oðn2 Þ and EðR21; r1 s1 ðkj ÞÞ ¼ OP ðn1 Þ, we get by the Cauchy– Schwartz inequality that E(Wr1s1(kj1)R1,r1s1 ()kj2)Wr2s2(kl1), R1,r2s2()kl2)) ¼ O(n)2) which implies that the corresponding term in (32) is O(h) and, therefore, vanishes as n ! 1. To give another example in evaluating (32), consider the case where j1 ¼ j2 6¼ l1 ¼ l2. By Theorem 2 in Hannan (1970, p. 248) we have EðR41; rs ðkj ÞÞ ¼ Oðn2 Þ. From this and the Cauchy–Schwarz inequality we get EðWr1 s1 ðkj1 ÞR1;r1 s1 ðkj1 ÞWr2 s2 ðkl1 ÞR1;r2 s2 ðkl1 ÞÞ n o1=4 n o1=4 n o1=4 n o1=4 EðWr1 s1 ðkj1 ÞÞ4 EðR1;r1 s1 ðkj1 ÞÞ4 EðWr2 s2 ðkl1 ÞÞ4 EðR1;r2 s2 ðkl1 ÞÞ4 ¼ Oðn1 Þ; which implies that the corresponding term in (32) is O(n)1h)1).
Ó Blackwell Publishing Ltd 2005
u
563
TESTING THE FIT
Let the assumptions of Theorem 1 be satisfied. As n ! 1
Lemma 3.
Z 1=2 L nh
p
2
2 1=2
kKn ðkÞk dk m h
p
nh1=2
p 2
K ðuÞdu
) N ð0; r2 ðKÞÞ:
ð33Þ
p
From the definition of Kn(k) we have
Proof. Z
Z
p
2
kKn k dk ¼
m X
p
nh1=2
r¼1
Z
p
p
8 m1 X m < X
1X Kh ðk kj Þð RefVn;rr ðkj Þg 1Þ n j
!2 dk
!2 pffiffiffi 1X nh Kh ðk kj Þ 2RefVn;rs ðkj Þg dk þ : p n j r¼1 s¼rþ1 !2 9 Z p = X p ffiffi ffi 1 þ nh1=2 Kh ðk kj Þ 2ImfVn;rs ðkj Þg dk ; ð34Þ ; p n j 1=2
Z
p
where RefVn,rs(k)g resp. ImfVn,rs(k)g denote the real resp. imaginary part of the (r, s)th element of the matrix 2pR )1/2In, e(kj)R)1/2. Let (v1,j, v2,j, v3,j )0 2 be the mp -dimensional vector with v1; j : ¼ ðRefVn; rr ðkj Þg 1Þ; r ¼ 1; . . . ; m ; ffiffiffi vp 2; jffiffiffi: ¼ ð 2RefVn; rs ðkj Þg; r ¼ 1; 2; . . . ; m 1; s ¼ r þ 1; . . . ; mÞ, and v3; j : ¼ ð 2ImfVn; rs ðkj Þg; r ¼ 1; 2; . . . ; m 1; s ¼ r þ 1; . . . ; mÞ. The assertion of the lemma follows then by the Cramer–World and because each one of the m2 R device 0 1=2 p 2 terms of (v1,j, v2,j, v3,j) centred by h p K ðuÞdu converge in distribution to the N(0, r2(K)/m2) distribution. We demonstrate the arguments used for the term RefVn, rr(kj)g only because the other terms can be handeled in exactly the same way.P LetP ~et ¼ R1=2 et and ~er; t be the rth element of ~et . Note that 1=2 n1 nt¼1 ns¼1 Þ þ Sn ðkÞ þ Sn ðkÞ where eP r; t~ r; s expfikðt sÞg 1 ¼ OP ðn P~en1 n 1 Sn ðkÞ ¼ n er; t~er; s expfikðt sÞg. We then have t¼1 s¼tþ1 ~ !)2 Z p( N n X n X X 1=2 1 1 ~er;t~er;s expfikðtsÞg1 nh n Kh ðkkj Þ n p
h1=2
Z
j¼N
t¼1 s¼1
p
K 2 ðuÞdu
p
¼nh
1=2
Z p(
n p
1
N X
)2 Kh ðkkj ÞðSn ðkj ÞþSn ðkj ÞÞ
dkh1=2
Z
p
K 2 ðuÞduþoP ð1Þ p
j¼N
! )2 Z p Z p (X n1 N X 1=2 1 ¼4h n n Kh ðkkj Þcosðskj Þ ^c~er ðsÞ dkh1=2 K 2 ðuÞduþoP ð1Þ; p
s¼1
j¼N
p
P P where ^c~er ðsÞ¼n1 ns er;t~Rer;tþs . Approximating n1 Nj¼N Kh ðkkj Þcosðskj Þ by the t¼1 ~ )1 Riemann integral (2p) Kh(k)x) cos(xs)dx and using the well-known trigonometric identities, the last equation can be written as
Ó Blackwell Publishing Ltd 2005
564 4h
E. PAPARODITIS
1=2
n
Z Z p (X n1 1 p
h1=2
Z
s¼1 p
2p
)2 KðuÞ cosðuhsÞdu ^c~er ðsÞ cosðskÞ dk
K 2 ðuÞdu þ oP ð1Þ
p
¼ 4h
1=2
n
Z n1 X 1 s¼1
2p
2 KðuÞ cosðuhsÞdu
^c~2er ðsÞ h1=2
Z
p
K 2 ðuÞdu þ oP ð1Þ;
p
where the last equality follows using Parseval’s idendity. Asymptotic normality of 2 Z Z p n1 X 1 1=2 KðuÞ cosðuhsÞdu ^c~2er ðsÞ h1=2 K 2 ðuÞdu; 4h n 2p p s¼1 follows then using Theorem A.1 of Hong (1996). Lemma 4.
u
Let the assumptions of Theorem 2 be satisfied. As n ! 1, n1 h1=2 Tn ¼
Z
p
kDn ðkÞk2 dk þ oP ð1Þ;
p
where Dn ðkÞ ¼
N 1X 1=2 1=2 Kh ðk kj Þ fH0 ðkj Þfðkj ÞfH0 ðkj Þ Im : n j¼N
1/2 )1/2 1 Proof. Let W1 W(k)R1/2 and verify by 0 ðkÞ ¼ B0 ðkÞU0 ðkÞ, f (k) ¼ (2p) simple algebra that
N 2 3 X X 1X Kh ðk kj ÞðU0;n ðkj Þ Im Þ ¼ Kn ðkÞ þ Dn ðkÞ þ Ls;n ðkÞ þ Rs;n ðkÞ; n j¼N s¼1 s¼1
where Kn(k) and R1, n(k) is given in (29) and (30) respectively. Furthermore, L1;n ðkÞ ¼
L2;n ðkÞ ¼
N h i 2p X Kh ðk kj ÞR1=2 W1 ðk ÞWðk Þ I ðk Þ R R1=2 ; I j j m n;e j 0 n j¼N
N h i 2p X 1=2 1 ðkj ÞWðk j Þ Im R1=2 ; Kh ðk kj ÞfH0 ðkj ÞWðkj Þ In;e ðkj Þ R W 0 n j¼N
pffiffiffiffiffiffi N h i 2p X 1=2 1 ðkj Þ W 1 ðkj Þ R1=2 R2;n ðkÞ ¼ Kh ðk kj ÞfH0 ðkj ÞRn ðkj Þ W 0 n j¼N
Ó Blackwell Publishing Ltd 2005
565
TESTING THE FIT
and pffiffiffiffiffiffi N i h 2p X 1 R3;n ðkÞ ¼ Kh ðk kj ÞR1=2 W1 ðk Þ W ðk Þ Rn ðkj Þf1=2 ðkj Þ: j j 0 n j¼N Rp Since p kKn k2 dk ¼ oP ð1Þ, to establish the desired result we have to show that Z
2 3 n o X X n ðkÞ þ s;n ðkÞ þ s;n ðkÞ dk ¼ oP ð1Þ; tr Kn ðkÞ D L R
p
p
Z
p
s¼1
2 3 n o X X n ðkÞ þ s;n ðkÞ þ s;n ðkÞ dk ¼ oP ð1Þ tr Dn ðkÞ K L R
p
ð35Þ
s¼1
s¼1
ð36Þ
s¼1
and Z
(
p
tr p
2 X
Ls;n ðkÞ þ
s¼1
3 X
! Rs;n ðkÞ
s¼1
n ðkÞ þ D n ðkÞ þ K
2 X
s;n ðkÞ þ L
s¼1
3 X
!) s;n ðkÞ R
dk ¼ oP ð1Þ:
ð37Þ
s¼1
(35)–(37) can then be shown using Kn(k) ¼ OP(n)1/2h)1/2), Dn(k) ¼ OP(1), Ls, n(k) ¼ OP(n)1/2h)1/2) and Rs, n(k) ¼ OP(n)1/2) uniformly in k. For instance, we get Z p n ðkÞ þ L 1;n ðkÞ þ R 1;n ðkÞÞgdk ¼ OP ðn1=2 h1=2 Þ; trfKn ðkÞðD p Z p 1;n ðkÞ þ R 1;n ðkÞÞgdk ¼ OP ðn1=2 h1=2 Þ trfDn ðkÞðL p
and Z
p
1;n þ R 1;n ðkÞÞgdk ¼ OP ðn1=2 h1=2 Þ; trfðL1;n ðkÞ þ R1;n ðkÞÞðL
p
u
while the other terms are handled in the same way. Proof of Theorem 1. Lemmas 1–3. Proof of Theorem 2.
Theorem 1
is
an
immediate
consequence
of u
By Lemmas 1 and 4 we get that under the alternative,
n1 h1=2 Tn ¼
Z
p
kDn ðkÞk2 dk þ oP ð1Þ:
ð38Þ
p
Ó Blackwell Publishing Ltd 2005
566
E. PAPARODITIS
From this and because result follows.
Rp
p
kDn ðkÞk2 dk !
Rp
p
kDðkÞk2 dk, as h ! 0, the desired u
Proof of Theorem 3. To establish the asymptotic validity of the parametric bootstrap procedure for VARMA models we have basically to imitate the proof of Theorem 1 and to use some basic theory for a parametric bootstrap developed by Kreiss and Franke (1992) and Paparoditis (1996). We, therefore, stress in the following only the essentials. Note that in contrast to Theorem 1, we have here ^ we can show along the ^ By the pffiffinffi-consistency of the estimator H et N ð0; RÞ. ~ same linesR as in Lemma 1 that jTnP Tn j ! 0 in probability, where p T~n ¼ nh1=2 p kQn ðkÞk2 dk, Qn ðkÞ ¼ n1 Nj¼ N Kh ðk kj ÞðUn ðkj Þ Im Þ and 1 ^ ^ 1=2 BðkÞ ^ 0 ðkÞB ^ 0 ðkÞ1 R ^ ^ 1=2 . Furthermore, we U ðkj Þ ¼ 2pR UðkÞIn; X ðkÞU have for the bootstrap periodogram 0 ^ 1 ðkj ÞUðk ^ ^0 ^ 1 In;X ðkj Þ ¼ B j ÞIn;e ðkj ÞU ðkj ÞðB ðkj ÞÞ þ Rn ðkj Þ;
where In,e (k) denotes the periodogram matrix of the i.i.d. series e1 ; e2 ; . . . ; en and E ðRn ðkj ÞÞ2 ¼ Oðn1 Þ in probability, uniformly in kj. Along R p the same lines as in the proof of Lemma 2 we get as n ! 1, that jT~n nh1=2 p kK ðkÞk2 dkj ! 0, in probability, where K ðkÞ ¼
N 1 X ^ 1=2 In;e ðkj ÞR ^ 1=2 Im : Kh ðk kj Þ R 2pn j¼N
Now, let ðv1; j ; v2; j ; v3; j Þ0 , j ¼ 1, 2, . . . , N be the bootstrap analogue of (v1, j, v2, j, v3, j)0 , j ¼ 1, 2, . . . , N defined in the proof 3. That is, pffiffiffi of Lemma v1; j : ¼ ðRefVn; rr ðkj Þg 1Þ; r ¼ 1; . . . ; mÞ, v : ¼ ð 2 RefV ðk Þg; r ¼ 1; . . . ; m j rs pffiffiffi 2; j and s ¼ r þ 1; . . . ; mÞ and v3; j : ¼ ð 2ImfVrs ðkj Þg; r ¼ 1; . . . ; m and s ¼ r þ 1; . . . ; mÞ, where RefVrs ðkj Þg resp. ImfVrs ðkj Þg denote the real resp. imaginary ^ 1=2 In; e R ^ 1=2 . Gaussianity of the part of the (r, s)th element of the matrix 2pR 0 i.i.d. sequence fet g implies that ðv1; j ; v2; j ; v3; j Þ , j ¼ 1, 2, . . . , N forms a sequence of independent random vectors with mean zero and covariance matrix the m2 m2 unit matrix. Furthermore, and because of the same assumption, the components of ðv1; j ; v2; j ; v3; j Þ0 are independent from each other (cf. Brillinger, 1981). Asymptotic normality of every one of these terms follows then by Lemma 6.3 of Paparoditis (2000a). For instance, we get applying this lemma that
nh
1=2
Z
p
p
!2 Z p 1X 1=2 Kh ðkkj ÞðRefVn;rr ðkj Þg1Þ dkh K 2 ðuÞdu ) N ð0;r2 ðKÞÞ; n j p
in probability, while the same convergence is true for the other terms of ðv1;j ; v2;j ; v3;j Þ0 . u ACKNOWLEDGEMENTS
The author is grateful to the referees for their constructive comments.
Ó Blackwell Publishing Ltd 2005
TESTING THE FIT
567
NOTE
Corresponding author: Efstathios Paparoditis, Department of Mathematics and Statistics, University of Cyprus, P.O. Box 537, CY-1678 Nicosia, Cyprus. Fax: þ357-2-339061; E-mail:
[email protected]
REFERENCES Anderson, T. W. A. (1993) Goodness of fit tests for spectral distributions. Annals of Statistics 21, 830– 47. Bartlett, M. S. (1954) Proble`mes de l analyse spectral des se´ries temporelles stationnaires. Publication Institute of Statstical University Paris III-3, 119–34. Bartlett, M. S. (1966) An Introduction to Stochastic Processes with Special Reference to Methods and Applications (2nd edn). Cambridge: Cambridge University Press. BeltrA˜o, K. I. and Bloomfield, P. (1987) Determining the bandwidth of a kernel spectrum estimate. Journal of Time Series Analysis 8, 21–38. Brillinger, D. R. (1981) Time Series: Data Analysis and Theory. San Francisco: Holden-Day. Brockwell, P. J. and Davis, R. A. (1991) Time Series: Theory and Methods. New York: SpringerVerlag. Chitturi, R. V. (1974) Distribution of residual autocorrelations in multiple autoregressive schemes. Journal of the American Statistical Association 69, 928–34. Chen, H. and Romano, J. P. (1997) Bootstrap -assisted goodness-of-fit tests in the frequency domain. Journal of Time Series Analysis 20, 619–54. Dahlhaus, R. (1985) On the asymptotic distribution of Bartlett’s Up-statistic. Journal of Time Series Analysis 6, 213–27. Eubank, R. L. and LaRiccia, V. N. (1992) Asymptotic comparison of Cramer-von mises and nonparametric function estimation techniques for testing goodness-of-fit. Annals of Statistics 20, 2071–86. Eubank, R. L., Hart, J. D. and LaRiccia, V. N. (1993) Testing goodness-of-fit via nonparametric function estimation techniques. Communications in Statistics – Theory Methods 22, 3327–54. Fan, Y. and Li, Q. (2000) Consistent model specification tests. Econometric Theory 16, 1016–41. Ghosh, B. K. and Huang, W. (1991) The power and optimal kernel of the Bickel-Rosenblatt test for goodness-of-fit. Annals of Statistics 19, 999–1009. Grenander, U. and Rosenblatt, M. (1952) On spectral analysis of stationary time-series. Proceedings of National Academic Science U.S.A. 38, 519–21. Grenander, U. and Rosenblatt, M. (1957) Statistical Analysis of Stationary Time Series. Stockholm: Almqvist and Wiksell. Hainz, G. and Dahlhaus, R. (2000) Spectral domain bootstrap tests for stationary time series. Preprint. Ha¨rdle, W. and Mammen, E. (1993) Comparing nonparametric versus parametric regression fits. Annals of Statistics 21, 1926–47. Hannan, E. (1969) The identification of vector mixed autoregressive-moving average systems. Biometrika 56, 223–5. Hannan, E. (1970) Multiple Time Series. New York: Wiley. Hannan, E. and Deistler, M. (1988) The Statistical Theory of Linear Systems. New York: Wiley. Hong, Y. (1996) Consistent testing for serial correlations of unknown form. Econometrica 64, 837–64. Hosking, J. (1980) The multivariate portmanteau statistics. Journal of the American Statistical Association, 75, 602–8. Hosking, J. (1981) Lagrange multiplier tests of multivariate time series models. Journal of the Royal Statistical Society B, 43, 219–30. Hurvich, C. M. (1985) Data driven choice of a spectrum estimate: extending the applicability of crossvalidation methods. Journal of the American Statistical Association 80, 933–40. Kohn, R. (1979) Asymptotic estimation and hypothesis testing results for vector linear time series models. Econometrica 47, 1005–30. Kreiss, J. P. and Franke, J. (1992) Bootstrapping stationary autoregressive moving average models. Journal of Time Series Analysis 13, 297–319.
Ó Blackwell Publishing Ltd 2005
568
E. PAPARODITIS
Li, W. K. and McLeod, A. I. (1981) Distribution of residual autocorrelations in multivariate ARMA time series models. Journal of the Royal Statistical Society Series B 43, 231–9. Lu¨tkepohl, H. (1991) Introduction to Multiple Time Series Analysis. New York: Springer-Verlag. Makridakis, S. and Wheelwright, S. C. (1978) Forecasting Methods and Applications. Santa Barbara: Wiley. Newbold, P. (1983) Model checking in time series analysis. In Applied Time Series Analysis of Economic Data (ed. A. Zellner). Washington DC: U.S. Bureau of the Census, pp. 138–43. Paparoditis, E. (1996) A frequency domain bootstrap-based method for checking the fit of a transfer function model. Journal of the American Statistical Association 91, 1535–50. Paparoditis, E. (2000a) Spectral density based goodness-of-fit test for time series models. Scandinavian Journal of Statistics 27, 143–76. Paparoditis, E. (2000b) On some power properties of goodness-of-fit tests for time series models. In Probability and Statistical Models with Applications (eds Ch. A. Charalambides, M. V. Koutras and N. Balakrishnan). London: Chapman & Hall, pp. 333–48. Prewitt, K. (1993) Goodness-of-fit test in parametric time series models differenced. Journal of Time Series Analysis 19, 549–74. Priestley, M. B. (1981) Spectral Analysis and Time Series, Vols 1 and 2. Academic Press, New York. Poskitt, D. S. and Tremayne, A. R. (1982) Diagnostic tests for multiple time series models. Annals of Statistics 10, 114–20. Reinsel, G. C. (1993) Elements of Multivariate Time Series Analysis. New York: Springer-Verlag. Robinson, P. (1991) Automatic frequency domain inference on semiparametric and nonparametric models. Econometrica 59, 1329–63. Rosenblatt, M. (1975) A quadratic measure of deviation of two-dimensional density estimates and a test of independence. Annals of Statistics 3, 1–14.
Ó Blackwell Publishing Ltd 2005