Econometric Reviews, 28(1–3):246–261, 2009 Copyright © Taylor & Francis Group, LLC ISSN: 0747-4938 print/1532-4168 online DOI: 10.1080/07474930802388066
A ROBUST ENTROPY-BASED TEST OF ASYMMETRY FOR DISCRETE AND CONTINUOUS PROCESSES
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
Esfandiar Maasoumi1 and Jeffrey S. Racine2 1
Department of Economics, Emory University, Atlanta, Georgia, USA and School of Business and Economics, Swansea, UK 2 Department of Economics, McMaster University, Hamilton, Ontario, Canada
We consider a metric entropy capable of detecting deviations from symmetry that is suitable for both discrete and continuous processes. A test statistic is constructed from an integrated normed difference between nonparametric estimates of two density functions. The null distribution (symmetry) is obtained by resampling from an artificially lengthened series constructed from a rotation of the original series about its mean (median, mode). Simulations demonstrate that the test has correct size and good power in the direction of interesting alternatives, while applications to updated Nelson and Plosser (1982) data demonstrate its potential power gains relative to existing tests.
!
Keywords Entropy; Kernel; Metric; Nonparametric; Symmetry; Time series. JEL Classification
C1; C12; C14.
1. OVERVIEW Testing for asymmetric behavior present in a series or in conditional predictions has a rich history dating to the pioneering work by Crum (1923), Mitchell (1927), and Keynes (1936) who examined potential asymmetries present in a number of macroeconomic series. Interest has intensified and continues through the present day, and many macroeconomic series have been analysed and tested for asymmetric behavior in expansions and downturns. Recent work of Timmermann and Perez-Quiros (2001), Bai and Ng (2001), Premaratne and Bera (2005), and Belaire-Franch and Peiro (2003) are but a few such examples. There is also Address correspondence to Jeffrey S. Racine, Department of Economics, McMaster University, Hamilton, ON L8S 4M4, Canada; E-mail:
[email protected]
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
A Robust Entropy-Based Test of Asymmetry
247
much recent interest in asymmetric behavior of prices as markets form and evolve toward greater competition. Some researchers subdivide asymmetry into categories such as sharpness, steepness, and deepness (see McQueen and Thorley, 1993). Though such categorizations of asymmetry may be of interest in their own right, our focus will rest upon consistent tests of asymmetries of any sort. Existing tests are somewhat limited in application as they are designed for continuous processes. However, in applied settings one may encounter discrete processes, particularly in finance where price movements (i.e., differences) are often characterized as i) no change, ii) positive, or iii) negative. In this article our goal is to provide a test that is robust to the underlying datatype, i.e., whether the data is categorical (namely, discrete) or continuous in nature. We adapt the entropy measure in Granger et al. (2004) which is a normalization of the Bhattacharya–Matusita–Hellinger measure of “distance” between distributions, to test the direct and full hypothesis of symmetry. This is in the same spirit as the test proposed by Fan and Gencay (1995), but in contrast with many of the existing tests of symmetry that examine moments and other implications of symmetry, mostly in terms of odd order functions, as in Premaratne and Bera (2005) and Bai and Ng (2005). The latter report on well known difficulties with estimating higher order moments, such as kurtosis, as well as the greater power of tests which are based simultaneously on several odd order moments, not withstanding estimation and bias problems with estimating these moments. It seems that methods which are based on distributions, such as ours, take into account all the information that may be forthcoming from many moments without assuming the existence of such moments, and without much more difficulty in estimation, if any, in similar settings. We are also less likely to suffer from inadvertent introduction of special relations between moments that only hold for special distribution types, especially Gaussian and/or linear processes under composite null hypotheses. We implement our test with appropriate resampling techniques and based on recent kernel density estimation adapted for both continuous (see, e.g., Silverman, 1986, Chapter 3) and discrete (see, e.g., Ouyang et al., 2006) processes, respectively. We find that our test is correctly sized and has superb power, generally, and especially compared with many of the existing tests. These results are perhaps surprisingly robust to liberal degrees of asymmetry, kurtosis, and dependence in the processes, as well as for moderate to large sample sizes commonly examined in this area where large samples are needed for estimation of high order moments, assuming they exist. Existence of accessible software for our test puts in question the common invocation of “ease of use” for moment based tests which can mislead very badly in some realistic circumstances for financial and macroeconomic series.
248
E. Maasoumi and J. S. Racine
The article is organized as follows. First symmetry is discussed in general, and our test statistic is introduced for both continuous and discrete processes. Finite sample performance of a bootstrap implementation is examined next, again for both simulated continuous and discrete processes. Lastly, we re-examine well known empirical examples based on the Extended Nelson–Plosser US macroeconomic series, and we also consider an application involving discrete stock tick data.
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
2. UNCONDITIONAL AND CONDITIONAL SYMMETRY Consider a (strictly) stationary series !Yt "Tt=1 . Let #y denote a measure of central tendency, say #y = E [Yt ], let f (y) denote the density function of !t = −Yt + 2#y denote a rotation of Yt about its the random variable Yt , let Y !t . mean, and let f (˜y ) denote the density function of the random variable Y !t = −Yt , though in general this will not be so. Note that if #y = 0 then Y We say a series is symmetric about the mean (median, mode) if f (y) ≡ f (˜y ) almost surely. Tests for asymmetry about the mean therefore naturally involve testing the following null: H0 : f (y) = f (˜y )
almost everywhere (a.e.)
(1)
against the alternative: H1 : f (y) # = f (˜y ) on a set with positive measure$
(2)
Though the mean has received particular attention when testing for deviations from symmetry, one could persuasively argue that deviations about the mode or median might seem a more natural characterization. One could of course clearly rotate a distribution around any of these measures of central tendency, and for what follows one simply would replace the mean with the appropriate statistic. Note that, for discrete processes, we must by necessity reflect about the median or mode to retain the original points of support (this feature would, in general, be lost were one instead to reflect a discrete process about its mean). Tests for the presence of conditional asymmetry can be based upon standardized residuals obtained from a regression model (see BelaireFranch and Peiro, 2003). Let Yt = h(%t , &) + '(%t , ())t ,
(3)
denote a general location-scale model for this process, where %t is a conditioning information set, '(%t , () the conditional standard deviation of Yt , and )t is a zero mean unit variance error process independent of the elements of %t . If E [) | %t ] = 0 and e = )/' is suitably standardized, then
A Robust Entropy-Based Test of Asymmetry
249
tests for conditional asymmetry involve the following null: H0 : f (e) = f (−e) almost everywhere
(4)
against the alternative:
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
H1 : f (e) # = f (−e) on a set with positive measure$
(5)
Related work includes Bai and Ng (2001), who construct tests based on the empirical distribution of et and that of −et . Belaire-Franch and Peiro (2003) apply their test and other tests to the Nelson and Plosser (1982) data updated to include 1988. We shall make use of the updated data in Section 5.1 below. 3. AN ENTROPY-BASED TEST OF ASYMMETRY Granger et al. (2004) considered a normalization of the Bhattacharya– Matusita–Hellinger measure of dependence given by " $2 1 ∞ # 1/2 S* = f1 − f21/2 dy (6) 2 −∞
where f1 = f (y) is the marginal density of the random variable Y and f2 = !, Y˜ being a rotation of Y about its mean. f (˜y ) that of Y Note that (6) presumes that the underlying datatype is continuous. However, without loss of generality, for a discrete process Y having support ! = !0, 1, 2 $ $ $ , c − 1" for c = 3, 5, $ $ $ , this becomes S* =
$2 1 % # 1/2 p1 − p21/2 2 y∈!
(7)
where p1 = p(y) is the marginal probability of the random variable Y and !, Y ! being a rotation of Y about its median. p2 = p(˜y ) that of Y We consider a kernel-based implementation of equations (6) and (7), denoted & S* , for the purposes of testing the null of symmetry. When Y is continuous we use standard Parzen kernel estimators, while when Y is discrete we use the estimator of Ouyang et al. (2006). Rather than adopt asymptotic-based testing procedures, we elect to use a bootstrap resampling approach (see Efron, 1982; Hall, 1992 for further details on bootstrap resampling procedures). We do so mainly because critical values obtained from the asymptotic null distribution do not depend on the bandwidth (due in part to the fact that the bandwidth is a quantity which vanishes asymptotically), while the value of the test statistic depends directly on the bandwidth. This is a serious drawback
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
250
E. Maasoumi and J. S. Racine
in practice, since the outcome of such asymptotic-based tests tends to be quite sensitive to the choice of bandwidth. This has been noted by a number of authors including Robinson (1991) who noted that “substantial variability in the [test statistic] across bandwidths was recorded,” which would be most troubling in applied situations due, in part, to numerous competing approaches for data-driven bandwidth choice (see Jones et al., 1996, for an excellent survey article on bandwidth selection for kernel density estimates). Granger et al. (2004) contains further discussion and references on the relative merits of asymptotic inferences based on our statistic, and gives an outline of its consistency and asymptotic distribution which will carry over in the present setting under similar assumptions including, particularly, the stationarity of the underlying processes. !1 , $ $ $ , Y !T ". Consider the sample of size 2T given by Z = !Y1 , $ $ $ , YT , Y We may construct the empirical distribution of & S* under the null of symmetry by noting that bootstrap samples drawn from Z , which we denote by Z ∗ , will be symmetric almost surely. We recompute & S* for each of the B resamples drawn from Z which we then order from smallest to largest and ∗ &∗ ∗ denote by & S*,1 , S*,2 , $ $ $ , & S*,B where B is the number of bootstrap replications, say, B = 399 (see Davidson and MacKinnon, 2000, for further details of the appropriate number of bootstrap replications). Given the set of B statistics computed under the null, we may then compute percentiles from the B (ordered) bootstrap statistics and use these as the basis for a test of asymmetry. For instance, to conduct the test at the 5% level, we reject ∗ ∗ H0 if & S* > & S*,380 where & S*,380 is the 95th percentile of the ordered bootstrap statistics that were generated under the null. Alternatively, we can compute empirical power via the proportion of the ordered bootstrap statistics that exceed the actual statistic. A few words are in order regarding which of the many existing bootstrap resampling procedures are appropriate for the proposed test. Bootstrap resampling schemes fall into one of three classes, i) i.i.d. resampling (see, e.g., Efron, 1982), ii) i.n.i.d. resampling (see, e.g., Liu, 1988), and iii) stationary resampling (see, e.g., Künsch, 1989; Politis and Romano, 1994). In practice, one needs to implement a resampling scheme that mimics the manner in which the sample in hand was drawn from its respective population. Both ii) and iii) require the user to set additional ‘tuning’ parameters (see, e.g., Künsch, 1989; Politis and White, 2004). One can of course test for symmetry in a variety of settings. For example, if interest lies in the distribution of bids at, say, a sealed auction, then i.i.d. resampling is naturally appropriate. However, for heterogeneous but otherwise independent bids one might use the so-called wild-bootstrap (Liu, 1988), while for sequential (stationary) bids that may be correlated, one might use the so-called stationary (Politis and Romano, 1994) or block (Künsch, 1989) bootstrap procedures. Of course, this places an additional burden on the practitioner, namely, that she must worry about
A Robust Entropy-Based Test of Asymmetry
251
appropriate values of incidental tuning parameters. Fortunately, a variety of such bootstrap resampling schemes are readily available in a number of popular software packages. Such flexibility allows practitioners to verify whether their results are robust to the underlying resampling scheme or not, and we advocate this approach in practice.
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
4. FINITE-SAMPLE PERFORMANCE We now consider the finite-sample performance of the kernel-based implementation of the test. For what follows, we conduct 1,000 Monte Carlo draws from each data generating process (DGP), and set the number of bootstrap replications underlying the test to B = 399. The bandwidth is selected via likelihood cross-validation (Silverman, 1986, p. 52) which produces density estimators which are “optimal” according to the Kullback–Leibler criterion. Should one wish to use one of the many alternative methods of bandwidth selection (see, e.g., Jones et al., 1996), one may do so at this stage without loss of generality. 4.1. Continuous Data Monte Carlo Simulation We first consider the finite-sample behavior of the proposed test using the metric (6) for continuous Y . We shall consider both i.i.d. data (Yi = )i ) and stationary dependent data (Yt = 0$5Yt −1 + )t ). For i.i.d. data we use a standard bootstrap procedure that resamples with replacement from the empirical distribution of the data, while for the stationary dependent data we consider both the block bootstrap of Künsch (1989) and the stationary bootstrap of Politis and Romano (1994) using the recommended1 blocksize that is equal to 3$15T 1/3 . Let the sample size assume values n = 50, 100, 200, and consider a range of DGP’s: N (120, 240) (symmetric), t (2) (symmetric fat-tailed), +2 (80) (asymmetric), +2 (40) (asymmetric), +2 (20) (asymmetric), +2 (10) (asymmetric), +2 (5) (asymmetric), and +2 (1) (asymmetric). Figure 1 plots each asymmetric DGP to allow the reader to get a sense of the range of asymmetries considered, contrasting the symmetric N (#, '2 ). Tables 1–3 summarize the finite-sample performance of the proposed test conducted at nominal levels of , = 0$10, 0$05, 0$01. 1
For consistency, the (mean) block length should be proportional to T 1/3 (see Politis and Romano, 1994; Politis and White, 2004). The constant 3.15 was provided by Politis and Romano (1994) who considered a Gaussian AR (1) process by way of example. The procedure suggested by Politis and White (2004) for selecting the block length is not applicable in this setting as it relies on visual inspection of the correlogram which is infeasible in a Monte Carlo setting due to the practical limitations of inspecting the correlogram for each Monte Carlo replication. Experimentation indicates that a slightly smaller constant of proportionality will produce a test that is correctly sized. Optimal selection of block length is an issue that is deserving of further study but is an issue that clearly lies beyond the scope of the current article.
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
252
E. Maasoumi and J. S. Racine
FIGURE 1 Simulated continuous distributions. The distributions are, from right to left, N (120, 240), +2 (80), +2 (40), +2 (20), +2 (10), +2 (5), and +2 (1).
We observe from Tables 1–3 that the test has correct size, though we note that the test is somewhat conservative (i.e., slightly undersized) for small sample sizes when using the block and stationary bootstrap for the autoregressive DGP. As might be expected, larger memory and persistence in the underlying processes requires larger sample sizes (200+) to reveal consistency of the tests. In general, however, there is considerable power,
TABLE 1 Empirical rejection frequencies at levels , = 0$10, 0$05, 0$01. The degree of asymmetry increases as we go from columns 2 and 3 (symmetric) to columns 4–9. Columns 2 and 3 reflect the empirical size of the test, columns 4–9 empirical power. Yi = )i , i.i.d. boostrap N (#, '2 )
t (2)
+2 (80)
+2 (40)
+2 (20)
+2 (10)
+2 (5)
+2 (1)
50 100 200 400
0.105 0.090 0.098 0.101
0.100 0.098 0.123 0.108
0.157 0.284 0.478 0.726
, = 0$10 0.259 0.445 0.741 0.946
0.411 0.700 0.945 0.998
0.598 0.912 0.997 1.000
0.835 0.990 1.000 1.000
0.958 0.995 1.000 1.000
50 100 200 400
0.048 0.055 0.039 0.047
0.041 0.041 0.050 0.046
0.096 0.169 0.337 0.614
, = 0$05 0.136 0.316 0.611 0.916
0.270 0.554 0.897 0.993
0.426 0.817 0.988 1.000
0.690 0.967 1.000 1.000
0.886 0.984 1.000 1.000
50 100 200 400
0.008 0.014 0.008 0.012
0.007 0.006 0.007 0.004
0.028 0.053 0.137 0.373
, = 0$01 0.029 0.120 0.350 0.739
0.087 0.276 0.696 0.979
0.175 0.542 0.946 1.000
0.381 0.833 0.997 1.000
0.605 0.933 0.999 1.000
n
A Robust Entropy-Based Test of Asymmetry
253
TABLE 2 Empirical rejection frequencies at levels , = 0$10, 0$05, 0$01. The degree of asymmetry increases as we go from columns 2 and 3 (symmetric) to columns 4–9. Columns 2 and 3 reflect the empirical size of the test, columns 4–9 empirical power. Yt = 0$5Yt −1 + )t , block bootstrap N (#, '2 )
t (2)
+2 (80)
+2 (40)
+2 (20)
+2 (10)
+2 (5)
+2 (1)
50 100 200 400
0.074 0.072 0.086 0.106
0.102 0.092 0.110 0.097
0.013 0.011 0.047 0.211
, = 0$10 0.028 0.040 0.118 0.350
0.050 0.105 0.362 0.748
0.159 0.377 0.756 0.975
0.337 0.718 0.969 1.000
0.767 0.984 1.000 1.000
50 100 200 400
0.022 0.022 0.033 0.043
0.030 0.037 0.038 0.032
0.002 0.000 0.017 0.096
, = 0$05 0.005 0.011 0.051 0.225
0.019 0.042 0.201 0.611
0.060 0.193 0.575 0.951
0.133 0.480 0.911 1.000
0.472 0.937 1.000 1.000
50 100 200 400
0.001 0.002 0.002 0.003
0.002 0.005 0.002 0.003
0.000 0.000 0.001 0.022
, = 0$01 0.001 0.001 0.005 0.063
0.004 0.002 0.029 0.275
0.003 0.016 0.172 0.754
0.007 0.088 0.566 0.988
0.054 0.499 0.985 1.000
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
n
especially in cases of appreciable asymmetry. The Block bootstrap method does slightly better than the stationary bootstrap in our experiments. Columns 2 and 3 of Tables 1–3 present results for the symmetric N (#, '2 ) and t (2) distributions for levels , = 0$10, 0$05, and 0$01. As the degree of asymmetry increases (moving from columns 4 through 9), we observe TABLE 3 Empirical rejection frequencies at levels , = 0$10, 0$05, 0$01. The degree of asymmetry increases as we go from columns 2 and 3 (symmetric) to columns 4–9. Columns 2 and 3 reflect the empirical size of the test, columns 4–9 empirical power. Yt = 0$5Yt −1 + )t , stationary bootstrap N (#, '2 )
t (2)
+2 (80)
+2 (40)
+2 (20)
+2 (10)
+2 (5)
+2 (1)
50 100 200 400
0.076 0.067 0.074 0.083
0.083 0.068 0.083 0.091
0.027 0.013 0.033 0.161
, = 0$10 0.035 0.023 0.070 0.329
0.071 0.091 0.281 0.697
0.114 0.290 0.661 0.975
0.273 0.623 0.959 1.000
0.760 0.975 1.000 1.000
50 100 200 400
0.023 0.014 0.020 0.024
0.024 0.014 0.024 0.035
0.008 0.004 0.009 0.072
, = 0$05 0.009 0.004 0.024 0.170
0.025 0.030 0.135 0.495
0.040 0.118 0.432 0.911
0.101 0.313 0.846 1.000
0.419 0.859 1.000 1.000
50 100 200 400
0.000 0.000 0.001 0.001
0.002 0.000 0.001 0.004
0.000 0.001 0.001 0.007
, = 0$01 0.000 0.000 0.003 0.028
0.000 0.000 0.014 0.124
0.001 0.010 0.066 0.528
0.003 0.019 0.276 0.921
0.048 0.214 0.886 1.000
n
254
E. Maasoumi and J. S. Racine
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
that power increases notably, and it increases uniformly with the sample size. In contrast, the state-of-art moment based tests, such as the Bai–Ng test (see Section 4.3 below) do not perform as well. The latter’s power, for instance, improves for only some underlying distributions, with the largest sample sizes, and when tests are based on several high order moments simultaneously and/or when the test exploits additional information about the one sidedness of the asymmetry hypothesis that may be justified in some applications. Since the metric entropy test is never bettered, we do not see any reason to qualify its adoption. Our code is in R and freely available and freely adopted. 4.2. Discrete Data Monte Carlo Simulation We now examine the finite-sample behavior of (7), and consider by way of example two trials from a Bernoulli process hence Y ∈ !0, 1, 2". When Pr (Y = 1) = 0$5 the process has a symmetric marginal probability function, while when Pr (Y = 1) # = 0$5 the marginal probability function is asymmetric. Figure 2 plots the probability functions for a range of values for Pr (Y = 1). For the simulations that follow, we vary Pr (Y = 1) from 0.35 through 0.50 in increments of 0.01. We draw 1,000 Monte Carlo replications for
FIGURE 2 Simulated discrete distributions. The distributions are, from top left, top right, bottom left, and bottom right, those for two draws from a Bernoulli trial with Pr (Y = 1) = 0$50, 0$45, 0$40, and 0$35 respectively.
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
A Robust Entropy-Based Test of Asymmetry
255
FIGURE 3 Power curves for (7) for n = 100 when Pr (Y = 1) ∈ [0$35, 0$65]. The dotted lines are the nominal levels of the test (, = 0$01, 0$05, 0$10).
each DGP and apply the proposed test. Empirical rejection frequencies are summarized in the form of power curves which we report in Figure 3. It can be seen from Figure 3 that the test is correctly sized and has power increasing appreciably with the degree of asymmetry. This is to be expected since entropies are functions of many moments, when these moments exist. We would like to remind the reader that the proposed test is quite versatile, and affords practitioners with a simple test that can be used regardless of the underlying datatype. 4.3. Comparison with Existing Tests Bai and Ng (2005) consider three moment-based test statistics for testing symmetry for continuous processes. We briefly compare our proposed test with their tests. As we have already demonstrated that our test is correctly sized, we restrict attention to power comparisons. We consider samples of sizes n = 100 and 200. Briefly, -ˆ 3∗ and -ˆ 3∗∗ are Bai and Ng’s (2005) one- and two-sided tests for symmetry (skewness) while #ˆ 35 is a joint test of the third and fifth central moments. Bai and Ng (2005) consider a range of distributions. We replicate their simulations which correspond to DGPs A1 (lognormal), A2 (exponential), A3 (+22 ), and A4 (generalized lambda with (1 = 0, (2 = 1, (3 = 1$4, and (4 = 0$25) in Bai and Ng (2005, Table 1). All tests are conducted at the 5% level. Results are reported in Table 4. Bai and Ng (2005, p. 54) write “the #ˆ 35 test is to be recommended when symmetry is the main concern.” It can be seen from Table 4 that the metric entropy test is not dominated in any case, but it has substantially greater power in a number of situations.
256
E. Maasoumi and J. S. Racine
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
TABLE 4 Comparison with Bai and Ng (2005). Empirical rejection frequencies, , = 0$05 (power) Distribution
-ˆ 3∗∗
lognormal +22 exponential Generalized (
0.43 0.74 0.75 0.85
lognormal +22 exponential Generalized (
0.52 0.90 0.91 1.00
#ˆ 35
Sˆ*
n = 100 0.63 0.88 0.89 0.93
0.62 0.96 0.96 0.64
0.97 0.99 1.00 0.90
n = 200 0.69 0.96 0.97 1.00
0.81 1.00 1.00 0.99
1.00 1.00 1.00 1.00
-ˆ 3∗
5. APPLICATIONS 5.1. Testing for Asymmetry in U.S. Macroeconomic Time Series In order to examine the behavior of the test on actual time series, we employed the extended (Nelson and Plosser, 1982) data on a number of US macroeconomic series. First, we applied our tests to the original series.2 Tables 5 and 6 present results for the proposed S* test for unconditional and conditional asymmetry for the updated (Nelson and Plosser, 1982) data. For the unconditional series we simply rotated each series about its mean, while for the conditional series we employed an AR (P ) process with lag order selected via SIC as in Belaire-Franch and Peiro (2003), where 'p et = Yt − .ˆ 0 − j =1 .ˆ i Yt −j . Should the lag order P = 0, the test statistics and percentiles for conditional symmetry will be equivalent to those for the unconditional series (ignoring bootstrap resampling error).3 We deploy Politis and Romano’s (1994) stationary bootstrap procedure to handle potential dependence in the series. For the original series, we observe from Table 5 that three series (Velocity, Bond Yields, and S&P 500) are deemed asymmetric by S* . Belaire-Franch and Peiro (2003) detect asymmetry for only two series (Employment Rate and Unemployment Rate). There is a vast literature on whether there is a unit root in these series, and also on modelling asymmetric behavior of US GNP and other macroeconomic series, as exemplified by and discussed in Timmermann and Perez-Quiros (2001). 2 In the dataset, all series except the interest rate are logged, and we refer to these variables as the “original series” for what follows. 3 SIC optimal lag orders ranged from 0 through 5. Additionally, we systematically investigated lag orders from 1 through 5, and results were qualitatively unchanged. SIC optimal lag orders were 1, 1, 1, 0, 3, 3, 1, 5, 1, 0, 1, 0, 0, and 0, for each series listed in Table 6, respectively.
A Robust Entropy-Based Test of Asymmetry
257
TABLE 5 Unconditional symmetry tests and percentiles under the null of symmetry
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
Series Real GNP Nominal GNP Real p/c GNP Industrial production Employment rate Unemployment rate GNP price deflator CPI Nominal wages Real wages Money stock Velocity Bond yields S&P 500
Sˆ*
p90
p95
p99
P
0.030 0.108 0.062 0.035 0.024 0.018 0.165 0.234 0.107 0.084 0.043 0.162 0.359 0.280
0.429 0.352 0.370 0.340 0.302 0.046 0.262 0.300 0.394 0.568 0.335 0.134 0.257 0.267
0.568 0.441 0.510 0.430 0.385 0.056 0.319 0.353 0.497 0.725 0.435 0.172 0.302 0.334
0.763 0.601 0.718 0.652 0.532 0.081 0.514 0.474 0.705 0.934 0.620 0.248 0.387 0.539
0.886 0.457 0.762 0.923 0.929 0.400 0.327 0.220 0.592 0.820 0.873 0.057 0.014 0.088
Below we report the unconditional tests for both the “original series”, as well as their first differences, for two reasons. First, asymmetry in a few of the original series, the real point of interest in macroeconomics and finance, appears to be present. Secondly, the issue of “nonstationarity” in some of these series is widely regarded as controversial and unsettled. This complicates the choice of prewhitening and filtering of the series, an issue that is most recently discussed by Dagum and Giannerini (2006). Nevertheless, many scholars believe that one should first difference these series (in logs, except where noted above) to make them stationary, as we have assumed them to be in our theoretical assertions. To examine this issue, Tables 7 and 8 present results for the proposed S* test for TABLE 6 Conditional symmetry tests and percentiles under the null of symmetry Series Real GNP Nominal GNP Real p/c GNP Industrial production Employment rate Unemployment rate GNP price deflator CPI Nominal wages Real wages Money stock Velocity Bond yields S&P 500
Sˆ*
p90
p95
p99
P
0.015 0.043 0.014 0.035 0.018 0.001 0.005 0.014 0.002 0.084 0.006 0.162 0.359 0.280
0.023 0.069 0.021 0.388 0.021 0.011 0.022 0.030 0.021 0.547 0.019 0.133 0.261 0.255
0.028 0.082 0.026 0.468 0.025 0.014 0.027 0.037 0.030 0.697 0.025 0.166 0.298 0.349
0.037 0.107 0.036 0.628 0.037 0.021 0.042 0.052 0.046 0.890 0.039 0.249 0.408 0.500
0.261 0.332 0.242 0.914 0.140 0.849 0.638 0.509 0.859 0.785 0.496 0.054 0.023 0.082
258
E. Maasoumi and J. S. Racine TABLE 7 Unconditional symmetry tests and percentiles under the null of symmetry for the differenced series
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
Series Real GNP Nominal GNP Real p/c GNP Industrial production Employment rate Unemployment rate GNP price deflator CPI Nominal wages Real wages Money stock Velocity Bond yields S&P 500
Sˆ*
p90
p95
p99
P
0.016 0.054 0.014 0.039 0.019 0.020 0.003 0.025 0.004 0.001 0.007 0.018 0.089 0.012
0.021 0.074 0.019 0.046 0.021 0.024 0.018 0.038 0.019 0.008 0.020 0.020 0.097 0.022
0.028 0.086 0.025 0.060 0.028 0.028 0.023 0.047 0.027 0.010 0.026 0.025 0.114 0.026
0.039 0.123 0.036 0.090 0.041 0.037 0.036 0.071 0.044 0.016 0.046 0.039 0.143 0.034
0.231 0.229 0.228 0.158 0.118 0.146 0.746 0.229 0.662 0.841 0.491 0.119 0.136 0.260
unconditional and conditional asymmetry using the first difference (of the logs) of the updated Nelson and Plosser (1982) data. Again, potential dependence in each transformed series is accommodated by applying the stationary bootstrap of Politis and Romano (1994) when constructing the null distribution of the test. For the “unconditional” differenced series we again simply rotate the series about its mean. Indeed, we now find little evidence of asymmetry in the differenced log of these macroeconomic series. As might be expected from this finding, further differencing by the AR (p) model of these already first differenced series produces symmetric TABLE 8 Conditional symmetry tests and percentiles under the null of symmetry for the differenced series Series Real GNP Nominal GNP Real p/c GNP Industrial production Employment rate Unemployment rate GNP price deflator CPI Nominal wages Real wages Money stock Velocity Bond yields S&P 500
Sˆ*
p90
p95
p99
P
0.005 0.023 0.004 0.039 0.006 0.008 0.009 0.009 0.014 0.001 0.008 0.018 0.089 0.012
0.018 0.032 0.015 0.050 0.018 0.013 0.019 0.018 0.024 0.007 0.018 0.020 0.095 0.020
0.024 0.036 0.020 0.063 0.023 0.016 0.023 0.020 0.029 0.009 0.021 0.026 0.110 0.025
0.034 0.050 0.030 0.079 0.037 0.024 0.032 0.030 0.039 0.013 0.029 0.036 0.136 0.032
0.437 0.194 0.473 0.181 0.536 0.298 0.480 0.428 0.355 0.849 0.454 0.125 0.127 0.248
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
A Robust Entropy-Based Test of Asymmetry
259
residuals. Comparing these results with those of “pseudo differencing” of an AR (p) process in Table 6, simple differencing is somewhat more effective in symmetrizing. Obviously, a wide range of models other than AR (p) have been fit to these series over the last two decades. We did not pursue whether (standardized) residuals from such models would be symmetric. This would be a challenging and large project that would more reflect on the adequacy of other model specifications, than the “symmetry” of the macroeconomic series, the main interest of this section. In interpreting and comparing our findings with the existing folklore it is worthwhile to remember that our hypothesis encompasses all the implications of “symmetry” for the whole distribution, and thus our test has the power to reject “symmetry” even if certain implications of symmetry, such as skewness, may not be rejected by other tests. 5.2. Testing for Asymmetry in Daily Stock Movements Often interest lies in a particular aspect of price movements of financial assets, namely the so-called price ‘tick’. The expression ‘tick’ denotes a change in the price of an asset from one trade to the next. Should the later trade be made at a higher price than the earlier trade, we call this an ‘uptick’ trade since the price went up. If, on the other hand, the later trade is is made at a lower price than the previous trade, that trade is known as a ‘downtick’ trade because the price went down. This information is given by sign(/pt ) where ‘sign’ returns the signs of the corresponding elements of the price change from time t − 1 to time t expressed as sign(/pt ) = sign(pt − pt −1 ) which assumes values 1, 0, or −1 if the sign of /pt is positive, zero, or negative, respectively. Price tick information is used to regulate financial markets. For instance, ‘shorting’ a stock, can only be executed on an uptick (i.e., before you begin shorting a stock the last trade must be an uptick). Price ticks also form the basis for widely watched market indicators. For instance, the so-called ‘tick indicator’ is a market indicator that measures how many stocks are moving up or down in price, and this tick indicator is computed based on the last trade in each stock. The Wall Street Journal publishes a daily short tick indicator table. Potential asymmetry in tick behaviour is of direct interest to traders. For what follows we consider one stock, namely, daily returns for General Electric from January 3, 1969 through December 31, 1998. Data was obtained from the Center for Research in Security Prices at the Graduate School of Business in the University of Chicago. Figure 4 plots the histogram of price ticks during this period. For this data, the relative frequencies of −1, 0, and 1 are 0.449, 0.061, and 0.491, respectively, as can be observed from Figure 4. We apply the proposed test for asymmetry to this discrete process using 399 bootstrap
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
260
E. Maasoumi and J. S. Racine
FIGURE 4 Price ticks for General Electric stock, 1969-1-03 to 1998-12-31.
replications, and obtain a P -value of 0.018 indicating that we would reject the null hypotheses of symmetry of ticks for this stock during this period at the 5% level. This simple application highlights the fact that our proposed test is versatile and is equally adept at detecting asymmetry present in both continuous and discrete processes. 6. CONCLUSION We examined a simple robust entropy-based test for asymmetry along with a resampling method for obtaining its null distribution. The test admits both discrete and continuous processes. Finite-sample performance is examined and the test is correctly sized possessing power that increases with both the sample size and degree of departure from the null. An application to the updated Nelson and Plosser (1982) data indicates power gains relative to tests deployed in Belaire-Franch and Peiro (2003). ACKNOWLEDGMENTS The authors would like to thank Cees Diks, Estela Bee Dagum, and anonymous referees for their valuable comments and suggestions. Racine would like to gratefully acknowledge support from Natural Sciences and Engineering Research Council of Canada (NSERC:www.nserc.ca), the Social Sciences and Humanities Research Council of Canada (SSHRC:www.sshrc.ca), and the Shared Hierarchical Academic Research Computing Network (SHARCNET:www.sharcnet.ca).
A Robust Entropy-Based Test of Asymmetry
261
Downloaded By: [Canadian Research Knowledge Network] At: 16:37 5 January 2009
REFERENCES Bai, J., Ng, S. (2001). A consistent test for conditional symmetry in time series models. Journal of Econometrics 103:225–258. Bai, J., Ng, S. (2005). Test for skewness, kurtosis and normality for time series data. Journal of Business and Economics Statistics 23:49–60. Belaire-Franch, J., Peiro, A. (2003). Conditional and unconditional asymmetry in U.S. macroeconomic time series. Studies in Nonlinear Dynamics and Econometrics. 7:1–17. Crum, W. L. (1923). Cycles of rates on commercial paper. Review of Economics and Statistics 5:17–29. Dagum, E., Giannerini, S. (2006). A critical investigation on detrending procedures for nonlinear processes. Journal of Macroeconomics 28:175–191. Davidson, R., MacKinnon, J. G. (2000). Bootstrap tests: How many bootstraps? Econometric Reviews 19:55–68. Efron, B. (1982). The Jackknife, the Bootstrap, and Other Resampling Plans. Society for Industrial and Applied Mathematics. Philadelphia, Pennsylvania 19103. Fan, Y., Gencay, R. (1995). A consistent nonparametric test of symmetry in linear regression models. Journal of the American Statistical Association 90:551–557. Granger, C., Maasoumi, E., Racine, J. (2004). A dependence metric for possibly nonlinear time series. Journal of Time Series Analysis 25(5):649–669. Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer Series in Statistics. New York: Springer-Verlag. Jones, M. C., Marron, J. S., Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association 91:401–707. Künsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations. The Annals of Statistics 17(3):1217–1241. Keynes, J. M. (1936). The General Theory of Employment, Interest, and Money. Macmillan Cambridge University Press. Liu, R. Y. (1988). Bootstrap procedures under some non i.i.d. models. Annals of Statistics 16:1696–1708. McQueen, G., Thorley, S. (1993). Asymmetric business cycle turning points. Journal of Monetary Economics 31:341–362. Mitchell, W. C. (1927). Business Cycles: The Problem and Its Setting. National Bureau of Economic Research. Nelson, C. R., Plosser, C. (1982). Trends and random walks in macro-economic time series: some evidence and implications. Journal of Monetary Economics 10:139-162. Ouyang, D., Li, Q., Racine, J. (2006). Cross-validation and the estimation of probability distributions with categorical data. Journal of Nonparametric Statistics 18(1):69–100. Politis, D. N., Romano, J. P. (1994). Limit theorems for weakly dependent Hilbert space valued random variables with applications to the stationary bootstrap. Statistica Sinica 4:461–476. Politis, D. N., White, H. (2004). Automatic block-length selection for the dependent bootstrap. Econometric Reviews 23:53–70. Premaratne, G., Bera, A. (2005). A test for symmetry with leptokurtic financial data. Journal of Financial Econometrics 3:169–187. Robinson, P. M. (1991). Consistent nonparametric entropy-based testing. Review of Economic Studies 58:437–453. Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall. Timmermann, A., Perez-Quiros, G. (2001). Business cycle asymmetries in stock returns: evidence from higher order moments and conditional densities. Journal of Econometrics 103:259–306.
ARTICLE IN PRESS
Journal of Econometrics 138 (2007) 547–567 www.elsevier.com/locate/jeconom
A versatile and robust metric entropy test of time-reversibility, and other hypotheses Jeffrey S. Racinea,, Esfandiar Maasoumib a
Department of Economics, McMaster University, Hamilton, ON, Canada L8S 4M4 Department of Economics, Southern Methodist University, Dallas, TX 75275-0496, USA
b
Available online 19 June 2006
Abstract We examine the performance of a metric entropy statistic as a robust test for time-reversibility (TR), symmetry, and serial dependence. It also serves as a measure of goodness-of-fit. The statistic provides a consistent and unified basis in model search, and is a powerful diagnostic measure with surprising ability to pinpoint areas of model failure. We provide empirical evidence comparing the performance of the proposed procedure with some of the modern competitors in nonlinear timeseries analysis, such as robust implementations of the BDS and characteristic function-based tests of TR, along with correlation-based competitors such as the Ljung–Box Q-statistic. Unlike our procedure, each of its competitors is motivated for a different, specific, context and hypothesis. Our evidence is based on Monte Carlo simulations along with an application to several stock indices for the US equity market. r 2006 Elsevier B.V. All rights reserved. JEL classification: C14; C12 Keywords: Nonparametric; Kernel; Density; Information-theoretic; Time series
1. Introduction A strictly stationary process is time-reversible (TR) if its finite-dimensional distributions are invariant to the reversal of the time indices. Linear processes with nonGaussian innovations (see Weiss, 1975, Tong, 1990, and Hallin et al., 1988) and nonlinear processes Corresponding author.
E-mail addresses:
[email protected] (J.S. Racine),
[email protected] (E. Maasoumi). 0304-4076/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2006.05.009
ARTICLE IN PRESS 548
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
with regime-switching structures, such as the self-exciting threshold autoregressive (SETAR) processes, the exponential GARCH (EGARCH) processes, and the smooth transition autoregressive (STAR) processes, are examples of generally TR processes. Note that, since the finite-dimensional distributions of a sequence of independently and identically distributed (iid) random variables are products of marginal distributions, timereversibility is a necessary condition for an iid random variable. Since, Tong (1990), at least, there has been a greater appreciation that the characteristics of economic and financial time-series typically go far beyond serial correlation and volatility clustering. For instance, the unemployment rate may behave asymmetrically in expansions and recessions, and the volatility of stock returns seems more sensitive to negative news than good news. Business cycle asymmetry and volatility asymmetries influence economic theories and the design of empirical models. As Tong (1990) pointed out, time-irreversibility is a broad concept of asymmetries in time-series. We propose here a unified approach to testing time-irreversibility to complement conventional independence and other tests. We follow the same principles, concepts and statistics to test several hypotheses, as well as to assess model fit and predictive performance. This provides for intellectual consistency as well as economy. In motivating a test for serial independence against time-irreversibility, Chen (2003) writes ‘‘Testing serial independence against time-irreversibility would be important for motivating and checking these nonlinear models, just like testing serial independence against serial correlation and volatility clustering is important for the ARMA-GARCH models.’’ Chen (2003) provides a good discussion of competing tests for TR, such as the one proposed by Chen et al. (2000) which is based on the characteristic function. Like our proposed test in this paper, an advantage of the Chen et al. (2000) test is its robustness to the moment condition failure of heavily tailed data. Chen (2003) uses the expectation of an odd-symmetric function of a random variable and its lag as a time-irreversibility measure. This is different from the basis used by Chen et al. (2000); see Section 2.2 for details, and Section 3.2 for power comparisons. Researchers depend on a battery of specification and diagnostic tests designed to guide the specification process. Most of these statistics trace their origins to linear time-series models, though they are often used in nonlinear settings with varying degrees of success. The best known examples are correlation-based statistics such as the autocorrelation function and the Ljung–Box Q statistic (Ljung and Box, 1978), though of course information criteria such as the Akaike information criterion (Akaike, 1981) have also proved to be very popular. A complementary set of statistics trace their origins to nonlinear time-series models. This class would include, for example, the chaos-based BDS statistic (Brock et al., 1987,1996) and, more recently, the TR tests mentioned in the previous paragraph and in Chen and Kuan (2002) and Chen (2003). But, specification and diagnostic tests are more meaningfully placed in the context of decision making whereby a model’s adequacy is ranked and judged by goodness-of-fit and predictive performance. It is typically the case that principles and innovations that guide the choice of test statistics change from one test to another, and are also unrelated to principles and methods used to assess model choice, fit, and prediction. For instance, one would not think of using, say, a Q statistic to measure goodness-of-fit! As will become clear, an advantage of the entropy methods of this paper, and similar ‘‘distribution-based’’ procedures, is that they avoid confusing and sometimes contradictory sets of principles and motivations.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
549
Information theoretic tests are increasingly found to be superior in a variety of contexts; see Hong and White (2005), Granger et al. (2004), and Skaug and Tjøtheim (1993), among others. In fact, BDS and other correlation integrals too may be viewed as special approximations of certain mutual information measures. Indeed, such relations may be used to obtain alternative nonparametric estimates for entropy measures, as proposed by Diks and Manzan (2002). It appears that how entropy measures are approximated, as well as how the actual statistics are implemented, have far reaching consequences for test properties and performance. In this paper, we pursue an information theoretic approach to the general problem of testing and model selection. We adopt Granger et al.’s (2004) metric entropy statistic and illustrate how it can serve as a versatile diagnostic tool for guiding model specification in several new directions. In addition to being well-suited to testing for serial dependence as described in Granger et al. (2004) and measuring goodness-of-fit in nonlinear models as described in Maasoumi and Racine (2002), we propose using this statistic to test for time-reversibility. Our particular formulation of the null of timereversibility necessitates testing for ‘‘symmetry.’’ The versatility of the entropy approach allows us, however, to use it to also test for symmetry in other contexts. We will examine the performance of the entropy-based method relative to a ‘‘robust BDS’’ test, the TR test of Chen and Kuan (2002), and common correlation-based statistics such as the Q statistic. Our approach to testing for time-reversibility appears to have good performance when used to identify suitable models and lags, being correctly sized yet having improved power relative to competing approaches. We also underscore the prescriptive nature of the statistic. That is, should the statistic indicate model failure, it also indicates a likely culprit for this failure thereby suggesting directions in which an improved model may lie. Barnett et al. (1997) have studied the performance of a range of popular diagnostic statistics, and they outline the generally unsatisfactory performance of several of these approaches to testing for nonlinearity. Not even those tests having their origins in nonlinear settings are immune to performance problems, however. For instance, Cromwell et al. (1994, pp. 32–36) outline the use of the BDS statistic as a diagnostic tool for linear time-series modeling, the approach motivated mainly through expectation of power against linear, nonlinear, and chaotic (deterministic) alternatives. Unfortunately, the usual application of the BDS statistic in most studies is nonrobust due to the presence of unacceptably large size distortions, reflecting a failure of the asymptotic distribution theory in finite-sample settings. Somewhat surprisingly, the use of resampling methods to correct for such size distortions reveals an underlying lack of power relative to alternative statistics; see Belaire-Franch and Contreras (2002) for size and power performance of a permutation-based BDS test, and Chen and Kuan (2002) who consider a bootstrap-based BDS test. Time-reversibility can be shown to be a necessary condition for serially independent processes, thus the TR statistic can be applied as a diagnostic tool. However, there is a breakdown in the asymptotic distribution theory in finite-sample settings similar to that found for the BDS test, while a resampled version of the TR test may lack power in other settings. Our particular choice of the entropy functional in this paper reflects both its many desirable properties and our accumulated positive experience with its use in several contexts. While its ‘‘metricness’’ is a rather rare property, it is not unique, and other metric measures deserve further examination. The rest of the paper proceeds as follows. Section 2 presents an overview of the proposed test of symmetry and time-reversibility, along with the comparison tests considered herein. Section 3 outlines Monte Carlo experiments designed to examine finite-sample size and
ARTICLE IN PRESS 550
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
power of the proposed test of symmetry and time-reversibility, and provides the asymptotic distribution of the statistic. Section 4 presents results extending the application performed by Chen and Kuan (2002) on financial models of six major US stock indices, while Section 5 presents some concluding remarks. 2. Overview of the BDS, TR, and metric entropy test statistics We briefly describe the tests which are compared in the current paper, the BDS test (Brock et al., 1987,1996), the TR test of Chen and Kuan (2002), and the entropy-based test Sr (Granger et al., 2004; Maasoumi and Racine, 2002). We refer interested readers to the original papers for additional detailed descriptions of size and power performances of the respective tests. 2.1. The BDS test The BDS test statistic is based on the correlation integral of a time-series fY t gTt¼1 . The generalized K-order correlation integral is given by "Z Z #1=ðK1Þ K1
C K ðY ; Þ ¼
Iðky y0 kpÞf Y ðy0 Þ dy0
f Y ðyÞ dy
,
where IðÞ denotes the indicator function, f Y ðÞ denotes the marginal density of Y, and kY k ¼ supi¼1;...;dim Y jyi j, the sup norm. The distance parameter is like a bandwidth and behaves accordingly. When the elements of Y are iid, the correlation integral factorizes. The BDS test statistic is based on C K , with K ¼ 2. This gives the expected probability of -neighbourhoods. For small and dimensionality parameter m, the inner integral (probability) in C K ðÞ behaves as m f Y ðyÞ over the -neighbourhood. This allows us to see an approximate relationship between the correlation integral and various entropies (See Section 2.3). The BDS test’s finite-sample distribution has been found to be poorly approximated by its limiting Nð0; 1Þ distribution. In particular, the asymptotic-based test has been found to suffer from substantial size distortions, often rejecting the null 100% of the time when the null is in fact true. Recently, tables providing quantiles of the finite-sample distribution have been constructed in certain cases which attempt to correct for finite-sample size distortions arising from the use of the asymptotic distribution (see Kanzler (1999) who assumed true Gaussian error distributions), though the asymptotic version of the test is the one used in virtually all applied settings. A number of authors have noted that its finitesample distribution is sensitive to the embedding dimension, dimension distance, and sample size, thus tabulated values are not likely to be useful in applied settings.1 However, a simple permutation-based resampled version of the BDS statistic does yield a correctly sized test (Belaire-Franch and Contreras, 2002; Diks and Manzan, 2002), hence we elect to use this ‘‘robust BDS’’ approach implemented by Chen and Kuan (2002) for what follows. 1
We note that in applied settings the user is required to set the embedding dimension (m) and the size of the dimensional distance (). One often encounters advice to avoid using the test on samples of size 500 or smaller, while one also encounters advice on setting in the range 0:5sy to 2:0sy of a time-series fY t gTt¼1 along with advice on setting m in the range 2–8.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
551
2.2. The TR test Recently, Chen and Kuan (2002) have suggested using a modified version of the TR test of Chen et al. (2000) as a diagnostic test for time-series models. This is a characteristic function-based test, and its authors recommend it in part as it requires no moment conditions hence is of wider applicability than existing time-reversibility tests. A stationary process is said to be ‘TR’ if its distributions are invariant to the reversal of time indices; independent processes are TR. If time-reversibility does not hold (i.e., the series is ‘timeirreversible’), then there is asymmetric dependence among members of the series in the sense that the effect of, say, Y s on Y t is different from that of Y t on Y s ; the threshold autoregressive (TAR) model is one example of a time-irreversible series. When a series is TR then the distribution of Y t Y tk is symmetric (k ¼ 1; 2; . . .), while failure of this symmetry condition indicates asymmetric dependence. A distribution is symmetric if and only if the imaginary part of its characteristic function is zero (i.e., hðoÞ ¼ E½sinðoðY t Y tk ÞÞ ¼ 0 8o 2 Rþ ). A necessary condition is Z sinðoðY t Y tk ÞÞgðoÞ do ¼ 0, (1) E½cg ðY t Y tk Þ ¼ E Rþ
where gðÞ is a weighting function, and Chen et al. (2000) therefore propose a test based on the sample analogue of (1) given by ! pffiffiffiffiffiffi c¯ g;k C g;k ¼ T k , s¯ g;k PT 2 where T k ¼ T k, c¯ g;k ¼ p t¼kþ1 ffiffiffiffiffiffi cg ðY t Y tk Þ=T k , and s¯ g;k is a consistent estimator of the asymptotic variance of T k c¯ g;k . Chen et al. (2000) choose gðÞ as the exponential distribution function with parameter b40. In this case cg ðY T Y Tk Þ ¼
bðY T Y Tk Þ 1 þ b2 ðY T Y Tk Þ
yielding a test that is straightforward to compute and has been shown to have a limiting Nð0; 1Þ distribution under H 0 . However, this version of the test suffers from substantial size distortions. A modified version of this test appropriate for testing asymmetry of residuals arising from a time-series model which is called the TR test (Chen and Kuan, 2002) is given by ! pffiffiffiffiffiffi c^ g;k ^ C g;k ¼ T k , n^ g;k ^ ¼ PT where c t ^tk Þ=T k (for our purposes ^t represents standardized g;k t¼kþ1 cg ð^ 2 residuals from a time-series pffiffiffiffiffiffi ^ model) and where n^ g;k is a consistent estimator of the asymptotic variance of T k cg;k which is obtained by simple bootstrapping (Chen and Kuan, 2002, p. 568). 2.3. Metric entropy tests Entropy-based measures of divergence have proved to be powerful tools for a variety of tasks. They may form the basis for measures of nonlinear dependence, as in Granger et al.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
552
(2004), but as we shall demonstrate can also be used to assess goodness-of-fit in nonlinear models or form the basis of tests for symmetry or time-reversibility. Granger et al. (2004) consider the case of the nonsymmetric K-class entropy, as in Z 1 H K ðY Þ ¼ 1 f K1 f ðyÞ dy ; Ka1 Y Y K 1 (equal to Shannon’s entropy for K ¼ 1), ’
1 f1 ½C K ðY ; Þ=m K1 g, K 1
or, for Renyi’s entropy Z 1 log ½f Y ðyÞK1 f Y ðyÞ dy; K 1 ’ log C K ðY ; Þ þ m log .
H K ðY Þ ¼
Ka1,
A nonparametric estimate of C K ðÞ may then be used to obtain a similar estimate of H K ðÞ, as cogently argued by Diks and Manzan (2002). But the data do not determine K, so estimating correlation integrals does not provide a nonparametric method of determining the ‘‘right’’ order for the entropy functional (K). Similarly, the embedding dimension m plays a role in how good a relation may hold between entropy and the corresponding correlation integral, but does not determine K. This bears further examination as an empirical matter. Mutual information measures test the significance of different metrics of divergence between the joint distribution and the product of the marginals, f X ;Y ðÞ, f X ðÞ, and f Y ðÞ, respectively. For instance Z Z IðX ; Y Þ ¼
ln½f X ;Y ðx; yÞ=f X ðxÞf Y ðyÞ f X ;Y ðx; yÞ dx dy
is the simple Shannon mutual information. And, generally I K ðX ; Y Þ ¼ H K ðX Þ þ H K ðY Þ H K ðX ; Y Þ, for any K, relates mutual information to the respective marginal and joint entropies. It is worth reminding the reader that these are nonsymmetric measures. The conditional mutual information, given a third variable Z, is Z Z Z IðX ; Y jZÞ ¼
ln½f X jy;z ðxjy; zÞ=f X jz ðxjzÞf X ;Y ;Z ðx; y; zÞ dx dy dz.
More generally, I K ðX ; Y jZÞ ¼ ln C K ðX ; Y ; Z; Þ ln C K ðX ; Z; Þ ln C K ðY ; Z; Þ þ ln C K ðZ; Þ reveals the relation between conditional mutual information and the correlation integrals. If nonparametric estimates of the C k functions are plugged in, a corresponding nonparametric estimate of the conditional mutual information is obtained for a given K. The unconditional relation is obtained by removing Z. Extensive results on the connection
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
553
between correlation integrals and information theory are given in Prichard and Theiler (1995).2 For any two density functions f 1 and f 2; the asymmetric (with respect to f 1 Þ K-class entropy divergence measure is Z 1 K 1 ðf K =f Þ dF I K ðf 2 ; f 1 Þ ¼ Ka1, 1 ; 2 1 K 1 such that limK!1 I K ðÞ ¼ I 1 ðÞ, the Shannon cross entropy (divergence) measure. When one distribution is the joint, and the other is the product of the marginals, this latter measure is called the ‘‘mutual information’’ outlined earlier. Once the divergence in both directions (of f 1 and f 2 Þ are averaged, a symmetric measure is obtained which, for K ¼ 1, is well known as the Kullback–Leibler measure. The symmetric K-class measure at K ¼ 12 is as I 1=2 ¼ 12 fI 1=2 ðf 2 ; f 1 Þ þ I 1=2 ðf 1 ; f 2 Þg ¼ Mðf 1 ; f 2 Þ ¼ 2Bðf 1 ; f 2 Þ, where MðÞ ¼ Rfollows: 1=2 1=2 2 ðf 1 f 2 Þ dx R is the1=2Matusita distance, and, BðÞ ¼ 1 r is the Bhattacharya distance with 0pr ¼ ðf 1 f 2 Þ p1 being a measure of ‘‘affinity’’ between two densities. BðÞ and MðÞ are rather rare measures of divergence since they satisfy the triangular inequality and are, therefore, proper metric measures of distance. Following the arguments in Granger et al. (2004) and Maasoumi and Racine (2002) in favour of metric entropies, we choose K ¼ 12 in the K-class entropy defined above. This is a normalization of the Bhattacharya–Matusita–Hellinger measure of dependence given by Z Z 1 1 1 1=2 1=2 Sr ¼ ðf f 2 Þ2 dx dy 2 1 1 1 1 ð2Þ ¼ I 1=2 , 2 where f 1 ¼ f ðx; yÞ is the joint density and f 2 ¼ f ðxÞ:f ðyÞ is the product of the marginal densities of the random variables X and Y. S r ¼ 0 if and only if X and Y are independent, and is otherwise positive and less than or equal to one. The subscript r invites an allusion to measures of ‘‘dependence’’ and co-dependence (as in correlation), but the above measure is just as commonly a measure of discrimination between models, hypotheses, etc. Further clarification of its relations to other entropies and measures, such as Cressi–Reed and Tsallis entropies are lucidly exposited in Golan (2002) and Golan and Perloff (2002). A test statistic based on Sr has several desirable properties. While each is not unique, this set of properties is unmatched and is worth enumerating: 1. It is well defined for both continuous and discrete variables. 2. It is normalized to zero if X and Y are independent, and lies between 0 and 1. 2 The choice q ¼ 2 is by far the most popular in chaos analysis as it allows for efficient estimation algorithms. Note that the conditional mutual information IqðX ; Y jZÞ is not positive definite for qa1. It is thus possible to have variables X and Y which are conditionally dependent given Z, but for which I 2 ðX ; Y jZÞ is zero or negative. Also, as noted by Diks and Manzan (2002), if I 2 ðX ; Y jZÞ is zero, the test based on it does not have unit power asymptotically against conditional dependence. This situation, while exceptional, is quite undesirable. Since I 2 ðX ; Y jZÞ is usually either positive or negative, a one-sided test rejecting for I 2 ðX ; Y jZÞ large, is not optimal. Diks and Manzan (2002) argue that, in practice I 2 behaves much like I 1 in that we usually observe larger power for one-sided tests (rejecting for large I 2 ) than for two-sided tests. This led them to propose q ¼ 2, together with a one-sided implementation of the test.
ARTICLE IN PRESS 554
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
3. The modulus of the measure is equal to unity if there is a measurable exact (nonlinear) relationship, Y ¼ gðX Þ say, between the random variables. 4. It is equal to or has a simple relationship with the (linear) correlation coefficient in the case of a bivariate normal distribution. 5. It is a metric, that is, it is a true measure of ‘‘distance’’ and not just of divergence, since it satisfies the triangularity rule. 6. The measure is invariant under continuous and strictly increasing transformations hðÞ: This is useful since X and Y are independent if and only if hðX Þ and hðY Þ are independent for all hðÞ: This invariance is also useful in deriving asymptotic distributions and conducting resampling experiments on pivotal statistics. 7. It is symmetric with respect to variables X and Y. See Granger et al. (2004) for discussion and proofs. Our TR test relies on testing for symmetry. To test symmetry, either in a series (original or residuals), or in their lag differences (as for time-reversibility case), the densities f 1 and f 2 will be defined accordingly. This further attests to the unification ability and general versatility of the information/entropy measures. Other divergence measures are capable of characterizing desired null hypotheses (such as independence) but may not be appropriate when these distances are compared across models, sample periods, or agents. These comparisons are often implicit in inferences. The S r measure is ‘‘robust’’ since it is capable of producing familiar outcomes when we deal with truly linear/Gaussian processes but is robust to departures from such settings. For testing the null of serial independence at any lag K, we employ kernel hestimators of f ðy; yk Þ, f ðyÞ, and f ðyk Þ, j ¼ 1; 2; . . . ; K originally proposed by Parzen (1962), with likelihood cross-validation used for bandwidth selection leading to density estimators that are ‘‘optimal’’ according to the Kullback–Leibler criterion; see Silverman (1986, p. 52) for details. For a test of symmetry, we proceed in a similar manner. The null distribution of the kernel-based implementation of S r , denoted by S^ r , is obtained via a bootstrap resampling approach identical to that used for the ‘‘robust BDS’’ test described above (see Granger et al. (2004) for details). R Code for computing the entropy metric and for computing the ‘‘robust BDS’’ test (Ihaka and Gentleman, 1996)3 is available from the authors upon request. In Section 4 we compare the relative performance of the TR, ‘‘robust BDS,’’ and the S^ r entropy-based tests on the basis of their diagnostic ability in an empirical setting. All tests are correctly sized due to their reliance on resampling procedures (see Section 3 for finitesample performance of the entropy tests), therefore relative performance boils down to a comparison of power. We note BDS is known to be a test of the hypothesis E½f ðX ; Y Þ E½f ðX Þ E½f ðY Þ ¼ 0, whereas mutual information tests are concerned with the expected divergence between f ðx; yÞ and f ðxÞ f ðyÞ (relative to f ðx; yÞ). The latter are one-to-one representations of independence and imply the null of concern to BDS, but not vice versa. 2.3.1. Testing for time-reversibility There are a number of potential approaches to testing for time-reversibility. For instance, one approach would be to consider the equality of the joint distribution of any 3
See http://www.r-project.org
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
555
finite set of time-series with the joint distribution of the same set in reverse order. In a nonparametric implementation, however, this can face the curse of dimensionality when the length of the finite set is large. Another approach is to note that reversibility implies that, 8k, f ðY t ; Y tk Þ ¼ f ðY tk ; Y t Þ, and Darolles et al. (2004) consider an elegant statistic based upon kernel estimation of such bivariate distributions at any lag. Yet another approach is possible, however, by recalling that when a series is TR then 8k, f ðY t Y tk Þ is symmetric. That is, one could instead look at the symmetry of pairwise distributions of Y t Y tk , for several values of k. In a nonparametric setting this circumvents the curse of dimensionality which has obvious appeal, and it is this avenue that we choose to pursue here. Looking at the symmetry of pairwise distributions of Y t Y tk for several values of k was also the route taken by Chen et al. (2000), who utilized the characteristic function as the basis for their test. Chen (2003) also proposed a portmanteau version of the latter method based on the sum of a finite number of pairwise differences with good performance. Testing for symmetry of the marginal distribution of kth differences highlights the versatility of the metric entropy by leveraging the symmetry test outlined in the following section, i.e., tests of symmetry and TR are based on the same statistic, the former for levels, the latter for kth differences. Section 3 presents some Monte Carlo evidence on the test’s finite-sample performance for some popular linear and nonlinear time-series processes. In that section we also offer some remarks regarding the asymptotic distribution of the entropy statistics and their poor performance. 2.3.2. Testing for unconditional and conditional symmetry Consider a stationary series fY t gTt¼1 . Let my ¼ E½Y t , let f 1 ðyÞ denote the density function of the random variable Y t , let Y~ t ¼ Y t þ 2my denote a rotation of Y t about its mean, and ~ denote the density function of the random variable Y~ t . Note that, if my ¼ 0, then let f 2 ðyÞ ~ Y t ¼ Y t , though in general this will not be so. ~ We say a series is symmetric about the mean (median, mode) if f 1 ðyÞ f 2 ðyÞ almost surely. Testing for asymmetry about the mean therefore naturally involve testing the null ~ H 0 : f 1 ðyÞ ¼ f 2 ðyÞ
for all y.
Note that symmetry about the mode or median may be a more natural characterization. One could of course clearly rotate a distribution around these measures of central tendency, and in what follows one would replace the mean with the appropriate statistic. Tests for the presence of ‘‘conditional asymmetry’’ can be based upon standardized residuals from a regression model (see Belaire-Franch and Peiro, 2003). Let Y t ¼ hðOt ; bÞ þ sðO; lÞet , denote a general model for this process, where Ot is a conditioning information set, sðO; lÞ the conditional standard deviation of Y t , and et is a zero mean unit variance error process independent of the elements of Ot . If me ¼ 0, then tests for conditional asymmetry involve the following null: H 0 : f 1 ðeÞ ¼ f 2 ðeÞ for all e.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
556
Bai and Ng(2001) construct tests based on the empirical distribution of et and that of et . Belaire-Franch and Peiro (2003) apply this and other tests to the Nelson and Plosser (1982) data updated to include 1988. We propose testing for symmetry using the metric entropy given in (2) with f 1 and f 2 as defined above. The unknown densities are estimated using kernel methods (see Granger et al. (2004) for details), while the null distribution is obtained via resampling from the pooled sample fY t ; Y~ t g. Note that H 0 holds if and only if S r ¼ 0. This is the main source of superior power for our entropy tests. Section 3 presents Monte Carlo evidence on the test’s finite-sample performance, and reveals that the proposed approach is correctly sized and has power under the alternative.
3. Finite-sample behaviour 3.1. Testing for symmetry We consider the finite-sample performance of the kernel-based test for symmetry, the null hypothesis being that the distribution is symmetric. We set the number of bootstrap replications underlying the test to 99, and consider a range of sample sizes. For each DGP, we conduct 1000 Monte Carlo replications. The bandwidth is selected via likelihood cross-validation. We consider a variety of DGP’s ranging from symmetric to highly asymmetric. Table 1 summarizes the finite-sample performance of the proposed test in the form of empirical rejection frequencies, i.e., the proportion of rejections out of 1000 Monte Carlo replications. We consider tests with nominal size a ¼ 0:10, 0.05, and 0.01. We observe from Table 1 that the test is correctly sized. Empirical size does not differ from nominal for any of the sample sizes considered. As the degree of asymmetry increases, we observe that power increases as it also does when the sample size increases. Table 1 Empirical rejection frequencies at levels a ¼ 0:10; 0:05; 0:01 n a ¼ 0:10 50 100 200 a ¼ 0:05 50 100 200 a ¼ 0:01 50 100 200
Nðm; s2 Þ
w2 ð120Þ
w2 ð80Þ
w2 ð40Þ
w2 ð20Þ
w2 ð10Þ
w2 ð5Þ
w2 ð1Þ
0.101 0.116 0.097
0.155 0.247 0.340
0.195 0.309 0.461
0.285 0.446 0.737
0.414 0.697 0.939
0.611 0.899 0.997
0.845 0.991 1.000
0.957 1.000 1.000
0.062 0.071 0.046
0.093 0.152 0.230
0.124 0.185 0.325
0.166 0.316 0.611
0.252 0.551 0.885
0.456 0.806 0.994
0.690 0.966 1.000
0.881 0.993 1.000
0.018 0.025 0.011
0.027 0.058 0.112
0.038 0.067 0.171
0.059 0.144 0.398
0.108 0.323 0.724
0.229 0.600 0.956
0.444 0.863 0.999
0.666 0.956 0.998
The degree of asymmetry increases as we go from left to right. The Nðm; s2 Þ column corresponds to empirical size, while the remaining columns correspond to empirical power. n denotes the sample size.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
557
Table 2 Empirical rejection frequencies for Sr and Chen and Kuan’s TR test (CK) for the SETAR model for b ¼ 1, k ¼ 1; 2; 3 k
n
a ¼ 0:10
a ¼ 0:05
a ¼ 0:01
Sr
CK
Sr
CK
Sr
CK
1 1 1
50 100 200
0.728 0.926 0.999
0.127 0.152 0.170
0.532 0.854 0.997
0.057 0.085 0.097
0.217 0.549 0.934
0.012 0.017 0.039
2 2 2
50 100 200
0.129 0.154 0.212
0.098 0.110 0.106
0.062 0.081 0.130
0.050 0.054 0.052
0.007 0.016 0.035
0.013 0.010 0.009
3 3 3
50 100 200
0.080 0.073 0.071
0.103 0.100 0.110
0.033 0.033 0.031
0.050 0.046 0.056
0.007 0.004 0.003
0.009 0.010 0.011
3.2. Testing for time-reversibility Next, we consider two simulation experiments, one designed to examine finite-sample power and the other to examine finite-sample size of the proposed metric entropy test of time-reversibility. In order to examine the test’s power, we consider a time-irreversible two-regime selfexciting threshold autoregression model (SETAR) of the form Y t ¼ ð1 IðY td 4rÞÞða0 þ a1 Y t1 þ þ ap Y tp þ s1 t Þ þ IðY td 4rÞðb0 þ b1 Y t1 þ þ bp Y tp þ s2 t Þ,
ð3Þ
where Nð0; 1Þ, and where IðÞ is the familiar indicator function. This is commonly referred to as a SETARð2; p; pÞ, and serves as a common illustration of a time-irreversible process. Following Clements et al. (2002, p. 366) we set p ¼ 1, d ¼ 1, and r ¼ 0:15, and let ða0 ; a1 ; s1 Þ ¼ ð1:25; 0:7; 2Þ and ðb0 ; b1 ; s2 Þ ¼ ð0; 0:3; 1Þ. We consider 1000 Monte Carlo replications drawn from this DGP and apply the proposed test for a range of lags for Y t Y tk . Results are presented in Table 2. The null is that the series is TR, hence empirical rejection frequencies exceeding 0.10, 0.05, and 0.01, respectively, reflect power at the respective sizes. For comparison purposes, we also compute the TR test of Chen and Kuan (2002).4 In order to examine the test’s size, Table 3 presents results for a TR AR(1) model of the form Y t ¼ a0 þ a1 Y t1 þ t , with ða0 ; a1 Þ ¼ ð1; 0:5Þ and t Nð0; 1Þ. It is evident from Tables 2 and 3 that the proposed S r test is correctly sized (entries in Table 3 do not differ significantly from nominal size) and has power approaching 1 as n increases (see, e.g., entries in Table 2 for k ¼ 1 approach 1 as n increases). Our expectation of significant power improvements relative to other tests is borne out by these results. We have considered a range of null and alternative models not reported here for space 4
We are indebted to Yi-Ting Chen for providing his Gauss code that implements his test.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
558
Table 3 Empirical rejection frequencies for Sr and Chen and Kuan’s TR test (CK) for the AR model for b ¼ 1, k ¼ 1; 2; 3 k
n
a ¼ 0:10
a ¼ 0:05
a ¼ 0:01
Sr
CK
Sr
CK
Sr
CK
1 1 1
50 100 200
0.081 0.105 0.107
0.080 0.076 0.069
0.037 0.053 0.048
0.028 0.033 0.031
0.007 0.005 0.007
0.006 0.004 0.003
2 2 2
50 100 200
0.100 0.109 0.105
0.108 0.100 0.123
0.039 0.053 0.052
0.051 0.058 0.063
0.007 0.014 0.010
0.009 0.014 0.012
3 3 3
50 100 200
0.092 0.083 0.097
0.132 0.103 0.117
0.044 0.043 0.046
0.072 0.053 0.055
0.005 0.013 0.006
0.019 0.012 0.014
considerations, and results appear to be robust across a range of specifications. We are confident that the test will perform as expected in applied settings. 3.3. Asymptotic approximations to the null distribution Skaug and Tjøstheim (1996) obtained the asymptotic distributions of several similar statistics using the same sample splitting techniques used by Robinson (1991). Considering the moment form of our measure, and replacing the joint CDF with the empirical CDF, we obtain the following approximation for the kth lag: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi#2 " T ^ X ^ t Þhðy 1 gðy tk Þ S~ r ¼ 1 wðyt ; ytk Þ, (4) ^ T k t¼kþ1 f 1 ðyt ; ytk Þ ^ ^ are the kernel estimates of the corresponding densities. where f^1 ðÞ, hðÞ, and gðÞ The weight function wðx; yÞ ¼ 1fðx; yÞ 2 S 2 g ¼ S S, with S ¼ ½a; b for aob. It has the effect of trimming out the extreme observations (though we do not actually use it in applications), and avoids the difficult tail areas for logarithmic entropy functions. Skaug and Tjøstheim (1993,1996) prove the consistency of a portmanteau version of S~ r (for a joint test of serial independence at several lags) for a weighted and nonnormalized version of S r . Given that we are testing ‘‘symmetry’’ of the first differences which are iid under the null hypothesis, we are able to adopt the same techniques if we further assume the following: 1 (A1) The marginal densities are bounded and uniformly continuous on R R R :2 (A2) The kernel function KðÞ is bounded and satisfies: KðuÞ du ¼ 0, u KðuÞ duo1, pffiffiffiffiffiffiffi R iZx e is an e dZ where i ¼ 1 and K and has the representation KðxÞ ¼ KðZÞe absolutely integrable function of a real variable. (A3) The bandwidth hn ¼ cn1=b for some c40 and 4obo8.
It is generally assumed that the processes are strong mixing with an exponentially decaying mixing coefficient. Under these assumptions proof of consistency of classes of
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
559
measures, including S~ r , follows from Skaug and Tjøtheim (1993) and also Robinson (1991). Asymptotic normality of the measure S~ r is given by the following theorem which is adapted from Skaug and Tjøstheim (1996). Theorem 1. Let fY t ; t 1g be iid random variables. Under assumptions (A1)–(A3) we have n1=2 S~ r ! Nð0; 14s2 Þ,
(5)
where s2 ¼
Z
2 Z f ðyÞwðyÞ dy
2 1 f ðyÞwðyÞ dy .
Proof. Slight adaptation of Theorem 1 of Skaug and Tjøstheim (1996).
(6) &
As has been noted by Robinson (1991) and others, when wðÞ ¼ 1 we obtain s2 ¼ 0, a degenerate distribution. Therefore, the weight function is a theoretically important device for obtaining the asymptotic distribution of these type of statistics. Hong and White (2005) provide a bootstrap implementation of an approximation to Kullback–Leibler measure whose asymptotic distribution is obtained without the sample splitting device of Robinson (1991) and the sensitivity to a nuisance parameter thereof. They conjecture similar results would hold for our measure. The large-sample Gaussian distribution is known to be a very poor approximation to the finite-sample one. Skaug and Tjøstheim (1993,1996) have also studied this issue in the context of testing for serial independence. They note that the asymptotic variance is very poorly estimated in the case of the Hellinger and cross entropy measures, which renders asymptotic inferences quite unreliable. These same reasons suggest that bootstrapping ‘‘asymptotically pivotal’’ statistics may not perform well in this context. A serious problem with the asymptotic approach in a kernel context is that the asymptotic-based null distribution would not depend on the bandwidth, while the value of the test statistic does so directly. This is partly because the bandwidth vanishes asymptotically. This is a serious drawback in practise, since the outcome of such asymptotic-based tests tends to be quite sensitive to the choice of bandwidth. This has been reported by a number of authors including Robinson (1991) who, in a kernel context, noted ‘‘substantial variability in the [test statistic] across bandwidths was recorded,’’ which would be quite disturbing in applied situations. The asymptotic theory for residual-based tests will require similar techniques to those used by Chen (2003) who exploits the asymptotic properties of the Gaussian quasimaximum-likelihood estimators (QMLEs), to extend the original-series-based tests as model diagnostic checks for a general model. Since these model diagnostic checks are based on a formal asymptotic method, they can be implemented without bootstrapping. We have not developed a similar theory for our tests based on the standardized residuals in this paper. We now turn to several applications of a bootstrap implementation of the entropy metric, including as a measure of goodness-of-fit, a measure of nonlinear ‘‘serial’’ dependence in both the original return series and the residuals of certain models, and as a test of time-reversibility in pairwise comparisons for a small set of k lags. In what follows
ARTICLE IN PRESS 560
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
we implement S^ r defined as follows: Z Z 1=2 1=2 1 ðf^1 f^2 Þ2 dF^ 1 , S^ r ¼ 2
(7)
where ‘‘^’’ denotes kernel estimates. Both S^ r and S~ r are consistent estimators of Sr and they differ only in the use of empirical CDF (in S~ r ) and the joint kernel estimator (in S^ r ). The rates of convergence will depend on assumptions regarding the underlying process Y t (or its first differences), and the operative hypothesis (null or alternatives). Generally S^ r will converge more slowly. Certain laws of large numbers support the asymptotic equivalence of the normalized versions of these two estimators under the null. But, given that we test for symmetry of the first differenced series, the required restrictions on the underlying process Y t may be relaxed so as to leave the first differences as stationary. A careful and detailed examination of this issue is beyond the scope of the present study. But these arguments may not provide much practical guidance when first order asymptotic theory is known to be unsuccessful in small samples. Also, our own experience strongly argues against the implementation of entropies as in S~ r compared to our kernel implementation S^ r . 4. Application: dynamic model specification for stock returns Section 3 outlines Monte Carlo experiments which reveal that the proposed tests are correctly sized under the null and have power approaching one under the alternative as the sample size increases. We now turn to an illustrative application, closely following the work of Chen and Kuan (2002) to facilitate comparison. Since all tests considered herein employ resampling methods yielding tests that are correctly sized, this application will serve to underscore power differences, albeit in an applied setting. 4.1. Data sources We use the data series found in Chen and Kuan (2002), who apply their TR characteristic function-based test to residuals from a variety of models of daily returns of six major US stock indices. The indices are the Dow Jones Industrial Averages (DJIA), New York Stock Exchange Composite (NYSE), Standard and Poor’s 500 (S&P500), National Association of Securities Dealers Automated Quotations Composite (NASDAQ), Russell 2000 (RS2000), and Pacific Exchange Technology (PETECH). Each series contains T ¼ 2527 observations running from January 1, 1991 through December 31,2000, and we let Y t ¼ 100 ðlog Pt log Pt1 Þ denote the daily return of the index Pt . 4.2. Assessing dependence in the series We begin by considering whether or not there exists potential nonlinear dependence in the original series themselves. We therefore compute our metric entropy S^ r using k ¼ 1; 2; . . . ; 5 lags for each original series, and use this as the basis for a test of independence following Granger et al. (2004). Bandwidths were selected via likelihood cross-validation, and the Gaussian kernel was used throughout. We construct P-values for the hypothesis that each series is a serially independent white-noise process and, for comparison, we also compute P-values for the Q test. Results are summarized in Table 4.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
561
Table 4 P-values for the entropy-based and Q tests for serial independence at various lags on the original data series S^ r
Q
Series
k¼1
k¼2
k¼3
k¼4
k¼5
k¼1
k¼2
k¼3
k¼4
k¼5
DJIA NASDAQ NYSE PETECH RS2000 S&P500
0.00 0.00 0.00 0.00 0.00 0.01
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.36 0.01 0.00 0.00 0.00 0.00
0.13 0.03 0.00 0.00 0.00 0.48
0.04 0.07 0.00 0.00 0.00 0.02
0.09 0.14 0.00 0.00 0.00 0.05
0.14 0.18 0.00 0.00 0.00 0.02
As can be seen from Table 4, there is extremely strong and robust evidence in favour of dependence being present in all of these series based on S^ r . However, the correlation-based Q statistic fails to capture this dependence for the DJIA, NASDAQ, and S&P500 across a range of lags. This suggests, as has been often observed, that correlation-based tests often lack power in nonlinear settings. Next, we follow Chen and Kuan (2002) who assess the suitability of two classes of popular time-series models which have been used to model such processes. 4.3. Assessing goodness-of-fit As in Chen and Kuan (2002), we consider two models, the GARCHðp; qÞ (Bollerslev, 1986) and EGARCHðp; qÞ (Nelson, 1991) specifications for a time-series Y t jCt1 ¼ t Nð0; ht Þ which we now briefly outline. The GARCHðp; qÞ model may be expressed as GARCHðp; qÞ :
ht ¼ o þ
q X
ai 2ti þ
i¼1
p X
gj htj ,
(8)
j¼1
where pX0, q40, o40, ai X0, and gj X0, while the EGARCHðp; qÞ model may be written as EGARCHðp; qÞ :
lnðht Þ ¼ o þ
q X i¼1
ai gðzti Þ þ
p X
gj lnðhtj Þ,
j¼1
where gðzt Þ ¼ yzt þ g½jzt j Ejzt j and zt ¼ t =
pffiffiffiffi ht .
ð9Þ
These models will serve multiple illustrative purposes for what follows. In order to facilitate direct comparison with Chen and Kuan (2002), for all GARCHð1; kÞ and EGARCHð1; kÞ models listed below, the value of k is chosen to be the maximum k (kp5) such that their TR statistic is significant at the 5 level. We first consider the question of whether these nonlinear models differ in terms of their goodness-of-fit for a given series as these models are often considered equal on the basis of ‘‘goodness-of-fit’’ criteria. However, goodness-of-fit criteria such as R2 are correlationbased. Out of concern that the ‘equality’ of models might be an artefact of using correlation-based measures of fit, we therefore compute the entropy measure with f ¼ f ðY^ t ; Y t Þ as the joint density of the predicted and actual excess returns, and f 1 ¼ f ðY^ t Þ and
ARTICLE IN PRESS 562
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
Table 5 S^ r measures of goodness-of-fit along with their resampled 90% interval estimates Series
S^ r
[pct5 , pct95 ]
DJIA [EGARCHð1; 1Þ] DJIA [EGARCHð1; kÞ] DJIA [GARCHð1; 1Þ] DJIA [GARCHð1; kÞ] NASDAQ [EGARCHð1; 1Þ] NASDAQ [EGARCHð1; kÞ] NASDAQ [GARCHð1; 1Þ] NASDAQ [GARCHð1; kÞ] NYSE [EGARCHð1; 1Þ] NYSE [EGARCHð1; kÞ] NYSE [GARCHð1; 1Þ] NYSE [GARCHð1; kÞ] PETECH [EGARCHð1; 1Þ] PETECH [EGARCHð1; kÞ] PETECH [GARCHð1; 1Þ] PETECH [GARCHð1; kÞ] RS2000 [EGARCHð1; 1Þ] RS2000 [EGARCHð1; kÞ] RS2000 [GARCHð1; 1Þ] RS2000 [GARCHð1; kÞ] S&P500 [EGARCHð1; 1Þ] S&P500 [EGARCHð1; kÞ] S&P500 [GARCHð1; 1Þ] S&P500 [GARCHð1; kÞ]
0.07 0.06 0.07 0.07 0.12 0.12 0.12 0.12 0.06 0.05 0.06 0.06 0.16 0.16 0.16 0.16 0.08 0.07 0.08 0.08 0.08 0.07 0.08 0.07
[0.06, [0.05, [0.06, [0.06, [0.09, [0.09, [0.10, [0.09, [0.05, [0.05, [0.05, [0.05, [0.14, [0.13, [0.14, [0.14, [0.05, [0.05, [0.06, [0.06, [0.06, [0.06, [0.06, [0.06,
0.15] 0.16] 0.13] 0.14] 0.25] 0.29] 0.23] 0.21] 0.09] 0.09] 0.09] 0.10] 0.19] 0.19] 0.20] 0.21] 0.14] 0.14] 0.14] 0.15] 0.10] 0.11] 0.10] 0.10]
f 2 ¼ f ðY t Þ as the respective marginal densities. If predicted and actual returns are independent, this metric will yield the value zero, and will increase as the model’s predictive ability improves. In order to determine whether or not two model’s measures of goodness-of-fit differ statistically, we require the sampling distribution of the goodness-of-fit measure itself. To obtain the percentiles for our goodness-of-fit statistic we employed the stationary bootstrap of Politis and Romano (1994) and 399 bootstrap replications. Results are summarized in Table 5. Given results summarized in Table 5, it is evident that there does not appear to be any significant difference between models in terms of their fidelity to the data for a given series. This common finding leads naturally to residual-based specification testing to which we now proceed. 4.4. Assessing model specification Next, we focus on using Sr as a diagnostic tool. Under the null of correct model specification, the model residuals would be indistinguishable from white noise at all lags. Table 6 reports the associated P-values for the entropy-based test for serial independence at various lags on time-series models’ residuals using 99 permutation replications, the Ljung–Box Q test, the modified BDS test, and those for Chen and Kuan (2002) TR test. It
Table 6 Test results from various tests for serial independence at various lags on time-series models’ residuals Q test
BDS test
k¼2
k¼3
k¼4
k¼5
k ¼ 10
n¼1
n¼2
n¼3
n¼4
k¼1
k¼2
k¼3
k¼4
k¼5
0.00* 0.00* 0.00* 0.00* 0.01* 0.00*
0.00* 0.49 0.00* 0.05* 0.13 0.07
0.41 0.82 0.52 0.88 0.51 0.34
0.59 0.30 0.01* 0.46 0.58 0.01*
0.17 0.98 0.06 0.44 0.66 0.28
0.55 0.68 1.29 0.62 2.55* 1.69
0.07 1.90 0.49 0.19 3.68* 0.81
0.21 1.88 0.69 0.25 3.69* 0.79
0.12 2.05* 0.42 0.42 3.89* 0.30
3.19* 5.52* 2.61* 4.45* 3.23* 3.38*
3.28* 4.15* 4.65* 3.59* 3.26* 4.65*
0.99 2.11* 0.72 2.40* 2.56* 0.89
1.16 0.36 1.77 0.76 0.32 1.91
1.26 2.02* 1.81 1.25 0.83 2.59*
0.00* 0.00* 0.00* 0.00* 0.01* 0.00*
0.00* 0.19 0.01* 0.05* 0.34 0.20
0.23 0.72 0.31 0.94 0.10 0.33
0.47 0.54 0.03* 0.31 0.20 0.00*
0.14 0.96 0.05* 0.41 0.51 0.25
1.20 0.09 1.41 0.85 1.77 1.67
1.49 0.25 1.55 0.65 2.03* 1.75
1.61 0.20 1.72 0.45 2.09* 1.77
1.32 0.18 1.32 0.04 2.01* 1.14
2.93* 5.38* 2.75* 4.41* 3.27* 3.63*
2.80* 3.81* 4.43* 3.19* 3.04* 4.06*
0.72 2.23* 0.46 2.22* 2.62* 0.72
1.03 0.45 1.38 0.73 0.13 1.74
1.29 1.86* 1.65 1.30 0.37 2.47*
0.25 0.00* 0.06 0.00* 0.02* 0.00*
0.60 0.65 0.12 0.33 0.48 0.67
0.43 0.60 0.75 0.32 0.26 0.61
0.85 0.53 0.08 0.68 0.84 0.06
0.18 0.99 0.12 0.59 0.69 0.21
1.04 0.42 1.83 1.33 2.79* 2.57*
0.74 1.56 1.21 0.66 3.88* 1.83
0.87 1.55 1.41 0.50 3.89* 1.90
0.79 1.73 1.08 0.31 4.26* 1.39
1.96* 3.70* 1.11 3.09* 1.56 1.98*
1.98* 2.72* 3.27* 2.06* 1.89 3.01*
0.14 1.03 0.31 1.28 1.96* 0.27
0.41 1.29 0.77 0.21 1.10 0.92
0.54 1.27 0.81 0.26 0.14 1.65
0.29 0.06 0.10 0.00* 0.11 0.10
0.10 0.44 0.04* 0.34 0.57 0.50
0.22 0.79 0.42 0.48 0.03* 0.57
0.73 0.58 0.05* 0.19 0.66 0.13
0.13 0.94 0.07 0.59 0.48 0.36
0.83 1.00 0.95 1.18 1.48 1.17
0.66 0.50 0.74 0.65 1.47 0.96
0.72 0.67 0.96 0.64 1.32 1.14
0.56 0.57 0.61 0.48 1.73 0.84
0.10 0.69 0.75 1.11 0.79 0.51
0.40 0.06 0.13 0.42 0.13 0.25
0.52 1.11 0.22 0.63 1.90 0.65
0.92 0.29 1.31 0.62 0.95 2.01*
0.83 1.07 1.02 1.09 1.31 0.98
Sr Entropy test Series
k¼1
ARTICLE IN PRESS
563
For the S^ r and Q-tests we present P-values for the null of correct specification, while for the BDS and TR tests we present the actual statistics and flag those values which are significant with an asterisk. Therefore, for comparison purposes, all entries which are significant at the a ¼ 0:05 level are marked with an asterisk (i.e., P-values and actual statistics). Throughout we use k to denote lag, and n to denote embedding dimension.
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
GARCHð1; 1Þ DJIA 0.00* NASDAQ 0.00* NYSE 0.00* PETECH 0.00* RS2000 0.00* S&P500 0.00* GARCHð1; kÞ DJIA 0.00* NASDAQ 0.00* NYSE 0.00* PETECH 0.00* RS2000 0.01* S&P500 0.00* EGARCHð1; 1Þ DJIA 0.00* NASDAQ 0.00* NYSE 0.00* PETECH 0.00* RS2000 0.01* S&P500 0.00* GARCHð1; kÞ DJIA 0.00* NASDAQ 0.16 NYSE 0.01* PETECH 0.13 RS2000 0.02* S&P500 0.00*
TR test
ARTICLE IN PRESS 564
J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
should be noted that time-reversibility is only necessary for iid-ness. Due to consistency for the original series, our test will have power against both symmetric and asymmetric alternatives, as well as such time-irreversible processes as STAR, SETAR, EGARCH and many regime switching models. Table 6 reveals that the correlation-based Q statistic almost uniformly fails to have power in the direction of misspecification for both the GARCH and EGARCH models. The ‘‘robust BDS’’ test performs better, though relative to the TR test, the BDS also fails to have power in a number of instances. The competing TR test performs quite well, though we note that it too lacks power for several cases of EGARCHð1; kÞ. In particular, Chen and Kuan (2002), on the basis of their reversibility tests, conclude that expanded EGARCH specifications are correct, further noting that ‘‘the [proposed] test detects volatility asymmetry that cannot be detected by the BDS test ½. . . providing more information regarding how a model should be refined’’ (Chen and Kuan, 2002, p. 577). Table 6 reveals that the correlation-based Q test, the chaos-based BDS test, and characteristic function-based TR test fail to reject the EGARCHð1; kÞ specification across series. In contrast, the entropy-based TR test detects misspecification across EGARCHð1; kÞ models for every series at lags 1 and/or 2 at the 5% level except the NASDAQ (though the appropriateness of this model is rejected at lag 2 at the 10 level). As demonstrated in our Monte Carlo evidence in Section 3, this is not due to any size distortion.
4.5. Assessing time-reversibility As has been noted in the literature, time-irreversibility is the norm rather than exception for nonlinear (financial) time-series (see Tong, 1990). The time-reversibility hypothesis for these same original return series is rejected by Chen and Kuan (2002). Table 7 contains our findings for the time-reversibility properties of the original series as well as their standardized residuals. For the original series (first panel) we find similar results by looking at k ¼ 1; 2; . . . ; 5: Every return series except the NASDAQ and RS2000 fails the TR test at some lag. Interestingly, S&P500 fails at small lags and at lag 5, whereas DJIA fails marginally only at the first lag. Several panels focus on the standardized residuals of the models described above. We see that the GARCH models perform badly since they appear to induce irreversibility in all the residual series, and at many more lags! A portmanteau test will not be necessary in these cases. EGARCHð1; 1Þ does considerably better, but still failing the NYSE and PETECH residuals. EGARCHð1; kÞ does much better only coming close to borders of rejection at some lags for NASDAQ ðk ¼ 3Þ and RS2000 ðk ¼ 1Þ. The P-values indicate that the residuals of the EGARCHð1; kÞ generally enjoy the timereversibility property with greater confidence than the original series. The performance of the entropy tests point to good power as, indicated earlier, and as shown in Skaug and Tjøtheim (1993) for the original series.5 We conclude that the relative failure to detect this misspecification by competing tests would merely reflect their lack of 5
The asymptotic distribution of the proposed entropy tests for the standardized residuals are yet to be developed. Experience has shown that such asymptotic approximations are generally poor and the bootstrap and other approximations to the permutation tests remain the procedure of choice. See, for example, Skaug and Tjøstheim (1996) on this issue, and Section 3.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
565
Table 7 P-values for the entropy-TR tests Series
k¼1
Series DJIA 0.08 NASDAQ 0.41 NYSE 0.17 PETECH 0.02 RS2000 0.34 S&P500 0.04 GARCHð1; 1Þ Residuals DJIA 0.15 NASDAQ 0.12 NYSE 0.00 PETECH 0.00 RS2000 0.04 S&P500 0.13 GARCHð1; kÞ Residuals DJIA 0.08 NASDAQ 0.09 NYSE 0.00 PETECH 0.00 RS2000 0.04 S&P500 0.03 EGARCHð1; 1Þ Residuals DJIA 0.30 NASDAQ 0.37 NYSE 0.06 PETECH 0.09 RS2000 0.26 S&P500 0.34 EGARCHð1; kÞ Residuals DJIA 0.77 NASDAQ 0.49 NYSE 0.54 PETECH 0.69 RS2000 0.10 S&P500 0.55
k¼2
k¼3
k¼4
k¼5
0.17 0.32 0.06 0.05 0.12 0.03
0.82 0.52 0.74 0.12 0.11 0.52
0.68 0.79 0.76 0.52 0.35 0.44
0.24 0.72 0.38 0.58 0.49 0.08
0.07 0.00 0.00 0.00 0.04 0.00
0.76 0.66 0.35 0.09 0.40 0.76
0.59 0.48 0.67 0.82 0.45 0.42
0.95 0.52 0.46 0.43 0.55 0.44
0.05 0.02 0.01 0.00 0.11 0.01
0.80 0.52 0.10 0.09 0.34 0.87
0.73 0.41 0.79 0.55 0.45 0.45
0.76 0.53 0.48 0.45 0.65 0.59
0.39 0.24 0.11 0.01 0.23 0.30
0.83 0.68 0.90 0.49 0.66 0.99
0.69 0.70 0.77 0.85 0.58 0.67
0.94 0.98 0.46 0.99 0.94 0.81
0.79 0.35 0.90 0.41 0.80 0.82
0.43 0.12 0.70 0.77 0.43 0.49
0.51 0.45 0.79 0.63 0.39 0.45
0.79 0.65 0.56 0.80 0.55 0.97
power in some directions. Relative to its peers, the entropy-based test has two features to recommend its use as a diagnostic tool: 1. It has generally higher power than competing correlation-based and even characteristicfunction-based tests. 2. It provides surprisingly strong indications of where models fail (e.g., at lags 1 and 2) hence provides prescriptive advice for model refinement in that limited sense. One constructive way to proceed would be to use our statistic as a goodness-of-fit measure, as was done above, to choose between competing models that accommodate timeirreversibility, such as the STAR, SETAR, EGARCH and switching regression models. This may be an informed selection process where rejections at specific lags, as above,
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
566
guide the choice of competing models. Additionally, we may employ our entropy statistic as a measure of out of sample predictive performance, as was done in Maasoumi and Racine (2002) in a similar context. These substantive specification searches are beyond the scope of the current paper. 5. Conclusion We consider the application of a versatile metric entropy for detecting departures from serial independence, and time-reversibility of series and residuals of some popular models to aid in the construction of parametric time-series models. Applications indicate this ‘‘diagnostic’’ approach may offer unusually constructive prescriptions for model specification, both when applied to the original series and the residuals of models, and in conjunction with its use as a measure of fit and predictive performance. Acknowledgements The authors would like to thank Dee Dechert, Thomas Cover, Yi-Ting Chen, Adrian Pagan, the co-editors of this volume, referees, and participants at the IEE Conference in Honor of Arnold Zellner (September 2003, American University) and the Canadian Econometric Study group (September 2004) for useful comments and suggestions. We are indebted to Cong Li for his research assistance. The usual caveat applies. References Akaike, H., 1981. Likelihood of a model and information criteria. Journal of Econometrics 16, 3–14. Bai, J., Ng, S., 2001. A consistent test for conditional symmetry. Journal of Econometrics 103, 225–258. Barnett, B., Gallant, A.R., Hinich, M.J., Jungeilges, J.A., Kaplan, D.T., Jensen, M.J., 1997. A single-blind controlled competition among tests for nonlinearity and chaos. Journal of Econometrics 82, 157–192. Belaire-Franch, J., Contreras, D., 2002. How to compute the BDS test: a software comparison. Journal of Applied Econometrics 17, 691–699. Belaire-Franch, J., Peiro, A., 2003. Conditional and unconditional asymmetry in U.S. macroeconomic time series. Studies in Nonlinear Dynamics and Econometrics 7 Article 4. Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327. Brock, W.A., Dechert, W.D., Scheinkman, J.A., LeBaron, B., 1996. A test for independence based on the correlation dimension. Econometric Reviews 15 (3), 197–235. Chen, Y., 2003. Testing serial independence against time irreversibility. Studies in Nonlinear Dynamics & Econometrics 7, 1–28. Chen, Y., Kuan, C., 2002. Time irreversibility and EGARCH effects in US stock index returns. Journal of Applied Econometrics 17, 565–578. Chen, Y., Chou, R., Kuan, C., 2000. Testing time reversibility without moment restrictions. Journal of Econometrics 95, 199–218. Clements, M.P., Franses, P.H., Smith, J., Van Dijk, D., 2002. On SETAR non-linearity and forecasting. Journal of Forecasting 22, 359–375. Cromwell, J.B., Labys, W.C., Terraza, M., 1994. Univariate Tests for Time Series Models. Sage, Beverly Hills, CA. Darolles, S.J., Florens, J.-P, Gourie´roux, C., 2004. Kernel-based nonlinear canonical analysis and time reversibility. Journal of Econometrics 119(2), 323–353. Diks, C., Manzan, S., 2002. Tests for serial independence and linearity based on correlation integrals. Studies in Nonlinear Dynamics and Econometrics 6. Golan, A., 2002. Information and entropy econometrics: Editor’s view. Journal of Econometrics 107, 1–15.
ARTICLE IN PRESS J.S. Racine, E. Maasoumi / Journal of Econometrics 138 (2007) 547–567
567
Golan, A., Perloff, J., 2002. Comparison of maximum entropy and higher-order entropy estimators. Journal of Econometrics 107, 195–211. Granger, C., Maasoumi, E., Racine, J.S., 2004. A dependence metric for possibly nonlinear time series. Journal of Time Series Analysis 25 (5), 649–669. Hallin, M., Lefevre, C., Puri, M., 1988. On time-reversibility and the uniqueness of moving average representations for non-gaussian stationary time series. Biometrika 75, 170–171. Hong, Y.M., White, H., 2005. Asymptotic distribution theory for nonparametric entropy measures of serial dependence. Econometrica 73 (3), 837–902. Ihaka, R., Gentleman, R., 1996. R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics 5 (3), 299–314. Kanzler, L., 1999. Very fast and correctly sized estimation of the BDS statistic. Christ Church and Department of Economics, University of Oxford. Ljung, G., Box, G., 1978. On a measure of lack of fit in time series models. Biometrika 65, 297–303. Maasoumi, E., Racine, J.S., 2002. Entropy and predictability of stock market returns. Journal of Econometrics 107 (2), 291–312. Nelson, D.B., 1991. Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59 (2), 347–370. Nelson, R., Plosser, C., 1982. Trends and random walks in macro-economic time series some evidence and implications. Journal of Monetary Economics 10, 139–162. Parzen, E., 1962. On estimation of a probability density function and mode. The Annals of Mathematical Statistics 33, 1065–1076. Politis, D.N., Romano, J.P., 1994. The stationary bootstrap. Journal of the American Statistical Association 89, 1303–1313. Robinson, P.M., 1991. Consistent nonparametric entropy-based testing. Review of Economic Studies 58, 437–453. Skaug, H., Tjøtheim, D., 1993. Nonparametric tests of serial independence. In: Rao, S. (Ed.), Developments in Time Series Analysis. Chapman & Hall, London, pp. 207–229. Skaug, H., Tjøstheim, D., 1996. Testing for serial independence using measures of distance between densities. In Robinson, P, Rosenblatt, M. (Eds.), Athens Conference on Applied Probability and Time Series. Springer Lecture Notes in Statistics. Springer, Berlin. Tong, H., 1990. Nonlinear Time Series—A Dynamic System Approach. Oxford University Press, Oxford. Weiss, G., 1975. Time-reversibility of linear stochastic processes. Journal of Applied Probability 12, 831–836.
Further Reading Brock, W.A., Dechert, W.D., Scheinkman, J.A., 1987. A test for independence based on the correlation dimension. University of Wisconsin-Madison Social Systems Research Institute Working Paper 8702, University of Wisconsin-Madison Social Systems Research Institute. Prichard, D., Theiler, J., 1995. Generalized redundancies for time series analysis. Physica D 84, 476–493. Silverman, B.W., 1986. Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.
ARTICLE IN PRESS
Journal of Econometrics 136 (2007) 483–508 www.elsevier.com/locate/jeconom
Growth and convergence: A profile of distribution dynamics and mobility Esfandiar Maasoumia,, Jeff Racineb, Thanasis Stengosc a
Department of Economics, Southern Methodist University, Dallas, TX 75275-0496, USA b Department of Economics, McMaster University, Hamilton, Ont, Canada L8S 4M4 c Department of Economics, University of Guelph, Guelph, Ont, Canada N1G 2W1 Available online 14 March 2006
Abstract In this paper we focus primarily on the dynamic evolution of the world distribution of growth rates in per capita GDP. We propose new concepts and measures of ‘‘convergence,’’ or ‘‘divergence’’ that are based on entropy distances and dominance relations between groups of countries over time. We update the sample period to include the most recent decade of data available, and we offer traditional parametric and new nonparametric estimates of the most widely used growth regressions for two important subgroups of countries, OECD and non-OECD. Traditional parametric models are rejected by the data, however, using robust nonparametric methods we find strong evidence in favor of ‘‘polarization’’ and ‘‘within group’’ mobility. r 2006 Elsevier B.V. All rights reserved. JEL classification: C13; C21; C22; C23; C33; D30; E13; F43; Q30; Q41 Keywords: Growth; Convergence; Distribution dynamics; Entropy; Stochastic dominance; Nonparametric; International cross-section
1. Introduction Recent research on growth and convergence has provided a fertile interface between economic theorists, empirical economists and, increasingly, modern econometricians. It is now more widely accepted that the research effort in this area should be directed less toward questions of whether realizations from, or moments of, the distribution of growth Corresponding author. Tel.: +1 214 768 4298.
E-mail address:
[email protected] (E. Maasoumi). 0304-4076/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2005.11.012
ARTICLE IN PRESS 484
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
rates converge, and more to questions concerning the ‘‘laws’’ that generate the distribution of growth rates, or incomes, and their evolution over time. This focus on whole distributions would hide less of the pertinent facts, and is more conducive to learning the nature and degree of what appears to be an ‘‘unconditional’’ divergence in growth rates and incomes. There is a well established tradition for our approach in the ‘‘income distribution’’ literature where ranking of distributions by, for example, Lorenz and Stochastic Dominance criteria, and the study of mobility, are well developed. Quah’s work is rightly associated with the introduction of the distribution approach in the ‘‘growth convergence’’ literature; see Quah (1993, 1997). In this paper we focus on significant features of the probability laws that generate growth rates that go beyond both the ‘‘b-convergence’’ and ‘‘s-convergence.’’ It is perhaps necessary to emphasize how narrow these two concepts are. The former concept refers to the possible equality of a single coefficient of a variable in the conditional mean of a distribution of growth rates! The latter, while being derivative of a commonplace notion of ‘‘goodness of fit,’’ also is in reference to the mere fit of a conditional mean regression, and is additionally rather defunct when facing nonlinear, nongaussian, or multimodal distributions commonly observed for growth and income distributions. We will examine the entire distribution of growth rates, as well as the distributions of parametrically and nonparametrically fitted and residual growth rates relative to a space of popular conditioning variables in this literature. New concepts of convergence and ‘‘conditional convergence’’ emerge as we introduce new entropy measures of distance between distributions to statistically examine a deeper question of convergence or divergence. Some of our findings may be viewed as alternative quantifications and characterizations of the distributional dynamics discussed in Quah (1993, 1997). Quah focuses on the distribution of per capita incomes (and relative incomes) for the same panel of countries in the world. He examines diffusion processes for the probability law generating these incomes, and a measure of ‘‘transition probabilities,’’ the stochastic kernel, to examine the evolution of the relative per capita incomes. On the other hand, we examine the distribution of the growth rates themselves, and use entropy distance metrics that reveal divergences, reflect the nature of divergences, and is closely related to welfare-theoretic notions of income mobility embodied in the inequality reducing measures of Shorrocks– Maasoumi–Zandvakili; see Maasoumi (1998). Our findings are largely based on distributional dynamics and conform more closely with the theoretical models which take cross-country interactions into account (such as in Lucas, 1993; Quah, 1997) or which allow for elements of multiple regimes and certain types of nonconvexities (as in Durlauf and Johnson, 1995). Employing recent techniques for handling mixed discrete/continuous variables, we also present new nonparametric estimates of both the growth rate distributions (see Li and Racine, 2003, 2004; Racine and Li, 2004; Hall et al., 2004). While we strongly agree with Quah on the limitations of the traditional panel regression (conditional mean) analysis in this area, we do connect to, and accommodate the current literature by applying our nonparametric techniques to the estimation of the most widely analyzed extended form of the original Solow–Swan regression model (as in Mankiw et al., 1992). But here too we offer a different (entropy) measure of ‘‘fit’’ for these regressions which may be viewed as an enhancement of the concept of s-convergence since it involves many more moments than just the variance. Making summary statements with conditional means (averages) is not without value, but
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
485
our modest message is that one can make better statements and one must caution that some distributions are poorly summarized by their means and/or variances. The availability of data on a number of important dimensions that describe domestic economic activity in a given country and the collection of these individual country data into an international data source, such as in Summers and Heston (1988) and King and Levine (1993), has allowed a systematic examination of cross-country growth regressions. Focusing on the conditional means, the vast majority of the contributions to the empirics of economic growth have assumed that the main attributes that characterize growth such as physical and human capital exert the same effect on economic growth both across countries (intratemporally) and across time (intertemporally) and have assumed a (log) linear relationship (see Barro, 1991; Barro and Sala-i-Martin, 1995). There have been some recent studies that question the assumption of linearity and propose nonlinear alternatives that allow for multiple regimes of growth patterns among different countries. These models are consistent with the presence of multiple steady-state equilibria that classify countries into different groups with different convergence characteristics (see Quah, 1996 for a discussion of the evidence against the convergence hypothesis that underlies the standard approach). In this context, Bernard and Durlauf (1996) offer an explanation for the apparent strong evidence in favor of the convergence hypothesis (see Mankiw et al., 1992). They argue that the convergence properties for all countries in the misspecified linear model are inherited from the convergence of a group of countries associated with a common steady state in the correctly specified multiple regime growth model. Motivated by recent theories emphasizing threshold externalities (Azariadis and Drazen, 1990), Durlauf and Johnson (1995) postulate that countries obey different laws of motion to the steady-state. They employ regression tree methodology and divide countries into four subgroups according to their initial level of per capita income and literacy rate. They infer distinct linear laws of motion for the four subgroups. Thus, their work rejects the presumption on which the majority of the cross-country empirical growth literature is based. In particular, they find substantial differences in their estimate of the coefficient for the secondary enrollment ratio: it is insignificant for two of the subsamples and is positive for the other two (it is a third larger in magnitude for the middle income economies as compared to the high income ones). Hansen (2000) uses a threshold regression framework to test for sample splitting between different groups of countries and he finds evidence of such groupings. In a related study using some of the same methods as ours, Liu and Stengos (1999) allow for two nonlinear components, one for the initial level of GDP and the other for the secondary enrollment rate. They find that the presence of nonlinearities were mainly due to groupings of countries according to their level of initial income, whereas the effect of human capital (as measured by the secondary enrollment rate) was in essence linear. As has been pointed out by Durlauf and Quah (1999), the dominant focus in these studies is on certain aspects of estimated conditional means, such as the sign or significance of the coefficient of initial incomes, how it might change if other conditioning variables are included, or with other functional forms for the production function or regressions. Many of these empirical models, including panel data regressions, fail to serve as vehicles to identify and distinguish underlying economic theories with sometimes radically different implications and predictions. Many also run counter to observed income distributional dynamics, or are unable to explain them. In addition, all of the above studies rely on ‘‘correlation’’ criteria to assess goodness of fit and to evaluate ‘‘convergence.’’ Our first
ARTICLE IN PRESS 486
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
step is to rectify this shortcoming, especially when considering nonlinear and/or nonparametric regressions. This we achieve with two entropy measures of fit. The resulting analysis produces ‘‘fitted values’’ of growth rates, as well as ‘‘residual growth rates’’ which will be used for fresh looks at the question of ‘‘conditional’’ convergence. Our nonparametric kernel estimates of conditional growth are free from some of the functional form misspecifications that have been pointed out by various authors in this area. We shed some light on potential nonlinearities in growth relations. Turning to the main objective of this paper, we examine the relation between growth rate distributions for different country groups, as well as the evolution of the generating law over time, both within and between country groups. The nonparametric density method of Hall et al. (2004) is utilized to analyze these questions. We quantify these distribution distances and movements by entropy measures, and use the latter to examine convergence (conditional and unconditional) as a new statistical hypothesis. Our data are extended beyond previous studies and span the last 35 years of available data. The plan of the paper is as follows. In Section 2 we present the elements of the traditional ‘‘work horse’’ model of this literature. In Section 3 we propose to fit parametric and nonparametric regression models on the data panel for two different groups of countries, the OECD and the ‘‘rest of the world’’ consisting of the lesser developed countries. We also offer a conditional moment test of the traditional parametric specification. In Section 4 we present the unconditional distribution of the growth rates, and the distribution of their fitted values. Next, we obtain k-class entropies of each distribution, especially for two values of k, the Shannon entropy, and for k ¼ 12 (see Granger et al. (2004)). Our approach is appealing because the distribution of growth rates across countries and time cannot be successfully summarized by their variances alone (unless they are normal). Additionally, inferences regarding the fit of these models is assessed by a metric entropy measure of distance between the actual and fitted distributions for each country group. We report the entropy distance between the two groups of countries (both for fitted and actual growth rates). The distance based on ‘‘raw’’ growth rates is a new measure of unconditional convergence. The one based on the fitted values is a new measure of ‘‘conditional convergence.’’ These entropies and entropy distances reveal how far apart (dispersed) are the economies within each group and between the two groups. If indeed there is statistically significant convergence to a common steady-state then one expects that these distance measures ‘‘shrink’’ in size as one moves from the 1960s to the 1970s through the 1990s. We find that the empirical evidence is compatible with bipolar development and ‘‘clubs.’’ Contrary to commonly assumed models, the evolution of these distances or laws may not be ‘‘linear.’’ For example it may be that the distance first decreases and then increases. Within each group, even if one finds b-convergence (the coefficient of initial income may be negative, signifying that a country with a lower GDP will have higher growth rates thereby catching up with the rest of the countries in the same group), entropy within each group will reveal any unequal pattern of growth rates (conditional and unconditional). If the growth rates are roughly equal, entropy will take its maximum value (log N, in the case of Shannon’s, where N is the number of countries in the group). Thus we are able to reveal more of the growth mobility dynamics even within groups. This offers an examination of mobility dynamics which tells us how distributions change and by how much, in the sense of Shorrocks–Maasoumi. In other words we are able to capture nonlinearities in the growth dynamics of different income classes (heterogeneity in the growth paths).
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
487
Quah (1997), looking at the per capita incomes, examines the probabilities of related transitions. This approach captures the cross-sectional heterogeneity and the tendency towards polarization of the cross-country distribution.1 The two approaches are clearly interconnected and complementary but different. Maasoumi (1998) sheds light on the relation between these two notions of mobility. Our reported entropies in the distributions of growth rates and model residuals for all countries and both groups reveal why it has been false to assert convergence, in any sense, without grouping of countries. What the proposed approach does that has not been done before is to define, measure, and test for convergence in the probability laws that generate cross-country growth rates, explicitly allow for heterogeneity between different country groups, and base inferences on more robust nonparametric estimators.2
2. The traditional parametric setting It is helpful to first present the mechanics of the traditional regression models of the conditional mean of the distribution which will be the primary focus of our work. This regression has been the main focus in the literature. Our recollection in this section helps to identify some popular conditioning variables. But we also offer some advances in the analysis of this conditional mean which would be helpful when one wishes to make statements that are useful ‘‘on average’’ for sufficiently homogeneous country groups. Mankiw et al. (1992) assume a production function of the form Y t ¼ K at H bt ðAt Lt Þ1ab , where Y, K, H, and L represent total output, physical capital stock, human capital stock and labor, respectively, and A is a technological parameter. Technology is assumed to grow exponentially at the rate f, or At ¼ A0 eft . By linearizing the transition path around the steady-state, they derive the path of output per effective worker ye ðe y ¼ Y =ALÞ between time period T and T þ r as follows: ln yeTþr ¼ y ln yen þ ð1 yÞ ln yeT ,
(1)
where y ¼ ð1 elr Þ, l is the rate of convergence and yen is the steady-state level of output per effective worker. In order to derive the growth of output per worker ðY =LÞ, they substitute for the steady-state level of output per worker ðln yen ¼ a ln kn þ b ln hn Þ, noting that the steady-state levels of capital per effective worker ðkn Þ and human capital per effective worker ðhn Þ depend on the share of output devoted to physical capital accumulation ðsk Þ, the share of output devoted to human capital accumulation ðsh Þ, the growth of the labor force (n), and the depreciation rate for (human and physical) capital ðdÞ. Finally, the growth of output per worker between period T and T þ r of country i is obtained by noting that ln yeT ¼ lnðY =LÞT ln A0 fT and subtracting initial income 1
Fiaschi and Lavezzi, forthcoming have tried to combine the two approaches in a Markov transition matrix framework. However, their approach suffers from the complexity of the state space in terms of both income levels and growth rates, since there is no natural way to obtain its partition ex-ante. 2 Quah (1996, 1997) looks at the distributions of per capita incomes and its various transformations, and their evolution into a bipolar set. Quah’s work is similar in spirit to ours but does not offer measures of ‘‘distance’’ between distributions, as we do.
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
488
from both sides of (1) to arrive at Y Y a ln ski ln ln ¼ fr þ yðln A0 þ fTÞ þ y L i;Tþr L i;T 1ab aþb lnðni þ f þ dÞ 1ab b Y h . þy ln si y ln 1ab L i;T
y
ð2aÞ
Mankiw et al. (1992, p. 418) point out that the steady-state level of output per worker can also be expressed in terms of the (steady-state) level of human capital ðhn Þ, rather than sh . In this case, the growth of output per worker becomes ln
Y Y 1ab ln ¼ fr þ yðln A0 þ fTÞ L i;Tþr L i;T 1a a a ln ski y lnðni þ f þ dÞ þy 1 a 1 a b Y þy . ln hni y ln 1a L i;T
ð2bÞ
As they point out, testing depends on ‘‘... whether the available data on human capital correspond more closely to the rate of accumulation (sh Þ or the level of human capital ðhÞ.’’ The early literature used data on rates of enrollment corresponding to the model in (2a). More recent contributions have used estimates of the number of years of schooling of the working age population corresponding more closely to the formulation in (2b). Mankiw et al. (1992) estimated the model in (2a) with cross-section data and used the ratio of investment to GDP to measure sk and the secondary enrollment rate (adjusted for the proportion of the population that is of secondary school age) to measure human capital ðsh Þ. Others have used primary as well as secondary enrollment rates to measure human capital (see Barro and Sala-i-Martin, 1995). As it is common with most recent contributions we employ panel data over seven 5-year periods: 1960–1964, 1965–1969, 1970–1974, 1975–1979, 1980–1984, 1985–1989 and 1990–1994. We estimate the unrestricted versions of the models in (2b) as follows: yit ¼ a0 þ a1 Dt þ a2 Dj þ a3 ln skit þ a4 lnðnit þ f þ dÞ þ a5 ln xit þ a6 ln hit þ eit ,
ð3Þ
where yit refers to the growth rate of income per capita during each period, xit is per capita income at the beginning of each period, hit is human capital measured either as a stock or as a flow. Dt and Dj are dummy variables for each period and for certain regions such as Latin America or Sub Saharan Africa, respectively. The need for dummies to identify the time period over which the model is estimated is clear from Eq. (2b). Regional dummies have been included by many previous researchers to account for idiosyncratic economic conditions in these two regions. Initial income estimates are from the Summers–Heston data base, as are the estimates of the average investment/GDP ratio for 5-year period. The average growth rate of the per capita GDP and the average annual population growth for
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
489
each period are from the World Bank. Finally, the average years of schooling in the population above 15 years of age are obtained from Barro and Lee (2000). Durlauf and Quah (1999) have provided an insightful summary of the empirical results from these regressions, their extensions, and their ability or inability to address the validity and predictions of both the exogenous and endogenous growth theories, with different treatments of human capital and technical assumptions. It is clear that negativity or significance of the impact of initial income in these regressions is insufficient evidence to distinguish between the underlying models/theories. It is the distributional dynamics, or ‘‘mobility’’ characteristics of these economies that are more interesting, less fragile as evidence, and more relevant especially in explaining within group interactions of economies that are either geographically close, or within trade groups, or similar in stage of social and economic development. Nevertheless, we include in the next section our more robust findings regarding the above regression models. 3. Growth regressions and their fit 3.1. Parametric results We first consider a linear parametric model which has been used to model this relationship. Note that this model is linear and additive in nature, while there is no interaction between the categorical variables (year, OECD status) and the continuous variables. Table 1 summarizes the estimated model (R code and data to reproduce these results are available upon request from the authors). Before proceeding further, we offer a new test for correct parametric specification of the model summarized in Table 1 since inference based upon incorrectly specified parametric Table 1 Parametric model summary
(Intercept) OECD d1970 d1975 d1980 d1985 d1990 d1995 initgdp initgdp2 initgdp3 initgdp4 popgro inv humancap humancap2 humancap3
Estimate
Std. error
t-value
Pr(4jtj)
6.5101 0.0043 0.0001 0.0028 0.0073 0.0238 0.0136 0.0187 3.3940 0.6572 0.0558 0.0018 0.0172 0.0185 0.0007 0.0011 0.0005
3.8180 0.0044 0.0039 0.0040 0.0040 0.0041 0.0042 0.0043 2.0025 0.3908 0.0336 0.0011 0.0105 0.0023 0.0032 0.0021 0.0011
1.71 0.97 0.02 0.70 1.81 5.78 3.25 4.35 1.69 1.68 1.66 1.63 1.63 7.93 0.21 0.51 0.45
0.0887 0.3311 0.9816 0.4853 0.0712 0.0000 0.0012 0.0000 0.0906 0.0931 0.0975 0.1043 0.1035 0.0000 0.8366 0.6084 0.6512
Residual standard error: 0.026 on 599 degrees of freedom, multiple R-squared: 0.2856, adjusted R-squared: 0.2665, F -statistic: 14.97 on 16 and 599 degrees of freedom, p-value: o2:2 1016 .
ARTICLE IN PRESS 490
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
models will be unreliable. We proceed with a robust consistent nonparametric test developed in Hsiao et al. (forthcoming) which we now briefly outline. If we denote the parametric model by mðxi ; gÞ and the true but unknown conditional mean by Eðyi jxi Þ, then a test for correct parametric specification is a test of the hypothesis H0 : Eðyi jxi Þ ¼ mðxi ; gÞ almost everywhere versus the alternative H1 : Eðyi jxi Þamðxi ; gÞ on a set having positive measure. Equivalently, letting ui ¼ yi mðxi ; gÞ, correct specification requires that Eðui jxi Þ ¼ 0 almost everywhere, with a consequence of incorrect functional specification being that Eðui jxi Þa0 on a set with positive measure, or, equivalently, that ½Eðui jxi Þ2 ¼ 0 and ½Eðui jxi Þ2 X0. To avoid problems arising from the presence of a random denominator in the nonparametric estimator of Eðui jxi Þ, a density weighted version given by ½Eðui jxi Þ2 f ðxi Þ is deployed. To test whether Eðui jxi Þ ¼ 0 holds over the entire support of def the regression function, the statistic I ¼ Ef½Eðui jxi Þ2 f ðxi ÞgX0 is chosen. Note that I ¼ 0 if and only if H0 is true, and I therefore serves as a valid candidate for testing H0 . The sample analogue of I is obtained by replacing ui with the residuals obtained from the parametric null model, u^ i ¼ yi mðxi ; g^ Þ, and by replacing Eðui jxi Þ and f ðxi Þ with their consistent kernel estimators, while the null distribution of the statistic is obtained via resampling methods (‘wild-bootstrapping’). This test is directly applicable to the problem at hand involving a mix of discrete and continuous data. The test has been shown to have correct size and, being consistent, to possess good power properties against a wide class of alternative models (see Hsiao et al., forthcoming for further details). Applying this test to the parametric model summarized in Table 1 yields a p-value of 4:07087 1006 . Unsurprisingly, this is extremely strong evidence against the null and indicates parametric misspecification; see Durlauf and Johnson (1995) for similar findings based on other methods. Given that we reject the null of (this) parametric specification, and given the presence of both discrete and continuous data, we choose to proceed with a rather new nonparametric approach. 3.2. Nonparametric results For what follows, we consider a fully nonparametric local linear specification using the estimator of Li and Racine (2004) that permits us to model the mix of discrete and continuous data types found in the present context. We summarize the nonparametric results using partial regression plots. These plots simply present the estimated multivariate regression function via a series of bivariate plots in which the regressors not appearing on the horizontal axis of a given plot have been held constant at their respective medians. That is, if we wish to present the nonparametric regression of y on x1 , x2 , and x3 , we plot y versus Eðx1 ; x¯ 2 ; x¯ 3 Þ, y versus Eðx¯ 1 ; x2 ; x¯ 3 Þ, and y versus Eðx¯ 1 ; x¯ 2 ; x3 Þ where the bar denotes a median which allows one to visualize the multivariate regression surface via a series of twodimensional plots. One of the appealing features of this approach is that it permits direct comparison of the parametric and nonparametric results. The profiles presented in Figs. 1 and 2 are constructed using our panel of 616 observations in the following manner. First, least-squares cross-validation is used to obtain the appropriate bandwidths for the discrete and continuous regressors (see Li and Racine, 2004 for details). Next we generate and plot the partial regression relationships between GDP growth (Y) and each continuous explanatory variable holding the remaining continuous variables constant at their respective medians (year ¼ 1980, initial GDP ¼ 7:8,
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508 GDP Growth Versus Population Growth
Kernel Linear
6
6.5
7
GDP Growth Rate
GDP Growth Rate
GDP Growth Versus Initial GDP 0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
7.5 8 8.5 Initial GDP
9
9.5
0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08 -3
-3.5
-3
-2.5 -2 Investment
-1.5
Kernel Linear
-2.9
-2.8 -2.7 -2.6 Population Growth
-2.5
-2.4
GDP Growth Versus Human Capital
Kernel Linear
GDP Growth Rate
GDP Growth Rate
GDP Growth Versus Investment 0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
491
-1
0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
Kernel Linear
-1
-0.5
0 0.5 1 Human Capital
1.5
2
Fig. 1. Nonparametric partial regression plots for non-OECD countries. GDP Growth Versus Population Growth
Kernel Linear
6
6.5
7
7.5 8 8.5 Initial GDP
GDP Growth Rate
GDP Growth Rate
GDP Growth Versus Initial GDP 0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
9
9.5
0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08 -3
Kernel Linear
-3.5
-3
-2.5 -2 Investment
-1.5
-2.9
-2.8 -2.7 -2.6 Population Growth
-2.5
-2.4
GDP Growth Versus Human Capital GDP Growth Rate
GDP Growth Rate
GDP Growth Versus Investment 0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
Kernel Linear
-1
0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08
Kernel Linear
-1
-0.5
0 0.5 1 Human Capital
1.5
2
Fig. 2. Nonparametric partial regression plots for OECD countries.
population growth ¼ 2:6, investment ¼ 1:8, human capital ¼ 1:4, respectively). We also plot the partial parametric regression surfaces, and we consider separate plots for OECD versus non-OECD members.
ARTICLE IN PRESS 492
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
Our nonparametric approach allows for interactions among all variables and also allows for nonlinearities in and among all variables. Furthermore, the method has two defining features; (i) if the underlying relationship is linear in a variable(s) then the cross-validated smoothing parameter is capable of automatically detecting this; (ii) the method has better finite-sample properties than the popular local constant kernel estimator, in particular, it is minimax efficient and is known to possess one of the best boundary correction methods available. A summary of the particulars of the nonparametric method for this panel (bandwidths and so forth) are available upon request from the authors. In the literature on growth convergence a great deal of attention has been paid to the relationship between GDP growth and initial GDP. This relationship is given in the first plot in Fig. 1. It is clear that as initial GDP rises, ceteris paribus, GDP growth falls. This would seem to offer evidence in favor of ‘‘b-convergence’’. As Durlauf and Quah (1999), point out, however, this is not evidence necessarily in favor of the traditional exogenous technical change, Solow–Swan model and its extended forms. A negative ‘‘coefficient’’ of initial GDP is not empirically incompatible with sometimes radically different theories. An interesting feature arises when considering the conditional relationship between GDP growth and population growth for OECD versus non-OECD countries. Note that for OECD countries, population growth ‘‘hurts’’ GDP growth. However, for non-OECD countries, low levels of population growth are beneficial while only high levels hurt growth. This is a reflection of an apparent threshold level for population size which may support economic advancement. Many smaller and economically less developed countries consider their population size to be a handicap in supporting major industrial developments and investment. These graphs make clear the importance of decomposition by country groups. Aggregating these countries hides the very different impact that each group has experienced from investment, population growth, and especially ‘‘human capital’’ upon its growth rates. While human capital has an increasing and positive relation with growth for OECDs, it has a tenuous impact for non-OECDs. But a general association of low human capital and low growth rates is common to both groups. Given these observations, it is rather interesting that, for the parametric regression on all of the countries, the OECD status (dummy) variable is insignificant. This underscores the dangers inherent to the unquestioning use of linear models! 4. Evolution of cross-section distributions In view of the evident limitations of conditional means (or even variances) as vehicles for analyzing diversity (convergence!) within distributions, certainly of incomes, we now turn to the central analysis of this paper based on the whole distribution of growth rates. The stylized facts concerning the cross-section distributions of growth rates and their evolution are well laid out in Durlauf and Quah (1999). The most important of these are a ‘‘polarization’’ effect being largely an evolution into a ‘‘bipolar’’ world, and ‘‘churning’’ or what we prefer to call ‘‘within group mobility’’ which, when examined in greater detail, points to possible ‘‘multimodality’’ and ‘‘clubs.’’3 3 Classification of countries by proximity or trade are given in Quah (1997) and others. We believe this question deserves greater attention and is perhaps best left to studies that consider multidimensional clustering which combine two different techniques. The multidimensionality aspects may be addressed in the manner of Maasoumi and Jeong (1985) who considered composite measures of well being for the world, including per capita incomes. The clustering techniques of Hirschberg et al. (2001) may then be applied to these multidimensional indices.
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
493
As noted earlier, several authors, including Bianchi (1997), Jones (1997), and Quah (1993, 1997), have examined the more interesting aspects of the dynamics in the distribution of growth rates in light of the predictions of various growth models. This section’s analysis, and our main interest, is in the same spirit. In particular, Binachi too obtains (different) nonparametric density estimates for growth rate distributions at each point in time, whereas Quah (1997) examined the (relative) per capita income distributions and their ‘‘transition’’ laws by analyzing transition probabilities and their continuous counterpart, stochastic kernels. Examination of mobility (in any attribute) has traditionally been conducted in two ways. Transition matrices (kernels) and indices defined over them, or inequality reducing measures based on ‘‘distances’’ between distributions and how they evolve toward the ‘‘equal’’ distribution over time. Ideal indices of mobility based on the latter approach are connected to those in the former, but a full understanding of the relations is not yet at hand. See Maasoumi (1998) for an extended discussion. In addition, per capita income and growth rates of incomes are at least statistically distinct (but surely related) variables. In comparing our results with the complementary findings of Quah (1997), these distinctions must be born in mind. Our findings reinforce the notion of divergence and polarization in both incomes and their growth rates. We also find that some groupings of countries identify somewhat more uniform sets, but neither identifies the causes of divergence in incomes or growth rates. Perhaps there is substance in the view that ‘‘conditional convergence’’ is a rather vacuous concept. Of course there are causes for the observed divergence.
4.1. Distribution dynamics: actual growth rates For what follows, we focus attention on the probability density function (PDF) and cumulative distribution function (CDF) of growth rates, focusing on how the distribution of growth rates evolves over time and behaves with respect to OECD status. Rather than presume that growth rates are generated from a known parametric family of distributions, we use robust nonparametric methods capable of providing consistent estimates of the unknown PDF and CDF. We elect to use kernel methods, and we estimate Rosenblatt–Parzen type density estimates. Data-driven methods of bandwidth selection are employed, and bandwidths are selected via likelihood cross-validation, which results in estimates Rthat are close to the true density in terms of the Kullback–Leibler information distance f ðyjxÞ logff ðyjxÞ=f^ðyjxÞg dy where f ðyjxÞ represents the conditional density function (see Silverman, 1986, p. 53; Hall, 1987). We begin by modeling the PDF and CDF of the actual growth rates conditional on OECD status (0/1) and year (1965, 1970, y). Note that, by modeling the joint distribution of growth rates, year, and OECD status and then conditioning on OECD status and year, we obtain a kernel density estimate having improved finite-sample properties relative to the traditional univariate kernel density estimate for growth rates for a particular year and OECD status (the latter using only a subset of the data used to construct the former). The conditional density estimator found in Hall et al. (2004) is used due to the mix of continuous and discrete data present. Fig. 3 presents plots of the conditional PDF and CDF for all combinations of OECD status and year, while Fig. 4 presents a plot of all OECD distributions for all years and all non-OECD ones again for all years.
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
25 20 15 10 5 0 -0.15
25 20 15 10 5 0 -0.15
30 25 20 15 10 5 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
0.2
1 0.8 0.6 0.4 0.2 0 -0.15
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
OECD Non-OECD
1970
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
OECD Non-OECD
1975 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
20 18 16 14 12 10 8 6 4 2 0 -0.15
1965
1980 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
25 20 15 10 5 0 -0.15
1985 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
Density
Distribution
Density
Density
Density
494
1965 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
1995 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
Distribution
Density Density
25 20 15 10 5 0 -0.15
-0.05
0.15
0.2
OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
0.2
1975 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
0.2
1980 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
0.2
1985 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
0.2
1990 OECD Non-OECD
-0.1
0.1
1970
1990 30 25 20 15 10 5 0 -0.15
0 0.05 Growth Rate
OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.15
0.2
1995 OECD Non-OECD
-0.1
-0.05
0 0.05 Growth Rate
Fig. 3. Growth rate distributions by year and OECD status.
0.1
0.15
0.2
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508 OECD
OECD
30
20 15 10
-0.1
-0.05
0 0.05 Growth Rate
0.1
0.4
0 -0.15 -0.1 -0.05
0.15 0.2
Non-OECD
0 0.05 Growth Rate
0.1
0.15
0.2
Non-OECD 1
1965 1970 1975 1980 1985 1990 1995
1965 1970 1975 1980 1985 1990 1995
0.8 Distribution
Density
0.6
0.2
5
16 14 12 10 8 6 4 2 0 -0.15
1965 1970 1975 1980 1985 1990 1995
0.8 Distribution
Density
1
1965 1970 1975 1980 1985 1990 1995
25
0 -0.15
495
0.6 0.4 0.2
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0 -0.15 -0.1 -0.05
Growth Rate
0
0.05
0.1
0.15
0.2
Growth Rate
Fig. 4. Growth rate distributions by OECD status for all years.
Several important features of these results may be noted: (1) The growth distributions for the OECD and non-OECD countries are very different, and have remained very different from 1965 to 1995. (2) The distribution for OECDs is less dispersed and is symmetrical, becoming more so over time. (3) The distribution for non-OECDs is less symmetrical, and not converging to any particular form, and becoming less concentrated. It appears to be forming a bimodality of its own suggesting multimodality that, while not incompatible with parametric/traditional regression models, may be difficult for ‘‘regression techniques’’ to identify and examine. Within group mobility in the non-OECDs is made evident by these graphs. It is possible to derive ‘‘mobility profiles’’ in the manner of Maasoumi and Zandvakili (1990), but we leave this to future work. (4) When combined, the previous two observations agree and further explain the often observed and expanding multimodality in the world distribution of growth rates; for example see Durlauf and Johnson (1995) who arrive at compatible inferences based on multiple regressions and regression trees. (5) Linton et al. (2002) consider welfare-theoretic bases for assessing the relations between distributions. They propose subsampling based tests of first, second (and higher) order stochastic dominance, FSD and SSD, respectively. Although one of us has applied these tests to some of the cases in this paper, we partially agree with Quah (1997) who suggests one has a census of all countries in the population here, not a sample. Given this point of view, the following observations may be viewed as free from
ARTICLE IN PRESS 496
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
sampling variation: (a) In 1965–1970 OECD first order stochastically dominates (FSD) the non-OECDs since its CDF lies everywhere to the right of the latter. From 1975 there is no FSD ranking between these two groups, but there is second order SD (SSD) of OECDs over non-OECDs through 1990 (with the possible exception of 1980). The order rankings are inconclusive and almost identical for the 1980 and 1995 pair! It should be noted that FSD is a very strong rank order and implies SSD. SSD obtains on the basis of welfare functions that are increasing and concave (equality preferring). Thus, one might conclude that the evolution of the non-OECD distribution has been positive, and it is a higher degree of ‘‘convergence’’ of growth rates amongst the OECDs that contributes to its SSD over the non-OECDs in later years. Some of these observations are explained by the movement of China from a large population, low growth economy, to a large population, high growth economy status. There is much ‘‘churning,’’ or ‘‘exchange mobility,’’ and no ‘‘convergence’’ within the non-OECDs, and a tangible convergence and ‘‘growth mobility’’ within the OECDs. There are clearly a minimum of ‘‘two clubs’’ on the basis of growth rates alone. Similarly, Quah (1997) finds credible evidence of relative per capita income ‘‘clubs’’ on the basis of geographical proximity, as well as trading practices. (b) Regarding the evolution of each group over time, again we find ‘‘a tale of two cities.’’ OECDs have clearly ‘‘deteriorated’’ over time since the 1965, whereas the situation for non-OECDs is far less clearcut. The OECD growth distribution in 1965 first order dominates all other years. There is a clear break in the 1980s, resulting in a gradual strengthening of this rank order as they evolve toward 1995. Note that this period contained two recessions in the 1980s and early 1990s. It would be interesting to re-examine this hierarchy when more recent data become available. It is interesting to note that, since FSD implies SSD, whatever small convergence in growth rates of OECD, if any, it is not enough to topple the SSD ranking (greater ‘‘equality’’) of earlier years over the latter years. (c) Regarding the non-OECDs, the only clearcut ranking is that 1985 is first order dominated by every other year except 1995. Clearly this was not a ‘‘good’’ year for growth globally. But, the differential development within this group is well reflected by a lack of FSD amongst other years. It is possible that 1965 weakly second order dominates 1995, yet another reflection of a lack of ‘‘convergence’’ in these distributions. We are in fact able to quantify the magnitudes of these movements between entire distributions! Thus we will report entropy distances and related tests in a subsequent section which shed light on the ‘‘magnitude’’ of these distances and focus on convergence.
4.2. Distribution dynamics: nonparametric fitted growth rates Our nonparametric regressions have produced what might be considered robust fitted values of the growth rates in the plane of the most popular conditioning variables (see Figs. 5 and 6).
ARTICLE IN PRESS 1965
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
15 10 5 -0.04
40 35 30 25 20 15 10 5 0
0.2 -0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
1 0.8
0.08
0.1
Distribution 0.1
1980
OECD Non-OECD
0.6 0.4 0.2 0
0.1
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
Distribution
0.08
0.1
1975 OECD Non-OECD
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
0.1
1980
1 OECD Non-OECD
OECD Non-OECD
0.8 0.6 0.4 0.2 0
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
-0.04
0.1
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
0.1
1985
1985
1 Distribution
OECD Non-OECD
OECD Non-OECD
0.8 0.6 0.4 0.2 0
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
0.1
-0.04
1990
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
-0.02
0
0.4 0.2 -0.04
-0.02
0
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
0.1
0.02 0.04 0.06 Fitted Growth Rate
0.08
0.1
1995
1
-0.04
0.1
0.6
1995 OECD Non-OECD
0.08
OECD Non-OECD
0.8
0
0.1
0.02 0.04 0.06 Fitted Growth Rate 1990
1
OECD Non-OECD
Distribution
Density Density
40 35 30 25 20 15 10 5 0
0.08
OECD Non-OECD
35 30 25 20 15 10 5 0
40 35 30 25 20 15 10 5 0
0 0.02 0.04 0.06 Fitted Growth Rate 1975
-0.04
Density
-0.02
Distribution
Density Distribution
30 25 20 15 10 5 0
0.4
1970 OECD Non-OECD
20
0
0.6
0
0.1
OECD Non-OECD
0.8
1970
25 Density
0.08
497
1965
1 OECD Non-OECD
Distribution
35 30 25 20 15 10 5 0
Distribution
Density
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
OECD Non-OECD
0.8 0.6 0.4 0.2 0
-0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
Fig. 5. Nonparametric fitted growth rates.
0.08
0.1
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
498
OECD
40
25
1965 1970 1975 1980 1985 1990 1995
0.8 Distribution
30 Density
OECD
1 1965 1970 1975 1980 1985 1990 1995
35
20 15 10
0.6 0.4 0.2
5 0
0 -0.04
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.1
-0.04
Non-OECD
35
-0.02
20 15
0.08
0.1
1965 1970 1975 1980 1985 1990 1995
0.8 Distribution
25
0 0.02 0.04 0.06 Fitted Growth Rate Non-OECD
1 1965 1970 1975 1980 1985 1990 1995
30 Density
0.08
0.6 0.4
10 0.2
5 0
0 -0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
-0.04
Fitted Growth Rate
-0.02
0 0.02 0.04 0.06 Fitted Growth Rate
0.08
0.1
Fig. 6. Predicted growth rates by OECD status for all years.
We present the PDF and CDF of these ‘‘fitted’’ or estimated growth rates. Plots of these conditional PDFs and CDFs for all combinations of OECD status and year are followed, finally, by plots of all OECD distributions for all years as well as all non-OECD countries for all years. The FSD and SSD rankings are similar to the ‘‘raw’’ growth distributions. The evolution of growth rates, as predicted by popular explanatory variables and free of ‘‘residual sources,’’ tends to conform to the ‘‘unconditional’’ evolution analysis provided in the previous section. Several caveats are in order, however: (1) The FSD rankings between the OECD and others are even stronger than for raw growth rates, becoming less strong toward 1980, whereby it is only a SSD ranking with decreasing strength toward 1995 where there may be at most a third order SD ranking between them. ‘‘Bipolarity’’ is surely not questioned. (2) All of our previous statements regarding the ‘‘time path’’ of these distributions for each group are intact, but somewhat stronger rankings are possible for the non-OECD distributions over time (compare this with generally consistent results of Quah (1997) for per capita incomes, and Durlauf and Quah (1999)).
4.3. Residual growth rates by OECD status for all years Appendix A presents results based on the distribution of our nonparametric regression ‘‘residuals,’’ organized in the same manner as the last two sections. These residuals may be regarded as ‘‘conditional’’ growth rates in the usual meaning of conditioning in econometrics. The residuals are growth rates after controlling for the influence of conditioning variables. Of course this control is only achieved on the mean of the growth
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
499
rates, and the variables may continue to impact other distributional characteristics. This residual analysis is valuable since our residuals are robust to functional forms and any evidence of ‘‘convergence’’ of their distributions may be interpreted as evidence of ‘‘conditional convergence.’’4 We summarize as follows: (1) There is no FSD between the OECD and non-OECD groups. There is generally no SSD either, with the possibility of week SSD or higher orders for some later years. Once the mean differences due to conditional variables are removed, uniform ranking of these groups by dominance criteria vanishes. Interestingly, even the dispersion aspects of these two distributions are generally not sufficiently different to produce higher order (SSD) rankings. This is evidence in favor of ‘‘conditional convergence’’ in the sense developed in this paper.5 (2) The ‘‘fit’’ is generally good for the regressions, but less good for non-OECD data because of their heterogeneity. (3) There is not much to separate these distributions over the successive 5-year intervals. The fit is equally good (bad) for each cross-section. Also, note that the residuals are effectively ‘smoothed’ over time so that differences in the residual series are negligible for different time periods.
5. Entropy measures of distributional distance In this section we provide a formal quantification of the distributional distances and evolutions observed in the last section. This is done by using a metric entropy measure suggested in Granger et al. (2004)).6 Any entropy measure is useful as an indicator of divergence from the uniform distribution, and is thus a measure of ‘‘equality,’’ or concentration in the corresponding distribution. The characterization of a density afforded by entropies is only a little short of that provided by characteristic functions. Thus entropies are generally superior to other moment-based criteria. Unfortunately, Shannon’s popular entropy is not a metric and thus fails to be useful for multiple comparisons, exemplified by our application here where several years and/or groups of distributions are being compared. Granger et al. (2004) developed a normalized entropy measure of ‘‘dependence’’ that has several desirable properties as well as being a proper distance metric. Some of these properties are briefly enumerated here for convenience. A measure of similarity/distance/dependence for a pair of random variables X and Y may be required to satisfy the following six ‘‘ideal’’ properties: 4 We are sympathetic, however, to the view that considers ‘‘conditional convergence’’ as rather lacking in meaning or consequence, especially relative to substantive theories and hypotheses which initially motivated this area of research. 5 It is worth noting that ‘‘strong’’ nonuniform rankings are not ruled out. There do exist cardinal (welfare) criteria according to which these distributions may be ranked. Variance is one such criterion, however, unlikely in this situation. 6 For comparison purposes, we also computed the Kullback–Leibler divergence measure. These were removed to save space but give consistent results. The KL measure is the most popular index of divergence between distributions, but it is not a metric and unsuited for precisely the type of comparisons of ‘‘distances’’ we need to make in this application.
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
500
(i) It is well defined for both continuous and discrete variables. (ii) It is normalized to zero if X and Y are identical, and is conveniently normalized to lie between 0 and +1. (iii) The modulus of the measure is equal to unity if there is a measurable exact (nonlinear) relationship, Y ¼ gðX Þ say, between the random variables. This is useful in our use of this measure for assessing the fit of regressions. (iv) It is equal to or has a simple relationship with the (linear) correlation coefficient in the case of a bivariate normal distribution. Again, this is useful in our use of this measure for assessing the fit of regressions. (v) It is metric, that is, it is a true measure of ‘‘distance’’ and not just of divergence. (vi) The measure is invariant under continuous and strictly increasing transformations hðÞ. This is useful since X and Y are independent if and only if hðX Þ and hðY Þ are independent. Invariance is important since otherwise clever or inadvertent transformations would produce different levels of dependence. This leads to a normalization of the Bhattacharya–Matusita–Hellinger measure of dependence/distance given by Sr ¼
1 2
Z
1 1
Z
1
1
1=2
1=2
f1 f2
2
dx dy,
(4)
where f 1 ¼ f ðxÞ and f 2 ¼ f ðyÞ are the marginal densities of the random variables X and Y. If f 1 and f 2 are equal this metric will yield the value zero, and is otherwise positive and less than one. Granger et al. (2004) demonstrate the relation of this normalized measure to kclass entropy divergence measures, as well as copulae. We use it as our primary means of assessing the distances between distributions. Testing for convergence is based on the null hypothesis that S r ¼ 0. Below, two types of use are made of these entropy measures that reflect their universal role as both measures of ‘‘divergence’’ and measures of ‘‘fit’’ or ‘‘dependence.’’ Tables that report entropies for the fit of the growth regressions allow an assessment of the ‘‘goodness of fit’’ of these models, and represent new results in their own right. Since our regressions are not linear, the traditional measures of correlation and linear dependence, such as R2 , are clearly inadequate. Thus in these tables we offer the first robust dependence results for the fit of the traditional growth regression variables.7 In terms of Shannon’s entropy (reported in Table 2), the actual growth rate distributions for OECD were becoming somewhat more concentrated until 1985, whereafter increasing in dispersion levels of 1965. For non-OECDs the increase in dispersion/inequality of growth rates is a steady pattern. Neither of these changes are ‘‘large’’ in absolute value (but see below for statistical evaluation). In Table 2 we note that the distances S r (also KL divergences not reported here) between OECD and others is significant at the 95% level for every date except 1970. Over time, we see that these distances declined in the 1960s, thereafter growing steadily until 1990, but seem to have declined in 1990–1995. Table 3 reports the ‘‘goodness of fit’’ values for the parametric as well as our own nonparametric models. For OECDs, the parametric fit is much better than the 7 We compute all entropy measures in the following manner: (i) Compute the conditional Rosenblatt–Parzen density estimates with covariates OECD status and year via cross-validation. (ii) Generate a grid in ½0:25; 0:25 having grain 0:001 (there are 501 points on this grid). (iii) Evaluate the Rosenblatt–Parzen kernel estimator on the grid of 501 points. Note that at the edges of the grid f^ðxjOECD; yearÞ ¼ 0:0. (iv) Evaluate each respective entropy via numerical quadrature.
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
501
Table 2 R1 Shannon’s entropy ð 1 f ðxÞ lnðf ðxÞÞ dxÞ 1965
1970
1975
1980
1985
1990
1995
OECD Actual growth rate Parametric fit Nonparametric fit Parametric residual Nonparametric residual
2.432 3.533 2.692 2.668 3.012
2.324 3.561 2.689 2.607 2.956
2.479 3.582 2.745 2.683 3.031
2.584 3.766 3.027 2.706 3.012
2.774 3.733 3.113 2.806 3.039
2.686 3.784 3.092 2.725 3.054
2.532 3.768 3.091 2.669 2.948
Non-OECD Actual growth rate Parametric fit Nonparametric fit Parametric residual Nonparametric residual
2.077 2.839 2.871 2.104 2.262
2.158 2.855 2.738 2.195 2.314
2.029 2.804 2.525 2.147 2.281
1.966 2.881 2.812 2.054 2.213
1.942 2.867 2.729 2.055 2.229
1.970 2.868 2.641 2.087 2.262
1.800 2.814 2.226 2.048 2.267
Table 3 R1 KL entropy ð 1 f ðxÞ lnðf ðxÞ=gðxÞÞ dxÞ (f ðxÞ ¼ non-OECD, gðxÞ ¼ OECD) 1965
1970
1975
1980
1985
1990
1995
Actual growth 0.803 0.160 0.383 0.630 1.378 1.237 0.580 rate [0.182, 0.212] [0.157, 0.187] [0.184, 0.211] [0.190, 0.228] [0.194, 0.235] [0.174, 0.216] [0.323, 0.352] Parametric fit
1.623 1.628 1.064 1.098 1.270 1.669 1.572 [0.294, 0.340] [0.255, 0.304] [0.261, 0.304] [0.270, 0.339] [0.284, 0.360] [0.290, 0.346] [0.301, 0.349]
Nonparametric 1.154 0.504 0.476 0.085 0.512 0.950 1.047 fit [0.240, 0.291] [0.147, 0.182] [0.226, 0.284] [0.100, 0.128] [0.128, 0.154] [0.224, 0.272] [0.349, 0.424]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference between OECD and non-OECD countries. Kernel evaluation of KL entropy: OECD versus non-OECD.
Table 4 R 1 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi Sr entropy 12 1 ½ f ðxÞ gðxÞ2 dx (f ðxÞ ¼ non-OECD, gðxÞ ¼ OECD) 1965
1970
1975
1980
1985
1990
1995
Actual growth 0.127 0.035 0.069 0.089 0.182 0.180 0.112 rate [0.040, 0.048] [0.032, 0.040] [0.043, 0.049] [0.039, 0.044] [0.040, 0.047] [0.038, 0.045] [0.054, 0.065] Parametric fit
0.259 0.232 0.156 0.147 0.174 0.175 0.198 [0.061, 0.070] [0.053, 0.063] [0.056, 0.068] [0.059, 0.069] [0.057, 0.067] [0.059, 0.069] [0.060, 0.069]
Nonparametric 0.252 0.111 0.067 0.015 0.077 0.141 0.143 fit [0.043, 0.051] [0.031, 0.038] [0.042, 0.051] [0.022, 0.029] [0.027, 0.032] [0.038, 0.044] [0.069, 0.077]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference between OECD and non-OECD countries.
ARTICLE IN PRESS 502
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
nonparametric one. This is predictable from the relative homogeneity in this group. The nonparametric fit is much better for the non-OECDs, but deteriorates in the later parts of the sample. Table 4 corresponds to our earlier graphical analysis of evolution through time. For the pooled sample, only the distances between 1980 and 1985 are significant at 95% level. These distances first increase to 1985, but thereafter become small again. By these indices, one would infer ‘‘convergence’’ except in 1980–1985, demonstrating the difficulty of analyzing distributional dynamics by strong/complete but nonuniform criteria. In the absence of uniform SD rankings, there will exist some ‘‘criterion function’’ which may reverse the conclusion of ‘‘convergence,’’ by another criterion. This may explain some of the quandary in the current literature with different conclusions being reached by different authors on the question of convergence. By either measure of divergence, the OECD countries moved forward by small amounts in the late 1960s and early 1970s, but changing significantly in later periods (except for 1980–1985). For the non-OECD growth distributions, on the other hand, the two measures suggest that their distributions have been changing slowly, indeed only significantly so in 1980–1985. The magnitude of changes over time are generally much larger for OECDs than others (‘the rich get richer and the poor get poorer’). These observations add credence to those in Durlauf and Quah (1999) and elsewhere, that the most interesting aspects of the growth phenomenon appear to be in different distributional dynamics and mobility profiles of different country groups, rather than in the growth regressions (see Tables 5 and 6). Appendix C reports similar analysis for ‘‘conditional growth rates,’’ i.e., the residuals of both the parametric and nonparametric growth regressions. Our earlier observations are confirmed by these entropy tests. (1) The ‘‘fit’’ is generally good for the regressions, but less good for non-OECD data because of their less homogeneous membership. (2) There is not much to separate these distributions over the successive 5-year intervals. The fit is equally good (bad) for each cross-section. Also, note that the residuals are effectively ‘smoothed’ over time so that differences in the residual series are negligible for different time periods. (3) There is no significant change in these residual growth distributions at the 95% level, and almost always, even at the 90% level (the exception is, again, 1980–1985 for OECDs which are significant at the 90% level). Table 5 R 1 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi Sr entropy 12 1 ½ f ðxÞ gðxÞ2 dx (f ðxÞ ¼ actual, gðxÞ ¼ predicted) 1965
1970
1975
1980
1985
1990
1995
OECD Parametric Nonparametric
0.196 0.020
0.232 0.041
0.228 0.019
0.240 0.047
0.173 0.037
0.215 0.038
0.243 0.069
Non-OECD Parametric Nonparametric
0.140 0.125
0.117 0.080
0.143 0.062
0.188 0.155
0.183 0.154
0.174 0.103
0.209 0.043
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
503
Table 6 R 1 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi Sr entropy 12 1 ½ f ðxÞ gðxÞ2 dx (f ðxÞ ¼ yeart , gðxÞ ¼ yeartþ5 ) 1965–1970
1970–1975
1975–1980
1980–1985
1985–1990
1990–1995
Pooled
0.003 [0.013, 0.016]
0.006 [0.014, 0.017]
0.014 [0.015, 0.017]
0.032 [0.014, 0.017]
0.008 [0.015, 0.017]
0.008 [0.015, 0.016]
OECD
0.017 [0.037, 0.045]
0.009 [0.038, 0.045]
0.071 [0.038, 0.048]
0.037 [0.036, 0.044]
0.074 [0.039, 0.045]
0.093 [0.036, 0.045]
Non-OECD
0.007 [0.026, 0.030]
0.011 [0.025, 0.029]
0.022 [0.027, 0.030]
0.047 [0.026, 0.029]
0.010 [0.026, 0.030]
0.023 [0.026, 0.030]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference over time.
(4) There is a further interpretation of these entropy measures of dynamic residual movements. Following Granger et al. (2004), the entropies in this context may be regarded as robust measures of possibly nonlinear serial dependence. Accordingly, our results indicate that there is no evidence of significant serial dependence of residuals between these 5-year periods. A summary of the conditional and unconditional (‘‘actual’’) growth rates and their distributional characteristics is given in Tables 11 and 12 of Appendix C.
6. Conclusions Employing nonparametric kernel density and regression techniques, we have examined the otherwise traditional growth relationship and given new entropy measures of fit, as well as residual correlation for them. We have identified distinct effects of the major conditioning variables on the growth rates of different groups of countries. This leaves very little doubt that separate models are required to examine different groups of countries. We have further examined the dynamics of cross-section distributions of actual growth rates, as well as ‘‘conditional’’ and ‘‘fitted’’ growth rates. Our study of these dynamics was based on stochastic dominance rankings, as well as tests based on entropy distances which shed further light on the mobility between and within groups of countries. Our robust findings tend to confirm the hypotheses of ‘‘convergence clubs’’ and polarization. We agree with the conclusions of Durlauf and Quah (1999) that future work needs to address more successfully the need for modeling of cross-country interactions and remain consistent with the rich distributional dynamics observed here, and studied in the mobility literature as, for example, in Maasoumi and Zandvakili (1990). There is also a need to extend the scope of this field by considering other attributes of well-being than per capita incomes, and connecting to the literature which deals with its related issues; see, for example, Hirschberg et al. (2001), and Maasoumi and Jeong (1985).
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
504
40 35 30 25 20 15 10 5 0 -0.1
40 35 30 25 20 15 10 5 0 -0.1
40 35 30 25 20 15 10 5 0 -0.1
40 35 30 25 20 15 10 5 0 -0.1
-0.05
0 Residuals
0.05
Distribution
1970
-0.05
0 Residuals
0.05
Distribution
OECD Non-OECD
1975
0 Residuals
0.05
0.05
0 Residuals
0.05
Distribution
1990
-0.05
0 Residuals
0.05
-0.05
0.05
0.1
0 Residuals
0.05
0.1
1975 OECD Non-OECD
0.8 0.6 0.4 0.2 -0.05
0 Residuals
0.05
0.1
1980 OECD Non-OECD
0.8 0.6 0.4 0.2 -0.05
0 Residuals
0.05
0.1
1985 OECD Non-OECD
0.8 0.6 0.4 0.2 -0.05
0 Residuals
0.05
0.1
1990 OECD Non-OECD
0.8 0.6 0.4 0.2 -0.05
0 Residuals
0.05
0.1
1995
1 OECD Non-OECD
0 Residuals
0.2
0 -0.1
0.1
1995
-0.05
OECD Non-OECD
1 OECD Non-OECD
0.1
0.4
0 -0.1
0.1
0.05
0.6
1
OECD Non-OECD
0 Residuals 1970
1 0.8
0 -0.1
0.1
1985
-0.05
-0.05
1 OECD Non-OECD
0 Residuals
0.2
0 -0.1
0.1
1980
-0.05
0.4
1
OECD Non-OECD
-0.05
OECD Non-OECD
0.6
0 -0.1
0.1
1965
1 0.8
0 -0.1
0.1
Distribution
40 35 30 25 20 15 10 5 0 -0.1
OECD Non-OECD
Distribution
40 35 30 25 20 15 10 5 0 -0.1
1965
Distribution
40 35 30 25 20 15 10 5 0 -0.1
Distribution
Density
Density
Density
Distribution
Density
Density
Density
Appendix A. Residuals by year and OECD status
OECD Non-OECD
0.8 0.6 0.4 0.2 0 -0.1
-0.05
0 Residuals
0.05
0.1
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
505
Appendix B. Residuals by OECD status for all years
OECD
40
25
1965 1970 1975 1980 1985 1990 1995
0.8 Distribution
Density
30
OECD
1
1965 1970 1975 1980 1985 1990 1995
35
20 15 10
0.6 0.4 0.2
5
20 18 16 14 12 10 8 6 4 2 0 -0.1
-0.05
0 Residuals
0.05
0.1
0 -0.1
0.1
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -0.1
Non-OECD 1965 1970 1975 1980 1985 1990 1995
-0.05
0
0.05
Distribution
Density
0 -0.1
-0.05
0 Residuals
0.05
0.1
Non-OECD 1965 1970 1975 1980 1985 1990 1995
-0.05
Residuals
0 Residuals
0.05
0.1
Appendix C. Growth rate dynamics See Tables 7–12
Table 7 R1 KL entropy for parametric residuals ð 1 f ðxÞ lnðf ðxÞ=gðxÞÞ dxÞ (f ðxÞ ¼ yeart , gðxÞ ¼ yeartþ5 ) 1965–1970
1970–1975
1975–1980
1980–1985
1985–1990
1990–1995
OECD
0.020 [0.032, 0.039]
0.019 [0.032, 0.040]
0.012 [0.033, 0.039]
0.053 [0.032, 0.038]
0.010 [0.032, 0.039]
0.037 [0.033, 0.040]
Non-OECD
0.014 [0.025, 0.030]
0.006 [0.025, 0.029]
0.030 [0.025, 0.029]
0.012 [0.026, 0.030]
0.005 [0.026, 0.030]
0.015 [0.026, 0.030]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference over time.
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
506
Table 8 R 1 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi Sr entropy dynamic for parametric residuals 12 1 ½ f ðxÞ gðxÞ2 dx (f ðxÞ ¼ yeart , gðxÞ ¼ yeartþ5 ) 1965–1970
1970–1975
1975–1980
1980–1985
1985–1990
1990–1995
OECD
0.006 [0.008, 0.009]
0.004 [0.008, 0.010]
0.003 [0.008, 0.010]
0.013 [0.008, 0.009]
0.003 [0.008, 0.010]
0.009 [0.008, 0.010]
Non-OECD
0.003 [0.006, 0.007]
0.002 [0.006, 0.007]
0.007 [0.006, 0.007]
0.003 [0.006, 0.007]
0.001 [0.006, 0.007]
0.004 [0.006, 0.007]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference over time.
Table 9 R1 KL entropy for kernel residuals ð 1 f ðxÞ lnðf ðxÞ=gðxÞÞ dxÞ (f ðxÞ ¼ yeart , gðxÞ ¼ yeartþ5 ) 1965–1970
1970–1975
1975–1980
1980–1985
1985–1990
1990–1995
OECD
0.008 [0.013, 0.016]
0.009 [0.014, 0.017]
0.004 [0.013, 0.017]
0.005 [0.013, 0.016]
0.014 [0.013, 0.016]
0.013 [0.014, 0.016]
Non-OECD
0.005 [0.012, 0.013]
0.004 [0.011, 0.013]
0.013 [0.011, 0.012]
0.012 [0.011, 0.013]
0.010 [0.012, 0.013]
0.004 [0.011, 0.013]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference over time.
Table 10 R 1 pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi Sr entropy for kernel residuals 12 1 ½ f ðxÞ gðxÞ2 dx (f ðxÞ ¼ yeart , gðxÞ ¼ yeartþ5 ) 1965–1970
1970–1975
1975–1980
1980–1985
1985–1990
1990–1995
OECD
0.002 [0.003, 0.004]
0.002 [0.003, 0.004]
0.001 [0.003, 0.004]
0.001 [0.003, 0.004]
0.003 [0.003, 0.004]
0.003 [0.003, 0.004]
Non-OECD
0.001 [0.003, 0.003]
0.001 [0.003, 0.003]
0.003 [0.003, 0.003]
0.003 [0.003, 0.003]
0.003 [0.003, 0.003]
0.001 [0.003, 0.003]
The values in brackets are the 90th and 95th percentiles obtained under the null of no difference over time.
Table 11 Actual growth rates summary Year
1965 1970 1975 1980 1985 1990 1995
Mean
Median
s
IQR
OECD
Non-OECD
OECD
Non-OECD
OECD
Non-OECD
OECD
Non-OECD
0.044 0.037 0.037 0.022 0.014 0.027 0.011
0.022 0.025 0.031 0.023 0.003 0.011 0.010
0.039 0.035 0.033 0.022 0.014 0.025 0.011
0.020 0.025 0.025 0.027 0.002 0.009 0.014
0.018 0.021 0.016 0.013 0.007 0.010 0.014
0.028 0.025 0.031 0.032 0.033 0.032 0.041
0.015 0.017 0.022 0.012 0.008 0.012 0.012
0.037 0.033 0.042 0.042 0.041 0.034 0.052
ARTICLE IN PRESS E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
507
Table 12 Kernel predicted growth rates summary Year
1965 1970 1975 1980 1985 1990 1995
Mean
Median
s
IQR
OECD
Non-OECD
OECD
Non-OECD
OECD
Non-OECD
OECD
Non-OECD
0.042 0.037 0.035 0.023 0.018 0.025 0.013
0.020 0.022 0.026 0.022 0.010 0.014 0.012
0.041 0.038 0.032 0.022 0.018 0.021 0.011
0.020 0.021 0.028 0.023 0.012 0.014 0.015
0.015 0.015 0.014 0.008 0.007 0.007 0.007
0.011 0.013 0.018 0.012 0.013 0.016 0.028
0.017 0.018 0.018 0.011 0.008 0.009 0.009
0.013 0.022 0.022 0.013 0.016 0.017 0.031
References Azariadis, C., Drazen, A., 1990. Threshold externalities in economic development. Quarterly Journal of Economics 501–526. Barro, R., 1991. Economic growth in cross section of countries. Quarterly Journal of Economics CVI, 407–444. Barro, R., Lee, J.-W., 2000. International data on educational attainment: updates and implications. Working Paper No. 42, Center for International Development, Harvard University. Barro, R., Sala-i-Martin, X., 1995. Economic Growth. McGraw-Hill, New York. Bernard, A.B., Durlauf, S., 1996. Interpreting tests of the convergence hypothesis, Journal of Econometrics, 71, 161–174. Bianchi, M., 1997. Testing for convergence: evidence from nonparametric multimodality tests. Journal of Applied Econometrics 12 (4), 393–409. Durlauf, S.N., Johnson, P.A., 1995. Multiple regimes and cross-country growth behavior. Journal of Applied Econometrics 10, 365–384. Durlauf, S.N., Quah, D.T., 1999. The new empirics of economic growth. In: Taylor, J.B., Woodford, M. (Eds.), Handbook of Macroeconomics I. Elsevier, Amsterdam, pp. 235–308 (Chapter 4). Fiaschi, D., Lavezzi, A. Distribution dynamics and nonlinear growth. Journal of Economic Growth, forthcoming. Granger, C., Maasoumi, E., Racine, J.S., 2004. A dependence metric for possibly nonlinear time series. Journal of Time Series Analysis 25, 649–669. Hall, P., 1987. On Kullback–Leibler loss and density estimation. Annals of Statistics 12, 1491–1519. Hall, P., Racine, J.S., Li, Q., 2004. Cross-validation and the estimation of conditional probability densities. Journal of The American Statistical Association 99, 1015–1026. Hansen, B., 2000. Sample splitting and threshold estimation. Econometrica 68, 575–603. Hirschberg, J., Maasoumi, E., Slottje, D.J., 2001. Clusters of quality of life attributes in the United States. Journal of Applied Econometrics 16, 445–460. Hsiao, C., Li, Q., Racine, J.S., A consistent model specification test with mixed categorical and continuous data. Journal of Econometrics, forthcoming. Jones, C.I., 1997. On the evolution of the world income distribution. Journal of Economic Perspectives 11 (3), 19–36. King, R.G., Levine, R., 1993. Finance and growth: Schumpeter might be right. Quarterly Journal of Economics 108, 717–737. Li, Q., Racine, J.S., 2003. Nonparametric estimation of distributions with categorical and continuous data. Journal of Multivariate Analysis 86, 266–292. Li, Q., Racine, J.S., 2004. Cross-validated nonparametric local linear regression. Statistica Sinica 14, 485–512. Linton, O., Maasoumi, E., Whang, Y.-J., 2002. Consistent testing for stochastic dominance: a subsampling approach. Cowles Foundation Monograph, Yale University, and Mimeo, Department of Economics, SMU. Liu, Z., Stengos, T., 1999. Non-linearities in cross country growth regressions: a semiparametric approach. Journal of Applied Econometrics 14, 527–538.
ARTICLE IN PRESS 508
E. Maasoumi et al. / Journal of Econometrics 136 (2007) 483–508
Lucas, R., 1993. Making a miracle. Econometrica 61 (2), 251–271. Maasoumi, E., 1998. On mobility. In: Giles, D., Ullah, A. (Eds.), Handbook of Applied Economic Statistics. Marcel Dekker, New York (Chapter 5). Maasoumi, E., Jeong, J.-H., 1985. The trend and the measurement of world inequality over extended periods of accounting. Economics Letters 19, 295–301. Maasoumi, E., Zandvakili, S., 1990. Generalized entropy measures of mobility for different sexes and income levels. Journal of Econometrics 121–133. Mankiw, N., Romer, D., Weil, D., 1992. A contribution to the empirics of economic growth. Quarterly Journal of Economics 108, 407–437. Quah, D.T., 1993. Empirical cross-section dynamics in economic growth. European Economic Review 37 (2/3), 426–434. Quah, D.T., 1996. Empirics for economic growth and convergence. European Economic Review 40, 1353–1375. Quah, D.T., 1997. Empirics for growth and distribution: stratification, polarization and convergence clubs. Journal of Economic Growth 2, 27–59. Racine, J.S., Li, Q., 2004. Nonparametric estimation of regression functions with both categorical and continuous data. Journal of Econometrics 119, 99–130. Silverman, B., 1986. Density Estimation for Statistics and Data Analysis. Chapman & Hall, London. Summers, R., Heston, A., 1988. A new set of international comparisons of real product and prices: estimates for 130 countries. Review of Income and Wealth 34, 1–26.
Further reading Maasoumi, E., Linton, O., 2002. A welfare basis for the examination of growth empirics and distributional dynamics. Mimeograph, LSE and SMU. Maasoumi, E., 1993. A compendium to information theory in economics and econometrics. Econometric Reviews 12 (2), 137–182. Romer, P.M., 1990. Endogenous technological change. Journal of Political Economy 98 (part 2), S71–S102.
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES By C. W. Granger, E. Maasoumi and J. Racine University of California, San Diego, Southern Methodist University, Syracuse University First Version received October 2001 Abstract. A transformed metric entropy measure of dependence is studied which satisfies many desirable properties, including being a proper measure of distance. It is capable of good performance in identifying dependence even in possibly nonlinear time series, and is applicable for both continuous and discrete variables. A nonparametric kernel density implementation is considered here for many stylized models including linear and nonlinear MA, AR, GARCH, integrated series and chaotic dynamics. A related permutation test of independence is proposed and compared with several alternatives. Keywords. Entropy; information theory; nonlinear models; serial dependence; nonparametric; goodness of fit; bootstrap.
1.
INTRODUCTION
The need to detect and properly measure association and dependence is an essential task in economic model building and forecasting. Numerous diagnostics, such as the Durbin Watson, Lagrange multiplier, and runs tests, are used to examine model residuals for departure from ‘independence’, i.i.d., reversibility, martingale difference and other properties. The most commonly used measures of dependence and test statistics are convenient functions of ‘correlation’ which is motivated by linear relations involving continuous variables and/or Gaussian processes. These measures tend to fail when variables are discrete, or in detection, as when they face nonlinear, or non-Gaussian processes. The currently dominant measures tend to be functions of only one or two moments of the underlying processes. While this has the advantage of simplicity, it can mislead when distinctions between the tail areas and higher order moments are germane. Examples in both economic and finance processes abound (see Hsieh, 1989, on foreign exchange rates; Pagan and Schwert, 1990; Hong and White, 2000; Chen and Kuan, 2002; Maasoumi and Racine, 2002, and references therein). In macroeconomics, the regime change models of Hamilton (1993), and threshold models of the US macroseries in, for example, Perron (1989), successfully compete with linear (possibly integrated) time-series hypotheses. Of course, nonlinearity in the cost and production functions is widely acknowledged.
0143-9782/04/05 649–669 JOURNAL OF TIME SERIES ANALYSIS Vol. 25, No. 5 2004 Blackwell Publishing Ltd., 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.
650
C. W. GRANGER, E. MAASOUMI AND J. RACINE
Furthermore, the residuals of empirical economic models may have nonlinearities, heterogeneity, and serial dependence for a multitude of reasons, including unknown forms of misspecification. Recent examples of unexplained nonlinear dependence in the residuals of (GARCH and other) models for exchange rates and S&P returns are, Hsieh (1989), Hong and White (2000), Qi (1999) and references therein, and Maasoumi and Racine (2002). It is clearly desirable for measures of association and dependence to be robust towards possible (but unknown) nonlinearities and non-Gaussian processes. The concept of statistical independence is well defined in terms of the joint distribution of variables. But for reasons of simplicity and convenience, the traditional criteria tend to measure different implications of independence. Examples of criteria that at least incorporate the divergence of the joint distributions from the product of their marginals include Chan and Tran (1992), who use an L1 norm, Ahmad and Li (1997) among others, who use the von-Mises (L2) norm, and Skaug and Tjøstheim (1996) who use the Hellinger and several other measures1. Examples of other ‘well-informed’ measures include the moment generating and characteristic functions, as well as many entropy functionals developed in information theory. Entropies are defined over the space of distributions which form the bases of independence/dependence concepts in both continuous and discrete cases. Entropy is also ‘dimensionless’ as it applies seamlessly to univariate and multivariate contexts. For these reasons, Shannon’s mutual information function has been increasingly utilized in the literature (see Joe, 1989; Robinson, 1991; Granger and Lin, 1994; Skaug and Tjøstheim, 1996). Shannon’s relative entropy and almost all other entropies fail to be ‘metric’, as they violate either symmetry, or the triangularity rule, or both. This means that they are measures of divergence, not distance. A metric measure would have the additional advantage of allowing multiple comparisons of departures/distances. It would be helpful to motivate and support any proposed measure by examining whether it satisfies certain useful and clearly stated properties such as those discussed in Granger and Lin (1994), in the excellent treatment by Skaug and Tjøstheim (1996), and in this paper. It is also desirable to provide the framework for assessing statistical significance of any proposed measure. The robustness of nonparametric implementation of entropy indices is one of the main reasons for the recent surge in their popularity. The interested reader is directed to Tjøstheim’s (1996) survey on the subject. Also Delgado (1996), Chan and Tran (1992), Granger and Lin (1994), Hsieh (1989), Skaug and Tjøstheim (1993), Aparicio and Escribano (1998), and Hong (1998), all report superior performance for nonparametric entropy measures of dependence over the traditional measures. Motivated by these arguments, we consider a robust nonparametric implementation of a metric entropy measure of association and dependence that satisfies several desirable properties. These properties embody the advantages of the proposed method and motivate its use in preference to existing indices. We also provide a framework for assessing the statistical significance of deviations from independence. We will focus on a popular application, measuring serial dependence
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
651
in a time series. Of course, one could apply the metric in many settings in which one wishes to measure fit and dependence, e.g. detecting nonlinear dependence between a variable being predicted and the predictions obtained from various models, thereby serving as a general measure of (nonlinear) goodness-of-fit. Much like the BDS and other tests in this arena, we do not test for nonlinearity per se, rather, we aim to measure and test for the degree of departure from independence. Those interested in applications of entropy to model adequacy tests and related tests for nonlinearity are referred to Hong and White (2000) and the references therein. Our method complements the approach prevalent in empirical econometrics and finance of parametric model search based on tests of specification for the conditional means and/or variances. Hong (1999) and Hong and Lee (2003) follow a similar philosophy in developing comparable tests for serial dependence based on generalized spectral distributions. A robust nonparametric approach as proposed by us provides a sound preliminary basis for an otherwise a priori exclusion/inclusion of model classes and processes. It may also help to reduce ‘data mining’ consequences of numerous parametric tests, as well as the dangers of tests of hypotheses within misspecified models. Additionally, diagnostic testing may be performed using our technique to examine the residuals of fitted models for ‘generic’ serial dependence, or for association with possibly excluded factors. In Section 2 we outline the desirable properties of any measure of dependence and briefly review the metric entropy measure ‘Sq’ which is a normalized form of the Bhattacharya–Matusita–Hellinger measure, and in Section 3 we outline a kernel-based implementation of Sq which we denote ‘S^q ’. Section 4 considers using the metric as the basis for a permutation test for serial independence. Section 5 considers simulations designed to gauge the finite-sample performance of the estimator for a number of popular dependent processes, and presents an application to chaotic series. We offer power comparisons with other tests which complements the earlier work of Granger and Lin (1994) and Skaug and Tjøstheim (1993, 1996). Section 6 concludes.
2.
AN IDEAL MEASURE OF DEPENDENCE: Sq
Several axioms may embody what we consider as desirable in any measure/index and motivate its use. A measure of functional dependence for a pair of random variables X and Y may be required to satisfy the following six ‘ideal’ properties. 1. It is well defined for both continuous and discrete variables. 2. It is normalized to zero if X and Y are independent, and lies between 0 and +1. 3. The modulus of the measure is equal to unity (or a maximum) if there is a measurable exact (nonlinear) relationship, Y ¼ m(X) say, between the random variables.
Blackwell Publishing Ltd 2004
652
C. W. GRANGER, E. MAASOUMI AND J. RACINE
4. It is equal to or has a simple relationship with the (linear) correlation coefficient in the case of a bivariate normal distribution. 5. It is metric, i.e., it is a true measure of ‘distance’ and not just of divergence. 6. The measure is invariant under continuous and strictly increasing transformations w(Æ). This is useful since X and Y are independent if and only if w(X) and w(Y) are independent. Invariance is important since otherwise clever or inadvertent transformations would produce different levels of dependence. We consider a normalization of the Bhattacharya–Matusita–Hellinger measure of dependence given by Z Z 2 1 1 1 1=2 1=2 Sq ¼ f1 f2 dx dy 2 1 1 # Z Z " 1=2 2 f2 1 1 1=2 dF1 ðx; yÞ; ¼ 2 f1 where f1 ¼ f(x, y) is the joint density and f2 ¼ g(x) Æ h(y) is the product of the marginal densities of the random variables X and Y. The second expression is in a moment form which is often replaced with a sample average, especially for theoretical developments; see the section below on relation to copulas. Importantly, H0: independence (f1 ¼ f2) Sq ¼ 0, otherwise (under H1) Sq > 0. Power and consistency of the tests based on consistent estimates of Sq arise from this last property. An impressive body of literature demonstrates the desirable axiomatic properties of entropy measures which is instructive in anticipating the properties of related indices. This literature would also make it clear that choices of indices are not as arbitrary as may seem. To see the relation of our normalized measure to entropy divergence measures, consider the k-class entropy family of Havrda and Charvat (1967): Hk ðf Þ ¼ ðk 1Þ1 ð1 Ef k1 Þ; . . . ; k 6¼ 1 ¼ E log f ;
ðShannon’s entropyÞ
for k ¼ 1;
where E denotes expectation with respect to a distribution f. For any two density functions f1 and f2, the asymmetric (with respect to f2) k-class entropy divergence measure is: Z 1 k k ðf1 =f2 ÞdF2 1 ; k 6¼ 1; Ik ðf2 ; f1 Þ ¼ k1 such that lim k fi 1Ik(Æ) ¼ I1(Æ), the Shannon relative entropy (divergence) measure. Once the divergence in both directions of f1 and f2 are averaged, a symmetric measure is obtained which, for k ¼ 1, is well known as the Kullback–Leibler measure. Consider the symmetric k-class measure at k ¼ 12 as follows: I1=2 ¼ I1=2 ðf2 ; f1 Þ þ I1=2 ðf1 ; f2 Þ ¼ 2Mðf1 ; f2 Þ ¼ 4Bðf1 ; f2 Þ
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
653
where MðÞ ¼
Z
1=2
ðf1
1=2
f2 Þ2 dx
is known as the Matusita or Hellinger distance, and, B(Æ) ¼ 1 ) q* is known as the Bhattacharya distance with Z 0 q ¼ ðf1 f2 Þ1=2 1 being a measure of ‘affinity’ between the two densities. Note that Sq ¼ BðÞ ¼ 12 MðÞ ¼ 14 I1=2 . B(Æ) and M(Æ) are rather unique among measures of divergence since they satisfy the triangular inequality and are, therefore, proper measures of distance. Other divergence measures are capable of characterizing desired null hypotheses (such as independence) but may not be appropriate when these distances are compared across models, sample periods, or agents. Such comparisons are often made, albeit implicitly, in routine inferences. For an example, see Hirschberg et al. (2001) in cluster analysis where ‘distances’ are meaningful. The measure Sq satisfies properties 1–3. Property 5 is not difficult to establish, and property 6 was established by Skaug and Tjøstheim (1996) for the Hellinger measure. As for property 4, we note that when f1(x, y) ¼ N(0, 0, 1, 1, q) and g(x) ¼ N(0, 1) ¼ h(y), Sq ¼ 1 q ¼1 ¼0 ¼1
ð1 q2 Þ5=4 2
ð1 q2 Þ3=2 if q ¼ 0 if q ¼ 1:
2.1. Relation to copula Consider the monotonic ‘probability integral transformations’ U ¼ G(X) and V ¼ H(Y) which are standard uniformly distributed variables, and with X and Y as in our moment form definition of Sq above. The copula, C(u, v) ¼ F*(x, y), is a joint distribution function, serves as a measure of dependence, and is unique for continuous variables. If it is twice differentiable (and the marginals are once differentiable), we have: f ðx; yÞ ¼
@ 2 Cðu; vÞ gðxÞhðyÞ: @u@v
ð1Þ
Since our measure is invariant under such monotonic transformations as G(Æ) and H(Æ), and since ¶2C(u, v)/¶u¶v ¼ c(u, v) is a density with uniform marginals, we may verify the following relations:
Blackwell Publishing Ltd 2004
654
C. W. GRANGER, E. MAASOUMI AND J. RACINE
2 Z Z 1 g1=2 ðxÞh1=2 ðyÞ 1 Sq ¼ dF ðx; yÞ 2 f 1=2 ðx; yÞ Z Z h i ¼ 1 c1=2 ðu; vÞ du dv; The study of copulas for convenient characterizations of dependence has received increasing attention in finance and other fields (see Nelsen, 1999 or Genest and MacKay, 1986).
3.
A KERNEL DENSITY IMPLEMENTATION
We consider using Sq to measure the degree of dependence present in time-series data. We employ kernel estimators of the densities involved originally proposed by Parzen (1962). A second-order Gaussian kernel is used throughout, and bandwidths are obtained via likelihood cross-validation (Silverman, 1986, p. 52) which produces ‘optimal’ density estimators according to the Kullback–Leibler criterion. Replacing the unknown densities in Sq with kernel estimators yields 1 b Sq ¼ 2
Z
1
Z
1
1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
f^1 ða; bÞ
pffiffiffiffiffiffiffiffiffiqffiffiffiffiffiffiffiffiffi b gðaÞ b hðbÞ
2 da db;
and multivariate numerical quadrature was used for computing the double integral (Lau, 1995, p. 303). For discrete processes, one simply replaces the relevant integrals in the equation above with the summation operator and replaces f^ðÞ, ^gðÞ, and ^ hðÞ with the kernel probability estimators in Li and Racine (2003). Proofs of the asymptotic properties of this implementation are available from the authors upon request. A few words regarding the application of Sq to potentially nonstationary processes are in order. Application of this method to potentially nonstationary processes raises the question of interpretation of nonparametric density estimators. Phillips and Park (forthcoming) have examined the properties of kernel estimators for nonstationary density estimation demonstrating that the kernel estimator remains meaningful as a type of density estimate even in the nonstationary case. The estimate tells us how dense the process is about a particular (spatial) point and in this sense can be interpreted as a form of ‘density’ estimator2 (see also Karlsen and Tjøstheim, 2001). In nonstationary settings, our measure is therefore to be interpreted as assessing the affinity of the processes in terms of their ‘denseness’ rather than their densities.
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
4.
655
A PERMUTATION TEST FOR SERIAL INDEPENDENCE
In this section we outline the use of Sq for testing serial independence against alternatives of dependence which can be of a general and nonlinear nature. A limiting normal distribution for the statistic S^q was established in Granger et al. (2002) who also examined its asymptotic power properties. But there are good reasons to expect that this approximation would be poor, while in addition the outcomes of asymptotic-based kernel tests tend to be quite sensitive to the choice of bandwidth. Skaug and Tjøstheim (1993, 1996), have also studied this issue in the context of testing for serial independence. They note that the asymptotic variance is very poorly estimated in the case of the Hellinger and mutual information measures, which combined with various competing methods of bandwidth selection, renders asymptotic inferences quite unreliable. These same reasons suggest that bootstrapping ‘asymptotically pivotal’ statistics may not perform well in this context. A permutation test approach can be used to test the simple null of serial independence and would be expected to be robust to the underlying distribution (see Efron and Tibshirani, 1993). By applying a random shuffle (permutation) to the data set at hand one can generate replications which are serially independent having marginal distributions identical to the original data. Randomly reordering the data leaves the marginal distributions intact while generating an independent bivariate distribution. This reshuffle can be used to recompute the statistic using data generated under the null, and this can be repeated a large number of times to generate the empirical null distribution of the statistic which can be used to compute finite-sample critical values. Proofs of consistency of this test are available from the authors upon request. Extensive simulations, also available from the authors upon request, reveal that the permutation-based test has valid size and possesses power in the direction of numerous interesting alternative directions. A comparison of permutation tests and bootstrap tests in this setting can be found in Skaug and Tjøstheim (1996).
5.
APPLICATIONS
5.1. Finite-sample behaviour We present some evidence on the finite-sample performance of this kernel-based implementation, and use many nonlinear models and simulations including those of Granger and Lin (1994) as our benchmark. The traditional indices are known to fail in certain instances, sometimes very badly indeed, whereas the new measure is seen to be successful in detecting dependence, and often, revealing the correct dynamic structure. Granger and Lin (1994) considered the following data-generating processes (DGPs) with t i.i.d., N(0, 1).
Blackwell Publishing Ltd 2004
656
C. W. GRANGER, E. MAASOUMI AND J. RACINE
Model 1:
yt ¼ t þ 0:82t1
Model 2:
yt ¼ t þ 0:82t2
Model 3:
yt ¼ t þ 0:82t3
Model 4:
yt ¼ t þ 0:82t1 þ 0:82t2 þ 0:82t3
Model 5:
yt ¼ jyt1 j0:8 þ t
Model 6: Model 7:
yt ¼ signðyt1 Þ þ t yt ¼ 0:8yt1 þ t
Model 8: Model 9:
yt ¼ yt1 þ t yt ¼ 0:6t1 yt2 þ t
Model 10:
yt ¼ 4yt1 ð1 yt1 Þ
for t > 1; 0 < y1 < 1:
For the simulations that follow, we augment Granger and Lin’s (1994) DGPs with the following. Model 0: Model 11:
y t ¼ t pffiffiffiffi y t ¼ ht t ;
2 ht ¼ 0:01 þ 0:94ht1 þ 0:05yt1 :
Models 1–4 are nonlinear MA processes of order 1, 2, 3, and 3 respectively. We expect a good measure to exhibit the theoretical properties of these MA processes which require zero ‘dependence’ at lags beyond their nominal orders. Models 5–7 are AR(1) autoregressions with various decaying memory properties. Model 8 is a simple I(1) process with persistent memory, and model 9 is bilinear with white noise characteristics. Model 10 is the logistic function generating chaotic dynamics. Granger and Lin (1994) found the usual correlation function measures to be inadequate in recognizing nonlinear relationships. They found that the relative entropy did very well, and Kendall’s s did well for MA processes but not for the AR models or models 9–10. A directly relevant measure, a portmanteau version of the Hellinger index over a number of lags, was shown by Skaug and Tjøstheim (1996) to do very well indeed for ARCH(1), GARCH(1), nonlinear MA, an extended nonlinear MA, and threshold autoregressive of order 1 (TAR(1)). They showed that the correlation function measures, such as Ljung and Box (1978), can fail to detect the dependence and/or the order of the lags. This failure has been observed by others in a variety of settings. The AR(1) model 6 has been further studied in Granger and Terasvirta (1999). The process is Markovian and stationary. Its theoretical autocorrelations should decline exponentially, as would also be expected by a linear stationary AR(1) process. Granger and Terasvirta (1999) observed that the usual autocorrelation measures can point to a fractionally integrated process, indicating long memory where a short-memory process is appropriate. The important nonlinear/switching regime behaviour of this process is lost to correlation measures.
Blackwell Publishing Ltd 2004
657
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
Model 11 is a GARCH (1,1) process commonly employed to represent the errors in financial applications. Its coefficients were taken from an empirical model of S&P weekly returns obtained by Hong and White (2000), in conjunction with an AR(3) model for the mean. We add model 0, yt ¼ t, a simple i.i.d. white noise Gaussian process, as a benchmark. In addition, the sampling distribution of model 0 provides one method of obtaining critical values for the proposed test of independence outlined in Section 4. We shall use these models to evaluate the performance of our dependence metric in finite samples. A minimum of 1000 Monte Carlo replications from each model are computed, and K ¼ 10 lags are considered. Code was written in the C programming language. Random number generation employed the portable random number routines ran1 and gasdev which use three linear congruential generators and the Box–Muller method found in Press et al. (1990, p. 210, 216). Likelihood cross-validation was used for each replication for selection of the bandwidths. We have undertaken an extensive range of experiments for models 0 to 11, and have considered sample sizes of n ¼ 50, 75, 100, 150, 200, 300, 400, and 500. We present results of the average value of the S^q statistic for each lag for a given model, with the average computed over the total number of Monte Carlo replications. This is therefore analogous to a sample autocorrelation function for linear time-series models. Following Granger and Lin (1994) we tabulate the mean and standard deviation for each lag and model. In addition, the distribution of the statistic is skewed and bounded below by zero, therefore the median and interquartile ranges are also tabulated. We also consider the empirical distribution of the statistic for the i.i.d. white noise process which will be useful for determining significant deviations of Sq from zero, the theoretical value of Sq for an i.i.d. white noise process. For this last process we tabulate the 90th, 95th, and 99th percentiles from the empirical distribution of Sq based upon the Monte Carlo replications. By way of example, consider the simulation results for n ¼ 100 for models 1–3, the nonlinear MA processes of order 1, 2, 3 presented in the following three figures. Each figure plots the average value of S^q over 1000 Monte Carlo replications for each DGP for lags 1 to 10 along with the average value of S^q for a white noise process. Model 1:yt = et + 0.8e t2–1 0.025
Model 2:yt = et + 0.8e t2–2 0.025
Model 1 White noise
0.02
0.025
Model 2 White noise
0.02
0.015
Model 3:yt = et + 0.8e t2–3
0.015
Sρ
Sρ
Sρ
0.015
0.01
0.01
0.01
0.005
0.005
0.005
0
0
1
2
3
4
5
6
K
7
8
9 10
Model 3 White noise
0.02
0 1
2
3
4
5
K
6
7
8
9 10
1
2
3
4
5
6
7
8
9 10
K
Blackwell Publishing Ltd 2004
658
C. W. GRANGER, E. MAASOUMI AND J. RACINE
The statistic clearly detects dependence in each nonlinear process and it does so often at the correct lag. In addition, the statistic does not behave differently from that for an i.i.d. process for the remaining lags. In order to examine the behaviour of S^q as n increases, consider the results for model 8 for n ¼ 100, 300, and 500 (the leftmost figure is for n ¼ 100, middle for n ¼ 300, and the rightmost for n ¼ 500). Model 8: yt = yt–1 + et 0.4
Model 8: yt = yt–1 + et 0.3
0.35 0.3
Sρ
0.35
0.25
0.3
Sρ
Sρ
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
Model 8 White noise
0.45 0.4
Model 8 White noise
0.25
Model 8: yt = yt–1 + et 0.5
Model 8 White noise
0.25 0.2 0.15 0.1 0.05 0
0 1
2
3
4
5
6
7
8
9 10
1
2
3
4
5
K
6
7
8
9
10
1
K
2
3
4
5
6
7
8
9
10
K
We also summarize the central tendency and dispersion of S^q for model 8 for n ¼ 100, 300, and 500 in the following tables. The columns represent the lag, mean, median, standard deviation, and interquartile range in that order. MODEL 8: yt ¼ yt ) 1 + t Lag n ¼ 100 1 2 3 4 5 6 7 8 9 10 n ¼ 300 1 2 3 4 5 6 7 8 9 10 n ¼ 500 1 2 3 4
S^q
S^qmed
^Sq r ^
S^qiqr
0.264 0.226 0.201 0.183 0.170 0.159 0.150 0.143 0.138 0.133
0.261 0.219 0.192 0.171 0.155 0.142 0.131 0.121 0.115 0.109
0.098 0.099 0.100 0.100 0.100 0.100 0.100 0.099 0.098 0.098
0.147 0.150 0.154 0.153 0.155 0.154 0.153 0.153 0.152 0.147
0.388 0.352 0.326 0.305 0.288 0.275 0.263 0.253 0.245 0.237
0.393 0.354 0.326 0.303 0.287 0.274 0.262 0.251 0.241 0.231
0.091 0.094 0.098 0.100 0.103 0.104 0.106 0.107 0.108 0.109
0.121 0.129 0.138 0.144 0.149 0.153 0.152 0.154 0.158 0.158
0.443 0.409 0.384 0.363
0.443 0.406 0.379 0.358
0.084 0.088 0.093 0.096
0.122 0.131 0.144 0.148
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
659
Lag
^ Sq
^ med Sq
^Sq r ^
^ iqr Sq
5 6 7 8 9 10
0.346 0.331 0.320 0.309 0.300 0.292
0.340 0.325 0.312 0.301 0.293 0.282
0.099 0.100 0.102 0.104 0.106 0.107
0.150 0.152 0.155 0.160 0.163 0.168
Some important improvements occur as n increases. First, as n increases the values for S^q for the i.i.d. series decrease and approach zero as expected. Second, for processes where there indeed exists dependence at the Kth lag, S^q increases as n increases. These two features are particularly useful for detecting exactly which lags are significant as n grows large. Since they are completely representative, and for brevity, we present the remaining simulation results (only for n ¼ 100) in Appendix A. Other results are available from the authors upon request. A larger set of tests, including the portmanteau version of the S^q test (the sum of the computed statistic at a finite number of lags), were studied by Skaug and Tjøstheim (1996). These included the portmanteau versions of the Shannon relative entropy, a test by the authors (absolute difference of the two densities), the Autocorrelation Function (ACF) test, the BDS, and the van der Waerden test. Their power results at the 95% level, and for n ¼ 250, may be summarized as follows: The S^q test and the BDS test were often comparable except that results for BDS at higher lags than 3 were not made available because of computational issues. BDS was well beaten by the S^q test for the TAR model. Indeed the S^q test and the relative entropy (the best here) beat other tests rather handily for this model, especially at higher lags. The van der Waerden test was often marginally better for the ARCH, GARCH, nonlinear MA and extended nonlinear MA models considered by the authors. The ACF and, surprisingly, the Spearman’s rank test did quite badly in almost all of the nonlinear cases. Power of all tests declines with the lag order, and is generally best at the ‘correct lag’. The picture that emerges is that the S^q metric, as well as the relative entropy, is always close to the best tests (when not the best), and maintains very high power for processes that can be missed by tests that are powerful in other situations. The power property of the S^q test thus appears quite robust to nonlinear model classes, an important characteristic in a world of specification uncertainty. These results also confirm serious failings of the traditional measures. The good power performance of the S^q measure in detecting memory structure/lags is also notable and gives rise to an expectation that it may form a suitable basis for constructive specification searches. 5.2. Power comparisons of the permutation test with existing tests (BDS and Ljung–Box) We now compare the finite-sample power of our proposed test with two widely used tests for serial independence, the BDS and Ljung–Box tests. The Ljung–Box
Blackwell Publishing Ltd 2004
660
C. W. GRANGER, E. MAASOUMI AND J. RACINE
test is correlation-based, while the BDS test has its origins in the deterministic chaos literature (see Brock et al., 1996). We shall consider testing H0 : xt is serially independent, t ¼ 1, 2,…, T, and by way of example, we consider the logistic map (model 10), a deterministic chaotic process. Given the origins of the BDS test, it seems fitting to proceed with this model. A few words on the size of the BDS test are in order. The BDS statistic has a limiting N(0, 1) null distribution. However, we too have observed that, even in large samples, the asymptotic critical values provide a test that has invalid size (see Maasoumi and Racine, 2002, for examples). It has been suggested that one not use the asymptotic-based test for samples less than n ¼ 500. We find the preferred approach to be a permutation-based version of the test which indeed has correct size We therefore use the more favourable (to BDS) permutation-based version of the test3. In addition to the test’s size issues, the BDS requires the user to set the embedding dimension (m) and the dimension distance (), the choice of which can rather dramatically affect the test’s power. First, we consider one draw from a logistic map for a sample of size n ¼ 500. We construct the Ljung–Box, BDS permutation, and Sq permutation tests for nominal size a ¼ 0.05. We graph the ACF and Sq for lags k ¼ 1, 2,…, 10 along with their pointwise critical values (dotted line) under H0 in Figure 1. Table I presents the BDS test statistic for recommended ranges of m and highlighting values which are significant at the a ¼ 0.05 level. Based upon Figure 1 we observe that, for this sample of n ¼ 500 drawn from the logistic map, the ACF fails to detect this (nonlinear) alternative and would lead one to conclude falsely that this series is serially independent4. Examining Table I, we see that the BDS test gives conflicting results depending on one’s choice of m and . One often encounters advice on setting equal to 1 or 1.5 times the series’ standard error, and setting m in the range 2,…,8, but we can observe that the outcome of the test hinges crucially on this choice: one can fail to reject or
ACF and 95% critical values
Dependence metric and 95% critical values
1.000
1.000
0.750
0.000
Sρ
r
0.500
–0.500
0.500
0.250
–1.000 1
2
3
4
5
6
K
7
8
9
10
0.000
1
2
3
4
5
6
7
8
9
K
Figure 1. The ACF and Sq test statistics for one draw from the logistic map (n ¼ 500).
Blackwell Publishing Ltd 2004
10
661
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
reject for a range of m depending on one’s choice of . The Sq test, although, shows no such weakness. A more serious comparison of power can only be obtained via Monte Carlo simulation. We therefore consider a modest simulation again using this DGP in order to more convincingly compare the power of the three tests. We draw 1000 replications from model 10, construct each test as described above, and tabulate empirical rejection frequencies over the 1000 Monte Carlo replications. As we implement the permutation-based version of the BDS test, it is valid to apply the test for small samples, thus we conduct a power comparison of the test for a sample size of n ¼ 50. Results are summarized in Tables II and III, and as the Ljung–Box test has power that does not differ significantly from
TABLE I The BDS Test Statistic for one Draw from the Logistic Map (n ¼ 500) for a Recommended Range of Values of m and . Entries Marked with an Asterisk are Significant at the a ¼ 0.05 Level using Permutation-based (and Asymptotic-based) Critical Values m
0.36 (^ rx )
0.53 (1:5^ rx )
2 3 4 5 6 7 8
175.03* 149.44* 143.36* 134.22* 129.01* 132.96* 138.91*
8.23* 2.17 1.98 0.89 0.01 )0.48 )0.45
TABLE II Empirical Rejection Frequency for the BDS Test, a ¼ 0.05, n ¼ 50 m
0.5rx
1.0rx
1.5rx
2.0rx
2 3 4 5 6 7 8
0.798 0.766 0.750 0.776 0.776 0.788 0.750
0.540 0.464 0.340 0.296 0.284 0.222 0.198
0.222 0.170 0.182 0.172 0.154 0.088 0.056
0.746 0.642 0.552 0.498 0.470 0.378 0.330
TABLE III Empirical Rejection Frequency for the Sq Test, a ¼ 0.05, n ¼ 50 Metric
K¼1
K¼2
K¼3
K¼4
K¼5
^q SP k
0.920 0.920
0.790 0.936
0.444 0.876
0.204 0.722
0.086 0.476
i¼1
S^q; i
Blackwell Publishing Ltd 2004
662
C. W. GRANGER, E. MAASOUMI AND J. RACINE
size, we omit these results for space considerations. We also include the permutation-based portmanteau version of the S^q test mentioned earlier, Pk ^ denoted i ¼ 1 Sq; i . We observe that, for samples that are extremely small by kernel estimator standards, our proposed test has high power rejecting the null of serial independence at lag k ¼ 1 with power close to unity. The BDS test, however, has power ranging from 0.058 for m ¼ 8 and ¼ 1:5^ rx to 0.798 for m ¼ 2 and ¼ 0:5^ rx .
6.
CONCLUSIONS
We believe that the proposed metric Sq shows strong promise as a general statistic which can be used to detect generic association and serial dependence present in a time series. A new white-noise test is proposed, and applications to nonlinear time series demonstrate the value added by the proposed approach relative to traditional measures. Further work on the constructive specification search applications of this measure is in progress, as is its utility in more general testing for causality and exogeneity.
Blackwell Publishing Ltd 2004
663
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
APPENDIX A ^q VERSUS K: N ¼ 100 S
Model 5: yt = | yt–1|0.8 + et 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0
3
4
5
6
7
8
0.04 0.03 0.02 0.01 1
9 10
2
3
K
4
5 6
7
8
0
9 10
1
2
3
4
Model 7: yt = 0.8 yt –1 + et
6
7
8
9
10
Model 9: yt = 0.6 t–1yt–2 + et
0.1
0.02 Model 7 White noise
0.08
5
K
K
Model 9 White noise
0.015
0.06
Sρ
2
Model 6 White noise
0.05
0.01
0.04 0.005
0.02 0
1
2
3
4
5
6
7
8
9
0
10
1
2
3
4
5
7
8
9 10
Model 11: yt = ht1/2Ut, ht = 0.01 + 0.94 ht–1 + 0.05Ut2–1
Model 10: yt = 4yt–1(1–yt–1) 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
6
K
K
Model 10 White noise
0.025
Model 11 White noise
0.02 0.015
Sρ
1
Model 6: yt = sin(yt–1) + et 0.06
Model 5 White noise
Sρ
Sρ
Model 4 White noise
Sρ
0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0
Sρ
Sρ
Model 4: yt = et + 0.8(et2–1 + et2–1 + et2–3 )
0.01 0.005 0
1
2
3
4
5
6
K
7
8
9
10
1
2
3
4
5
6
7
8
9 10
K
Blackwell Publishing Ltd 2004
664
C. W. GRANGER, E. MAASOUMI AND J. RACINE
APPENDIX B
MEAN, MEDIAN, STANDARD DEVIATION, AND INTERQUARTILE RANGE: n ¼ 100 MODEL 0: yt ¼ t
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
r ^S^q
S^qiqr
0.010 0.010 0.010 0.010 0.010 0.010 0.011 0.011 0.011 0.011
0.007 0.007 0.007 0.007 0.007 0.007 0.007 0.007 0.007 0.007
0.014 0.014 0.013 0.013 0.014 0.014 0.014 0.014 0.014 0.014
0.005 0.005 0.005 0.005 0.005 0.005 0.005 0.005 0.005 0.005
MODEL 1: yt ¼ t þ 0:82t 1
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.021 0.010 0.010 0.010 0.010 0.010 0.011 0.011 0.011 0.011
0.017 0.008 0.008 0.008 0.008 0.008 0.008 0.008 0.008 0.008
0.016 0.011 0.011 0.011 0.012 0.011 0.012 0.012 0.012 0.012
0.012 0.005 0.006 0.005 0.005 0.006 0.006 0.006 0.006 0.006
MODEL 2: yt ¼ t þ 0:82t 2
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.010 0.021 0.010 0.010 0.010 0.010 0.010 0.010 0.010 0.011
0.008 0.016 0.008 0.008 0.008 0.008 0.008 0.008 0.008 0.008
0.010 0.014 0.009 0.009 0.010 0.010 0.009 0.009 0.010 0.010
0.005 0.012 0.005 0.006 0.006 0.005 0.006 0.005 0.006 0.006
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
665
MODEL 3: yt ¼ t þ 0:82t 3
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.011 0.010 0.021 0.011 0.011 0.011 0.011 0.011 0.011 0.011
0.008 0.008 0.017 0.008 0.008 0.008 0.008 0.008 0.008 0.008
0.011 0.011 0.016 0.011 0.011 0.011 0.011 0.011 0.011 0.011
0.005 0.005 0.012 0.006 0.006 0.006 0.005 0.006 0.006 0.006
MODEL 4: yt ¼ t þ 0:8ð2t 1 þ 2t 2 þ 2t 3 Þ
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.048 0.024 0.018 0.016 0.016 0.016 0.017 0.017 0.017 0.017
0.043 0.020 0.013 0.012 0.013 0.013 0.013 0.013 0.013 0.013
0.024 0.017 0.015 0.015 0.014 0.015 0.015 0.015 0.015 0.015
0.024 0.014 0.011 0.010 0.010 0.010 0.010 0.010 0.010 0.010
MODEL 5: yt ¼ |yt ) 1|0.8 + t
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
r ^S^q
S^qiqr
0.042 0.020 0.014 0.013 0.013 0.012 0.013 0.013 0.013 0.013
0.037 0.016 0.010 0.009 0.008 0.008 0.008 0.008 0.008 0.008
0.024 0.017 0.015 0.015 0.015 0.015 0.015 0.015 0.015 0.015
0.023 0.014 0.009 0.007 0.007 0.007 0.007 0.007 0.007 0.007
Blackwell Publishing Ltd 2004
666
C. W. GRANGER, E. MAASOUMI AND J. RACINE MODEL 6: yt ¼ sign(yt ) 1) + t
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.057 0.029 0.019 0.016 0.014 0.014 0.014 0.014 0.014 0.014
0.051 0.024 0.015 0.011 0.010 0.009 0.009 0.009 0.009 0.009
0.028 0.021 0.018 0.016 0.016 0.016 0.016 0.016 0.016 0.017
0.028 0.019 0.013 0.010 0.009 0.008 0.008 0.008 0.008 0.008
MODEL 7: yt ¼ 0.8yt ) 1 + t
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.087 0.052 0.036 0.028 0.023 0.021 0.020 0.019 0.019 0.019
0.080 0.046 0.030 0.022 0.017 0.015 0.014 0.014 0.014 0.014
0.037 0.029 0.025 0.022 0.020 0.019 0.019 0.018 0.018 0.018
0.044 0.034 0.027 0.022 0.018 0.016 0.014 0.013 0.013 0.013
MODEL 9: yt ¼ 0.6t ) 1yt ) 2 + t
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.012 0.014 0.010 0.010 0.010 0.010 0.010 0.010 0.010 0.010
0.009 0.010 0.007 0.007 0.007 0.007 0.007 0.007 0.007 0.007
0.012 0.012 0.010 0.010 0.010 0.010 0.010 0.010 0.010 0.010
0.007 0.008 0.005 0.005 0.005 0.005 0.005 0.005 0.005 0.005
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
667
MODEL 10: yt ¼ 4yt ) 1(1 ) yt ) 1)
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.472 0.340 0.234 0.176 0.160 0.160 0.161 0.161 0.162 0.163
0.469 0.333 0.227 0.164 0.148 0.149 0.150 0.151 0.152 0.152
0.066 0.117 0.087 0.089 0.076 0.072 0.071 0.071 0.072 0.076
0.084 0.113 0.121 0.118 0.099 0.093 0.094 0.094 0.096 0.095
1=2
MODEL 11: yt ¼ ht t
Lag 1 2 3 4 5 6 7 8 9 10
S^q
S^qmed
^S^q r
S^qiqr
0.019 0.019 0.019 0.020 0.019 0.019 0.020 0.020 0.020 0.020
0.012 0.012 0.011 0.012 0.012 0.012 0.012 0.012 0.013 0.012
0.020 0.022 0.021 0.023 0.020 0.021 0.022 0.022 0.022 0.022
0.017 0.017 0.016 0.017 0.015 0.018 0.017 0.017 0.017 0.017
ACKNOWLEDGEMENTS
We would like to thank, without implicating, several referees, the Editor, Timo Terasvirta and seminar participants in the UK, Canada, Australia, Sweden, and the US. Racine thanks the Center for Policy Research for its generous support.
NOTES
1. The BDS test may also be interpreted as an unusual measure of divergence between the moments of the joint and the marginal distribution; (see Brock, Deckert, Scheinkman and LeBaron, 1996). 2. Strictly speaking, f^ ðxÞ estimates the local time spent in the immediate vicinity of a point that is determined by the value of x.
Blackwell Publishing Ltd 2004
668
C. W. GRANGER, E. MAASOUMI AND J. RACINE
3. This will also permit direct power comparisons between our proposed permutation-based test and the size-adjusted, i.e. permutation-based BDS test. 4. This has been noted by Granger and Lin (1994, p. 379). Corresponding author: E. Maasoumi, Department of Economics, Southern Methodist University, Dallas, TX 75275-0496, USA. Tel.: (214) 768-4298; E-mail:
[email protected]
REFERENCES Ahmad, I. and Li, Q. (1997) Testing independence by nonparametric kernel method. Statistics and Probability Letters 34, 201–10. Aparicio, F. and Escribano, A. (1998) Information-theoretic analysis of serial dependence and cointegration. Studies in Nonlinear Dynamics and Econometrics 3, 119–40. Brock, W., Deckert, W., Scheinkman, J. and LeBaron, B. (1996) A test for independence based on the correlation dimension. Econometric Reviews 15, 197–236. Chan, H. and Tran, T. L. (1992) Nonparametric tests for serial dependence. Journal of Time Series Analysis 13, 169–75. Chen, Y. and Kuan, C. (2002) Time irreversibility and EGARCH effects in US stock index returns. Journal of Applied Econometrics 17, 565–78. Delgado, M. (1996) Testing serial independence using the sample distribution function. The Journal of Time Series Analysis 17, 271–85. Efron, B. and Tibshirani, R. (1993) An Introduction to the Bootstrap. New York, London: Chapman and Hall. Genest, C. and MacKay, R. J. (1986) The joy of copulas: bivariate distribution with uniform marginals. The American Statistician 40, 280–3. Granger, C. and Lin, J. L. (1994) Using the mutual information coefficient to identify lags in nonlinear models. The Journal of Time Series Analysis 15, 371–84. Granger, C. W. and Terasvirta, T. (1999) A simple nonlinear time series model with misleading linear properties. Economics Letters 62, 161–5. Granger, C. W., Maasoumi, E. and Racine, J. (2002) A dependence metric for possibly nonlinear processes. Working paper, Department of Economics, SMU. Hamilton, J. D. (1993) Estimation, inference, and forecasting of time series subject to changes in regime. In Handbook of Statistics Vol. 11 (eds G. Madalla, C. Rao and H. Vinod). North Holland, Amsterdam. Havrda, J. and Charvat, F. (1967) Quantification method of classification processes: concept of structural a-entropy. Kybernetika Cislo I. Rocnik 3, 30–4. Hirschberg, D., Maasoumi, E. and Slottje, D. (2001) Clusters of attributes and well-being in the US. Journal of Applied Econometrics 61, 445–60. Hong, Y. (1998) Testing for pairwise serial independence via the empirical distribution function. Journal of the Royal Statistical Society Series B (Statistical Methodology) 60, 429–53. Hong, Y. (1999) Testing serial independence via the empirical characteristic function. Technical Report, Department of Economics,Cornell University, Ithaca, NY, USA. Hong, Y. and Lee, T. W. (2003) Diagnostic checking for the adequacy of nonlinear time series models. Econometric Theory 19, 1065–121. Hong, Y. and White, H. (2000) Asymptotic distribution theory for nonparametric entropy measures of serial dependence. Mimeo, Department of Economics, Cornell University, and UCSD. Hsieh, D. (1989) Testing for nonlinear dependence in foreign exchange rates: 1974–1983. Journal of Business 62, 339–68. Joe, H. (1989) Relative entropy measures of multivariate dependence. Journal of the American Statistical Association 84, 157–64. Karlsen, H. A. and Tjøstheim, D. (2001) Nonparametric estimation in null recurrent time series. The Annals of Statistics 29, 372–416. Lau, H. T. (1995) A Numerical Library in C for Scientists and Engineers. Tokyo: CRC Press.
Blackwell Publishing Ltd 2004
A DEPENDENCE METRIC FOR POSSIBLY NONLINEAR PROCESSES
669
Li, Q. and Racine, J. (2003) Nonparametric estimation of distributions with categorical and continuous data. Journal of Multivariate Analysis 86, 266–92. Ljung, G. and Box, G. (1978) On a measure of lack of fit in time series models. Biometrika 65, 297– 303. Maasoumi, E. and Racine, J. (2002) Entropy and predictability of stock market returns. Journal of Econometrics 107, 291–312. Nelsen, R. (1999) An Introduction to Copulas. Springer-Verlag, Berlin. Pagan, A. and Schwert, G. (1990) Alternative models for conditional stock volatility. Journal of Econometrics 45, 267–90. Parzen, E. (1962) On estimation of a probability density function and mode. The Annals of Mathematical Statistics 33, 1065–76. Perron, P. (1989) The great crash the oil price shock and the unit root hypothesis. Econometrica 57, 1361–401. Phillips, P. C. B. and Park, J. Y. (1998) Nonstationary density estimation and kernel autoregression. Cowles paper CFDP# 1181. Press, W. H., Flannery, B. P., Teukolsky, S. A. and Vetterling, W. T. (1990) Numerical Recipes in C. New York: Cambridge University Press. Qi, M. (1999) Nonlinear predictability of stock returns using financial and economic variables. Journal of Business and Economic Statistics 17, 419–29. Robinson, P. M. (1991) Consistent nonparametric entropy-based testing. Review of Economic Studies 58, 437–53. Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. Chapman and Hall, London. Skaug, H. and Tjøstheim, D. (1993) Nonparametric tests of serial independence. In Developments in Time Series Analysis (ed. S. Rao). London, Chapman and Hall, pp. 207–29. Skaug, H. and Tjøstheim, D. (1996) Testing for serial independence using measures of distance between densities. In Athens Conference on Applied Probability and Time Series (eds P. Robinson and M. Rosenblatt). Springer Lecture Notes in Statistics, Springer. Tjøstheim, D. (1996) Measures and tests of independence: a survey. Statistics 28, 249–84.
Blackwell Publishing Ltd 2004