Testing for a unit root in a stationary ESTAR process

9 downloads 161 Views 291KB Size Report
This paper develops a statistic for testing the null of a linear unit root process ... van Dijk (1999), Caner and Hansen (2001), Seo (2003), Bec et al.(2004) ...
Testing for a unit root in a stationary ESTAR process Forthcoming inEconometric Reviews

Rehim Kılı¸c∗

Abstract This paper develops a statistic for testing the null of a linear unit root process against the alternative of a stationary exponential smooth transition autoregressive model. The asymptotic distribution of the test is shown to be nonstandard but nuisance parameter-free and hence critical values are obtained by simulations. Simulations show that the proposed statistic has considerable power under various data generating scenarios. Applications to real exchange rates also illustrate the ability of our test to reject null of unit root when some of the alternative tests do not.

Key Words: ESTAR model, unit root, nonlinearity. JEL Classification: C12, C22, F41.



School of Economics, Georgia Institute of Technology, 221 Bobby Dodd Way, Atlanta, GA 303320615, e-mail [email protected].

1

1

Introduction

There is a growing literature that aims to discriminate non-stationarity from nonlinearity. Although the standard unit root tests (e.g. Augmented Dickey-Fuller (ADF) due to Dickey and Fuller 1979 and Said and Dickey 1984 and Phillips and Perron 1988) should be consistent against “stationary” nonlinear alternatives, their power turns out to be quite low. Simulation studies in Balke and Fomby (1997), Pippenger and Goering (1993) and Taylor (2001) have shown that the power of the conventional unit root tests can be dramatically low against nonlinear alternatives. This lack of power has motivated the development of new testing approaches that consider the nonlinear processes explicitly. Among others, Enders and Granger (1998), Gonz´alez and Gonzalo (1998), Berben and van Dijk (1999), Caner and Hansen (2001), Seo (2003), Bec et al. (2004), Kapetanios and Shin (2003, 2006), de Jong et al. (2007) and Bec et al. (2008) suggested tests in the context of Threshold Autoregressive (TAR) models. On the other hand, Kapetanios et al. (2003) Park and Shintani (2005), Rothe and Sibbertsen (2006) and Kruse (2008) proposed tests that have a specific exponential smooth transition autoregressive (ESTAR) model with lagged level as the transition variable under the alternative. The current paper is part of this recent literature that aims to discriminate a linear unit root process from a “stationary” nonlinear ESTAR alternative. The main contribution of this paper are twofold. First, we provide a new statistic for unit root testing in the context of an ESTAR process for which conditions for strict stationarity and ergodicity are known under some assumptions. In the ESTAR model we consider, the transition parameter is unidentified under the null of a unit root, and the test statistic is based on a one-dimensional grid search over this unidentified parameter space. We suggest to use the smallest negative t-value over a fixed parameter space that is normalized by the sample standard deviation of the transition variable as the

2

test statistic. Since the transition variable is stationary and the parameter space is well defined asymptotically, the limit distribution of our test does not depend on the space over which the t− statistic is optimized. The asymptotic distribution of the test statistic is derived under both the null and the alternative hypotheses. It is shown that the asymptotic distribution of the test statistic under the null is nonstandard but pivotal. Therefore, the critical values can be computed by simulations. Moreover, we show that under the alternative the test statistic converges to a constant indicating the consistency of the test against the specific ESTAR process we have. Second, we investigate finite sample performance of recently proposed unit root tests by Kapetanios et al. (2003), Park and Shintani (2005), Bec et al. (2008), and our test (denoted by tN , inf-t, WBSup , and tEST AR respectively) in the context of two ESTAR specifications and in a TAR specification by simulations.1 Simulation results show that our proposed test statistic outperforms others when the alternative is an ESTAR model with lagged difference as the transition variable irrespective of how persistent the process is. The power gain from our test is often quite substantial, relative to all the alternative tests. Results also show that the suggested procedure works relatively well when the alternative is an ESTAR model with lagged level as the transition variable (i.e. under the alternative of tN and inf-t) or the TAR model (i.e. under the alternative of WBSup ). Consistent with findings in Kapetanios et al. (2003), under the alternative of tN and inf-t tests, tN test performs better than alternatives especially when the parameter that characterizes the degree of smoothness of the transition function in the ESTAR model is small. Moreover, the inf-t test tends to perform better than alternatives as the speed of nonlinear adjustment increases in the ESTAR model. When the TAR is the alternative 1

We have also compared and contrasted size and power of the conventional ADF statistic as well as the test suggested by Rothe and Sibbertsen (2006). Simulation show the tEST AR test outperforms these tests in finite samples. Test of Rothe and Sibbertsen (2006) is very similar to that of tN test as both tests are essentially very similar except for the way they control the serial correlation. These results can be obtained upon request.

3

model, tEST AR , tN and inf-t perform equally well with few exceptions. As expected, the test of Bec et al. (2008) performs better than alternatives under this alternative. Application to monthly and quarterly real exchange rates reveal considerable evidence against the unit root null from all the tests. However, the strongest evidence is obtained from the tEST AR statistic. Estimation and diagnostics results reported in the paper shows important nonlinear dynamics that can be modeled adequately by the ESTAR model presumed under the alternative of tEST AR test. Simulations and applications reported in this paper, as well as applications by Paya and Peel (2006) to real Dollar-Sterling exchange rate between 1871-1994 and by Christopoulos and Le´on-Ledesma (2008) in the context of current account dynamics provide considerable evidence on the usefulness of the tEST AR statistic in distinguishing a linear unit root from a stationary ESTAR process. The rest of the paper is organized as follows. Section 2 presents the stationary ESTAR process and discusses motivating examples, and section 3 develops the proposed test statistics, derives its asymptotic distributions and provides asymptotic critical values. Section 4 presents the results of simulation experiments. Section 5 contains the empirical application and the last section concludes the paper. The mathematical proofs are presented in the Appendix.

2

Motivation and the ESTAR Model

The ESTAR model was introduced by Haggan and Ozaki (1981) and reassessed by Granger and Ter¨asvirta (1993) and Ter¨asvirta (1994). A survey of recent developments in ESTAR modelling is given by van Dijk et al. (2002). In this paper we consider the following representation of the ESTAR model,

∆yt = φyt−1 F (γ, zt ) + ut , 4

t = 1, . . . , T

(1)

where ut is a stationary process, F (γ, zt ) = 1 − exp(−γzt2 ), φ and γ, are unknown parameters, and zt = ∆yt−d for d ∈ 1, · · · , dmax , is the transition variable. The transition parameter γ determines the speed of transition between two extreme regimes, with low values of γ implying slower transition. For γ = 0, Equation (1) becomes a linear AR(1) model with a unit root, yt = yt−1 + ut .

(2)

The outer regime is obtained as γ → ∞ and corresponds to the AR(1) model

yt = (1 + φ)yt−1 + ut .

(3)

Following Kapetanios et al. (2003) and Park and Shintani (2005), we consider the model with serial correlation under the assumption that serially correlated errors enter in a linear way. Under this assumption, re-specify the model in (1) as follows

∆yt =

p X

δi ∆yt−i + φyt−1 (1 − exp(−γzt2 )) + ut ,

(4)

t=1

where δ(L) = 1 −

Pp

j j=1 δj L

is assumed to have all roots outside the unit circle. Con-

ditions for geometric ergodicity and mixing conditions for the process in Eqn. (4) are discussed in Doukhan (1994). Doukhan (1994, Theorem 7, pp. 102) shows the process generated by (1) and (4) with zt = ∆yt−d is strictly stationary and strong mixing under Assumption A below and under |φ + 1| < 1 in (1) and the roots of the poly´ ³ Pp j p+1 are outside the unit circle in nomial 1 − (1 + φ + δ1 )L − j=2 (δj − δj−1 )L + δp L (4). More formally, observe that the model in Eqn. (4) can be written down as yt = P f (yt−1 , yt−2 · · · , yt−p−1 ) + ut where f (.) = φyt−1 F (γ, zt ) + p+1 i=1 βi yt−i , β1 = 1 + δ1 , βi = δi − δi−1 , for i = 2, 3, · · · , p and βp+1 = δp for i = p + 1 and ut = yt − f (yt−1 , · · · , yt−p−1 ). From the definition of ut it is readily seen that Zt = [yt · · · yt−p−1 ut ]0 is a Markov Chain 5

on Rp+2 × R. Given that Assumption A holds and the transition variable zt is stationary, the Markov chain {Zt } is irreducible by Lemma 2 of Doukhan (1994, p. 101). Moreover, given the fact that the transition function is bounded and satisfies the Lipschitz condition, (i.e. |F (γ1 , zt ) − F (γ2 , zt )| ≤ κ|γ1 − γ2 | for some 0 < κ < 1 and all γ1 , γ2 ∈ R+ ) the time series process {yt } is geometrically ergodic (and hence strictly stationary and strong mixing) under the parameter restrictions discussed above.2 Under the above conditions, models in (1) and in (4) indicate that whenever the d−period lagged growth rate or the change in the dependent variable is very small or whenever the speed of adjustment approaches to zero yt behaves “approximately” a unit root process. On the other hand, when the lagged changes in the dependent variable is large in both positive and negative directions, then the process increasingly becomes mean reverting as φ must be less than zero. However, as Seo (2003), Park and Shintani (2005) and de Jong et al. (2007) point out, there appear to be no general results that establish weak dependence or stationarity properties of the model if the errors ut are not i.i.d such as a martingale sequence with conditional heteroscedasticity as in Park and Shintani (2005). The issue appears to be that results such as the ones of Doukhan use results from Markov chain theory that have no immediate equivalents if errors are not i.i.d. ESTAR models have been used to study the dynamics of real exchange rates (see Micheal et al.

1997, Taylor et al.

2001, Taylor 2001, and Kilian and Taylor 2003

among others). The use of lagged level of real exchange rates in these models are largely 2

As can be seen from the discussion above and the results in Doukhan (1994) if the transition variable is not stationary then the process may stay in one of the extreme regimes indefinitely with positive probability and hence causing nonstationary transition probabilities in the Markov chain representation of the underlying time series. Therefore, we use the delayed difference of yt as the transition variable. We recognize that this might not be most relevant choice in some applications as it might be desirable to allow the level of yt to characterize nonlinear dynamics. However, our use of the difference of yt allows us to test the null of a unit root against a stationary alternative which is not the case for majority of the tests available in the literature where one has to assume stationarity under the alternative.

6

consistent with the idea that lagged deviations of real exchange rates from a “constant” equilibrium characterize the nonlinear dynamics. The applied literature on real exchange rates are motivated by the theoretical models in Dumas (1992) and Sercu et al. (1995) where sizeable deviations from a constant long run level of home and foreign prices make marginal benefit of trade to exceed the marginal cost and hence entices arbitrage and leads to the trade activity to increase and eventually the process tends to revert back to this constant equilibrium. ESTAR models are used to approximate such nonlinear dynamics in purchasing power parity (PPP) deviations and real exchange rates. In these models presence of a constant long run PPP level and presence of constant proportional transportation costs lead to deviations from a constant long run equilibrium to characterize nonlinear adjustment. However, as argued by Kilian and Taylor (2003), it is plausible to imagine that the long run equilibrium itself is not constant, due to for example possible shifts in transportation or transaction costs and relative prices over long time periods. Moreover, the presence of transaction costs alone could not account for many of the observed very large movements in exchange rates, either in terms of day-to-day volatility or in terms of periods of substantial and persistent overvaluation or undervaluation of real exchange rates which are central to the PPP puzzle as described by Rogoff (1996). Since we utilize the lagged difference as the transition variable, the nonlinear dynamics in the ESTAR model we consider on the other hand, indicates that lagged appreciations or depreciations in the real exchange rates characterize the nonlinearity. For large enough past appreciations and depreciations, real exchange rates may adjust towards an attractor (which is not necessarily constant as assumed in the applications in the literature so far) that is also consistent with the notion of a long run equilibrium. In other words, it is quite plausible to think that not only the deviation from a constant long run PPP level, but also the movements in the relative growth rates of home and foreign prices that induce changes in trade activity and causes the nonlinear dynamics that can be characterized 7

by the ESTAR model we consider. The ESTAR model under the alternative of tEST AR test may also be consistent with the idea that relatively large appreciations and depreciations in nominal and real exchange rates (which may cause overvaluation or undervaluation of a currency) may induce policy makers to intervene into the exchange rate markets either directly or indirectly with the objective to move real exchange rates in the direction of an attractor that is consistent with fundamentals. For example long appreciation of US Dollar during 1980s and subsequent coordinated intervention by major countries (including US, Germany and Japan) into exchange rates markets during second half of 1980s has caused US Dollar real exchange rates to depreciate after its long overvaluation (see Beine 2003, Dominguez 2003, Baillie et al. 2000 among others). It is also plausible to imagine situations in which traders and foreign exchange investors behave differently during large appreciations and depreciations causing real exchange rates or deviations of nominal rates from fundamentals to behave differently. More specifically, traders and investors in the currency markets may take different positions depending upon if exchange rates are depreciating or appreciating for an extended time period. For instance, traders with net long positions (i.e. net exporters with foreign currency receivables) may be inclined to hedge against domestic currency appreciations yet remain unhedged against domestic currency depreciations. Alternatively, traders with net short positions (i.e. net importers with foreign currency payables) may be inclined to hedge against domestic currency depreciations yet remain unhedged against domestic currency appreciations. This type of behavior may not only generate nonlinear dynamics in exchange rates but also asymmetric behavior in the adjustment process. Other examples in which past changes of dependent variable characterizes the nonlinear dynamics of ESTAR form can be found in papers by Christopoulos and Le`on-Ledesma (2008), Paya and Peel (2006) and Gonzalo and Pitarakis (2006). 8

3

The tEST AR Statistic

Based on the ESTAR specification discussed in the previous section, we present the proposed test statistic and derive the asymptotic distribution. In the following, without loss of generality, we set delay parameter d to 1.3 If γ were known, we could obtain an ˆ of φ by a regression of ∆yt on yt−1 (1 − exp(−γz 2 )). A testing procedure for estimate, φ, t H0 : φ = 0 against H1 : φ < 0 could be based on tˆφ=0 (γ). However, since γ is unknown, the problem arises that under H0 : φ = 0, γ is unidentified (see for example Davies 1987). To overcome this problem we suggest to use the lowest possible t-value over a fixed parameter space of γ values that are normalized by the sample standard deviation of the transition variable zt . In other words, we suggest to use tEST AR = inf tˆφ=0 (γ) = inf

γ∈ΓT

γ∈ΓT

h

i

where ΓT = γ T , γ T =

h

1 100szT

,

100 szT

ˆ φ(γ) , ˆ se( b φ(γ))

(5)

i ∈ R, where szT is the sample standard deviation

of zt . The scale parameter γ in the ESTAR model is searched over the fixed interval normalized by the sample standard deviation of the transition variable zt (see, e.g. van Dijk et al., 2002). Since zt is a stationary process, this choice yields a well defined limit lower and upper bounds for the parameter space and hence a well-defined limit parameter i £ ¤ h 1 p 100 space. In other words, ΓT −→ Γ = γ, γ = 100σz , σz , as T → ∞ and hence Γ is a compact subset of R+ .4 3

In the Appendix, we show that setting d = 1 does not make any difference in terms of asymptotic theory we develop. However, in finite and especially in small samples, test may provide different values for different d. Either one can compute the test for a sequence of d values and report the results for each d or alternatively, one can select the test value based on the smallest sum of squared residuals criterion as in for example Caner and Hansen (2001). Since the parameter space for d is discrete, this shouldn’t affect the asymptotic distribution of the test. 4 In addition to the procedure followed in this paper, an alternative approach involves: (i) testing the \ null H0 : γ = 0 against H1 : γ > 0 based on the t-statistics tγ=0 (φ) and (ii) testing H0 : φγ = 0 against H1 : φγ < 0. The formulation of the null as in (ii) clearly shows why φ or γ are unidentified under the null and why Kapetanios et al. (2003) develop tests that avoids the dependence on the unidentified

9

Notice that this choice yields a compact subspace of the nuisance parameter which is required to prove the tightness of the finite dimensional distribution of the suggested test as the nuisance parameter space is infinite. As discussed in Hansen (1996), the validity of the choice of space over which the test statistic is optimized in applications depends on whether the chosen subset is sufficiently dense in Γ. As argued by Hansen a general solution to this problem may not be easy. Our choice is motivated by the common practice in smooth transition models where scale parameter is usually searched over some fixed interval normalized by the sample standard deviation of the data (see, e.g. van Dijk et al., 2002). Moreover, since it is not easy to quantify “how large is γ”, it is desirable to make the sample space to be a function of the scale of the transition variable as also done in Park and Shintani (2005). Since as γ → ∞, the tˆφ=0 (γ) becomes closer to the Dickey-Fuller (DF) test, making the interval too wide (i.e. when γ T → γ = ∞) will result in optimization over tˆφ=0 (γ) values that will all be close to the DF test. In implementing the test, we suggest to use h i a grid over 100s1 zT , s100 with the grid size given by 1/100szT . We assume the following zT on ut :

Assumption A 1. ut is a sequence of i.i.d. random variables with a continuous density and with E|ut |κ < ∞, where κ ≥ 4. 2. E[u2t |Ft−1 ] = σ 2 for all t, where Ft−1 is the information set at date t − 1. Assumption A rules out presence of heteroscedasticity in the conditional second moments. \ parameters. We base our testing procedure on tφ=0 (γ) because of two reasons: (1) the procedure is very similar to the well-known Dickey-Fuller approach and (2) for a given γ the test is based on the t-statistics from a regression of ∆yt on yt−1 F (γ, zt ), and hence implementation of the test only requires least squares. Tests that utilize these alternatives may prove useful but beyond the scope of current paper. See also the discussion in Kapetanios et al. (2006).

10

Although, this might be somewhat restrictive for some economic and financial data, it ensures that the nonlinear ESTAR process is “strictly stationary” under the alternative. We should emphasize that little is known about the conditions for asymptotic stationarity of the ESTAR and other transitional models under more general error assumptions and hence one needs to rely on the assumption of stationarity of the alternative model without proof as for example in Park and Shintani (2005) and de Jong et al. (2007). The ESTAR process under the alternative of our test is“strictly stationary” and ergodic under conditions discussed above. This feature is important in applications as testing for nonlinearity, estimation and subsequent diagnostic tools in ESTAR models rely on the “stationarity” of the models (see also Kılı¸c 2004 and Sandberg 2008). Our testing procedure is in the spirit of the Dickey-Fuller test and easy to implement. Although first we impose strict assumption of i.i.d. errors, we extend our results to cases where the data generating process may have serial correlation in a very similar fashion to Augmented Dickey-Fuller test. An alternative approach is to impose relatively less stringent assumptions on the error process as in Park and Shintani (2005) that allows for heteroscedasticity. This however comes with some cost as the asymptotic distribution of the test depends on the long run variance of the error process. In order to avoid this asymptotic dependence, either one has to impose very specific forms on the relationship between the long run variance and the unidentified parameter(s) of the model under the null as in Park and Shintani (2005) (which may be difficult to verify in practice) or as in Seo (2003) one needs to appeal to bootstrap methods to obtain critical values.5 Our main analytical result now is the following. Theorem 1

1. Given the model in (1), under the null hypothesis H0 : φ = 0 and

5

There may be some difficulties involved in applying bootstrap methods as there are unidentified parameters under the null. Defour (2006) provides an approach to compute p-values by bootstrap and Monte Carlo simulations when part of the parameter space is unidentified under the null. It is not clear if the approach suggested by Defour (2006) can easily be extended to the current setup however as the nuisance parameter space is not compact. Therefore we leave this possible extension to future research.

11

Assumption A, R1

W (r)dV (r, γ) tEST AR = inf tˆφ=0 (γ) −→ inf 0R 1 , 2 dr)1/2 γ∈ΓT γ∈Γ ( W (r) 0 d

(6)

where (W (r), V (r, γ)) is a bivariate Brownian Motion process for a each given γ with covariance matrix6   Σ(γ) = 

 1

σ12 (γ)   σ12 (γ) 1

(7)

and σ12 (γ) = E(1 − exp(−γu2s ))/[E(1 − exp(−γu2s ))2 ]1/2 .

(8)

2. In addition to Assumption A, assuming that δ(L) has all roots outside the unit circle, the asymptotic distribution of tEST AR statistic from (4) is given by Equation (6) above. p

3. Under the alternative hypothesis φ < 0, T −1/2 tEST AR −→ c for some constant c < 0.

Proof: See Appendix.

This theorem shows that the asymptotic distribution of the test statistic is nonstandard, but nuisance parameter free and depends only on the limit parameter space Γ which in turn is a fixed interval that is scaled by the standard error of the transition variable zt . Note that as γ → ∞, σ12 (γ) → 1 and V (r, γ) → W (r), and the asymptotic distribution of tEST AR becomes the asymptotic distribution of the DF unit root test statistic. This is 6

Strictly speaking, a bivariate Brownian Motion only depend upon one index, namely time. We should emphasize that for a given γ value (W (r), V (r, γ)) is a bivariate Brownian Motion and hence the notation is used to emphasize dependence on a given γ.

12

because as γ gets larger, the exponential transition function approaches to one and the models in (1) and (4) reduce to the usual DF and ADF regressions respectively. Therefore, for large values of γ, we can expect that the tEST AR statistic to behave similar to the ADF test.7 In cases where the process has a nonzero mean and/or linear deterministic trend, following Kapetanios et al. (2003), we suggest to use demeaned and/or demeaned and detrended data. For example, in the case where the data has nonzero mean, i.e., xt = µ+yt , we use the demeaned data yt = xt − x¯, where x¯ is the sample mean. It can be shown that the asymptotic distribution of the tEST AR statistic is the same as (6), except that ˜ (r) defined on r ∈ [0, 1]. W (r) is replaced by the demeaned standard Brownian motion W In a similar way, in the case where the process has nonzero mean and linear trend, i.e., ˆ where xt = µ + βt + yt , we can use the demeaned and de-trended data y = xt − µ ˆ − βt, where µ ˆ and βˆ are the OLS estimators of µ and β. In this case it can be shown that the asymptotic distribution of tEST AR statistic will be the same as in (6) except that W (r) ˆ (r) defined on is replaced by the demeaned and de-trended standard Brownian motion W r ∈ [0, 1].8 One important issue we need to address here is that the method we model the intercept and trend implies a particular way that constant term and/or trend enter the model under the alternative as in Kapetanios et al. (2003). Therefore, similar to tN statistic, the tEST AR test may have power problem in small samples. Asymptotic critical values of the tEST AR statistic for discussed cases, denoted by Case 1, Case 2, and Case 3 respectively, are tabulated via simulations with T = 2000 and n = 100, 000 replications, are presented in Table 1. 7 Following van Dijk et al. (2002) and Kapetanios et al. (2003), we suggest that standard model selection criteria or significance testing procedure be used in selecting p. 8 Proofs for these conjectures will follow the same lines as in the proof of Theorem 1 given in the Appendix with the standard Brownian motions replaced by the demeaned and demeaned and de-trended standard Brownian motions. See also Kapetanios et al. (2003) on this.

13

4

Small Sample Properties of Alternative Tests

In this section we study size and power performance of tEST AR and compare it with tN , inf-t and WBSup tests under different DGPs.9 We consider the samples of the sizes T = 100 and 200.

4.1

Size Comparisons

To investigate the size properties of alternative tests, time series data is generated under the null model with possibly serially correlated errors by

yt = yt−1 + ut , with ut = ρut−1 + εt

where εt is drawn from the standard normal distribution and ρ = {−0.9, −0.5, 0, 0.5, 0.9}. In case 2, we have added a constant term µ = 0.1 and in case 3 we also included a time trend with the slope equals to 1/T . The order p of the autoregression is assumed to be known and set to 1 whenever ρ 6= 0.10 Table 2 reports the actual rejection frequencies of the tEST AR test and compares them with those of the inf − t, tN , and WBSup tests. Size results reported in Table 2 are based on 10,000 replications for the nominal 5% significance level. Looking at the columns corresponding to the case 1, all of the tests have sizes that are close to the nominal size 9

Since the simulations in Bec et al. (2008) show that the sup Wald test with the bounded adaptive Sup statistic. threshold set outperforms the test with unbounded set, we compute the size and power of WB We also computed the size and power of standard ADF statistic and the test proposed by Rothe and Sibbertsen (2006) which is essentially the tN statistic with a Phillips-Perron type correction to serial correlation. To conserve space we do not report results for these tests. Results for the test of Rothe and Sibbertsen (2006) are very similar to tN test. These results can be obtained upon request. 10 Results for ρ < 0 are not reported to conserve space. As ρ becomes more negative, typically all tests tend to over reject the null hypothesis. Full results can be obtained upon request. For the sake of Sup comparison, we compute sizes for inf-t and WB tests for under case 1 and 2 as case 3 is not considered in Park and Shintani (2005) and Bec et al. (2008). Indeed, Bec et al. (2008) provide critical values Sup with mean parameters only. Therefore, size results for WB use the same critical values for case 1 and 2 in Table 2 and hence caution should be exercised in interpreting the results for case 1 in Table 2.

14

of 5% in majority of the cases. Columns corresponding to case 2 reveals that the inf − t test has some size distortions in the direction of under-rejection of unit root null. On the other hand, the tEST AR test tends to slightly over-reject the unit root null in some cases. Results show that WBSup test over rejects the null of a unit root in cases 1 and 2 compared to other tests more often. Inspection of the results in Table 2 show that, consistent with findings of Kapetanios et al. (2003), the tN statistics has sizes closer to the nominal size of 5%. However the test tends to over reject the null especially for case 3 with large value of ρ. Overall the tEST AR test have plausible size properties compared to the alternative tests considered.

4.2

Power Comparisons

To assess the power performance of alternative tests, we generate data under the alternative model ∆yt = α + βt + φF (γ, zt ) + ρ∆yt−1 + ut ,

(9)

where ut is drawn from the standard normal distribution with mean zero and variance 1. We consider α ∈ {0, 0.1} and β ∈ {0, 1/T }, but for the sake of preserving space, we report and discuss results for case 2, that is α = 0.1 and β = 0 in the following. We fix ρ = 0 and vary the parameters γ and φ. We conduct three sets of Monte Carlo experiments to study the power of tests. In the first set, we generate data from (9) with zt = ∆yt−1 (i.e. under alternative of tEST AR statistic) and in the second set, again the data is generated from (9), but with zt = yt−1 (i.e. under the alternative of tN and inf-t tests).

15

In the last experiment we generate data under the alternative of WBSup ,    µ1 + φ1 yt−1 if yt−1 ≤ −λ    ∆yt = a∆yt−1 + ut + φ2 yt−1 if |yt−1 | < λ      −µ1 + φ1 yt−1 if yt−1 ≥ λ

(10)

with µ1 = 1.3 × |φ1 | × λ, φ2 = 0 and the bounded threshold set ΛB T is selected by using Equation 2.9 of Bec et al. (2008) with length parameter set δ = 6 and changed |DFT | with max(1, |DFT |) as suggested by Bec et al. (2008, pp.105). Following Bec et al. (2008), we consider λ = 10, a ∈ {0, 0.3} and φ1 ∈ {−0.1, −0.3} in generating the data. Given some variation in the size of the tests, power of the tests are based on size-adjusted critical values where critical values for each sample size is computed under the null of a unit root by using 1,000 simulations. Power measures for each test are based on the rejection rates in 5,000 simulations. Panels A and B of Table 3 report the simulation results for the ESTAR model with zt = ∆yt−1 and zt = yt−1 respectively with the parameter values γ ∈ {0.01, 0.05, 0.10, 0.25} and φ ∈ {−0.1, −0.25, −0.50, −0.75} while panel C displays the results when the alternative is the TAR specification given in equation (10). In selecting the values for γ and φ, we follow Kapetanios et al. (2003) and focus mostly on small values of γ and small negative values of φ.11 Careful inspection of panels A and B of Table 3, reveals that power of each test crucially depends on the values of γ and φ as well as the sample size. Power measures of all tests increase as γ increases and as φ decreases for any given sample size. When γ and |φ| are sufficiently high, power measures of tests approach to 100%. This is intuitive as 11

For space considerations we have not reported the simulation results for γ ∈ 0.5, 0.75, 1.00, 5.00. Although the main pattern in power comparisons stay the same, as γ gets larger and larger, the difference across tests becomes smaller. These results can be obtained upon request.

16

larger negative values of φ indicates that for given values of γ and σ 2 , the process becomes less persistent. Similarly, for given values of φ, and σ 2 as γ gets larger, E(exp(−γ(zt )2 )) decreases and the series becomes less persistent (on average more realizations of process occurs in the neighborhood of outer regime). Hence, relatively small values of γ and small negative φ indicate that the process is more persistent. For any given combination of γ and φ, the power of each test increases as the sample size increases. When γ is very small, say γ = 0.01, or γ = 0.05 (i.e. the generated data is in the neighborhood of unit root null) the power of tests are relatively small especially for small sample sizes. Results in Panel A of Table 3, reveal that for any given combination of (γ, φ), tEST AR test has more power than any of the tests. In most cases reported in the table, the power measure of our test is at least twice as high as the power measure of other tests. For example, for T = 200, the power of tEST AR test is 51.0% with (γ, φ) = (0.05, −0.5) while it is only 18.5% for the tN test which has the highest power measure among the remaining three tests. Note that results also indicate that the tEST AR test performs better than all other tests both when the process is in the neighborhood of the null hypothesis of a unit root (i.e. when γ is low and/or when φ is small in absolute value) as well as when the process is far from that neighborhood. Since many economic and financial time series seem to be highly persistent, this finding may be useful. Results in Panel B indicate that power measures of all tests increase with the sample size and with the size of γ. Results also indicate that performance of all tests improves as φ becomes smaller. Power measures for all tests approach 100% especially for values of γ exceeding 0.10 and small values of φ less than -0.50. Careful inspection of the Table reveals that overall, inf-t test performs better than other tests for relatively larger values of γ and smaller values of φ. On the other hand for very small values of γ and |φ|, tN test outperforms the alternatives. This is consistent with the simulation results reported in Kapetanios et al. (2003). This finding seems intuitive in the sense that as γ approaches 17

to zero with very small values of |φ| the auxiliary model that is based on the Taylor series expansion of the nonlinearity approximates the nonlinear dynamics more accurately and hence the power of the test improves. For small values of γ and large values of φ, results are somewhat mixed. Overall, our test seems to have comparable power for large γ and small φ even when the transition variable is the lagged level of the dependent variable. However, our test tends to perform worse than the tN and inf − t tests for moderate sizes of γ and φ. Note that the WBSup test performs relatively worse than the others. However, this should be due to the fact that this test is specifically designed for TAR models and hence loss in power may be expected for ESTAR type alternatives. The reported results in panel C confirm the findings in Bec et al. (2008) in that WBSup test outperforms alternatives. Results also show that our test performs slightly worse than other tests. However, the relative performance of tEST AR , tN and inf − t tests worth noting even for a TAR alternative. Overall simulations show that each test performs better than the alternatives when the true DGP is given by each test’s alternative model. tEST AR test performs worse than tN and inf-t tests but better than the WBSup test when the transition variable is the lagged level of dependent variable in the ESTAR model. Interestingly enough, our test performs as well as tN and inf-t tests when TAR is the alternative model.12 12

An interesting observation comes out of the results in panels A and B is that all tests achieve higher power measures in panel B than panel A suggesting that the ESTAR process might have different persistence characteristics depending on whether the transition variable is the lagged level or the change of the dependent variable. This may worth further investigation but beyond the scope of the current paper.

18

5

Nonlinearity in Real Exchange Rates

In the light of the discussion in Section 2 above, in this section we apply our suggested test statistic as well as tN , inf-t and WBsup to several monthly and quarterly real exchange rates and estimate the ESTAR model under the alternative of our proposed test statistic. Throughout this section, the lag length p is selected by using modified AIC and examination of partial autocorrelation functions under the null hypothesis. The delay d is not identified under the null, and therefore, in computing WBSup , tN , inf-t, and tEST AR tests, we follow Caner and Hansen (2001) and select d that minimizes the residual sum of squares of corresponding regressions.13 Moreover, since the plots of the data used does not suggest a deterministic trend in the data, a constant term is included in implementing the tests.14 We utilize two different exchange rate data sets. Both monthly and the quarterly series used in this study are derived from International Monetary Fund’s International Financial Statistics (IFS) database. The first data set is the quarterly CPI-defined real exchange rates between US Dollar and several OECD countries. This quarterly data covers the period between 1973 and 1998 for the Euro-zone currencies (Belgium (BF), France (FF), Germany (GM), Italy (IL), Netherlands (DG), and Spain (SP)) and between 1973-2006 for the non-Euro zone currencies, (Australia(AD), Canada (CD), Denmark (DK), Japan (JY), Norway (NK), Switzerland (SF), and United Kingdom (UKP)). The monthly data set covers the same periods for a subset of the above Euro-zone and nonEuro zone countries. We present unit root test and estimation results in Tables 4 and 5 for quarterly and monthly series respectively. 13

Results for all d ∈ {1, 2, · · · , dmax } where dmax = 4 and dmax = 12 in the cases of quarterly and monthly series respectively, can be obtained upon request. 14 Results from ADF, PP and test of Rothe and Sibertsen (2006) can be obtained upon request. We have also applied our testing procedure and estimated ESTAR models for monthly and quarterly US inflation rates, monthly US unemployment rate and two series on term structure of US interest rates. To preserve space, we do not report these results which can be obtained upon request.

19

As can be observed from columns two through five of each table, WBSup statistic rejects a unit root in ten out of twelve quarterly series and five out of nine monthly series. Similarly tN test rejects the unit root null in six monthly series and ten quarterly exchange rates. On the other hand the number of rejections of unit root are two in monthly rates and seven in quarterly rates when we use inf-t statistic. Note that the tEST AR test provides the strongest evidence against the null of unit root in both monthly and quarterly real exchange rates as it rejects the unit root in all quarterly series and all but one in the monthly series. Overall, findings from alternative tests provide considerable evidence against the null of unit root in real exchange rates over our sample period. The most striking evidence against the null of unit root is obtained for the quarterly data from all tests. Remaining columns of Table 4 and 5 display summary of estimation results under the alternative of tEST AR test together with some diagnostic statistics. Estimated transition parameters normalized by the standard error of transition variables in each case are significantly different from zero at conventional significance levels. Estimates of φ are consistent with the condition(s) for strict stationarity discussed in previous sections for all series except for monthly Canadian Dollar series.15 It should be noted that none of the unit root tests provide any evidence of stationarity for monthly Canadian Dollar series. In a way the reported estimation results are consistent with the results from the unit root tests reported in Table 5. Diagnostics tests reported indicate adequacy of ESTAR models in majority of series. Tests for remaining nonlinearity indicate presence of significant logistic STAR type nonlinearity remaining in the residuals of estimated ESTAR model for quarterly Canadian dollar series and some marginal evidence in the cases of quarterly 15

However, one needs to be cautious on relying the results from these tests as under H0 : γ = 0, φ is unidentified and under H0 : φ = 0, γ is unidentified. Moreover, under any of these null hypotheses, the ESTAR model reduces to a linear AR model with a unit root. Therefore critical values may not be correct. Ideally one can compute marginal significance levels throughout simulation or bootstrapping which are beyond the scope of this paper.

20

Australian Dollar and monthly Canadian Dollar, Italian Lira and Japanese Yen series. Reported p-values for ARCH effects show no evidence of heteroscedasticity in residuals from the estimated models to quarterly series and only in monthly Canadian Dollar and Italian Lira series. Overall, results support the estimated alternative models. Not displayed for space considerations, plots of the transition functions have the usual U -shape as expected under the ESTAR model suggesting that real exchange rates visits both extreme regimes during the sample period. Investigation of the results in Tables 4 and 5 provide interesting insights on the nonlinearity of real exchange rates. Results show that considerable amount of variation in the speed of transition between extreme regimes in the ESTAR model across currencies. Careful inspection of the estimation results reveal that currencies that are part of the Euro zone or closely related to the Euro tend to have somewhat similar speed of adjustments. For example estimated values for Belgian Franc, Dutch Guilder, German Mark and to some extend Danish Kroner are in the order of 2.46-6.45 for the quarterly series. Similar observations hold for the monthly series in that estimated slope parameters are in the range of 2.19 to 5.89 for Belgian Franc, Dutch Guilder, German Mark, Swiss Franc and French Franc. Indeed the estimated transition parameters for the UK Pound is not very far from these values (5.04 and 7.64 for monthly and quarterly data respectively). Italian Lira and Spanish Peseta are two exceptions. On the other hand, estimated values for Australian Dollar, Canadian Dollar and Japanese Yen indicate much faster adjustment dynamics than other currencies. Reported results also show that with the exception of Canadian Dollar, Italian Lira and Swiss Franc, estimated transition parameters are quite close between monthly and quarterly data. We also observe considerable difference in the estimates of φ between monthly and quarterly real exchange rates suggesting differential persistence dynamics between monthly and quarterly series which may worth further investigation. 21

Estimated standardized transition parameter γ and φ values indicate considerable amount of variation in the speed of transition between extreme regimes in the ESTAR model across currencies as well as differences in the persistence. Careful inspection of the reported results indicate that with some exceptions, adjustment and persistence dynamics are considerably similar for Euro-area currencies. These observations tend to indicate differential nonlinear and persistence dynamics across currencies in that the transition and persistence dynamics seems more homogenous for Euro-zone currencies. These observations are consistent with the findings of Gadea et al. (2004) who also makes similar observations within a linear panel data framework. Further work needed to uncover important differences and reasons for these differences in the adjustment and persistence in real exchange rates across currencies.

6

Conclusion

This paper proposed a new unit root test that aims to discriminate a unit root process from a stationary ESTAR process where the transition variable is the lagged changes of the dependent variable. The test is based on a one-sided t statistic where the t-statistic is optimized over the unidentified transition parameter space (γ). The asymptotic distribution of the test does not depend on any nuisance parameters that are not identified under the null nor on any long run parameters or the space over which the test is optimized. Therefore, asymptotic critical values can be obtained by simulations. Simulations revealed that our testing approach has comparable size and quite good power properties relative to alternatives including the recent tests that are suggested in the context of ESTAR and TAR models. We have applied alternative tests to monthly and quarterly real exchange rates over the floating period. Applications in this study and papers by Paya and Peel (2006) and Christopolous and Le`on (2008) as well as simulations illustrate 22

the usefulness of the suggested procedure in this paper. A possible extension of the work here is to consider other transition functions, such as logistic function that allows asymmetric adjustment which might be more useful in analyzing certain economic and time series data, for example, output growth, or unemployment rates. Another potential extension is to consider STAR models in which the augmentation terms are also follow a nonlinear process. One possible extension is to incorporate the adaptive approach of Bec et al. (2008) to ESTAR and other STAR models where not only the transition variable is unidentified but also non-zero threshold parameters are allowed for. An alternative approach which may have some advantages in terms of specification of the unidentified parameter involves testing the null of γ = 0 and optimizing over the parameter space for φ in the context of ESTAR model discussed in this paper. Another extension should allow for conditionally heteroscedastic errors in the ESTAR model. All of these important issues are left for future research.

23

Appendix: Proof of Theorem 1 The proof section is organized as follows. We first prove part one of the main result p

d

and proceed with the second and thirds parts. In what follows −→ and −→ denote convergence in probability and convergence in distribution respectively.

Proof of Theorem 1 part 1 We will first show pointwise convergence in distribution of the test statistic. Then to complete the proof, we show stochastic equicontinuity with respect to γ ∈ Γ = (γ, γ). h i 1 100 Note that since zt is a stationary process, ΓT = 100sz,T , sz,T has a well-defined limit h i £ ¤ 1 100 as T → ∞ with the limit given by Γ = γ, γ = 100σz , σz ⊆ R+ . Without any loss of generality we assume that we optimize over this compact parameter space Γ = © ª γ : γ ≤ γ ≤ γ . To show the pointwise convergence in distribution, first note that under the null of unit root, φ = 0, the t-test for a given value of γ is

tφ (γ) = n

T −1

PT t=1

yt−1 ut [1 − exp(−γzt2 )]

P 2 σˆ2 (γ)T −2 Tt=1 yt−1 [1 − exp(−γzt2 )]2

o1/2 ,

where σˆ2 (γ) is the least square estimate of σ 2 from the regression of ∆yt on yt−1 (1 − exp(−γzt2 )) and zt = ∆yt−d and note that under the null hypothesis, zt = ut−d with mean zero and variance σz2 = σ 2 = Eu2t . Without loss of generality, we set d = 1. To see why, first observe that ¯ p ¯ supd6=s ¯s∆yt−d ,T − s∆yt−s ,T ¯ −→ 0 as T → ∞ for d 6= s and hence it follows that p

supd6=s |Γ∆yt−d ,T − Γ∆yt−s ,T | −→ 0 as T → ∞. It should also be noted that ¯ ¯ ¯1 X ¯ p ¯ sup ¯ ∆yt yt−1 [F (γ, ∆yt−d ) − F (γ, ∆yt−s )]¯¯ −→ 0 d6=s T

24

as T → ∞. This latter result is due to boundedness of F (.) and by the Cauchy-Schwartz inequality, i.e. ¯ X ¯ ¯1 ¯ ¯ ∆yt yt−1 [F (γ, ∆yt−d ) − F (γ, ∆yt−s )]¯¯ ≤ ¯T µ X ¶ 12 µ X ¶ 12 1 1 C (∆yt yt−1 ) (F (γ, ∆yt−d ) − F (γ, ∆yt−s )) T T for a constant C > 0. Because F (.) is bounded between 0 and 1 and the transition variP able zt is stationary, the term T1 [F (γ, ∆yt−d ) − F (γ, ∆yt−s )] is bounded with positive probability (as with a stationary transition variable F (.) can not stay on one of the extreme regimes indefinitely with positive probability and hence the sum of the differences in the above expression will be bounded) and as T −→ ∞, ¡1 P ¢1 (F (γ, ∆yt−d ) − F (γ, ∆yt−s )) 2 converges to zero a.s.16 Note also that the first exT ¡ P ¢1/2 d £¡ 1 ¢¤1/2 pression under the null, T1 (ut yt−1 ) −→ 2 σ{W (1)2 − 1} (see Hamilton 1994 and Theorem 30.13 of Davidson 1994). This establishes the claimed result. A similar argument leads to ¯X ¡ 2 ¢¯¯ p ¯ 2 2 sup ¯ ∆yt−1 F (γ, ∆yt−d ) − F (γ, ∆yt−s ) ¯ −→ 0 d6=s

as T → ∞, therefore it should follow that ¯ ¯ ¯ ¯ p sup ¯¯ inf t(d, γ) − inf t(s, γ)¯¯ −→ 0 γ∈ΓT d6=s γ∈ΓT as T → ∞ for each γ ∈ Γ and d 6= s. P P 1 Alternatively, observe that T1 [F (γ, ∆yt−d ) − F (γ, ∆yt−s )] = exp(−γut−s ) − T P p 1 exp(−γut−d ) −→ E [exp(−γ, ut−s ) − exp(−γ, ut−d )] = 0 as T −→ ∞ under conditions T stated in Assumption A and by the boundedness of F (.). Note that last equality follows from the fact that exp(γ, u) is a monotone transformation of the stationary random variable u and hence the expectation exists for each given γ and the difference in expected value vanishes to zero. 16

25

We will first consider the numerator of the t− statistics. Define for 0 ≤ r ≤ 1,

WT (r) = T

−1/2

[rT ] X

us /σ,

s=1

VT (r, γ) = T

−1/2

[rT ] X

us [1 − exp(−γu2s−1 )]/λ,

s=1

where [rT ] is the largest integer less than or equal to rT , σ 2 = Eu2t , and

2

λ = lim E(T

−1/2

T →∞

[rT ] X

us [1 − exp(−γu2s−1 )])2

s=1

= σ 2 E[1 − exp(−γu2s−1 )]2 . Now by the functional central limit theorem for any given γ, (WT (r), VT (r, γ)) ⇒ (W (r), V (r, γ)) for a bivariate Brownian motion process (W (.), V (., .)) as in the theorem, and by convergence to stochastic integrals (see for instance Davidson (1994, Theorem 30.13)), we have under the conditions of the theorem17

T

−1

T X

Z yt−1 ut [1 −

exp(−γu2t−1 )]

t=1

where Λ = lim T T →∞

−1

T X

1

d

−→ λσ

W (r)dV (r, γ) + Λ, 0

Eyt−1 ut [1 − exp(−γu2t−1 )] = 0.

t=1

17

Note that for a given γ, the bivariate Brownian motion process (W (r), V (r, γ)) has the covariance matrix given by Σ(r) with diagonal terms equal to one. The off-diagonal terms (i.e. the covariance between hP i P[rt] [rT ] 1 E W (r) and V (r, γ)) are given by σ12 (γ) = limT →∞ T −1 σλ u s=1 s s=1 us [1 − exp(−γus−1 )] which hP i [rT ] 1 E after some algebra becomes σ12 (γ) = limT →∞ T −1 σ2 E[1−exp(−γu [1 − exp(−γu )] . s−1 2 2 s=1 [ s−1 )] ] Hence we obtain the expression given in Equation 8 in the text.

26

To derive the limit of the denominator, first note that it is straightforward to show that18 p sup |σˆ2 (γ) − σ 2 | −→ 0. γ∈Γ

Also, note that

T

T X

−2

2 yt−1 [1



exp(−γu2t−1 )]2

=T

t=1

−2

T X

2 yt−1 E[1 − exp(−γu2t−1 )]2

t=1

+T

−2

T X

¢ ¡ 2 yt−1 [1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2 .

t=1 2 2 Now note that yt−1 = yt−2 + u2t−1 + 2yt−2 ut−1 , and therefore

T

−2

T X

¡ ¢ 2 yt−1 [1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2

t=1

T X

= T −2

¡ ¢ 2 yt−2 [1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2

t=1

+T

−2

T X

¡ ¢ u2t−1 [1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2

t=1

+T

−2

T X

¡ ¢ 2yt−2 ut−1 [1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2 .

t=1

By taking the expectation of its absolute value, it can be seen that second term is Op (T −1 ). By convergence to stochastic integrals, (Davidson 1994, Theorem 30.13) it P 2 1 1 P 2 yt−1 [1−exp(−γu2 )]] [ To see this first note that under the null we have σ 2 (γ) = T1 ut − T 1T P y2 [1−exp(−γu2t−1)]2 . As t−1 t−1 £ P ¤ R 1 T2 shown above, the term T1 yt−1 [1 − exp(−γu2t−1 )] converges to λσ 0 W (r)dV (r, γ) + Λ while the exR1 P 2 yt−1 [1−exp(−γu2t−1 )]2 converges to E[1−exp(−γu2t−1 )]σ 2 0 W (r)2 dr. pression in the denominator T12 Hence the second expression is T1 times ratio of functionals which as T → ∞ becomes op (1). This in P 2 p turn implies that σ 2 (γ) = T1 ut + op (1) showing that σ 2 (γ) −→ σ 2 holds uniformly in the compact parameter space Γ. 18

27

also follows that the third term is Op (T −1 ). Since the first term consist of martingale difference summands,

E(T −2

T X

¡ ¢ 2 yt−1 [1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2 )2

t=1

=T

−4

T X

4 ([1 − exp(−γu2t−1 )]2 − E[1 − exp(−γu2t−1 )]2 )2 Eyt−1

t=1

≤ 4T

−4

T X

4 Eyt−1 = O(T −1 )

t=1

under the assumptions of the theorem. The pointwise distribution result now follows by noting that by the FCLT (see for example, Davidson (1994, Theorem 29.11)),

T

−2

T X

Z 2 yt−1 E[1



exp(−γu2t−1 )]2

d

−→ E[1 −

exp(−γu2t−1 )]2 σ 2

t=1

1

W (r)2 dr.

0

To show stochastic equicontinuity, note that because p sup |σˆ2 (γ) − σ 2 | −→ 0. γ∈Γ

and because T −2

T X

2 yt−1 [1 − exp(−γu2t−1 )]2 ≥ T −2

T X

2 yt−1 [1 − exp(−γu2t−1 )]2

t=1

t=1

and because the last term convergence in distribution to an almost surely positive limit, it suffices to prove the stochastic equicontinuity of

DT (γ) = T −1

T X

yt−1 ut [1 − exp(−γu2t−1 )]

t=1

28

and of T

−2

T X

2 yt−1 [1 − exp(−γu2t−1 )]2 .

t=1

Stochastic equicontinuity of the last term follows because by a Taylor series expansion of order 1, for some mean value γ˜ on the line between γ and γ 0 ,

sup

sup

|T

−2

γ∈Γ γ 0 :|γ−γ 0 |

Suggest Documents