Deterministic or Stochastic Trend - Hogrefe eContent

4 downloads 0 Views 279KB Size Report
Dickey-Fuller Test. Tetiana Stadnytska. Department of Psychology, University of Heidelberg, Germany. Abstract. Time series with deterministic and stochastic ...
Original Article

Deterministic or Stochastic Trend Decision on the Basis of the Augmented Dickey-Fuller Test Tetiana Stadnytska Department of Psychology, University of Heidelberg, Germany Abstract. Time series with deterministic and stochastic trends possess different memory characteristics and exhibit dissimilar long-range development. Trending series are nonstationary and must be transformed to be stabilized. The choice of correct transformation depends on patterns of nonstationarity in the data. Inappropriate transformations are consequential for subsequent analysis and should be omitted. The objectives of this article are (1) to introduce unit root testing procedures, (2) to evaluate the strategies for distinguishing between stochastic and deterministic alternatives by means of Monte Carlo experiments, and (3) to demonstrate their implementation on empirical examples using SAS for Windows. Keywords: unit root tests, ADF test, time series, trend stationary, difference stationary, nonstationarity, integrated, Monte Carlo experiments

Nonstationary time series with a changing mean or variance are common in psychology (Fortes, Ninot, & Delignie`res, 2004; Glass, Willson, & Gottman, 1975; Gottman, 1981; McCleary & Hay, 1980; Ninot, Fortes, & Delignie`res, 2005; Velicer & Colby, 1997; Velicer & Fava, 2003; Warner, 1998). Most psychological time series are unstable due to a deterministic trend or stochastic drift. Nonstationary processes with deterministic and stochastic trend components possess different memory characteristics and exhibit dissimilar long-range development. In social and behavioral sciences, the goal of time-series analysis is usually to determine the nature of the process that describes an observed behavior, to measure the effects of an intervention, as in an interrupted time-series experiment, or to forecast future values of the series under consideration. In the latter two cases, stationarity of the series under study is required, which implies some form of trend removal for nonstationary data. The choice of an appropriate detrending procedure depends on the cause of nonstationarity. Mis-specifying the trend characteristics of the data can result in biased tests and false predictions (Ashley & Verbrugge, 2004, 2006; Diebold & Kilian, 2000; Diebold & Senhadji, 1996; Psaradakis & Sola, 2003). If two or more time series are nonstationary due to a stochastic drift, the long-run equilibrium of such time series can be modeled by means of cointegration techniques. Hence, clarifying the cause of nonstationarity in the data represents the first step in the cointegration modeling (Stroe-Kunold & Werner, 2007, 2008). This paper introduces unit root testing strategies allowing one to distinguish between stochastic and deterministic alternatives, evaluate them by means of Monte Carlo experiments, and demonstrate their implementation on empirical examples. Ó 2010 Hogrefe Publishing

Different Types of Nonstationarity This part of the article introduces the concept of stationarity, discusses the difference between stochastic and deterministic trends, and presents unit root testing as a method for distinguishing between stationary and nonstationary series.

Stationarity A process is said to be stationary if its mean, variance, and covariance do not change over time. If that is not the case, we deal with a nonstationary process. If the goal of analysis consists in measuring the effects of an intervention, as in an interrupted time-series experiment, or in forecasting future values of the series, stationarity (i.e., stability) of the data under consideration is required. A great deal of psychological time series has a timevarying mean, a time-varying variance, or both. For further analysis, such time series must be transformed to make them stationary. The transformation method depends on the cause of nonstationarity. The consequences of a false treatment can be rather serious; unfortunately, this is not emphasized in the time-series textbooks used among psychologists. Some descriptions even suggest that two popular methods for stabilizing nonstationary series, differencing and ordinary least square regression, are interchangeable and that the choice of transformation technique is simply a matter of researcher’s preference (see e.g., Warner, 1998, p. 39). One of the objectives of this paper is to emphasize the importance of proper stationarity transformation for empirical time-series research. Methodology 2010; Vol. 6(2):83–92 DOI: 10.1027/1614-2241/a000009

84

T. Stadnytska: Deterministic or Stochastic Trend

a

Yt = Yt−1 + ut

Figure 1. Nonstationary processes and their autocorrelation and partial autocorrelation functions: (a) pure random walk, (b) random walk with drift, and (c) deterministic time trend.

ut ∼ IID(0, σ2)

6

0

ACF

Y

2

-2

1.0

1.0

0.5

0.5

0.0

-6

-1.0

-1.0 1 2 3 4 5 6 7 8 9 10

1 12 23 34 45 56 67 78 89 100

t

Yt = 2.0 + Yt−1 + ut

0.0 -0.5

-0.5

-4

b

PACF

4

1 2 3 4 5 6 7 8 9 10

Lag

Lag

ut ∼ IID(0, σ2)

30

15

ACF

Y

20

10

1.0

1.0

0.5

0.5

0.0

0

-1.0

-1.0 1 12 23 34 45 56 67 78 89 100

1 2 3 4 5 6 7 8 9 10

t

Yt = 0.1t + 0.2Yt −1 + at

0.0 -0.5

-0.5

5

c

PACF

25

1 2 3 4 5 6 7 8 9 10

Lag

at = 0.5at −1 + u t

Lag

ut ∼ IID(0, σ2)

12 1.0

0.5

0.5

ACF

3 0

PACF

1.0

6

Y

9

0.0

-3

-1.0 1 12 23 34 45 56 67 78 89 100

t

0.0 -0.5

-0.5

-1.0 1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

Lag

Lag

Nonstationary Processes

variance increase over time. Random walk processes are nonstationary, but their first differences

Figure 1 shows the three most common nonstationary processes and their autocorrelation and partial autocorrelation functions (ACF and PACF, respectively). The process   ð1Þ Y t ¼ Y t1 þ ut with ut  IIDN 0; r2 is called a pure random walk. The mean of this process is equal to its initial value but its variance (tr2) increases indefinitely over time. A pure random walk process can also be represented as the sum of random shocks X ut : ð2Þ Yt ¼ As a result, the impact of a particular shock does not dissipate, and the random walk remembers the shock forever. That is why a random walk is said to have an infinite memory. If a constant term is present in the equation Y t ¼ a þ Y t1 þ ut ;

ð3Þ

then Yt is called random walk with drift, where a is known as the drift parameter. Depending on a being negative or positive, Yt exhibits a negative or positive stochastic trend. For a random walk with drift, the mean as well as the

1

Y t ¼ Y t  Y t1

ð4Þ

are stationary. Hence, both types of random walks are called difference stationary (DS) processes. Random walk models are also known in the time-series literature as unit root processes. A situation of nonstationarity is called the unit root problem, if in the first-order autoregressive model Y t ¼ qY t1 þ ut ;

ð5Þ

where q is 1. The name unit root is due to the fact that q = 1. (The autoregressive model can also be written as ð1  LÞY t ¼ ut . The term unit root refers to the root of the polynomial in the lag operator.1) Random walk is a specific case of a more general class of stochastic models known as integrated processes. An integrated process of first order is represented by an equation Y t ¼ a þ Y t1 þ at ;

ð6Þ

where any stationary autoregressive moving-average, ARMA (p, q) process can generate the random part at.

Lag operator L: LYt = Yt1, L2Yt = Yt2 and so on. If (1  L) = 0, we obtain, L = 1, hence the name unit root.

Methodology 2010; Vol. 6(2):83–92

Ó 2010 Hogrefe Publishing

T. Stadnytska: Deterministic or Stochastic Trend

Therefore, random walk processes are integrated of order 1, denoted as I(1) or (0, 1, 0) in terms of Box and Jenkins ARIMA methodology. In general, if a time series has to be differenced d times to make it stationary, that series is called integrated of the order d. The development of the third process in Figure 1, represented by the equation Y t ¼ 0:1t þ 0:2Y t1 þ at

with at ¼ 0:5at1 þ ut ;

ð7Þ

is determined by a deterministic time trend. This process has stable variance and a changing mean. In a more simple case Y t ¼ b1 þ b2 t þ ut ;

ð8Þ

the mean is, for instance, b1 + b2t. If we subtract this mean from Yt, the resulting series will be stationary (this procedure is known as polynomial detrending). That is why processes of this type are called trend stationary (TS). In contrast to integrated series, processes with a deterministic trend do not exhibit an infinite memory. The deviations from the trend line do not contribute to the long-run development of the series. In the case of a stochastic trend, however, the random component at affects the long-run course of the series. Therefore, the proper transformation method of nonstationary data crucially depends on the data generating process (DGP). If an empirical time series is a realization of the random walk process, the solution here is to take the first differences of the time series. If a series is stationary around the trend line, the correct way to transform such a time series is to regress it on time. The residuals from this regression will then be stationary. As Figure 1 shows, realizations of DS and TS processes can appear very similar and it is especially difficult to decide whether the trend in a time series is stochastic or deterministic (Cases b and c in Figure 1). As a result, inappropriate transformations are not unusual in practice. Different studies have shown that consequences of misspecifying the trend characteristics of the data can be rather serious.

Consequences of Inappropriate Transformations Chan, Hayya, and Ord (1977) analyzed the effects of incorrect transformation on the autocorrelation and the power spectral density functions. These authors showed that the ACF of residuals from linear regression of a random walk series on time are not stationary and tend to exhibit cycles of increasing length and amplitude around a fitted trend line as sample size gets larger. That is why erroneous detrending of DS series is also called underdifferencing. The residuals of inappropriately differenced TS series follow a noninvertible moving-average process. This is known as overdifferencing. There has been some debate in the literature arguing that overdifferencing is a less serious error than underdifferencing (see Maddala & Kim, 1998, for an overview). Nelson and Kang (1981, 1984) detected artificial periodicity in inappropriately detrended time series and presented Ó 2010 Hogrefe Publishing

85

several ways in which investigators could obtain misleading results using underdifferenced series. Diebold and Senhadji (1996) showed that applying DS and TS models to the same time series could result in very different predictions. SchenkHoppe` (2001) and Psaradakis and Sola (2003) demonstrated dramatic consequences of inappropriate detrending within the scope of business cycle research. In a recent paper, Dagum and Giannerini (2006) investigated the effect of erroneous transformations on tests detecting nonlinearity and a unit root. The authors concluded that either wrong differencing or wrong detrending leads to biased results. Ashley and Verbrugge (2004, 2006) studied the effects of false transformations in the context of ordinary parameter inference in simple linear models. Underdifferencing yielded seriously oversized tests. Overdifferencing in turn produced very distorted estimated impulse response functions. Distortions became increasingly severe as sample size grew. To summarize, memory characteristics and long-range development of a time series crucially depend on whether its trend component is deterministic or stochastic. A deterministic trend is completely predictable and not variable, whereas a stochastic trend is not predictable. Series with a stochastic trend have to be differenced to make them stationary. For series with a deterministic trend, polynomial detrending is a correct transformation to achieve stationarity. Inappropriate transformation is consequential for subsequent analysis and should be omitted. It is impossible to distinguish between stochastic and deterministic alternatives visually or through analyzing ACF and PACF of the series under study.

Unit Root Testing Testing for unit roots is a common practice in econometrics. To most psychologists this methodology might be unfamiliar; therefore this part of the article briefly presents the basic concepts of the unit root approach and demonstrates the test procedure on simulated data. Augmented Dickey-Fuller Test There exist numerous unit root tests. One of the most popular among them is the Augmented Dickey-Fuller (ADF) test. (See Gujarati, 2003, for an introduction; see Maddala & Kim, 1998, for an overview; consult Hamilton, 1994, at the advanced level.) In the simplest case of an uncorrelated error term the test begins by estimating the equation Y t ¼ qY t1 þ ut :

ð9Þ

For reasons of convenience, this equation is written in the form Y t ¼ ðq  1ÞY t1 þ ut ¼ dY t1 þ ut Y t ¼ Y t  Y t1 :

with ð10Þ

The null hypothesis is d = 0, which means q = 1 or there is a unit root, implying that the time series under consideration is integrated. Dickey and Fuller (1981) have shown Methodology 2010; Vol. 6(2):83–92

86

T. Stadnytska: Deterministic or Stochastic Trend

that under the null hypothesis, the estimated t value of d follows the s statistics and computed the critical s values on the basis of Monte Carlo simulations. The authors also introduced a competing F test with the usual F computation but using special critical values. Elder and Kennedy (2001a) showed in their recent paper, however, that in testing for stationarity, the t test is preferable. The actual procedure of implementing the ADF test involves several decisions. To allow for various possibilities of nonstationarity, the ADF test is estimated in three different forms:

a large p is necessary (for further details see Lopez, 1997; Ng & Perron, 1995, 2001; Stock, 1994). As the majority of the unit root procedures, the ADF test suffers from size distortion through misspecification of H0 or an inappropriate selection of lagged terms (the true significance level exceeds the nominal level). Another drawback is the low power in smaller samples and in cases of q near 1. Example The following SAS code simulates the series Y with T = 100 yt = yt1 + at, where at = ut  0.6ut1 and ut  IIDN (0, 1).

(I) Yt is a random walk DYt = dYt1 + ut; (II) Yt is a random walk with drift DYt = a + dYt1 + ut; (III) Yt is a random walk with drift around a deterministic trend DYt = a + bt + dYt1 + ut.

data series; y=0; y1=0; a=0; u=0; a1=0; teta=0.6; keep t y;

In each case, the null hypothesis is d = 0. The alternative hypothesis is that d < 0 (one-sided test); that is, the time series is stationary. If the null hypothesis is rejected, it means in the first case that Yt is a stationary series with a zero mean. In the second case, Yt is stationary with a nonzero mean and in the third case, Yt is stationary around a deterministic trend. The critical s values are different for each of the three preceding specifications of the ADF test. If the error term in the model is autocorrelated, the ADF test is conducted by ‘‘augmenting’’ the preceding three equations by adding the lagged values of the dependent variable DYt: p X Y t ¼ a þ bt þ dY t1 þ ci Y ti þ ut : ð11Þ

do t=50 to 100; u=rannor(59837); y=y1+a; a=uteta*a1;a1=u; y1=y; if t gt 0 then output; end; run; When testing for unit roots, it is crucial to specify the null and alternative hypotheses appropriately. For example, if the data are not growing, the hypotheses should reflect this. Therefore the first step of analysis is to examine if the observed series exhibits an increasing or decreasing trend. Figure 2a shows that there is no apparent positive or negative trend in the generated series. Slow decay of the ACF suggests that the process may be nonstationary; so we have to decide between the H0 ‘‘yt is I(1) without drift’’ and the H1 ‘‘yt is I(0) with nonzero mean’’. Hence the test regression is DYt = a + dYt1 + at. An important practical issue for the correct implementation of the ADF test is the specification of the order of serial

i¼1

The number of lagged differenced terms (p) is determined empirically using information criteria such as the Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC), the idea being to include enough terms so that the error part at is serially uncorrelated. In general, a small p is adequate for autoregressive errors or ARMA processes with small moving-average components. For error terms with moving-average coefficients near ±1,

a 6.00

2.00

1.0

0.5

0.5

PACF

ACF

Y

4.00

1.0

0.0

-1.0

-1.0 1

12 23 34 45 56 67 78 89 100

0.0 -0.5

-0.5

0.00

1 2 3 4 5 6 7 8 9 10 1112

t

Figure 2. Generated yt = yt1 + at  0.6at1 + ut series and its ACF and PACF: (a) nontransformed and (b) differenced (d = 1).

1 2 3 4 5 6 7 8 9 10 1112

Lag

Lag

b

-2.00

1.0

1.0

0.5

0.5

PACF

0.00

ACF

Y(1)

2.00

0.0

-0.5

-0.5

-1.0

-1.0

-4.00 2 13 24 35 46 57 68 79 90

t

Methodology 2010; Vol. 6(2):83–92

0.0

1 2 3 4 5 6 7 8 9 10 1112

Lag

1 2 3 4 5 6 7 8 9 10 1112

Lag

Ó 2010 Hogrefe Publishing

T. Stadnytska: Deterministic or Stochastic Trend

correlation (p) in the error term at. Ng and Perron (1995) suggested the following lag length selection procedure. First, set an upper bound pmax for p. Schwert (1989) proposed to use "  1=4 # T pmax ¼ INT 12 : ð12Þ 100 For series with length up to 100 observations, pmax = 8 is usually sufficient. Next, estimate the ADF test regression with p = pmax. If the absolute value of the t statistic for testing the significance of the last lagged difference is > 1.6, then set p = pmax and perform the unit root test. Otherwise reduce the lag length by one and repeat the process. The following SAS statements test the regression

87

Table 2. SAS output for the ADF unit root test for the simulated yt = yt1 + at  0.6at1 + ut series Type Single mean

Lag

s

Pr < s

6

1.89

.3368

proc arima data= series; identify var=y stationarity=(adf=(6)); run; Table 2 presents the relevant parts of the SAS output. With p = 6 the ADF t statistic is –1.89, the p value .3368. Hence we cannot reject the unit root null hypothesis. Therefore the results of the ADF test correctly indicate the unit root model without drift for the simulated series.

Y t ¼ a þ dY t1 þ c1 Y t1 þ c2 Y t2 þ    þ c8 Y t8 þ ut :

ð13Þ

Deterministic Versus Stochastic Trend data series; set series; y_dif=dif(y); ly=lag(y); dy1=lag(y_dif); dy2=lag2(y_dif); dy3=lag3(y_dif); dy4=lag4(y_dif); dy5=lag5(y_dif); dy6=lag6(y_dif); dy7=lag7(y_dif); dy8=lag8(y_dif); run; proc reg data=series outest=est; model y_dif= ly dy1 dy2 dy3 dy4 dy5 dy6 dy7 dy8/ aic bic; run; proc print data=est; run; Table 1 shows that the absolute value of the t statistic for p = 8 is < 1.6, so the procedure is repeated for 7, 6, and 5 lags. Following the Ng-Perron backward selection algorithm and minimizing the AIC and the BIC, p = 6 lags were chosen. The following SAS code conducts the ADF test with p = 6. Table 1. Summary statistics for the Ng-Perron backward selection procedure for the simulated yt = yt1 + at  0.6at1 + ut series Last lag 8 7 6 5

|t|

Pr( 5% for lags 1–5 in the movingaverage case (0, 1). The obtained results are in accordance with the evaluated testing strategy since we expect for DS series the adherence to the nominal level of significance and no noticeable discrepancies in size of the two subtests. Table 4 presents the percentages of significant ADF tests for models with a deterministic trend. In this case, the DGP does not contain a unit root and therefore the rejection of H0 represents a correct test decision. It can be seen from the left-hand section of Table 4 that omitting a time trend term from the estimating regression leads to the lack of any test power. In other words, there is no possibility to reject the null hypothesis of a unit root. Including a deterministic parameter, on the other hand, ensures quite good results. This enormous discrepancy in the powers of the two subtests serves as distinct evidence in favor of the suggested strategy. The right-hand section of Table 4 additionally shows that the quality of test decisions is better for low q values than for high q ones. As expected, the worst results are obtained for q near 1. Recall that q = 1 implies a unit root. For all TS models, the power of the ADF test decreases with the number of lagged terms used. The simulation results demonstrated that for growing time series the test should begin with the third most general hypothesis from the ADF family (a unit root with drift and a time trend) and then continue with the more restricted case of a unit root with drift. Rejection of the null hypothesis in the first test can be treated as strong evidence of a deterministic trend. If the null is not rejected in either test, the growth Ó 2010 Hogrefe Publishing

T. Stadnytska: Deterministic or Stochastic Trend

89

Table 3. Percentage of significant decisions of the ADF test at the nominal 5% level of significance for DS series with DGP: Yt = a + Yt1 + at; T = 100; 1,000 replications

at/Lag

Model II

Model III

H0: d = 0 (q = 1) a 5 0

H0: d = 0 (q = 1) a 5 0 b 5 0

1

2

3

4

5

6

1

2

3

4

5

6

a = .2 (0, 0) (1, 0) (0, 1)

2.6 4.2 0.2

2.4 4.0 0.8

2.0 4.2 0.7

2.3 4.7 0.6

3.0 5.1 0.8

2.7 4.7 0.9

4.5 5.5 29.0

5.7 5.6 12.0

5.0 5.5 7.8

4.6 4.5 6.2

4.5 5.1 6.1

5.3 5.5 4.7

a = .5 (0, 0) (1, 0) (0, 1)

0.6 1.9 0

0.2 1.6 0.4

0.4 1.7 0.5

0.6 2.1 0.3

0.8 2.5 0.5

0.8 2.3 0.2

4.5 5.5 29.0

5.7 5.6 12.0

5.0 5.5 7.8

4.6 4.5 6.2

4.5 5.1 6.1

5.3 5.5 4.7

Table 4. Percentage of significant decisions of the ADF test at the nominal 5% level of significance for TS series with DGP: Yt = 0.1t + qYt1 + at; T = 100; 1,000 replications Model II

Model III

H0: d = 0 (q = 1) a 5 0

H0: d = 0 (q = 1) a 5 0 b 5 0

at/Lag

1

2

3

4

5

6

1

2

3

4

5

6

q = 0.0 (0, 0) (1, 0) (0, 1)

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

100 99.9 99.9

100 97.2 99.0

99.1 89.3 98.0

96.4 75.7 89.9

86.4 62.6 79.1

69.0 47.3 62.7

q = 0.2 (0, 0) (1, 0) (0, 1)

0 0 0.2

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

100 99.5 99.9

100 94.5 98.2

98.4 84.6 96.6

92.1 69.6 84.4

80.4 57.4 72.8

63.7 43.4 56.9

q = 0.5 (0, 0) (1, 0) (0, 1)

0 1.9 2.8

0 0.2 0

0 0.3 0

0 0.1 0

0 0 0

0 0 0

99.9 92.0 99.7

97.2 78.9 87.1

89.3 68.2 86.4

75.7 52.1 65.9

62.6 43.6 56.8

47.3 32.6 44.8

q = 0.8 (0, 0) (1, 0) (0, 1)

0 11.5 11.9

0 7.9 2.2

0 6.0 2.8

0 4.9 1.9

0 4.5 2.7

0 3.5 1.3

56.8 38.6 76.2

41.8 30.7 28.0

35.6 27.2 39.5

29.8 21.4 24.8

24.0 18.6 25.8

18.0 15.7 20.9

in the observed series is probably due to a stochastic trend. Furthermore, the findings underlined the importance of the proper choice of lag length in unit root testing. The presented analysis was limited to rather simple models. Using more complex error structures or DGPs with nonlinear trends could lead to divergent results (see Kim et al., 2004).

Empirical Demonstration To illustrate the ADF test procedure for trend cases, timeseries raw data of traffic fatalities for New York State from January 1951 to April 1960 were employed. The data are Ó 2010 Hogrefe Publishing

freely available in the textbook of Glass et al. (1975). Figure 3 shows the original series, its first difference, and the residuals from linear regression. Employing the Box-Jenkins procedure, Glass and colleagues fitted the (0, 1, 1) model to the series. (For an elaborated strategy of model selection combining different techniques, see Stadnytska, Braun, and Werner (2008a, 2008b).) Therefore according to Glass et al. (1975), a positive trend in the series is due to a stochastic drift. For a growing series in the ADF procedure the hypotheses to be tested are H0: yt is I(1) with drift; H1: yt is I(0) with deterministic time trend. Methodology 2010; Vol. 6(2):83–92

90

T. Stadnytska: Deterministic or Stochastic Trend

a 5.00 0.5

ACF

2.00

0.5

PACF

1.0

3.00

Y

4.00

0.0

0.00

0.0 -0.5

-0.5

1.00

-1.0

-1.0 1 2 3 4 5 6 7 8 9 10 1112

1 12 23 34 45 56 67 78 89 100

t

Figure 3. New York state traffic fatalities data from Glass et al. (1975) and its ACF and PACF: (a) nontransformed, (b) differenced, and (c) detrended.

1.0

1 2 3 4 5 6 7 8 9 10 11 12

Lag

Lag

1.0

1.0

0.00

0.5

0.5

-1.00

PACF

1.00

ACF

Y(1)

b 2.00

0.0

-2.00

-0.5

-3.00

-1.0 2 13 24 35 46 57 68 79 90

0.0 -0.5 -1.0 1 2 3 4 5 6 7 8 9 1011 12

1 2 3 4 5 6 7 8 9 1011 12

Lag

t

Lag

c 2.00

-1.00

1.0

1.0

0.5

0.5

PACF

0.00

ACF

Res

1.00

0.0 -0.5

-2.00

-0.5

-1.0 1 12 23 34 45 56 67 78 89 100

t

-1.0 1 2 3 4 5 6 7 8 9 101112

1 2 3 4 5 6 7 8 9 1011 12

Lag

Thus the test regression is Y t ¼ a þ bt þ dY t1 þ p P ci Y ti þ ut to capture the deterministic trend under i¼1

the alternative. The number of lags in the test regression can be determined using the Ng-Perron backward selection method. The following SAS code fits the test regression with p = 6 to the data and computes the AIC and BIC. data newyork; set newyork; y_dif=dif(y); ly=lag(y); dy1=lag(y_dif); dy2=lag2(y_dif); dy3=lag3(y_dif); dy4=lag4(y_dif); dy5=lag5(y_dif); dy6=lag6(y_dif); run; proc reg data= newyork outest=est; model y_dif=t ly dy1 dy2 dy3 dy4 dy5 dy6/ aic bic; run; proc print data=est; run;

Methodology 2010; Vol. 6(2):83–92

0.0

Lag

Table 5. Summary statistics for the Ng-Perron backward selection procedure for the New York traffic fatalities series Last lag 6 5 4 3 2 1

|t|

AIC

BIC

RMSE

1.52 0.53 1.2 0.51 1.39 0.9

111.00 112.78 116.68 119.11 123.02 124.49

107.10 109.30 113.58 116.31 120.49 122.15

0.526 0.527 0.522 0.522 0.517 0.519

Table 5 summarizes the results from the Ng-Perron algorithm for lags 6–1. The results indicate that there is no need of lagged differences in the test regression. (The absolute value of the t statistics is < 1.6 for all lags tested. The model with p = 1 has the smallest AIC and BIC. Note that it does not matter that the AIC or BIC values were all negative in this example. The values are simply supposed to be compared to each other, meaning that we should pick the model with the smallest actual (not absolute) value of AIC or BIC.) The following SAS statements perform the ADF test for lags 0–2. proc arima data= newyork; identify var=y stationarity=(adf=2); run;

Ó 2010 Hogrefe Publishing

T. Stadnytska: Deterministic or Stochastic Trend

Table 6. SAS output for the ADF unit root test for the New York traffic fatalities series Lags

s

Pr < s

Single mean

0 1 2

6.07 3.39 3.04

< .0001 .0137 .0346

Trend

0 1 2

10.16 6.32 6.09

< .0001 < .0001 < .0001

Type

Table 6 shows the relevant parts of the SAS output. For lags 0–2, the results strongly indicate the rejection of the null hypotheses of unit root with drift. For instance, with p = 1 the ADF t statistic is –10.16 (p value < .0001). Therefore in contrast to Glass et al. (1975), the ADF test indicates that the series under study is TS. This simple example illustrates the impact of model selection on theoretical interpretation of phenomena under study. For instance, the I(1) model identified by Glass and colleagues suggests that the growth in the series is due to accumulation of random shocks. According to this model, an accidental increase in traffic fatalities in the past will persistently affect the future level of the series (since integrated processes remember the shock forever). In contrast, the TS model identified by means of the ADF test strategy does not assume an influence of random shocks on the longrun development of the series and explains the observed growth as a function of the deterministic time trend (i.e., more cars, more accidents, and more fatalities). Further, comparison of forecasts for the last observation (t = 100) in the data demonstrated the superiority of the TS model over the I(1) alternative. The forecast Dy100 = 0.87 (0.56) from the TS model: Y t ¼ 2:18 þ 0:018t þ 1:04Y t1 þ ut

ð14Þ

lay close to the observed time-series value Dy100 = 0.94; whereas the predicted value Dy100 = 0.23 (0.52) from the (0, 1, 1) model: Y t ¼ 0:017 þ ut  0:98ut1

ð15Þ

deviated strongly from the true one.

Summary and Conclusions Many psychological time series are nonstationary. Increasing or decreasing behavior of the observed series can be due to a deterministic or stochastic trend. A deterministic trend is completely predictable and not variable; a stochastic trend is not predictable. Therefore, DS and TS models of the same time series imply different memory properties of the process under investigation. They also require different trend removal procedures and may result in divergent predictions. Hence the decision which model to use is tremendously important for applied researchers. Furthermore, the Ó 2010 Hogrefe Publishing

91

described unit root testing strategies play an important role within the multivariate cointegration framework. The ADF test allows determining the cause of nonstationarity in the data. Testing for a unit root always indicates a testing strategy and not mere calculation of a single test statistic. The first step in the strategy is to plot data against time and to rule out unreasonable hypotheses on the basis of theoretical considerations (see Elder & Kennedy, 2001b). For growing time series, the test regression should include a constant and time trend (Hypothesis III from the ADF family). In series with autocorrelated errors, the number of lagged difference terms (p) used to approximate the ARMA structure of the errors exercises a strong influence on test decisions. If p is too small, then the remaining serial correlation in the errors will bias the test. Including too many lagged terms reduces the power of the test. Employing the lag length selection procedure suggested by Ng and Perron (1995) results in stable size of the ADF test with minimal power loss.

References Ashley, R., & Verbrugge, R. J. (2004). To difference or not to difference: A Monte Carlo investigation of inference in vector autoregressive models. VPI Economic Department Working Paper. Retrieved from http://ashleymac.econ.vt.edu/working_ papers/9915.pdf. Ashley, R., & Verbrugge, R. J. (2006). Comments on ‘‘A critical investigation on detrending procedures for non-linear processes’’. Journal of Macroeconomics, 28, 192–194. Ayat, L., & Burridge, P. (2000). Unit root tests in the presence of uncertainty about the non-stochastic trend. Journal of Econometrics, 95, 71–96. Box, G. E. P., & Pierce, W. A. (1970). Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of American Statistical Association, 65, 1509–1526. Chan, K. H., Hayya, J. C., & Ord, J. K. (1977). A note on trend removal methods: The case of polynomial regression versus variate differencing. Econometrica, 45, 737–744. Dagum, E. B., & Giannerini, S. (2006). A critical investigation on detrending procedures for non-linear processes. Journal of Macroeconomics, 28, 175–191. DeJong, D. N., Nankervis, J. C., Savin, N. E., & Whiteman, H. (1992). The power problems of unit root tests in time series with autoregressive errors. Journal of Econometrics, 53, 323–343. Dickey, D. A. (1984). Power of unit root tests. Proceedings of Business and Economic Statistics Section of ASA, pp. 489– 493. Dickey, D. A., & Fuller, W. A. (1981). Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica, 49, 1057–1072. Diebold, F. X., & Kilian, L. (2000). Unit root tests are useful for selecting forecasting model. Journal of Business and Economic Statistics, 18, 265–273. Diebold, F. X., & Senhadji, A. S. (1996). The uncertain unit root in real GNP: Comment. American Economic Review, 86, 1291–1298. Elder, J., & Kennedy, P. E. (2001a). F versus t test for unit roots. Economic Bulletin, 3, 1–6. Elder, J., & Kennedy, P. E. (2001b). Testing for unit roots: What should students be taught? Journal of Economic Education, 32, 137–146. Methodology 2010; Vol. 6(2):83–92

92

T. Stadnytska: Deterministic or Stochastic Trend

Fortes, M., Ninot, G., & Delignie`res, D. (2004). The dynamics of self-esteem and physical self: Between preservation and adaptation. Quality and Quantity, 38, 735–751. Glass, G. V., Willson, V. L., & Gottman, J. M. (1975). Design and analysis of time-series experiments. Boulder, CO: Colorado Associated University Press. Gottman, J. M. (1981). Time-series analysis. New York: Cambridge University Press. Granger, C. W. J., & Newbold, P. (1986). Forecasting economic time series. San Diego, CA: Academic Press. Gujarati, D. (2003). Basic econometrics (4th ed.). New York: McGraw-Hill. Hamilton, J. D. (1994). Time series analysis. Princeton: Princeton University Press. Holden, D., & Perman, R. (1994). Unit roots and cointegration for the economist. In B. B. Rao (Ed.), Cointegration for the applied economist (pp. 47–112). New York: St. Martin’s. Kim, T., Leybourne, S. J., & Newbold, P. (2004). Behaviour of Dickey-Fuller unit root tests under trend misspecification. Journal of Time Series Analysis, 25, 755–764. Ljung, G. M., & Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika, 65, 297–303. Lopez, J. H. (1997). The power of the ADF test. Economics Letters, 57, 5–10. Maddala, G. S., & Kim, I. (1998). Unit roots, cointegration and structural change. Cambridge: University Press. McCleary, R., & Hay, R. A. Jr. (1980). Applied time series analysis for the social sciences. Beverly Hills: Sage. Nelson, C. R., & Kang, H. (1981). Spurious periodicity in inappropriately detrended time teries. Econometrica, 49, 741–751. Nelson, C. R., & Kang, H. (1984). Pitfalls in the use of time as an explanatory variable in regression. Journal of Business and Economic Statistics, 2, 73–82. Ng, S., & Perron, P. (1995). Unit root tests in ARMA models with data dependent methods for the selection of the truncation lag. Journal of the American Statistical Association, 90, 268–281. Ng, S., & Perron, P. (2001). Lag length selection and the construction of the unit root test with good size and power. Econometrica, 69, 1519–1554. Ninot, G., Fortes, M., & Delignie`res, D. (2005). The dynamics of self-esteem in adults over a six-month period: An exploratory study. The Journal of Psychology, 139, 315–330. Perron, P. (1988). Trends and random walks in macroeconomic time series. Journal of Economic Dynamics and Control, 12, 297–332. Psaradakis, Z., & Sola, M. (2003). On detrending and cyclical asymmetry. Journal of Applied Econometrics, 18, 271–289. Schenk-Hoppe`, K. (2001). Economic growth and business cycles: A critical comment on detrending time series. Studies in Nonlinear Dynamics and Econometrics, 5, 75–86.

Methodology 2010; Vol. 6(2):83–92

Schwert, G. W. (1989). Tests for unit roots: A Monte Carlo investigation. Journal of Business and Economic Statistics, 7, 147–160. Stadnytska, T., Braun, S., & Werner, J. (2008a). Model identification of integrated ARMA processes. Multivariate Behavior Research, 43, 1–28. Stadnytska, T., Braun, S., & Werner, J. (2008b). Comparison of automated procedures for ARMA model identification. Behavior Research Methods, 40, 250–262. Stock, J. H. (1994). Unit roots, structural breaks, and trends. In R. Engle & D. McFadden (Eds.), Handbook of Econometrics (Vol. IV, pp. 2740–2843). Amsterdam: Elsevier. Stroe-Kunold, E., & Werner, J. (2007). Sind psychologische Prozesse kointegriert? Standortbestimmung und Perspektiven der Kointegrationsmethodologie in der psychologischen Forschung [Are psychological processes cointegrated? Present role and future perspectives of cointegration methodology in psychological research]. Psychologische Rundschau, 58(4), 225–237. Stroe-Kunold, E., & Werner, J. (2008). Modeling human dynamics by means of cointegration methodology. Methodology, 4(3), 113–131. Velicer, W. F., & Colby, S. M. (1997). Time series analysis for prevention and treatment research. In K. J. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 211–249). Washington, DC: American Psychological Association. Velicer, W. F., & Fava, J. l. (2003). Time series analysis. In J. Schinka & W. F. Velicer (Eds.), Research methods in psychology (pp. 581–606). New York: Wiley. Warner, R. M. (1998). Spectral analysis of time-series data. New York: Guilford. West, K. D. (1987). A note of the power of least squares tests for a unit root. Economics Letters, 24, 249–252.

Tetiana Stadnytska Department of Psychology University of Heidelberg Hauptstrasse 47-51 69117 Heidelberg Germany Tel. +49 6221 547345 E-mail [email protected]

Ó 2010 Hogrefe Publishing