Symmetric Test for Second Differencing in Univariate Time ... - CiteSeerX

0 downloads 0 Views 374KB Size Report
Author(s): D. L. Sen and D. A. Dickey. Source: Journal of Business ... D. A. Dickey. Department of Statistics ...... Note that VY1 = Y1 - N(0, a2), but VY81 - N(0, v), ...
Symmetric Test for Second Differencing in Univariate Time Series Author(s): D. L. Sen and D. A. Dickey Source: Journal of Business & Economic Statistics, Vol. 5, No. 4 (Oct., 1987), pp. 463-473 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/1391998 Accessed: 15/10/2009 05:22 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=astata. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of Business & Economic Statistics.

http://www.jstor.org

Journalof Business&EconomicStatistics,October1987,Vol.5, No.4

? 1987 American Statistical Association

for

Symmetric Test in

Second Series

Time

Univariate

Differencing

D. L. Sen

IBMCorporation, Raleigh,NC 27609 D. A. Dickey of Statistics,NorthCarolinaState University, Department Raleigh,NC 27695 A test forthe nullhypothesisthata timeserieshas characteristic equationswithtwounitroots is presented.Thetest, basedon a standardregressioncomputation, is shownto havegood whencomparedto previously powerproperties existingtests. KEYWORDS:Nonstationary; Unitroots. 1.

poses, we will define the model to be stationaryor nonstationaryaccordingto the location of the roots of the characteristicequationthat, for model (2.1), is

INTRODUCTION

Hasza and Fuller (1979) developed a test for the null hypothesisthat a univariatetime series has two characteristicroots equalto 1. The test is basedon a statistic readily obtained by standardregressionsoftware. It is a statisticthat, in a standardregression,is distributed as Snedecor'sF. For the double-unit-roottime series case, the distributionwas unknownuntil Hasza (1977) tabulatedit. In this article,we show how to computea symmetricversion of Hasza's F statistic by a simple modificationof the originaldata. Tablesof criticalvalues for the symmetrictest statistic are given. These differ, even in the limit, from those given by Hasza. This symmetrizedversion of the test is shown to have more power than the ordinaryHasza F. Two examples are given. 2.

Mp+2 -

+

Ip+2(Yt-p-2

+ F2(Yt-2 - u) + -

U) +

et,

(5.1) and the materialthat follows], implyingthat the model can be expressed solely in second differences with the stationarityor nonstationarityin second differences being determined by the new characteristic equation

'

MP - flMP-1 - f2MP-2 -

(2.1)

a) =

il(Yt,_ - /)

+ a2V(Yt,_ --)

3.

flIV2(Yt-j -

) + et.

-

p = 0.

(2.4)

PREVIOUS CONTRIBUTIONS

Several authorshave contributedin recent years to testing the null hypothesisthat d = 1, where d is the numberof unitrootsin the characteristic equation.Fuller = 1 and gave the for d sec. discussed tests (1976, 8.2) tables of percentiles for the tests. Dickey and Fuller (1979)gave detailsof the percentiletabulationandlimit theory. An alternativemethod for the computationof these percentileswas given by EvansandSavin(1981a). Another approachto the distributionalproblem was given by Ahtola (1983). Articles illustratingthese tests and containingreferencesto previouswork as those by

p

+

...

Throughoutthis article, we assume that the roots of (2.4) are all inside the unit circle.

with e, ~ N(0, a2) independently.A convenientreparameterizationis V2(Yt -

...

ing the null hypothesis d = 2. Notice that if d = 2, the reparameterized model (2.2) will have ac = a2 = 0 [see

THE MODEL

/ = Dl(Y,-i --A)

p+2 = 0.

-

(2.3) If all roots of (2.3) are less than 1 in magnitude,then (2.1) is stationary.If d of the roots are equal to 1 with all of the restless than1 in magnitude,the seriesexhibits the type of homogeneousnonstationaritydescribedby Box and Jenkins (1976). This type of nonstationarity can be removed by taking the dth difference of the originalseries. If any of the roots are largerthan 1 in magnitude,the series is said to have explosive nonstationarity.In this article,we will be concernedwith test-

We considerthe general model Y,-

SlMP+l - F2MP -

(2.2)

j=1

Here V denotes the differencingoperatorVYt = Y, Y,-1. Since we want to allow the possibilityof characteristicroots equal to 1, we will assumethe initialconditions YO= Y_1 = ..* = Y_ - = p. This is a standard

assumptionin dealing with nonstationarytime series and arisesbecausesuchtime series cannotbe expanded as linearfunctionsof e,'sstretchingbackinto the infinite past as can be done with stationaryseries. For our pur463

Journalof Business & EconomicStatistics,October1987

464

Dickey, Bell, and Miller (1986), Nelson and Plosser (1982), and Evans and Savin (1981b). Fountis and Dickey (1986) extended these tests to vector autoregressionsand gave the transformationfor vector processes that is analogous to differencingin univariateseries. Dickey, Hasza,andFuller(1984)gave test statisticsfor seasonal unit roots and gave symmetrizedversionsof these. It is this methodof computing symmetricstatisticsthat we will mimichere. Hasza (1977) and Hasza and Fuller (1979) discussed the distributionof the statisticthat would be computed by a standardregressionpackage to test the null hypothesis that d = 2; that is, Ho: a1 = a2 = 0 in model

(2.2). Although this is computedby the formulafor a regressionF statistic,its distributionis not that of Snedecor'sF, and percentilescomputedby Hasza must be used to performthe test. This is the test we proposeto modify. Pantula (1986) and Dickey and Pantula (1987) suggested sequential methods of testing for an arbitrary numberof unit roots [roots M = 1 in (2.3)]. Pantula (1986) computeddistributionpercentilesfor testing up to d = 5 unit roots. Note that Hasza'sprocedureand our modificationof it do not proceed sequentiallybut insteadinvolve a single test. Chanand Wei (1986) presented a theoretical representation,in terms of functionals of a Weiner process, for distributionsof estimatorsin autoregressionswith an arbitrarynumberof roots aroundthe unit circle. Percentilesfor these distributionsare not available at present except for the specificcases mentionedpreviously. FOR SYMMETRICSTATISTICS 4. MOTIVATION The motivationfor symmetricstatisticsarises from an interestingpropertyof seriesthatsatisfythe standard definition of weak stationarityas given, for example, by Fuller (1976, p. 4). The interestingpropertyis that if the differenceequation definingthe series is

Table 1. Regression Tableaufor SymmetricEstimator Dependent variable

Independentvariables

Yp+ Yp+2 - #

YYp-Yp+ - P

Ypp-p Yp -

Yp+3 -

Yp+2 -

Yp+, -

Yn

it

-

-

Yn-p -

Yn-pY, -

P

Y nYn-p+1 Yn p

P

p

J

Yn-2 Yn-p+2 -+ -

Y,3 Y

-

#

U

2- A Yp-. -

Y, - p - # *" Y2

Yp -

...*

Y3 -

*..

Yn-p -p Yn - # Yn-1 -

Yn-3 Yn-p+3 Yn-p+2 -

- A)

((Y,-

+

+

)2(Yt-2 - ,u)

+ ..

5. DEFINITIONOF SYMMETRIC STATISTIC FM(2)

We have referredto the F-type statistic of Hasza's work as Hasza's F. It is computedusing the standard differencein regressionsum of squaresbetween model (2.2) with all parametersestimatedand (2.2) with the a's set to 0 by deleting the correspondingregression columns.We thusare motivatedto producea regression tableau,Table2, expressedin termsof a model parameterized as in (2.2). Notice the effect of differencingon ,. From (2.2), the coefficients on the independentvariables are a1, a2, fil, f2 , . . , l,p. These are related to the coefficients ijin model (2.1) by fk

(4.1)

wheree, is whitenoise (an uncorrelatedmean0 variance a2 sequence), then Yt-

/

-

(1(Yt+1

- A)

-

(2(Yt+2

-

) -

"'

- p(Y,t+ -p) is also a white-noisesequencewith variancea2; that is, Y,-

# = (I(Y,t+

- A)

+ (2(Yt+2

+ "-p(Yt+p - fi) + Vt',

-

U) + '-

(4.2)

with Vthaving mean 0 and variance a2. Fuller (1976, chap. 2) gave a full discussionof this property.Now if we were dealing with such a weakly stationarytime series, Equations (4.1) and (4.2) suggest a regression as in Table 1. Usually ,u is unknown. It can be replaced, for example, by the sample mean. Now if we perform a

p -

-

regressionin this symmetricfashion and use it to test for unit roots, we hope to gain some power becauseof improvedestimationunder the alternativethat the series is stationary.It is possible that such a test would have some poweragainstexplosivealternatives,but our motive is to increasepower againststationaryalternatives that we believe to be the most commonsituation in terms of practicalinterest. We will need to tabulate the distributionof the test statisticunderthe hypothesis that d = 2, because, as we will see, it differsboth from Snedecor'sF and from Hasza'sF.

p+2

a1 =

= E (j - k - l)Ij,

(j - 1,

j=l

j=k+2

)p(Yt-p - A) + et,

.. -

Y4

p+2

I, =

Y,-

#

#

and p+2

/

a2=

-1

(j -

+

l)(j)

.

j=2

We can equivalently define fik as f]p = Dp+2and ik flk+ = -j=p+2 Djfor j < p, which enables us to extend the sequence through fio = fa + ,=2 cj = - (a2 +

1). If the characteristicequation (2.3) has a unit root, then 1 1 - (D2 - *.* - .p+2 = 0 < a = 0, (5.1) and (2.3) can be writtenas (M - 1)[MP+1 + (1 - (D1)MP+ (1 - (I1 + (1 -

( I2)MP

1 - (D2 2--

+' -

(p+)].

Using (5.1) and the extended fj's, we factor the char-

Sen and Dickey:Testingfor UnitRoots

acteristicequation as 1)[MP+' + (fO - l1)MP + (fl -

(M -

l2)MP-1

+ ... +

= 0.

Now the factorin bracketsis the characteristicequation of the first differencedseries, and it has a unit root iff 1 + fi = 0-that is, iff a2 = 0. Thus (2.3) has d = 2 unit roots iff a = a2 = 0. Define the symmetric statistic FM(2) as the ratio of the mean square for a1 = a2 = 0

divided by the error mean sqaure in the regressionin Table2. To computeHasza'sFtest, referredto as (2(2) by Hasza and Fuller (1979, p. 1110), use only the top half of the table (above ***) with u = 0 and an addi-

tional column of Is. Referringto the workof Hasza and Fuller(1979), all of the sumsof squaresand crossproductsfor the regression in the upperhalf of Table 2 have been studied. In particular,normalizationfactors(powersof n) and representationsof the limitdistributionsas quadraticforms have been obtained by them. It is a rathertrivial extension to describe the symmetricregressionin terms of these same quadraticforms. Because of the negative signs in column 3 of the table, some simplifications occur as comparedto Hasza's F. A full development was given by Sen (1986), and we presenthere only the salient features of the development. Let X be the matrixconsistingof the p + 2 columns of the "independentvariable"of Table 2. Define D to be the (p + 2)-dimensionaldiagonal matrix with diagonal elements n2, n, n112,n1/2,..., the hypothesis that a, = a2 = 0,

n1/2. Then, under

- B = Op(n-1/2),

D-l(X'X)D-l

where B is a block-diagonalmatrix. We now evaluate the elements of B. The upper left 2 x 2 submatrixof

465

arisingfrom symmetrizingthe regression.The diagonal elements converge to twice the quadratic-formlimits for Hasza's regression (the one using the top half of Table 2). The limits are random variables, not constants,andtheirlimitrepresentationsas quadraticforms were given by Hasza and Fuller (1979). The methodology of Chan and Wei (1986) could also be used to express these as functionalsof a Wiener process, but we choose the quadraticformsapproachto use available softwareto get the percentilesof the limitdistributions. Either way, we have defined the upperleft 2 x 2 submatrixof B as a diagonalmatrixwith randomdiagonal elements. The lower rightp x p submatrixof B is twice the variance-covariancematrix,r, of p consecutiveobservations from an autoregression with characteristic Equation(2.4) and is thus a matrixof constants.Since B is block diagonal, the joint limit distributioncan be obtained in pieces. The block diagonalityfollows directlyfromthe workof HaszaandFuller(1979)or from Chan and Wei (1986). Specifically,the joint-limitdistributionof the estimates of a, and a2 do not depend on the numberof nonzero fl's and the joint-limitdistributionof the estimates of the fBsis the same as we wouldget if we assumedthatthe a's were0. Thisimplies thatthe ,f estimatorshave the usualmultivariatenormal distribution.Specifically,letting b denote the vector of estimates, we have /2(b

-

N(O, r-1l2).

B)

Notice that the lower rightcornerof B is 2r, so regression estimatesof standarderrorsfor the fl's will be too small by the factor squareroot of 2. To obtain the joint-limitdistributionof the a's, we set p = 0 (without loss of generality). It follows that under our null hypothesis, the column of dependent

Dn-(X'X)D-1 is 1

(Y, 1 -)2

n-3

n-1

n-3

E

(Y - /)VY,

-

n-3

E

(Yt1 - ,u)VYt

t=p+3

(5.2)

n

(Y, - .u)VY, - n-3

t=p+2

E

(Yt-1 - lu)VYt

n-4

([Y2]2 + 2 E

t=p+3

[VY]2 + [VY,]2

t=p+3

and using Dickey and Fuller's (1979) results, S[Vy,]2 = Op(n2),so the off-diagonal elements converge in probabilityto 0. This is the first of two simplifications Table2. Regression Tableaufor SymmetricEstimator

variablesin Table2 has elementscomposedof the whitenoise error sequence from the hypothesizedorder 2 autoregressionwith two unit roots, namely, e3, e4, e5, .? . ,

en een, en-,_ en2, 3, . .. , e3. Under the null

hypothesis,then, we are interestedin (for p = 0)

Dependent variable

n

Independentvariables

V2Yp+

(Yp -

(Yp+,(Yp+2-

V2Yp+2 V2Yp+3 V2Y.

E t=p+2

t=p+3

_

n

n-1

n

n-42

)

VYp

)

VYp+, VY+

- A)

(_n-

VYn_,

V2Yp

V2Yp, V2Y+2 V2Y_,

n-22 , ..

V2Y3

... V2,4 ... V2Y5 ...

(Y,1- - /u)e,

t=3 n

n-'1 -

t=3

[VYt_let -

VY,e,]

V2y -1 -

V2Yp

V2Y3 V2Y,,, (Yp

(

p

(Y,., -# -

))

-VYp -vy,,_,

2Ynp

. V2Y4 V2Y,

V2Y, ** ...v2Y V2Y,

p)e,

(5.3) )(et)2

Journalof Business&Economic Statistics,October1987

466

Here again if we use the sample mean to estimate p, we can use Hasza and Fuller's (1979) results to characterize the limit random variables. The second simplificationcaused by symmetryis the convergenceof the second element of (5.3) to a constant, -a2. The

first element is 2 times a quadraticform studied by Hasza and Fuller (1979). They defined all of their limit distributionsin terms of a vector W, and we will be able to do the same. Their W = (W1, W2, .. . , W6) is defined by letting V, be an independent N(0, 1) sequence, yi = 2[(2i then 1)H]- ( -l)i+, oo

W1 = 21/2

00oo

> Y2Vi,

W2 = 21/2

YiVi, i=l

i=l

0o

W3 = 21/2 Z

(y2 -

y3)Vi,

i=l

W4 = 21/2,

i=1

(.5y

-

y3 + yl)Vi,

00

W5 =

w6 = E y4V

71;

E i=1

i=l

We assumea variance1 for Vi. Haszaand Fuller(1979) used a2, but all of the statisticsthat we and they studied are functions of the W's, which are invariantunder changesin a2. Nothing is lost by taking a2 = 1. Using straightforwardalgebra and following the procedures of Hasza and Fuller (1979), Sen (1986, theorem 3.2) developed a representationfor the limit distributionof the estimators of a, and a2. Letting D = W2 - W3 = 21/2 yj3Vi,he showed that (5.2) convergesin law to the diagonalmatrixdiag{2(W6- D2), 2W5},that (5.3) con-

verges in law to (2(W1D - Ws), - 1)', and thus that - W1D- Ws na n2W6-

D2

-1

na2 ,.,

2W5

,.

-

In addition, Sen (1986) showed that the error mean squarefromthe regressionin Table2 convergesin probabilityto a2 and, becausethe upperleft 2 x 2 submatrix of B is diagonal, that the "t statistics,"zr and r2, for the a's convergeto limits given by -

Tl

-21/2(W1D Ws) (W D2)1/2

.

-i T2

_

(5.5)

-1

_-

(2W5)1/2

Finally, the 2 df "F statistic,"FM(2),for testingHo: ac = a2 = 0 converges to the same limit as (r2 + r2)/2:

namely,

FM(2) (W1 - - W)2+ 4 4Ws (W6 D2)

(5.6)

We have placed quotationmarksaroundthe terms "F statistic" and "t statistic" because they refer to the regressionformulasused to computethe statistics.They do not imply anythingabout distributionalproperties. The second statisticin (5.5) has the same distribution as one would get in a symmetricregressionfor a singleunit-root process with mean 0 (, = 0 and not estimated). It is the distributionlabeled d = 1 in Dickey et al. (1984, table 8, p. 362) (henceforthreferredto as DHF). This equivalenceis motivatedby the arguments following (5.1) that present a1 as providinga test for one unit root and a2 for a second given that there is at least one. If we decided that al = 0, in a sequential procedure, we would then difference the data (thus eliminating,u), run a regressionwith no intercept,and test for anotherunit root. Sen (1986) showed that the same limit distribution for FM(2)is obtainedif, instead of initiallysubtracting the series mean from each data point, we simply run the regressionin Table 2 with p set to 0 and a column of ls (intercept column) inserted as an independent variable.We then compute FM(2)as the standard2 df "Ftest" for Ho:al = a2 = 0. In this setup, the intercept column has coefficient a0 = - acl,. We could compute an "F statistic," FM(3), for testing Ho: a0 = al = a2

= 0, but that wastes a degree of freedomby including both alu = 0 and al = 0 in the null hypothesis.Thus it is not surprisingthat Sen (1986)found less powerfor FM(3) than for FM(2). We recommend using FM(2) for

testing the two-unit-rootnull hypothesis.This statistic can be computedfrom either the correctedregression (mean of Ys initiallysubtracted)or from a regression with an intercept. 6. PERCENTILESOF THE DISTRIBUTIONS Thereare manywaysto expressthe limitdistributions of our statistics.We choose to use a representationin quadraticformsin orderto follow the methodof Dickey and Fuller (1979) for computingthe percentilesof the distributionsof interest. The finite sample-sizepercentiles are computedsimplyby computingregressionson n simulateddata points as laid out in Table 2 with the sample mean replacing p. For each n, 20,000 independentregressionsare computed,yielding20,000 values for each test statisticat each n. The limitpercentiles are obtainedby simulatingthe W vector and then computing the functions (5.3)-(5.5). Again, 20,000 replications are used. The programsdescribedby Dickey (1981) are used to tabulate the sample percentilesfor the 20,000 values of each estimator for each n. The random-numbergeneratoris the one given by Knuth (1982). The estimated percentiles are smoothed as a function of n by computing regressions pin = ai + biln, where Pinis the ith percentileof a given statistic for sample size n. Since the limit distributionsof these statisticsdo not depend on the f's in model (2.2), we use only a lag 2 model (l's set to 0) in the simulation. Tables 3-7 present the smoothed percentilesfor each

Sen and Dickey:Testingfor UnitRoots Table3. Percentilesforn2&, Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250

-103.0 -110.9 -114.8 -117.2 -118.7

-75.3 -78.8 -80.5 -81.6 -82.3

-54.64 -56.54 -57.49 -58.06 -58.44

-36.27 -37.85 -38.63 -39.11 -39.42

-7.22 -7.44 -7.54 -7.61 -7.65

1.06 .62 .40 .26 .18

4.04 3.29 2.92 2.70 2.55

7.18 6.25 5.78 5.50 5.32

11.85 10.70 10.12 9.77 9.54

X0

Table4. Percentilesforn&2 Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250 oo

-11.01 -12.81 -13.71 -14.25 -14.51

-8.80 -10.01 -10.62 -10.98 -11.26

-7.06 -7.94 -8.38 -8.65 -8.86

-5.20 -5.84 -6.16 -6.35 -6.53

-1.38 -1.56 -1.65 -1.70 -1.72

-.32 -.37 -.40 -.42 -.42

-.22 -.26 -.29 -.30 -.30

-.16 -.20 -.22 -.23 -.23

-.12 -.15 -.17 -.18 -.18

Table5. Percentilesfor T, Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250

-4.95 -4.70 -4.58 -4.51 -4.46

-4.26 -4.14 -4.09 -4.05 -4.03

-3.80 -3.72 -3.69 -3.67 -3.65

-3.26 -3.24 -3.23 -3.22 -3.22

-1.50 -1.57 -1.61 -1.63 -1.64

.22 .13 .09 .06 .04

.73 .62 .57 .54 .51

1.20 1.06 .99 .95 .92

1.79 1.60 1.50 1.44 1.40

Xc

Table6. Percentilesfor T2 Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250 oo

-3.51 -3.67 -3.76 -3.81 -3.81

-3.05 -3.20 -3.28 -3.33 -3.35

-2.67 -2.82 -2.90 -2.95 -2.98

-2.24 -2.39 -2.47 -2.52 -2.55

-1.11 -1.21 -1.26 -1.30 -1.31

-.52 -.59 -.62 -.64 -.65

-.44 -.50 -.53 -.54 -.55

-.38 -.43 -.46 -.48 -.48

-.32 -.38 -.40 -.42 -.42

Table 7. Percentilesfor FM(2) Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250 oo

.14 .18 .20 .21 .22

.21 .26 .28 .30 .31

.31 .37 .40 .41 .42

.49 .56 .59 .61 .62

2.36 2.54 2.63 2.69 2.72

7.68 7.53 7.46 7.42 7.39

9.87 9.44 9.22 9.09 9.01

12.13 11.37 10.99 10.76 10.61

15.29 13.95 13.28 12.88 12.61

Table8. Percentilesfor T, Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25

-6.08

-5.46

-5.00

-4.45

-2.70

-1.14

-.73

-.38

-.02

50

-5.67

-5.14

-4.71

-4.21

-2.53

-.90

-.45

-.07

100 250

-5.46 -5.34

-4.98 -4.88

-4.56 -4.48

-4.10 -4.02

-2.45 -2.40

-.78 -.71

-.32 -.23

.08 .17

cc

.55 .66

-5.25

-4.28

-4.42

-3.98

-2.37

-.66

-.18

.24

.74

.36

467

468

Journalof Business & EconomicStatistics,October1987 Table9. Percentilesfor Tr Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250

-4.27 -4.36 -4.41 -4.44 -4.46

-3.81 -3.92 -3.98 -4.02 -4.04

-3.42 -3.55 -3.61 -3.65 -3.67

-3.02 -3.14 -3.20 -3.24 -3.26

-1.86 -1.94 -1.98 -2.00 -2.02

-1.10 -1.14 -1.17 -1.18 -1.19

-.96 -1.00 -1.02 -1.03 -1.04

-.85 -.89 -.91 -.92 -.93

-.76 -.79 -.81 -.82 -.83

0o

of the five symmetricregressionstatisticsdiscussedin Section 5. The last rows of Tables 4 and 6 have been replaced by entries from table 8 of DHF. Since we are estimating an extra parameter here, the finite sample-sizepercentileswill differ slightlyfrom theirs. The last row entries,P, of Table6 are relatedto the last row entries, X, of Table 4 by P = -(-X)11.

In an ordinaryregression,repeatingeach row of data essentiallydoublesthe value of any F statistic.To compensate for the fact that there are 2n rows in Table 2 instead of n, will divide the percentilesin Table 7 by 2 andthen comparethemwithHasza's(1977)percentiles. So normalized,the percentilesof Table 7 are less than those of Hasza's F (i2(2), Hasza and Fuller 1979, p. 1116). Since both the ordinaryregressionand the symmetricregressionare governedby limit normaltheory in the stationarityregion, the distributionof the properly normalizedtest statisticswill be approximatelythe same in this region, showingthat, for large n, the test with the smaller percentiles (Sen's symmetricF) will have largerpower. Suppose we unwittinglytest for a single unit root usingDHF percentileswhen, in fact, the series has two unit roots. The percentiles,X, in table 9 of DHF can be converted to percentiles, P, of the corresponding "t statistics" by computing P = [(2n - 1)Xln]l/2[2 + Values of P computed from the left half of

Xln]-1/2.

table 9 of DHF are all slightlyless than the corresponding entriesin Table5, so a left-tailedtest thatmistakenly uses table 9 of DHF instead of Table 5 when there are actuallytwo unit roots present must be conservative. Sen (1986), usingMonte Carlosimulation,showed that if we use nominalsignificancelevel .05, for example, and comparerl to single-unit-rootcriticalvaluesP, the true significancelevel is between .042 and .0475for all sample sizes n in the preceding tables. Thus the symmetricsingle-unit-roottests have the nice property that they are more likely to conclude that there is a

unit root when there are really two than when there is only one. We interpretthis as a robustnesspropertyof the symmetric single-unit-roottest to the presence of additionalunit roots. The ordinaryunit-roottests at the nominal5% level also have actuallevels near (but sometimesslightlyexceeding)5% when there are two unit rootspresent.For example, Sen (1986) reported actual levels from .044 to .051 in the model with an intercept. He used similarmethodologyto compute tables of percentilesfor datathathavebeen detrendedby a linear regression on time and then analyzed by symmetric regressionas in Table 2. Percentilesfor the "t statistic" tit for a1, the "t statistic"Z2tfor a2, and the 2 df "F statistic"FT(2)are reportedin Tables 8-10. 7. POWER In the first phase of our power study, we computed by Monte Carlosimulationthe empiricalpowerof Sen's symmetricFM(2)statisticandthatof Hasza'sFfor order 2 autoregressionswith characteristicroots ml = 1 and 0 < m2' 1. Table 11 gives the proportionof rejections out of 20,000 trialsfor each m2using series lengthn = 100 and the 5% criticalpoint from the null distribution of each statistic. Sen's symmetrictest is more powerfulthan Hasza's test in this region with the differencein power fairly large at some m2values. We arguedin Section 6 that Sen's symmetricstatistic FM(2) should have greaterpower than Hasza'sF when we are in the stationarityregion and n is large. Here we investigatepowerfor a reasonablysmallsamplesize, n = 50. We also investigate regions other than the stationarityregion. Denoting the roots of the characteristicequationby m1 and m2, it is easy to see that a1 = -(m

- 1)(m2 - 1) and a2 = m1m2 - 1. We com-

pute the power by Monte Carlo methods at a grid of

Table 10. Percentilesfor FT(2) Probabilityof a smallervalue n

.01

.025

.05

.10

.50

.90

.95

.975

.99

25 50 100 250 00

.90 .79 .73 .70 .67

1.21 1.06 .99 .95 .92

1.59 1.41 1.32 1.27 1.23

2.21 1.96 1.84 1.77 1.72

6.69 6.01 5.68 5.47 5.34

15.62 13.69 12.73 12.15 11.76

19.12 16.42 15.08 14.27 13.73

22.58 19.10 17.36 16.31 15.62

27.72 22.79 20.33 18.86 17.87

Sen and Dickey:Testingfor UnitRoots

469

Table 11. Empirical Power Versus m2 (m, = 1, n = 100; 20,000 samples)

m,

1.00

.99

.98

.95

.90

.85

.80

.70

.60

.50

.30

.00

Sen Hasza

.0497 .0516

.0661 .0642

.0843 .0735

.1697 .1227

.4205 .2710

.7177 .5009

.9146 .7436

.9984 .9770

1.00 .9996

1.00 1.00

1.00 1.00

1.00 1.00

points in the ac, a2 plane, taking as a model (2.2) with p = 0. In this plane, the stationarity region is the region (a1 < 0, a2 < 0). We choose the grid of points to cover the stationarity region well. For each point (a,, a2), we generate 1,000 series of length n and record the proportion of rejections of the two-unit-root hypothesis using Sen's symmetric F (top entry) and Hasza's F (bottom entry). Starting values of u = 0 are used in Table 12 to correspond to the theory developed. Table 12 shows that for n = 50 Sen's symmetric F test has better power in the stationarity region, including the boundary of this region. In the region with at least one aj > 0, the power of the symmetric F is still competitive with that of Hasza's F. Both tests seem to display a slight bias when ac is slightly larger than 0 and a2 is near 0. At the suggestion of a referee, we investigated the power of the test when the assumption of fixed initial values is violated. To accomplish this, we ran the same computer program but generated 130 values for each realization of the time series, throwing away the first 80 before beginning the regression computations. Consider the ray a, = 0, a2 C 0 in Table 12. Notice that one root, say mi, is 1 and the other is m2 = a2 + 1.

In Table 13, we generated 130 observations for each realization, deleting the first 80. On the preceding ray, then, VYt = m2VY,_1 + e,, with Y0 = VY0 = 0. This means that Y80+t= Y80 + VY81 + ** + VY8o+,and, since Y = (Y81 + Y82 + ** + Y130)/50,we have Y80+ - Y = (VY81 + '* + VY80+t- VY) [where VY = (VY81+ *- + VY130)/50= (Y130- Y80)/50].Thus there

is no startup effect caused by the level reached by the series at time 80 (Y80is subtracted out of each observation). There is, however, an effect on the variance of the initial observation. Note that VY1 = Y1 - N(0, a2), but VY81- N(0, v), where v = (1 - mi62)a2/(1 - m2) for }m21< 1 = 81a2

for m2 = 1.

Thus for m2 very near 1, the first-differenced series will often have an initial observation far from #, and it will take several observations for the series to move back near ,u. The result of this is that such a first-differenced series will appear too often to have a unit root, so all along the ray we are describing the effect will be to depress the power function. In fact, even the null-hypothesis significance level becomes significantly lower than it was with fixed starting values. Thus the symmetric test is even more conservative than the nominal 5% level suggests when the 0 initial condition is removed. On the other hand, when we move further into the stationarity region, the symmetric estimator picks up power as we would expect, since it is designed to estimate well under stationarity.Thus we expect the power to be lowered on and near the ray (a, = 0, at2 0) and higher in the interior of the stationarity region (because now the series is "truly" stationary and does not contain startup problems). Table 13 shows the described behavior for the symmetric test. The ordinary test does not suffer from low

Table 12. EmpiricalPower Versusa, and a2 (n = 50; 1,000 samples) Alpha2 Alpha 1

-.500

-.300

-.100

-.050

-.020

-.010

-.005

.000

.010

.050

-.050 -.050

1.00 .98

1.00 .81

.98 .67

.98 .74

.98 .82

.98 .85

.99 .85

.98 .87

.99 .88

.99 .91

-.030 -.030

1.00 .95

.97 .63

.86 .35

.88 .45

.88 .54

.89 .58

.90 .61

.91 .61

.91 .65

.94 .81

-.010 -.010

1.00 .91

.87 .51

.41 .12

.40 .12

.43 .14

.42 .18

.43 .19

.45 .19

.44 .21

.45 .40

-.002 -.002

.99 .92

.79 .54

.21 .12

.13 .09

.13 .06

.13 .07

.15 .07

.14 .07

.13 .08

.11 .07

.000 .000

.99 .92

.72 .53

.17 .11

.11 .07

.08 .07

.07 .05

.07 .06

.05 .04

.05 .05

.11 .07

.002 .002

.99 .92

.68 .51

.15 .12

.07 .06

.04 .05

.04 .04

.03 .04

.02 .03

.03 .03

.72 .70

.010 .010

.92 .89

.44 .45

.09 .12

.59 .60

.85 .83

.88 .87

.89 .88

.91 .90

.92 .91

.98 .97

470

Journalof Business & EconomicStatistics,October1987 Table 13. Power of Test WhenInitialValueIs Random Alpha2 Alpha 1

-.500

-.300

-.100

-.050

-.020

-.010

-.005

-.00

-.050 - .050

1.00 .98

1.00 .80

.97 .76

.98 .86

.99 .94

1.00 .97

1.00 .97

1.00 .98

-.030 -.030

1.00 .95

.96 .65

.81 .44

.85 .61

.93 .81

.95 .84

.96 .88

.96 .89

-.010 - .010

.99 .92

.83 .53

.32 .17

.40 .23

.57 .40

.68 .53

.71 .59

.77 .66

-.002 -.002

.99 .92

.76 .51

.15 .12

.10 .08

.11 .09

.11 .09

.11 .08

.13 .09

.99 .91

.71 .54

.16 .13

.07 .08

.04 .07

.02 .06

.02 .05

.02 .06

.000 .000

NOTE: Foreach group,Sen's symmetrictest is the top entryand Hasza'sis the bottomentry.

a

TOTAL 9000

-

8000

-

7000

-

4000

-

3000

-

20001a --

ft

1900

I""

I l

* It

It

1910

fI I l

tI I I' I I I' t I

II I I

1940

1930

1920

1970

1960

19 350

YEAR b

DEL 3000-

I'

2000_

i

l000-

i

zeee-

'x

-3000-!11I

.t

-2000-

-3000-. . i If

1900

f. . . l . t)t!l .. . . I l! fll

1910

. lt . .. . . ...

I ...IIf

1 20

. I . I. t. .t I.

I IIIII ...

. .. . Ift

. . . Ift

1940

1930

I!

fII. ....Ii . . IIt

1950

II

tfIt

1960

.

if . it . .I

iI..

1970

YEAR Figure 1. U.S. Total Exports in Millions of 1954 Dollars: (a) With Trend Line; (b)-,

Differences; - - -, Detrended.

Sen and Dickey:Testingfor UnitRoots

squarederror).The HaszaFtest is 38.26, whichexceeds

poweralongour boundaryraybut still does not develop as muchpoweras the symmetrictest does in the interior of the stationarityregion.

every listed percentile of ()2(2) in Hasza and Fuller

(1979, table 4.1). To compute the symmetricregression,we used the SAS computerpackage to create a data set with variables, as in Table 2. In this package, a copy of the data set can be sorted in reverse order and concatenatedto the originaldata set. This, with a few other simpledata manipulations,produces the data set. Regressingthe dependentvariableon the independentones gives

8. EXAMPLES 8.1 Example 1: Exports Spencer(1969) presentedseveralreasonablylong series. We look at total exports; measurementsare in millionsof dollars(constant1954dollars),andthe series mean is 4,057. First, we regressthe second-differenced

V2Y, = -.0849 (Y,_1 - 4,057) - 1.2805 VYt,_

series, V2Y, on 1, Yt1, VY,_1, V2Y,-1, V2Y,2, V2Y,t3.

+ .0647 Yt- -

(263.87)

1.5208 VY,

(.0834) Notice that in this symmetricregressionwe need to multiply the standarderror on V2Y,_1by the square root of 2, because the matrixof regressorshas roughly 2n rows. That is, to test the significanceof the coefficient on the lagged second difference,divide .3617 by

1

MSE = 319,055,

+ .4835 V2Y,_1,

(.1290) where the numbersin parenthesesare the standarderrors from the computer output (MSE denotes mean a

MSE = 315,113.

+ .3617 V2Yt_,,

(.2003)

(.0656)

(.1197)

(.0415)

The F statistic for the last two terms is .16, which is insignificantcomparedwith a standardF with 2 and 56 df. Refittingwith only significantterms, we get V2Y, = -141.66

471

(.0834)(1.41421) = .1179, getting t = 3.07. Comparing

this resultto a standardnormaltable (this is not a unit-

12.412.312.2

L

12.1 12.0 -

0 11.9A 11 .8R 11.7-

3; T H M

11.6t .S11.411.311.211.111.8-i.

I-

i-

-

pI Blw I- I- v- I- l-

i

1920

-

l ---l

-

l5 I I I- Iw

-

-I

*

0 - v I. IA I. I6- I. I. a. Ii a. a. a.

.

. . . % . . I w I I I11

.

. . I. a. I111

.

Ia

.

a* l. v. I. I.

1970

1968

1950

1940

1930

. a. I. I. a**

.

a9 I .

.

.

.

.

.

.

.

1990

1908

YEAR b

0 .0s

-

8,.04

-

0.03

-

% % \ %

0 .02I F F 0.81E R E -.800. N C E -0.01

I

%/ % IL-

\

/

I

\/

It%/

%

1 1

\/ /

\

/

9%

'1%

-8.84

192 1920

.

.

.

.

-. -. -. .

9 0 1930

.W

w

.1 140

19410

. . . 95...

w .

.

.

.

.

... I

i

1960

19S8

w 1

11

I I 9 I v *

1970

I

6 I1

1988

I

I 1v I

1990

YEAR

Figure2. U.S. Population:(a) Logarithmof Populationper 1,000-Original Data WithTrendLine;(b)-,

Differences;- - -, Detrended.

472

Journal of Business & Economic Statistics, October 1987

root test), we see that the lagged second differenceis needed. Next, we regress V2Y,on V2Yt_ to get the reduced-modelsum of squaresand thus the Sen symmetricFstatistic= (full-modelregressionsumof squares - reduced-model regression sum of squares)/(2 315,113) = 76.756. This is about twice Hasza's test statisticand is comparedwith valuesfromTable7. This more powerful test also rejects the two-unit-roothypothesis. Adding VYt,_ to both sides of the preceding equationproducesthe model for a single-unit-roottest. Sincethe "tstatistic"on Yt 1is unalteredby thischange,

ing of the data produces about the same results;that is, a linear detrendingwill not produce a stationary series. We need to take differences.A plot of the data in original, differenced,and detrendedforms is given in Figure 1. 8.2 Example 2: U.S. Population In Figure2 we graphthe total U.S. populationfrom 1929 to 1982. The series is very smooth, as is characteristicof series with more than one unit root. The first and second differencesare shown in Figure3. Since we feel that populationgrowthis likely to be a multiplicativeprocess, we model the data on the logarithmicscale. On that scale, the mean is 12.0244and analysissimilarto that in Example 1 gives no evidence that lags of second differencesare needed on the right side of our model. The ordinaryregressionapproach gives

- .0849/.0415 = - 2.05 can be compared with the sym-

metricpercentilescomputedfrom table 9, page 363, of Dickey et al. (1984). We cannot reject the single-unitroot null hypothesis. Algebraic manipulationof the coefficientsin either estimated model shows that the first difference, VYY, satisfies, roughly, VY, = )VYt_2, with ( about -.5. This negative correlationat lag 2 seems a bit unusual, but using SAS PROC ARIMA and maximumlikelihood estimationwe verify that this model fits the data very well with

Suggest Documents