On Least-Squares Bias in the AR(p) Models: Bias ... - CiteSeerX

0 downloads 0 Views 176KB Size Report
sample bias of OLSE of the AR(1) term when the underlying process has a unit root. ... estimates of the regression coefficients in small sample. 2 Bias .... t substituting u∗ t into ut in equation (1), where the initial values {y∗ p, y∗ p−1, ททท, y∗ .... 1, chi-squared in Table 2 and uniform in Table 3, respectively. For all the tables ...
On Least-Squares Bias in the AR(p) Models: Bias Correction Using the Bootstrap Methods∗ Hisashi Tanizaki Faculty of Economics, Kobe University Nadaku, Kobe 657-8501, Japan ([email protected])

Key Words: AR(p) Model, OLSE, Unbiased Estimator, Lagged Dependent Variable, Exogenous Variables, Nonnormal Error, Bootstrap Method.

Abstract In the case where the lagged dependent variables are included in the regression model, it is known that the ordinary least squares estimates (OLSE) are biased in small sample and that bias increases as the number of the irrelevant variables increases. In this paper, based on the bootstrap methods, an attempt is made to obtain the unbiased estimates in autoregressive and non-Gaussian cases. We propose the residual-based bootstrap method in this paper. Some simulation studies are performed to examine whether the proposed estimation procedure works well or not. We obtain the results that it is possible to recover the true parameter values and that the proposed procedure gives us the less biased estimators than OLSE.

1

Introduction

In the case where the lagged dependent variables are included in the regression model, it is known that the OLSE’s of autoregressive (AR) models are biased in small sample. ∗ This paper is a substantial revision of Tanizaki (2000), i.e., the normality assumption is taken in Tanizaki (2000) but it is not required in this paper. This research was partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Encouragement of Young Scientists (# 12730022).

1

Hurwicz (1950), Marriott and Pope (1954), Kendall (1954) and White (1961) discussed the mean-bias of the OLSE. Quenouille (1956) introduced the jackknife estimator of the AR parameter which is median-unbiased to order 1/T as T goes to infinity, where the trend term is not taken into account. Orcutt and Winokur (1969) constructed approximately mean-unbiased estimates of the AR parameter in stationary models. Sawa (1978), Tanaka (1983) and Tsui and Ali (1994) also examined the AR(1) models, where the exact moments of OLSE are discussed. Shaman and Stine (1988) established the mean-bias of the OLSE to order 1/T in stationary AR(p) (also see Maekawa (1987) for the AR(p) models). Grubb and Symons (1987) gave an expression to order 1/T for bias to the estimated coefficient on a lagged dependent variable when all other regressors are exogenous (also see Tse (1982) and Maekawa (1983) for the AR models including the exogenous variables). Peters (1989) studied the finite sample sensitivity of OLSE of the AR(1) term with nonnormal errors. In Abadir (1993), an analytical formula was derived to approximate the finite sample bias of OLSE of the AR(1) term when the underlying process has a unit root. Moreover, in the case where the true model is the first-order AR model, Andrews (1993) examined the cases where the estimated models are the AR(1), the AR(1) with a constant term and the AR(1) with constant ane trend terms, where the exact median-unbiased estimator of the first-order autoregressive model is derived by utilizing the Imhof (1961) algorithm. Andrews and Chen (1994) obtained the approximately median-unbiased estimator of autoregressive models, where Andrews (1993) is applied by transforming AR(p) models into AR(1) and taking the iterative procedure. Thus, the AR models have been studied with respect to various aspects, i.e., (i) a stationary model or a unit root model, (ii) the first-order autoregressive model or the higher-order autoregressive models, (iii) an autoregressive model with or without exogenous variables, and (iv) a normal error or a nonnormal error. Tanizaki (2000) proposed the median- and mean-unbiased estimators using simulation techniques, where the underlying assumption is that the error term is normal. In this paper, in more general formulation which can be applied to all the cases (i) – (iv), using the bootstrap methods we derive the unbiased estimates of the regression coefficients in small sample.

2

Bias Correction Method

We take the autoregressive model which may include the exogenous variables, say xt . That is, consider the following simple regression model: yt = xt β +

p X

αj yt−j + ut

j=1

= zt θ + ut ,

(1) 2

for t = p + 1, p + 2, · · · , T , where xt and β are a 1 × k vector and a k × 1 vector, respectively. θ and zt are given by θ = (β 0 , α0 )0 and zt = (xt , yt−1 , yt−2 , · · ·, yt−p ), where α = (α1 , α2 , · · ·, αp )0 . ut is assumed to be distributed with mean zero and variance σ 2 . We will discuss later for the distribution function of the error term ut . In this paper, the initial values yp , yp−1 , · · ·, y1 are assumed to be constant for simplicity. Since it is well known that OLSE of the autoregressive coefficient vector in the AR(p) model is biased in small sample (see, for example, Andrews (1993), Andrews and Chen (1994), Diebold and Rudebusch (1991), Hurwicz (1959), Kendall (1954), Marriott and Pope (1954), Quenouille (1956) and so b 0 )0 , is clearly biased. on), OLSE of θ, θb = (βb0 , α To obtain the unbiased estimator of θ, the underlying idea in this paper is described as follows. Let θ be an unknown parameter and θb be the biased estimate of θ. Suppose that the distribution function of θb is given by fθb(·), which is not obtained analytically in the case where the lagged dependent variables are included in the explanatory variables. Since θb is biased, we have b where the expectation E(θ) b is defined as follows: θ 6= E(θ), b ≡ E(θ)

Z

+∞

−∞

xfθb(x)dx.

(2)

To obtain the relationship between θb and θ, let {θb1∗ , θb2∗ , · · ·, θbn∗ } be a sequence of the biased estimates of θ, which are taken as the random draws generated from fθb(·). Note that θb implies the OLSE obtained from the actual data while θbi∗ denotes the i-th OLSE based on the simulated data given the true parameter value θ. Therefore, θbi∗ depends on θ, i.e., θbi∗ = θbi∗ (θ) for all i = 1, 2, · · · , n. Suppose that given θ we can generate the n random draws θb1∗ , θb2∗ , · · ·, θbn∗ from fθb(·). Using the n random draws, the integration in equation (2) can be represented as follows: b = lim E(θ) n→∞

n 1X θb∗ (θ). n i=1 i

(3)

Let us define the unbiased estimator of θ as θ. Equation (3) implies that θ is given by the θ which satisfies the following equation: n 1X θb∗ (θ) n i=1 i ≡ g(θ),

θb =

(4)

where θb in the left-hand side represents the OLSE of θ based on the original data yt and xt . g(·) denotes a (k + p) × 1 vector function, which is defined as (4). The solution of θ obtained from equation (4) are denoted by θ, which corresponds to the unbiased estimator of θ. Conventionally, it is impossible to obtain an explicit functional form of g(·) in equation (4). Therefore, equation (4) is practically solved by an iterative procedure or a simple grid search. 3

In this paper, the computational implementation is shown as the following iterative procedure. 1. Given the actual time series data (i.e., xt and yt ), estimate θ and σ 2 in (1) by OLS, which are denoted by θb and σb 2 . 2. Let u∗t be the random draw with mean zero and variance σb 2 . Suppose for now that the random draws u∗p+1 , u∗p+2 , · · ·, u∗T are available. We will discuss later for the random number generation method of u∗t . 3. Let θ(j) be the j-th iteration of θ and yt∗ be the random draw of yt . ∗ Given the initial values {yp∗ , yp−1 , · · ·, y1∗ }, the exogenous variable xt for t = p + 1, p + 2, · · · , T and θ(j) , we obtain yt∗ substituting u∗t into ut in ∗ equation (1), where the initial values {yp∗ , yp−1 , · · ·, y1∗ } may be taken as the actual data {yp , yp−1 , · · ·, y1 }. That is, yt∗ is generated from ∗ ∗ zt∗ θ(j) + u∗t given θ(j) and the random draw u∗t , where zt∗ = (xt , yt−1 , yt−2 , ∗ (1) b · · ·, yt−p ). For j = 1 we may set θ = θ. 4. Given yt∗ and xt for t = 1, 2, · · · , T , compute the OLSE of θ(j) , which is denoted by θb∗ . 5. Repeat Steps 2 – 4 n times, where n = 10, 000 is taken in this paper. Then, n OLSE’s of θ(j) are obtained, which correspond to θbi∗ (θ(j) ), i = 1, 2, · · · , n, in equation (4). Based on the n OLSE’s of θ(j) , compute the arithmetic mean for each element of θ, i.e., the function g(θ(j) ). 6. As in equation (4), θb should be equal to the arithmetic average g(θ(j) ) ≡ P (1/n) ni=1 θbi∗ (θ(j) ). For each element of θ, therefore, θ(j+1) should be smaller than θ(j) if θb is less than the arithmetic average, and it should be larger than θ(j) otherwise. Here, we consider that each element of g(θ) is a monotone increasing function of the corresponding element of θ. Thus, θ(j) is updated to θ(j+1) . An example of the optimization procedure is described in Appendix 2. 7. Repeat Steps 2 – 6 until θ(j+1) is stable, where the limit of θ(j+1) is taken as θ. Note that the random draws of ut , generated in Step 2, should be same for all j, i.e., n × (T − p) random draws are required.

Now, in Step 2 we need to consider generating the random draw of ut , i.e., u∗t , which is assumed to be distributed with mean zero and variance σb 2 . In the regression model (1), the underlying distribution of ut is conventionally unknown. To examine whether the suggested procedure is robust or not, using the bootstrap methods we take the following four types of random draws for ut , i.e., the normal error (N), the chi-squared error (X), the uniform error (U) and the residual-based error (R). (N) u∗t = σb t , where t ∼ N (0, 1) and σb denotes the standard error of regression by OLS. vt − 1 (X) u∗t = σb t , where t = √ and vt ∼ χ2 (1). 2 4

√ (U) u∗t = σb t , where t = 2 3(vt − .5) and vt ∼ U (0, 1). (R) u∗t is resampled from {cub1 , cub2 , · · ·, cubT } with equal probability 1/T , b and bt denotes the OLS residual at time t, i.e., u bt = yt − zt θ, where q u c = T /(T − k) is taken (see Wu (1986) for c). Thus, for t = p + 1, p + 2, · · · , T , it is necessary to generate the random draws of ut in Step 2. In practice we often have the case where the underlying distribution of the true data series is different from that of the simulated one, because the distribution of ut is not known. (R) does not assume any distribution for ut . Therefore, it might be expected that (R) is more robust compared with (N), (X) and (U). In the next section, for the true distribution of ut we consider the four types of the error term, i.e., the standard normal error, the chi-squared error and the uniform error. (N), (X), (U) and (R) are examined for each of them.

3 3.1

Monte Carlo Experiments AR(1) Models

Let Model A be the case of k = 0, Model B be the case of k = 1 and xt = 1 and Model C be the case of k = 2 and xt = (1, x1t ), i.e, Model A: yt =

p X

αj yt−j + ut ,

j=1

Model B: yt = β1 +

p X

αj yt−j + ut ,

j=1

Model C: yt = β1 + β2 x1t +

p X

αj yt−j + ut ,

j=1

for t = p + 1, p + 2, . . . , T , given the initial condition y1 = y2 = · · · = yp = 0. In Model C, we take x1t as the trend term, i.e., x1t = t. The true distribution of the error term ut is assumed to be normal in Table 1, chi-squared in Table 2 and uniform in Table 3, respectively. For all the tables, mean and variance of the error are normalized to be zero and one. Since the true distribution of the error term is not known in practice, we examine (N), (X) and (R) for all the estimated models. The case of p = 1 is examined although the suggested procedure would be applied to any p. The sample size is T = 20, 40, 60. For the parameter values, α1 = 0.6, 0.9, 1.0, β1 = 0.0, 1.0, 2.0 and β2 = 0.0 are taken. We perform 1000 simulations. The arithmetic averages from the 1000 estimates of α1 are shown in Tables 1 – 3. The values in the parentheses are the root mean square errors from the 1000 estimates. In Tables 1 – 3, (O) represents OLS, while (N), (X), (U) and (R) are discussed in Section 2. (1) – (5) in each table denote as follows: 5

Table 1: Estimates of α1 (Case: β2 = 0) — N (0, 1) Error T (O) (N) (X) (U) (R) (O) (N) 20 (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) 40 (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) 60 (X) (U) (R) (O) (N) (X) (U) (R)

Model α1 \ β1 0.6

0.9

1.0

0.6

0.9

1.0

0.6

0.9

1.0

(1) A 0.0 .552 (.205) .607 (.219) .580 (.223) .608 (.220) .552 (.229) .828 (.175) .902 (.168) .887 (.175) .903 (.167) .856 (.183) .927 (.160) .998 (.143) .987 (.150) .998 (.142) .959 (.159) .572 (.139) .600 (.143) .588 (.143) .600 (.143) .571 (.149) .858 (.104) .899 (.099) .891 (.102) .899 (.099) .874 (.106) .956 (.088) .994 (.074) .991 (.078) .994 (.074) .975 (.084) .582 (.110) .601 (.112) .595 (.113) .601 (.113) .580 (.117) .873 (.075) .901 (.072) .898 (.073) .902 (.072) .883 (.079) .973 (.056) .999 (.047) .997 (.048) .999 (.047) .987 (.052)

(2) B 0.0 .446 (.272) .603 (.273) .588 (.268) .605 (.275) .605 (.274) .676 (.301) .849 (.230) .834 (.235) .852 (.229) .850 (.230) .761 (.307) .905 (.220) .893 (.228) .907 (.219) .906 (.219) .526 (.162) .602 (.159) .595 (.157) .602 (.160) .602 (.159) .789 (.161) .892 (.129) .884 (.130) .892 (.129) .892 (.129) .872 (.162) .953 (.108) .948 (.112) .953 (.107) .953 (.108) .552 (.124) .601 (.122) .599 (.121) .601 (.122) .602 (.122) .828 (.114) .899 (.096) .897 (.096) .899 (.096) .900 (.096) .917 (.107) .972 (.069) .970 (.070) .972 (.069) .972 (.069)

6

(3) B 1.0 .482 (.226) .601 (.224) .587 (.225) .601 (.224) .601 (.224) .837 (.120) .902 (.108) .893 (.108) .900 (.108) .901 (.108) .981 (.050) 1.006 (.053) .999 (.050) 1.005 (.053) 1.005 (.053) .535 (.148) .599 (.144) .594 (.144) .599 (.145) .599 (.145) .861 (.073) .900 (.065) .898 (.065) .899 (.065) .899 (.065) .996 (.016) 1.001 (.016) 1.000 (.015) 1.001 (.016) 1.001 (.016) .557 (.116) .601 (.113) .599 (.113) .601 (.113) .602 (.113) .872 (.056) .901 (.050) .901 (.050) .902 (.050) .902 (.050) .998 (.008) 1.000 (.008) 1.000 (.008) 1.000 (.008) 1.000 (.008)

(4) B 2.0 .532 (.160) .601 (.158) .590 (.157) .600 (.158) .600 (.158) .882 (.051) .900 (.049) .897 (.048) .899 (.049) .900 (.049) .995 (.023) .999 (.022) .999 (.022) .999 (.022) .999 (.022) .553 (.118) .599 (.115) .596 (.115) .598 (.116) .599 (.115) .888 (.034) .900 (.032) .899 (.032) .899 (.032) .899 (.032) .999 (.007) 1.000 (.007) 1.000 (.007) 1.000 (.007) 1.000 (.007) .567 (.099) .601 (.097) .600 (.097) .601 (.097) .602 (.097) .891 (.028) .901 (.026) .901 (.026) .901 (.026) .901 (.026) 1.000 (.004) 1.000 (.004) 1.000 (.004) 1.000 (.004) 1.000 (.004)

(5) C 0.0 .339 (.346) .593 (.319) .570 (.310) .592 (.320) .593 (.319) .527 (.436) .805 (.304) .779 (.311) .806 (.305) .805 (.304) .560 (.496) .834 (.327) .810 (.339) .835 (.327) .834 (.327) .477 (.196) .601 (.179) .595 (.177) .602 (.180) .601 (.180) .716 (.227) .881 (.165) .872 (.165) .882 (.164) .881 (.165) .762 (.271) .915 (.169) .908 (.174) .917 (.168) .915 (.169) .522 (.143) .602 (.132) .598 (.131) .602 (.132) .602 (.132) .782 (.154) .897 (.119) .893 (.120) .897 (.119) .897 (.119) .838 (.186) .945 (.114) .942 (.117) .945 (.114) .946 (.114)

√ Table 2: Estimates of α1 (Case: β2 = 0) — (χ2 (1) − 1)/ 2 Error T (O) (N) (X) (U) (R) (O) (N) 20 (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) 40 (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) 60 (X) (U) (R) (O) (N) (X) (U) (R)

Model α1 \ β1 0.6

0.9

1.0

0.6

0.9

1.0

0.6

0.9

1.0

(1) A 0.0 .576 (.198) .633 (.216) .606 (.219) .634 (.217) .560 (.221) .838 (.171) .911 (.166) .896 (.173) .912 (.166) .860 (.180) .930 (.163) 1.000 (.146) .990 (.154) 1.000 (.145) .957 (.164) .585 (.129) .614 (.135) .602 (.135) .613 (.135) .572 (.137) .868 (.095) .909 (.091) .901 (.094) .908 (.091) .879 (.101) .963 (.079) 1.001 (.066) .997 (.069) 1.000 (.066) .980 (.076) .587 (.105) .606 (.108) .600 (.108) .606 (.108) .578 (.108) .876 (.075) .904 (.073) .901 (.074) .904 (.073) .883 (.078) .974 (.055) .999 (.046) .998 (.047) 1.000 (.046) .985 (.053)

(2) B 0.0 .457 (.242) .612 (.234) .596 (.230) .614 (.236) .601 (.230) .689 (.284) .856 (.213) .842 (.218) .859 (.212) .849 (.213) .777 (.290) .915 (.199) .905 (.207) .918 (.198) .912 (.201) .529 (.145) .604 (.140) .598 (.138) .605 (.140) .599 (.139) .795 (.149) .898 (.116) .890 (.118) .898 (.116) .893 (.117) .881 (.155) .960 (.102) .956 (.106) .960 (.102) .958 (.104) .552 (.114) .600 (.110) .598 (.110) .600 (.110) .598 (.109) .831 (.106) .902 (.088) .899 (.088) .902 (.088) .899 (.088) .917 (.109) .971 (.073) .969 (.074) .971 (.073) .970 (.074)

7

(3) B 1.0 .491 (.213) .604 (.210) .590 (.210) .604 (.210) .595 (.209) .850 (.123) .903 (.121) .895 (.119) .902 (.121) .899 (.119) .988 (.047) 1.008 (.053) 1.002 (.050) 1.007 (.053) 1.003 (.051) .542 (.131) .606 (.128) .600 (.127) .605 (.128) .602 (.127) .866 (.075) .901 (.070) .899 (.070) .901 (.070) .900 (.070) .997 (.015) 1.001 (.015) 1.000 (.015) 1.001 (.015) 1.000 (.015) .556 (.109) .599 (.106) .598 (.106) .600 (.106) .598 (.106) .871 (.063) .899 (.058) .899 (.058) .899 (.058) .899 (.058) .999 (.008) 1.000 (.008) 1.000 (.008) 1.000 (.008) 1.000 (.008)

(4) B 2.0 .536 (.162) .598 (.162) .589 (.161) .597 (.162) .593 (.161) .885 (.054) .901 (.053) .898 (.052) .900 (.053) .899 (.052) .997 (.022) 1.001 (.022) 1.000 (.022) 1.000 (.022) 1.000 (.022) .560 (.113) .604 (.112) .601 (.111) .603 (.112) .602 (.111) .889 (.036) .901 (.034) .900 (.034) .900 (.034) .900 (.034) .999 (.007) 1.000 (.007) 1.000 (.007) 1.000 (.007) 1.000 (.007) .566 (.100) .598 (.098) .597 (.098) .599 (.098) .598 (.098) .890 (.031) .900 (.030) .900 (.030) .900 (.030) .900 (.030) 1.000 (.004) 1.000 (.004) 1.000 (.004) 1.000 (.004) 1.000 (.004)

(5) C 0.0 .351 (.319) .600 (.271) .576 (.265) .599 (.272) .587 (.266) .545 (.414) .825 (.273) .797 (.281) .826 (.274) .815 (.274) .569 (.485) .846 (.309) .819 (.323) .847 (.310) .839 (.312) .483 (.175) .607 (.156) .601 (.154) .609 (.157) .602 (.154) .726 (.211) .893 (.146) .884 (.147) .894 (.146) .887 (.146) .767 (.263) .926 (.157) .918 (.162) .927 (.156) .922 (.159) .520 (.133) .599 (.118) .596 (.117) .599 (.118) .597 (.117) .784 (.146) .899 (.108) .895 (.109) .899 (.108) .896 (.108) .840 (.182) .948 (.108) .945 (.111) .948 (.108) .946 (.110)

√ √ Table 3: Estimates of α1 (Case: β2 = 0) — U (− 3, 3) Error T (O) (N) (X) (U) (R) (O) (N) 20 (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) 40 (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) (X) (U) (R) (O) (N) 60 (X) (U) (R) (O) (N) (X) (U) (R)

Model α1 \ β1 0.6

0.9

1.0

0.6

0.9

1.0

0.6

0.9

1.0

(1) A 0.0 .534 (.216) .587 (.226) .560 (.232) .588 (.227) .534 (.236) .811 (.190) .884 (.178) .868 (.188) .885 (.177) .840 (.193) .912 (.180) .982 (.159) .971 (.168) .983 (.158) .945 (.174) .567 (.139) .596 (.141) .583 (.142) .595 (.142) .566 (.150) .855 (.107) .896 (.100) .888 (.103) .896 (.100) .871 (.110) .955 (.088) .993 (.074) .989 (.077) .993 (.074) .974 (.084) .578 (.113) .597 (.114) .590 (.115) .597 (.114) .576 (.119) .869 (.080) .897 (.076) .894 (.078) .898 (.076) .879 (.083) .971 (.060) .996 (.051) .994 (.052) .996 (.051) .984 (.057)

(2) B 0.0 .430 (.275) .583 (.265) .569 (.261) .585 (.266) .587 (.266) .660 (.312) .831 (.231) .816 (.237) .834 (.230) .834 (.230) .749 (.318) .894 (.223) .882 (.232) .896 (.222) .896 (.222) .520 (.167) .595 (.162) .589 (.160) .596 (.162) .596 (.162) .786 (.165) .889 (.131) .881 (.133) .889 (.131) .889 (.131) .872 (.164) .954 (.109) .949 (.113) .954 (.109) .954 (.109) .548 (.127) .597 (.123) .594 (.123) .596 (.123) .598 (.123) .823 (.119) .894 (.099) .891 (.100) .894 (.100) .895 (.099) .912 (.114) .967 (.076) .965 (.077) .967 (.076) .968 (.076)

8

(3) B 1.0 .468 (.235) .586 (.227) .572 (.229) .586 (.228) .588 (.228) .838 (.120) .903 (.108) .895 (.108) .902 (.108) .903 (.109) .984 (.049) 1.009 (.053) 1.002 (.050) 1.008 (.053) 1.009 (.053) .528 (.156) .592 (.150) .586 (.150) .592 (.151) .593 (.151) .863 (.072) .901 (.063) .899 (.063) .901 (.064) .901 (.063) .997 (.015) 1.002 (.016) 1.000 (.015) 1.002 (.016) 1.001 (.016) .552 (.120) .595 (.117) .594 (.117) .596 (.117) .597 (.117) .870 (.058) .899 (.052) .899 (.052) .899 (.052) .900 (.052) .998 (.008) 1.000 (.008) 1.000 (.008) 1.000 (.008) 1.000 (.008)

(4) B 2.0 .524 (.164) .593 (.159) .582 (.158) .592 (.159) .594 (.159) .884 (.050) .902 (.048) .899 (.047) .901 (.047) .901 (.047) .997 (.022) 1.001 (.021) 1.000 (.021) 1.000 (.021) 1.000 (.021) .547 (.127) .593 (.122) .589 (.122) .592 (.122) .593 (.122) .889 (.032) .901 (.030) .900 (.030) .900 (.030) .900 (.030) .999 (.007) 1.000 (.007) 1.000 (.007) 1.000 (.007) 1.000 (.007) .562 (.103) .596 (.100) .595 (.100) .596 (.100) .597 (.100) .890 (.028) .900 (.027) .900 (.027) .900 (.027) .900 (.027) 1.000 (.004) 1.000 (.004) 1.000 (.004) 1.000 (.004) 1.000 (.004)

(5) C 0.0 .330 (.355) .582 (.322) .559 (.314) .581 (.323) .584 (.323) .511 (.449) .789 (.309) .763 (.317) .790 (.310) .791 (.309) .539 (.513) .812 (.336) .786 (.350) .814 (.336) .814 (.335) .473 (.198) .595 (.178) .589 (.176) .597 (.179) .596 (.178) .712 (.233) .875 (.169) .867 (.170) .877 (.168) .876 (.168) .759 (.277) .914 (.177) .906 (.181) .915 (.175) .915 (.176) .517 (.146) .596 (.132) .592 (.132) .596 (.132) .596 (.132) .777 (.157) .891 (.118) .888 (.119) .891 (.118) .892 (.118) .833 (.192) .941 (.119) .937 (.122) .941 (.119) .941 (.119)

(1) The true model is α1 = 0.6, 0.9, 1.0 and while the estimated model is Model A. (2) The true model is α1 = 0.6, 0.9, 1.0 and while the estimated model is Model B. (3) The true model is α1 = 0.6, 0.9, 1.0 and while the estimated model is Model B. (4) The true model is α1 = 0.6, 0.9, 1.0 and while the estimated model is Model B. (5) The true model is α1 = 0.6, 0.9, 1.0 and while the estimated model is Model C.

(β1 , β2 ) = (0, 0), i.e., Model A, (β1 , β2 ) = (0, 0), i.e., Model A, (β1 , β2 ) = (1, 0), i.e., Model B, (β1 , β2 ) = (2, 0), i.e., Model B, (β1 , β2 ) = (0, 0), i.e., Model A,

Suppose that the true model is represented by Model A with p = 1. When x1t = t is taken in Model C, it is known that the OLSE of α1 from Model C gives us the largest bias and the OLSE of α1 from Model A yields the smallest one (see, for example, Andrews (1993)). That is, OLSE bias of the AR(1) coefficient increases as the number of exogenous variables increases. In order to check this fact, first we compare (1), (2) and (5) with respect to (O). Note that (O) represents the arithmetic average and the root mean square error from the 1000 OLSE’s. For (O) in Table 1, in the case of T = 20 and α1 = 0.6, (1) is 0.552, (2) is 0.446 and (5) is 0.339. For all the cases of T = 20, 40, 60 and α1 = 0.6, 0.9, 1.0 in Tables 1 – 3, (O) is biased as the number of exogenous variables increases. We compare (2) – (4), taking the case T = 20 in Table 1, where the true model is Model A or B while the estimated model is Model B. We examine whether the intercept influences precision of OLSE. The results are as follows. When the intercept increases, the OLSE approaches the true parameter value, i.e., 0.446 for (O) in (2), 0.482 for (O) in (3) and 0.532 for (O) in (4), and in addition the root mean square error of the OLSE is small, i.e., 0.272 for (O) in (2), 0.226 for (O) in (3) and 0.160 for (O) in (4). Thus, as the intercept increases the better OLSE is obtained. The same results are obtained for (O) in both Tables 2 and 3. The error term is assumed to be normal in Table 1, chi-squared in Table 2 and uniform in Table 3. OLSE is distribution-free, but it is observed from Tables 1 – 3 that the bias of OLSE depends on the underlying distribution of the error. That is, the OLSE with the uniform error gives us the largest bias and RMSE of the three while the OLSE with the chi-squared error yields the smallest bias and RMSE See (O) in each table. In Table 1, under the condition that for the true distribution of the error term ut is normal, we compute the unbiased estimate of the AR(1) coefficient assuming that the error term follows the normal distribution (N), the chisquare distribution (X), the uniform distribution (U) and the residual-based distribution (R). Accordingly, it might be expected that (N) shows the best performance, because the estimated model is consistent with the underlying true one. Similarly, (X) in Table 2 and (U) in Table 3 should be better than 9

any other procedures. That is, the best estimator should be (N) in Table 1, (X) in Table 2 and (U) in Table 3. In Table 1, as expected, (N) shows the best estimator because (N) is close to the true parameter values in almost all the cases. In Table 3, as expected, (U) indicates the best estimator because (U) is close to the true parameter values in almost all the cases. However, in Table 2, (U) represents the best estimator against expectation. Remember that (X) should be the best because the underlying assumption of the error is chi-squared. In any case, we can see from the tables that (O) is biased while (N), (X), (U) and (R) are bias-corrected. (U) indicates the best estimator out of (N), (X), (U) and (R) through Tables 1 – 3, although it might be expected that (R) is robust for all the cases.

3.2

AR(p) Models

Next, we consider the AR(p) models, where p = 2, 3 is taken. Assume that the true model is represented by Model A, i.e., yt = α1 yt−1 + α2 yt−2 + · · · + αp yt−p + ut , for t = p + 1, p + 2, · · · , T , where ut is assumed to be distributed as a standard normal random variable and the initial values are given by y1 = y2 = · · · = yp = 0. In Section 3.1 we examine the four kinds of distributions for ut but in this section we consider only normal error. The above AR(p) model is rewritten as: (1 − λ1 L)(1 − λ2 L) · · · (1 − λp L)yt = ut , where L denotes the lag operator. λ1 , λ2 , · · ·, λp are assumed to be real numbers. Taking the cases of p = 2, 3, we estimate the true model by Model A. However, the true model is not necessarily equivalent to the estimated model. The case of λ1 6= 1 and λ2 = · · · = λp = 0 implies that the true model is AR(1) but the estimated one is AR(p). The results are in Table 4 for AR(2) and Table 5 for AR(3), where the arithmetic averages and the root mean squares errors from the 1000 coefficient estimates of α1 – α3 are shown. In both Tables 4 and 5, the sample size is taken as only the cases of T = 20 to save space. As shown in Tables 1 – 3, the cases of T = 40, 60 are similar to those of T = 20 except that the former cases are less biased than the latter ones. We examine the cases where λi takes 0.0, 0.5 or 1.0 for i = 1, 2, 3 and λ1 ≥ λ2 ≥ λ3 holds. Under the assumption that the error term ut is normally distributed in the true model, we obtain the unbiased estimates using (N), (X), (U) and (R) shown in Section 2. In Table 4, RMSE’s of (N), (X), (U) and (R) are smaller than those of OLSE in the case of α1 = 2 and α2 = −1, i.e., λ1 = λ2 = 1. For all the estimates of α1 and α2 , the arithmetic averages of (N), (X), (U) and (R) are closer to the true parameter values than those of OLSE, but (X) is slightly 10

Table 4: AR(2) Model: N (0, 1) Error and T = 20 (α1 , α2 ) (0.0,

0.00)

(0.5,

0.00)

(1.0,

0.00)

(1.0, −0.25)

(1.5, −0.50)

(2.0, −1.00)

Est. of α1 Est. of α2 (O) .003 (.249) −.053 (.247) (N) .002 (.263) .000 (.281) (X) −.019 (.264) −.027 (.278) (U) .001 (.263) .008 (.284) (R) .002 (.263) .002 (.281) (O) .478 (.251) −.052 (.245) (N) .503 (.264) −.001 (.278) (X) .484 (.264) −.017 (.276) (U) .503 (.264) .007 (.280) (R) .503 (.264) .001 (.278) (O) .944 (.257) −.032 (.254) (N) 1.002 (.272) −.013 (.286) (X) .982 (.268) −.001 (.281) (U) 1.002 (.273) −.009 (.289) (R) 1.003 (.272) −.012 (.287) (O) .951 (.247) −.265 (.235) (N) 1.004 (.256) −.254 (.267) (X) .983 (.254) −.252 (.264) (U) 1.005 (.257) −.250 (.270) (R) 1.004 (.256) −.252 (.268) (O) 1.408 (.249) −.454 (.241) (N) 1.502 (.248) −.509 (.264) (X) 1.475 (.247) −.484 (.262) (U) 1.506 (.250) −.512 (.267) (R) 1.504 (.249) −.510 (.265) (O) 1.871 (.240) −.873 (.256) (N) 1.991 (.203) −.993 (.228) (X) 1.970 (.210) −.969 (.237) (U) 1.997 (.203) −1.000 (.228) (R) 1.994 (.203) −.996 (.228)

11

Table 5: AR(3) Models: N (0, 1) Error and T = 20 (α1 , α2 , α3 ) (0.0,

0.00, 0.000)

(0.5,

0.00, 0.000)

(1.0,

0.00, 0.000)

(1.0, −0.25, 0.000)

(1.5, −0.50, 0.000)

(2.0, −1.00, 0.000)

(1.5, −0.75, 0.125)

(2.0, −1.25, 0.250)

(2.5, −2.00, 0.500)

(3.0, −3.00, 1.000)

Est. of α1 Est. of α2 Est. of α3 (O) .003 (.262) −.055 (.262) .007 (.260) (N) .006 (.278) −.002 (.298) .008 (.315) (X) −.016 (.277) −.021 (.295) −.005 (.310) (U) .001 (.279) .005 (.300) .010 (.317) (R) .003 (.278) .002 (.299) .009 (.315) (O) .476 (.264) −.055 (.288) .006 (.263) (N) .507 (.280) −.005 (.329) .003 (.317) (X) .482 (.278) −.011 (.322) .005 (.314) (U) .502 (.280) .006 (.331) .001 (.319) (R) .504 (.280) .001 (.329) .003 (.318) (O) .940 (.275) −.053 (.351) .026 (.278) (N) 1.003 (.286) −.010 (.402) −.010 (.328) (X) .977 (.286) −.012 (.394) .018 (.323) (U) .998 (.286) .005 (.402) −.018 (.331) (R) 1.000 (.286) −.003 (.401) −.012 (.329) (O) .948 (.267) −.267 (.338) .002 (.268) (N) 1.006 (.281) −.255 (.395) −.002 (.319) (X) .979 (.280) −.249 (.385) .006 (.317) (U) 1.002 (.281) −.243 (.397) −.009 (.321) (R) 1.003 (.281) −.248 (.395) −.005 (.320) (O) 1.410 (.282) −.480 (.418) .023 (.285) (N) 1.502 (.287) −.508 (.484) −.005 (.331) (X) 1.476 (.288) −.499 (.474) .016 (.328) (U) 1.498 (.288) −.493 (.487) −.015 (.335) (R) 1.500 (.288) −.501 (.485) −.008 (.333) (O) 1.870 (.300) −.880 (.513) .007 (.314) (N) 1.997 (.298) −1.004 (.586) .005 (.363) (X) 1.974 (.297) −.984 (.572) .012 (.357) (U) 1.993 (.300) −.990 (.591) −.007 (.367) (R) 1.995 (.299) −.999 (.588) .001 (.365) (O) 1.413 (.273) −.689 (.401) .100 (.273) (N) 1.504 (.281) −.754 (.462) .120 (.320) (X) 1.475 (.281) −.733 (.453) .124 (.317) (U) 1.502 (.282) −.743 (.467) .114 (.323) (R) 1.502 (.281) −.748 (.463) .117 (.321) (O) 1.873 (.293) −1.112 (.492) .213 (.296) (N) 2.000 (.287) −1.255 (.543) .249 (.337) (X) 1.973 (.287) −1.227 (.532) .252 (.332) (U) 1.998 (.289) −1.246 (.549) .242 (.342) (R) 1.999 (.288) −1.251 (.545) .246 (.339) (O) 2.319 (.318) −1.702 (.606) .377 (.342) (N) 2.490 (.290) −1.986 (.611) .494 (.366) (X) 2.464 (.290) −1.942 (.602) .479 (.360) (U) 2.490 (.292) −1.981 (.617) .489 (.370) (R) 2.490 (.291) −1.984 (.614) .493 (.368) (O) 2.763 (.354) −2.501 (.762) .729 (.441) (N) 2.975 (.285) −2.946 (.647) .969 (.393) (X) 2.949 (.289) −2.889 (.651) .938 (.394) (U) 2.978 (.287) −2.950 (.653) .970 (.397) (R) 2.976 (.286) −2.948 (.650) .970 (.394)

12

biased conmared with (N), (U) and (R). Therefore, it might be concluded that the OLSE bias is corrected by the suggested estimators. Thus, in the case of the AR(2) models, we obtain the same results as in the case of the AR(1) models. Next, we examine the AR(3) models and the results are in Table 5. For estimation of zero coefficients, all the three estimators are close to the true parameter value. However, for estimation of non-zero coefficients, the suggested estimators are superior to OLSE, which implies that (N), (X), (U) and (R) are less biased than (O). Thus, for all the cases of AR(p) for p = 1, 2, 3, it is shown from Tables 1 – 5 that OLSE bias is corrected using the proposed estimators even if the data generating process is not known. Finally, note as follows. In Table 4, the case of α1 6= 0 and α2 = 0 (i.e., λ1 6= 0 and λ2 = 0) implies that the data generating process is AR(1). In Table 5, the case of α1 6= 0 and α2 = α3 = 0 (i.e., λ1 6= 0 and λ2 = λ3 = 0) implies that the data generating process is AR(1) and the case of α1 6= 0, α2 6= 0 and α3 = 0 (i.e., λ1 6= 0, λ2 6= 0 and λ3 = 0) implies that the data generating process is AR(2). Thus, in any case, even if the true model is different from the estimated model, we can obtain the bias-corrected coefficient estimate based on the suggested estimators.

4

Summary

It is well known that the OLS estimates are biased when the autoregressive terms are included in the explanatory variables. In this paper, we have proposed the bias correction method using the simulation techniques, where the bootstrap methods are applied. We obtain the unbiased estimate of θ, i.e., θ, which is the θ such that the OLSE computed from the original data is equal to the arithmetic average of the OLSE’s obtained from the simulated data given θ. When we simulate the data series, we need to assume a distribution of the error term. Since the underlying true distribution of the error term is not known, the four types of random draws are examined for the error term, i.e., the normal error (N), the chi-squared error (X), the uniform error (U) and the residual-based error (R). Because the residual-based approach is distribution-free, it is easily expected that (R) shows a good performance for all the simulation studies in the case where the distribution of the error term is misspecified. However, from the simulation studies shown in Tables 1 – 3, (U) indicates the best estimator in spite of the true distribution of the error term.

13

Appendices Appendix 1: OLSE Bias In this appendix, we examine by Monte Carlo simulations how large the OLSE bias is. We focus only on the case of p = 1, i.e., the AR(1) model. Suppose that the true model is represented by Model A with p = 1. When x1t = t (time trend) is taken in Model C, it is known that OLSE of α1 from Model C gives us the largest bias and OLSE of α1 from Model A yields the smallest one (see, for example, Andrews (1993)). Figure 1 shows the relationship between the true autoregressive coefficient (i.e., α1 ), the arithmetic mean of OLSE’s from 10,000 simulation runs. In order to draw the figure, we take the following simulation procedure. (i) Generate y2 , y3 , · · ·, yT by Model A given α1 , ut ∼N (0, 1) and y1 = 0, where T = 20. (ii) Compute OLSE of α1 by estimating Model A, that of (β1 , α1 ) by Model B, and that of (β1 , β2 , α1 ) by Model C. Note in Model C that x1t = t (time trend) is taken in this simulation study. (iii) Repeat (i) and (ii) 10,000 times. (iv) Obtain the arithmetic mean from the 10,000 OLSE’s of α1 . (v) Repeat (i) – (iv) given the exactly same random draws for ut (i.e., 10, 000 × (T − p) random draws for T = 20 and p = 1) and the different parameter value for α1 (i.e., α1 = −1.20, −1.19. − 1.18, · · · , 1.20). Thus, we have the arithmetic mean from the 10,000 OLSE’s corresponding to the true parameter value for each model. In Figure 1, the true model is given by Model A and it is estimated by Models A – C. The horizontal line implies the true parameter value of the AR(1) coefficient and the vertical line indicates the OLSE corresponding to the true parameter value. Unless the OLSE is biased, the 45◦ degree line represents the relationship between the true parameter value and the OLSE. Each line indicates the arithmetic mean of the 10,000 OLSE’s in Figure 1. There is the largest bias around α1 = 1 for all the Models A – C. From Figure 1, bias drastically increases as number of the exogenous variables increases. That is, in the case where α1 is positive, OLSE of Model C has the largest downward-bias and OLSE of Model A represents the smallest downward-bias, which implies that inclusion of more extra variables results in larger bias of OLSE. Thus, from Figure 1 we can see how large the OLSE bias is. That is, discrepancy between the 45◦ degree line and the other lines increases as number of the extra variables increases. Now we consider correcting the OLSE bias. In Figure 1, we see the arithmetic mean from the 10,000 OLSE’s given the true coefficient, respectively. It is also possible to read the figures reversely. 14

Figure 1: The Arithmetic Mean from the 10,000 OLSE’s of AR(1) Coeff. — N (0, 1) Error and T = 20 — b1 α

... . ...... .... ... ... .. .. .... .. ... ......... ...... . . . .. .... . .. . ... .. . . ... ... . .. . ........ ... .... ... . . . . ... . . . .. .... ... .. . .... .. ... . ....... ... .. . .. . ... ... . . .... . . . . .. . .... ... ...... .. .... . . .. .. ... ... .. . . ... .. . .... . . . . . ..... ... .... .. . ... ... .... .... . ... . .... . . ... ... ..... . . . ...... . . .. ... ... . ... ... .... .. ...... ... ... ............... ...... ..... ...... .. . ...................................................................................................... . . . . ... . . . .. . . .... .... ... ... ..... . . . ..... ... ... . ... .... .. ......... ..... ... ... . .. ... ....... ..... ..... . . ..... ... . . ..... . . .. . . . ... . . . . . .. ... ........ .... ....... ..... . . ....... ......... ..... ... . . ..... . . .... . .. .. ... .. .. ... . . . ... .... .... .... ....... ... . . . ..... ... ... ... .... . ... ... .... .... .... ........ .. . . ... ... ... ... .... . ..... . . . . . . . . ..... ...... .. . . . .... . . . ... ... ... ..... ......... .... . . . .. .. . . . . .. . . ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................. . .. .. . . . . . . . . . . . .. ..... . .. .. .. . . . .. . . . ... ... ... ................... . .. .. .. . . . . . . ........ ........ ........ ........ ... . ... ....... . . ...... . . . .......... . .... ........ . . . . . . . . ....... . . . . . . . ..... ............... . . . ..... ............ . . . . . . . ...... . . . ........ . .... . .. ................ . . . ........ ................ ... . . . . . . .. . . ......... . ...... ................ . . . . . ..... ........... ◦ . .. .... .................................................. ....... ................. .... . ................ ... . . . . . . . ....... ....... ....... ....... .... ... . ... ... .... .... .. ....... .... .................... ....... .. ...... . . . . . . ..... ... ... .. .... ....... . .......... . . . . . . . . . ..... .... .... .. .... ... ... ..... . . . . .. .... ..... ......... .. .... ...... .. .

1.0

0.5

−1.0

−0.5

0.5

α1

1.0

.552 .673 .849

−0.5

45 Degree Line Model A Model B Model C

−1.0

15

b 1 = 0.5 from actually For example, in Figure 1, when OLSE is obtained as α observed data, the true parameter value α1 can be estimated as 0.526 for Model A, 0.642 for Model B and 0.806 for Model C. For the proposed estimator, it is possible to consider shifting the distribution of OLSE toward the distribution around the true value in the sense of mean. In practice, no one knows the true model. What we can do is to estimate the model assumed by a researcher. Figure 1 indicates that inclusion of more extra variables possibly yields serious biased OLSE and furthermore that the true parameter values can be recovered from the estimated model even if we do not know the true model. In Section 2, based on this idea, we obtain the mean-unbiased estimator, which can be applied to any case of the higherorder autoregressive models, the nonnormal error term and inclusion of the exogenous variables other than the constant and trend terms. Here, we take the constant term and the time trend as xt , although any exogenous variables can be included in the model.

Appendix 2: Optimization Procedure As for a solution of equation (4), in this appendix we take an example of an iterative procedure, where the numerical optimization procedure is utilized. Since θbi∗ is computed based on the artificially simulated data, θ is numerically derived by the simulation technique (see, for example, Tanizaki (1995) for the numerical optimization procedure). Using equation (4), we update the parameter θ as follows: 



θ(j+1) = θ(j) + γ (j) θb − g(θ(j) ) ,

(5)

where j denotes the j-th iteration. In the first iteration, OLSE of θ is taken for b γ (j) is a scalar, which may depend on number of iteration, θ(1) , i.e., θ(1) = θ. i.e., j. Thus, using equation (5), the unbiased estimator can be obtained. When (j+1) θ is stable, we take it as the unbiased estimate of θ, i.e., θ. As for convergence criterion, in this paper, when each element of θ(j+1) − θ(j) is less than 0.001 in absolute value, we consider that θ(j+1) is stable. For an interpretation of γ (j) , it might be appropriate to consider that the Newton-Raphson optimization procedure is taken. which is described as follows. Approximating θb − g(θ) about θ = θ∗ , we have: 0 = θb − g(θ)

∂g(θ∗ ) (θ − θ∗ ). ∂θ0 Then, we can rewrite as: ≈ θb − g(θ∗ ) −

∂g(θ∗ ) θ=θ + ∂θ0 ∗

!−1





θb − g(θ∗ ) . 16

Regarding θ as θ(j+1) and θ∗ as θ(j) , the following equation is derived: θ

(j+1)



(j)

∂g(θ(j) ) + ∂θ0

!−1

(θb − g(θ(j) )),

which is equivalent to equation (5) with the following condition: ∂g(θ(j) ) ∂θ0

!−1

= γ (j) Ik+p ,

where Ik+p denotes a (k + p) × (k + p) identity matrix. Since g(θ) cannot be explicitly specified, we take the first derivative of g(θ) as the diagonal matrix. Moreover, taking into account the convergence speed, γ (j) = cj−1 is used in this paper, where c = 0.9.

References Abadir, K.M. (1993) “ OLS Bias in A Nonstationary Autoregression, ” Econometric Theory, Vol.9, No.1, pp.81 – 93. Andrews, D.W.K. (1993) “ Exactly Median-Unbiased Estimation of First Order Autoregressive / Unit Root Models, ” Econometrica, Vol.61, No.1, pp.139 – 165. Andrews, D.W.K. and H.Y. Chen (1994) “ Approximately Median-Unbiased Estimation of Autoregressive Models, ” Journal of Business and Economic Statistics, Vol.12, No.2, pp.187 – 204. Diebold, F.X. and G.D. Rudebusch (1991) “ Unbiased Estimation of Autoregressive Models, ” unpublished manuscript, University of Pennsylvania. Enders, W. and B. Falk (1998) “ Threshold-Autoregressive, Median-Unbiased, and Cointegration Tests of Purchasing Power Parity, ” International Journal of Forecasting, Vol.14, No.2, pp.171 – 186. Greene, W.H. (1993) Econometric Analysis (Second Edition), Prentice Hall. Grubb, D. and J. Symons (1987) “ Bias in Regressions with a Lagged Dependent Variable, ” Econometric Theory, Vol.3, pp.371 – 386. Hurwicz, L. (1950) “ Least-Squares Bias in Time Series, ” in Statistical Inference in Dynamic Economic Models, ed. T.C. Koopmans, New York. John Wiley, pp.365 – 383. Imhof, J.P. (1961) “ Computing the Distribution of Quadratic Forms in Normal Variates, ” Biometrika, Vol.48, pp.419 – 426. Kendall, M.G. (1954) “ Note on Bias in the Estimation of Autocorrelations, ” Biometrika, Vol.41, pp.403 – 404. MacKinnon, J.G., and A.A. Smith, Jr. (1998) “ Approximate Bias Correction in Econometrics, ” Journal of Econometrics, Vol.85, No.2, pp.205 – 230. 17

Maekawa, K. (1983) “ An Approximation to the Distribution of the Least Squares Estimator in an Autoregressive Model with Exogenous Variables, ” Econometrica, Vol.51, No.1, pp.229 – 238. Maekawa, K. (1987) “ Finite Sample Properties of Several Predictors from an Autoregressive model, ” Econometric Theory, Vol.3, pp.359 – 370. Marriott, F.H.C. and J.A. Pope (1954) “ Bias in the Estimation of Autocorrelations, ” Biometrika, Vol.41, pp.390 – 402. Orcutt, G.H. and H.S. Winokur (1969) “ First Order Autoregression: Inference, Estimation, and Prediction, ” Econometrica, Vol.37, No.1, pp.1 – 14. Peters, T.A. (1989) “ The Exact Moments of OLS in Dynamic Regression Models with Non-Normal Errors, ” Journal of Econometrics, Vol.40, pp.279 – 305. Quenouille, M.H. (1956) “ Notes on Bias in Estimation, ” Biometrika, Vol.43, pp.353 – 360. Sawa, T. (1978) “ The Exact Moments of the Least Squares Estimator for the Autoregressive Model, ” Journal of Econometrics, Vol.8, pp.159 – 172. Shaman, P. and R.A. Stine (1988) “ The Bias of Autoregressive Coefficient Estimators, ” Journal of the American Statistical Association, Vol.83, No.403, pp.842 – 848. Tanaka, K. (1993) “ Asymptotic Expansions Associated with AR(1) Model with Unknown Mean, ” Econometrica, Vol.51, No.4, pp.1221 – 1231. Tanizaki, H. (1995) “ Asymptotically Exact Confidence Intervals of CUSUM and CUSUMSQ Tests: A Numerical Derivation Using Simulation Technique, ” Communications in Statistics, Simulation and Computation, Vol.24, No.4, pp.1019 – 1036. Tanizaki, H. (2000) “ Bias Correction of OLSE in the Regression Model with Lagged Dependent Variables, ” Computational Statistics and Data Analysis, forthcoming. Tse, Y.K. (1982) “ Edgeworth Approximations in First-Order Stochastic Difference Equations with Exogenous Variables, ” Journal of Econometrics, Vol.20, pp.175 – 195. Tsui, A.K. and M.M. Ali (1994) “ Exact Distributions, Density Functions and Moments of the Least Squares Estimator in a First-Order Autoregressive Model, ” Computational Statistics & Data Analysis, Vol.17, No.4, pp.433 – 454. White, J.S. (1961) “ Asymptotic Expansions for the Mean and Variance of the Serial Correlation Coefficient, ” Biometrika, Vol.48, pp.85 – 94. Wu, C.F.J. (1986) “ Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis, ” Annals of Statistics, Vol.14, pp.1261 – 1350 (with discussion).

18