The Robustness of Estimators for Dynamic Panel Data Models to

0 downloads 0 Views 823KB Size Report
Oct 29, 2008 - In these studies analysis is based on Monte .... instruments and/or other moment conditions.3 Most of the recent developments have been ...
The Robustness of Estimators for Dynamic Panel Data Models to Misspeci…cation Mark N. Harris1 , Weiping Kostenko1 ; László Mátyás2;3 and Isfaaq Timol1 1 Department of Econometrics and Business Statistics, Monash University, Australia 2 Central European University, Hungary 3 Erudite, Universite Paris XII October 29, 2008

JEL Classi…cation: C13, C15 and C23 We would like to thank Jan Kiviet, Patrick Sevestre and Badi Baltagi, for helpful comments and suggestions and Attila Pataki for computing assistance on earlier versions of this paper. The usual caveats apply.

The Robustness of Estimators for Dynamic Panel Data Models to Misspeci…cation 1

Abstract Transition from economic theory to a testable form of model invariably involves the use of certain “simplifying assumptions”. If, however, these are not valid, misspeci…ed models result. This paper considers estimation of the dynamic panel data model, which often forms the basis of testable economic hypotheses. These estimators are frequently similarly based on certain assumptions which appear to be often untenable in practice. In this paper the performance of these estimators is analysed in scenarios where the theoretically required conditions are not met. Speci…cally, we consider three such instances of: correlated idiosyncractic disturbance terms; correlation between the idiosyncractic disturbance terms and explanatory variables; and, …nally, cross-sectional dependence. The major …ndings are the limited tests readily available tend to have poor power properties and that estimators’performance varies greatly across scenarios. A robust estimator across all experiments and parameter settings was the W B2 (?, ?) estimator. 1

We would like to thank Jan Kiviet, Patrick Sevestre and Badi Baltagi, for helpful comments and suggestions and Attila Pataki for computing assistance on earlier versions of this paper. The usual caveats apply. The order of the authors is alphabetical.

2

1. Introduction Applied economists invariably take an economic question of interest, transform it into a testable hypothesis and econometric model, and subsequently refute or accept it based on available data. Unfortunately this transition from theory to estimated equation is often not straightforward such that many “simplifying assumptions” have to be made. The use of panel data in applied work has the advantage that one can easily condition on unobserved individual heterogeneity: often a major stumbling block to the applied economist. Even here though, use is still made of such simplifying assumptions. The problem is however, that if these assumptions are not valid potentially misspeci…ed models and seriously erroneous inference may result. In this paper, focus is on the estimation of dynamic panel data models and the consequences of working with a misspeci…ed model. There has been a signi…cant amount of work analysing the small sample performance of the surfeit of estimators available to the applied researcher when these assumptions are indeed valid (see, for example, ?, ?, ?, ?, ?, ?). In these studies analysis is based on Monte Carlo experiments in which the data was indeed generated according to the assumed data generating process; a signi…cant array of biases is typically found. These di¤erences can mostly be attributed to the small sample performance of the estimators as di¤ering (consistent) estimators have divergent rates of asymptotic convergence. Empirically, however, a paradox frequently arises as even in large samples (where all consistent estimators should have similarly small biases) di¤erent estimators will invariably yield quite substantially di¤erent parameter estimates (?). This suggests that it is not only the small sample performance of the estimators that is causing these divergences, but often that the estimated models are, in some manner, misspeci…ed. It is the purpose of this paper to investigate the performance of popular estimators of dynamic linear panel data models given that some of the usual assumptions underlying them are violated. The sources of misspeci…cation considered are those

3

thought most likely to occur in practice in economic modelling. Speci…cally, we consider three such instances of: correlated idiosyncratic disturbance terms; correlation between these idiosyncratic disturbance terms and explanatory variables; and cross-sectional dependence.2 Naturally it is always an open to the applied researcher to test for the presence of such misspeci…cation. However, given the multitude of available estimators, con‡icting evidence across estimators is a near certainty. Moreover, as there are many possible candidates for misspeci…cation, it is likely that such tests will be undertaken within a biased framework. A more attractive alternative would appear to be to select an estimator that is empirically robust to the predominant forms of misspeci…cation. However, we do also consider the performance of certain speci…cation tests in aiding the applied researcher in her choice of estimator. However, in general, the performance of these tests in being able to correctly detect the forms of misspeci…cation considered, was poor. Overall, the results suggest that under the scenario of correlated errors, the AR (?) estimator is an appropriate choice and the W B (?, ?) estimator is preferred when there is cross-sectional dependence. Finally, when the disturbance terms and the explanatory variables are correlated, the simple OLS estimator appears to o¤er a robust choice. 2

An earlier version of this paper (?) considered static versus dynamic models, and correlations between the observed and unobserved heterogeniety components.

4

2. The Basic Dynamic Panel Data Model The focus here is on the family of data generating processes implied by the sample regression function yit =

with

The

+ x0it + uit ; i = 1; : : : ; N ; t = 1; : : : ; T;

(2.2)

:

i

=

+

(2.3)

it

i;t 1

(2.4)

+ "it :

are the unobserved and individual-speci…c e¤ects and

i

(2.1)

e0it + uit = x

uit = it

i yi;t 1

it

and "it are idio-

syncratic disturbance terms. The vector of observed characteristics xit can contain both time-variant and invariant regressors. The usual methods of estimating equation (2.1) when there is no lagged dependent variable (most commonly any of ordinary least squares, OLS; W ithin, or random e¤ects generalised least squares) are biased and inconsistent when 6= 0 (see ?, ?). Consistent estimation of such a model however, is possible. Typically, such estimators implicitly, or explicitly, make certain assumptions regarding equations (2.1) to (2.4): 1. 2.

i

= ; 8i; = 0; such that

3. E (x0is

it )

it

= "it ;

= 0; 8i; t; s;

4. E (x0it i ) = 0; 8i; t; and 5. E (ui u0i ) =

2

IT +

2

JT ; where JT is a matrix of ones of size T:

That is, focus here is on the relatively simple case of cross-sectional independence and homoskedasticity; the x’s are independent of both uit and

i;

and the

composite disturbance term has the usual error components form (?). What we 5

wish to ascertain in this paper, is how certain estimators of the linear dynamic panel data model fare, when some of the less tenable assumptions above, do not, in fact, hold. 2.1. Estimators of the Linear Dynamic Panel Data Model Under the usual set-up of the linear dynamic panel data model (that is, assumptions 1. to 5. are maintained), many consistent estimators have been proposed (?). Invariably, these are IV (or GMM) type estimators, relying on a set of valid instruments and/or other moment conditions.3 Most of the recent developments have been based on a search for an increase in the instrument set, zit; based further lags of yi;t

1

and/or transformations of this. Indeed, depending on the assump-

tions one is wishing make regarding the relationship between xit and

it ;

the list

of potential candidates for zit can be huge (?). The estimators we consider in this paper, which are consistent under assumptions 1. to 5. above, are, following the literature, based upon IV and GMM estimation. Of course, with such a large list of potential candidates, one has to be somewhat selective in the estimators so chosen. Here we focus on the: simplest; and/or most common; and/or with previous evidence of good performance. A commonly used approach to the estimation of dynamic panel data models where possibly some (or all) elements of xit are correlated with the individual e¤ects, is to write the model in …rst di¤erences so that the individual e¤ects are, being time invariant, discarded such that yit =

yi;t

1

+

x0it +

=

yi;t

1

+

x0it +

uit

(2.5)

it :

Here the lagged endogenous variable (and possibly some other regressors) are eit ) 6= 0) and correlated with the disturbances, both in …nite samples (E ( it j x 0 eit it =N T ) 6= 0) : Due to the assumed error components asymptotically (plim ( x 3

Maximum likelihood has also been suggested, but is infrequently used in practice.

6

structure of the disturbances, here we have

with

eit ) = V ( V ( ui j x 0

2

1

it

0

B B 1 2 B B = B 0 ... B B 0 ::: @ 0 :::

:::

1 .

0 .. .

1 0

2 1

..

2

eit ) = j x 0

(2.6)

1

C ::: C C C : 0 C C C 1 A 2

(2.7)

We consider three such estimators based on the …rst-di¤erenced model. The AH estimator (?) has

yi;t

2

as an instrument:

yi;t

2

= yi;t

2

yi;t

2

is clearly

correlated with yi;t 1 (= yi;t 2 yi;t 3 ) ; but not with uit (= it = it i;t 1 ) - the disturbance term of the transformed model - so long as the original ones were iid. However, in some instances, use of the instrument

yi;t

2

yields ine¢ cient

estimators (?), who suggests that yi;t 2 is preferable (AR). For the same reasons, and under the same conditions, this is once more a valid instrument. e = Implementation of both of these estimators is very simple: having rank( X) e and Z; represent the matrix stacked versions of rank (Z) ; where X the IV estimator is simply given by b

AH=AR

= (Z0 X)

1

Z0 Y

eit and zit ; x

(2.8)

e = where again Y is the matrix stacked version of yit . However, having rank( X) rank (Z), these estimators will typically not have …nite moments (?). ? proposed an estimator aimed at tackling the drawbacks of both the AH

and AR estimators: the small number of instruments (orthogonality conditions) and the serial correlation in the disturbances of the …rst-di¤erenced model. With regard to the former, ? show that there are many more potential orthogonality conditions implied by the standard linear dynamic panel data model. Consider a 7

panel with …ve periods of observations, t = 0; 1; : : : ; 4. The model can be written as for t = 2: yi2

yi1 = (yi1

yi0 ) + (xi2

xi1 )0 +

i2

i1

for t = 3: yi3

yi2 = (yi2

yi1 ) + (xi3

xi2 )0 +

i3

i2

yi2 ) + (xi4

0

i4

i3 :

and for t = 4: yi4

yi3 = (yi3

xi3 )

+

In period t = 2 the variable yi0 is a valid instrument: it is obviously correlated with yi1

yi0 but not with

i2

i1

(as long as the

it ’s

are serially uncorrelated).

Indeed, when t = 2; yi0 is simply the AH estimator. When t = 3, the instrument proposed by ? is yi1 : However, given the autoregressive nature of the model, yi0 is also a valid instrument here: it is correlated with yi2 yi1 while, on the assumption of no serial correlation of the ’s, it is not correlated with

i3

i2 :

This provides

two instruments for estimating the model at time t = 3: This expansion continues as t grows, yielding the full instrument set as (y)

zi = (zi ; xi ); where (y) zi

0

yi0 0 : : : : : : 0 B B 0 yi0 yi1 0 0 =B .. .. B . . @ 0 ::: 0 : : : 0 0 yi0

where we simply treat

::: :::

0 0

:::

0

: : : yiT

1 2

C C C; C A

(2.9)

xi as its own instrument.4 The associated orthogonality

conditions can be written as E(yi;t

it )

= 0; t = 2; :::; T ;

2:

(2.10)

To further increase e¢ ciency, ? suggest the following linear GMM estimator which takes into account the non-scalar covariance matrix of the disturbance terms 4

Depending on the assumptions one is willing to make regarding x (and elements of it) and its relationship with ; one may easily rede…ne the valid instrument set along similar lines as with yit (?).

8

resulting from the …rst-di¤erencing process. The estimator has the form b AB1 = ( X e 0 ZAB (Z0 AB ZAB )

where

= IN

1

:

e Z0AB X)

1

e 0 ZAB (Z0 X AB ZAB )

1

Z0AB Y; (2.11)

? also suggest a variant robust to heteroskedasticity, where the matrix (Z0AB ZAB ) - with now

being unspeci…ed - is consistently estimated as Z0ABdZAB

N 1 X 0 c c0 z ui ui z i ; = N i=1 i

(2.12)

where zi corresponds to the AB instrument set and cui are the residuals estimated from b ;b is then given by AB1

AB2

1 b AB2 = ( X e 0 ZAB (Z0 dZAB ) Z0 e AB X) AB

Note that such estimators using yi;t l , with l

1

1 e 0 ZAB (Z0 dZAB ) Z0 X AB Y: AB

(2.13)

1, as an instrument (and trans-

formations of these), speci…cally require that the errors in …rst di¤erences should be M A (q) with q l. Most recent (within the last 20 years or so) empirical applications have been dominated by the ? estimator, as the authors have long made available Gauss code to estimate such models, and moreover it is also available in commercial software packages such as Limdep and Stata. ? show that estimation can be based on orthogonality conditions implied by linear transformations of yit de…ned by the matrix Ai , provided Ai conforms to particular restrictions; in particular consistency requires 0 E (z0i ) =E y+ Ai

= trAi E

y0+ = 0

(2.14)

with y+ = (yi0 ; : : : ; yiT ). ? suggest “operationalising” the estimator from the simple autoregressive model to one with x. As the usual Z0 pre-multiplication of the model, typically employed for instrumental variable estimation, here removes the unobserved e¤ects, xit can now be considered exogenous with respect to 9

i

even if not prior to such pre-multiplication. This de…nes ZW B as zi = (Ai y+ ; xi ) and estimation can again be based on conditions of the form (2.14). Ai is unspeci…ed apart from the restrictions of (2.14);5 the variance-covariance matrix of the resulting estimator is a function of Ai ; the suggested optimal choice of Ai is that which (numerically) minimises the trace of this variance-covariance matrix whilst ensuring that the appropriate restrictions hold. With Ai in hand, the W B1 estimator has the usual (linear) GMM form (as per equation (2.11)). The additional assumption that xit is exogenous with respect to it yields an “expanded” estimator (W B2) which has zi = (Ai y+ ; Ai x+ ; xi ) with x+ similarly de…ned (?). The problem of weak instruments has been noted by ? with regard to the estimation of the model in the …rst di¤erences and using lagged levels of the endogenous explanatory variable as instruments. Indeed, ? show that in certain conditions as either

2

! 1 or when the ratio

=

2

! 1, the correlation between

the instrument set and the lagged endogenous regressor, tends to 0: This may explain the poor performance of the ? estimator(s) in previous studies. Thus the ? approach consists of adding supplementary orthogonality conditions based on the additional assumption of “quasi-stationarity”of yit . This assumption amounts to considering that the initial observations yi0 are generated according to yi0 = x0i0

0

+

i

1

+

i0 :

(2.15)

We now have that E ( yi;t 1 uit ) = 0; t = 2; 3; :::; T;

(2.16)

meaning that lagged …rst di¤erences of yit are valid instruments in the levels equation. The ? “system GMM”estimator (BB) stacks the model in …rst di¤erences with that in levels and estimates this system using linear GMM with ? instruments for the …rst di¤erenced partition of the equation, and those implied by equation (2.16) for the levels partition. 5

see also ? for further restrictions on the structure of Ai .

10

As shown above, many orthogonality conditions are implied by the standard assumptions; however, estimators are (generally) only based upon subsets of these (see, inter alia, ?, ?, ?, ?, ?, ?). The essence of nonlinear GMM estimation involves explicit exploitation of theoretical moment conditions which, for estimation purposes, are replaced by their sample counterparts (?, ?). Here we use the orthogonality conditions reported in ? (assumptions 1. to 5. above) and also consider two nonlinear GMM estimators: GMMW (which uses ? asymptotically e¢ cient weighting matrix) and GMMI (the identity matrix). In summary, we do not consider all such estimators of the linear dynamic panel data model. Indeed, depending on what assumptions one is willing to make regarding the exogeneity of x and subsets of it, how initial conditions are generated and whether to also consider maximum likelihood and bias adjusted estimation (?), there are a plethora of potential candidates (?). Those estimators we do consider are some simple (inconsistent in the usual dynamic panel data setting where there is no misspeci…cation) estimators: ordinary least squares (OLS); the …xed e¤ects (W ithin) estimator; random e¤ects GLS (F GLS); and ordinary least squares on the …rst di¤erenced model (DOLS) : Then we also consider numerous IV estimators based on the …rst-di¤erenced model: ?, AH; ?, AR; and ?, AB1 and AB2: And estimators based on a generic model transformation (?, ?), W B1 and W B2; ? “system GMM”estimator; and …nally nonlinear GMM, GM M I and GM M W:

3. The Experiments It is likely that some of the assumptions made in the previous section, will not hold in practice. We therefore consider three violations of these, namely: instigating autocorrelation into it ; allowing x to be endogenous with respect to vit ; and …nally, allowing for cross-sectional dependence. To analyse the small sample performance of these various estimators under these scenarios of “misspeci…cation”, a set of Monte Carlo experiments is con-

11

ducted. Following ? the basic assumed data generating process (under the null hypothesis) is of the form yit = yi;t where

i

i:i:d: N (0;

2

(1)

1

) and

+ xit

(2)

1

+ xit

i:i:d: N (0;

it

2

+

2

i

+

(3.1)

it ;

= 1 and

);

is determined

by = (1 ). For all the experiments, three di¤erent values of are consid(1) (2) ered ( = 0:1, 0:5 and 0:9) and 1 = 2 = 0:5: The variables in xit and xit are generated as (k)

(k) (k) x xit 1

xit = where

(1) x

yi0 =

= 0:8, i = (1

(2) x

(3.2)

+ N (0; 1)

= 0. The initial conditions are set as (1)

) + xi0

(2)

1 = (1

) + xi0

2 = (1

)+

i0 =

1

2 1=2

(3.3)

where the coe¢ cients of the exogenous variables equal their long-run values. As in ?, ? and ?, we set N = 100, T = 6; and the number of Monte Carlo repetitions M = 1000. It is extremely important in any Monte Carlo analysis to pay close attention to the signal-to-noise ratio (see, among others, ?, ?). Under the null hypothesis the signal-to-noise ratio is given by 2 2 yi;t

1

2 2 1 x(1) it

+ 2

+

2

+

2 2 2 x(2) it

:

(3.4)

To ensure the results from the Monte Carlo simulations are comparable across all experiments the signal-to-noise ratios for all the following experiments are kept the same as the null by adjusting the variance of the disturbance terms. 3.1. Misspeci…cation: The Scenarios In general, estimators relying on di¤ering assumptions are likely to be di¤erently a¤ected by the various misspeci…cation scenarios. Firstly consider assumptions 2

12

and 5, which are both violated if the idiosyncractic disturbances are autocorrelated. Here we let the disturbances follow an AR(1) process such that it

where "it

iid N 0;

2 0

; and

=

i;t 1

(3.5)

+ "it

= 0:1; 0:5 and 0:9. Such a situation could well

arise if, for example, the dynamics of the model have been misspeci…ed or if there is an erroneously omitted explanatory variable. As all of the estimators are based on these two key assumptions, they are all likely to be a¤ected. Those based on conditions such as 3 and 4, will still be consistent (but ine¢ cient). On the other hand, if assumption 2, in particular, is violated, estimators based on these will be inconsistent (AR, AH, AB and W B). The next scenario considered is to allow for correlation between the explanatory variables and the idiosyncratic disturbance terms. Such a situation is likely to frequently occur in practice, for example, as a result of simultaneous system of equations.6 Such a correlation is instigated by replacing equation (3.2) with (k)

xit =

(k) (k) x xi;t 1

+ vit + N (0;

2 (k) )

k = 1; 2

(3.6)

Again the strength of these “misspeci…cation correlations”was varied from “weak” ( = 0:1); to “medium” ( = 0:5); to “strong” ( = 0:9). As previously stated, these sources of misspeci…cation are chosen as they are deemed to be the most likely to occur in applied work and should have di¤ering e¤ects across estimators. Note that the nonlinear GMM estimators should be adversely a¤ected by all of these sources of misspeci…cation, as they explicitly rely on all of them, in the pursuit of asymptotic e¢ ciency. Finally, as there has been a signi…cant recent interest in cross-sectional dependence in panel data (see, for example, ?), we also investigate the scenario of 6

In a previous version of this paper, correlations between x and i were considered. However, as pointed out by an anonymous referee, this is less interesting than the current case as estimators operating in …rst di¤erences e¤ectively eliminate this correlation by eliminating the i :

13

cross-sectional dependence. Indeed, we follow the ? approach by letting yit =

(1)

yi;t

1

+ xit

(2)

1

+ xit

2

(3.7)

+ uit

with uit = where

t

t

+ N (0;

2 u)

iid N (0; 1) over t. The variance of

purposes. In this speci…cation,

t

(3.8) t

is set to 1 for identi…cation

is the source of the cross-sectional dependence

and the strength of the dependence is measured by : the null hypothesis has = 0. Following ? this paper considers two levels of cross sectional dependence: “low”, where

U [0; 0:2]; and “high”where

U [1; 4] 7 .

As noted above, it is extremely important when undertaking Monte Carlo analyses to maintain a constant signal-to-noise ratio, as otherwise it is not possible to ascertain whether changes in estimator performance is a result of changing the parameters of the experiment, or the signal-to-noise ratio. So here this entails varying the variance parameters

2 u

(variance of the general error component in

the presence of cross-sectional dependence); 20 (variance of the error term in the scenario of AR(1) errors); 21 (variance of x1 ) and 22 (variance of x2 ). Average values and standard deviations across the respective experiment for 2 2

2 u;

2 0,

2 1

and

are reported in Table 3.1.8

4. Monte Carlo Results Due to the large number of results, they are presented in graphical form in Figures 4.1 to 4.9 and focus is on the estimation of (however, the results for 1 and 2 are also reported).9 As simulation results can be a¤ected by the non-existence 7

Under the null hypothesis of no cross-sectional dependence, this model does not reduce to a standard random e¤ects model. However, when we include an individual e¤ect in the model and do the monte carlo simulations, the results are essentially unchanged. 8 Numerical methods based on a constrained optimization algorithm, were employed to achieve this. 9 For the applied economist the quantity b = 1 b may also be of interest. For space k

reasons these are not reported; however, they can easily be deduced from b k and b: The full set

14

Table 3.1: The Setting of the Variances for = 0:1 2 s.d. 0 AR(1) = 0:1 1.00 0.004 Disturbances = 0:5 0.87 0.023 = 0:9 0.73 0.026 2 Crosss.d. u Sectional Low 1.341 0.049 Dependence High 0.523 0.384 2 1

x and Correlated

= 0:1 = 0:5 = 0:9

0.993 0.867 0.391 2 2

= 0:1 = 0:5 = 0:9

0.994 0.815 0.366

s.d. 0.020 0.054 0.130 s.d. 0.025 0.181 0.184

Maintaining Signal-to-Noise Ratio = 0:5 = 0:9 2 2 s.d. s.d. 0 0 1.01 0.96 0.64 2 u

1.044 0.339 2 1

0.949 0.078 0.032 2 2

0.977 0.800 0.072

0.005 0.029 0.061

s.d. 0.033 0.283 s.d. 0.015 0.175 0.050 s.d. 0.021 0.117 0.097

1.00 0.91 0.54 2 u

0.991 0.578 2 1

0.934 0.379 0.140 2 2

0.892 0.422 0.366

0.005 0.030 0.062

s.d. 0.017 0.426 s.d. 0.035 0.244 0.188 s.d. 0.0933 0.170 0.200

of moments (?, ?), the results are reported in the form of median parameter estimates over the M repetitions. Also reported are the inter-quartile range and absolute range (the latter is sometimes truncated depending on its value(s) and consequential distortion to the graphic). The graphics also contain the estimators’performance under the null hypothesis, that is, when there is no misspeci…cation. For each estimator, the results are presented in the order of no misspeci…cation through to “high” levels of such.10 For ease of comparison across all graphics, the scales of these are held constant across all experiments. of results are available from the authors on request. 10 To obtain the GM M estimators under high levels of misspeci…cation, some of the GM M conditions used in the monte carlo simulations had to be changed for GAUSS to run.

15

4.1. Autocorrelated Disturbances The AR results are presented in Figures 4.1 to 4.3. Note that here (and elsewhere) under the null hypothesis (the …rst data-point for each estimator), the known biases of the traditional estimators is evident, as is the consistent behaviour of the IV/GMM ones. In line with previous evidence these latter estimators are all essentially unbiased, although their performance varies somewhat with regard to variability: in particular, AH has a very variable performance (see, for example, ?). As these null hypothesis results are the same across all scenarios, they are not discussed again. AR residuals rule out the use of past values of y as a valid instrument. However, as

tends to unity,

uit converges to a white noise series, such that the

performance of the …rst di¤erence estimators is likely to “n shaped”in . That is, as ! 1; lagged values of y and y are now once more valid instruments in the …rst di¤erenced model. Indeed, with

= 0:1 (Figure 4.1) this does appear

to be the case, especially so (with respect to ) for the AH estimator, although this estimator exhibits a rather erratic performance. Of the remaining estimators (AR; AB; BB, W B and GM M ), performance is relatively even across-theboard: in particular they all exhibit a sharp decrease in performance from mild to medium levels of autocorrelation. Overall, given lower levels of bias and variation, the BB estimator would appear to be the preferred IV/GMM estimator here. However, as ! 1 simple DOLS is an obvious, and simple, alternative. With regards to the estimation of

1

(Figure 4.1), most estimators exhibit low

bias and volatility, even at high levels of autocorrelation. The OLS, W B1 and GM M estimators have the worst performance in terms of bias and volatility while the DOLS, AH, AR and AB estimators are essentially unbiased for all levels of autocorrelation, and with levels of variability that decline with Finally, the results for

2

v.

(Figure 4.1) show that most estimators have no

bias at all, even at high levels of autocorrelation. Apart from the AH estimator which shows higher variation as ! 1, the levels of variability of the remaining estimators decline with

. Overall, the simple OLS estimator performs very well. 16

Increasing

to 0.5 has a drastic e¤ect on most of the estimators’performance

(Figure 4.2). In particular, with regard to estimation of , the performance of all estimators become much more biased and volatile, especially as increases. That is, overall, biases tend to increase across-the-board with . The performance of most of the estimators again tends to be n shaped in

. All of the AH; AR and

W B estimators can be disregarded here primarily due to their excessive volatility, especially at stronger levels of autocorrelation. Of the remaining estimators (AB; BB and GM M ) performance is relatively similar and stable, with only the BB’s performance worsening in the strength of autocorrelation in the residuals. However, being based solely on the …rst di¤erenced model, unlike its remaining “competitors”, the AB estimator bene…ts from high values of : The W ithin and DOLS estimators also perform quite well at high levels of autocorrelation. The results for

1

are very di¤erent. Here the W B estimators exhibit high

volatility while the GM M estimators show more bias as

! 1. The variability

of most of the estimators decline as the level of autocorrelation increases. Overall, the W ithin, DOLS and AB estimators perform the best. As far as the estimation of

2

is concerned, the performance of the AH es-

timator is quite erratic and the AR and W B estimators perform poorly as well. Overall, the simple OLS, W ithin and F GLS estimators display the least bias and volatility. When

is increased to 0:9 (Figure 4.3), the bias and volatility of most of

the estimators are lower than when

= 0:5. The performance of the AH, AB,

W B1, W B2 estimators again tends to be n shaped in . The AH, AR and W B1 estimators display very high levels of volatility while the GM M estimators perform quite well over the range of

. Overall, given lower levels of bias and variation,

the GM M and AB estimators would appear to be the preferred estimators here. However, as ! 1; DOLS or OLS is once again a simpler alternative due to its increased performance. This is not surprising, as these estimators are known to be consistent here as Estimation of

1

! 1:

and

2

closely mirror each other across the di¤erent estima17

tors. In both cases, the AB estimator performs very well, specially at higher levels of autocorrelation. The simple estimators (OLS, W ithin, F GLS) also fare well, even when there is high autocorrelation. With regard to the estimation of 1 , the W B1 estimator has the most bias and volatility, whilst the GM M estimators appear quite volatile with respect to the estimation of

2.

In summary, for all ranges of autocorrelation in the residuals and all values of considered, with regard to ^, the AB estimator appears an appropriate choice. However, across all scenarios as unbiased performance; as

! 1 DOLS has a very stable and essentially

! 1 the correlation between

yi;t

1

and the trans-

formed disturbance term is reduced. With regards to the estimation of

1

and

2,

either the AB or one of the simple OLS estimators seem to be an appropriate choice. However, across all scenarios and all coe¢ cients, a robust choice is the W B2 estimator. 4.2. Testing Procedures The purpose of this paper is to o¤er some advice to applied researchers as to which estimator is a robust choice against a range of misspeci…ed models. However, to help aid this choice we can also consider some existing tests that are readily available in the literature.11 With regard to testing for autocorrelated disturbances, we can use the m2 test statistic as suggested by ?, for the …rst-di¤erenced estimators. Table 4.1 contains rejection probabilities of no (second order) serial correlation. The test was conducted as a two-sided variant, at nominal size 5% (as were all subsequent tests). For both the AH and AR estimators, this test has low power, reaching a peak of only 37.2% with

= 0:5. It performs marginally better for

the AB estimator, but seems best at picking-up medium levels of autocorrelation. The test has extremely good performance for the BB estimator, especially for = 0:1 and

= 0:9:

11

In order to aid the applied researcher, we only consider tests currently available and implemented in standard software packages. Note also, as before, if di¤erent variants of an estimator have very similar performance, only the results of one are reported.

18

Table 4.1: AR(2) Test Rejection Probabilities for the First Di¤erenced Estimators

AH AR AB BB

0:1 0.046 0.059 0.065 0.691

= 0:1 0:5 0.049 0.153 0.177 0.598

0:9 0.037 0.174 0.189 0.992

= 0:5 0:5 0.033 0.372 0.422 0.123

0:1 0.013 0.083 0.099 0.099

0:9 0.034 0.167 0.230 0.769

0:1 0.054 0.097 0.109 0.069

= 0:9 0:5 0.066 0.292 0.393 0.931

0:9 0.052 0.11 0.077 1.000

Table 4.2: Sargan-Hansen Rejection Probabilities: Autocorrelated Disturbances

AB1 AB2 BB W B2 GM M I GM M W

0:1 0.060 0.047 0.047 0.081 0.000 0.000

= 0:1 0:5 0.067 0.057 0.240 0.782 0.000 0.000

0:9 0.072 0.055 0.722 0.730 0.002 0.000

0:1 0.089 0.068 0.066 0.421 0.000 0.001

= 0:5 0:5 0.434 0.391 0.829 0.692 0.000 0.000

0:9 0.323 0.231 1.000 0.711 0.004 0.000

0:1 0.103 0.074 0.084 0.071 0.000 0.005

= 0:9 0:5 0.685 0.587 0.888 0.395 0.000 0.000

0:9 0.198 0.111 1.000 0.305 0.000 0.000

We can also consider the Sargan (Hansen) test (?, ?, ?) in terms of instrument validity. Being only de…ned for over-identifying moment conditions, this test is not de…ned for those IV estimators where the total number of instruments is the same as the total number of (presumed) endogenous variables (thus it is not de…ned for any of: AH; AR and W B1). From Table 4.2, it can be seen that this test is only useful in determining misspeci…cation for the following cases: = 0:5 and = 0:9 with medium (0:5) and high (0:9) for estimators AB1; AB2, BB and W B2; BB and W B2 with

= 0:1 and medium and strong values of

autocorrelation. In summary of the autocorrelation results, it appears that the AB estimator has the best across-the-board performance. However, reasonable performance is also a¤orded by the DOLS estimator as

19

! 1.

4.3. Explanatory Variables and Disturbance Terms Correlated The results for when xit is exogenous with respect to are reported in Figures 4.4 to 4.6. When

i

but correlated with

it

= 0:1 (Figure 4.4), the AH, W B and

GM M estimators are essentially unbiased and also seem to be una¤ected by the extent of the correlation. However, the variance of the AH estimator’performance again makes it less attractive, whilst that of the W B and GM M ones decrease with the extent of correlation. The AH, AR, AB and BB estimators exhibit an n-shaped behaviour in the extent of the correlation whilst that of the OLS one worsens. Overall, however, here one would probably favour the less biased but more volatile W B estimators over the marginally more biased and less volatile AR, AB and BB estimators. Both ^ 1 and ^ 2 are adversely a¤ected by increased correlation between x and ; more so for the latter. 2 is poorly estimated by all estimators when the correlation is high. Moreover, across all scenarios here, there is really little to choose amongst all of the estimators. However, bias and variability is signi…cantly lower for all of the W B and GM M variants with regard to ^ 1 : Apart from OLS, all estimators fare very well at small values of this correlation. Increasing to 0:5 appears to worsen estimators’performance across-the-board (Figure 4.5), especially with regard to b and b . For estimation of , the rise from 1

2

= 0:1 to 0:5; appears to increase volatility across-the-board whilst also worsening bias properties. Indeed, most of them exhibit an “n shaped” performance in the extent of the correlation and most estimators are unbiased when

= 0:9.

Once more, the AH estimator has a very high degree of variability while both the W B and GM M estimators perform quite well with regards to both bias and volatility. Both the W ithin and DOLS estimator perform quite well at high levels of correlation. However, taking into account both bias and volatility, the W B2 estimator seems to perform the best overall. When estimating 1 , all estimators su¤er a signi…cant decline in performance when is increased from 0:1 to 0:5, with the OLS, W B and GM M estimators showing relatively less bias that the rest. However, all IV/GM M estimators are 20

essentially unbiased when

= 0:9. With regard to ^ 2 , the performance of all the

estimators is severely a¤ected when

increases from 0:5 to 0:9, with the OLS

estimator being most biased and the AH estimator being most volatile overall. At low levels of correlation, most of the estimators are essentially unbiased. Finally, with

= 0:9 (Figure 4.6) both the AH and W B1 estimators display

excessive variability while the OLS and BB estimators have nearly no volatility at all. The OLS and BB estimators are also unbiased at all levels of correlation. The performance of all the …rst di¤erence estimators is n shaped in , with the AB estimator displaying the best performance out of the three. However, the complete absence of bias and volatility of the BB estimator makes it the most attractive when = 0:9 (or, indeed, simple OLS). Estimation of 1 is adversely a¤ected at medium and high levels of correlation, with most estimators having signi…cant bias and volatility. The W B1 estimator has the most erratic performance whereas the W B2 estimator performs very well for all values of . Indeed, the OLS and BB estimators also perform quite well when the level of correlation is at its highest. Increasing to 0:9 has also worsened estimators’performance across-the-board with regard to ^ . Once again, the W B1 estimator displays excessive volatility 2

but is surprisingly unbiased when = 0:9. In most cases, volatility seems to be at its highest when = 0:1 and decreases gradually as the level of correlation increases. Finally, at medium and high levels of correlation, the DOLS estimator has the lowest bias. In summary, the robustness of estimators is once more heavily dependent on . In the estimation of , at low and medium values of the the W B estimators perform very well while a high

value sees the BB and simple OLS estimators having the best performance with no bias and no volatility. With regard to ^ 1 and ^ 2 , the appropriate choice is not clear-cut since the performance of most estimators varies a lot across

and . Once more however, with generally smaller

levels of both bias and volatility across both all parameter values and parameters, W B2 has a very robust performance. 21

Table 4.3: Sargan Tests: Observed and Unobserved Heterogeniety Correlated = 0:1 = 0:5 = 0:9 0:1 0:5 0:9 0:1 0:5 0:9 0:1 0:5 0:9 AB1 0.054 0.081 0.208 0.067 0.360 0.187 0.069 0.784 0.865 AB2 0.040 0.066 0.158 0.049 0.298 0.154 0.055 0.746 0.852 BB 0.048 0.026 0.528 0.049 0.928 0.619 0.052 0.894 0.892 W B2 0.061 0.247 0.324 0.053 0.489 0.65 0.058 0.171 0.408 GM M I 0.000 0.000 0.427 0.000 0.012 0.983 0.000 0.039 0.167 GM M W 0.000 0.000 0.001 0.001 0.000 0.006 0.004 0.008 0.003 4.3.1. Testing Procedures Table 4.3 contains the empirical rejection probabilities of the Sargan=Hansen statistic. Signi…cant levels of power are found for the AB, BB and W B2 estimators. However, this statistic does little to aid the applied researcher when = 0:1 (for all values of ) and when GM M estimators are involved. Reassuringly, overall, the power of the test improves at high levels of correlation between x and (high ). 4.4. Cross-Sectional Dependence The results for the cross-sectional dependence scenario are presented in Figures 4.7 to 4.9. Focussing on ^; when = 0:1 (Figure 4.7), the simple estimators show high levels of bias while the remaining estimators (AH; AR; AB; BB; W B and GM M ) are all unbiased when there is a low degree of cross-sectional dependence ( U [0; 0:2]). Raising the extent of cross-sectional dependence ( U [1; 4]) severely a¤ects the estimators’performance either in terms of bias (most notably the GM M ones, but also AB and BB), or volatility (the remaining estimators plus both AB and BB again). Once more, this volatility appears to be the most signi…cant for the AH (and also the F GLS) estimator(s). The GM M estimators are severely biased at these high levels of cross-sectional dependence, but very stable. Of the “di¤erenced” estimators, the AB and BB estimators are the

22

most severely biased. In this situation, the researcher would probably choose an estimator which is less biased but more volatile. In this respect, the AR and W B estimators perform quite well since they are all unbiased and only display excessive variability when is high. With regards to the estimation of both ^ 1 and ^ 2 , all the estimators exhibit very similar behaviour across these two parameters. All appear to be essentially unbiased at all levels of dependence (with the exception of the OLS and GM M ones). However, rising levels of dependence appears to adversely a¤ect the volatility of all estimators, and indeed, for the GM M ones, bias also. Overall, the AB estimator performs the best. Over all parameters and dependence levels, once more the W B2 estimator is essentially unbiased and only su¤ers from excessive volatility at high levels of dependence. Increasing

to 0:5 (Figure 4.8) does not appear to strongly a¤ect estimator

performance in estimation of

: at low levels of dependence, most estimators are

unbiased; but for high levels variability dramatically increases (AH; AR; BB and W B) and bias becomes evident (GM M ). Interestingly, the performance of the simple F GLS and DOLS estimators, is lightly improves the performance of the simpleF GLS and DOLS estimators. Once again, the F GLS and AH estimators are the most variable and the GM M estimators have the highest level of bias. Overall, one the simple OLS estimator appears a suitable choice in terms of both bias and variability, if there is any extent of cross-sectional dependence. Estimation of both

1

nor

1

does not appear to be strongly a¤ected by raising

from 0:1 to 0:5, and the previous discussion applies once more. Across all parameters, once more the W B2 estimator appears to provide a robust choice. Finally, with

= 0:9 (Figure 4.9), most of the estimators are essentially unbi-

ased for , for both low and high levels of misspeci…cation (the only signi…cantly biased estimators are the W ithin, F GLS and DOLS estimators). While the volatility of these estimators increases greatly for the high cross-sectional dependence scenario, there is a slight improvement with regards to their bias. Both the AH and AR estimators display relatively high variability at all levels of misspec23

i…cation. The BB and GM M estimators perform very well, but as with most previous experiments, the simple OLS estimator has good performance whenever = 0:9. As it was the case in previous scenarios, the behaviour of ^ 1 and ^ 2 is quite di¤erent from the behaviour of ^. Once again, the GM M estimators are biased as the level of misspeci…cation increases (and moreover, appear quite volatile) while the rest of the estimators are essentially unbiased. With regards to volatility, the AH, AR, W B and GM M estimators are relatively more volatile while the simple (OLS; W ithin; F GLS and DOLS) estimators seem to fare very well in this respect (as well as with respect to bias properties). Overall with

= 0:9, the

OLS estimators can be considered an obvious choice; of the IV/GMM estimators, BB; AB or W B2 estimators (in that order), would probably be chosen. For overall performance in the estimation of for both medium and high values of

and , OLS performs very well

and does not seem adversely a¤ected by the

level of cross-sectional dependence. When is low, the AR and W B estimators perform well in estimating while the AB and OLS estimators are preferred for the estimation of

. Once more, however, a robust choice across scenarios

and parameter values here, is the W B2 estimator, whose main failing is rising volatility with increasing levels of dependence.

5. Conclusions and Caveats Numerous consistent estimators have been proposed in the literature in the context of a dynamic panel data model. Much has been undertaken on the theoretical and empirical properties of these estimators. However, traditionally all such research has focussed on the estimators’properties under the condition that the assumed data generating process is true. Obviously, this is unlikely to always be the case in practice. This paper was concerned with the performance of (a relatively small subset of the total number of) estimators of the dynamic linear panel data model (chosen on the basis of popularity and/or previous experience of good behaviour

24

with respect to bias and volatility) when indeed these assumptions do not hold. In doing so, invaluable advice is given to the applied researcher as to which estimator is likely to be more applicable to his or her problem at hand. The key …ndings of the experiments are summarised below. Being based on di¤ering assumptions, it is not surprising that most estimators have di¤ering performance across the di¤ering scenarios of misspeci…cation. However, somewhat worrying, is that the readily available testing procedures tend to have poor performance in detecting the various misspeci…cation forms considered. This reinforces the need for an estimator that is robust to most of the more common forms of misspeci…cation. Then again, the estimators’performance vary greatly across scenarios. While the AR estimator performs very well when the disturbance terms are serially correlated, the W B estimator seems to be the most appropriate choice when the explanatory variables ard the error term are correlated. Finally, the OLS estimator is the preferred option under the scenario of cross-sectional dependence, and especially for high levels of . However, a standout estimator across all parameter settings and scenarios, was the W B2 estimator: which appears to be a good choice for the applied economist, although to date there are relatively few instances of its use. As this was a Monte Carlo study, as with any Monte Carlo study, the results are clearly dependent on the parameters of the experiments. Although care was taken to ensure constant signal-to-noise ratios, the results could well be a¤ected by the parameters so chosen. For example, we only considered one relatively small value of N (= 100), and one value of T (= 6). We leave this as an avenue of future research to see if these results hold for di¤erent experimental settings.

25

Figure 4.1: AR(1) Disturbances; 26

= 0:1

Figure 4.2: AR(1) Disturbances; 27

= 0:5

Figure 4.3: AR(1) Disturbances; 28

= 0:9

Figure 4.4: x and

correlated; 29

= 0:1

Figure 4.5: x and

correlated; 30

= 0:5

Figure 4.6: x and

correlated; 31

= 0:9

Figure 4.7: cross-sectional dependence; @ = 0:1 32

Figure 4.8: cross-sectional dependence; @ = 0:5 33

Figure 4.9: cross-sectional dependence; @ = 0:9 34