Panel data: Combined Time Series and Cross Section Data Panel ...

201 downloads 5766 Views 109KB Size Report
Panel data combine a time series dimension with a cross section dimension, in such a ... It is important to distinguish panel data from repeated cross-sections.
Panel data: Combined Time Series and Cross Section Data Panel data combine a time series dimension with a cross section dimension, in such a way that there are data on N individuals (or firms, countries...), followed over T time periods. Not all datasets that combine a time series dimension with a cross section dimension are panel data-sets, however. It is important to distinguish panel data from repeated cross-sections. A Panel Data-Set Panel data contains information on the same cross section units - e.g. individuals, countries or firms - over time. The structure of a panel data set is as follows: Country year X1 X2 y2005 y2006 y2007 imp exp Argentina 2005 1 0 1 0 0 0.167 0.13 Argentina 2006 1 0 0 1 0 0.175 0.15 Argentina 2007 1 0 0 0 1 0.164 0.15 Argentina 2008 1 0 0 0 0 0.179 0.16 Argentina 2009 1 0 0 0 0 0.172 0.14 Brazil 2005 0 1 1 0 0 0.239 0.22 Brazil 2006 0 1 0 1 0 0.250 0.24 Brazil 2007 0 1 0 0 1 0.247 0.25 Brazil 2008 0 1 0 0 0 0.248 0.26 Brazil 2009 0 1 0 0 0 0.218 0.23 Canada 2005 0 0 1 0 0 0.024 0.03 Canada 2006 0 0 0 1 0 0.048 0.06 Canada 2007 0 0 0 0 1 0.057 0.06 Canada 2008 0 0 0 0 0 0.071 0.07 Canada 2009 0 0 0 0 0 0.047 0.08 Chile 2005 0 0 1 0 0 0.040 0.00 Chile 2006 0 0 0 1 0 0.067 0.04 Chile 2007 0 0 0 0 1 0.042 0.06 Chile 2008 0 0 0 0 0 0.018 0.03 Chile 2009 0 0 0 0 0 0.036 0.01 where country is the variable identifying the individual country that we follow over time; y2005 to y2007can be consider time dummies, constructed from the year variable; imp and exp are example of a time varying variable; x1 and x2 are example of a time invariant variable.

If the time periods for which we have data are the same for all N individuals, e.g. t = 1,2,…,T, then we have a balanced panel. In practice, it is common that the length of the time series and/or the time periods differs across individuals. In such a case the panel is unbalanced. Analyzing unbalanced panel data typically raises few additional issues compared with analysis of balanced data. However if the panel is unbalanced for reasons that are not entirely random then we may need to take this into account when estimating the model. Repeated cross sections are not the same as panel data. Repeated cross sections are obtained by sampling from the same population at different points in time. The identity of the individuals (or .rms. households etc.) is not recorded, and there is no attempt to follow individuals over time. This is the key reason why pooled cross sections are different from panel data. If the country variable in the example above not been available, we would have referred to this as a pooled repeated cross-section data-set.

Some of the well known panel data set 1. The Panel Study of Income Dynamics (PSID) is a longitudinal panel survey of American families, conducted by the Survey Research Centre at the University of Michigan, started in 1968. 2. Survey of Income and Program Participation (SIPP) is to collect source and amount of income, labor force information, program participation and eligibility data, and general demographic characteristics to measure the effectiveness of existing federal, state, and local programs; to estimate future costs and coverage for government programs, such as food stamps; and to provide improved statistics on the distribution of income and measures of economic well-being in the country. Sponsoring agency and legal authority is the U.S. Census Bureau sponsors the survey under the authority of Title 13, United States Code, and Section 182. It is a continuing survey with monthly interviewing. 3. German Socio-Economic Panel (GESOEP) 4. Household, Income and Labour Dynamics in Australia Survey (HILDA) 5. British Household Panel Survey (BHPS) 6. Survey of Family Income and Employment (SoFIE) 7. Lifelong Labour Market Database (LLMDB) 8. Korean Labor and Income Panel Study (KLIPS) 9. Chinese Family Panel Studies (CFPS)

10. German Family Panel (pairfam) 11. National Longitudinal Surveys (NLSY)

Balanced Panel Data: Number of years for panel variables are equal. Unbalanced Panel Data: Number of years for panel variables are not equal. Short Panel: Number of panel variable is greater than number of time variable. (N > T) Long Panel: Number of panel variable is less than number of time variable. (N < T)

Estimating Techniques 1. Pooled OLS Yit= β1+ β2 X2it+ β3 X3it + β4 X4it +uit uit N(0,

2

u)

i= 1,2,…,6 t= 1,2,…,15 2. Fixed effect least square dummy (LSDV) Yit= α1+ α2D2i+ α3D3i+ α4D4i+ α5D5i+ α6D6i+ β2 X2it+ β3 X3it + β4 X4it +uit uit N(0,

2

u)

i= 1,2,…,6 t= 1,2,…,15 3. Fixed Effect within-group (FEM) Yit= β1i+ β2 X2it+ β3 X3it + β4 X4it +uit uit N(0, 2u) i= 1,2,…,6 t= 1,2,…,15 4. Random effect model (REM) Yit= β1+ β2 X2it+ β3 X3it + β4 X4it +εi+ uit εi N(0, 2e) and uit N(0, 2u) i= 1,2,…,6 t= 1,2,…,15

Example Fixed-effects (within) regression Group variable: country

Number of obs Number of groups

= =

278 20

R-sq:

Obs per group: min = avg = max =

8 13.9 15

within = 0.1005 between = 0.0363 overall = 0.0507

corr(u_i, Xb)

= -0.4860

F(9,249) Prob > F

= =

3.09 0.0015

-----------------------------------------------------------------------------model2 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------x1 | .0001804 .0006316 0.29 0.775 -.0010636 .0014245 x2 | .0005982 .0006654 0.90 0.369 -.0007122 .0019087 m1 | -.0006663 .000525 -1.27 0.206 -.0017003 .0003676 m2 | -.0001445 .0007588 -0.19 0.849 -.0016389 .0013499 inrest | -.1603729 .0476446 -3.37 0.001 -.2542108 -.066535 gdp | -.1092638 .1073249 -1.02 0.310 -.3206441 .1021166 inflation | .0222039 .0308187 0.72 0.472 -.0384946 .0829025 lnmv | 1.512391 .6385276 2.37 0.019 .2547869 2.769994 open1 | 5.493402 3.608919 1.52 0.129 -1.614498 12.6013 _cons | -29.28743 16.75483 -1.75 0.082 -62.28668 3.711828 -------------+---------------------------------------------------------------sigma_u | 5.4862157 sigma_e | 5.2511268 rho | .52188402 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(19, 249) = 7.60 Prob > F = 0.0000

Random-effects GLS regression Group variable: country

Number of obs Number of groups

= =

278 20

R-sq:

Obs per group: min = avg = max =

8 13.9 15

within = 0.0873 between = 0.2002 overall = 0.1321

Random effects u_i ~ Gaussian corr(u_i, X) = 0 (assumed)

Wald chi2(10) Prob > chi2

= =

25.43 0.0046

-----------------------------------------------------------------------------model2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------x1 | -.0000935 .0004984 -0.19 0.851 -.0010704 .0008834 x2 | .0001885 .0004335 0.43 0.664 -.0006612 .0010382 m1 | -.0003946 .0003457 -1.14 0.254 -.0010721 .0002829 m2 | .0002874 .0005606 0.51 0.608 -.0008113 .0013861 inrest | -.0693571 .0420565 -1.65 0.099 -.1517862 .0130721 gdp | -.0569042 .108621 -0.52 0.600 -.2697974 .155989 inflation | .0022885 .0305071 0.08 0.940 -.0575044 .0620814 lnmv | .9054565 .5527453 1.64 0.101 -.1779043 1.988817 open1 | 3.60605 1.561736 2.31 0.021 .545104 6.666996 _cons | -10.31873 14.53934 -0.71 0.478 -38.81531 18.17786 -------------+---------------------------------------------------------------sigma_u | 2.8794529 sigma_e | 5.2511268 rho | .23117565 (fraction of variance due to u_i) ------------------------------------------------------------------------------

Hausman Test ---- Coefficients ---| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | fixed random Difference S.E. -------------+---------------------------------------------------------------x1 | .0001804 -.0000935 .0002739 .000388 x2 | .0005982 .0001885 .0004097 .0005047 m1 | -.0006663 -.0003946 -.0002717 .0003951 m2 | -.0001445 .0002874 -.0004319 .0005113 inrest | -.1603729 -.0693571 -.0910158 .022389 gdp | -.1092638 -.0569042 -.0523596 . inflation | .0222039 .0022885 .0199155 .0043711 lnmv | 1.512391 .9054565 .6069341 .319672 open1 | 5.493402 3.60605 1.887352 3.253503 -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test:

Ho:

difference in coefficients not systematic/RE model is suitable chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 27.59 Prob>chi2 = 0.0001 (V_b-V_B is not positive definite)

Comment: Fixed Effect Model is Effective

Suggest Documents