The Reparameterization of Linear Models Subject ... - Semantic Scholar

15 downloads 0 Views 203KB Size Report
The estimation of regression models subject to exact linear restrictions, is a widely ... however, aside from simple examples, the reparameterization method is.
The Reparameterization of Linear Models Subject to Exact Linear Restrictions1 by Joseph G. Hirschberg Department of Economics University of Melbourne Parkville, Victoria Australia And Daniel J. Slottje Department of Economics Southern Methodist University Dallas, Texas 75275-0496

July 1999

Key words Systems of demand equations, Polynomial lags, Spline lags Classification C51, D12, C63 Abstract The estimation of regression models subject to exact linear restrictions, is a widely applied technique, however, aside from simple examples, the reparameterization method is rarely employed except in the case of polynomial lags. We believe this is due to the lack of a general transformation method for changing from the definition of restrictions in terms of the unrestricted parameters to the equivalent reparameterized model. In many cases the reparameterization method is computationally more efficient especially when estimation involves an iterative method. The general relationship that converts the two forms of the restricted model is derived. Examples involving systems of demand equations, polynomial lagged equations, and Splines are given in which the transformation from one form to the other are demonstrated. In addition, we demonstrate how a Wald test of the restrictions can be constructed using an augmented version of the reparameterized model. A computer program example is presented to demonstrate the equivalence.

1

We wish to acknowledge our colleagues Thomas B. Fomby and Jenny Lye for their helpful suggestions and comments on this paper, the usual caveat holds.

1. Introduction A discussion of the estimation of a regression model subject to exact linear restrictions can be found in most econometrics textbooks (such as Griffiths et al 1993, Greene 1997, and Johnston and DiNardo 1997). The basic exposition on how one solves a set of "normal equations" with the restrictions added has not changed much since Tintner’s (1952) text. However, as Mantell (1973) has shown, it is possible to reconfigure the restrictions to form the reparameterization. More recently Davidson and MacKinnon (1993) (section 1.3) repeated this earlier contribution in the econometrics literature. These authors all propose that the alternative reparameterized model exists, however they assume that a rearrangement of restrictions can be found without any general proposal on how to perform the reparameterization needed. One might suspect that since the techniques of constrained optimization are so familiar to economists from their study of microeconomic theory, that they rightly perceive the minimization of the squared error subject to a linear restriction as a very useful pedagogical analogy to the similar problem in economic theory. This similarity may aid in the interpretation of resulting Lagrange multipliers in the testing of a set of linear restrictions. The reparameterization of the original regression equation is often referred to informally or only for a specific example, however this approach is usually dropped in the discussion of the general case. In this paper we demonstrate the general relationship between the widely used restricted least squares estimator generally presented in textbooks and an equivalent estimator found by reparameterization. In many cases the reparameterized model proves to be more insightful and more efficient. It can also be more accurately computed in many instances. A linear regression model subject to a set of linear restrictions is written in the form (1)

Y = Xb + e , s. t . Rb = r ,



where YTx1 the vector of the observations on the dependent variable, XTxk is the matrix of H[SODQDWRU\YDULDEOHV

kx1

is the set of restricted parameters (note that in this exposition the

unrestricted parameters will be denoted differently), Rmxk is the matrix of m linear combinations of the restricted parameter set, rmx1 is the vector of constraints to which we equate the linear combinations, m < k and the usual assumptions concerning ε that E(ε  DQG(

 Σ which

also allows us to specify a particular distribution for ε. This form of the problem will be referred to as the "linear function of unrestricted parameters" form of the restrictions or the LFUP form. 7KHDOWHUQDWLYHPHWKRGIRUGHILQLQJOLQHDUUHVWULFWLRQVLVWKHFDVHZKHQ FDQEHZULWWHQDV a linear function of a set of j (j=k-m SDUDPHWHUV DQGDk x 1 vector of constants (d) as (2)

β = Aγ + d

we can show that A = the matrix of j eigenvectors corresponding to the zero valued eigenvectors of R 5. This form of linear restrictions will be referred to as ROP for "reparameterized in other parameters" form of the restrictions and the regression can be written as (3)

Y = Xβ + ε , s. t. β = Aγ + d ,

Which, by substitution, is equivalent to the reparameterized regression written as; (4)

G = Zγ + ε

where G = ( Y - X d), d = R+ r, and Z = X A . When the rank of R is equal to m, the appropriate Moore-Penrose g-inverse (R+) is given by (5)

R+ = R ′( R ′R )-1

The direct computational relationship between the LFUP and the ROP forms is the key to the wide application of this method. The widespread availability of computer programs to estimate eigenvalues and eigenvectors of symmetric matrices facilitates the application of this



method. Simple examples can be found in which the equivalent A matrix can be determined from R without need for the computation of eigenvalues and eigenvectors. However, without this relationship these examples are no more than special cases that need to be redefined for every case. 7RHVWLPDWH   FDQEHXVHGZLWKWKHUHJUHVVLRQFRHIILFLHQWIRU DVHVWLPDWHGIURP   $OWHUQDWLYHO\WKLVVROXWLRQFDQEHREWDLQHGE\VROYLQJWKHJHQHUDOL]HGOHDVWVTXDUHV  ZKHUH

*

is the OLS unrestrictedHVWLPDWHRI DQGWKXV a 

*

-d=A

(X Σ-1X)-1).

*2

~ 7KHFRYDULDQFHRIWKHHVWLPDWHRI IURPWKHUHSDUDPHWHUL]DWLRQLVJLYHQDV Σ

(6)

~ ~ -2 A (Z ′Σ −1 Z )-1 A ′ c ov(β ) = σ

1RWHWKDWWKHUHSDUDPHWHUL]HGUHJUHVVLRQ  LVZULWWHQZLWKWKHVDPHHUURU DVWKHRULJLQDO ~ SUREOHP&RQVHTXHQWO\LIWKHFRYDULDQFHPDWUL[RIWKHHUURULVHVWLPDWHGVXFKWKDW(   Σ ,

the equivalent solution to the GLS problem is;

~ ~ ~ γ = (Z’ Σ −1Z) −1 Z’ Σ −1G

(7)

,QWKLVFDVHWKHHVWLPDWHVRI FDQWKHQEHGHILQHGYLD  ZLWKWKH*/6HVWLPDWHRI DQGWKH FRYDULDQFHRI FDQEHGHILQHGLQDIRUPHTXLYDOHQWWR  XVLQJWKH*/6HVWLPDWHGFRYDULDQFH IRU  The proof of this relationship is given in Appendix A and is a modification of the proof given in Lawson and Hanson (1974). Of pedagogical interest we note that the proof provides an application of the singular value decomposition of a non-square matrix and an exposition of the properties of generalized inverses. One advantage of the reparameterization method is the ability to derive a linear equation such as (5) that relates the kE\YHFWRU WRWKHWUDQVIRUPHGVHWRIjSDUDPHWHUV 7KHH[SOLFLW GHILQLWLRQRI DVDIXQFWLRQRI VHUYHVWRPDLQWDLQWKHVHSDUDWLRQRIWKHSDUDPHWHUVVXEMHFWWR restrictions from those that are "free", or more precisely those that are not subjected to



restrictions. Most textbook descriptions of the reparameterization, or "substitution" method, HJ*ULIILWKVHWDODQG'DYLGVRQDQG0DF.LQQRQ GRQRWH[SOLFLWO\GHILQH  ,QVWHDGWKH\RQO\UHIHUWRWKHHVWLPDWHVRI ZLWKRXWGHILQLQJDVHWRIHTXLYDOHQWXQUHVWULFWHG parameters. An exception to this omission is in the discussion of polynomial distributed lags by some authors. For example, Fomby, Hill and Johnson (1984, p. 84) (FHJ) present a version of (5) but without the additive value d as would be the case when every element of r is zero. In this case the least squares estimate is set up as in (2). As FHJ show, the equivalent R can be found as; R* = ( I - A(A ′A )-1 A ′ )

However they also point out that this expression is not of full row rank and they do not provide a general method for the reduction of this matrix. In Appendix B we provide a proof that an equivalent m by k version of R can be defined as the transpose of the matrix of eigenvectors corresponding to the zero valued eigenvalues of AA (or H 2 as defined in appendix B). 7KHDELOLW\WRFRQYHUWWKH  A d to R  r form is particularly useful when different restrictions of both forms are specified for the same regression. For example, we may have an equation specified as Y = Xb + e , s. t. R 1 b = r1 , and b = Ag + d . this can then be rewritten as

R  r  Y = Xb + e , s. t.  K b =  K      R  r  1

1

2

1

where (from appendix B) R2 = H 2, and r2 = H 2d. Thus the R matrix can be augmented by adding new rows that correspond to the additional restrictions. The advantage of the linear function form is that additional restrictions can be easily accommodated by adding rows to the



matrix R. This problem could then be converted back to a reparameterized model in which the two sets of restrictions are included in the new parameter definitions. Another advantage of the reparameterization is in the mechanics of estimation. The traditional textbook solution to the restricted least squares problem specified in (1) is -1 -1 -1 -1 -1 bˆ r = ( X ŠX ) X ŠY + ( X ŠX ) R ( R ( X ŠX ) R Š) (r - R ( X ŠX ) X ŠY)

which implies that (X ;) is nonsingular. Unfortunately, there are a number of examples where this condition is not satisfied. Most notably, the case where one includes the full set of dummy variables in an equation and constrains the sum of them to be zero as proposed by Suits(1984). Using the reparameterization solution one can easily solve this case. Although this need not be a problem with the restricted least squares solution either if it is specified correctly. Greene and Seaks (1991) note that this is actually a single specific solution to the restricted least squares problem and that computer program code for the estimation of restricted least squares is often written to solve the general case. (The PROC REG routine in SAS is a good example of this.) Furthermore, the restricted linear equation as given in (1) results in a constrained optimization that is well defined and for which a large number of computer programs have been written. However, if one is applying a more general method for the estimation of the parameters such as maximum likelihood, it is not often the case that one would like to add the linear restriction to the problem being solved. For example, most logistic regression packages do not provide a method of applying linear restrictions on the parameters. In these cases it is necessary to estimate the reparameterized version of the linear equation and then determine the parameter values of those parameters that have not been estimated. A simple way to avoid this problem is to allow the user to specify the linear restriction form of the problem and have the computer program translate the problem into a reparameterized problem that can



then be solved. Once the solution is obtained the reparameterization can be reversed to reveal the restricted estimates in terms of the original parameters. The estimated parameter variance covariance matrix can then be found using the estimated variance covariance matrix of the reparameterized problem. In addition, one can form a Wald test of the restrictions using the augmented version of the reparameterized equation defined in section 4 below. The balance of this paper presents examples of the use of these alternative optimization programs. First, we present an example of the use of the reparameterization method in the estimation of a system of demand equations subject to the exact linear restrictions defined by the symmetry and homogeneity of degree zero in the prices. Second, we examine the case where we may want to use restrictions that can be specified in the linear function approach as well as ones using the reparameterization method, with an example based on polynomial lag functions. Then we show that the reparameterized model may also be subject to linear restrictions and we show how these can be combined as well with an example using a bilinear spline function as the lag function. Lastly, we demonstrate that the reparameterized model may also be used to test the hypothesis that the restrictions hold by estimating a special version of the unrestricted model in which a subset of the parameters can be used to test the linear restrictions. 2.

The case of restricted demand equations Phlips (1983) pointed out that,

. . . . applied consumption analysis appears then as the art of constructing and effectively utilizing interesting theoretical restrictions in the (econometric) estimation of demand equations. That this art is worth practicing and developing is indicated by the observation that stronger (more particular that is) restrictions produce more precise estimations and forecasts. [Phlips, 1983, p. 27]

This philosophy has been the basis for a vast number of empirical studies of consumer behavior. The same holds true for the rich literature on applied production analysis. As Theil (1980) 

noted, the applied analysis of both consumer behavior and production theory gradually has moved from the general (single equation) to the specific. As better micro data has become available, demand systems and cost and production functions now can be analyzed at increasingly disaggregated levels. This change fostered the system-wide approach that most applied economists rely on today. The ability to examine systems has produced increasingly sophisticated flexible functional forms to model consumer and producer behavior. Deaton and Muellbauer (1980) and Blundell (1988) survey many of the functional forms that have been utilized for estimating demand systems. Chambers (1988) presents a similar summary of the various functional forms used on the production side. Unfortunately, as the number of commodities or outputs the applied economist can analyze has increased, so too has the scope of the problem they face when attempting to estimate the parameters in a system with a large number of equations. An important tool for the researcher in the estimation of large systems of equations is the set of linear restrictions imposed on the model. In this example, we show that by imposing linear restrictions, the size of the least squares problem to be solved can be significantly reduced. Such reductions in dimensionality are very useful when dealing with the large number of parameters needed to estimate modern demand systems that may contain dozens of commodities or cost functions that may contain large numbers of outputs as well. Byron (1982) and Hirschberg (1992) both demonstrate that by exploiting the symmetry of the parameters, the price parameters for certain demand system specifications can be estimated much more efficiently using a non-traditional technique. However, these methods suffer from the drawback that they do not provide an estimate of the asymptotic covariance matrix and they can only be applied to a special class of demand systems. Although modern computers are available to estimate very large systems, there are cases when the non-restricted model cannot be solved and the software package available to the researcher always casts a restricted least squares problem as a LFUP problem. The method 

employed for estimation could be a maximum likelihood procedure such as a Tobit model when many of the responses are given as zero. In this case, reductions in the number of parameters are very important in making large problems tractable. To perform the estimation of the parameters for a set of demand equations it is traditional to use a SUR model. The system of demand equations (or cost share equations) can be written as a large single equation when we stack the individual equations in the form Î y 1 Þ Î x1 Ïy ß Ï 0 Ï 2ß = Ï Ï M ß ÏM ÏÐ y ßà ÏÐ 0 k

(8)

0 x2 M 0

L

0Þ L 0ß ß O Mß L x k ßà

Î b1 Þ Î e1 Þ Ïb ß Ï e ß Ï 2ß + Ï 2ß Ï Mß ÏMß ÏÐ b ßà ÏÐ e ßà k k

where yi = the amount of commodity i (or the cost share ) (T x 1), xi = the regressors (usually functions of the prices) for equation i (T x k+1), 0 is a i (T x k PDWUL[RIDOO]HUR i = the coefficients for equation i (k[ DQG i = the errors for equation i (T x 1). We can rewrite equation (8) in the compact form

Y = XB +Ε

(9)

where Y is (Tk x 1), X is i (Tk x (k+1)k), B is ((k+1)k x 1), E is (KT x 1) and we have specified E[E (@  ZKLFKLV KT x KT). For example, consider a system of two share equations as derived from a translog cost function (Christensen and Greene 1976) which are subject to restrictions to insure cross price coefficient symmetry and where the cost function is homogeneous of degree one in prices (10)

y1 = α1 + β11 P1 + β12 P2 + ε1 y2 = α2 + β21 P1 + β22 P2 + ε1

Here, the y’s are the cost shares and the P’s are logs of the prices. The matrix form for the unrestricted model is



α   β  0  β  ε  +   ε  P   α   β  β  1

 y =  y  

(2)

11

1

L P1 P2

0

2

0

L P1

0

0

0

2

12

1

2

2

21

22

where L is a column vector of length T with all elements equal to 1 and the Pi are the column vectors of length T of the log prices. A set of restricted parameters can always be obtained by rewriting the regression equation with the restrictions implicitly included in the definition of the regressors. The resulting coefficients of these new regressors can then be used to define the restricted coefficient HVWLPDWHV7KHFURVVSULFHUHVWULFWLRQLPSOLHVWKDW LPSOLHVWKDW

11



12 21



22DQG 1 2

12



21,

and the homogeneity restriction

= 1. If we incorporate the restriction into our

specification, we can then respecify (10) as

  α  +  0  +  ε   L  ε  ( P - P )   β 

 y  =  L  y   -L

( P1 - P2)

1

(3)

2

1

2

1

1

11

2

(VWLPDWHVIRUWKHRWKHUSDUDPHWHUVFDQDOVREHREWDLQHGIURPWKHUHVWULFWLRQV7KXV 2  1, 22



11 12



11DQG 21



11.

Then the associated variance covariance terms follow as

well. In this example, the reparameterization of the original problem was done solely for this specific case. However, we can get an equivalent reparameterization from the generalization GLVFXVVHGLQVHFWLRQRQH)LUVWZHGHILQHWKHUHVWULFWLRQUHODWLRQVKLS 5  U LQWKLVFDVHDV

(4)

0  0  0 1

1 1 0

0

0 0 0

1

0

1 0 -1

0 0

1

0

 α  β 0   1  β  0   α  0   β  β

1

11

12 2

21

22

     

0 0 =  .  0   1



At this point, we note how other authors have proposed to derive the reparameterized version of these estimates. Mantell (1973) and Davidson and MacKinnon(1993) claim that the columns of matrix R can always be reordered in such a way that R can be decomposed into two matrices such that R = R1 R 2 where R1 is nonsingular. Even in this very simple case, this problem requires a trial and error approach and it is not clear how one would write a computer program to achieve this without having to try many possible rearrangements. In addition, once the rearrangement of the regressors is determined we then need a bit of bookkeeping to insure that we can reconstruct the original model coefficients. Furthermore, since the linear restrictions can be transformed it is possible to find an equivalent linear restriction. As shown in Appendix C an equivalent matrix to the R matrix and the vector r defined above could be written as

  

 α  β β     α  β  β

1

11

0.5018359 0.4178266 0.5580871 0.5018359 -0.085042 0.0552182 0.0271988 0.6050141 -0.374574 0.0271988 0.5684613 -0.411126 0.4965375 -0.456904 -0.029521 -0.024858

-0.537925 0.4965375 0.0929258 0.0119048 0.0941827 -0.029521 0.6410806 0.7601213

12 2

21

22

     

 =   

  

0.5018359 0.0271988 .04965375 −.029521

The determination of the non-singular sub-matrix of this version of R is not so obvious in this case yet it is equivalent to the matrix defined above in that it will impose the same restrictions on the estimated parameters. However, as we show below the use of a readily available routine for the computation of eigen values and eigen vectors of a symmetric matrix will solve this problem for any matrix R. The eigenvectors of RR  U from (A5)) are given as



(5)

 0  .27  .65 U =   -.650  .27

0 .71

0

.71

.5

0

.65

0

.5

0 -.27

0

0 .71

0 -.71

.5

0

.65

0

.5

0 -.27

0

 .5   -.5  0   -.5   0  0

The corresponding eigenvalues of RR DUHJLYHQDV ( 3.41 2 2 .59 0 0 )

Thus, the last two columns of U make up A, a matrix composed of the eigenvectors that correspond to the zero valued eigenvalues of RR’ and each column can be multiplied by any scalar. Thus, we can rescale this matrix to be of the form

1  0 0 A =   -10  0

(6)

0 1 -1 0 -1 0

    

for d = R+r we get

(7)

 0  .75  .25 d =   .250  .75

0

0

-.25 -.5 .25

.5

0

0

.25 -.5 .75

.5

 0   0   0   0    .5   0    0   1  0  .5

 .5   0  0  =   .50   0 

Z = X A results in (8)

Z =

 - L  -L

 (P - P ) 

( P1 - P2 ) 1

2

and for G = Y - Xd (9)

G =

 (Y -.5 L)   (Y -.5 L)  1

2



,WFDQEHVKRZQWKDWWKHUHVXOWLQJHVWLPDWHVRI ZLOOEHWKHVDPHIURPHLWKHU UHSDUDPHWHUL]DWLRQ7KHRQO\GLIIHUHQFHLVLQWKHHVWLPDWLRQRIWKH VZKLFKZLOOEHVHWXS differently in the second case, but which result in the same estimates of the restricted values of  In the special case of only linear restrictions to assure symmetry in the matrix of price coefficients, the transformation matrix (A) can be defined as

A* =

(10)

3 |R ′ |

( k x m)

: S( k x (k-2m))

8

( k x j)

where S = (s 1 , s 2 , L s k-2m )

(11)

In addition, each si is a vector with a 1 in the row corresponding to a coefficient that is not restricted. For example, if the first coefficient is not a symmetrical element

(12)

s1 =

 1  0   .   ..   0 

(k x 1)

Thus, each column of S corresponds to a coefficient that has an all zero column in R. We can prove that the columns of A* are eigenvectors of R 5 that correspond to the zero valued eigenvalues of R 57KLVLVGRQHE\VKRZLQJWKDWIRUDFROXPQYHFWRURIA* called a*i, we have (13)

(R’R) a*i = 0 and a*i ’a*j = 0 for all i, j where i ≠ j

It can easily be shown that the columns of both S and RA* satisfy this criteria and that they are eigenvectors of R 5 that correspond to the zero valued eigenvalues of R 5. A general procedure for obtaining A from any R is a straightforward application of a macro language such as SAS PROC IML or GAUSS. The efficiency in computation reason for



using this method for obtaining restricted generalized linear model estimates can be forcefully made by comparing the sizes of the matrices to be inverted.

~ −1 -1 Using the LFUP estimation approach, the matrix Q = (X ′Σ X ) is needed along with (RQR -1. However, with the reparameterization regressor only one inversion;

~ (A ′ X ′Σ −1 X A )−1 is needed. To compute the inverse of a matrix the number of individual computations increases by the cube of the dimension. Appendix C provides a listing of a computer program written in the SAS PROC IML computer language. An example of the gains from reparameterization can be found in Huang and Haidacher’s 1983 paper in which they estimate a 13-equation demand system with 195 coefficients by applying a reparameterization of 92 restrictions defined on the unrestricted parameters. Unfortunately, due to their use of the same parameter definitions for the restricted and the unrestricted parameters it is difficult to follow the equivalent reparameterization of the restrictions in their model. However, employing the reparameterization technique given here, the equivalent reparameterized model is easily derived. In this case the computation of the Q matrix would involve the inversion of the 195 by 195 matrix which requires on the order of 1953 or 7,414,875 operations and, it would be necessary to compute (R Q R -1, a 92 by 92 matrix and 923 = 778,088. Thus, these two inversions would require 8,193,563 operations in

4

~ −1 total. However, A ′ X ′Σ X A

9

−1

is a j x j, or in this case, a (103 x 103) matrix and the number

of operations needed for inversion is of the order of is 1033 or 1,092,727. Thus, the modified regressor technique requires approximately 13% of the computations that the traditional constrained optimization formula would require which also reduces the potential for round-off error as well. In a experiment done to see how much faster this technique is for various size demand systems, the efficiency seems to be limited to requiring only 12% of the operations that the traditional technique needs (this occurs past 30 commodities where the number of



coefficients is 900). Note that in the case that non-linear or maximum likelihood methods are employed for estimation it may be necessary for the solution for these parameters to found by iteration and the estimation of the reparameterized model will be far more efficient. In addition, due to the proximate multicollinearity in the data especially in the case of demand systems defined in terms of functions of prices it may not be possible to compute the unrestricted model. Even without any direct multicollinearity, the smaller the matrix to invert, the smaller the potential for round-off error in the inversion routine. At this point one might observe that the eigenvector computation needed does not warrant the reduction in dimensionality due to computation time, although for any definition of R the A matrix can be stored. Lawson and Hanson (1974) propose a number of more computationally efficient methods based on direct elimination algorithms however these methods require the development of specially written computer code. However, in the demand case discussed above, once the A matrix is derived for a particular functional form, repeated derivation is unnecessary and thus, it can be stored for future use as in the case of iterated solutions or the case of the same model being applied to different data series. 3. The examples of the combination of linear restrictions and reparameterization. In this section we show that the two forms of restrictions may be imposed simultaneously. In many cases the model may be best expressed in either the reparameterized form or in the form of a linear restriction on the parameters. Here we show the case of when parameters of a linear model are assumed to be functions of another linear function which leads to a well defined reparameterized model. Then we also impose a linear restriction on the original model parameters. In the second case we show the case where the reparameterized model of the original parameters is subject to a set of linear restrictions and we then show how these may be estimated.



3.1 The case of reparameterization and linear restrictions on the original model A frequently used reparameterization found in time series analysis involves the polynomial lag model as proposed by Almond (1965). In this case, a series of lagged values of a variable are assumed to have parameters that are related to each other via a polynomial in the lags. A typical model of this type would be m

yt = β 0 + β1 Xt + ∑ β 2 + s Zt − s + ε t

(14)

s=0

when the parameters λi are subject to a reparameterization from another set of parameters from a model such as given below if we assume a quadratic function: β 2 + s = φ 0 + φ1s + φ 2 s 2 Thus, we can reparameterize the model. If we assume that the length of the lag m is 5 then we can write the reparameterization equation (β = Aγ + d ) of this case as:

β   1  β   0  β   0  ββ  =  00  β   0  β   0 0 1 2

3 4 5

6

0 0 0 1 0 0 0 0 0

0 1 1 1 1 1

0 0 1 2 3 4

 0  γ   0  γ    1  γ   4  γ    9  γ   16 0

1 2

3 4

5

where the new parameters (γ) can be shown to be functions of the original parameters of the polynomial and the model.



1

γ2

γ3

γ4

6 1

γ 5 ’ = β0

β1

φ0

6

φ1 φ 2 ’ .

Thus we can form the equivalent linear restriction as shown in Appendix B via the eigen vectors that correspond to the zero valued eigen values of AA a version of this is given by. R=

 0 0

 , r =  0 0.7550527 - 0.337964   0

0 0.2174568 - 0.660484 0.6767127 - 0.241799 0 0.2588402 - 0.438556 - 0.237372

0.0081141



This result shows that we are imposing 2 restrictions on the model and that all the parameters on the lagged values are subject to restrictions. In addition, to this constraint on the parameters to be estimated we also may want to insure that the parameters for the lagged coefficient sum to zero. This may be the case when using Suits’ (1984) proposal that dummy variables sum to zero and that the variable Z may be a dummy variable. In this case, we would also want the restriction that

β   β   β  15 β = 005  β   β   β  0 1

00

2

0 0 1 1 1

3

4

5

6

Thus, we can add these constraints together to form a new R matrix by stacking these two matrices together

0 R = 0  0

0 0.2174568 - 0.660484

0 0.2588402 - 0.438556 - 0.237372

0 1

1

  0 0.7550527 - 0.337964  , r =  0   0 1 1

0.6767127 - 0.241799

1

0.0081141

and thus obtaining parameters that are restricted by both constraints. Other typical restrictions on the parameters insure that the polynomial ends and begins at zero thus we would have such restrictions as β2=0 and β6=0 can be applied to this form of the model as well as new rows to the augmented R matrix. 3.2 The case when the reparameterizing equation is subject to linear restrictions Spline functions are another area in which the reparameterized version of the model is the most convenient form to use. However spline functions imply that there are linear restrictions on the reparameterizing equation An example of this would be the case were we define a spline lag function which is subject to constraints.



For example if we assumed a bilinear spline for the lag coefficients from (14) we would define the reparameterizing function as β 2 + s = π10 D1 + π11sD1 + π 20 D2 + π11 ( s − k ) D2 , where D1 = 1, when s < k and D1 = 0, otherwise where D2 = 1, when s ≥ k and D2 = 0, otherwise

This implies that the lag function is two linear functions where the change over from one function to the other occurs at the knot defined by k. One starts at lag 0 and ends at lag k-1 and the other starts at lag k and ends at lag m. This reparameterization would imply that the following reparameterization would hold when the knot was placed at k=3. Thus the β = Aγ is defined as

β   1  β   0  β   0  ββ  =  00  β   0  β   0 0 1 2

3 4 5

6

 γ  0 γ  0   γ  0   γ 0   γ  1   γ  2

0 0 0 0 0 1 0 0 0 0 0

0 1 1 0 0 0

0 0 1 0 0 0

0 0 0 1 1 1

1 2

3 4 5

6

This would imply that the new parameter vector would be the following original parameters.



1

γ2

γ3

γ4

6 1

γ 5 γ 6 ’ = β0

β1

π10

π11

π10

6

π11 ’

In spline functions one usually interrelates the separate functions by constraining the estimated functions to share common values at the knots and in the case of higher order polynomial functions that the 1st and 2nd derivatives are equal at the knots as well. In this case we insure that the two linear functions meet at the knot by insuring that the relationship π10 + π11k = π 20 holds. This can be written as a linear restriction on the reparameterized model as:



γ   γ  γ  05  = 005  γγ   γ  1 2

00

0 1 3 −1

3 4 5

6

To accommodate this restriction on the reparameterized coefficients we apply the transformation of the linear restriction to the these variables and derive a new reparameterization as γ = A1θ where A1 is obtained is the matrix made up of the eigen vectors corresponding to the zero valued eigen values of the linear combination matrix defined above. Thus the relationship between the original model parameters and these new parameters is β = AA1θ . Note that the dimension of θ is 5 by 1. A value for this new transformation matrix

in this case is given below.

AA1

0  1  0 = 0  0  0 0

1 0

0

0 0

0

0 0 .9533 0 0 .6619 0 0 .0790 0 1 .0790 0 2 .0790

 0   .0164 .3277  .9502 .9502  .9502 0

As we have shown here, the reparameterization method provides a means for dealing with a number of cases in which various combinations of restrictions are imposed simultaneously. The ability to move from one representation to another allows complex restrictions to be programmed in a manner that results in the minimum potential for errors in programming.



4.

A test of linear restrictions using the reparameterized model Another feature of the reparameterization is the ability to test the hypothesis implied by

the linear restrictions. Using equation (A6) we have that the linear restriction Rβ=r is equivalent to the solution of β = R+ r + Aγ where R+ = R 55 -1, γ is the vector of new parameters and A is the matrix of the eigen vectors corresponding to the zero valued eigen values of R 5. Thus we can make the substitution for β in the regression equation Y = Xβ + ε to obtain

3

8

Y = X R + r + Aγ + ε When we assume the restrictions hold we transform the independent variable and run a regression of the form

G = XAγ + ε, where G = Y − XR + r However if we want to test the hypothesis that 5β U we can estimate ρ by using the following “augmented” reparameterized estimating equation which is equivalent to the unrestricted equation 

Y = XR + ρ + XAγ + ε

and then form the Wald test for the hypothesis that ρ = r . This is equivalent to testing Rβ$ =r when β$ is the unrestricted estimate of β. As we show in appendix A when this hypothesis is true the reparameterized model is equivalent to the imposition of the linear equation, thus the assumption that ρ = r . Consequently, we can test the restrictions directly by estimating ρ$ and if the restrictions are given in the reparameterized form then we can construct the equivalent linear restrictions and reformulate the model to test the restrictions and to test each restriction



separately when m!,QDGGLWLRQWKHHVWLPDWHVIRU IURP  SURYLGHWKHXQUHVWULFWHGYDOXHV of the new parameters. 5.

Conclusions This paper has shown how one can find a reparameterized model which is equivalent to

any linear model subject to a set of linear restrictions and how one can find a linear restriction that is equivalent to any linear reparameterization of a linear model. Although this method has been known for some time, previous authors in econometrics have not explicitly demonstrated a method for going between the linear restricted model and the reparameterized model. Previously it had been suggested that the researcher derive this equivalence on a case by case basis. The method shown here allows the researcher to convert from one form of the model to the other in an automatic fashion. The linear restricted model is the usual representation of the restricted linear model found in econometrics. However, reparameterization often helps in simplifying the estimation procedure which may be especially important when iterative methods are used for estimation as in the case of maximum likelihood. It would appear that textbook authors have tended to handle the reparameterization of restricted least squares problem in a somewhat hazy fashion because they lack a general method for the transformation of one form to the other. We hope that this is no longer the case. As demonstrated in appendix C, modern computer routines make the transformation of these two equivalent restricted regression models possible with the use of widely available programs. Although we have concentrated on linear equations and linear restrictions a common method for imposing or testing non-linear restrictions are based on the linearization of the restrictions via a first order Taylor series approximation . Thus these forms of restrictions can be incorporated in the models presented here as well.



Appendix A. The Derivation of the ROP from the LFUP This section provides the proof of the relationship between the linear function of unrestricted parameters (R  r DQGDQHTXLYDOHQWUHSDUDPHWHUL]DWLRQLQRWKHUSDUDPHWHUV  A d). We can solve the LFUP restriction equation, R  r, using the general solution of a set of linear equations β = R+ r + (Ik - R+ R) φ

(A1)

It can be shown that if a solution exists, this equation provides a solution (for example, see *UD\ELOO7KHRUHP JLYHQVRPHYDOXHIRU ZKHUHR+ is a generalized inverse of R. If, as in the present example, R is of rank equal to the number of rows (m) then we can define R+ = R 55 -1. However, this solution does not furnish a reduction in dimensionality. (Ik - R+R) is of dimension k x k. For example, say we just use the symmetry restriction from the example given in Section 2. We have R = ( 0 0 1 0 -1 0 )

in this case, we get

( I k - R+ R ) =

    

1

0 0

0 .5 0 .5 0 0

0

1

 0   0  . 0   0   1

0 0 0 0 0

0 .5 0 .5 0 0

0 0

0

1

0

0 0

0 0

,WLVREYLRXVIURPWKLVH[DPSOHWKDWQRXQLTXHVROXWLRQZLOOEHREWDLQHGIRU DQGZHKDYH QRWVXFFHHGHGLQUHGXFLQJWKHGLPHQVLRQRIWKHUHJUHVVLRQSUREOHP LVRIGLPHQVLRQ nx1). To make this reduction we use the singular value decomposition (see for example, chapter 4 of Lawson and Hanson 1974) of R whereby we can find a triplet of matrices of the form (A2)

R = H C U′



where H is an orthogonal matrix with columns consisting of the eigenvectors of RR C is an (m x k) matrix of the square root of the eigenvalues of RR RQWKHILUVWm x m diagonal, listed in decreasing order, U is an orthogonal k x k matrix with the columns consisting of the eigenvectors of R 5. Thus, based on the information that rank of R is m we have the following partition of C and U

(A3)

R = H(m x m)

4C

1(m x m)

M 0

(m x j)

9

(m x k)

  

  KKK  U U1( ′m x k)

2( ′j x k)

(k x k)

where j = k - m, and m x k U is made up from the m rows of U  DVVRFLDWHG ZLWK QRQ]HUR eigenvalues which are on the diagonal of m x m C1 and U are the j rows associated with the zero-valued eigenvalues. Based on the properties of orthogonal matrices H and U we can use (A2) to rewrite our value for R+ as

R+ = U C′ H ′ ( H C U ′ U C′ H′ )-1 , = U C′ H ′ H (C C′ )-1 H ′ , = U C′ (C C′ )-1 H ′, = U C* H ′ , where C* is defined as,

C  =  L  0    −1 1

(A4)

C*



then by substitution we find R + R = U C* H ′ H C U ′, = U C*C U ′,

1

= U1 M

I U 6 0

mxm

0 mxj

jxm

0 jxj

2

= U1 U1 ′ .

 ′   UL    U ′    1

2

Thus we can use I = UU DQGJHW

( I - R + R ) = U U ′ - U1 U1 ’ , or = U1 U1 ’ + U2 U2 ’ - U1 U1 ’ , = U2 U2 ’ .

(A5)

%\VXEVWLWXWLRQRI $ LQWR $ DYDOXHIRU LVJLYHQDV b = R + r + U 2U 2 ’ f

(A6)

where U 2 ’f is of dimension j x 1. Thus, the equivalent new parameter set is γ = U2 ’ φ

(A7)

and the define the transformation matrix as (A8)

A = U2 .

IURP $ ZHFDQQRZZULWHWKHYDOXHIRU DV β = A γ + d

(A9)

Thus, our least squares estimate of beta (that incorporates the restriction) is (A10)

β$ = A ( A ′X ′ XA )-1 A ′X ′ ( Y - X d ) + d

We can confirm that this estimate will be equivalent to the estimate we obtain by using the LFUP form by substituting (A9) in to R  r to get (A11)

R ( Aγ + d ) = r



using the singular value decomposition for R and (A2), by substitution for d and A, this can be written as (A12)

H C U’U 2 γ + RR + r = r

because CU 82 = 0 we have that (A12) reduces to r = r. Thus, the ROP form of the restrictions is consistent with the LFUP form. This demonstrates that the value of A can be a wide class of matrices which when postmultiplied by R is equal to the zero matrix. Appendix B The derivation of the LFUP form from the ROP form. This appendix provides a proof that for any linear equation that relates the vector of OHQJWKNRIWKHSDUDPHWHUV WRDYHFWRURIOHQJWKMVD\ VXFKWKDW  A d, can be shown to be equivalent to a set of m (m=k-j) linear restriction in the parameters in the form R  r, where R is a m by k matrix equal to the transpose of the eigenvectors of AA that correspond to the zero valued eigenvalues of AA and the r = Rd. Following the derivation given in Fomby Hill and Johnson(1984, pages 85 and 393) we can rewrite the ROP form of the restriction equation as; (B1)

β = Aγ + d

and premultiply both sides of the equation by the Moore-Penrose generalized inverse of A; (B2)

A+ β = γ + A+ d

where when A is of full column rank (rank of A is j) A+ can be defined by (B3)

A+ = (A ′A )-1 A ′

7KXVZHFDQVROYHIRU DVDIXQFWLRQRI (B4)

γ = A+ (β - d)

E\VXEVWLWXWLRQRI % LQ % ZHJHWDOLQHDUHTXDWLRQLQRQO\ (B5)

( I - AA+ ) β = ( I - AA+ ) d



However, as FHJ point out in an exercise (17.3), the matrix (I - AA+) is of dimension k by k, but it is not of full row rank and it is unsatisfactory for use in the traditional restricted least squares solution, thus it is necessary to reduce the dimensionality. To make this reduction we use the singular value decomposition of A whereby we can find a triplet of matrices of the form (as in A2) ! Å  Å (Å#Å5a

(B6)

in this case H is a k x k orthogonal matrix with columns consisting of the eigenvectors of AA , C is an (k x j) matrix with the square root of the eigenvalues of A $ on the top j x j diagonal, U is an orthogonal j x j matrix with the columns consisting of the eigenvectors of A $. Thus, based on the information that rank of A is j we have the following partition of C and U

A =

(B7)

4H

1(k x j)

M H2(k x m)

C   0

  0U5 K 

1(j x j)

9

(k x k)

(m x j)

( ′j x j)

(k x j)

where H are the j columns of H associated with non-zero eigenvalues which are on the diagonal of C1 and H are the k-j rows associated with the zero-valued eigenvalues. Based on the properties of orthogonal matrices H and U we can use (A2) to rewrite our value for A+ as

A+ = ( U C′ H ′ H C U ′ )-1 U C′ H′, = U ( C′ C )-1 U ′U C′ H ′, = U ( C′ C )-1 C′ H ′, = U C* H ′ , where C* is defined as, (B8)

3

−1 C* = C1 M 0

8



and then by substitution we find A A + = H C UŠ U C* HŠ, = H C C* HŠ, = ( H1

ÎH ŠÞ 1 I 0 ß, Î jxj jxm Þ Ï M H2 ) Ï Lß Ï ß Ð 0mxj 0mxm à Ï ß Ð H 2Š à

= H1 H1 Š . Thus we can use I = H H and get; ( I - AA + ) = H HŠ - H1H1Š , or = H 1 H 1 Š + H 2 H 2 Š - H 1 H1 Š ,

(B9)

= H 2 H 2Š .

%\VXEVWLWXWLRQRIWKLVUHVXOWLQ % WKHUHVWULFWLRQLQWHUPVRI LVJLYHQDV (B10)

H 2 H 2 ′ β = H2 H 2 ′ d

and by premultiplying both sides by H2 ZHJHW (B11)

H2 ′ β = H2 ′ d

Thus R = H2 and r = H2 G in the equation given as R  r form. We can show this is equivalent by substituting (B1) in to the LFUP as in (A11), (B12)

R ( Aγ + d ) = r

by replacing A with the singular value decomposition form (B6) and substituting the values of R and r we obtain. (B13)

H2 ′ H ′CUγ + H2 ′ d = H2 ′ d

where H2 HC = 0 we get that; (B14)

H2 ′ d = H2 ′ d

Thus in analogy to appendix A, we find that the R matrix can be any matrix which when postmultiplied by A is equal to zero.



Appendix C An example Computer program. The following computer program is written in the PROC IML programming language (SAS Institute 1989), however it could be written in other similar languages. It simulates data for regression in the same form as the OLS estimation of the Translog demand equation case discussed above. Three different methods are used for estimating the restricted estimates. The first uses the traditional restricted least squares estimator; (C1)

~ -1 -1 -1 -1 -1 β1 = (X ′X ) X ′Y + (X ′X ) R (R(X ′X ) R ′ ) (r - R(X ′X ) X ′Y)

The second set of estimates is based on the reparameterization method and the code to form the estimate of the parameters using the formula given as; (C2)

~ -1 β 2 = A ( A ′X ′ XA ) A ′X ′ ( Y - X d ) + d

The third estimate uses the values of A and d to form an equivalent R and r (which are not equal to R and r that were used in (C1)) and recomputes the restricted value with these equivalent, but not equal, values of R and r. The program creates A matrix then the equivalent R and r matrices.



Computer program written in the SAS PROC IML computer language. SURFLPO

'HILQHWKHUHVWULFWLRQPDWUL[5 U ^   `

8VHOUWRGHQRWHWKHYHFWRURIUHVWULFWLRQFRQVWDQWVU  OU ^`

&RPSXWHWKHJHQHUDOL]HGLQYHUVHRI5DVJLQYU  UUS U UCJLQYU UC LQY UUS 

&RPSXWHWKHYDOXHRIG  G JLQYU OU

&RPSXWHWKHHLJHQYDOXHVDQGHLJHQYHFWRUVRI5 5  USU UC U FDOOHLJHQ HYDOXHHYHFUSU 

)LQGWKHLQGH[RIWKRVHHLJHQYDOXHVWKDWDUHDSSUR[LPDWHO\]HUR  I ORF HYDOXHH 

'HILQHWKHPDWUL[$DVWKHPDWUL[ZLWKWKHFROXPQVDVGHILQHGLQI  D HYHF _I_ SULQWDG

,QWKLVVHFWLRQZHWUDQVIRUPWKH$PDWUL[DQGGYHFWRUWRDQ HTXLYDOHQW5DQGUW\SHUHVWULFWLRQ 

)LUVWFRPSXWH$$ DQGILQGWKHHLJHQYDOXHHLJHQYHFWRUVHWV  DDS D DC FDOOHLJHQ HYDOXHHYHFDDS 

/RFDWHWKHDSSUR[LPDWH]HURYDOXHVRIWKHHLJHQYDOXHV  I ORF HYDOXHH 

'HILQHWKHQHZYDOXHRI5DVQUDQGWKHQHZYDOXHRIUDVQOU  QU HYHF _I_ CQOU QU G SULQWQUQOU



6LPXODWHDVHWRIGDWDWKDWLVHTXLYDOHQWWRWKHHTXDWLRQWUDQVORJ FDVHXVHGLQWKHH[DPSOHE\JHQHUDWLQJXQLIRUPO\GLVWULEXWHG\ VDQG [ V  \ UDQXQL M   \  \ \  [ M  __UDQXQL M   [  L  #[ 

&RPSXWHWKH2/6HVWLPDWHVRIWKHXQUHVWULFWHGEHWDVDQGFDOOWKHP EKDW  [S[ [C [[S\ [C \L[S[ LQY [S[  EKDW L[S[ [S\

&RPSXWHWKHWKUHHYHUVLRQVRIUHVWULFWHGEHWDV EU WKHWUDGLWLRQDODSSURDFKXVLQJ5DQGU EU WKHUHSDUDPHWHUL]HGSUREOHPXVLQJ$DQGG EU WKHWUDGLWLRQDODSSURDFKXVLQJWKH5DQGUDVGHULYHGIURP $DQGG  EU EKDWL[S[ UC LQY U L[S[ UC   OUU EKDW  EU D LQY DC [C [ D  DC [C  \[ G G EU EKDWL[S[ QUC LQY QU L[S[ QUC   QOUQU EKDW  SULQWEKDWEUEUEU UXQ 7KHUHVXOWLQJRXWSXWIURPWKHSURJUDP $      

'      

15    



1/5    

%+$7%5%5%5      



References

Blundell, R., (1988), "Consumer Behavior Theory and Empirical Evidence - A Survey", Economic Journal, 93, 16-65. Byron, R. P, 1982, "A note on the estimation of symmetric systems", Econometrica, 50, 15731575. Chambers, R., (1988), Applied Production Analysis, Cambridge University Press, Cambridge, UK. Christensen, Laurits R. and William H. Greene, (1976), "Economies of Scale in U.S. Electric Power Generation", Journal of Political Economy, 84, 655-676. Deaton, A, and J. Muellbauer, (1980), Economics and Consumer Behavior, Cambridge University Press, Cambridge, UK. Davidson, R. and J. G. MacKinnon, (1993), Estimation and Inference in Econometrics, Oxford University Press, New York, NY. Fomby, T. B., R. C. Hill, S. R. Johnson, (1984), Advanced Econometric Methods, SpringerVerlag, New York, NY. Graybill, F. A., (1983), Matrices with Applications in Statistics, 2nd Edition, Wadsworth, Belmont, CA. Greene, W. H., (1997), Econometric Analysis, 3rd Edition, Macmillan, New York. Greene, W. H. and T. G. Seaks, (1991), "The restricted Least Squares Estimator A pedagogical Note", Review of Economics and Statistics, 563-567. Griffiths, W. E., R. C. Hill and G. G. Judge (1993), Learning and Practicing Econometrics, John Wiley & Sons, New York, NY. Hirschberg, J. G., (1992), "A computationally efficient method for bootstrapping systems of demand equations a comparison to traditional techniques", Statistics and Computing, 2, 19-24. Huang, K. S. and R. C. Haidacher, (1983), "Estimation of a Composite Food Demand System for the United States", Journal of Business & Statistics, 1 285-291. Johnston, J. and J. DiNardo, 1997, Econometric Methods, 4th Edition, McGraw-Hill, New York, NY. Judge, G. G., W. E. Griffiths, R. C. Hill, H. Lükepohl, T.-C. Lee, 1985, The Theory and Practice of Econometrics, 2nd Edition, John Wiley and Sons, New York, NY.



Lawson, C.L. and R.J. Hanson (1974), Solving Least Squares Problems, Prentice-Hall, Englewood Cliffs, NJ. Mantell, E. H. (1973), "Exact Linear Restrictions on Parameters in the Classical Linear Regression Model", The American Statistician, 27, 86-87. Phlips, L. (1983) Applied Consumption Analysis, North-Holland, Amsterdam, The Netherlands. SAS Institute Inc., (1989), SAS/IML? Software Usage and Reference, Version 6, First Edition, SAS Institute Inc., Cary, NC. Suits, D. B., (1984), “Dummy Variables: Mechanics V. Interpretation”, The Review of Economics and Statistics, 66, 177-180. Theil, H. (1971), Principles of Econometrics, John Wiley and Sons, New York, NY. Theil, H. (1980), The System-wide Approach to Microeconomics, University of Chicago Press, Chicago, IL. Tintner, G., (1952), Econometrics, John Wiley and Sons, New York, NY.



Suggest Documents