Measurement Errors in the Estimation of Home Value Philip K. Robins; Richard W. West Journal of the American Statistical Association, Vol. 72, No. 358. (Jun., 1977), pp. 290-294. Stable URL: http://links.jstor.org/sici?sici=0162-1459%28197706%2972%3A358%3C290%3AMEITEO%3E2.0.CO%3B2-K Journal of the American Statistical Association is currently published by American Statistical Association.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/astata.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers, and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take advantage of advances in technology. For more information regarding JSTOR, please contact
[email protected].
http://www.jstor.org Thu Nov 15 16:51:53 2007
Measurement Errors in the Estimation of Home Value
PHILIP K. ROBINS and RICHARD W. WEST*
Home value plays an important role in a variety of fields of research. In this paper, three commonly used measures of home value are evaluated and compared using a framework developed by Joreskog and Goldberger (1975) for estimation of causal models containing unobserved variables. The paper extends ideas presented by Kain and Quigley (1972) and Kish and Lansing (1954). KEY WORDS: Home value; Measurement errors; Unobserved variables; Measurement errors in home value.
1. INTRODUCTION Home equity represents a major component of the total net worth position of many households in the United States. The imputed income associated with such an asset can be quite sizable when compared to other income received, particularly noriwage income. One particular measure of home value, the owner-estimate, is used to analyze a wide variety of economic problems. For example, in empirical studies of housing demand, ownerestimated home value is used to estimate such measures as income and price elasticities. For a discussion of some of these studies, see de Leeuw (1971). I n a study of labor supply, Hall (1973) uses imputed income from ownerestimated home value (among other things) to calculate income and substitution effects. Despite the importance of home value in analyzing various economic problems, little is known about the accuracy of owner-estimates as a measure of market value. One suspects that owner-estimates contain significant measurement error since most families do not keep abreast of conditions in real estate markets. In two previous papers by Kain and Quigley (1972) and Kish and Lansing (1954), the subject of measurement errors in the estimation of home value is addressed. I n each of these studies an attempt is made to determine the accuracy of owner-estimated home values by comparing them to professionally appraised valuations of the same properties. Implicit in both analyses is the assumption that the appraised values are free from measurement error. Since the appraisers did not have access to the interiors of the properties, this assumption is quite tenuous. Moreover, appraisal practices vary considerably across firms, and *Philip K. Robins and Richard W. West are Economists with Stanford Research Institute, Menlo Park, CA 94025. The research reported in this paper was performed under contracts with the states of Washington and Colorado, prime contractors for the Department of Health, Education, and Welfare, Washington, D.C., under contract numbers SRS-70-53 and SRS-71-18, respectively. The opinions expressed in the paper are those of the authors and should not be construed as representing the opinions or policies of the states of Washington or Colorado or any agency of the United States Government. The authors wish to acknowledge the computational assistance of Steven Spickard and helpful comments by Arthur S. Goldberger and Robert G. Spiegelman.
there is no reason to assume that any particular appraiser would be providing perfectly accurate measures of true market va1ue.l In this paper we present an alternative method of determining the accuracy of owner-estimated home value, based on the framework developed by Joreskog and Goldberger (1975) for estimating causal models containing unobserved variables. The model consists of a system of structural equations in which there are several measures of the unobserved true market value of a property, which itself is determined by a set of causal variables. Such a specification has been termed the RIIMIC model (Multiple Indicators and Multiple Causes) by Joreskog and Goldberger and represents a special case of the covariance structure model analyzed by Joreskog (1970, 1973). Versions of the arI3rIc model have appeared in Zellner (1970), Hauser and Goldberger (1971), and Goldberger (1971, 1972a, 197213). I n the next section of this paper we describe the statistical model. Estimates of the model using a particular set of data are presented in Section 3. The final section discusses the implications of the analysis. 2. THE MODEL 2.1 Specification The model consists of two portions. The first portion specifies that each of M measures of home value is a linear function of the true (unobserved) market value of the home plus an error term,
where y is an M X 1 column vector of home value measures, yoand y are M X 1 column vectors of parameters, y* is the scalar unobserved true market value, and u is an M X 1 column vector of error terms. The stochastic assumptions of this portion of the model are t h a t E ( u ) = 0, E(uy*) = 0, and E (uu') = b2, a diagonal matrix. Thus, it is assumed that the errors in each of the M equations determining the measures of home value are uncorrelated. The assumption of uncorrelated error terms in these measurement equations can be tested in -
-
This is perhaps too harsh a criticism of their studies. For example, Kish and Lansing (1954, p. 532) recognize that the appraised values contain measurement error but are unable to effectively deal with the problem. @ Journal of the American Statistical Association
June 1977, Volume 72, Number 358 Applications Section
Measurement Errors in Home Value
29 1
the context of the entire model. The results of such a test are presented in Section 3.3. The second portion of the model represents true market value as a linear function of a set of K causal property characteristic variables plus an error term, y* = ao
+ a'x +
E
,
(2.2)
where cro is a scalar constant, a' is a 1 X K rowi vector of parameters, x is a K X 1 column vector of property characteristics, and E is a scalar error term. I t is assumed that E ( E )= 0, E ( t 2 ) = u2, E(Ex') = 0, E(eul) = 0, and E(xu') = 0. Thus the stochastic assumptions of the second portion of the model are that the error term in the causal equation is uncorrelated with the error terms in the measurement equations and with the independent variables in the causal equation, and that the error terms in the measurement equations are uncorrelated with the independent variables in the causal e q ~ a t i o n . ~ The reduced form of the model is given by
for each observation, the appraised value is assumed equal to the true market value. 2.2 Estimation Assuming that the error terms are distributed normally and that the x vectors are fixed, the log of the likelihood function, apart from an irrelevant constant term (see Joreskog and Goldberger (1975, p. 632)) is
where N is the sample size and W is the sample variancecovariance matrix of the reduced form residuals. Maximum likelihood estimates of the model are obtained using the iterative maximization procedure described in Joreskog, Gruvaeus, and van Thillo (1970). 4
3. EMPIRICAL ESTIMATES OF T H E M O D E L 3.1 Data
The data used in this study consist of information collected on a random sample of single family dwelling units owned by families in the Seattle Income Maintenance Experiment. The families all have low or moderate inuse of a low and moderate income sample .~ where no = yo yao, n = ya', and v = yt u. The c o m e ~ While may preclude extrapolation of the results to broader variance-covariance matrix of v is given by populations, it nevertheless is an interesting group to analyze because errors in owner-estimated home value are likely to be most serious for such families. Thus the reduced form of the model contains restrictions For each of the properties in our sample, we have on both the coefficients of the exogenous variables (II) collected the three measures of home value described and on the variance-covariance matrix of the reduced earlier plus a large number of property characteristics. form disturbances ( a ) . The property characteristics were obtained from records Since the empirical implications of the model are infiled at the county assessor's office and include such variant with respect to multiplication of y by a constant things as age, lot size, number of rooms, etc. Twelve of and division of a,, a, and u by that same constant, or these characteristics were selected for use in this study with respect to the addition of a constant to cro and (K = 12).6 subtraction of y times that constant from yo, the model Means and standard deviations of the three measures must be normalized to identify the parameters. The effect of home value and the twelve property characteristics are of the normalization is to uniquely scale all the presented in Table 1. I t is interesting to note that, parameters3 among the measures of home value, the owner-estimate I n this study we consider three measures of home value has the highest mean while the assessed value has the (M = 3). These include (i) an appraised value by a private firm, (ii) an estimate of the value by the owner, lowest mean. Using- a standard one-tailed t test for the and (iii) the value used by the county assessment officials difference between means, the null hypotheses of no for property tax purposes. The normalization we use is differences between the mean of the owner-estimate and to assume that the appraised value is an unbiased estimaThis procedure actually maximizes a different function, but tor of true market value. Thus the first elements of yo Joreskog and Goldberger (1975, p. 633) prove that it has the same and y are assumed equal to zero and one, respectively. global maximum as ( 2 . 5 ) . The Seattle Experiment is comprised of approximately 2,100 This assumption contrasts with the one made by Kish families residing in the Seattle Standard hIetropolitan Statistical and Lansing (1954) and Kain and Quigley (1972) where Area. The families were selected for the experiment on the basis of
+
+
While the x variables are referred to as being "causal," for our purposes it is not necessary that (2.2) be a completely epecified model of the determination of home value. All that is required is that x be uncorrelated with u and correlated with y*. After normalization, the number of overidentifying restrictions M ( M - 3 ) ] / 2 . If in the model is given by q = C2K ( M - 1) M 2 2 and K M 2 3, the model is identified. These conditions are sufficient for identification if K is interpreted as the number of causal variables having nonzero parameters, and M is interpreted as the number of indicators with nonzero coefficients on y*.
+
+
ethnicity, income, and number of family heads. For a description of the experiment, see Kurz and Spiegelman (1971). Of the 2,100 families; approximately two-fifths are homeowners. Initially, 150 of these families were selected on a random basis for inclusion in this study. However, because of coding errors and missing information, a complete set of data was obtained for only 138 families. All the data pertain to 1971. These twelve characteristics are among the most important determinants of home value and were selected for that reason. See Kain and Quigley (1970) and Richardson, Vipond, and Furbey (1974).
Journal of the American Statistical Association, June 1977
292
the means of either of the other two measures are rejected The null hypotheat the five percent level of ~ignificance.~ sis of no-difference between the means of the appraised and assessed values, however, cannot be rejected a t the same level of significance. 1. Means and Standard Deviations of Variables (N = 138) Item
Standard deviation
Mean
2. Maximum Likelihood Estimates (N = 138) a. Home value measurement equations a Source Appraised value
YO
Assessed value
-5.909
Measures of home value (thousands of dollars)
0
L
6
R2,
1.000 (-)
2.415 (. 158)
,553
1.339 (.120)
1.612 (.169)
.832
b. Casual equation
Appraised value Owner-estimate Assessed value
Variable
a -
Construction grade 1 if attached garage 1 if detached garage 1 if basement garage Finished area (hundreds of sq. ft.) 1 if substandard storage space 1 if quality is below neighborhood standards Number of stories Number of built-ins Effective age Number of rooms Lot size (hundreds of sq. ft.) Constant term (a,)
Property characteristics Construction gradea 1 if attached garage 1 if detached garage 1 if basement garage Finished area (hundreds of sq. ft.) 1 if substandard storage space 1 if quality of home is below general neighborhood standards Number of stories Number of built-insb Effective age (years)" Number of rooms Lot size (hundreds of sq. ft.) " A cont~nuousindex o f the construction quality o f improvements based on an integer grade and percentage variations above and below thls grade. The grades are i n compliance wlth the State of Washington standards. Includes barbecues, dishwashers, ovens, etc. ' Actual age of the property adjusted downward for above average maintenance or remodeling.
1.077 ,134 ,693 1.537
,201 ,555 ,248 ,433
,223
,045
-1.391
,679
-1.144 ,493 ,461 -.073 .287 ,014 5.704
,646 ,366 ,172 ,010 ,129 ,006 (-)
c. Derived statistics Estimate
Statistic
,907 ,982 6.530 ,544 7.201
R2c R2c, fft+ff
3.2 Estimation Results
y'n-ly
Table 2 presents the maximum likelihood estimates of the structural model. In addition to the parameter estimates and estimates of their asymptotic standard errors, a series of goodness-of-fit measures are also reported. The goodness-of-fit measures are defined as
v(Y*)
R21i = VLE(Y~ I Y*)I/V(Y~) = 1 - 6'i2/V(yi)
,
i
=
1, 2, 3 ;
R2cr = VCE(Y*I X, Y)I/V(Y*) = (cyl+cy y151-ly)/((Y1+(Y where
+
+ 2) ;
(3.2) (3.3)
+
= the sample variance-covariance matrix of x, R2c = the proportion of the variance of y* accounted for by x, R 2 ~=i the proportion of the variance of yi accounted for by y*, and R2c1 = a measure of the overall explanatory power of the observables; it may be viewed as the R2 from a theoretical regression of y* on y and x .
Kain and Quigley (1972, p. 804) find no significant difference in the means of the owner-estimate and appraised value.
Estimated asymptotic standard error of a
" Est~matedasymptotic standard errors In parentheses b u = ,819, and standard error of u
=
,163.
I n the causal equation the signs of the estimated coefficients of each of the twelve property characteristic variables conform with what one would expect a priori. Using a normal test, nine of the coefficients are significant at the five percent level. The RZCof .91 suggests that most of the variation in true market value is explained by the property characteristics. Among the measurement equations, the one for assessed value has the smallest estimated error variance while the one for the ownerestimate has the largest. 3.3 Tests of Parameter Restrictions
Before deriving implications about the relative merits of the three housing value measures, it will be useful to determine the validity of the model's structure by testing the restrictions on the coefficients of the reduced form (n)and on the variance-covariance matrix of the reduced form disturbances (51). Since there are only three measures of home value, restrictions on the reduced form disturbances exist only
Measurement Errors in Home Value
293
when the coefficient restrictions exist. Thus to test the null hypothesis that the coefficient and disturbance restrictions in the fully restricted model are valid, we estimate two additional models : the unrestricted reduced form model and a model with only coefficient restrictions. Comparing the unrestricted reduced form model to the model with only coefficient restrictions provides a test of the coefficient restrictions. Comparing the fully restricted model t o the model with only coefficient restrictions provides a test of the disturbance restrictions. Finally, comparing the fully restricted model to the unrestricted reduced form model provides a joint test of the coefficient and disturbance restrictions. The test statistic used is minus twice the log of the likelihood ratio, which is asymptotically distributed as chi-square with degrees of freedom equal to the difference in the number of parameters in the two models being compared. The results of the tests are presented in Table 3. 3. Tests of Coefficient and Disturbance Restrictions Test
xZ
d.f.
Critical value (5 percent level)
Coefficient restrictions Disturbance restrictions (conditional on coefficient restrictions) Coefficient and disturbance restrictions jointly
18.90
22
33.90
2.98
2
5.99
21.88
24
36.40
The test results indicate that the restricted model yields a fully adequate description of the relationships between the property characteristics and the measures of home value. The hypothesis that both the disturbance restrictions and the coefficient restrictions are valid is accepted a t the five percent level of significance. This is also true for separate tests of the coefficient restrictions and of the disturbance restrictions (conditional on the coefficient restrictions). Overall, the restrictions of the model are consistent with the valuating behavior of the appraiser, homeowner, and assessor. 3.4 Interpretation of Results Given the validity of the model, some positive statements can be madeabout the relativeaccuracyof the home value estimates made by the appraiser, homeowner, and assessor. First, relative t o the appraiser, the homeowner's estimates of market value do not suffer from any multiplicative bias (yz = .973 is not significantly different from 1). They do, however, contain an additive bias relative to those of the appraiser. Homeowners tend to overvalue their homes by about $1,260. Second, the standard deviation of the error term in the measurement equations is about the same for the appraiser and the homeowner (2.42 vs. 2.77). The chi-square statistic for testing the joint hypothesis that y2 = yl = 1 and O2 = 61 is 2.30 which with two degrees of freedom, is not significant a t the five percent level. Thus except for the addi-
tive bias, the results indicate that the appraiser and the homeowner determine their estimates of housing value with about the same degree of accuracy. The assessor's estimate has somewhat different characteristics. Relative to the appraised value, there appears to be a smaller error variance, a large positive multiplicative bias, and a large negative additive bias. The total bias a t the mean of y*, however, is small (E(y, - y*) = w.337). Using the parameter estimates it is possible to calculate the mean square error ( a r s ~ , )of each measure of home value,
The mean square error is equal t o the square of the bias at the mean of y* plus the square of the multiplicative bias times the variance of y* plus the error variance. Table 4 presents the components of the mean square error for each measure of home value. The root mean square error (RAISE,)is also presented. The RAISE of the owner-estimate is only slightly larger than that of the appraised value. The difference is almost entirely attributable to the higher error variance of the ownerestimate. However, the RAISE of the assessed value is smaller because of its small error variance and small bias a t the mean of y*. Thus even though the appraised value is assumed t o be an unbiased estimate of home value, the results indicate that the assessed value is a better measure by the mean square error criterion. The RnlSE of 2.89 for the owner-estimate compares with a value of 3.10 obtained by Kish and Lansing (1954).8 4. Components of the Mean Square Error of the Three
Measures of Home Value (Thousands of Dollars) Statistic [E(yi - y*)I2
(1 - ~ i ) " V(Y*) Biz MSEi RMSE,
Appraised value
0 0 5.832 5.832 2.415
Ownerestimate
,661 .001 7.678 8.345 2.889
Assessed value
.I14 .828 2.599 3.585 1.893
4. CONCLUSIONS This paper has described how to specify a stochastic model which assumes measurement errors in the estimation of home value. Such a specification enables one to draw empirical implications about the relationships among several measures of home value and also about the extent of the errors inherent in those measures. The model is estimated using data on a sample of properties owned by low and moderate income families in Seattle, Washington. Because of the special nature of *Their estimated RMSE is calculated as the standard deviation of the difference between the owner-estimate and the appraised value, which is 3.44 in our sample.
294
Journal of the American Statistical Association, June 1977
the sample the results are not applicable to properties owned by other groups or to assessors in other areas. With this qualification in mind, the empirical results suggest two things. First, it appears that all three measures of home value contain significant measurement error. The root mean square errors of the measures range from $1,900 for the assessed value to $2,900 for the owner-estimate. Measurement errors of this magnitude can lead to significant biases if, for example, home value is used as an independent variable in a multiple regression model. Second, the results indicate that the ownerestimate compares favorably to the appraised value in terms of precision. Thus the results do not provide support for the notion that, relative to professional organizations which specialize in valuating property, homeowners are substantially less precise when assigning values to their home. [Received January 1976. Revised ~l'ovember1976.1 REFERENCES de Leeuw, Frank (1971), "The Demand for Housing: A Review of Cross Section Evidence," Review of Economics and Statistics, 52, 1-10. Goldberger, Arthur S. (1971), "Econometrics and ~sychometr'ics:A Survey of Communalities," Psychometrika, 36, 83-107. (1972a), L'~laximum-LikelihoodEstimation of Regressions Containing Unobservable Independent Variables," International Economic Review, 13, 1-15.
(1972b), "Structural Equation Rlethods in the Social Sciences," Econometrica, 40, 979-1001. Hall, Robert E . (1973), "Wages, Income and Hours of Work in the U.S. Labor Force," in Income Maintenance and Labor Supply: Econometric Studies, ed. H . Watts and G. Cain, Institute for Research on Poverty Monograph Series, Madison: University of Wisconsin Press. Hauser, Robert >I., and Goldberger, Arthur S. (1971), "The Treatment of Unobservable Variables in Path Analysis," in Sociological Methodology, ed. H.L. Costner, San Francisco: Jossey-Bass. Joreskog, Karl G. (1970), "A General Method for Analysis of Covariance Structures," Biometrika, 57, 239-51. (1973), "Analysis of Covariance Structures," in Multivariate Analysis 111,ed. P.lt. Krishnaiah, New York: Academic Press. , and Goldberger, Arthur S. (1975), "Estimation of a Model with Rlultiple Indicators and Multiple Causes of a Single Latent Variable," Journal of the American Statistical Association, 70, 631-39. , Gruvaeus, G.T., and van Thillo, M. (1970), "A General Computer Program for Analysis of Covariance Structures," Research Bulletin, 70-15, Educational Testing Service, Princet)ori, New Jersey. Kain, John F . , and Quigley, John A1. (1970), "Measuring the Value of Housing Quality," Journal of the American Statistical Association, 65, 532-48. (1972), "Note on Owner's Estimate of Housing Value," Journal of the American Statistical Association, 67, 803-6. Kish, Leslie, and Lansing, John B. (1954), "liesponse Errors in Estimating the Value of Homes " Journal of the American Statistical Association, 49, 520-38. Kurz, Mordecai, and Spiegelman, Robert G. (1971), "The Seattle Experiment: The Combined Effect of Income Maintenance and Manpower Investment," American Economic Review, L X , 22-9. Richardson, Harry W., Vipond, Joan, and Furbey, Robert A. (1974), "Determinants of Urban House Prices," Urban Studies, 11, 189-99. Zellner, Arnord (1970), "Estimat'ion of Regression Relationships Cont,aining Unobservable Independent Variables," International Economic Review, 11, 4 5 1 4 .