Measurement of Simulation Variance in Parameter ... - Semantic Scholar

2 downloads 25956 Views 28KB Size Report
Nov 15, 1995 - regarding households' choices among classes of new cars to buy. ..... R. Mellman, 1980, "The Effect of Fuel Economy Standards on the U.S. Automotive ... paper, Haas School of Business, University of California, Berkeley.
Measurement of Simulation Variance in Parameter Estimation

by Daniel McFadden and Kenneth Train

November 15, 1995

I. Introduction

Estimation by simulation involves drawing random terms for each observation and calculating an element (such as the probability or score) that enters an objective function (e.g. the simulated log-likelihood function) and/or a first-order condition (e.g., the sum of simulated scores or simulated moment conditions). The resulting parameter estimates depend on the particular values of the draws for each observation and necessarily vary over draws. We investigate the variance in parameter estimates that is induced by the drawing of random terms and propose a practical method for measuring this simulation variance and correcting the standard errors of the estimates to account for it.

Our analysis is conducted in the context of a random-parameters logit model estimated on data regarding households’ choices among classes of new cars to buy. We find that the simulation variance in the estimated parameters is fairly large relative to the estimates themselves. In particular, the simulation standard deviation ranges from 3% to 41% of the estimated parameters when using 20 repetitions and, even with 1000 repetitions, ranges from 0.4% to 8% of the parameters. These large standard deviations in parameter estimates occur even though the simulation standard deviations in the average probability and the log-likelihood function are very small -- about one-twentieth of one percent of their values with as few as 20 repetitions. Apparently, small simulation variance in the probabilities does not guarantee small simulation variance in the estimated parameters. Even though the simulation variance in parameter estimates is sizeable relative to the parameters, it is smaller than the sampling variance in our application. With 20 repetitions, the simulation standard deviation ranges from 6% to 70% of the sampling 1

standard deviation, and the range drops to 2%-12% with 1000 repetitions. Standard errors are calculated that incorporate simulation and sampling. Since the simulation variance is generally small relative to the sampling variance in our application, the standard errors that incorporate simulation variance are not much larger than the unadjusted standard errors.

Only two previous studies have examined the simulation variance in estimated parameters, both using Monte Carlo experiments. Our results are consistent with these and serve to clarify their implications. McFadden and Ruud (1994) obtain a simulation standard deviation that is 25% of the sampling standard deviation using the simulated EM algorithm with five repetitions on a trivariate probit model, and 74%-106% of the sampling standard deviation using maximum simulated likelihood and method of simulated scores with ten repetitions in a binary probit situation. Borsch-Supan and Hajivassiliou (1993) do not report the sampling standard deviation in their experiment; however, McFadden and Ruud constructed their trivariate probit to be very similar to that of Borsch-Supan and Hajivassiliou. Using the sampling standard deviation that McFadden and Ruud report, Borsch-Supan and Hajivassiliou obtain a simulation standard deviation that is 56% of the sampling standard deviation using the GHK simulator with 20 repetitions. These magnitudes of simulation variance relative to sampling variance are similar to our results with 20 repetitions, and higher than our results with 1000 repetitions.

In the trivariate probit experiments, the simulation standard deviation is less than 2% of the parameter in Borsch-Supan and Hajivassiliou’s study and less than 1% of the parameter in McFadden and Ruud’s study. Our analysis found much higher simulation variance relative to the parameters. However, the Monte Carlo experiments were constructed to have extremely small sampling variance. The t-statistic on the parameter in these studies is 30.2, which represents far less sampling variance than is encountered in typical applied work and, in particular, less variance than in our application. (Our t-statistics hover around two for the important parameters.) With so little sampling variance relative to the parameters, the simulatiom variance is necessarily small relative to the parameters even when it is a large proportion relative to the sampling variance. The binary probit experiment of McFadden and Ruud is more representative of applied situations: the t-statistic on the parameter is 3.7, indicating a moderate degree of sampling 2

variance. In this experiment, the simulation standard deviation is 20-30% of the parameter, depending on the simulation method. This magnitude of simulation variance is similar to that obtained in our analysis with real-world data.1

II. Random-Parameter Logit

Random-parameter, or error-components, logit models have taken different forms in different applications; their commonality arises in the integration of the logit formula over unobserved random factors. The early applications (Boyd and Mellman, 1980, and Cardell and Dunbar, 1980) were restricted to situations in which the explanatory variables do not vary over decisionmakers, such that simulation is required for only one "decisionmaker" using aggregate share data rather than for each decisionmaker in a sample. Advances in computer speed have allowed estimation of models with explanatory variables varying over decisionmakers. Bolduc, Fortin, and Fournier (1993) describe the location choice of physicians; Revelt and Train (1995) estimate a model of appliance choice with repeated choices over time; and Erdem (1995) examines households’ choices among brands of food in repeated purchases. The form of the random-parameters logit that we utilize in our investigation is described as follows.

1

Keane (1994) and Geweke, Keane, and Runkle (1994) investigate several simulators for time-series probits. They provide standard deviations of estimated parameters where the standard deviations incorporate simulation and sampling variance combined; they also report asymptotic standard errors, which reflect sampling variance only. In principle the simulation standard deviation can be inferred from these figures. However, in most cases, the asymptotic standard error exceeds the standard deviation, due perhaps to simulation error in the calculation of these statistics or an upward bias in the standard errors. In those cases where the calculation is possible, the inferred simulation standard deviation is a relatively large share of the sampling standard deviation. For example, for the parameter representing correlation over time, estimated by simulated maximum lilkelihood with the GHK simulator (in Keane’s Table 1), the standard deviation due to sampling and simulation combined is 0.02826 and the mean asymptotic standard error is 0.02518, which imply a simulation standard deviation of 0.01283 -- 51% of the sampling standard deviation. The simulation standard deviation in these studies is small compared to the parameters, but this is because the experiments were constructed with very small sampling variance. For example, the t-statistic on the parameter discussed above is 23.0. 3

The utility that person n obtains from alternative i∈I is Uin = βnxin+εin where vector xin is observed, parameter vector βn is unobserved for each n and varies in the population as specified below, and εin is an unobserved random term that is distributed iid extreme value, independent of βn and xin. Conditional on β, the choice probability is standard logit: Lin(β) = exp(βxin) /

exp(βxjn).

j

The unconditional choice probability is therefore the integral of the conditional choice probability over all possible values of β: Pin = ∫Lin(β) f(β) dβ where f(.) is the density of β. Since β enters the utility of each alternative, the variance in β induces correlations between utilities for different alternatives. As a result, the choice probabilities do not exhibit independence from irrelevant alternatives, and a wide variety of substitution patterns can be induced by appropriate choice of variables and distribution for β.

The choice probability is approximated through simulation; more specifically, the integration in Pin is approximated by a summation over randomly chosen values of β. A value of β is drawn from its distribution, and Lin(β) -- the standard logit formula -- is calculated for this value of β. This process is repeated for many draws and the average of the resulting Lin(β)’s is taken as the approximate choice probability:

SPin = (1/R)

r=1,...,R

Lin(βr)

where R is the number of repetitions, βr is the r-th draw from f(β), and SPin is the simulated probability. The simulated log-likelihood function is constructed as SLL =

n

ln(SPi*n) where i*n

denotes the chosen alternative for person n, and the parameter estimates are those that maximize SLL. Note that, even though the simulated probability is an unbiased estimate of the true probability, the log of the simulated probability with finite number of repetitions R is not an 4

unbiased estimated of the log of the true probability. The bias in SLL decreases as the number of repetitions increases. In our application, we treat the elements of β as being independently normally distributed with means b and standard deviations w -- though of course other distributions are possible. Utility therefore becomes: Uin = bxin+ (w*µ)xin + εin where µ is a vector of standard normal deviates and w*µ denotes the element-by-element product of w and µ. The unobserved portion of utility is (w*µ)xin + εin, which is correlated over alternatives insofar as xin is correlated over alternatives. The simulated probability is obtained by drawing standard normals, calculating the logit formula using bxin+ (w*µ)xin as the argument for each alternative, and averaging the result over many repetitions of draws. The parameters b and w are estimated though maximum simulated likelihood.

III. Estimation Results Using Different Seeds and Different Numbers of Repetitions

The data consist of 1105 households who purchased a new car in 1992. The alternatives are defined to be six classes of cars, based largely on EPA ratings2. Explanatory variables are the average price, size (as measured in volume), and horsepower of cars in each class. Exploratory analysis with a standard logit model revealed that these are the most important variables in customers’ choice of car class (other variables affect make and model choice within a class; see McCarthy and Tay, 1989; Mannering and Winston, 1985, 1991). Household characteristics enter the model through interactions with car-class attributes. We chose a specification with few explanatory variables so as to focus on the issue of simulation variance in the estimation of these parameters. Three variables enter: price divided by income, volume divided by household size, and horsepower for households with no children. The first two of these were allowed to have varying coefficients, and the coefficient of the horsepower variable was assumed not to vary.

2

The data, as well as the classification scheme, were developed by Bart Davis of the Lawrence Berkeley Laboratory. We are grateful to him for sharing his data with us. 5

(Preliminary estimation obtained a very low t-statistic for the standard deviation of the horsepower coefficient.) Alternative-specific constants are included, with constant coefficients.

The model parameters were estimated separately using 20, 100, and 1000 repetitions. The starting values in each run were set at the standard logit estimates, with the starting value for the standard deviation of each varying parameter set to 0.1. For each number of repetitions, the model parameters were estimated 20 times, using 20 different seeds for the random number generator. The standard deviation in the estimates over the 20 runs provides an indication of the simulation standard deviation.

Results are given in Table 1. The average probability and log-likelihood at the starting values exhibit very little simulation variance. With as few as twenty repetitions, the standard deviation in the average probability is less than one-twentieth of one percent of the mean. With 1000 repetitions, this drops to less than one-hundredth of one percent.

The mean log-likelihood increases with the number of repetitions, as expected given the downward bias in the simulated log-likelihood. However, the bias seems to be very small: the difference between the mean log-likelihood with 20 repetitions and 1000 is only 0.00026, or 0.019%, and the difference between 100 and 1000 repetitions is 0.007%. The simulation variance in the log-likelihood is also small. With as few as 20 repetitions, the standard deviation is about one-twentieth of one percent of the mean, and drops to less than one-hundreth of one percent with 1000 repetitions.

The small simulation variance in the average probability and log-likelihood do not translate, however, into small simulation variance in the estimated parameters. With 20 repetitions, the standard deviations in the estimated parameters range from 3% to 41% of the mean estimate, depending on the parameter. The simulation variance decreases of course with more repetitions. With 1000 repetitions, the standard deviations range from only 0.4% to 8% of the mean estimates.

6

Even though the simulation variance is fairly large relative to the parameters, it is consistently smaller than the sampling variance. As shown in Table 1, with 20 repetitions, the standard deviation in the parameter estimates over draws ranges from 6% to 70% of the mean standard errors. With 1000 repetitions, the standard deviations are only 2% to 12% of the mean standard errors.

A simulation standard deviation of 41%, or even 8%, of an estimate can be a concern to a researcher in applied work. It implies that, just by changing the seed, the point estimates, on which forecasts and policy analysis are often based, change considerably. However, the fact that the simulation standard deviations are smaller than the sampling standard deviations implies that the researcher should have already been concerned that the point estimates would change, and even more, with different samples. Changing the seed is easier than collecting data on a new sample, and so the researcher is more directly confronted with the changes in point estimates with different seeds than with the larger changes that would come from different samples. From this perspective, the changes in point estimates over different seeds can be seen as a motivation for renewed emphasis on standard errors in forecasting and policy analysis.

V. A Practical Procedure for Estimating Standard Errors that Incorporate Simulation Variance

The simulation standard deviation can be estimated directly by estimating the model parameters numerous times with different seeds and calculating the standard deviation of the results. This is the procedure that was applied for Table 1. However, this procedure is highly computer intensive. A more practical procedure for estimating the simulation variance relies on the result that one bhhh step away from a consistent parameter estimate is asymptotically equivalent to maximum likelihood (Brendt, et al., 1974; Newey and McFadden, 1994). The procedure is: (1) Estimate the model parameters by maximum simulated likelihood using a particular seed for the random number generator. Provided the number of repetitions is sufficiently large that the bias is negligible, this can be considered a consistent estimate. (2) At these estimates, calculate one bhhh step using each of several different seeds. (3) Calculate the standard deviation of these steps; this is an estimate of the standard deviation due to simulation. (4) Calculate "simulation 7

adjusted" standard errors for the estimated parameters by combining the standard errors obtained in (1) with the standard deviations is (3). Since the simulation error is independent of the sampling error, the simulation adjusted standard error for any parameter is simply the square root of the sum of the square of its unadjusted standard error and the square of the standard deviation of the bhhh step.

To calculate a bhhh step, only the gradient of the simulated log-likelihood is required: a bhhh step is (G’G)-1g where G is the NxK matrix for N observations and K parameters whose n,k-th element is the derivative of the simulated log-likelihood of observation n with respect to parameter k, evaluated at the parameter estimates obtained in (1); and g is the Kx1 vector of column sums of G. In our application, model estimation generally required about 40 iterations. Calculation of one bhhh step takes somewhat less time that an interation (since, in an iteration, the simulated log-likelihood function is evaluated, sometimes repeatedly, to determine the stepsize, which is not required for a bhhh step with unit step-size.) Therefore, calculating 20 bhhh steps increases the computer time, in our application, by somewhat less than 50% -- a considerably improvement over the 20-fold increase that would be required with separate runs for each set of draws.

Table 2 gives the results of applying this procedure. The first part of the table gives the parameter estimates obtained with simulated maximum likelihood on a particular set of draws. The second part of the table gives the standard deviation of one bhhh step from the estimates in part one, using twenty different sets of draws. The remaining parts of the table give standard errors for the estimated parameters: not adjusted for simulation variance, adjusted for simulation variance using the bhhh step approach (i.e., using the standard deviations in the second part of the table), and adjusted for simulation variance using the fully-converged estimates (i.e., using the standard deviations in Table 1.) Two results are evident. First, the standard errors using the bhhh step approach are similar to those using the fully converged estimates, which provides support for the bhhh step approach. Second, the simulation-adjusted standard errors are not much larger than the unadjusted standard errors. This result reflects the fact that, even though the simulation standard deviation may be large compared to the model parameter, it is small relative 8

to the sampling standard deviation and, consequently, does not add much to the total standard error of the estimate.

In summary, the change in point estimates that the researcher observes when estimating the model with different draws might seem large to a researcher who is accustomed to relying on point estimates; however, it is actually small compared to the changes that the researcher would observe if he/she took different samples. The simulation variance is a reminder to the researcher of the importance of considering the standard errors of estimates and not relying too heavily on point estimates. Standard errors that reflect simulation and sampling are practical to calculate.

9

REFERENCES

Berndt, E., B. Hall, R. Hall, and J. Hausman, 1974, "Estimation and Inference in Non-Linear Structural Models," Annals of Economic and Social Measurement, Vol. 3, pp. 653-665.

Bolduc, D., B. Fortin, and M.-A. Fournier, 1993, "The Impact of Incentive Policies on the Practical Location of Doctors: A Multinomial Probit Analysis," Cahier de recherche numero 9305 du Groupe de Recherche en Politique Economique, Department d’economique, University Laval, Quebec, Canada, G1K 7P4.

Borsch-Supan, A., and V. Hajivassiliou, 1993, "Smooth Unbiased Multivariate Probability Simulators for Maximum Likelihood Estimation of Limited Dependent Variable Models," Journal of Econometrics, Vol. 58, pp. 347-368.

Boyd, J. and R. Mellman, 1980, "The Effect of Fuel Economy Standards on the U.S. Automotive Market: An Hedonic Demand Analysis," Transportation Research, Vol. 14A, No. 5-6, pp. 367378.

Cardell, N. and F. Dunbar, 1980, "Measuring the Societal Impacts of Automobile Downsizing," Transportation Research, Vol. 14A, No. 5-6, pp. 423-434.

Erdem, T., 1995, "A Dynamic Analysis of Market Structure Based on Panel Data," working paper, Haas School of Business, University of California, Berkeley.

Geweke, J., M. Keane, and D. Runkle, 1994, "Alternative Computational Approaches to Inference in the Multinomial Probit Model," Review of Economics and Statistics, Vol. LXXVI, No. 4, pp. 609-632.

Keane, M., 1994, "A Computationally Practical Simulation Estimator for Panel Data." Econometrica, Vol. 62, No. 1, pp. 95-116. 10

Mannering, F., and C. Winston, 1985, "A Dynamic Empirical Analysis of Household Vehicle Ownership and Utilization," RAND Journal of Economics, Vol. 16, pp. 215-36.

Mannering, F., and C. Winston, 1991, "Brand Loyalty and the Decline of American Automobile Firms," Brookings Papers on Economic Activity: Microeconomic, pp. 67-114.

McCarthy, P. and R. Tay, 1989, "Consumer Valuation of New Car Attributes: An Econometric Analysis of the Demand for Domestic and Japanese/Western European Imports," Transportation Research, Vol 23A, No. 5, pp. 367-376.

McFadden, D. and P. Ruud, 1994, Review of Economics and Statistics, Vol. LXXVI, No. 4, pp. 591-608.

Newey, W., and D. McFadden, 1994, "Large Sample Estimation and Hypothesis Testing," in R. Engle and D. McFadden (eds.), Handbook of Econometrics, Vol. IV, North-Holland: The Netherlands.

Revelt, D. and K. Train, 1995, "Incentives for Appliance Efficiency in a Competeitive Energy Industry," working paper, Department of Economics, University of California, Berkeley.

11