Control-Variate Estimation Using Estimated Control Means - CiteSeerX

5 downloads 0 Views 87KB Size Report
University of Minnesota. Minneapolis, MN 55455. Jin Wang. Department of Mathematics and Computer Science. Valdosta State University. Valdosta, GA 31698.
Control-Variate Estimation Using Estimated Control Means Bruce W. Schmeiser School of Industrial Engineering Purdue University West Lafayette, IN 47907-1287 Michael R. Taaffe Department of Operations and Management Science Graduate Faculty of Industrial Engineering Graduate Faculty of Scientific Computation University of Minnesota Minneapolis, MN 55455 Jin Wang Department of Mathematics and Computer Science Valdosta State University Valdosta, GA 31698 To Appear: IIE Transactions Fall 2000

1

Abstract We study control variate estimation where the control mean itself is estimated. Control variate estimation in simulation experiments can significantly increase sampling efficiency, and has traditionally been restricted to cases where the control has a known mean. In a previous paper (Schmeiser, Taaffe, and Wang (2000)), we generalized the idea of control variate estimation to the case where the control mean is only approximated. The result is a biased, but possibly useful estimator. For that case we provided a mean-square-error optimal estimator and discussed its properties. In this paper we also generalize classical control variate estimation to the case of control variates using estimated control means (CVEMs). CVEMs replace the control mean with an estimated value for the control mean, obtained from a prior simulation experiment. Although the resulting control-variate estimator is unbiased, it does introduce additional sampling error and so its properties are not the same as those of the standard control-variate estimator. We develop a CVEM estimator that minimizes the overall estimator variance. Both BCVs and CVEMs can be used to improve the efficiency of stochastic simulation experiments. Their main appeal is that the restriction of having to know (deterministically) the exact value of the control mean is eliminated; thus the space of possible controls is greatly increased.

KEY WORDS: simulation, Monte Carlo, variance reduction.

1

Introduction

In an earlier paper we introduced the idea of generalizing classical control variate estimation by eliminating the requirement that the control mean be known deterministically (Schmeiser, Taaffe, and Wang 2000). In that paper we generalized control variates to the case where the control mean can be approximated. We called the resulting estimators Biased Control Variates (BCVs). In this paper we again generalize classical control variates to the 2

case where the control mean is estimated. To be consistent with the notation of the BCV paper we paraphrase the following two paragraphs from that paper. In a stochastic simulation experiment we compute a point estimator θ of a performance measure θ for a model of interest. Following the notation of Schmeiser, Taaffe, and Wang (2000), we refer to the model of interest as the principal model. When the principal model can be simplified to obtain an approximation model with performance measure θa and when θa is known we can often greatly increase the efficiency of the simulation experiment by constructing control-variate estimator. The point estimator θ and the approximation measure θa can be combined via the classical control-variate estimator c θO (β) ≡ θ − β(θa − θa ),

(1)

where θ is the direct-simulation estimator with unknown mean θ, θa is the control-simulation estimator with known mean θa , and β is the rate-of-change c (β) per unit change in θa . Hammersley and Handscomb (1964), Law in θO

and Kelton (1991), Lavenberg and Welch (1981), Nelson (1989, 1990), Nelson and Schmeiser (1986), Swain and Schmeiser (1989), and Wilson (1984), for example, discuss classical control variates, including estimation of the minimalvariance control weight β ∗ and extensions to higher dimensions. The two key requirements for choosing the approximation model are that the mean θa is known and that θ and θa are highly correlated. We relax the former to improve the latter. Consider the case where θa is unknown. Perhaps we can closely approximate θa analytically or numerically, or perhaps from a prior simulation experiment we have an estimate of θa . In classical control variate estimation we could not make use of these approximate/estimated values of θa , thus we could not take advantage of the efficiencies of a control variate estimation. In Schmeiser et al (2000) we generalized the ideas of control variates to the case where the mean of the control, θa can be approximated. We called that generalization of control variate estimation Biased Control Variates (BCVs) and explored the statistical properties of such estimators. 3

In a similar spirit, we now generalize classical control variates to the case where we have available an estimate, θa , of the control mean. The estimate may be available from a prior simulation experiment. We call this generalization of classical control variates Control Variates using Estimated Means. Unlike BCV’s, CVEMs are unbiased; however, CVEMs do introduce an additional source of sampling variability. Both BCV’s and CVEM’s eliminate the restriction of deterministically knowing the control mean (as is the case with classical control variates) and both BCV’s and CVEM’s can greatly increase the efficiency of simulation experiments. In Section 2 we investigate statistical properties of CVEM’s. Section 3 contains a summary and conclusions.

2

Control Variates using Estimated Control Means θa

Our purpose is to estimate the performance measure θ of the principal model. And our approach, as in Schmeiser et al (2000), is to simulate the principal model and an approximation model to obtain a correlated pair of point es θa ). But in the present context, we consider a new source for timators (θ,

θa . Consider an independent auxiliary simulation experiment that provides θa , an estimate of the approximation-model mean θa . Rather than being a deterministic approximation as in Schmeiser et al (2000), θa is now a random variable with mean θa and variance var[θia ]/k  , where k  is the number of repli-

cations in the auxiliary experiment. The analysis error ε = θa − θa now has

mean zero and variance var[θia ]/k  , whereas in Schmeiser et al (2000) it had a nonzero mean and zero variance. Because the control-variate experiment will use a particular realization of θa and therefore a particular ε, the results in this paper are in terms of expected performance over instances from the earlier paper (Schmeiser et al (2000)). 4

Unlike the previous paper, however, the practitioner now has—in the sample standard error—an objective measure of the quality of θa . By having a control-variate structure where one is allowed to use an estimated approximation-model mean, any model can be used as the approximation model, not just those whose means are known or can be well approximated. The approximation model can then be chosen based solely on its potential correlation with the principal model and computational costs. We consider two computation costs, both of which are for generating θia , the point estimator from one replication from the approximation model, relative to computing cost for the direct simulation. As in Schmeiser et al (2000), we denote the relative cost during the control-simulation to be c. The relative cost during the prior auxiliary experiment we denote by c . Computing costs do not refer to computing times. If they did, then c and c would be equal because both relate to simulating the approximation model. Rather, the computing costs are meant to reflect urgency or financial cost or other utility measure. For example, perhaps the auxiliary experiment can be run before the principal model is available, possibly overnight or in background. Or perhaps a single approximation model can serve as a control for several principal models. Despite the difficulty in interpretation, we assume that reasonable values for c and  can be specified and proceed with an analysis. The results based on these costs should be viewed as rough guidelines, in the spirit of inventory models that contain a parameter for customer good will. An auxiliary experiment to estimate the approximation-model mean is reasonable to consider when c is small compared to c. Inexpensive computation spent on the auxiliary experiment provides information about the principal model, replacing expensive computation on the principal model. For small values of c (as would be appropriate if the auxiliary experiment were run on a personal computer over the weekend), the auxiliary sample size k  should be large, resulting in a small standard error for θa and the CVEM estimator of this paper behaving like the classical (unbiased) control-variate estimator. But in practice the auxiliary run is finite and we wish to investigate the relationships 5

among run lengths, costs, and statistical properties.  The control-variate estimator is structurally unchanged: θc (β) = θ−β( θa −

θa ). As in the deterministic approximation case (BCVs), the moments are easily derived. With respect to ε its expected bias is E[βε] = 0 and its expected squared bias is E[(βε)2 ] = β 2 var[θia ]/k  . Because its variance is not a function of ε, its expected variance is simply its variance, 



a var[θia ] var[θi ] cov[θi , θia ] 2 var[θi ] var[θ (β)] = +β + , − 2β k k k k c

assuming that the auxiliary and control-variate experiments be independent. The mse of the combined (auxiliary and control-variate) experiment is the average of the mse’s conditioned on ε: mse[θc (β), θ]

a

a

a

2β 2 var[θi ] var[θi ] + β 2 var[θi ] − 2βcov[θi , θi ] = + , k k

the expected squared bias plus the expected variance. Setting the first derivative of mse[θc (β), θ] to zero yields the mse-optimal control-variate weight a





β (k ) =

cov[θi , θi ]



a

var[θi ]

1 1 + 2k/k 



whose corresponding optimal mse is 

mse[θc (β ∗(k  )), θ]

=

 θ] mse[θ,

 2

1−ρ

1 1 + 2k/k 



.

If k  = 0, then β ∗ = 0 and the direct simulation experiment is optimal. The mse-optimal auxiliary sample size is k  = ∞, in which case β ∗ and mse[θc (β ∗ ), θ] correspond to the classical values. To evaluate efficiency, we use generalized mse, the product of mse and cost. The cost of the combined experiment is c k  +(1+c)k, which is not a function of β, so β ∗ (k  ) also produces the optimal gmse. For notational simplicity, define the experimental constant 

α=

2ρ2 1 − ρ2



1 + c − 2c c

6

1/2

.

If β ∗ (k  ) is used, the gmse-optimal auxiliary sample size k  is ∗

k  = k max{0, α − 2}. Not surprisingly, if ρ2 is close to zero, or if c is not much less than 1 + c, then the optimal auxiliary sample size k  ∗ is zero and the direct experiment is optimal. More specifically, CVEMs using the optimal parameters β ∗ (k  ∗ ) and k  ∗ has a smaller gmse than the direct experiment if and only if ρ2 >

c . (1 + c)/2

The right-hand term is the ratio of the cost to obtain one auxiliary observation to the average cost to obtain the real-time observations (θi , θia ). The mse resulting from using the gmse-optimal experiment parameters β (k  ∗ ) and k  ∗ is then ∗







 θ] 1 − ρ2 1 − mse[θc (β ∗ (k  )), θ] = mse[θ,

2 α



.

In practice the mse obtained will be greater because the optimal experimental parameters will not be known. Unlike the deterministic θa , a practitioner knows the experiment’s expected location in the graphs of Figures 6.1 and 6.2. In particular, a

k var[θi ]/k  var[ε] = . = E[b ] = a   a k var[θ ] var[θi ]/k 2

Further, if ε is normally distributed (which should usually be a good assumptions since it is a sample mean of k  observations), then a



2

3(var[θi ]/k  )2 E[ε4 ] k var[b ] = = = 3 , a k var2 [θa ] (var[θi ]/k)2 √ so the coefficient of variation of b2 is always 3. More specifically, normality 2

of ε implies that the distribution of b2 is a scaled Chi-squared with one degree of freedom. Therefore, values of b2 close to zero are quite likely, but the long 7

tail will occasionally yield a control-variate experiment with much larger-thanaverage bias. Occasionally a practitioner might wish to use a possibly non-optimal value of β; for example, β = 1 is occasionally used because it is simple or because it is intuitive (e.g., when θ and θa have common units). The gmse-optimal auxiliary sample size for an arbitrary value of β is  

 



−1/2

   c var[θ] ρ ∗  var[θ] + 1   k = k −  1 + c − 2c 2β 2 var[θa ] β var[θa ] 2 

,

which is obtained by setting the first derivative of gmse with respect to k  equal to zero.

3

Implementation Considerations and Conclusions

For implementation, a practitioner needs to specify values for c, c , ρ, and k. If the resulting k  ∗ is positive, the auxiliary experiment is run with the approximation model to produce a sample mean θa . If convenient, the k a

control-variate replications are also simulated to obtain {θi ; i = 1, . . . , k} and the random-number seeds saved. (In this case, the values of c and c would be equal and much less than one.) Then, when the principal model becomes available, the same seeds are used to obtain {θi ; i = 1, . . . , k}. Unless the practitioner has substantial confidence in the specified value of ρ, β ∗ should be estimated using the sample covariance and variance from the k pairs of a

observations {θi , θi ; i = 1, . . . , k}. (The value of k should be chosen to be large enough to provide a good estimate, as discussed in Nelson 1989). Although values of c, c , ρ, and k are needed to design the experiment, of these, only k is controlled by the practitioner. Reasonable values of c and c are obvious by comparing the complexity of the approximation and principal models and considering when and how the approximation runs (auxiliary and 8

control) are to be made. The optimal value of k  increases with value of ρ, which is seldom known in advance. The increase is dramatic as ρ2 approaches one, so the practitioner might want to investigate the sensitivity by trying more than one value of ρ while determining the auxiliary sample size k  . Having determined k  , the practitioner might well choose another value, because often computation cost is not linear in the number of replications. For example, if the auxiliary run is made overnight, then c may be essentially zero until the practitioner returns to work in the morning. Therefore, typically the computer value of k  is simply an order-of-magnitude guideline.

References [1] P. W. Glynn and D. L. Iglehart. The optimal linear combination of control variates in the presence of asymptotically negligible bias. Naval Research Logistics, 36, 683–692, 1989. [2] J. M. Hammersley and D. C. Handscomb. Monte Carlo Methods. Methuen and Co. Ltd., London, 1964. [3] E. L. Lehmann. Point Estimation. John Wiley and Sons, New York, 1983. [4] S. S. Lavenberg and P. D. Welch. A perspective on the use of control variables to increase the efficiency of Monte Carlo simulations. Management Science, 27, 322–335, 1981. [5] A. M. Law and W. D. Kelton. Second Edition: Simulation Modeling and Analysis. McGraw-Hill, New York, 1991. [6] B. L. Nelson. On control variate estimators. Computers and Operations Research, 14, 219–225, 1987. [7] B. L. Nelson. Batch size effects on the efficiency of control variates in simulation. European Journal of Operational Research, 43, 184–196, 1989.

9

[8] B. L. Nelson. Control variate remedies. Operations Research, 38, 974–992, 1990. [9] B. L. Nelson and B. W. Schmeiser. Decomposition of some well-known variance reduction techniques. Journal of Statistical Computation and Simulation, 4, 27–46, 1986. [10] B. L. Nelson, B. W. Schmeiser, M. R. Taaffe and J. Wang. Approximationassisted point estimation. Operations Research Letters, 20, 109–118, 1997. [11] B. W. Schmeiser. Simulation experiments. Chapter 7 in Handbooks in Operations Research and Management Science, Volume 2: Stochastic Models, D.P. Heyman and M.J. Sobel (eds.), North-Holland, Amsterdam, 295– 330, 1990. [12] B. W. Schmeiser and M. R. Taaffe. Time-dependent queueing network approximations as simulation external control variates. Operations Research Letters, 16, 1–9, 1994. [13] B. W. Schmeiser and M. R. Taaffe, and J. Wang. Biased control-variate estimation. IIE Transactions, to appear in the special issue honoring A. Alan B. Pritsker, 2000. [14] A. P. Sharon and B. L. Nelson. Analytic and external control variates for queueing network simulation. Journal of the Operational Research Society, 39, 595–602, 1988. [15] J. J. Swain and B. W. Schmeiser. Control variates for Monte Carlo analysis of nonlinear statistical models, I: Overview. Communications in Statistics: B — Simulation and Computation, 18, 3, 1011–1036, 1989. [16] M. R. Taaffe. Approximating Nonstationary Queueing Models. Ph.D. Dissertation, The Ohio State University, Department of Industrial and Systems Engineering, 1982.

10

[17] M. R. Taaffe and S. A. Horn. External control variance reduction for nonstationary simulation. Proceedings of the Winter Simulation Conference, 341–343, 1983. [18] J. Wang. Contributions to Monte Carlo Analysis: Variance Reduction, Random Search, and Bayesian Analysis. Ph.D. Dissertation, Purdue University, School of Industrial Engineering, 1994. [19] J. Wilson. Variance reduction techniques for digital simulation. American Journal of Mathematical and Management Sciences, 4, 277–312, 1984.

11

Suggest Documents