Response Surface Methodology with Stochastic

0 downloads 0 Views 300KB Size Report
Jul 31, 2006 - This paper investigates simulation-based optimization problems with a .... in Gill, Murray, and Wright (1981). ...... Practical Optimization.
Response Surface Methodology with Stochastic Constraints for Expensive Simulation Ebru Ang¨ una∗, G¨ ul G¨ urkanb , Dick den Hertogb , Jack Kleijnenc a

˙ Galatasaray University, Dept. of Industrial Engineering, Ortak¨oy 34357 Istanbul, Turkey

([email protected]) b

Tilburg University, Dept. of Econometrics and Operations Research, P.O. Box 90153,

5000 LE Tilburg, Netherlands ([email protected], [email protected]) c

Tilburg University, Dept. of Information Systems and Management, P.O. Box 90153,

5000 LE Tilburg, Netherlands ([email protected])

July 31, 2006

Abstract This paper investigates simulation-based optimization problems with a stochastic objective function, stochastic output constraints, and deterministic input constraints. More specifically, it generalizes classic Response Surface Methodology (RSM) to account for these stochastic and deterministic constraints. This extension is obtained through the generalization of the estimated steepest descent—used in classic RSM—using ideas from interior point methods, especially affine scaling. The new search direction is scale independent, which is important for practitioners since it avoids some numerical complications and problems commonly encountered. Furthermore, this paper derives a heuristic that uses this search direction iteratively. The heuristic is intended for problems in which each simulation run is expensive and the computer budget is limited, so that the search needs to reach a neighborhood of the true optimum quickly. Moreover, although the heuristic is meant for stochastic problems, it can be easily simplified for deterministic ones. Numerical illustrations give encouraging results. (Simulation, interior point methods, stochastic optimization, bootstrap) ∗

Corresponding author

2

1

Introduction

Optimization in simulation has been attempted by many methods; see Fu (2002), and Tekin and Sabuncuoglu (2004). These methods can be classified as either ‘white box’ or ‘black box’ methods. White box examples are perturbation analysis methods (Ho and Cao, 1991, and Glasserman, 1991) and score function methods (Rubinstein and Shapiro, 1993), which use some mathematical functions inside the simulation model to estimate gradients. Metaheuristics—such as simulated annealing, genetic algorithms, tabu search, and scatter search—can also be used to optimize a simulated system, treating the simulation model as a black box; i.e., these methods observe only the inputs and outputs of the simulation model. Two other black box methods are the simultaneous perturbation stochastic approximation of Spall (2003), and Fu and Hill (1997), and the cross-entropy method of Rubinstein and Kroese (2004). In this paper, we focus on the black box method known as Response Surface Methodology (RSM). From the practitioners’ point of view, RSM is broadly applicable since RSM can be easily integrated with both stochastic and deterministic simulation models. Recent case studies of RSM optimization of stochastic systems are presented by Yang and Tseng (2002), and Irizarry, Wilson, and Trevino (2001); a case study of RSM for deterministic systems is presented by Ben-Gal and Bukchin (2002). Case studies of RSM applied to real, non-simulated systems can be found in the standard RSM textbooks by Myers and Montgomery (2002), and Khuri and Cornell (1996). Originally, RSM was derived for problems with a single stochastic objective function. Myers and Montgomery (2002) gave the following general description for the first stage of this classic RSM. An experimental design is used to fit locally a first-order polynomial to the observed values of the stochastic objective through the least squares method. Then, a steepest descent search direction is estimated from the resulting metamodel (response surface), and a number of steps are taken along the estimated steepest descent direction, until no additional decrease in objective is evident. This procedure is repeated until a first-order polynomial becomes an inadequate model, which is indicated by a gradient not significantly different from zero. Then, in the second stage of classic RSM, a secondorder polynomial is fitted, and canonical analysis finds an optimum, a saddle point, or a ridge. In practice, however, optimization problems may have stochastic constraints. For example, inventory simulation may minimize the total holding and ordering cost under a service level constraint. In RSM, there have been several approaches to solve constrained

3 optimization problems. Khuri (1996) surveyed most of these approaches, including the desirability function (Harrington, 1965, and Derringer and Suich, 1980), the generalized distance (Khuri and Conlon, 1981), and the dual response (Myers and Carter, 1973, Vining and Myers, 1990, Del Castillo and Montgomery, 1993, and Fan and Del Castillo, 1999). Further, Wei, Olson, and White (1990) suggested another approach called the prediction-interval constrained goal programming, which was not mentioned in Khuri (1996). In all these approaches, the constrained optimization problem is reformulated by combining the constraints and the original objective function into a new, single objective function, using appropriate transformations. The type of transformations differs with the particular method. Next, the resulting unconstrained optimization problem is solved through an ordinary nonlinear programming algorithm. These approaches suffer from a major drawback: the transformations require arbitrary choices. To overcome this drawback, we propose an alternative approach. Rather than transforming the problem into an unconstrained optimization problem, we focus on the original (possibly stochastically) constrained optimization problem. We now summarize our approach for extending classic RSM such that it can handle stochastic constraints. This is achieved through the generalization of the estimated steepest descent search direction using ideas from interior point methods, more specifically the affine scaling algorithm; see Barnes (1986). We illustrate the superiority of our search direction through a deterministically constrained problem, as follows. In Figure 1, which is inspired by Ye (1989, p. 51), the current input vector is R and the optimal point is P . Suppose that—although there are constraints—we would use the classic RSM’s estimated steepest descent direction. Then, this search direction would soon hit the boundary at the input vector C. We would then have slow convergence, creeping along the boundary. We propose a search direction that generates an input vector that avoids hitting the boundary. We accomplish this by introducing an ellipsoid constraint centered at the current iterate with a shape determined by the current slack values. By minimizing the linear objective function over this ellipsoid, we obtain the search direction RP 0 . Suppose that the next iterate along RP 0 is R0 . In the next iterate a new ellipsoid, centered at R0 , will be introduced; since one slack value is small at R0 , the corresponding radius of the ellipsoid will also be small. This avoids creeping along the boundary; the next search direction will be R0 P 00 . An explicit formula for the search direction obtained by this optimization is well-known in the interior point literature; see Barnes (1986) and also Appendix A.

4 Insert Figure 1 This paper has two components, namely a novel search direction and a heuristic incorporating that search direction. The former component, which enables us to deal with stochastically constrained outputs and deterministically constrained inputs, is a generalized estimated steepest descent search direction. To achieve this generalization, we use standard tools from interior point methods and nonlinear programming, such as scaling and projection. This scaling is a standard feature of the affine scaling algorithm. Projection of the search direction is also a standard approach when there are linear constraints; there is a related family of algorithms called null-space methods discussed in Gill, Murray, and Wright (1981). We prove that the proposed search direction has two important features: (i) it is indeed a descent direction and (ii) it is scale independent. It is well-known that the estimated steepest descent direction, used in classic RSM, is scale dependent. In general, practitioners need to deal with variables with widely varying orders of magnitude; therefore, scale independence enables them to avoid numerical complications and problems. The latter component, namely the heuristic that uses the proposed search direction iteratively, is for the first stage of RSM. This heuristic is primarily meant for “expensive” simulation-based optimization problems; that is, for problems in which each simulation run takes (say) several hours or days, and the total computer budget is tight. Such problems arise in practice in various fields; see, for example, Simpson et al. (2004), Helton et al. (1997), Den Hertog and Stehouwer (2002), and Vonk Noordegraaf, Nielen, and Kleijnen (2003). Our heuristic reaches the desired neighborhood of the true optimum in a relatively small number of simulation runs; once it reaches this neighborhood, it stops at a “feasible” point. Moreover, our heuristic can be easily simplified for deterministic problems that are solved iteratively. Using interior point methods has two advantages. First, interior point methods are known to perform well within the interior of the feasible region; therefore, we expect that our heuristic will also avoid creeping along the boundary. Second, in practice some simulation programs crash or become invalid outside the feasible region. Such problems were reported in Bettonvil and Kleijnen (1996), Booker et al. (1999), and Den Hertog and Stehouwer (2002). The remainder of this paper is organized as follows. In § 2, we formalize our problem including statistical methods. In § 3, we give the proposed search direction and outline its properties. In § 4, we describe our heuristic for the first stage of RSM. In § 5, we evaluate our heuristic through a bootstrap example and a classic inventory problem. In

5 § 6, we give conclusions. There are two appendices with technical details and proofs.

2

Problem formulation

We consider the following problem: minimize

E [F0 (x, ω)]

subject to

E [Fj (x, ω)] ≥ aj for j = 1, ..., r − 1

(1)

l≤x≤u where x ∈ Rk is the vector of k input variables, ω is the simulation’s seed vector, l is the lower bound on x, u is the upper bound on x, aj is the jth component of the deterministic right-hand-side vector, and the Fi (i = 0,..., r − 1) are r random simulation responses (outputs). We do not know these Fi explicitly; therefore we estimate their means through simulation. However, we assume that the Fi are continuous and continuously differentiable on the “feasible” set defined by the inequalities in (1). Without loss of generality, we assume that there are no deterministic linear constraints in (1). If we had such constraints in (1), these constraints would be treated in the same way as the stochastic constraints, after the stochastic ones are approximated by first-order polynomials. Notice that for deterministic problems, ω and hence the expectation operator E will simply vanish in (1). The formulation in (1) also covers variances and coefficients of variation. Note that probabilities can be formulated as expected values of indicator functions, but then our assumption that the Fi are continuous and continuously differentiable does not hold. In practice, performance measures are quantiles; there exist more complicated formulations than (1) that enable us to consider quantiles as responses; but for simplicity we use the formulation in (1). The first stage of RSM requires local approximations of the E(Fi ) through first-order polynomials augmented with noise: eT βi + ²i (x, ω) for i = 0, ..., r − 1 Gi (x, ω) = x

(2)

¢T ¡ e = 1, xT , βi ∈ Rk+1 denotes the vector of k + 1 polynomial coefficients where x (including the intercepts), and the ²i are additive noises with mean vector µ²i and covariance matrix Σ²i . The ²i in (2) account for both the lack of fit of the approximations (so µ²i 6= 0) and the inherent randomness in stochastic simulation.

6 In classic RSM with its single response (r = 1), the usual assumptions are that locally the mean vector µ²i is a zero-vector and the covariance matrix Σ²i is a scaled identity matrix with the common residual variance on the diagonal; in the rest of this paper, these assumptions on µ²i and Σ²i are called the ordinary least squares (OLS) assumptions. When the OLS assumptions are satisfied, the Gauss-Markov theorem guarantees the OLS estimator of βi to be the best linear unbiased estimator (BLUE), where “best” means minimum variance. In our setting, the r unknown expected responses in (1) are estimated through the same simulation run. Hence, the estimators of the r expected responses may be correlated; besides, the variances per response may differ. In such cases, generalized least squares (GLS) gives the BLUE of βi . However, under some conditions GLS reduces to OLS, as we now explain. Let N denote the total number of local runs. We use an independent seed vector ωm for run m. Furthermore, we assume that each response satisfies the OLS assumptions locally (i.e., within the local area where we have N simulation runs). Then, Rao (1967) and Ruud (2000, p. 703) prove that if the same design for all r responses is used, the GLS estimator reduces to the OLS estimator: ¡ ¢−1 T b i (x, ω) βbi (x, ω) = XT X X F

for i = 0, ..., r − 1

(3)

bi = where X ∈ RN ×(k+1) denotes the design matrix augmented by a column of ones; F ³ ´T Fbi,1 , ..., Fbi,m , ..., Fbi,N ∈ RN denote the estimator of the expected response i with the independent components Fbi,m (m = 1, ..., N ) obtained through the mth simulation run. Let Σ be the r × r covariance matrix with the variance σi,i of response i on the diagonal, and the covariance σi,h between the responses i and h (h = 0, ..., r − 1) on the off-diagonal. In general, we do not know this Σ. Often, simulation analysts estimate Σ through replications. However, when simulation is expensive, we prefer an alternative estimator, namely the mean squared residual (MSR), which does not require expenb i ∈ RN denotes the estimator for the expectation of the linear sive replications. If G approximation in (2) with βi estimated by the OLS estimator βbi defined in (3), then ³ σ bi,h (x, ω) =

´T ³ ´ b i (x, ω) − G b i (x, ω) b h (x, ω) − G b h (x, ω) F F N − (k + 1)

(4)

estimates the (co)variances between responses i and h from the residuals (Khuri 1996, p. 385). We assume constant covariances within the local area (where we have N

7 simulation runs); i.e., σi,h (x) = σi,h . Since we assume that each response i satisfies the OLS assumptions locally (i.e., µ²i = 0 and Σ²i = σi,i IN , where IN denotes the N × N identity matrix), the MSR estimators in (4) are unbiased for σi,h . As we move to a different local area, the (co)variances may change; therefore, we re-estimate them through (4). Furthermore, since in the rest of the paper we only need point estimates of σ bi,h , we do not need to assume multivariate normality of the responses; that is, the σ bi,h in (4) are not necessarily chi-squarely distributed. After the βi are estimated, our problem in (1) is locally approximated by minimize

bT0 x

subject to

bTj x ≥ cj for j = 1, ..., r − 1

(5)

l≤x≤u where bi = (bi,1 , ..., bi,k )T denotes a realization of the OLS estimator βbi excluding the intercept βbi,0 and cj = aj − bj,0 . In (5), we leave the constant term b0,0 out of the objective function since we will use only the gradient of the objective in the following section. An important characteristic of simulation is the experimenter’s control over the seed ω that drives the simulation model. The use of common seeds as variance reduction technique is standard in simulation; see Law and Kelton (2000). In a classic RSM context, Donohue, Houck, and Myers (1993, 1995), and Hussey, Myers, and Houck (1987) considered the use of the following three seed assignment strategies as part of their experimental design: (i) the assignment of an independent seed to each input vector, (ii) the assignment of a common seed to input vectors, and (iii) Schruben and Margolin (1978)’s “assignment rule” that simultaneously uses common and antithetic seeds in an orthogonally blocked experimental design. Notice that for a fixed seed ω, the unknown responses Fi are deterministic functions of x. In the heuristic in § 4, we will use the common seed approach on our path towards the optimum to reduce the random variations in the estimates for the estimators of the expected responses. Finally, we emphasize that the use of common seeds usually causes synchronization problem for discrete-event dynamic systems simulations—see Law and Kelton (2000)—and that the use of common seeds is not essential for our heuristic.

8

3

The new search direction and its properties

In this section we propose an estimated search direction for the problem in (1), and derive that it is a descent direction and scale independent. We simplify (5) by introducing B = (b1 , ..., br−1 )T , and we add slacks s, r, and v: minimize

bT0 x

subject to

Bx − s = c, (6)

x + r = u, x − v = l, r−1 k . s ∈ R+ , r, v ∈ R+

Our purpose is not to solve the linear programming problem in (6); we use the local approximation in (6) only to derive a search direction. Our novel search direction d is derived in Appendix A using standard tools from interior point methods: ³ ´ −2 −2 −2 −1 d = − BT S B + R + V b0

(7)

where S, R, and V are diagonal matrices with the components of the current estimated slack vectors s, r, v > 0 on the diagonals. Notice that in (7), the inverse of the matrix within the parentheses scales and projects the estimated steepest descent −b0 . −2

Furthermore, the inverse in (7) exists since BT S B is positive semi-definite, and R and V

−2

−2

are positive definite, which is the result of the fact that each iterate is strictly

feasible. First, we present some intuitive ideas about the derivation of (7), and then we consider some special cases of (6), which affect (7). To give some geometrical insight about the derivation of (7) in Appendix A, we start with classic RSM (with a single response), and its estimated steepest descent. Finding the normalized estimated steepest descent d can then be formulated as minimize bT0 d

subject to kdk2 ≤ 1.

(8)

The problem in (8) corresponds to the minimization of a linear function over a ball constraint. However, the estimated steepest descent may be very inefficient in a constrained problem, as we illustrated through Figure 1. Hence, we replace the nonnegativity constraints on the estimated slack vectors by an ellipsoid constraint that lies in the interior of the positive orthant, and that is centered at the current estimated slacks with a

9 shape determined by the current values of the estimated slacks; see Barnes (1986) and also Appendix A. Then, (7) is obtained through the minimization of a linear function over this ellipsoid constraint. In Ang¨ un (2004), we consider an alternative derivation of (7), which explicitly applies scaling and projection operators to the estimated steepest descent. We now discuss two special cases of (6). Case 1: No box constraints on the input variables, i.e., no constraints x + r = u and x − v = l in (6). Then (7) reduces to ³ ´−1 −2 d = − BT S B b0 .

(9)

This can be seen as follows. No box constraints on x implies −∞ < x < +∞. Then, −2

using 1/∞ = 0 reduces R

and V

−2

in (7) to zero-matrices. (9), however, may not be

very useful since even if B is a square matrix (i.e., the number of constraints is equal −2

to the number of input variables), BT S B can be ill-conditioned so that inverting this matrix results in an erroneous solution; if B is not a square matrix (for example, the −2

number of constraints is less than the number of input variables), BT S B is singular, hence not invertible. Case 2: No output constraints (single output to be minimized), i.e., no Bx − s = c in (6). Then (7) simplifies to ³ d=− R

−2

+V

´ −2 −1

b0 .

(10) −2

Note that the inverse in (10) exists since R

and V

−2

are positive definite. The direction

defined by (10) may be useful, since—although it is not explicitly stated—classic RSM ³ −2 ´ −2 −1 assumes box constraints on the decision variables. The scaling matrix R + V in (10) is a diagonal matrix with expressions of the estimated current slacks of the box constraints on the diagonal. Then, the scaling matrix in (10) adapts the estimated steepest descent (−b0 ) with respect to the shape of the feasible region through the estimated slack values. We expect that for problems with only box constraints, (10) performs better than the estimated steepest descent (−b0 ) whenever the individual ³ −2 ´ −2 −1 diagonal entries in R + V differ; we leave this issue for future research. We now consider two properties of the search direction proposed in (7). First, since −2

BT S B is positive semi-definite, and R

−2

and V

−2

are positive definite, the estimated

10 search direction in (7) is indeed a descent direction; that is, ³ ´ −2 −2 −2 −1 −bT0 BT S B + R + V b0 < 0. Second, to avoid numerical complications and problems, it is common practice to make variables have similar magnitudes. This can be done by a general linear transformation Dz + e = x

(11)

where D ∈ Rk×k is non-singular and e ∈ Rk . This transformation is one-to-one; i.e., z = D−1 (x − e) . Such a transformation is standard in classic RSM literature: transform the original variables into coded ones such that all variables are between −1 and +1. In Appendix B, we prove that our search direction (7) is invariant under the transformation (11). The estimated steepest descent in classic RSM is scale dependent; see Myers and Montgomery (2002, p. 218-220).

4

An iterative heuristic for the first stage of RSM

Before presenting the details of our heuristic, we explain briefly the general procedure. The expected responses in (1) are locally approximated by first-order polynomials. Then, the search direction (7) and a maximum step size are computed from the resulting model. In the initialization, the heuristic fixes a set of independent seed vectors to be used when estimating function values locally. This same set is used when estimating function values in different local areas. The heuristic changes this fixed set of independent seed vectors only once, namely when there is no progress observed in a newly estimated search direction. In this way, the heuristic tries to avoid premature stopping if stopping is caused by the inherent randomness of the problem. At each iteration, there are two statistical tests—one for “feasibility” and one for “improvement” in objective—to determine whether the candidate iterate will become the next iterate. These tests consider relative improvements (rather than absolute improvements) of the candidate iterate with respect to the current iterate. This approach is in line with interior point methods, where at each iteration one takes a step such that the candidate iterate’s slacks are equal to some percentage of the current slacks. Moreover, the use of ratios instead of absolute differences avoids scale dependency. Furthermore, our heuristic does not assume multivariate normality for the Fi . Hence, we do not use the classical statistical tests. Instead, we use parametric bootstrapping, which

11 applies bootstrap sampling from a given “input” distribution, in order to estimate the “output” distribution of a (complicated) statistic (e.g. a ratio); see Efron and Tibshirani (1993). In our case, we assume a Gaussian input distribution, because this assumption is classic in RSM. However, the bootstrap may use any input distribution; that is, if the practitioner has reason to believe that a particular distribution may suit better for the problem on hand, then sampling should be done from that particular distribution. We now provide a more specific and detailed description of why and how we resort to parametric bootstrapping as part of the heuristic. Suppose our current iterate is xc , and it is a “feasible” point. First, we check whether the candidate point, xc + λd, is “feasible” through a statistical test for “feasibility”. If xc + λd is “feasible”, then we check whether there is significant decrease in the objective at xc +λd compared with the objective at xc . This comparison is done through a statistical test for “improvement” in objective. For “feasibility”, we consider the ratios Sj (xc + λd, ω) /Sj (xc , ω) of the slacks; this is in line with the classical interior point methods. For symmetry and ease of explanation, we also look at the ratios of the objectives (that is, relative improvements in the objectives formulated as (F0 (xc , ω) − F0 (xc + λd, ω)) / | F0 (xc , ω) |) instead of absolute improvements. Since we assume that simulation is very time-consuming (so we can not make a large number of simulation runs), we consider fb0 (xc + λd) and fb0 (xc ), which are the corresponding simulation observations obtained for F0 (xc + λd, ω) and F0 (xc , ω), as the point estimates for the means of F0 (xc + λd, ω) and F0 (xc , ω), respectively. Notice that we can not use any of the classical statistical tests, since we do not assume multivariate normality for the Fi . We proceed as follows: we sample a large number, say K = 1000, of the random objectives Fb0 (xc + λd, ω) and Fb0 (xc , ω) from their distributions, say normal, with means fb0 (xc + λd) and fb0 (xc ), respectively, and the variance σ b0,0 , which was estimated when the search direction was last revised. A similar procedure is used for generating a bootstrap sample of the slacks. Further details of the bootstrap tests for “feasibility” and “improvement” in objective are explained at the end of this section. In Figures 2 and 3, we summarize the outer and the inner loops of the heuristic, where the latter loop is used to determine an approximate line minimum in an estimated search direction in a prespecified number of simulation runs. Our heuristic neglects the statistical dependencies among the Fi . Now, we detail each step of the heuristic as follows. Insert Figures 2 and 3 Step 0, Initialize: Input a user-specified fixed number of simulation runs per inner

12 loop, and a maximum total number of simulation runs for the outer loop. This maximum is determined by the time and budget constraints of the expensive simulation study. For the outer loop, initialize the number of simulation runs already executed to zero. Input a fixed, user-specified size of the local experimental area, where the responses Fi are locally approximated by first-order polynomials. This local experimental area should lie within the global area determined by the box constraints in (1). The size of this local experimental area is clearly scale dependent, and there are no general guidelines to determine an appropriate size that would work in all applications; see the standard RSM textbooks by Myers and Montgomery (2002), and Khuri and Cornell (1996). Therefore, to determine an appropriate size, the users need to have insight into their application. Notice that the arbitrary choice for the initial size of the region of interest is also a general issue in nonlinear programming: in trust-region methods, the initial size of the local experimental area—called the trust-region—is also user-specified; see Conn, Gould, and Toint (2000). Furthermore, we assume that the users can determine at least one design point inside the “feasible” region. The rest of the initial design points may be “infeasible”, but the simulation program should not crash in those points. For an existing system, a “feasible” point is provided by the current design point at which the system is operating. In general, however, there are again no guidelines that ensure that a selected design point is inside the “feasible” region prior to simulating at that point; therefore, the users are again asked to use their prior knowledge. As a further remark, at least one of the “feasible” points has to be in the interior of the “feasible” region far from the boundary, since (7) will creep along the boundary or fail when this point is close to the boundary or on the boundary, respectively. For the first stage of RSM, we use a resolution-3 design, since such a design type gives unbiased estimators for βi in (2) with a small number of simulation runs, provided that first-order polynomials are adequate approximations; see Kleijnen (2005). Simulate at the design points xc to estimate their objectives fb0 (xc ) and their slacks sbj (xc ) = aj − fbj (xc ), where the aj are the right-hand-side values in (1). Notice that when we use parametric bootstrapping, these fb0 (xc ) and sbj (xc ), are used as the point estimates for the means of the random objective F0 (xc , ω) and the random slacks Sj (xc , ω). For later use, fix the set {ω1 , ..., ωc , ..., ωN } of independent seed vectors for each of the N design points. The initial iterate, say xc , is the “feasible” design point (among the N design points) that has the minimum objective fb0 (xc ) estimated through simulation; i.e., xc is not found by minimizing the local linear model in (6). Fix the seed vector ωc

13 corresponding to xc as the common seed. Increase the number of simulation runs already executed for the outer loop by N , which is the number of runs used for initialization. As we mentioned in § 2 when introducing (1), the inherent randomness ω vanishes for deterministic problems. Hence, the generation of independent seeds and the use of common seeds are to be skipped when the heuristic is applied to such problems. Step 1, Fit first-order polynomials, estimate variances, and perform bootstrap sampling: Approximate (1) by local first-order polynomials within the local experimental area using (3), and obtain point estimates σ bi,i for the variances through (4). As mentioned in § 2 below (4), the MSR estimators in (4) are unbiased for σi,h . Because the locally constant variance assumption may not hold globally, we use only the most recent estimates σ bi,i in the heuristic. After estimating the variances, we perform parametric bootstrapping: sample a large number K of observations on F0 (xc , ω) from the normal distribution with mean fb0 (xc ) and variance σ b0,0 , and on Sj (xc , ω) from the normal distributions with means sbj (xc ) and variances σ bj,j . (Our normality choice is only for clarity in describing the heuristic, and in line with classic RSM literature.) Later, these samples from F0 (xc , ω) and Sj (xc , ω) will be used in the statistical tests for “feasibility” and “improvement” in objective. For deterministic problems, variance estimation and bootstrap sampling are not needed since the exact values for F0 (xc + λd), F0 (xc ), Sj (xc + λd), and Sj (xc ) are obtained through simulation. Step 2, Estimate a search direction and a maximum step size: Determine the −2

search direction (7), where the diagonal are estimated through a single ³ entries of´S −2 −2 simulation run at xc ; i.e., sbj (xc ) = aj − fbj (xc ) . To determine a maximum step size into this direction, we initially assume that the approximation in (5) holds globally. Then we find a maximum step size λmax as a solution of the following problem: maximize

λ

subject to

bTj [xc + λd] ≥ cj for j = 1, ..., r − 1

(12)

l ≤ xc + λd ≤ u, λ ≥ 0. It is not necessary to solve (12) as a linear program, since the maximum step size λmax

14 can be obtained explicitly through λmax =max{0, min {λ1 , λ2 , λ3 }} where © ª λ1 = min (ch − bTh xc )/bTh d: h ∈ {1, ..., r − 1} , bTh d 0} , λ3 = min {(ln − xn ) /dn : n ∈ {1, ..., k} , dn < 0} . To increase the probability of staying within the interior of the “feasible” region, we take only 80% of λmax as our maximum step size λ. Note that this step size is lower than what is often used in the interior point literature for deterministic problems (e.g. 0.95λmax ), since we want to reduce the risk of our next iterate being infeasible. Step 3, Estimate an approximate line minimum: Initialize the number of simulation runs already executed per inner loop to zero. Simulate at xc + λd to estimate fb0 (xc + λd) and sbj (xc + λd) using the common seed vector ωc . Sample K observations on F0 (xc + λd, ω) from the normal distribution with mean fb0 (xc + λd) and variance σ b0,0 , and on Sj (xc + λd, ω) from the normal distributions with means sbj (xc + λd) and variances σ bj,j . At this point, the heuristic compares the current iterate xc with the candidate iterate xc + λd to determine the “better” point, where “better” means “feasible” and lower objective value. (Tests will be described in detail at the end of this section.) As mentioned in the beginning of this section, the heuristic tests the ratios of the random slacks, which is in the spirit of interior point methods. The heuristic also tests the ratios of the objectives to determine whether an improvement in the objective value has been made. Notice that classical statistical tests (for example, a paired-t test for the objectives) are not used, since the heuristic does not assume (multivariate) normality. Besides, normality implies Cauchy type distributions for ratios, and these distributions have no finite means. Parametric bootstrapping estimates the medians (not means) and test whether the lower bounds on these medians are significant (i.e., exceed prespecified values). The details are explained at the end of this section. Determine the “better” of xc + λd and xc . Denote the “better” by xc . Set the objective to fb0 (xc ). Increase the number of already executed simulation runs for both the outer and the inner loops by one. Now, analogous to binary search, we have the interval [xc , xc + λd] to be systematically halved for a prespecified number of simulation runs in the same estimated search direction, to estimate an approximate line minimum; see Figure 3. Repeat the procedure each time with a new interval [xc , xc + λd], until the fixed number of simulation runs

15 per inner loop is reached. Then, set the slacks of the current estimated approximate line minimum (xc ) to sbj (xc ). At the end of this step, it is possible that the heuristic fails to find a point better than the old xc . Then, the heuristic does not leave that point. Step 4, Select a resolution-3 design and simulate at the design points: The current design point xc and the other design points are the N vertices of a k-dimensional hypercube with the side length determined by the fixed size of the local experimental area. Using the fixed set {ω1 , ..., ωc , ..., ωN } of independent seeds (except ωc , which was used at xc ), simulate at the new design points other than xc (since we already simulated at xc ) If in the line search in Step 3, the heuristic failed to find a better point than the old xc (we have already used xc in a previous resolution-3 design), then generate a new set {ω1 , ..., ωc , ..., ωN } of independent seeds, and select the seed ωc of xc as the new common seed. Increase the number of already executed simulation runs for the outer loop by the number of new runs. Step 5, Stopping criterion satisfied: The heuristic stops when either the number of executed simulation runs for the outer loop exceeds the maximum, or the heuristic uses the same design point more than twice in a resolution-3 design. The latter happens in the following situation: suppose that our current iterate is xc and we selected a resolution-3 design for which xc was one of the vertices. If the heuristic could not find a better point than xc in the line search (in Step 3), it did not leave the current iterate xc . Next, the heuristic generates a new set of independent seed vectors {ω1 , ..., ωN } (in Step 4 or in Step 0), and uses the previous resolution-3 design to obtain new local linear approximations in (2) , and derives a new search direction (7). If the heuristic again fails to find a better point than xc in the line search, then it stops, since xc has already been used twice in a resolution-3 design. Note that in case the current estimated iterate is far from a neighborhood of the minimizer, premature stopping may be caused by the inherent randomness of the problem. Therefore, we estimate a new search direction by generating a new set of independent seeds in Step 4. If, however, the current estimated iterate is in a neighborhood of the minimizer, the fixed size of the local experimental area becomes simply too big to estimate a good search direction. If the stopping criterion is not satisfied yet, then return to Step 1. For deterministic problems, the only stopping rule is the maximum number of simulation runs for the outer loop. This finishes our discussion of Figures 2 and 3. Clearly, the stopping rules used in the heuristic are rather arbitrary but also simple to use. Developing a more formal stopping rule is a future research topic. Moreover, this heuristic is designed for a small number

16 of simulation runs. In such a case, we can not claim theoretical convergence to the true optimum solution. Tests for “feasibility” and “improvement” in objective: The deterministic counterpart of testing for “feasibility” and “improvement” in objective will be given in the last paragraph of this section. Therefore, the text until the last paragraph of this section should be skipped for deterministic problems. As mentioned before, to decide whether a point has a “better” estimated objective, we use the random ratio (F0 (xc , ω) − F0 (xc + λd, ω)) / | F0 (xc , ω) |, where F0 (xc , ω) stands for the minimum objective so far and F0 (xc + λd, ω) for the candidate’s objective. To simplify the notation, we do not show the dependence on ω. We use the symbol Ql = (F0,l (xc ) − F0,l (xc + λd)) / | F0,l (xc ) |, where l ∈ {1, ..., K} is the lth bootstrap value. Let the corresponding order statistics be Q(1) < ... < Q(K) . Then, a point estimator for the median Q0.5 (see Law and Kelton (2000, p. 517)) is ( b0.5 = Q

Q(0.5K)

if 0.5K is an integer,

Q(b0.5K+1c) otherwise. (1)

Our null hypothesis is H0

(13)

: Q0.5 ≤ δ, where δ is a user-specified positive constant

indicating the smallest desired improvement in the estimated objective. In other words, (1)

(1)

H0 is pessimistic: we reject H0 only if the estimated median qb0.5 is significantly greater than δ. This implies that a point will be considered significantly improved only if its estimated objective is significantly smaller than the minimum objective so far. Then, the index y of the lower confidence limit Q(y) is given by y = dKp − zα

p Kp (1 − p)e

(14)

where α is the significance level, p = 0.5, zα is the 1 − α quantile of the standard normal distribution (see Kleijnen (1987, p. 23-25)). We also use the random ratios Sj (xc + λd) /Sj (xc ) to check the “feasibility”. Ideally, we would take a step such that the candidate’s slacks would be Sj (xc + λd) = γSj (xc ), where γ is a user-specified constant with γ < 1. Then, the heuristic would consider those points xc + λd with the slacks Sj (xc + λd) < γSj (xc ) as “infeasible”, although these points may indeed be “feasible” for (1). In our experiments, we use γ = 0.2, which means that our maximal step size is 80% of the step to the boundary. This is in line with our choice for determining a maximum step size in Step 2. In this way, the heuristic avoids prematurely approaching the boundary. This is im-

17 portant since if xc is on the boundary or close to the boundary at an early iteration, then the search direction (7) will fail or creep along the boundary. Of course, as the heuristic moves towards the optimum, the current iterate xc will eventually be close to the boundary, and so will be the candidate. We follow a procedure analogous to the one described for the objective, as follows. For constraint j, we denote the random ratios by Mj,l = Sj,l (xc + λd) /Sj,l (xc ) with © ª (2) 0.5 l = 1, ..., K. Now, our null hypothesis becomes H0 : min M10.5 , ..., Mr−1 ≤ γ, where γ is the same user-specified constant as the one in the previous paragraph and Mj0.5 (2)

is the median for the jth constraint. Hence, we again have a pessimistic H0 , which means that the “feasible” region considered by the heuristic is tighter than the actual © ª 0.5 “feasible” region. Define M 0.5 = min M10.5 , ..., Mr−1 . We then find a point estimator 0.5 c , and an index y for the lower confidence limit through (13) and (14). M (2)

Notice that α in (14) for H0

(1)

is not necessarily the same as α for H0 . Moreover,

because of Bonferroni’s inequality we choose the significance level αj for each constraint (2)

j such that α1 + ... + αr−1 = α. Furthermore, the choice of α in H0

is not completely

(2)

arbitrary; when α is small, rejection of H0 is more likely. In this way, we move towards the optimum through the interior of the “feasible” region, which we empirically found to improve the performance of the heuristic. In deterministic³ problems, to decide on ´ the ³ point that´has a lower objective, we can check whether fb0 (xc ) − fb0 (xc + λd) / | fb0 (xc ) | +1 > δ, where δ stands for the user-specified constant indicating the smallest desired improvement in the objective, and fb0 (xc ) and fb0 (xc + λd) stand for the exact values obtained ³ through simulation´of F0 (xc ) and F0 (xc + λd), respectively. Notice that the ratio fb0 (xc ) − fb0 (xc + λd) / ³ ´ b | f0 (xc ) | +1 is defined even when fb0 (xc ) = 0. For feasibility, we can check whether ½

sbj (xc + λd) min j=1,...,r−1 sbj (xc )

¾ >γ

where γ is the same user-specified constant as in the stochastic case, and sbj (xc ) and sbj (xc + λd) are the exact values obtained through simulation of Sj (xc ) and Sj (xc + λd), respectively. Analogous to the stochastic case, the feasible point with the lower objective becomes the current iterate xc .

18

5

Numerical examples

In the following two subsections, we investigated the performance of the heuristic and the proposed search direction through bootstrap experiments and an (s, S) inventory problem with a fill rate constraint. We emphasize that bootstrap experiments enable us to control the intrinsic noise of the simulation; moreover, they are computationally very efficient compared with expensive simulation studies that may take weeks. Furthermore, bootstrap experiments are also effective in the sense that we (the evaluators and not the heuristic) do know the true optimum, the binding constraints, etc.; therefore, they provide controlled laboratory settings. The inventory problem lays the ground to estimate the performance of the heuristic and the search direction for more realistic simulation studies. We implemented the experiments on a PC with Windows XP 2002, an Intel Celeron CPU of 2.40 Ghz, and a RAM of 512 MB. The computer program was coded in MATLAB 6.5.

5.1

Bootstrap example

We considered the following example: minimize

E[5(x1 − 1)2 + (x2 − 5)2 + 4x1 x2 + ²0 ]

subject to

E[(x1 − 3)2 + x22 + x1 x2 + ²1 ] ≤ 4 E[x21 + 3 (x2 + 1.061)2 + ²2 ] ≤ 9

(15)

0 ≤ x1 ≤ 3, −2 ≤ x2 ≤ 1 where (²0 , ²1 , ²2 ) is assumed to be multivariate normally distributed with mean vector 0, variances σ0,0 = 1 (σ0 = 1), σ1,1 = 0.0225 (σ1 = 0.15), σ2,2 = 0.16 (σ2 = 0.4), and correlations ρ0,1 = 0.6, ρ0,2 = 0.3, ρ1,2 = −0.1. Notice that our procedure does not assume Gaussian responses, but to illustrate our procedure we used the normal distribution. It is easy to see that the unconstrained minimum occurs at (−5, 15), whereas the constrained minimum occurs at approximately (x∗1 , x∗2 ) = (1.24, 0.52) with a mean objective value of 22.96 approximately. In our bootstrap experiment, we arbitrarily chose the following local experimental area for a 22 design, expressed in the original variables: 2.4 ≤ x1 ≤ 2.7 and −1.1 ≤ x2 ≤ −0.8. Therefore, ∆x1 and ∆x2 were arbitrarily chosen to be 10% of the whole ranges for x1 (0 ≤ x1 ≤ 3, ∆x1 = 0.3) and x2 (−2 ≤ x2 ≤ 1, ∆x2 = 0.3). Notice that we did not shrink the experimental area while proceeding towards the optimum (although shrinking

19 may improve the performance of the heuristic). We selected as user-supplied maximum number of simulation runs for the outer and the inner loops 20 and 3, respectively. For the bootstrap sampling, we selected the (1)

sample size K = 1000. The significance level for H0

was selected as α(1) = 20%. As

mentioned in § 4, relatively small values of α(2) increase the probability of staying inside the “feasible” region, which enhances the performance of our heuristic; hence, we chose α(2) = 1%. Moreover, we chose δ = 2.5% (the smallest desired improvement in the estimated objective). To test the performance of the heuristic, we made 100 macro-replicates: estimated medians and various quantiles were then found to stabilize. Because of the simplicity of our problem in (15), 100 macro-replicates took less than five minutes. Notice that at the end of each macro-replicate, we had an estimate (b x∗1 , x b∗2 ) for (x∗1 , x∗2 ). Then we sorted the macro-replicates with respect to the deviations of the corresponding objectives at (b x∗1 , x b∗2 ) from the true minimum 22.96. Hence, the estimated solutions (b x∗1 , x b∗2 ) in Figures 4 through 6 correspond to the 10th, 50th, and 90th “best” (0.10, 0.50, and 0.90 quantiles) of the estimated solutions, respectively. Furthermore, in these figures the two level curves of the objective function correspond to the true optimum value (22.96) plus three and six times the standard deviation σ0 = 1, namely E [F0 (x, ω)] = 25.96 and E [F0 (x, ω)] = 28.96. Detailed numerical results for Figure 5 (the macro-replicate giving the median result) are presented in Table 1. Insert Figures 4 through 6, and Table 1 In Figures 4 and 5, the heuristic indeed reached the of the true op³ desired neighborhood ´ timum, which was measured by the relative gap fb0 (b x∗ , x b∗ ) − 22.96 /22.96: (b x∗ , x b∗ ) = 1

2

(1.16, 0.22) with the estimated cost 23.99 and the relative gap 0.05, and

1 ∗ (b x1 ,

2 ∗ x b2 )

=

(1.46, 0.19) with the estimated cost 25.30 and the relative gap 0.10, respectively. In Figure 6, the initial set of independent seeds did not enable us to estimate a good search direction; therefore, most simulation runs were wasted until the 8th iterate was b∗2 ) = (1.52, − 0.18) with the corresponding reached. The heuristic stopped at (b x∗1 , x estimated objective 27.09 and the corresponding relative gap 0.18, since the number of simulations runs exceeded 20 (the maximum for the outer loop). If the budget is flexible, this problem can be overcome by simply increasing the maximum for the outer loop. (For example, in Figure 6, increasing the maximum from 20 to 25 made the estimated solution, the corresponding objective, and the corresponding relative gap become b∗2 ) = (1.26, 0.49), 23.12, and 0.01, respectively, which is an improvement of 15% (b x∗1 , x in the objective reached after 20 runs).

20

5.2

(s, S) inventory with a fill rate constraint

We further illustrated our heuristic (including the search direction) by applying it to the optimization of the (s, S) inventory system investigated by Bashyam and Fu (1998). Unlike most authors on inventory models, Bashyam and Fu assumed a service level constraint instead of a penalty cost for backorders, which is also the approach in practice. Hence, Bashyam and Fu’s inventory can be formulated as (1). ³ problem ´ Our goal is to find an estimate sb∗ , Sb∗ of the optimal reorder and order up to levels s∗ and S ∗ ; that is, we have two input variables x = (s, S)T . The objective function E [F0 (x, ω)] in (1) is the steady-state expected total costs, namely the sum of order setup, ordering, and holding costs. There is a single stochastic constraint E [F1 (x, ω)], which is the expected steady-state “fill rate”, i.e., the fraction of demand directly met from stock on hand. Different from the formulation in (1), there is also a deterministic constraint on the inputs, namely S ≥ s. We already remarked that our heuristic can handle such constraints; see page 5. Like Bashyam and Fu (1998), we assumed an infinite horizon, periodic review inventory system with continuous-valued and independent, identically distributed (i.i.d) demands and full backlogging of orders. The basic sequence of events in each period is as follows: orders are received at the beginning of the period, the demand for the period is subtracted, and order review is done at the end of the period. An order is placed when the inventory position (stock on hand plus outstanding suppliers’ orders minus customers’ backorders) falls below the reorder level s; the order amount is the difference between the order up to level S and the current inventory position. Suppliers’ orders can cross in time (which has made the analytical solution impossible, so far). Bashyam and Fu (1998) considered various distribution types for customers’ demands during a period and order lead times. In particular, Bashyam and Fu calibrated their algorithm using exponentially distributed demands with mean 100 and Poisson distributed order lead times with mean 6. Furthermore, they reported that the results of the calibration problem were quite representative of the large number of test cases that they considered. Therefore, in our simulation experiments, we also assumed that customers’ demands have an exponential distribution with mean 100 and order lead times have a Poisson distribution with mean 6. Moreover, they set per order setup cost Z = 36, per unit order cost u = 2, and per period per unit holding cost h = 1; we used the same cost values in our simulation experiments. Like Bashyam and Fu (1998), each run was simulated for 20,000 periods, since they found that this was sufficient to reach steady-state conditions. Finally, whereas Bashyam and Fu considered various target fill

21 rates including 0.90, in our simulation experiments, we only focused on a target fill rate of 0.90. Our simulation of the inventory system started with the inventory position and the inventory level (stock on hand) at S without any outstanding suppliers’ orders. In the original Bashyam and Fu (1998)’s inventory problem, there were no box constraints on s and S. Therefore, our search direction to be used iteratively by our heuristic is given by (9). For this inventory problem, the matrix in the parentheses in (9) is a square matrix, since there are two constraints and two input variables. During the application of our heuristic, this matrix can become ill-conditioned. To prevent this situation, we added tentative upper and lower bounds on s and S, as follows. The lower and upper bounds on s are given by smin = E (D) E (L) and smax = E (D) E (L)+3σD E (L), respectively, where E (D), E (L), and σD are the expected demand, the expected lead time, and the standard deviation of demands, respectively. This gave 600 ≤ s ≤ 2400. To obtain lower and upper bounds on S, we used the economic order quantity (EOQ) p formula in Bashyam and Fu (1998): EOQ = 2ZE (D) /h. We chose the lower and upper bounds on S as smin + 0.5EOQ ≤ S ≤ smax + 2EOQ, which approximately equaled 643 ≤ S ≤ 2570. After adding these lower and upper bounds, the search direction of the heuristic is given by (7). Bashyam and Fu (1998) did not report the optimal (s∗ , S ∗ ) for the test case with the Poisson distributed lead times with mean 6, exponentially distributed demands with mean 100, and 0.90 target fill rate. Therefore, we tried to find the “optimal” (s∗ , S ∗ ) through brute-force simulation over a grid. We simulated for 30,000 simulation periods and averaged the costs and the fill rates over 10 macro-replicates over (s, S) plane, starting with 600 ≤ s ≤ 2400 and 643 ≤ S ≤ 2570, and a coarse grid. By refining the grid successively and reaching to 1 × 1, our conclusion of these brute-force simulation experiments is that s = 1160 and S = 1212—which has the average cost of 647.15 with a standard error of 8.55 and the average fill rate of 0.8948 with a standard error of 0.01—is the best estimate of (s∗ , S ∗ ). In this application of our heuristic, we used the same values as in § 5.1; i.e., the maximum number of simulation runs for the outer loop = 20, the number of simulation runs for the inner loop = 3, K = 1000, α(1) = 20%, α(2) = 1%, and δ = 2.5%. In Figures 7 through 9, our heuristic started from the following arbitrarily chosen local experimental area for a 22 design, expressed in the original variables: 2100 ≤ s ≤ 2280 and 2300 ≤ S ≤ 2493. Therefore, ∆s and ∆S were arbitrarily chosen to be 10% of the whole ranges for s (600 ≤ s ≤ 2400, ∆s = 180) and S (643 ≤ S ≤ 2570, ∆S = 193). In Figures 7 through 9, the lower and upper bounds on s and S coincide with the

22 axes. Furthermore, in these figures the three piecewise linear functions correspond to the lines where 650 ≤ E [F0 (s, S, ω)] < 700, 700 ≤ E [F0 (s, S, ω)] < 750, and 750 ≤ E [F0 (s, S, ω)] < 800 respectively, and E [F0 (x, ω)] was estimated through brute-force simulation. Insert Figures 7 through 9 To test the performance of our heuristic, we again made 100 macro-replicates. In contrast with the simple Monte Carlo example in § 5.1 where 100 macro-replicates took less than 5 minutes, now 100 macro-replicates took about 450 minutes. As ³in § 5.1, ´ Figures 7 ∗ b∗ through 9 correspond to the 10th, 50th, and 90th “best” solutions sb , S , where we determined “best” with respect to the deviations of the corresponding objectives at ³ ´ sb∗ , Sb∗ from the true minimum 647.15. In Figures 7³through ´ 9, the heuristic indeed reached the desired neighborhood of the true optimum: sb∗ , Sb∗ = (1137.7, 1274.8) with the estimated cost 675.2 and the relative ³ ´ ∗ b∗ gap 0.04, sb , S = (1136.7, 1302.4) with the estimated cost 684.4 and the relative gap ³ ´ 0.06, and sb∗ , Sb∗ = (1147.9, 1338.4) with the estimated cost 711.5 and the relative gap 0.10, respectively. Furthermore, in Figures 7 through 9, the cost estimates at the starting point (0) are 1650.2, 1647.1, and 1647.7. Hence, the percent decreases in the objective estimates are ((1650.2-675.2)/1650.2)*100 = 59.08%, 58.45%, and 56.82% in at most 20 simulation runs. In Figure 9, as a result of increasing the maximum number ³ ´of runs for the outer loop from 20 to 25, the new estimated solution becomes ∗ b∗ sb , S = (1182.7, 1220) with the estimated cost 658.2 and the relative gap 0.02, which is an improvement of 7.49% in the cost estimated after 20 runs. Table 2 gives detailed numerical results for Figure 8. Insert Table 2

5.3

Conclusions from numerical examples

We summarized the results of the Monte Carlo example and the inventory problem obtained from all 100 macro-replicates in Tables 3 and 4, respectively. The heuristic tends to end at a “feasible” point, since both tables have only positive quantiles. This is clearly due to our conservative (small) significance level α(2) = 1% with a pessimistic (2)

null hypothesis H0 . We experimented with different starting points, and obtained very similar results; therefore, we do not present those details. Insert Tables 3 and 4 Our conclusion is that the heuristic reaches the desired neighborhood of the true opti-

23 mum in a relatively small number of simulation runs. Once the heuristic reaches this neighborhood, it stops at a “feasible” point.

6

Conclusions and future research

In this paper we focused on RSM for problems with a stochastic objective function and stochastic output constraints, besides deterministic input constraints. In the first stage of this novel RSM, the unknown functions are approximated locally by first-order polynomials as in classic RSM. Then, a novel search direction is estimated, and along this path a number of steps are taken. The estimated search direction generalizes the steepest descent of classic RSM (with a single stochastic objective function). To achieve this generalization, we used ideas from nonlinear programming, such as affine scaling and projection. We proved two properties of our search direction: it is indeed a descent direction, and it is invariant with respect to general linear transformations. The classic steepest descent, however, is scale dependent. Next, we provided a heuristic procedure for quickly reaching the desired neighborhood of the optimum of expensive simulation-based optimization problems. The heuristic moves towards this neighborhood through the interior of the feasible region. In this way, it avoids creeping along the boundary and ensures that simulation programs do not crash or become invalid. Though the heuristic focuses on stochastic optimization problems with stochastic constraints, it can also be applied to deterministic problems after a few simplifications. The empirical results of our numerical examples are encouraging; that is, in general the heuristic reaches a point that is “sufficiently” close to the optimum in a few simulation runs. The main contribution of this paper is the generalization of the classic RSM to problems with stochastic constraints. The numerical examples are only meant to illustrate the applicability of our heuristic based on a novel, scale independent search direction. More experiments and examples may bring more insight into and better understanding of the capabilities and limitations of our approach. Furthermore, we might use specific enhancements such as better line search methods to determine the step size or dynamically adapting the experimental area as we move towards the solution point, using ideas from trust-region methods. In addition, it should be clear that this heuristic covers only the first stage of RSM; that is, it reaches a neighborhood of the true optimum in a few simulation runs. Afterwards the second stage should be carried out, which gets even closer to the true optimum with a

24 well defined stopping rule. Analogous to unconstrained deterministic optimization where the vanishing gradient is one of the necessary conditions for optimality, in classic RSM the first stage ends when the gradient of the approximate objective is not significantly larger than zero. It is possible to develop a formal stopping rule based on ideas from constrained deterministic optimization. However, this goes beyond the aim and the scope of this paper; see Ang¨ un and Kleijnen (2005).

Appendix A: Derivation of the search direction by introducing the ellipsoid constraint We considered the approximation in (6), and replaced the nonnegativity constraints on r−1 k the slack vectors (i.e., s ∈ R+ , r, v ∈ R+ ) by an ellipsoid constraint; see Barnes (1986).

minimize

bT0 x

subject to

Bx − s = c, x + r = u, x − v = l, kq − qkQ−1 ≤ ρ,

¡ ¡ ¢T ¢T where 0 < ρ < 1, q = sT , rT , vT , q = sT , rT , vT , Q = diag (s, r, v) with q ∈ Rr−1+2k , q ∈ Rr−1+2k , x ∈ Rk , and Q ∈ R(r−1+2k)×(r−1+2k) . Note that the entries in Q are the components of the current slack vectors s, r, v > 0. The ellipsoid constraint in the formulation above can be rewritten as (s − s)T S

−2

(s − s) + (r − r)T R

−2

(r − r) + (v − v)T V

−2

(v − v) ≤ ρ2 .

By substituting the slack vectors in terms of x (i.e., s = Bx − c, r = u − x, v = x − l) into the ellipsoid constraint, we obtained −2

(x − x)T BT S B (x − x) + (x − x)T R

−2

(x − x) + (x − x)T V

−2

(x − x) ≤ ρ2 .

Now the approximation can be rewritten as minimize

bT0 x −2

subject to (x − x)T BT S B (x − x) + (x − x)T R T

+ (x − x) V

−2

−2

(x − x)

(x − x) ≤ ρ2 .

Since the problem is to minimize a linear function over an ellipsoid, the optimum will occur at the boundary. So we can replace the ≤-constraint by a =-constraint and form

25 the Lagrange function L (x, µ), where µ stands for the Lagrange multiplier. Hence, from the first-order necessary conditions, we derived −2

b0 − 2µ[BT S B (x − x) + R −2

(x − x)T BT S B (x − x) + (x − x)T R

−2

−2

(x − x) + V

−2

(x − x) + (x − x)T V

(x − x)] = 0

−2

(x − x) = ρ2 .

From the first equality above, we derived ¸ · ³ ´ 1 −2 −2 −1 T −2 x=x− − B S B+R +V b0 . 2µ In this expression, the term within the brackets gives the proposed search direction, denoted as d in (7) in § 3.

Appendix B: Scale independence of the search direction We considered a general linear transformation of the variables: Dz + e = x, where D ∈ Rk×k is non-singular and e ∈ Rk . This transformation is one-to-one and z = D−1 (x − e). The final result proven in this appendix will be a simple relation between the search directions dx and dz in the x and z spaces, which implies scale independence: dz = D−1 dx .

¡ ¢T ¡ ¢T e = 1, xT , Dz + e = x can be written as: Ae e Defining e z = 1, zT and x z=x

where à A=

1 0

!

e D

and A is non-singular. Given that Z = XA−T , the least squares estimator βbiez in the z space is given by: ¡ ¢−1 T ¡ ¢ b j = A−1 XT XA−T −1 A−1 XT F b j = AT βbxe βbjez = ZT Z Z F j

26 where βbjxe is given in (3). Now, we can write (5) in terms of z, as follows: minimize

bT0 Dz

subject to

bTj Dz ≥ cj − bTj e for j = 1, ..., r − 1, D−1 (l − e) ≤ z ≤ D−1 (u − e) .

Hence from the formulation above we have: Bz = Bx D. By further simplifying the formulation and adding the slacks sz , rz , and vz , we obtained minimize

bT0 Dz

subject to

Bx Dz − sz = c − Bx e, z + rz = D−1 (u − e), z − vz = D−1 (l − e) , sz , rz , vz ≥ 0.

Hence £ ¤ sz = Bx Dz − c + Bx e = Bx D D−1 (x − e) − c + Bx e = Bx x − c = sx , rz = D−1 (u − e) − z = D−1 (u − e) − D−1 (x − e) = D−1 (u − x) = D−1 rx , vz = z − D−1 (l − e) = D−1 (x − e) − D−1 (l − e) = D−1 (x − l) = D−1 vx . Thus, Bz = Bx D, Sz = Sx , Rz = Rx D−T³, and Vz = Vx D−T . Remember that the ´−1 −2 −2 −2 search direction in x was given by dx = − BTx Sx Bx + Rx + Vx b0 . Therefore the search direction in z space is given by ³ ´−1 −2 −2 −2 dz = − DT BTx Sx Bx D + DT Rx D + DT Vx D D T b0 ´ ³ −2 −2 −1 −1 T −2 = −D B x Sx B x + R x + V x D−T DT b0 = D−1 dx .

REFERENCES Ang¨ un, E. 2004. Black box simulation optimization: Generalized response surface methodology. Ph.D. Thesis, Tilburg University, Tilburg, The Netherlands. Ang¨ un, E., and J. P. C. Kleijnen. 2005. An asymptotic test of optimality conditions in multiresponse simulation-based optimization. Working paper (download from http://center.kub.nl/staff/kleijnen/papers.html). Barnes, E. R. 1986. A variation on Karmarkar’s algorithm for solving linear program-

27 ming problems. Mathematical Programming 36: 174-182. Bashyam, S., and M. C. Fu. 1998. Optimization of (s, S) inventory systems with random lead times and a service level constraint. Management Science 44: S243S256. Ben-Gal, I., and J. Bukchin. 2002. The ergonomic design of working environment via rapid prototyping tools and design of experiments. IIE Transactions 34 (4): 375-391. Bettonvil, B., and J. P. C. Kleijnen. 1996. Searching for important factors in simulation models with many factors: Sequential bifurcation. European Journal of Operational Research 96: 180-194. Booker, A. J., J. E. Dennis, Jr., P. D. Frank, D. B. Serafini, V. Torczon, and M. W. Trosset. 1999. A rigorous framework for optimization of expensive functions by surrogates. Structural Optimization 17 (1): 1-13. Conn, A. R., N. Gould, and Ph. L. Toint. 2000. Trust-Region Methods. SIAM, Philadelphia. Del Castillo, E., and D. C. Montgomery. 1993. Nonlinear programming solution to the dual response problem. Journal of Quality Technology 25 (3): 199-204. Den Hertog, D., and H. P. Stehouwer. 2002. Optimizing color picture tubes by highcost nonlinear programming. European Journal of Operational Research 140 (2): 197-211. Derringer, G., and R. Suich. 1980. Simultaneous optimization of several response variables. Journal of Quality Technology 12: 214-219. Donohue, J. M., E. C. Houck, and R. H. Myers. 1993. Simulation design and correlation induction for reducing second-order bias in first-order response surfaces. Operations Research 41 (5): 880-902. ———. 1995. Simulation designs for the estimation of quadratic response surface gradients in the presence of model misspecification. Management Science 41 (2): 244-262. Efron, B., and R. J. Tibshirani. 1993. An Introduction to the Bootstrap, Monographs on statistics and applied probability 57. Chapman & Hall, New York.

28 Fan, S. K., and E. Del Castillo. 1999. Calculation of an optimal region of operation for dual response systems fitted from experimental data. Journal of the Operational Research Society 50 (8): 826-836. Fu, M. C. 2002. Optimization for simulation: Theory vs. practice. INFORMS Journal on Computing 14 (3): 192-215. Fu, M. C., and S. D. Hill. 1997. Optimization of discrete event systems via simultaneous perturbation stochastic approximation. IIE Transactions 29: 233-243. Gill, P. E., W. Murray, and M. H. Wright. 1981. Practical Optimization. Academic Press, London. Glasserman, P. 1991. Gradient Estimation via Perturbation Analysis. Kluwer Academic, Dordrecht. Harrington, E. C. 1965. The desirability function. Industrial Quality Control 21: 494-498. Helton, J. C., D. R. Anderson, M. G. Marietta, and R. P. Rechard. 1997. Performance assessment for the waste isolation pilot plant: from regulation to calculation for 40 CFR 191.13. Operations Research 45 (2): 157-177. Ho, Y. C., and X. R. Cao. 1991. Perturbation Analysis of Discrete Event Dynamic Systems. Kluwer Academic, Dordrecht. Hussey, J. R., R. H. Myers, and E. C. Houck. 1987. Correlated simulation experiments in first-order response surface design. Operations Research 35 (5): 744-758. Irizarry, M., J. R. Wilson, and J. Trevino. 2001. A flexible simulation tool for manufacturing-cell design, II: Response surface analysis and case study. IIE Transactions 33: 837-846. Khuri, A. I. 1996. Multiresponse surface methodology. In Handbook of Statistics 13. S. Ghosh, and C. R. Rao, eds. Elsevier, Amsterdam. Khuri, A. I., and M. Conlon. 1981. Simultaneous optimization of multiple responses represented by polynomial regression functions. Technometrics 23 (4): 363-375. Khuri, A. I., and J. A. Cornell. 1996. Response Surfaces: Design and Analysis, 2nd ed. Marcel Dekker, New York.

29 Kleijnen, J. P. C. 1987. Statistical Tools for Simulation Practitioners. Marcel Dekker, New York. ———. 2005. Invited review: An overview of the design and analysis of simulation experiments for sensitivity analysis. European Journal of Operational Research 164 (2): 287-300. Law, A. M., and W. D. Kelton. 2000. Simulation Modeling and Analysis, 3rd ed. McGraw-Hill, Boston. Myers, R. H., and W. H. Carter. 1973. Response surface techniques for dual response systems. Technometrics 15: 301-317. Myers, R. H., and D. C. Montgomery. 2002. Response Surface Methodology: Process and Product Optimization Using Designed Experiments, 2nd ed. Wiley, New York. Rao, C. R. 1967. Least squares theory using an estimated dispersion matrix and its application to measurement of signals. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. I: 355-372. Rubinstein, R. Y., and D. P. Kroese. 2004. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer, New York. Rubinstein, R. Y., and A. Shapiro. 1993. Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. Wiley, Chichester. Ruud, P. A. 2000. An Introduction to Classical Econometric Theory. Oxford University Press, New York. Schruben, L. W., and B. H. Margolin. 1978. Pseudo-random number assignment in statistically designed simulation and distribution sampling experiments. Journal of the American Statistical Association 73: 504-520. Simpson, T. W., A. J. Booker, D. Ghosh, A. A. Giunta, P. N. Koch, and R.-J. Yang. 2004. Approximation methods in multidisciplinary analysis and optimization: A panel discussion. Structural and Multidisciplinary Optimization 27 (5): 302-313. Spall, J. C. 2003. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley, Hoboken, NJ.

Tekin, E., and I. Sabuncuoglu. 2004. Simulation optimization: A comprehensive review on theory and applications. IIE Transactions 36: 1067-1081. Vining, G. G., and R. H. Myers. 1990. Combining Taguchi and response surface philosophies: A dual response approach. Journal of Quality Technology 22 (1): 38-45. Vonk Noordegraaf, A., M. Nielen, and J. P. C. Kleijnen. 2003. Sensitivity analysis by experimental design and metamodelling: Case study on simulation in national animal disease control. European Journal of Operational Research 146 (3): 433443. Wei, C. J., D. L. Olson, and E. M. White. 1990. Simultaneous optimization in process quality control via prediction-interval constrained programming. Journal of the Operational Research Society 41 (12): 1161-1167. Yang, T., and L. Tseng. 2002. Solving a multi-objective simulation model using a hybrid response surface method and lexicographical goal programming approach: A case study on integrated circuit ink-marking machines. Journal of the Operational Research Society 53 (2): 211-221. Ye, Y. 1989. An extension of Karmarkar’s algorithm and the trust region method for quadratic programming. Progress in Mathematical Programming: Interior-Point and Related Methods. Wiley, Chichester.

iteration 0 1 2 3 4 5 6 7 8 9 10 11 12

iterate (b x1 , x b2 ) (2.40, −1.10) (1.69, −0.97) (2.05, −1.03) (1.87, −1.00) (1.46, 0.19) (1.58, −0.39) (1.52, −0.09) (1.32, 0.26) (1.39, 0.23) (1.43, 0.21) (1.08, 0.08) (1.27, 0.14) (1.37, 0.16)

(1)

step size λ H0 0.72 1.18 reject fail to fail to (−0.91, 0.41), (−0.96, −0.28) 0.16, 0.40 reject fail to fail to reject fail to fail to fail to fail to fail to

search direction d (−0.98, 0.18) (−0.19, 0.98)

reject reject reject reject reject

reject reject

reject reject

reject reject reject reject reject reject fail to reject reject reject reject reject reject

(2)

H0

Table 1: Numerical results for the estimated mean (estimated by the 0.50 quantile) of 100 estimated solutions for (15)

0 1 2 3 4 5 6 7 8 9

(2100, 2300) (900, 1091) (1500, 1695.5) (1200, 1393.3) (1097.6, 1255.9) (1148.8, 1324.6) (1123.2, 1290.2) (1100.2, 1236.1) (1124.5, 1280.3) (1136.7, 1302.4)

1703.4

171.3 100.9

(-0.7045, -0.7097)

(-0.5973, -0.8020) (-0.4816, -0.8764)

reject reject reject reject reject reject reject reject reject

fail to reject reject fail to reject fail to fail to fail to reject

reject reject reject

reject

reject

Table 2: Numerical results for the estimated mean (estimated by the 0.50 quantile) of 100 estimated solutions for the inventory problem ³ ´ (1) (2) b iteration iterate sb, S search direction d step size λ H0 H0

0.10 quantile 0.0448 0.0264 0.0315

0.25 quantile 0.0555 0.1185 0.1453

0.50 quantile 0.1019 0.2529 0.2970

0.75 quantile 0.1798 0.4348 0.4921

0.90 quantile 0.1858 0.6147 0.5009

³

³

´

0.10 quantile ´ E[F0 sb∗ , Sb∗ ] − 647.15 /647.15 0.0233 ³ ³ ´ ´ ∗ b∗ E[F1 sb , S ] − 0.9 /0.9 0.0050

0.50 quantile 0.0575 0.0091

0.25 quantile 0.0430 0.0068

0.0133

0.0763

0.75 quantile

0.0166

0.0893

0.90 quantile

Table 4: Variability of the estimated objectives and slacks over 100 macro-replicates for the inventory problem

(4 − (9 −

x b∗2 )] − 22.96) /22.26 E[F1 (b x∗1 , x b∗2 )]) /4 ∗ E[F2 (b x1 , x b∗2 )]) /9

(E[F0 (b x∗1 ,

Table 3: Variability of the estimated objectives and slacks over 100 macro-replicates for the problem (15)

Figure 1: Proposed search direction RP 0 (or R0 P 00 ) versus steepest descent RC

Feasible area

R P’ P

P’’

R’

C

level curves of objective function

Figure 2: Overview of the iterative heuristic

0. Initialize.

1. Fit first-order polynomials, estimate variances, and perform bootstrap sampling.

2. Estimate a search direction and a maximum step size.

3. Estimate an approximate line minimum.

4. Select a resolution-3 design and simulate at the design points to estimate outputs.

N

5. Stopping criterion satisfied

Y End

Figure 3: Estimating an approximate line minimum (see Figure 2, Step 3)

3.0. Initializations: set already executed simulation runs per inner loop to 0, simulate at x c  Od , perform bootstrap sampling for objective F0 x c  Od and slacks S j x c  Od j 1,...., r  1 , determine the “better” of x c and x c  Od , increase number of simulation runs by one for inner and outer loops.

End

3.1. Number of simulation runs for inner loop d specified number

N

Y

3.2. Simulate at candidate x h . x h 0.5>x c  x c  Od @

3.3. Bookkeeping: increase number of simulation runs by one for inner and outer loops.

3.4. Perform bootstrap sampling for objective F0 x h and slacks S j x h j 1,...., r  1.

Y

Y

3.8. x c has “better” objective than x c  Od

3.5. x h “feasible”

3.7. x h has “better” objective than “best” so far

N

N

Y 3.9. Determine new end point of line. x c  Od m x c

N

3.10. Determine new “best” so far, change objective and slacks to those of “best” so far. x c m x h , fˆ 0 x c m fˆ 0 x h , sˆ j x c m sˆ j x h j 1,..., r  1

3.6. Determine new end point of line. x c  Od m x h

Figure 4: The estimated best (estimated by the 0.10 quantile) of 100 estimated solutions for (15); (7*) is estimated solution 1 true optimum design points twice used design points iterates infeasible iterates

0.8 (1) 0.6 (4)

0.4

(6) (3)

0.2

(5)

0

x 2+3(x +1.061)2≤9

(2)

1

x2

tr

(7*)

−0.2

2

E[F0(x1, x2, ω)]=25.96

−0.4 E[F0(x1, x2, ω)]=28.96

−0.6

(0) −0.8 (x 1−3)2+x 22+x 1x2≤4

−1

0.5

1

1.5

2 x1

2.5

3

Figure 5: The estimated mean (estimated by the 0.50 quantile) of 100 estimated solutions for (15); (10*) is estimated solution 1 0.8 0.6 true optimum design points twice used design points iterates infeasible iterates

0.4 (10*) (4)

0.2

x2

t

0

−0.2

(7)

(8)

(9)

1

2

2

(5)

−0.4 −0.6

x 2+3(x +1.061)2≤9

(6)

E[F (x , x , ω)]=25.96 0 1

E[F0(x1, x2, ω)]=28.96

−0.8 (3) −1

0.5

(x 1−3)2+x 22+x 1x2≤4 1

(2)

(1)

1.5

2 x1

(0)

2.5

3

Figure 6: The estimated worst (estimated by the 0.90 quantile) of 100 estimated solutions for (15); (8*) is estimated solution 1 solution after 25 runs true optimum design points twice used design points iterates infeasible iterates

0.8 0.6 (7)

0.4 0.2

x 21+3(x +1.061)2≤9 2

x2

0 E[F (x , x , ω)]=25.96 0 1 2 (8*)

−0.2

(4) −0.4 (9) −0.6

(5)

(1) E[F0(x1, x2, ω)]=28.96 (6) (3)

−0.8 −1

(0) (2)

(x 1−3)2+x 22+x 1x2≤4 0.5

1

1.5

2 x1

2.5

3

Figure 7: The estimated best (estimated by the 0.10 quantile) of 100 estimated solutions for the inventory problem; (8*) is estimated solution

2400 (0)

E[F1(s, S, ω)]≥ 0.9

2200

2000 750 ≤E[F0(s, S, ω)]< 800 1800 700 ≤E[F0(s, S, ω)]< 750 S

650 ≤E[F0(s, S, ω)]< 700

−s+S≥ 0

(2)

1600 (5) (3) (6) (8*) (4) (9) (7)

1400

1200 (1)

true optimum

1000

design points iterates

800

infeasible iterates 600

800

1000

1200

1400

1600 s

1800

2000

2200

2400

Figure 8: The estimated mean (estimated by the 0.50 quantile) of 100 estimated solutions for the inventory problem; (9*) is estimated solution

2400 E[F1(s, S, ω ]≥ 0.9

(0)

2200

2000

750 ≤ E[F0(s, S, ω)]