Struct Multidisc Optim 22, 125–138 Springer-Verlag 2001
Simulation based optimization of stochastic systems with integer design variables by sequential multipoint linear approximation S.J. Abspoel, L.F.P. Etman, J. Vervoort, R.A. van Rooij, A.J.G. Schoofs and J.E. Rooda
Abstract Optimization problems are considered for which objective function and constraints are defined as expected values of stochastic functions that can only be evaluated at integer design variable levels via a computationally expensive computer simulation. Design sensitivities are assumed not to be available. An optimization approach is proposed based on a sequence of linear approximate optimization subproblems. Within each search subregion a linear approximate optimization subproblem is built using response surface model building. To this end, N simulation experiments are carried out in the search subregion according to a D-optimal experimental design. The linear approximate optimization problem is solved by integer linear programming using corrected constraint bounds to account for any uncertainty due to the stochasticity. Each approximate optimum is evaluated on the basis of M simulation replications with respect to objective function change and feasibility of the design. The performance of the optimization approach and the influence of parameters N and M is illustrated via two analytical test problems. A third example shows the application to a production flow line simulation model. Key words approximation concepts, integer design variables, stochastic systems, simulation optimization, nongradient based optimization
1 Introduction Approximation concepts have been successfully applied to a large range of optimum design problems where a comReceived April 28, 2000 S.J. Abspoel, L.F.P. Etman, J. Vervoort, R.A. van Rooij, A.J.G. Schoofs and J.E. Rooda Systems, Dynamics, and Control Engineering, Department of Mechanical Engineering, Eindhoven University of Technology, Wh. 4.105, P.O. Box 513, NL-5600 MB Eindhoven, The Netherlands e-mail:
[email protected]
putationally expensive computer simulation is involved. A review of approximation concepts for optimum structural design is given by Barthelemy and Haftka (1993). Approximation concepts are a means to obtain explicit optimization subproblems which are cheap to evaluate. They have been mainly developed and applied for deterministic (structural) systems with continuous design variables. Discrete design variables pose additional difficulties. But also in the discrete case, approximation methods are effective if function evaluations are quite expensive. Thanedar and Vanderplaats (1995) and Arora et al. (1994) reviewed methods for optimization of deterministic nonlinear problems with discrete design variables. Thanedar and Vanderplaats (1995) found approximation methods useful in discovering practical solutions, notwithstanding that they do not guarantee a discrete optimum. Stochastic systems are less frequently considered in design optimization studies. Basically two different problems can be distinguished: either the actual values of the design variables values are uncertain, or the outcome of a computer simulation is uncertain resulting in inaccurate or stochastic responses. The first problem, uncertainties with respect to the input side of the numerical analysis, is considered in reliability based design and optimization. Introductions can be found for example in the work of Melchers (1987) and Gasser and Schu¨eller (1998). For the second problem, the design variables are assumed to be deterministic, but uncertainties in the simulation outcome are due to inaccuracies or stochastic distributions included in the simulation model itself. In the operations research community the optimization of such a stochastic system is referred to as simulation optimization. Fu (1994) and Carson and Maria (1997) give an extensive overview of methods and techniques developed in this research field. Both in reliability based design as well as in simulation optimization response surface approximation techniques have been extensively used, mainly considering continuous design variables. In several optimum design problems both stochastic responses and discrete design variables occur. Two different types of application areas are mentioned here as examples. In crashworthiness design optimization,
126 large-scale crash simulation models may give rise to a band of noise on the objective function and constraints (Etman et al. 1996). Typical discrete design variables in crashworthiness design are: limited number of possible belt stiffnesses, inflater airbag diameters, or thicknesses of metal sheets. A second application area is design of manufacturing systems. Here (discreteevent) simulation software is widely used to estimate the behaviour and performance of manufacturing system designs. Stochastic distributions in the model are usually due to variability in process times (Hopp and Spearman 1996). The discrete design variables in these problems are often integer variables such as the number of machines in workstations, buffer sizes, and batch sizes. The optimization problem considered in this paper combines the following characteristic properties. 1. A computationally expensive simulation is needed to evaluate objective function and constraints, 2. the design variables are integer discrete, 3. objective function and constraints can only be evaluated at the integer level, and 4. objective function and constraints are expected values of stochastic functions. In literature usually two of these properties are considered together. For example, Loh and Papalambros (1991a) described a sequential linearization approach for mixed-discrete optimization problems that are deterministic and that can be evaluated at any (not necessarily discrete) level. Our optimization approach also uses as basis a sequence of linear approximate subproblems. It uses elements of response surface methodology, integer linear programming, and procedures from statistics. The approach has been implemented in Matlab (MathWorks 1999). A nonconvex analytical test problem, a stepped cantilever beam, and a four-station production line are used to illustrate the approach. For the last test problem, the developed optimization tool has been coupled with the specification and simulation software χ (Rooda 2000).
2 Optimization problem formulation The optimization problem considered here is mathematically formulated as follows: Problem P minimize: subject to:
E[F (x, ω)] E[Gj (x, ω)] ≤ cj ,
xi ≤ xi ≤ xui ,
xi ∈ Z ,
j = 1, . . . , m ,
i = 1, . . . , n .
(1)
Problem P states that the expected value of function F should be minimized subject to m constraints on the expected values of Gj . Herein, x denotes a vector containing the n design variables. Each of these variables xi can take integer values only. Parameter ω represents the stochastic effects. Therefore, the objective function F (x, ω) and the constraints Gj (x, ω) are random variables. This means that for a certain design x, F and Gj can take several values according to some distribution. In the context of this paper, these values are supposed to follow from a (computationally expensive) computer simulation. Several replications of the simulation experiment for the same design variables settings are needed to actually obtain an estimate of E[F (x, ω)] and E[Gj (x, ω)]. It is assumed that there are – unknown – underlying continuous functional relations defining the expected values of F and Gj . However, these functional relationships are not explicitly known, and can only be evaluated with a certain probability distribution at a discrete design point. We assume that other than integer design points cannot be evaluated, and that design sensitivities with respect to the design variables are not available.
3 Linear approximate optimization approach The approach basically comprises a series of approximate optimization cycles. Since optimization problem P may behave nonlinearly as function of the design variables, a limited part (search subregion) of the design space is considered. Within this search subregion a design of experiments is defined. According to this design of experiments simulation experiments are carried out to build closed form linear response surface approximations of objective function and constraints using regression techniques. Within the search subregion the integer but deterministic approximate optimization problem is solved using a suitable mathematical programming algorithm. Finally, it is evaluated whether the calculated optimum design has improved compared to the optimum of the previous cycle. If the design has improved it is used as starting point of a new cycle, otherwise the cycle is repeated (iteration) with a repositioned or reduced search subregion. Design point x(p,0) denotes the starting point of the p-th cycle around which the search subregion is defined. The approximate optimum solution of the p-th cycle for the q-th iteration is denoted as x(p,q) . So, if two iterations are needed to obtain an improved design during cycle p, then x(p,2) is the accepted optimum design that will be used as starting design for the next cycle: x(p+1,0) . Using this notation, the linear approximate subproblem of cycle p and iteration q around cycle start design x(p,0) is formulated as follows:
127 ˜ (p,q) Problem P minimize:
(p,q) f˜(p,q) = a0 +
n
start design x(p,0) as a corner of the search subregion. The opposing corner is placed in the direction of the previous search direction. The upper and lower bounds of the search subregion for this method can be calculated using
(p,q)
· xi ,
(p,q)
· xi ≤ c˜j
ai
i=1
subject to:
(p,q)
g˜j
(p,q)
= b0j
+
n
bij
(p,q)
,
(p,0)
+
1 1 (p,q) (p−1) (p,q) · mi · sign si − · mi , 2 2
(p,0)
+
1 1 (p,q) (p−1) (p,q) · mi · sign si + · mi , 2 2
(p,0)
− xi
(p,q)
= xi
u(p,q)
= xi
(p−1)
= xi
xi
i=1 (p,q)
xi
u(p,q)
≤ xi ≤ xi
j = 1, . . . , m ,
,
xi
i = 1, . . . , n . (p,q)
(2)
u(p,q)
and xi define the position and The move limits xi size of the search subregion. The actual move limit values are determined by the move limit strategy described in Sect. 4. The constraint bound c˜j equals cj corrected by some extra margin to account for the stochasticity. In Sect. 5 it is described how this correction factor is computed. Within the search subregion N experiments are planned according to a D-optimal experimental design (Myers and Montgomery 1995). The approximate subproblem is built from the outcome of these experiments using linear regression. The D-optimal design is created for a linear model without interaction using a row-exchange algorithm from the Mathworks statistics toolbox (MathWorks 1999). Contrary to factorial and fractional factorial designs, D-optimal experimental designs can be generated for any number of (simulation) experiments, providing the number of experiments is larger than or equal to the number of parameters in the model. This allows a linear relation between the number of experiments and the number of design variables in the optimization problem. This is rather crucial to keep the number of simulation experiments at a manageable level for increasing numbers of design variables. The linear approximate subproblem is solved using an integer linear programming algorithm. An external solver called lp_solve (see Berkelaar 1997) is used. It is based on a branch and bound to solve mixed-integer programming problems (see Nemhauser and Wolsey 1988). Slack design variables are used to relax approximate subproblems that do not have a feasible solution (Haftka and G¨ urdal 1992).
4 Move limit strategy The optimization approach uses linear approximations of objective function and constraints. The search subregion is the region for which these linear approximations are used. The position of the search subregion and the values of the move limits are set by the move limit strategy. The move limit strategy employs two directional methods of placing the search subregion. Directional method 1 is shown in Fig. 1 and uses the current cycle
si
(p−1,0)
,
i = 1, . . . , n .
(3)
(p,q)
defines the size of the search subreParameter mi (p−1) gion in scale-units in direction i. If search direction si (p−1) equals zero, sign(si ) gives zero as well, which affects the position of the search subregion.
Fig. 1 Directional method 1: the search subregion is placed in the direction of the previous search direction s(p−1) with the current start design x(p,0) as a corner-point
Directional method 2 is shown in Fig. 2 and makes the cycle start design x(p,0) the centre-point of the search subregion. The lower and upper bound of the search subregion for this method can be calculated using (p,q)
= xi
u(p,q)
= xi
xi xi
(p,0)
(p,0)
1 (p,q) − mi , 2 1 (p,q) + mi . 2
(4)
For both directional methods some of the calculated (p,q) u(p,q) search subregion bounds xi or xi may violate the bounds of the original design space xi and xui defined in Problem P. In that case the subregion is shifted such that it lies completely within the design space. Furthermore,
128 5 Accepting designs The calculated optimum design x(p,q) is accepted as the start design for the next cycle if the following three conditions hold: – x(p,q) is feasible – x(p,q) was not previously found – the objective function value F x(p,q) has not in (p,0) creased compared with F x .
Fig. 2 Directional method 2: the search subregion is placed using the current start design x(p,0) as its centre point. In the picture additionally a move limit reduction is visualized
(p,q)
u(p,q)
if the calculated xi or xi values are not integer, (p,q) u(p,q) xi is decreased and xi is increased towards the first integer value. If a solution is not accepted as the start design for the next cycle then the move limit strategy either changes the method with which the search subregion is positioned or it reduces the move limits. This is done in the following manner. 1. Start with initial search subregion size m(0,0) and position the search subregion using directional method 2. 2. After the first cycle, start using directional method 1 to position the search subregion. 3. If a design is not accepted and directional method 1 is used, then start positioning the search subregion with directional method 2. 4. If a design is not accepted and directional method 2 is used, then the current search subregion size m(p,q) are reduced. The plan points of the D-optimal experimental design are positioned at the corner points of the search subregion. With stochastic functions this positioning is best for the quality of the linear approximations. However, given the discrete grid and the stochastic behaviour, such a multipoint approximation will not have a local region with high accuracy, contrary to single point approximations based on function value and gradient information. This local inaccuracy of the multipoint approximations influences the convergence behaviour of the sequential approximate optimization approach. Some approximation error will always be present, also in the final stages of the optimization. Therefore, one cannot guarantee to converge to the true global or local optimum solution; a neighbour may also be found.
In the deterministic case the feasibility of a design x(p,q) is determined by comparison of the deterministic constraint values g(p,q) with the corresponding boundaries c. In the presence of stochastic constraints this approach is no longer applicable. A similar rationale also holds for the decrease or increase of the objective function value. Consider a stochastic constraint function Gj . If we carry out M replications of the simulation experiment for the same design variable setting x(p,q) , then the calcu(p,q) lated constraint values gjk (k = 1, . . . , M ) vary around (p,q) some mean value g j . This mean is an estimate of the true expected value. The feasibility of x(p,q) with respect to constraint E(Gj ) ≤ cj depends on the difference be(p,q) tween the mean g j and the constraint bound cj . The (p,q) following safety index βgj of constraint Gj in cycle p and iteration q is defined (p,q)
βg(p,q) = j
gj
− cj
(p,q) √ sgj M
,
j = 1, . . . , m ,
(5) (p,q)
with the mean g (p,q) and standard deviation sgj by (p,q)
gj
=
M
given
(p,q)
gjk
M,
k=1
M 2 (p,q) (p,q) (p,q) sgj = gjk − gj (M − 1) .
(6)
k=1
The nominator in (5) represents the difference between the mean and constraint bound. The denominator ex(p,q) presses the error associated with the mean g j as estimator for the true expected value. This error depends on the standard deviation of the constraint values, and the number of replications M . For larger M , the estimation (p,q) gj becomes more accurate. For each solution x(p,q) of approximate optimization ˜ (p,q) , M experiments are carried out to comproblem P pute the constraint values. The constraint E(Gj ) ≤ cj is considered inactive if the safety index value is smaller (p,q) than some specified margin: βgj < −βgspec . Sim(p,q) ilarly, the constraint is violated if βgj > βgspec . In all (p,q) other cases, −βgspec ≤ βgj ≤ −βgspec, the constraint is
129 called active. This is visualized in Fig. 3. Here it is as(p,q) sumed that the distribution of g j is more or less symmetrical; if this is not the case then the margins on left and right hand side should be taken unequal. The design (p,q) x(p,q) is called feasible if βgj ≤ −βgspec ∀j ∈ {1, . . . , m}. (p,q) Otherwise, x is infeasible, meaning that one or more of the constraints may be active or violated. The value of βgspec should be chosen in accordance with the distribution of the constraint function Gj and the allowed probability of infeasibility. For a nor(p,q) mal distribution, the safety index βgj in (5) follows a t-distribution (see any text book on statistics). For M sufficiently large, βgspec = 2 yields a probability of 2.5% or less that E(Gj ) ≤ cj is not satisfied. In simulation experiments the actual distribution may, however, be non-normal. The true distribution may even be unknown. In that case we should be careful with the selection of the βgspec value and the statistical interpretation of the safety index with respect to the confidence intervals. inactive
active spec −βg
0
violated spec βg
βg
Fig. 3 Interpretation of the safety index for constraints
Solving the approximate subproblem will yield a solution as close to the constraint boundaries as possible. The safety index however indicates that a certain distance to the constraint boundaries is desired for the solutions found. This can be achieved by tightening the ˜ (p,q) with constraint boundaries c in the subproblem P a correction factor. The corrected constraint bound in the ˜ (p,q) is calculated as approximate optimization problem P follows: √ (p,q) c˜j = cj − βgspec · sgj x(p,q−1) M, j = 1, . . . , m .
(7)
The correction of the constraint boundaries equals the specified margin on the safety index multiplied by the standard deviation of the mean. It is assumed that this standard deviation may depend on the design variable value (for example, the standard deviation may increase with increasing design variable value). The standard deviation to use for the correction factor should be the standard deviation of the constraint value in the solution of the approximate subproblem x(p,q) . Since this solution is not yet available, the standard deviation of the previous solution x(p,q−1) is used as an estimate. A similar approach is used for the objective function F . Due to the stochasticity the change in objective function value for a new design can no longer be determined using a single evaluation. Instead the two sample means
f 1 of the previous point and f 2 of the current point have to be compared. Therefore, the following index is defined (p,q)
(p,q) βf
(p,q)
−f = (p,q) 1 , sf (2/M ) f2
(8)
with (p,q)
f 1,2 =
M
(p,q) f1,2k M ,
k=1
M (p,q) (p,q) 2 = f1,2k − f 1,2 (M − 1) ,
sf1,2
k=1
(p,q) sf
s2f1 + s2f2
=
2
.
(9)
The mean objective function values f 1 and f 2 and the standard deviations sf1 and sf2 are again based on the M experiments in each of the two design points. Three situations can be identified (see Fig. 4). Comparing the current point with the previous point, the objective function can be decreased (βf < −βfspec ), equal (−βfspec ≤ βf ≤ βfspec ), or increased (βf > βfspec ). If the objective function values f1 and f2 are normally distributed with unknown variances, then the index βf follows a t-distribution. Similar to the selection of βgspec , we should be careful with βfspec as well, since the precondition of normality may not hold.
decrease
equal spec
−βf
increase spec
βf
0
βf
Fig. 4 Interpretation of the safety index for the objective function
Using the safety indices for both constraints and objective function, a design is accepted if the following three conditions hold: (p,q)
– βf
< βfspec,
(p,q)
– maxj=1,...,m βgj
< −βgspec ,
– the design was not previously found. If the start design is infeasible, the approximate optimization problem may not have a feasible solution in the search subregion. The slack variables relax the constraint bounds until the design closest to the feasible domain is found. This approximate optimum design which is still in-
130
Fig. 5 Flow chart for accepting designs
feasible will be accepted if: (p,q) (p,0) – maxj=1,...,m βgj < maxj=1,...,m βgj , and – the design was not previously found Figure 5 summarizes the acceptance criteria starting from either a feasible or an infeasible design.
6 Convergence The optimization is stopped when it is unlikely that further improvement of the current design will be found. This occurs if x(p,q) has not been accepted and: – max m(p,q) < 2 and directional method = 2 or if x(p,q) has been accepted and: –
−βfspec < (p,q) and x
(p,q) βf
βfspec and
< is feasible.
− βfspec
−βgspec, then the constraint boundaries are corrected with the new estimate of sgj and ˜ (p,q) is resolved. Again M experithe subproblem P ments are conducted to calculate βf and βgj for this new design using (8) and (5). 9. Check acceptance criteria for x(p,q) according to Fig. 5. 10. Check convergence criteria for x(p,q) according to Fig. 6. If no convergence has occurred and x(p,q) is accepted then it becomes the start design for the next cycle x(p+1,0) . Let p = p + 1 and q = 1, goto step 3. If no convergence occurred and the solution is not accepted then the position or size of the search subregion is modified by the move limit strategy and a new iteration is started, q = q + 1, goto step 3. Elseif convergence has occurred: the optimization run is terminated.
8 Test problems 8.1 Test problem 1. Nonconvex constraints The following stochastic discrete optimization problem has a non convex feasible region and is derived from a deterministic discrete optimization problem presented by Loh and Papalambros (1991b) by adding relative error terms. The error terms are normally distributed with a zero mean and a standard deviation equal to 5% of the corresponding constraint – or objective function value, minimize subject to
E(F ) E(G1 ) ≤ 3.718 ,
E(G2 ) ≤ 15.854 ,
x1 , x2 ∈ Z , with F = f + f ,
Gj = gj + gj ,
j = 1, 2 ,
f = −9x21 + 10x1 x2 − 50x1 + 8x2 + 460 , g1 = x1 − (0.2768x22 − 0.235x2) ,
132 g2 = x1 − (−0.019x32 + 0.446x22 − 3.98x2 ) , f ∈ N (0, |0.05f |) ,
gj ∈ N (0, |0.05gj |) ,
j = 1, 2 .
The optimization problem is visualized in Fig. 8. The solid lines represent the contour lines E(G1 ) = 3.718 and E(G2 ) = 15.854. Two discrete optima can be identified. The discrete global optimum is found at (5,3). One other local discrete optimum is found at (3,0). The influence of the stochastic contribution to the constraintvalues is illustrated by the dash dotted lines. The dash dotted lines show how the contour lines change if the constraint bounds are tightened with two times the standard deviation. The amount of tightening that will be actually applied follows from (7) and depends on the number of replications M . Fig. 8 The constraints and objective function for the nonconvex problem
Fig. 7 Flow chart for the optimization approach
First, the deterministic problem is solved for both the original and tightened constraint bounds, starting from each discrete design point in the design space, 0 ≤ x1 , x2 ≤ 10,with initial size of the search (0,0) subregion m1,2 = 4. We take N = 4 which gives a Doptimal design with four points in each corner of the search subregion. For the original constraint bounds, 52.1% of the optimization runs converged to the global optimum (5,3); 37.2% runs ended in either (6,4) or (5,4). The local optimum (3,0) or one of its neighbours (3,1) and (4,1) is found for 3.7% and 7.4% of the runs, respectively. A single optimization run starting from (2,8) which converges to (5,4) is visualized by Etman et al. (1996). In the deterministic case, convergence to a nonoptimal point, other than (5,3) or (3,0), is due to the linearization of the constraints. This linearization is based on function values in the corner points of the search subregion. Since both constraint functions g1 and g2 behave concave, the linear constraint approximations may include part of the infeasible domain. This may cause a discrete point that is feasible in the approximate problem, to be actually infeasible. Candidate points where this is likely to happen lie in the infeasible domain close to the constraint boundary, e.g (4,1) and (6,4). If such a point is visited during the optimization process, it will not be accepted, which may result in a premature convergence after (repeated) reduction of the search subregion. This observation is confirmed by the deterministic problem with tightened constraint bounds, where the full 100% of the runs converged towards (5,3). Figure 8 shows how the tightened constraints are more advantageously positioned with respect to the discrete grid. For the stochastic problem, twenty optimization runs are started from each discrete design point in the design space 0 ≤ x1 , x2 ≤ 10. This experiment is carried out for
133 several values of N and M . Parameters βfspec and βgspec are set to 2; the initial size of the search subregion is again 4 in both design variable directions. A range of optimum solutions is found. Each run consists on average of four or five approximate optimization cycles. To analyze the solutions, the following two definitions of Loh and Papalambros (1991a) for a deterministic discrete nonlinear programming problem DDN LP are helpful. For the stochastic case the expressions ‘feasible’ and ’smaller or equal than’ should be interpreted as explained in Sect. 5: Definition 1. The discrete neighbourhood of a point x is defined as the set of all points y, whose discrete components differ +1, 0, or −1 discrete units from the corresponding components of x , x itself being excluded from its own discrete neighbourhood. Formally, this set is then defined as DN (x) = {y : |yi − xi | = 1 or 0 discrete units, i = 1, . . . , n; y = x} Definition 2. The point x is said to be a local optimum for problem DDN LP , if x is feasible for problem DDN LP and f (x) ≤ f (y) for all feasible y contained in DN (x).
Table 1 Optimization results for the nonconvex test problem. DN(5,3) denotes the discrete neighbourhood of (5,3) No. points
No. replications M 5
10
25
2000
[%]
[%]
[%]
[%]
4
(5,3) 85.6 DN(5,3) 8.6 (3,0) 3.0 DN(3,0) 0.4 other 2.4
85.5 9.2 3.3 0.5 0.9
82.9 10.2 4.3 1.2 1.3
70.0 21.9 4.3 1.9 1.9
8
(5,3) 89.9 DN(5,3) 6.5 (3,0) 1.9 DN(3,0) 0.2 other 1.5
89.7 6.4 2.3 0.7 0.9
86.4 8.8 3.1 0.6 1.0
70.0 22.0 5.4 1.9 0.7
32
(5,3) 93.6 DN(5,3) 4.8 (3,0) 0.7 DN(3,0) 0.1 other 0.9
94.9 3.8 1.0 0.1 0.2
91.9 5.2 2.3 0.4 0.2
68.8 23.0 5.7 2.4 0.2
(5,3) 94.4 DN(5,3) 4.5 (3,0) 0.1 DN(3,0) 0.0 other 1.0
97.0 2.9 0.0 0.0 0.1
99.1 0.9 0.0 0.0 0.0
57.4 31.9 7.4 3.3 0.0
N
1024
Design
The calculated solutions are categorized into 5 categories. A solution is either equal to a discrete optimum (local or global), part of its discrete neighbourhood or an
other design. The discrete neighbourhoods of the discrete optima (5, 0) and (3, 0) do not overlap. The five categories are therefore unique and contain all solutions found. The frequencies of occurrence of the solutions in each of these five categories are presented in Table 1. Increasing N improves the linear approximation. This can be observed for N = 1024 and M = 2000 where the distribution of calculated optima approaches the distribution of optima of the deterministic problem. Increasing M increases the accuracy of the estimates of mean and standard deviation. Increasing M therefore decreases the safety correction in (7) and enlarges the feasible region towards the deterministic constraint facets. Keeping N = 1024 the frequency with which (5,3) is found increases from 94.4% for M = 5 to 99.1% for M = 25. Solving the corresponding deterministic prob√ lems with tightened constraint bounds cj − βgspec σg / M gives (5,3) in 100% of the runs for M = 5, 10, 25. For M = 2000 the results of the uncorrected deterministic problem are obtained. Somewhere in between M = 25 and M = 2000 the shift of the constraint facets cause a turning point in the set of solutions found, as explained above. 8.2 Test problem 2. A stepped cantilever beam An analytic cantilever beam problem is presented by Svanberg (1987). A stepped cantilever beam is built from 5 beam elements, with quadratic cross sections, as shown in Fig. 9. The beam is rigidly supported at node 1, and an external vertical force is acting at node 6.
Fig. 9 Cantilever beam (test problem 2)
The heights of the different beam elements are taken as design variables xi . The thicknesses are held fixed. The objective is to minimize the weight of the beam. There is only one behaviour constraint, namely a given limit on the vertical displacement of node 6 (where the given load is acting). For our purposes, the constraint is made stochastic by adding a relative error-term following a zero mean normal distribution with a standard deviation equal to 5% of the current constraint value. The design variables are positive integer. This problem can be stated analytically as follows:
134 minimize
f = C1 (x1 + x2 + x3 + x4 + x5 ) E(G) ≤ C2 ,
subject to with
G = g + g ,
g=
xi ∈ Z + ,
i = 1, . . . , 5 ,
61 37 19 7 1 + + + + , x31 x32 x33 x34 x35
g ∈ N (0, |0.05g|) . Parameters C1 and C2 are constants whose values depend on material properties, the magnitude of the given load, etc. They are taken equal to (similar to Svanberg’s case): C1 = 0.0624 and C2 = 1.0. To have a reference, first the deterministic case is considered. All points in the design space 1 ≤ xi ≤ 10, i = 1, . . . , 5 have been evaluated and the condition for local optimality as given by Definition 2 checked. Nine different (local) optima are found, presented in Table 2. They are subdivided into four groups: the optima within a group share the same objective function value. Group I contains the three different global optima. Groups II, III, and IV present local optima with a higher objective value. This underlines the statement of Loh and Papalambros (1991a) that mixed-integer problems may have many local optima and that proliferation of local optima occurs even when the underlying continuous problems (obtained by relaxing the discreteness requirement) are convex. The cantilever beam is an example of such a problem.
feasible. Depending on the positioning of the successive search subregions this may cause the optimization to miss the true optimum design and end up in a neighbouring point. Next the stochastic problem is considered. Two hundred optimization runs are carried out starting from10) for several combinations of M and N . A run consists (0,0) = 4 and on average of four to six cycles using mi spec βg = 2. Table 3 shows how the calculated solutions compare with the group I deterministic optima of Table 2. The majority of runs yielded a group I optimum or a neighbour. The frequency with which other points are found decreases with increasing N , but is hardly affected by the number of replications M . Increasing N improves the quality of the approximations, which is also confirmed by Table 4 where the (absolute) number of different solutions found is tabulated. This number of solutions decreases for increasing N . Only for M = 5 in Table 3 there seems to be hardly any effect. This is due to the large safety margin for small M . Then the tightening of the constraint bound in the approximate problem causes a group I optimum to be less frequently found. Increasing M increases the frequency
Table 3 Optimum solutions of the stochastic cantilever beam starting 200 optimization runs from (10, 10, 10, 10, 10). A solution is categorized as a group I-point of Table 2, a discrete neighbour (DN) of group I, or an other point No.
Table 2 Local optima of deterministic cantilever beam with corresponding objective function value (f ) and constraint values (g) Group I
II
III IV
Local Optimum 6 6 6 8 8 8 6 6 6
5 6 6 5 5 6 5 5 5
5 4 5 4 5 4 4 5 8
4 4 3 4 3 3 4 3 3
2 2 2 2 2 2 5 5 2
f
g
1.3728 1.3728 1.3728 1.4352 1.4352 1.4352 1.4976 1.4976 1.4976
0.9648 0.9850 0.9900 0.9464 0.9514 0.9716 0.9927 0.9977 0.9998
Starting from initial design (10, 10, 10, 10, 10) with initial search subregion set to 4, the optimization converges to (5, 6, 6, 4, 2) for the deterministic problem with N = 32 (i.e. each corner point of the search subregion is included in the D-optimal experimental design). This is a feasible neighbour of the group I optimal designs in Table 2 with a higher objective function value f = 1.4352. Constraint g is convex, and therefore the multipoint linear constraint approximation will cut off some part of the feasible domain. This means that in the linear approximate problem some discrete points will be considered as infeasible, while they are actually
No. replications M
points
Design
N
5
10
25
2000
[%]
[%]
[%]
[%]
12
group I DN(group I) other
9.0 83.0 8.0
6.5 88.0 5.5
12.0 81.5 6.5
22.0 70.5 7.5
24
group I DN(group I) other
7.0 87.0 6.0
11.5 81.5 7.0
16.0 77.0 7.0
23.5 69.5 7.0
48
group I DN(group I) other
8.0 89.0 3.0
11.5 84.0 4.5
22.5 71.5 6.0
24.5 69.5 6.0
1024
group I DN(group I) other
6.0 94.0 0.0
25.5 74.5 0.0
61.0 39.0 0.0
0.5 99.5 0.0
Table 4 Number of different optimum solutions found for the 200 runs starting from (10, 10, 10, 10, 10) No. points N 12 24 48 1024
No. replications M 5
10
25
2000
45 35 27 12
40 37 26 9
43 37 27 7
38 38 34 10
135
Fig. 10 Four-station production flow line
with which the true optimum is found. The only exception is the case with N = 1024 and M = 2000 where almost all runs yielded a neighbour point of group I. This corresponds with the deterministic optimization case where also a neighbouring point was found as optimum solution.
8.3 Test problem 3. A four-station production line Consider a four-station production flow line with a throughput target of 2.5 jobs per hour (Hopp and Spearman 1996), see Fig. 10. With this mean throughput, jobs will arrive at the first workstation according to a negative exponential distribution. Figure 10 shows the production flow line; each workstation consists of xi identical machines with a single infinite input buffer for temporary storage. Objective is to determine the number of machines per workstation that yields the minimum cost solution to realize the target throughput with a maximum mean cycle time of 6 hours – the cycle time refers to the time a job travels from start to end of the line. For each work station fixed costs F Ci and unit costs U Ci can be identified. The variable costs depend on the number of installed machines. Fixed and unit costs are summarized in Table 5, together with the mean process times M P Ti and the squared coefficients of variation SCVi – the coefficient of variation is defined as the quotient of the standard deviation and mean process time.
Table 5 Data of the line design problem [1] Station
1 2 3 4
Fixed cost
Unit cost
MPT
SCV
[$1000]
[$1000]
[hrs]
[−]
225 150 200 250
100 155 90 130
1.50 0.78 1.10 1.60
1.00 1.00 3.14 0.10
To begin with, the minimum required number of machines per workstation is determined: for each work station the installed machine capacity xi /M P Ti should be larger than the average arrival rate ra = 2.5 jobs/hour. Thus, the utilizations of the work stations should be smaller than one,
ui =
ra M T Pi < 1 ∀i ∈ {1, 2, 3, 4} . xi
This yields the minimum cost capacity feasible configuration as given in Table 6. However, this design may not satisfy the cycle time constraint. Due to the variable process times of the machines, queueing in the buffers occurs, which contributes to the cycle times of the jobs.
Table 6 Minimum cost capacity feasible configuration Station
Machines Utilization Cost [1000]
1 2 3 4
4 2 3 5
0.94 0.98 0.92 0.80
625 460 470 900
Total
2455
The optimization problem becomes minimize: F =
4 i=1
subject to: x1 ≥ 4 ,
F Ci +
4
U Ci xi
xi ∈ Z +
i=1
E(CT ) ≤ 6.0 ,
x2 ≥ 2 ,
x3 ≥ 3 ,
x4 ≥ 5 .
The expected value of the cycle time E(CT ) is estimated by computer simulation of the flow line using gamma distributed process times with mean and standard deviation in accordance with Table 5. The outcome of a simulation run of the minimum cost capacity feasible configuration x = (4, 2, 3, 5) is visualized in Fig. 11. The computed average cycle time is plotted as function of the simulation run length. The plot shows that the run length should be sufficiently long to obtain a reasonable accurate estimation of the mean cycle time. However, for each simulation a slightly different cycle time estimation will be obtained. Figure 12 shows the distribution of calculated mean cycle times for one hundred simulation runs. The distribution has a mean of 33.1 hours with a standard deviation of 5.7 hour, and is almost but not entirely normally distributed. The effect of this uncertain mean cycle time value on the optimization problem is visualized in Fig. 13. The contour lines of the mean cycle time are plotted for several simulation run lengths, taking x1 = x1 = x2 and x2 = x3 = x4 . For increasing run length the contour lines of the mean cycle time smoothens.
136 is 50 000 jobs. The margin on the safety index βgspec is set to 2. The minimum cost capacity feasible design is infeasible with respect to the cycle time constraint. The solutions for the 50 optimization runs can be found in Table 7. Generally the optimization converges within four or five cycles. Nine different solutions have been found. In Table 8 the frequency of design variable values has been tabulated. The optimum design suggested by this table is x = (6, 3, 6, 6), which is also the most found solution in Table 7. Almost all other solutions are neighbours of this design.
Fig. 11 A simulation run of the minimum cost capacity feasible configuration
Table 7 Calculated optimum designs for the four-station flow line Design
Fig. 12 Calculated mean cycle times of one hundred simulation runs with a run length of 50 000 jobs for the minimum cost capacity feasible configuration of the four-station production flow line
Fifty optimization runs have been carried out starting from the minimum cost capacity feasible design (0,0) x0 = (4, 2, 3, 5), with initial search subregion size: mi = 4. Both the number of experiments N and the number of replications M are taken 15. The simulation run length
(a)
(b)
Frequency
x1
x2
x3
x4
6 6 7 6 7 7 5 5 6
3 4 3 4 5 3 3 4 4
6 5 5 5 6 6 6 5 6
6 5 6 7 5 5 6 6 5
18 10 6 4 4 3 2 2 1
Table 8 Design variable values corresponding to the calculated optima Value
3 4 5 6 7
Frequency x1
x2
x3
x4
0 0 4 33 13
29 17 4 0 0
0 0 22 28 0
0 0 18 28 4
(c)
Fig. 13 Contour lines of the calculated mean cycle time of the four-station production flow line with x1 = x1 = x2 and x2 = x3 = x4 for several simulation run lengths. (a) Run length = 1000 jobs, (b) run length = 10 000 jobs, and (c) run length = 50 000 jobs
137 9 Conclusions
able for the simulation optimization of production flow lines.
The proposed sequential approximate optimization approach shows promising results to solve simulation based optimization problems that combine integer design variables and stochastic responses in a practical valuable way. The approach generates a sequence of approximate integer linear programming problems. Optimal design of experiments is used to plan which simulationdiscrete points of the search subregion. The stochastic behaviour is accounted for by the introduction of safety indices for objective function and constraints. a move limit strategy is included to redefine the size and position of the search subregion during the optimization process. The number of experiments N planned in the search subregion controls the accuracy of the approximations. However, the multipoint linear approximations are unable to follow any (highly) nonlinear behaviour of objective function and constraints. The reason is that the size of the search subregion (i.e. the position of the plan points) is restricted to the integer levels of the design variables. This limited accuracy of the approximations can cause the sequential linearization to end in a nonoptimal point in the deterministic case. Stochasticity also causes multiple solutions to be found. Optimization results of two analytical test problems have been presented. For the two analytical test problems the majority of the designs is found at or in the neighbourhood of the deterministic optimum solutions. The actual frequencies with which these points are found depend on the amount of stochasticity present in the optimization problem, the number of experiments N to build the approximations, and the number of replications M to compute the indices βf and βgj . The examples show that increasing the number of experiments N increases the quality of the approximations and decreases the variety of different solutions found. Increasing the number of replications M mainly increases the frequency with which the true optimum is found. Finally, the four-station production line illustrates a successful application to a simple (stochastic) simulation model with four design variables. For this problem, in about thirty-six percent of the runs the same final optimum solution is found, while for the other runs almost always a neighbour of this point is found. For this problem it can be shown that the standard deviation of the cycle time indeed changes as function of the place in the design space. This underlines the importance to recalculate the safety indices for each approximate optimum solution. Furthermore, the convex functional behaviour of the cycle time constraint of the production line problem looks like the displacement constraint curve of the cantilever beam problem. This suggests that concepts developed for structural optimization, such as reciprocal intermediatedesign variables, may also be valu-
Acknowledgements The authors wish to thank Leen Stougie for his comments on a draft version of this paper.
References Arora, J.S.; Huang, M.W.; Hsieh, C.C. 1994: Methods for optimization of nonlinear problems with discrete variables: a review. Struct. Optim. 8, 69–85 Barthelemy, J.-F.M. ; Haftka, R.T. 1993: Approximation concepts for optimum structural design – a review. Struct. Optim. 5, 129–144 Berkelaar, M.R.C.M. 1997: lp_solve, version 2.2, ftp.es.ele. tue.nl/pub/lp_solve, Eindhoven University of Technology Carson, Y.; Maria, A. 1997: Simulation optimization: methods and applications. Proc. 1997 Winter Simulation Conf., pp. 118–126. Atlanta: ACM Etman, L.F.P.; Adriaens, J.M.T.A.; van Slagmaat, M.T.P.; Schoofs, A.J.G. 1996: Crash worthiness design optimization using multipoint sequential linear programming. Struct. Optim. 12, 222–228 Etman, L.F.P.; Abspoel, S.J.; Schoofs, A.J.G.; Rooda, J.E. 1999: An approach to simulation optimization of industrial systems with discrete design variables and stochastic behaviour. Proc. WCSMO-3, Third World Cong. of Structural and Multidisciplinary Optimization. CD Rom Gasser, M.; Schu¨eller, G.I. 1998: Some basic principles of reliability-based optimization (RBO) of structures and mechanical components. Proc. 3rd GAMM/IFIP-Workshop on Stochastic Optimization: Numerical Methods and Technical Applications, pp. 80–103. Berlin, Heidelberg, New York: Springer Fu, M.C. 1994: Optimization via simulation: a review. Annals Operations Research 53, 199–247 Haftka, R.T.; G¨ urdal, Z. 1992: Elements of structural optimization. Dordrecht: Kluwer Hopp, W.J.; Spearman, M.L 1996: Factory physics: foundations of manufacturing management. London: Irwin Loh, H.T.; Papalambros, P.Y. 1991: A sequential linearization approach for solving mixed discrete nonlinear design optimization problems. J. Mech. Des. 113, 325–334 Loh, H.T.; Papalambros, P.Y. 1991: Computational implementation and tests of a sequential linearization algorithm for mixed-discrete nonlinear design optimization. J. Mech. Des. 113, 335–345 MathWorks 1999: MATLAB reference guide, version 5 . The MathWorks, Inc., Natick
138 Melchers, R.E. 1987: Structural reliability – analysis and prediction. New York: John Wiley & Sons Myers, R.H.; Montgomery, D.C. 1995: Response surface methodology – process and product optimization using designed experiments. New York: John Wiley & Sons Nemhauser, G.L.; Wolsey, L.A. 1988: Integer and combinatorial optimization. New York: John Wiley & Sons
Rooda, J.E. 2000: Modelling of Industrial Systems, http://se.wtb.tue.nl/, Eindhoven University of Technology Svanberg, K. 1987: The method of moving asymptotes – a new method for structural optimization. Int. J. Numer. Meth. Engrg. 24, 359–373 Thanedar, P.B.; Vanderplaats, G.N. 1995: Survey of discrete variable optimization for structural design. J. Struct. Eng. 121, 301–306