ABSTRACT Coercion combines optimization and expert guided manual code modification for ... Users need to be able to preempt and pursue an alternative course of ...... RSM using downhill scheme can't escape from local optimums. But this.
User Guide for Chained Use of Optimization Techniques to Gain Insights Lingjia Tang, Paul F. Reynolds Jr. ABSTRACT
Coercion combines optimization and expert guided manual code modification for
adapting simulations to meet new requirements. How to make right strategic decisions during this semi-automated process is an essential yet difficult problem. This paper demonstrates the importance of gaining insight for agile strategic decisions during the optimization process. It also proposes to chain various optimization techniques to gain and exploit insight. We investigate and summarize important characteristics of several common optimization techniques and possible insight they can provide. The result is presented in an optimization technique table that can be used as a user guide for gaining insight and making corresponding strategic decisions during the optimization process. To demonstrate the utility of the table, we also present several approaches to gaining insight and possible scenarios of users making suitable strategy decisions based on the insight.
1. Introduction The coercion process is a semi-automated process combining optimizations and code modification to adapt a simulation to meet new requirements [6]. In coercion, the objective function is defined as the distance between the simulation’s current behavior and the required new behavior. Decision variables for optimization are flexible points identified and annotated by subject matter experts [1]. Optimization is used to automatically search for appropriate bindings of flexible points to minimize the objective function, thus minimize the distance between the simulation’s current behavior and the required behavior. Optimization over simulation inputs (“simulation optimization”) to meet certain output objectives has been employed for years. The traditional approach to simulation optimization problems is on an ad hoc basis. Users select an optimization technique arbitrarily. If it terminates with unsatisfactory solutions, users need to select another technique. This continues until users find a suitable technique that produces acceptable solutions. There is little guide on how to select a suitable optimization technique, except a basic optimization problems taxonomy which does not work for techniques that are applicable for general problems. Optimization in the coercion process is different from the general simulation optimization. Because coercion is an iterative process interweaving automated optimization and manual code modification, it tends to be more complicated. The traditional monolithic application of optimization in SO is of limited use. Furthermore, users’ agile steering during the process is essential for coercion. Users need to be able to preempt and pursue an alternative course of action at any step during the process. In addition, defining appropriate objective functions to represent new requirements is sometimes difficult and may require users’ incremental refinement. Therefore, we propose to gain insight during optimization and exploit the insight to guide the process. We also propose to chain different optimization techniques to help acquisition of insight and exploit the insight. The idea is based on two key observations: 1) Users can gain valuable insight during optimization. The insight can be exploited to guide the future optimization process.
2) Various optimization techniques are suitable in different situations and are able to provide different insights from different perspectives. To provide a guide on how to gain insight during optimization and how to exploit them to guide the process for coercion users, this report presents an optimization table that summarizes and compares several common optimization techniques. The table focuses on different insights various techniques can provide and different problems they are suitable for. The paper also discusses several approaches to gaining insights and scenarios to exploit them. The report is organized as follows: section 2 discusses our motivations in more detail. Section 3 is related work. Section 4 introduces the optimization table and talks about how to interpret and use the table. Section 5 discusses how to use insight to guide the coercion process by analyzing possible scenarios. Section 6 is the summary and future work. 2. Chain Optimization techniques and Gain Insights As mentioned above, we propose to gain insights and chain optimization techniques in the coercion process based on two observations: Observations 1. Users gain valuable insight while an optimization technique is running. The insight can be exploited to guide the future optimization process. A user can gain useful insights during optimization that in turn reveal promising directions and effective strategies for the future process. The possible insights can be valuable knowledge about the simulation behavior, the selection of design variables, interactions between them, constraints, search space boundaries, promising regions, characteristics of the problem, the landscape of the objective function, how well the current technique is progressing, how to tune the technique, etc. These insights can indicate promising directions; reveal constraints and interactions to bound the search space. They can indicate a need for strategy change as well as suggesting appropriate changes. For example, they can indicate a need for preempting the current technique and switching to a more suitable one. Insights are also essential for selecting techniques: it has been proved that users cannot justify preference to any optimization technique without insights of the problem characteristics or structure [7]. In coercion, because flexible points have semantics representing abstraction alternatives in the simulation [14] they create more opportunities to gain valuable insights about themselves and their effects on the simulation behavior. Moreover, newly gained insights about flexible points and simulation behavior can be annotated and used for future coercion [14]. However, these useful insights are generally not available as a user begins an optimization process. And currently there is no guide on how to gain insights during optimization and how to exploit them to guide the process. Observation 2. Various optimization techniques are suitable in different situations and are able to provide different insights from different perspectives. Various optimization techniques are suitable in different situations. They differ in setup cost, computation time and qualities of solutions they can produce. Some robust techniques take a long time to set up and run but generally produce high-quality solutions or even guarantee the global optimum. Some techniques have little setup cost, take less computation
time but are very likely to only find low-quality local optimums. Techniques also differ in exploration powers and capabilities in exploiting insight. For example, evolutionary search approaches often are capable of exploring a larger area of the search space with a smaller number of objective function evaluations than techniques based on sampling the neighborhood of a single solution[12]. On the other hand, local search can intensify the search effort in small areas when users specified. As shown in Figure 1, random code generation has the greatest exploration power and code modification is purely exploiting user insight. And most optimization methods are located somewhere between the two.
Explore Unknown
Random Code Generation
Code Modification Exploit Insight Figure 1. In addition, different techniques produce different search trajectories and different intermediate results thus yield different insights [4]. For example, response surface methodology constructs a meta-model for the problem, which provides insight into the objective function landscape. Clustering method apply a clustering technique on sample solutions to identify groups that (hopefully) represent neighborhoods of local minima. The higher-ranking groups can provide insights into promising sub-regions. Chaining techniques can exploit strengths of different techniques, solve larger classes of problems, leverage insight and help users gain different insights. As users’ insights are accumulating, they want to preempt the current technique and switch to other more suitable techniques. Meanwhile, the fact that coercion is an iterative semi-automated process provides opportunities for users to select and chain various techniques. These two observations motivate our work of studying and comparing different optimization techniques and compiling the user guide for chaining optimization techniques to gain insights while provide guidance for questions such as what situations each technique is suitable for, what insights different techniques need for setting up, what insights a technique can possibly provide and how to use these insights.
3. Related Work The importance of human insight in the optimization process has been addressed in literature. Techniques are proposed to help users gain insight during optimization. [5] proposes a generic visual diagnostic tool for tabu search. With the help of the tool, an algorithm designer is able to examine search trajectories on-the-fly, gain insight and steer the search by applying remedial strategies(such as “tuning parameters, adjusting configurations, or deriving better adaptive rules”[5]). In simulation optimization community, [18] advocates exposing the internal of the optimization process to help users gain insight from the optimization trajectories and tune the optimization technique. To achieve the exposure, it proposes an online instrumentation approach. On the other hand, hybridization of optimization techniques has been proposed to take advantage of strengths of different techniques. There are different forms of hybridization. One form is integrating components from different techniques. Many integration schemes have been proposed. A popular scheme is to embed the local search into the framework of population-based techniques, including evolutionary algorithms, ant-colony methods, etc [19]. “The ability of evolutionary methods to capture a global view of the search space, when combined carefully with the fast convergence of local search techniques can often produce an algorithm that outperforms either one alone”[20],[21]. Another popular form of hybridization is the cooperative search. “Typically, cooperative search algorithms consist of the parallel execution of search algorithms with a varying level of communication” [26]. [22] proposes a cooperative parallel search paradigm for graph coloring problems, where two different search algorithms are running in parallel, exchanging and reusing partial solutions generated by each other. Similar work includes [23], [24], [25]. [26] provides a rather comprehensive survey of above two forms of hybridization of meta-heuristics optimization techniques. Another form of hybridization is to connect and interweave different techniques. Techniques communicate with each other when switching. Early research has been conducted on this area. Interesting work includes Human-guided tabu search [27] and the HuGs platform [28]. For some combinatorial problems, several optimization techniques (tabu search, exhaustive search and greedy search) and solution visualizations are provided in HuGs. There, users can preempt a current optimization, modify a solution, constrain a search, or switch to another technique. With HuGs, experiments showed that human-guided optimization outperformed an equivalent amount of unguided optimization. [29] presents the adaptive optimization engine(AOE) that solves nonlinear optimiz ation and constraints satisfaction problems by interleaving thirty different algorithms ranging from heuristic search to genetic algorithms to linear programming. In the engine, optimization problems are presented as black-box models. A technique is given a view of the problem, a start point, and the number of points it is allowed to evaluate before it needs to pause and allow other techniques time to run on the problem. A solution queue is kept so “the technique gets information learned from the previous techniques, like the global best point found”. Simulink developed by Mathwork provides three optimization methods in the Simulink Response Optimization toolkit: gradient descent, pattern search, simplex search. Users can connect these methods by using the previous results as start points for the next method [30].
4. Optimization Table We present an optimization table (appears in the appendix) which comprehensively summarizes and compares characteristics of several common optimization techniques including metaheuristics methods and traditional systematic methods. Characteristics of interest include basic categorizations of optimization problems (local/global, constrained/unconstrained, discrete/continuous, linear/nonlinear, etc. in Table 1, 2), as well as the setup cost, computation cost, helpful insights for setting up a technique, possible insights that can be provided by a technique and different situations each technique is suitable for. The summary and comparisons provide guidance and assistance for users to select techniques and to decide when to switch techniques. This table can serve as a guide for users of our chained-optimization approach for coercion. It can also be helpful for users of general optimization problems and simulation optimization. Section 4.1 discusses the important characteristics the table includes in more detail. Section 4.2 presents available approaches to gaining insights. 4.1 Characteristics of Optimization Techniques We select the characteristics to include in the table based on our interest in investigating the insights each technique may need for setup, the insights each technique may provide and the characteristics to consider when selecting techniques. We highlight and discuss some important characteristics in the following. Insights useful for setting up the technique - Setting up is important because, firstly, whether a technique is set up appropriately can greatly affect its performance. Secondly, the time spent on setup is eventually part of the total time of solving an optimization problem. Hence, when selecting techniques, users need to consider how available insights can help the setup of different techniques. (see Appendix – Table 3) Intermediate results - This entry focuses on intermediate results each technique produces during the process. Intermediate results contain information that is often overlooked but valuable for making “rational decisions about computation, such as the selection of a new strategy or the decision to cease computation” [4]. These intermediate results can either be used directly in the next technique or be analyzed to extract valuable insights. (see Appendix – Table 4) Possible insights users can gain from the technique – Users can gain insights from intermediate results and from observing and analyzing trajectories of the optimization process. Different techniques can provide different insights from different perspectives. This entry discusses possible insights a user can expect to gain when using a technique. (see Appendix – Table 4) Computation time - Computation time is a major factor when selecting optimization techniques. Theoretic analysis indicates that, “for any (optimization) algorithm, any elevated performance over one class of (optimization) problems is exactly paid for in performance over another class” [7]. However, there are general rules. For example, we can expect local
optimizers take much less time than global optimizers do. In practice, there are many literature reporting comparison experiments of different techniques, especially on specific classes of problems. We summarize some empirical knowledge and literatures about computation time in the table. Generally, users need to pay special attention to several facts when reading literature on optimizer comparisons: the qualities of solutions that each technique produces, whether the time for setting up each technique is included in the comparison, and finally, in what context and on what special classes of problem the evaluation and comparisons are conducted. (see Appendix – Table 5) Suitable situations for the technique - We discuss suitable situations for each technique in the table to help users select a suitable technique when needed. It is worth noting that a technique’s neighborhood construction to a large extend indicates what problems/landscapes the technique is suitable for. Neighborhood is the next set of candidate solutions that are calculated based on a technique’s current solution. Looking for a suitable technique for a problem, from a geometric perspective, is looking for a technique whose search trajectory is suitable for the landscape of the given objective function. The neighborhood construction in part decides a technique’s search trajectory. Here is an example of how a technique’s neighborhood construction interacts with problems’ landscapes. Simulated annealing is suitable for searching for a global optimum hiding in many poor local minima because simulated annealing brings in randomness in its neighborhood construction which helps the technique avoid local minima. However, this construction also slows down the convergence rate of the technique. Therefore, when there are not many local minimums, using simulated annealing might not be cost-effective. In addition to the relation between the neighborhood construction and landscapes, other insights about suitability of each technique are also discussed in the table. (Appendix – Table 6) 4.2 Approaches to Gaining Insights Users can gain insights through observing and analyzing intermediate results or the running process of a technique. We propose several approaches to gaining insights. 4.2.1 Visualization Many visualization techniques can help users gain insights about the simulation and the new requirements, the landscape of the objective function, interaction between design variables, the importance of variables, how well the optimizer technique works, etc. Visualization of intermediate results can give users a picture of the landscape of the objective function. It can also reveal insights on what variables are important and possible interactions between variables. Visualization can also present intermediate solutions to users for them to select solutions or sub-regions as start solutions or promising regions to continue the search with. Visualizations of optimization’s running process can present how optimizers’ sample solutions distribute on the search space, how the search effort gets concentrating on some sub-regions and how fast the solutions’ evaluation is converging. These can help users gain insight into how the technique is progressing as well as appropriate strategic changes if needed. For example, [5] proposes a visualization technique to help users gain insight about, for example, if the optimizer is searching poorer promising regions or cycling solutions or
doing passive search, etc, and use the insights to tune tabu search’s parameters, configurations and adaptive tabu rules on-the-fly. In addition, visualization can help users gain insight on the effects of different variables and interactions between them. Visualization can also suggest promising sub-regions to users. In the coercion process, visualization of the simulation's current behavior comparing to users’ new requirements can help users gain insight about how to modify the problem setup [2]. 4.2.2 Statistical Analysis Statistical analysis can help users gain insight into interactions between flexible points and effects/importance/sensitivity of flexible points. Dimensionality reduction techniques such as principle component analysis can be helpful. A dynamic sensitivity analysis on the effectiveness of the current optimization technique would identify good locations for preempting the current optimization. Statistical analysis together with visualization of the specific technique’s characteristics (e.g. diversity of population in genetic algorithm, the system temperature in simulated annealing) can provide insight about how to tune the optimization. Still, some insights are difficult to formalize and extract. Technique-specific insight about how to configure one particular technique cannot be directly exploited in other technique. This is remaining an interesting topic for future research. 5. Scenario of Using Insights to Guide Optimization Insights gained during optimization may indicate needs for -- changing the set up for the optimization problem or the coercion problem -- tuning the current technique -- switching to other optimization techniques -- going back to manual code modification In this section, we present and discuss some scenarios of how insights can be used to guide the optimization process. We are not aiming to list exhaustively all possible scenarios here. Instead, we hope that by presenting representative scenarios, we can inspire readers to exploit insights in the optimization process to make strategy decisions. 5.1 Scenarios when insights indicate changing the set up for the optimization problem or coercion problem. Scenario 1. User gains insights about the coercion’s requirement. Sometimes it is difficult for a user to define the objective function as a simplified representation of the new requirements. Through observing the optimization process and comparing the optimization-generated simulation’s behavior and the target behavior, a user may gain insights about how to refine or re-define the objective function to represent the ideal requirements more accurately. Scenario 2. User gains insights about the selection of design variables or the boundary of search space.
A user may discover the improper selection of design variables during the optimization process. A user can reduce the dimensionality of the search space or expand the search space according to their insights. 5.2 Scenarios when insights indicate tuning or refining optimization techniques. Scenario 1. User gains insights about interactions between design variables or between variables and the objective function. Useful insights about interactions can indicate both promising and poor directions. For example, incompatible, redundant and canceling relations between flexible points can reduce the search space. A complementary relation may suggest a need for concentrating the search on these influential interacting flexible points. A user can either adjust search regions or adjust the optimization technique’s neighborhood construction based on the insights. Scenario 2. User thinks that the current technique is stuck in the local optimum. When a user has the insight that the current technique is stuck in the local optimum, her options include restarting the technique with a different start solution, tuning the technique or switching to other techniques. In these ways, a user can explore the regions that the technique has not explored yet. Scenario 3. User identifies promising sub-regions of the search space. When a user identifies promising sub-regions during optimization, she can preempt the current technique, constrain the technique within these sub-regions and restart. She can also switch to a different optimization technique (possibly finer-grain ones) for sub-regions. If a user gains insights about the landscape from previous techniques, she can choose different techniques for different sub-regions. She can also iteratively restart the technique in smaller and smaller sub-regions too. 5.3 Scenarios when insights indicate switching to a new optimization technique Scenario When the current technique is not working well. Scenario When insights indicate other technique is better suited for the problem. Scenario After user modifies the setup of the optimization problem or the coercion problem. When the current technique is not working well, for example, its solutions are improving slowly or it is stuck in a local optimum or it is repetitively searching poor solutions, a user can consider switching to another technique. When a user gains insights about the problem’s characteristics and decide that other techniques are better suited, or when a user wants to explore other sub-regions or concentrate on some sub-regions, she can also consider switching techniques too. Another scenario is after a user modifies the setup of the problem, some techniques can become more suitable for the new modified problem. To select the new technique, users can use the suitable situation discussion in the optimization table as a guide. Switching techniques gives rise to one interesting question: how does the next technique use insights from the previous techniques? The next technique may use solutions the previous technique produced as start solutions or promising solutions sets. The next technique can also use the meta-model generated by the previous technique to prune low-quality solutions. The next technique can also be notified that some solutions are possible poor local optimums
based on the insights from previous techniques and thus need to avoid and explore other solutions. Interesting future research can be done in this topic. 5.4 Scenarios when insights indicate going back to manual modification. When a user learns that the current optimization is not going in the desirable direction or the simulation does not behave correctly, or the optimization fails to provide satisfactory solutions after a long run, she might need to consider going back to manual modification. 6. Summary and Future work This paper demonstrates the importance of user insight in the optimization process and proposes chaining optimization techniques for the coercion process. The main contribution of our work is the table in the appendix with comparisons of various optimization techniques. It can serve as a user guide for gaining and using insight to guide optimization in the coercion process. Interesting work remains. We have discussed how to gain insight through visualization and statistical analysis. Developing useful visualization and analysis techniques can be very helpful for optimization users. In addition, design and development of a framework for chaining various optimization techniques to exploit user insight efficiently during the coercion process are needed.
Reference [1] J. C. Carnahan, Paul F. Reynolds, Jr., and David C. Brogan. "Language Support for Identifying Flexible Points in Coercible Simulations". In Proceedings of the 2004 Fall Simulation Interoperability Workshop, September 2004. [2] J. C. Carnahan, P. F. Reynolds, D. C. Brogan. “Visualizing Coercible Simulations”. In Proceedings of the 2004 Winter Simulation Conference. [3] J. C. Carnahan, Paul F. Reynolds, Jr., and David C. Brogan. "Simulation-Specific Characteristics and Software Reuse." In Proceedings of the 2005 Winter Simulation Conference, pp. 2492-2499, December 2005. [4] E. J. Horvitz. “Reasoning under varying and uncertain resource constraints”. In Proceedings of the National Conference on Artificial Intelligence (AAAI-88), pages 111--116, 1988. [5] H. C. Lau, W. C. Wan, S. Halim, “Tuning Tabu Search Strategies via Visual Diagnosis”. MIC 2005, the 6th Metaheuristics International Conference.
[6] S. Waziruddin, D. C. Brogan, P. F. Reynolds. “Coercion through Optimization: A Classification of Optimization Techniques”. In Proceedings of the 2004 Fall Simulation Interoperability Workshop, Orlando, FL, September 2004. [7] D. H. Wolpert and W. G. MacReady. “No free lunch theorems for optimization”. IEEE Transactions on Evolutionary Computation, April 1996.
[8] W. H. Press, B. P. Flannery, S. A. Teukolsky, W. T. Vetterling. “Numerical Recipes”, Cambridge University Press 1992. http://www.nr.com/ [9] P. Gray, W. Hart, L. Painton, C. Phillips, M. Trahan, J. Wagner, “A Survey of Global Optimization Methods”. Sandia National Laboratories, 1997. http://www.cs.sandia.gov/opt/survey [10] S. S. Skiena. “The Algorithm Design Manual”. Springer; 1 edition (July 31, 1998) http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK/BOOK.HTM [11] J. Haddock, J. Mittenthal, “Simulation optimization using simulated annealing”. Computers and Industrial Engineering Volume 22 ,Issue 4 (October 1992) ,Pages: 387 395 [12] J. April, F. Glover, J. P. Kelly and M. Laguna 2003, Practical introduction to simulation optimization, Proceedings of the 2003 Winter Simulation Conference [13] A. Törn, “Global Optimization (Lecture notes in computer science)”, Springer-Verlag Berlin and Heidelberg GmbH & Co. K , 1989. [14] J. C. Carnahan, 2006. “Language Support for the Coercible Software Domain”. A Dissertation Proposal, University of Virginia: School of Engineering and Applied Science, Charlottesville, VA, 2006. [15] L. Gerencser. “Optimization Over Discrete Sets via SPSA”, in Proceedings of the IEEE Conference on Decision and Control, pp 1791-1795 [16] F. Azadivar and J. J. Talavage. “Optimization of Stochastic Simulation Models”. Mathematics and Computers in Simulation, 22:231-241. [17] F. Azadivar. “Simulation Optimization Methodologies”. In Proceedings of the 1999 winter Simulation Conference, 1999. [18] A. Persson, H. Grimm, A. Ng. 2006. “On-line instrumentation for simulation-based optimization”. In Proceedings of the 2006 Winter Simulation Conference, December 2006.
[19] W. E. Hart. 1994. “Adaptive Global Optimization with Local Search”. Doctoral Dissertation, University of California, San Diego, CA. [20] P. Turney. 1996. “Myths and Legends of the Baldwin Effect”, Poc. The ICML(13th International Conference on Machine Learning). 135-142. [21] F. Bobo and E. Goldberg. 1997. “Decision Making in a Hybrid Genetic Algorithm”, in T. Back(Ed.) Proc. The 1997 IEEE International Conference on Evolutionary Computation, pp. 121-125. IEEE press. [22] T. Hogg and C. P. Williams. 1993. “Solving The Really Hard Problems With Cooperative Search”, In Proceedings of the AAAI93, AAAI Press. Menlo Park, CA,1993. [23] V. Bachelet and E. Talbi. 2000. “Cosearch: A co-evoltionary metaheuristics”. In Proceedings of Congress on Evolutionary Computation - CEC’2000. 1550-1557. [24] J. Denzinger and T. Offerman. 1999. “On cooperation between evolutionary algorithms and other search paradigms”. In Proceedings of Congress on Evolutionary Computation CEC’1999. 2317-2324. [25] M. Toulouse, K. Thulasiraman and F. Glover. 1999. “Multi-level cooperative search: A new paradigm for combinatorial optimization and application to graph partitioning. In Proceedings of the 5th International Euro-par Conference on Parallel Processing, Lecture Notes in Computer Science. Springer-Verlag, New York, 533-542. [26] C. Blum and A. Roli. 2003. “Metaheuristics in Combinatorial Optimization: Overview and Conceptual Comparison”. In ACM Computing Surveys, Vol. 35, No. 3, September 2003, pp 268-308. [27] Klau, G. W., N. Lesh, J. Marks, and M. Mitzenmacher. 2002. “Human-Guided Tabu Search.” In Proceedings of the 18th National Conference on Artificial Intelligence (AAAI), pp. 41-47, 2002. [28] Klau, G. W., N. Lesh, J. Marks, and M. Mitzenmacher. 2002b. “The HuGS Platform: A Toolkit for Interactive Optimization”, Published in Advanced Visual Interfaces, May 2002. Trento, Italy. [29] Faulkner, E. and J. Cowart. 2006. “The Adaptive Optimization Engine”, Presented at INFORMS Annual Meeting 2006. [30]Simulink Response Optimization Documentation http://www.mathworks.com/access/helpdesk/help/toolbox/sloptim/ [31] J.C Spall, Hill, S.D., and Stark, D.R. 2006. “Theoretical Framework for Comparing Several Stochastic Optimization Approaches,” in Probabilistic and Randomized Methods for Design under Uncertainty (G. Calafiore and F. Dabbene, eds.), Springer, Berlin, Chapter 3
[32] J.C Spall, “SPSA- a method for system optimization”, http://www.jhuapl.edu/SPSA/
[33] J.C. Spall. 2003. “Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control”, Wiley, Hoboken, NJ. [34] J.C. Spall, 1998. "An Overview of the Simultaneous Perturbation Method for Efficient Optimization," Johns Hopkins APL Technical Digest, vol. 19, pp. 482-492 (survey paper on SPSA).
APPENDIX Optimization Technique Table We include six tables here in the appendix. Each table summarizes and compares techniques based on specific characteristics. Table 1 and 2 include a basic taxonomy of optimization problems. Table 3, 4 and 5 discuss the setting-up and running properties of each technique, as well as insights it need and it can possibly provide. Table 6 discusses the situations each technique is suitable for. a local
Suitable for constraint or
Suitable for continuous
Suitable
Suitable for
optimizer or a
unconstraint problems ?
or discrete problems?
for Linear
high-dimension
global
or
problems or low
optimizer ?
nonlinear
dimension?
problems? Simulated
Global
Annealing
Originally for unconstrained
Continuous:
problems but
Neighborhood
penalty function or sophisticated
construction uses
neighborhood construction can
gradient-based methods
be used to solve constrained
Discrete:
problem
Neighborhood
Nonlinear
construction needs combinatorial encoding Genetic
Global
Algorithm
Originally designed for
Mainly for discrete,
High
Suitable for high
unconstrained.
occasionally used for
-nonlinear
dimension
methods to handle constrained
continuous.
problem have been proposed , which increases computation complexity Tabu Search
Global
both
Discrete (esp.
nonlinear
combinatorial); can be applied to continuous problems by choosing a discrete encoding of the problem both
Continuous
Linear
Response
Local. But can
Surface
be combined
Or low
Methodology
with
nonlinear
techniques to escape local minimum Stochastic Approximation
Local
originally designed for
nonlinear
Some algorithm
continuous input but
needs only two
can be modified to
simulations per
work for discrete input
estimation
[15]
independent of the dimension, which
makes it promising for high dimensional problems.
Random
Global
both
Search
Primarily discrete
nonlinear
optimization but can be
Suitable for high dimension
applied in continuous case or hybrid types.
Clustering
Global
Both
method
continuous
nonlinear
most effective for low dimensional problems; less effective for problems of more than a few hundred variables.
Table 1. Basic Taxonomy of Optimizers (1).
Suitable for Stochastic
Is it guaranteed to converge?
simulations?
Simulated
Yes
Yes under unrealistic setting
Does it have
Suitable for
Number of
randomness
non-differentia
function
components
ble
evaluations it
?
simulations?
requires
Y
Y
Large, at least
Annealing
for continuous case
Genetic
Common belief is that
A canonical GA without elitism(that
Algorithm
it is suitable for
does not hold onto the best solution
stochastic problems
at each generation) is provably
though no theoretical
non-convergent (Rudolph, 1994).
support exits
conditions for the formal
Y
Y
N
Y
N
Y
large
convergence of GAs to an optimal are presented in literatures, such as [33] Tabu Search
Yes. but difficult to
Yes under unrealistic setting (e.g.
determine the minimum
tabu list is infinite)
number of simulation replications that will reduce the effect of the stochastic process on the search method Response
Yes
low (suitable
Surface
for problems
Methodology
where the function evaluation is very costly)
Stochastic
Yes
Approximation Random Search
Yes but the convergence requirement
Y
Y
low
Y
Y
large (thus
is impractical and purely theoretical. Yes
Theoretical convergence proofs exist. Converges under very general
recommended
conditions but the convergence rate
for not costly
is slow. No guarantee for stochastic
function
setting.
evaluation problems)
Clustering
Y
Method
Y. Gradient-based local search can be helpful.
Table 2. Basic Taxonomy of Optimizers (2).
large
Set up (design decisions users need to
Insights useful for setting up/tuning the technique
make when set up the technique) Simulated
1) Construction of neighborhood and
annealing
the selection of candidates set 2) For constrained problems, need to select strategies to handle constraints.(i.e. penalty function or sophisticated construction of neighborhood structure) 3) cooling schedule
a. initial temperature b.
Temperature decrement function
c.
Number of iterations between temperature change
d.
Acceptance criteria
4) The start solution
Genetic
1) Encoding
Algorithm
2) Initial population 3) population size 4) crossover rules 5) mutation rules
1) Generally, not many insights are needed for set-up. The most important is setting up the cooling schedule. There are some adaptive algorithms available for users. However, users sometimes need to go through a trial-and-error phase to decide the parameters. When users have insight about the landscape of the objective function which can help them decide how much randomness in the technique is appropriate, the tuning process would be more efficient. 2) For combinatorial or constrained problems, efficient computation needs deeper thinking and insight about neighborhood construction and constraints representation. 3) Insight about good start solutions is useful. 4) How fast the technique can converge at each temperature determines the appropriate iteration length. Relevant insight is helpful, although it is generally difficult to gain. One option is to run the technique for a period of time and to gain insight from analysis of the running profile. 1) User insight about how to design an encoding scheme and crossover rules to combine good features from parents to produce interesting individuals are essential for GA. The insight could be provided by domain experts who understand the problem structure well. Useful insights also include how design variables interact and influence each other, which users can gain from experience of the previous techniques.
6) the next generation fitness function (from the objective function)
2) Insight about initial population is helpful.
7) how to handle constraints 3) Crossover/ mutation rates need tuning. 4) Insight about how to select the next generation from the children population produced to keep good solutions while maintaining diversity. Users may be able to gain it from the visualization of the GA running profile.
Tabu Search
1) Neighborhood construction.
TS integrates AI into the search procedure,
thus explicitly needs to use
expert insight in the search. 2) Decide tabu rules: select tabu object; select tabu list length; select tabu length; select aspiration criteria; select frequency rules
1) Neighborhood construction and candidate selection need user insight (similar in designing other techniques), especially for combinatorial problems. 2) Deciding appropriate parameters requires insight about the problem structure. For example, parameter tabu_list_length decides how much diversification the procedure is. Insight about the problem structure
3) Start solution
can be helpful for tuning. 3) Setting up goes beyond parameter tuning, but more a strategy design/selection: what objects should be avoided? How to use short-term memory as well as long-term memory? What information should be recorded in history? Besides, what time to “untabu” those in the list (aspiration criteria)? All these design decisions would benefit from user insight. 4) Target Analysis is proposed to adjust strategies. The idea is users adjust strategies and parameters in a small scale of the problem to gain insight about the problem characteristics then uses the insight for the bigger scale. Trial-and-error procedure can be used too. 5) Insight about how current setup is working and where they fail can be gained during runtime and help reconfigure or redesign the technique’s strategies [5]. 6) Insight about a good start solution.
RSM
1) how to define/divide sub-regions 2) size of sub-regions 3) number of sample evaluations in a
RSM is straightforward to use. The important parameters user needs to tune for problems is the how large a sub-region is and how many samples are needed in one sub-region. These tuning need insights about how rugged the problem landscape is and how linear/nonlinear the problem is.
sub-region 4) initial sub-region SA
1) Initial position 2) Step size
there are some published guidelines providing insight into how to pick the coefficients in practical applications [34]
3) SPSA requires user-generated perturbation vector. Random
1) How to sample
Though many parameters need to be set , set up is easy for simple naïve
Search
2) How many points to sample
random search; even non-experts can set it up quickly. Defining the appropriate neighborhood is the key.
Clustering
1) Selection of sampling method
Set up doesn’t take long. Although fine tuning requires insights.
Method
2) selection of cluster method
Insights about importance of each design variable are helpful for setting
3) distance function
up the distance function.
4) threshold distance
Insights about how flat/hilly sub-regions are helpful for setting some thresholds appropriately.
Table 3. comparison of techniques based on how to set up different techniques and what insight each technique need and can benefit from during setting up and tuning
Simulated Annealing
Intermediate results 1) all solutions the technique has been evaluated 2) Sub-regions and different exploration powers used to explore them. (sub-regions are explored under different temperature)
Possible Insight the intermediate results may provide 1) SAN examines sub-regions with different levels of randomness, which can provide insights on structure of multiple sub-regions and promising sub-regions. 2) For combinatorial problems, SAN has the special property that “configuration decisions tend to proceed in a logical order”[1], which means a decision variable which can make greater influence on the output tends to be decided at the earlier stage of the procedure. This may help to provide insights about which decision variables have more influence in output and which have less. 3) SAN can escape from some local optimums, so it can provide insight about whether a optimum is local.
Genetic Algorithm
1) current population (which can be seen as a set of non-connected promising solutions)
The special features of GA are it is population based and its special neighborhood construction scheme. 1) Population based feature provides a large sample set, and helps users
2) Previous generations
2) Insights about interactions b/t design variable can also be from
gain insights about interactions b/t design variable. observing crossover operations. 3) Good exploring power gives users a much greater picture of the landscape (and not necessarily continuous). 4) GA's neighborhood is not restrained to geometric neighborhood Tabu Search
1) All solutions the technique has evaluated 2) frequency of occurrences of solutions. 3) The current tabu list. or previous lists.
Response Surface Methodology
1) linear regression models for several certain sub-regions. 2) At the last step of the algorithm, a final high-order model for a sub-region
around a single solution thus it can explore non-connected regions. 1) Tabu list gives the technique more exploring power and certain ability to escape from local optimums. This can provide insights about if an optimum is local and give users bigger picture of the landscape. 2) Frequency information can provide useful insights: solutions with high frequency might be buried in the valley, and should be added to the tabu list; if already did, would need more jump to escape from that. 3) If the technique doesn’t work well, it might indicate that users can try stochastic methods, because tabu search which depends largely on encoding search patterns users identified. RSM using downhill scheme can’t escape from local optimums. But this method can be very helpful for providing insights. 1) It can be used to construct meta-model which provide insights about the whole landscape. The meta-model can be used as screening phase for other optimization techniques. 2) Insights about different sub regions. 3) Insights about the effects of the decision variables on the output(possible linear relationship), which decision variable is more
Stochastic Approximation
A path of solutions
influential at some local regions. The search procedure shows a “guessed” optimal path in this sub-region. So users get to see a very limited partial picture of the search space. 2) Some insights about how certain decision variables may influence the objective function, and/or how decision variables can interact with each other to generate possible results.
1)
Random Search
Objective function values at random sampling points, or regions (depend on how neighborhood is defined in different algorithms).
1) Users can identify promising regions 2) Top ranking points can be used as start points for following algorithms.
Clustering Method
intermediate clusters.
1) some of the highest ranking clusters can be treated as promising sub-regions 2) The cluster centers and cluster radius can provide insight s about sub-regions. 3) clustering provides an opportunity to examine the design variables’ influence on the output and variables’ interaction. 4)“The output from the process showing the evolution of clusters contains a lot of information hard to formalize but of great importance for practitioners solving a real world problem” [13]
Table 4. comparison of different techniques based on the intermediate results each technique can produce when preempted, and the insights these results can help users gain.
Computation Time Simulated
The “main disadvantage is the computational time required to
Annealing
find solutions of a reasonably high quality” [12]. “A procedure based on exploring would be effective if the
Stop Criteria 1) when temperature drops below some pre-specified threshold, it stops 2) Users can specify how many temperature
starting point is a solution that is ‘close’ to high quality
iterations and how many iterations per
solutions and if these solutions can be reached by the move
temperature step
mechanism that defines the neighborhood”. [12] 3) if current optimum doesn’t improve after pre-specified number of iterations, it stops 4) statistically estimate the stopping temperature
1) computation budget is used up
Genetic Algorithm
“the main advantage of evolutionary approaches over those based in sampling the neighborhood of a single solution (e.g. simulated annealing) is that they are capable of exploring a
2) Terminate when convergence rate drop below the threshold.
larger area of the solution space with a smaller number of objective function evaluations”. [12] efficiency highly depends on how much insight users exploit in the algorithm set up. Tabu Search
The performance largely depends on the strategies the method
1) computation budget is used up
uses. It’s efficient for combinatorial problems. Comparable and
2) The current optimum isn’t updated in some
surpassing GA and SAN in many situations.
pre-set number of iterations or current optimum is good enough for users.
Response
If applied blindly, its computational requirements are its
1) Meta-model is used mainly for screening.
Surface
shortcoming.
2) Sequential model does linear regression
Methodology
But “it’s relatively efficient comparing to many gradient based
iteratively until it finds out that a linear
methods for simulation optimization when concerning how
model is not sufficient, it then builds a second
many simulation experiments are needed” [17]
or higher order model in the last sub-region to estimate the optimum.
Stochastic
Efficient even for high-dimension search
Common used criteria are until it converges or
Approximation
“Formal theoretical and numerical algorithmic comparisons of
until computation budget is used up.
SPSA with other state-of-the-art optimization methods (simulated annealing, evolutionary computation, etc.) have generally shown SPSA to be competitive (and possibly more efficient) in terms of the overall cost of the optimization process[31]. This is especially the case when only noisy values of the objective function are available.”[32]
Random
Normally not efficient
1)
Search
user-specified computation budget has been used up
2)
good enough results have been found
Clustering
Clustering methods work better than multi-start local search
1)
Only a single cluster left.
method
because they are designed to avoid a main problem in multi-start
2)
No new minima detected
local search: several different start points may end up in the
3)
The regions corresponding to the values of
same local minimum.
the function improving has negligible measure 4)
A rule based on the distribution of the number of expected minima
5)
The number of clusters stabilizes
Table 5. comparisons of techniques based on the computation time of each technique and common stop criteria each technique uses.
Suitable for Simulated
“Simulated annealing is a fairly robust and simple approach to constrained optimization, particularly when we are
Annealing
optimizing over combinatorial structures (permutations, graphs, subsets) instead of continuous functions.” [10] “Combinatorial problem(circuit design, TSP)” [9] SAN is suitable for deterministic optimization problems. Modified SAN for discrete-event simulation optimization problems have been proposed [11]. Perturbation analysis techniques were also introduced into SAN methods for stochastic problems. SAN is especially suitable for problems whose global optimum is needed but hidden in many, poorer, local minimum. SAN is less likely to be stuck in the local optimum than local search. It is suitable when users look for high quality solutions, and have little constraints on time.
Genetic Algorithm
a. Real/discrete/complex input b. High dimensional problems c. The search space contains multiple disconnected regions. d. multi-objective problems e. highly-nonlinear problems f. GA is a general black-box optimizer. It can be used when users have no deep understanding of the problem. g. when users have insights about crossover rules can make children inherit good properties from parents. h. GA is suitable for chaining use with other techniques Shortcomings: 1) Some problems are not difficult to encode in GA representation. Some constraints can be difficult to encode into the representation. 2) Takes long time to run.
Tabu Search
“The Tabu Search has traditionally been used on combinatorial optimization problems. The technique is straightforwardly applied to continuous functions by choosing a discrete encoding of the problem. Many of the applications in the literature involve integer programming problems, scheduling, routing, traveling salesman and related problems.” [9] 1) When users want to avoid local minimums. 2) When the search process exhibits repetitive patterns on the problem (have repeatedly visited the same exploration points/area), switching to TS could be an alternative. 3) When users can identify certain search patterns to avoid 4) Combinatorial problems. They are sometimes easier for human to extract patterns and design strategies to be exploited in the technique. 5) Problems with complicated constraints
Response
RSM is a fairly general method.
Surface
Sequential RSM using regression is one of the most established forms of simulation optimization found in the
Methodology
research literature. Suitable for building meta-model for users to learn the landscape and discover promising regions. It can be used to construct meta-models to provide insights about the whole landscape. The meta-models can be used as screening phase for other optimization techniques. It doesn’t provide good answers for complex functions with sharp ridges and flat valleys[16]
Stochastic
1) Stochastic optimization
Approximation
2) When gradient can’t be evaluated directly or efficiently, especially for high-dimensional problem.
Disadvantage: 1) May get stuck in local optimum. However, there are possible ways to extend the technique to global optimizer. 2) It could be inefficient for functions of “long narrow valley” structure as steepest descent. Random
1) Simple, general and robust ( not sensitive to local irregularities of the objective function)
Search
2) search guarantees convergence at the probability of 1 3) It is not efficient and thus rarely used alone. But it is suitable for combining with other methods.
Clustering
Efficiently combine global search and local search.
Method
Not so sensitive to the number of existing local minima “Clustering methods have been developed for optimizing unconstrained functions over reals. These methods assume that the objective function is relatively inexpensive since many points are randomly sampled to identify the clusters.” [9] Table 6.
different situations different techniques are suitable for.