Direct search methods for parameter optimization are applicable in a much wider field than any other ... is one that is based on principles of organic evolution.
DIRECT SEARCH FOR OPTIMAL P A R A M E T E R S WITHIN S I M U L A T I O N MODELS
Hans-Paul Schwefel Nuclear Research Centre Juelich, F.R.G. (KFA) Programme Group of Systems Analysis and T e c h n o l o g i c a l Development
(STE)
Abstract. The tool of systems simulation can be improved by s u p e r p o s i n g optimization techniques onto the computer model of the object or system investigated. Direct search methods for p a r a m e t e r o p t i m i z a t i o n are applicable in a much wider field than any other technique. O u t s t a n d i n g among such partly heuristic methods is one that is based on p r i n c i p l e s of organic evolution. The results of a comp r e h e n s i v e test p r o g r a m in which all important algorithms have been compared reveal the superiority of this e v o l u t i o n strategy. Reference is made to examples of actual application.
INTRODUCTION A model of a system or an object is devised where it is desirable its p e r f o r m a n c e
or b e h a v i o u r in future or under certain conditions
p e r i m e n t i n g on the actual object is precluded
mulated in terms of direct m u l t i t u d e of details.
and where ex-
for cost or other reasons.
the case e.g. in biology, medical science, and socioeconomy. always involve highly complex r e l a t i o n s h i p s ;
to know
hence,
This is
Here the problems
they usually cannot be for-
c a u s e - a n d - e f f e c t relations r e q u i r i n g attendance to a
Contrary to p h y s i c a l or e n g i n e e r i n g models, b e h a v i o u r a l
and statistical relations b e t w e e n system quantities have to be taken into account. This explains why any model is n e c e s s a r i l y
confined to certain aspects of the
overall system and greatly depends on the q u e s t i o n that has to be answered. Computerized models have preference over imaginary models. consistent
They call for a
image of reality on any level of a g g r e g a t i o n and for an even quanti-
tative disclosure of all assumptions.
The results are always reproducible.
When finally all important relations b e t w e e n system quantities have been formulated,
a model is a v a i l a b l e for equivalent experimenting.
as "what happens if
..." can be answered in the format
Simple questions
"if ... , then
...".
S y s t e m quantities a p p e a r i n g at the "if" end are the independent variables x = { Xl, x 2 ... x n } : { xi; i = i, 2 ... n } of the system that are sometimes variable w i t h i n limits only.
91
92
SCHWEFEL
The other end contains the dependent variable(s). F(x) or { Fi(x) , F2(x) The number
n
... ).
of the p a r a m e t e r s
xi
d e t e r m i n e s the m u l t i p l i c i t y
of p o s s i b l e
v a r i a t i o n s that r a p i d l y becomes u n m a n a g e a b l e with i n c r e a s i n g ~ n. In such situations, no progress
can be a c h i e v e d unless the original q u e s t i o n
is r e v e r s e d into "what should be done to achieve a d e s i r a b l e result?".
THE SEARCH FOR D E S I G N A T E D STATES OF THE S I M U L A T E D SYSTEM This r e v e r s a l of the d i r e c t i o n of i n f o r m a t i o n nearly always leads to arithmetic problems, the simplest terms,
e s p e c i a l l y where
several p a r a m e t e r values are searched for. In
case, a q u a n t i t a t i v e l y known result is desirable.
In m a t h e m a t i c a l
the s p e c i f i c a t i o n is !
F(x)
~
Fse t
Now the p r o b l e m assumes the form of an e q u a t i o n or a set of (simultaneous) equations;
however,
exact a p p r o a c h e s to solution,
e.g,
the G a u s s i a n elimination,
exist only in the case of purely linear or f i r s t - d e g r e e r e l a t i o n s b e t w e e n and
F. In other cases, an iterative or s t e p - b y - s t e p
the p a r a m e t e r s
x
is r e q u i r e d
I! F(x) - Fse t is g r a d u a l l y reduced. deviation, problem. tained.
x
search with v a r i a t i o n of
in which the d i f f e r e n c e
il
D e p e n d i n g on the n o r m
standard methods are a v a i l a b l e
II" II
for the w e i g h t i n g of the
for the a p p r o x i m a t i v e
After a finite n u m b e r of arithmetic
operations,
solution of the
the s o l u t i o n is ob-
This solution is not in all cases exact, however,
but rather a suffici-
ently accurate a p p r o x i m a t i o n as can be a c h i e v e d with the computer e m p l o y e d and its limited c o m p u t i n g accuracy. The q u e s t i o n for the best of all p o s s i b l e solutions without s p e c i f i c a t i o n of the latter is also permissible, to indicate an objective or maximum) value.
function
F(x)
but only where
quantitative it is p o s s i b l e
that is to assume an extreme
(minimum
This s p e c i f i c a t i o n for an u n d e t e r m i n e d value is sometimes
e x p r e s s e d as F(x)
+
extr.
(min. or max.)
.
This is now a genuine o p t i m i z a t i o n p r o b l e m where only an iterative a p p r o a c h to the s o l u t i o n can be applied unless the r e l a t i o n
F(x)
is so simple that the
n e c e s s a r y o p t i m a l i t y conditions Fx(X )
:
{ ~F(x) ~x.
; i
=
I, 2 ... n }
=
0
1
(i.e. all first-order partial d e r i v a t i v e s of the o b j e c t i v e become
function should
zero) can be h a n d l e d w i t h one of the standard methods
sets of equations and certain sufficient
for the solution of
conditions are still met.
DIRECT SEARCH FOR OPTIMA THE PREREQUISITES
OF OPTIMIZING
Before the various seems appropriate one dependent Fj(x).
93
procedures
to find an optimum are discussed,
to the case where the objective
variable
F(x),
cannot be represented
but is described by several partial
In this case, either the partial objectives F(x)
=
Fl(X)
=
F(x)
~ Fj(x)
=
objectives
have to be interrelated,
e.g.
by factors wj, e.g.
wj
•
O or one of them has to be selected while the other objectives lated into constraints
by only
/ F2(x)
or they have to be weighted
F(x)
a reference
have to be reformu-
as, for instance,
FI(x) ;
Gj
=
Fj+ 1
cj
~
0
with cj as the upper or lower bounds. In doing so, compromises fails,no ables
optimization
(parameters)
pensable
are frequently
is possible.
The identification
and the selection
prerequisites o f
necessary
any optimization.
This may be illustrated
by an example
emissions
of pollutants.
might be advisable loading minimum, with different
to first determine
and finally
Apart dependent
from methods variables
such controversial
ALGORITHM
particularly
suitable
it
parameter
minimizing
of multiple
1976, ref.
the cost
criteria
i .
values
multitude
optimization
the linear optimization linearity
- these problems
of methods
are
problem.
or Linear Programming
of the relations
F(x)
and
to find the optimum in a finite number of steps and can be cost even for large problems.
solution may be quite unusable non-linear
for a region
to find minima and maxima for time-
here -, an almost unsurveyable
at reasonable
originally
More details
OPTIMIZATION
that makes use of the at least piece-wise
employed
is taken.
partial objectives,
solutions,
in Keeney and Raiffa,
for the continuous
It guarantees
function
the pure cost minimum, then the environment-
to the emissions.
Most widely used is probably
Gj(x).
energy
or those that may assume only discrete
will not be discussed already available
on which a decision
look for intermediate
decision making may be found e.g.
CHOOSING AN APPROPRIATE
are the indis-
the objective
from energy economy:
vari-
cost, but at the same time with the least possible
Considering
constraints
objective
In some cases,
at a family of solutions
at lowest
of the independent
of an unambiguous
can be varied to arrive
is to be provided
and if this procedure
in reality,
and was impermissibly
optimum of a linear p r o g r a m up of the restrictions,
is always
However,
the computed best
e.g. where the relation simplified
located
in the model.
F(x)
was
While the
in a corner of the polyhedron
made
the optimum of a non.-linear p r o b l e m may well be found in
the interior of the feasible
area.
94
SCHWEFEL The extension of linear programming to convex or concave, usually purely
quadratic problems yields only gradual improvements It presupposes unimodality, to be excluded; moreover, partial derivatives
of the area of application.
that is, the existence of several local optima has
such an extension calls for the specification of the
Fx(X),
mostly in analytical
form, and their continuity.
Frequently the simulation model can be specified in algorithmic derivatives are not available, discontinuities
form only;
nothing can be stated about the topology,
are known to exist. Then only direct search strategies
climbing methods)
can be applied.
is no theoretically
or
(hill-
These are partly of heuristic nature and there
founded guarantee for the convergence to the (absolute)
mum. But they have proven to yield practical
opti-
solutions even when other methods
fail. They include quite a number of different
strategy concepts.
A potential
user would therefore profit from their numerical comparison with respect to reliability and cost, based on many test models. (Schwefel,
1977, ref.
Such comparison has been made
2) and has comprised the strategies
listed in Table 1.
Table 1: List of Strategies Compared Code
Description of strategy or variant
RBO GOLD LAGR HOJE DSCG
Univariate strategy with Rbonacci search Univariate strategy with golden-section search Univariate strategy with Lagrange interpolation Strategy of Hooke and Jeeves (pattem search) Strategy of Davies, Swann, and Campey with Gram-Schmidt orthonormalization Strategy of Davies, Swann, and Campey with Palmer orthonormalization Strategy of Powell (conjugate directions) Strategy of Davidon, Retcher, and Powell (variable metdc] modified by Stewart Simplex strategy of Nelder and Mead Strategy of Rosenbrock (rotating coordinates) Complex strategy of M.J. Box (1+1) evolution strategy (10,100) evolution strategy without recombination (10,100) evolution strategy with recombination
DSCP POWE DFPS SIMP ROSE COMP EVOL GRUP REKO
D I R E C T SEARCH FOR OPTIMA
95
COMPARISON AMONG D E R I V A T I V E - F R E E O P T I M I Z A T I O N P R O C E D U R E S Fig.
1 shows the c o m p u t i n g time for a quadratic test p r o b l e m vs. number of
parameters. non-uniform. gations.
As long as the number of variables is small, the results are highly This explains the c o n t r o v e r s i a l evidence of many early investi-
The trends become very clear, however, where the number of parameters
is large enough; these trends are indicated by the e x t e n d e d lines plotted.
\\
\\ \
COM,~'"/.//SIMP
\
\
/
X
I DSCG
/
....