School of Computer Science. Andrea Soltoggio. Function optimisation .... the best algorithm that would work best on average on a large number of test functions.
Function optimisation - I A minimisation problem is described as a pair (S,f) where S ⊆ Rn is a bounded set of Rn and f : S → R The problem is to find a point xmin = {x1, x2, … xn}min ∈ S such that f(xmin) is a global minimum:
∀x ∈ S : f(xmin) ≤ f(x) • Each problem may have different dimension (n = 1…j) • The physics of the problem may introduce constraints: for example x1 ≤ x2 or (x1 + x2 + x3) ≤ k School of Computer Science
Andrea Soltoggio
Function optimisation - II Example: f(x) = 2x2 + x - 3 with S : [-4,4]
School of Computer Science
Andrea Soltoggio
Function optimisation - III Example: f(x,y) = x2 + y2/2 + 10 sin(x-y) with S : [-4,4], [-4,4] Global minimum
School of Computer Science
Andrea Soltoggio
Finding the minimum f(x) = x2 + 2x - 3 S : [-4,4] Minimum: x = - 1 f(x) = - 4
When the function has a simple analytical expression, there are mathematical procedures and tools to obtain the global minimum very efficiently School of Computer Science
Andrea Soltoggio
Mutation Offspring is created by adding a mutation vector to the parent
With bit encoding : Parent’s genotype: 1
00111010 Mutation vector: 0 0 1 0 0 0 0 1 0 Offpring genotype: 1 0 1 1 1 1 0 0 0 School of Computer Science
Andrea Soltoggio
Mutation - II Real-valued encoding Parent’s genotype: 1.231
0.032 Offpring genotype: 1.263 Mutation vector:
6.789 3.914 0.455 -0.362 7.244 3.552
If mutation is the only operator, it has the purpose of differentiating the offspring without destroying or altering excessively the parent’s characteristic. Therefore, small alterations should be more likely than large ones School of Computer Science
Andrea Soltoggio
Mutation - distributions Examples of probability distributions:
Uniform School of Computer Science
Gaussian Andrea Soltoggio
Benchmark functions for EAs Test functions for EAs should have the following characteristics: - The global minimum should be known - The function should be easily computed - The function is recognised to have certain characteristics A common classification of benchmark functions is: - Unimodal (Sphere function) - Discontinuous (for instance Step function) - Multimodal with few local minima - Multimodal with exponential number of local minima School of Computer Science
Andrea Soltoggio
Test functions: Sphere Example of unimodal function f(x,y) = x2 + y2
School of Computer Science
Andrea Soltoggio
Test functions: Schwefel Example of unimodal function f(x) = max(|xi|)2
School of Computer Science
Andrea Soltoggio
Test functions: Schwefel problem Multimodal: f(x) =
School of Computer Science
Andrea Soltoggio
Test functions: Rastrigin’s function Multimodal: f(x) =
School of Computer Science
Andrea Soltoggio
Test functions: Ackley’s function Multimodal: f(x) =
School of Computer Science
Andrea Soltoggio
Test functions: Butterfly function Multimodal: f(x) = (x2 - y2)sin(x+y) / (x2 + y2)
School of Computer Science
Andrea Soltoggio
Benchmark functions - Ridges f(x,y) =
R. Salomon (1995) points out drastic drop in EAs performance when rotated ridges are present
School of Computer Science
Andrea Soltoggio
Benchmark functions - Ridges - II f(x,y) =
School of Computer Science
Andrea Soltoggio
Benchmark functions - Weierstrass f(x) =
School of Computer Science
Andrea Soltoggio
Benchmark functions - Weierstrass
School of Computer Science
Andrea Soltoggio
Testing the algorithms Performance can be evaluated as 1. Number of evaluations to reach a given threshold 2. Optimised valued after a given number of evaluation Other important aspects 1. Consistency of the results over many experiments 2. Robustness over a set of problems 3. Dynamics of optimisation during the generations School of Computer Science
Andrea Soltoggio
No free lunch During the eighties, researchers in EC where trying to find the best algorithm that would work best on average on a large number of test functions. In 1995 Wolpert and MacReady from Santa Fe institute published a technical report titled: “No free lunch theorms for search”. “…for any algorithm, any elevated performance over one class of problems is exactly paid for in performance over another class” School of Computer Science
Andrea Soltoggio
Main messages • There are different kinds of evolutionary algorithms for
parameter optimisation implementing minor or major differences
• The operator of mutation can adopt different probability distribution (i.e. Gaussian, Cauchy...)
• Fitness landscapes can look very different, demanding the application of different mutation strength and distribution
• An algorithm particularly tuned to solve a problem might
perform less well on a different fitness landscape (no free lunch) School of Computer Science
Andrea Soltoggio
Suggested reading T. Bäck and H.P.Schwefel, “An overview of evolutionary algorithms for parameter optimization”, Evolutionary Computation, vol. 1, no. 1, pp 1-23 (1993)
School of Computer Science
Andrea Soltoggio