Function optimisation

62 downloads 12252 Views 5MB Size Report
School of Computer Science. Andrea Soltoggio. Function optimisation .... the best algorithm that would work best on average on a large number of test functions.
Function optimisation - I A minimisation problem is described as a pair (S,f) where S ⊆ Rn is a bounded set of Rn and f : S → R The problem is to find a point xmin = {x1, x2, … xn}min ∈ S such that f(xmin) is a global minimum:

∀x ∈ S : f(xmin) ≤ f(x) • Each problem may have different dimension (n = 1…j) • The physics of the problem may introduce constraints: for example x1 ≤ x2 or (x1 + x2 + x3) ≤ k School of Computer Science

Andrea Soltoggio

Function optimisation - II Example: f(x) = 2x2 + x - 3 with S : [-4,4]

School of Computer Science

Andrea Soltoggio

Function optimisation - III Example: f(x,y) = x2 + y2/2 + 10 sin(x-y) with S : [-4,4], [-4,4] Global minimum

School of Computer Science

Andrea Soltoggio

Finding the minimum f(x) = x2 + 2x - 3 S : [-4,4] Minimum: x = - 1 f(x) = - 4

When the function has a simple analytical expression, there are mathematical procedures and tools to obtain the global minimum very efficiently School of Computer Science

Andrea Soltoggio

Mutation Offspring is created by adding a mutation vector to the parent

With bit encoding : Parent’s genotype: 1

00111010 Mutation vector: 0 0 1 0 0 0 0 1 0 Offpring genotype: 1 0 1 1 1 1 0 0 0 School of Computer Science

Andrea Soltoggio

Mutation - II Real-valued encoding Parent’s genotype: 1.231

0.032 Offpring genotype: 1.263 Mutation vector:

6.789 3.914 0.455 -0.362 7.244 3.552

If mutation is the only operator, it has the purpose of differentiating the offspring without destroying or altering excessively the parent’s characteristic. Therefore, small alterations should be more likely than large ones School of Computer Science

Andrea Soltoggio

Mutation - distributions Examples of probability distributions:

Uniform School of Computer Science

Gaussian Andrea Soltoggio

Benchmark functions for EAs Test functions for EAs should have the following characteristics: - The global minimum should be known - The function should be easily computed - The function is recognised to have certain characteristics A common classification of benchmark functions is: - Unimodal (Sphere function) - Discontinuous (for instance Step function) - Multimodal with few local minima - Multimodal with exponential number of local minima School of Computer Science

Andrea Soltoggio

Test functions: Sphere Example of unimodal function f(x,y) = x2 + y2

School of Computer Science

Andrea Soltoggio

Test functions: Schwefel Example of unimodal function f(x) = max(|xi|)2

School of Computer Science

Andrea Soltoggio

Test functions: Schwefel problem Multimodal: f(x) =

School of Computer Science

Andrea Soltoggio

Test functions: Rastrigin’s function Multimodal: f(x) =

School of Computer Science

Andrea Soltoggio

Test functions: Ackley’s function Multimodal: f(x) =

School of Computer Science

Andrea Soltoggio

Test functions: Butterfly function Multimodal: f(x) = (x2 - y2)sin(x+y) / (x2 + y2)

School of Computer Science

Andrea Soltoggio

Benchmark functions - Ridges f(x,y) =

R. Salomon (1995) points out drastic drop in EAs performance when rotated ridges are present

School of Computer Science

Andrea Soltoggio

Benchmark functions - Ridges - II f(x,y) =

School of Computer Science

Andrea Soltoggio

Benchmark functions - Weierstrass f(x) =

School of Computer Science

Andrea Soltoggio

Benchmark functions - Weierstrass

School of Computer Science

Andrea Soltoggio

Testing the algorithms Performance can be evaluated as 1. Number of evaluations to reach a given threshold 2. Optimised valued after a given number of evaluation Other important aspects 1. Consistency of the results over many experiments 2. Robustness over a set of problems 3. Dynamics of optimisation during the generations School of Computer Science

Andrea Soltoggio

No free lunch During the eighties, researchers in EC where trying to find the best algorithm that would work best on average on a large number of test functions. In 1995 Wolpert and MacReady from Santa Fe institute published a technical report titled: “No free lunch theorms for search”. “…for any algorithm, any elevated performance over one class of problems is exactly paid for in performance over another class” School of Computer Science

Andrea Soltoggio

Main messages • There are different kinds of evolutionary algorithms for

parameter optimisation implementing minor or major differences

• The operator of mutation can adopt different probability distribution (i.e. Gaussian, Cauchy...)

• Fitness landscapes can look very different, demanding the application of different mutation strength and distribution

• An algorithm particularly tuned to solve a problem might

perform less well on a different fitness landscape (no free lunch) School of Computer Science

Andrea Soltoggio

Suggested reading T. Bäck and H.P.Schwefel, “An overview of evolutionary algorithms for parameter optimization”, Evolutionary Computation, vol. 1, no. 1, pp 1-23 (1993)

School of Computer Science

Andrea Soltoggio