The aims of this tutorial

7 downloads 0 Views 1MB Size Report
Simple answers like “Crossover rate should be 0.8” ... predetermined time-varying schedule p = p(t) ..... Number of their own parameters – overhead costs.
A.E. Eiben

Principled approaches to tuning EA parameters

A.E. Eiben Free University Amsterdam  Free University Amsterdam (google: eiben)

The aims of this tutorial

y Creating awareness of the tuning issue  y Providing guidelines for tuning EAs y Presenting a vision for future development 

CEC 2009 tutorial

1

A.E. Eiben

Principled approaches to tuning EA parameters

By the end of this talk You will certainly NOT have: y Simple answers like “Crossover rate should be 0.8”  y Clear recipes like “Method X can tune your EA perfectly” You will hopefully have:  y A new look on EA parameter calibration y Motivation to adjust your practice  j y p y Inspiration to do research on EA parameter calibration

The basic EC metaphor EVOLUTION

PROBLEM SOLVING

Environment

Problem

Individual Fitness

Candidate Solution Quality

Fitness → chances for survival and reproduction Quality → chance for seeding new solutions

CEC 2009 tutorial

2

A.E. Eiben

Principled approaches to tuning EA parameters

The two main pillars of EAs There are two competing forces active y Increasing population  diversity by 

y Decreasing population diversity by 

genetic operators

selection

y mutation

y of parents

y recombination

y of survivors

Push towards novelty 

Push towards quality us to a ds qua ty

Proper balance of these forces is essential This is regulated through the parameters

Algorithm design and parameters y Calibration of EAs y Design of EAs y Configuration of EAs y Parameter optimization of EAs y…

Given: Required:

CEC 2009 tutorial

an algorithmic framework instantiation to a specific algorithm with good quality

3

A.E. Eiben

Principled approaches to tuning EA parameters

Problems y How to find good parameter values ?

Many parameters, unknown effects, unknown, non‐ linear interactions

y How to vary parameter values?

EA is a dynamic, staged, process Æoptimal parameter  values may vary during a run

Brief historical account y 1970/80ies   “GA is a robust method”

adapt mutation stepsize σ y 1970ies +      ESs self 1970ies + ESs self‐adapt mutation stepsize σ y 1986              meta‐GA for optimizing GA parameters y 1990ies         EP adopts self‐adaptation of σ  as ‘standard’  y 1990ies         some papers on changing parameters on‐the‐fly  y 1999              Eiben‐Michalewicz‐Hinterding paper proposes 

clear taxonomy & terminology clear taxonomy & terminology 

CEC 2009 tutorial

4

A.E. Eiben

Principled approaches to tuning EA parameters

Taxonomy

PARAMETER CALIBRATION PARAMETER TUNING

PARAMETER CONTROL

(before the run)

(during the run)

DETERMINISTIC

ADAPTIVE

SELF-ADAPTIVE

((time dependent) p )

((feedback from search))

((coded in chromosomes))

Google Scholar index > 550 (>700 with the conference version)

Parameter tuning Parameter tuning: testing and comparing different values before the “real” real run Problems: y users mistakes in settings can be sources of errors or

sub-optimal performance y costs much time y parameters interact: exhaustive search is not practicable y good values may become bad during the run

CEC 2009 tutorial

5

A.E. Eiben

Principled approaches to tuning EA parameters

Parameter control Parameter control: setting values on-line, during the actual run, e.g. y predetermined time-varying schedule p = p(t) y using (heuristic) feedback from the search process y encoding parameters in chromosomes and rely on natural

selection

Problems: y finding optimal p is hard, finding optimal p(t) is harder y still user-defined feedback mechanism, how to “optimize”? y when would natural selection work for algorithm parameters?

Example Task to solve: ( 1,,…,x , n) y min f(x y Li ≤ xi ≤ Ui y gi (x) ≤ 0 y hi (x) = 0

for i = 1,…,n for i = 1,…,q for i = q+1,…,m

bounds inequality constraints equality constraints

Algorithm: y EA with ith real-valued l l d representation t ti ((x1,…,xn) y arithmetic averaging crossover y Gaussian mutation: x’ i = xi + N(0, σ)

standard deviation σ is called mutation step size

CEC 2009 tutorial

6

A.E. Eiben

Principled approaches to tuning EA parameters

Varying mutation step size: option 1 Replace the constant σ by a function σ(t)

σ (t ) = 1 - 0.9 × Tt 0 ≤ t ≤ T is the current generation number Features: z z z z

changes in σ are independent from the search progress strong user control of σ by the above formula σ is fully predictable a given σ acts on all individuals of the population

Varying mutation step size: option 2 Replace the constant σ by a function σ(t). The value is updated after every n steps by Rechenberg’s 1/5 success rule

>1/5 1/5 ⎧σ ((tt − n ) / c if ps > ⎪ σ (t ) = ⎨σ (t − n ) ⋅ c if ps < 1/5 ⎪σ (t − n ) otherwise ⎩ where ps is the % of successful mutations, c is a parameter (0.8 < c < 1)

Features: z z z z

CEC 2009 tutorial

changes in σ are based on feedback from the search progress some user control of σ by the above formula σ is not predictable a given σ acts on all individuals of the population

7

A.E. Eiben

Principled approaches to tuning EA parameters

Varying mutation step size: option 3 Assign a personal σ to each individual Incorporate this σ into the chromosome: (x1, …, xn, σ) Apply variation operators to xi‘s and σ

σ ′ = σ × e N ( 0,τ ) x′i = xi + N (0, σ ′) Features: z z z z

changes in σ are results of natural selection (almost) no user control of σ σ is not predictable a given σ acts on one individual

Varying mutation step size: option 4 Assign a personal σ to each variable in each individual Incorporate σ’s into the chromosomes: (x1, …, xn, σ1, …, σ n) Apply variation operators to xi‘s s and σi‘s s

σ ′i = σi × e N ( 0,τ ) x′i = xi + N (0, σ ′i ) Features: z z z z

CEC 2009 tutorial

changes in σi are results of natural selection (almost) no user control of σi σi is not predictable a given σi acts on one variable of one individual

8

A.E. Eiben

Principled approaches to tuning EA parameters

Control WHAT? Practically any EA component can be parameterized and thus controlled on-the-fly: y representation y evaluation function y variation operators y selection operator (parent or mating selection) y replacement operator (survival or environmental selection) y population (overlap, size, topology)

Control HOW? Three major types of parameter control: y deterministic: some rule modifies strategy parameter

without feedback from the search (based on some counter, typically time or no of search steps)

y adaptive: feedback rule, i.e., heuristic, based on some

measure monitoring search progress

y self-adaptative: parameter values evolve along with

solutions; encoded onto chromosomes they undergo variation and selection

CEC 2009 tutorial

9

A.E. Eiben

Principled approaches to tuning EA parameters

Notes on parameter control y Parameter control offers the possibility to use appropriate values in various 

stages of the search y Adaptive and self‐adaptive control can  Adaptive and self adaptive control can “liberate” liberate  users from tuning users from tuning Æ

reduces need for EA expertise for a new application y Assumption: control heuristic is less parameter‐sensitive than the EA

BUT y State‐of‐the‐art is a mess: literature is a potpourri, no generic knowledge, 

no principled approaches to developing control heuristics (deterministic or  adaptive), no solid testing methodology

WHAT ABOUT TUNING?

Historical account (cont’d) Last decade:  y More & more work on parameter control  y Traditional parameters: mutation and xover  y Non‐traditional parameters: selection and population size y All parameters Î “parameterless”  EAs (name!?)

y Hardly any work on parameter tuning, i.e., y Nobody reports on tuning efforts behind their EA published y Just a handful papers on tuning methods / algorithms 

CEC 2009 tutorial

10

A.E. Eiben

Principled approaches to tuning EA parameters

Control flow of EA calibration / design User

Design layer

Meta-GA

Algorithm layer

GA

optimizes

GP

optimizes Symbolic regression

Application layer

One-max

Information flow of EA calibration / design Design layer

Algorithm quality

Algorithm layer

Solution quality

Application layer

CEC 2009 tutorial

11

A.E. Eiben

Principled approaches to tuning EA parameters

Lower level of EA calibration / design EA

Searches

Decision variables Problem parameters Candidate solutions Evaluates Application

Space of solution vectors

Lower level of EA calibration / design The whole field of EC is about this

EA

Searches

Decision variables Problem parameters Candidate solutions Evaluates Application

Space of solution vectors

CEC 2009 tutorial

12

A.E. Eiben

Principled approaches to tuning EA parameters

Upper level of EA calibration / design Design method

Searches

Design variables, Algorithm parameters, Strategy parameters Evaluates EA

Space of parameter vectors

Parameter – performance landscape y All parameters together span a (search) space y One point – one EA instance  y Height of point = performance of EA instance 

on a given problem y Parameter‐performance landscape or utility landscape for 

each { EA + problem instance + performance measure }

y This landscape is unlikely to be trivial, e.g., unimodal, 

separable

y If there is some structure in the utility landscape, then we 

can do better than random or exhaustive search

CEC 2009 tutorial

13

A.E. Eiben

Principled approaches to tuning EA parameters

Ontology ‐ Terminology LOWER PART

UPPER PART UPPER PART

EA

Tuner

Solution vectors

Parameter vectors

Fitness

Utility

Evaluation

Test 

METHOD SEARCH SPACE QUALITY ASSESSMENT

y Fitness ≈ objective function value y Utility = ?  y MBF‐utility, AES‐utility, SR‐utility, combined utility, 

robustness utility, …  

Off‐line vs. on‐line calibration / design Design / calibration method y Off‐line Æ parameter tuning y On‐line Æ parameter control y Why focus on tuning (first)? y Easier y Most immediate need of users y Control strategies have parameters too Æ need tuning themselves y Knowledge about tuning (utility landscapes) can help the design of  Knowledge about tuning (utility landscapes) can help the design of good control strategies y There are indications that good tuning works better than control High impact R&D programme: principled approaches to EA tuning

CEC 2009 tutorial

14

A.E. Eiben

Principled approaches to tuning EA parameters

Tuning by generate‐and‐test y EA tuning is a search problem itself y Straightforward approach: generate‐and‐test hf d h d Generate parameter vectors

Test parameter vectors

Terminate

Testing parameter vectors y Run EA with these parameters on the given problem or problems y Record EA performance in that run e.g., by  y Solution quality = best fitness at termination  y Speed = time used to find required solution quality  y EAs are stochastic Æ repetitions are needed for reliable 

evaluation Æ we get statistics, e.g., y Average performance by solution quality, speed (MBF, AES, AEB) y Robustness R b = variance in those averages i i h y Success rate = % runs ending with success

y Big issue: how many repetitions of the test

CEC 2009 tutorial

15

A.E. Eiben

Principled approaches to tuning EA parameters

Numeric parameters y E.g., population size, xover rate, tournament size, … y Domain D i is i subset b t off R, R Z, Z N (finite (fi it or infinite) i fi it )

EA performance

EA performance

y Sensible distance metric Æ searchable

Parameter value

Parameter value

Relevant parameter

Irrelevant parameter

Symbolic parameters y E.g., xover_operator, elitism, selection_method y Finite domain, e.g., {1 point, uniform, averaging}, {Y, N} Finite domain e g {1‐point uniform averaging} {Y N}

A B C D E F G H Parameter value Non-searchable ordering

CEC 2009 tutorial

EA peerformance

EA peerformance

y No sensible distance metric Æ non‐searchable in general

B D C A H F G E Parameter value Searchable ordering

16

A.E. Eiben

Principled approaches to tuning EA parameters

Notes on parameters y A value of a symbolic parameter can introduce a numeric 

parameter, e.g.,  y Selection = tournament Æ tournament size y Populations_type = overlapping Æ generation gap y Parameters can have a hierarchical, nested structure y Number of EA parameters is not defined in general y Cannot simply denote the design space / tuning search space  by  S = Q1 x … Qm x R1 x … x Rn with Qi  / Rj  as domains of the symbolic/numeric parameters 

What is an EA? ALG‐1

ALG‐2

ALG‐3

ALG‐4

SYMBOLIC PARAMETERS Bit‐string

Bit‐string

Real‐valued

Real‐valued

Overlapping pops

Representation

N

Y

Y

Y

Survivor selection

̶

Tournament 

Replace worst

Replace worst

Roulette wheel

Uniform determ

Tournament 

Tournament 

Bit‐flip

Bit‐flip

N(0,σ)

N(0,σ)

Recombination

Uniform xover

Uniform xover

Discrete recomb

Discrete recomb

G Generation ti gap

̶

05 0.5

09 0.9

09 0.9

Population size

100

500

100

300

Parent selection Mutation

NUMERIC PARAMETERS

Tournament size Mutation rate Mutation stepsize Crossover rate

CEC 2009 tutorial

̶

2

3

30

0.01

0.1

̶

̶

̶

̶

0.01

0.05

0.8

0.7

1

0.8

17

A.E. Eiben

Principled approaches to tuning EA parameters

What is an EA? (cont’d) Make a principal distinction between EAs and EA instances and place the  border between them by: y Option 1 O ti 1 y There is only one EA, the generic EA scheme y Previous table contains 1 EA and 4 EA‐instances

y Option 2 y An EA = particular configuration of the symbolic parameters y Previous table contains 3 EAs, with 2 instances for one of them

y Option 3 y An EA = particular configuration of parameters y Notions of EA and EA‐instance coincide y Previous table contains 4 EAs / 4 EA‐instances

Generate‐and‐test under the hood Generate initial parameter vectors

Test p.v.’s

Terminate

Select p.v.’s

→ Fixed-set search → Smart S t fixed-set fi d t search h → Population-based search Generate p.v.’s

CEC 2009 tutorial

18

A.E. Eiben

Principled approaches to tuning EA parameters

Tuning effort y Total amount of computational work is determined by  y A  A = number of vectors tested number of vectors tested y B = number of tests per vector y C = number of fitness evaluations per test  y Tuning methods can be positioned by their rationale:  y To optimize A  (population‐based search) y To optimize B  (smart fixed‐set search) p ( ) y To optimize A and B  (combination) y To optimize C  (non‐existent) y…

Optimize A = optimally use A Applicable only to numeric parameters Number of tested vectors not fixed, A is the maximum (stop cond.) Initialization with N