Black Box Search By Unbiased Variation. Per Kristian Lehre and Carsten Witt. CERCIA, University of Birmingham, UK. DTU Informatics, Copenhagen, Denmark.
Black Box Search By Unbiased Variation Per Kristian Lehre and Carsten Witt CERCIA, University of Birmingham, UK DTU Informatics, Copenhagen, Denmark
ThRaSH - March 24th 2010
State of the Art in Runtime Analysis of RSHs (1+1) EA (1+λ) EA (µ+1) EA 1-ANT (µ+1) IA (1+1) EA
O(n log n) O(λn + n log n) O(µn + n log n) O(n2 ) w.h.p. O(µn + n log n) Θ(n log n)
MST
cGA (1+1) EA (1+1) EA (1+1) EA MO (1+1) EA (1+1) EA
Θ(n2+ε ), ε > 0 const. eΩ(n) , PRAS Θ(n2 log n) O(n3 log(nwmax )) O(n3 ) Θ(m2 log(nwmax ))
Max. Clique (rand. planar) Eulerian Cycle Partition Vertex Cover
(1+λ) EA 1-ANT (1+1) EA (16n+1) RLS (1+1) EA (1+1) EA (1+1) EA
O(nλ log(nwmax )), λ = d m e n O(mn log(nwmax )) Θ(n5 ) Θ(n5/3 ) Θ(m2 log m) PRAS, avg. eΩ(n) , arb. bad approx.
(1+1) EA SEMO (1+1) EA
eΩ(n) , arb. bad approx. Pol. O(log n)-approx. 1/p-approximation in O(|E|p+2 log(|E|wmax )) eΩ(n)
OneMax
Linear Functions
Max. Matching Sorting SS Shortest Path
Set Cover Intersection of p ≥ 3 matroids UIO/FSM conf.
(1+1) EA
See survey [Oliveto et al., 2007b].
[M¨ uhlenbein, 1992] [Jansen et al., 2005] [Witt, 2006] [Neumann and Witt, 2006] [Zarges, 2009] [Droste et al., 2002] and [He and Yao, 2003] [Droste, 2006] [Giel and Wegener, 2003] [Scharnow et al., 2002] [Baswana et al., 2009] [Scharnow et al., 2002] [Neumann and Wegener, 2007] 2
[Neumann and Wegener, 2007] [Neumann and Witt, 2008] [Storch, 2006] [Storch, 2006] [Doerr et al., 2007] [Witt, 2005] [Friedrich et al., 2007] and [Oliveto et al., 2007a] [Friedrich et al., 2007] [Friedrich et al., 2007] [Reichel and Skutella, 2008] [Lehre and Yao, 2007]
Motivation - A Theory of Randomised Search Heuristics
Computational Complexity I
Classification of problems according to inherent difficulty.
I
Common limits on the efficiency of all algorithms.
I
Assuming a particular model of computation.
Computational Complexity of Search Problems I
Polynomial-time Local Search [Johnson et al., 1988].
I
Black-Box Complexity [Droste et al., 2006].
Black Box Complexity
A
f
Function class F
Photo: E. Gerhard (1846).
[Droste et al., 2006]
Black Box Complexity x1 , x2 , x3 , ...
f (x1 ), f (x2 ), f (x3 ), ...
A
f
Function class F
Photo: E. Gerhard (1846).
[Droste et al., 2006]
Black Box Complexity x1 , x2 , x3 , ..., xt
f (x1 ), f (x2 ), f (x3 ), ..., f (xt )
A
f
Function class F
Photo: E. Gerhard (1846).
I
Black box complexity on function class F TF := min max TA,f A
[Droste et al., 2006]
f ∈F
Results with old Model
I
Very general model with few restrictions on resources.
I
Example: Needle has BB complexity (2n + 1)/2.
I
Some NP-hard problems have polynomial BB complexity. Artificially low BB complexity on example functions, e.g.
I
I I
n/ log(2n + 1) − 1 on OneMax n/2 − o(n) on LeadingOnes
Refined Black Box Model
A
f
Function class F
Photo: E. Gerhard (1846).
Refined Black Box Model f (x0 )
0 f (x0 ) x0 A
f
Function class F
Photo: E. Gerhard (1846).
Refined Black Box Model 0, 0
f (x0 ), f (x1 )
f (x0 ) x0 A
f (x1 ) x1 f
Function class F
Photo: E. Gerhard (1846).
Refined Black Box Model 0, 0, 2
f (x0 ), f (x1 ), f (x2 )
f (x0 ) x0 A
f (x1 ) x1 f
f (x2 ) x2 Function class F
Photo: E. Gerhard (1846).
Refined Black Box Model 0, 0, 2, 3
f (x0 ), f (x1 ), f (x2 ), f (x3 )
f (x0 ) x0 A
f (x1 ) x1 f
f (x2 ) x2 Function class F f (x3 ) x3
Photo: E. Gerhard (1846).
Refined Black Box Model f (x0 ), f (x1 ), f (x2 ), f (x3 ), f (x4 ), f (x5 ), f (x6 ) 0, 0, 2, 3, 0, 2 f (x0 ) x0 A
f (x1 ) x1 f
f (x2 ) x2 Function class F f (x3 ) x3 f (x4 ) x4 Photo: E. Gerhard (1846).
f (x5 ) x5 f (x6 ) x6
Refined Black Box Model f (x0 ) x0 A
f (x1 ) x1 f
f (x2 ) x2 Function class F f (x3 ) x3 f (x4 ) x4 Photo: E. Gerhard (1846).
f (x5 ) x5 f (x6 ) x6
I
Unbiased black box complexity on function class F TF := min max TA,f A
f ∈F
Unbiased Variation Operators
1
Encoding of solution by bitstring x = x1 x2 x3 x4 x5 x1 x5 x2 x4
x3 1
Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators
1
Encoding of solution by bitstring x = x1 x2 x3 x4 x5
x1 x2
x3 x4
x5 1
Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators
1
Encoding of solution by bitstring x = x1 x2 x3 x4 x5
x1 x2 = 1 =⇒ blue in!
x3 x4 = 1 =⇒ orange in!
x5 1
Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators
1
Encoding of solution by bitstring x = x1 x2 x3 x4 x5
x1 x2 = 1 =⇒ blue in!
x3 x4 = 1 =⇒ orange out!
x5 1
Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators
p(y | x)
For any bitstrings x, y, z and permutation σ, we require 1) p(y | x) = p(y ⊕ z | x ⊕ z) 2) p(y | x) = p(yσ(1) yσ(2) · · · yσ(n) | xσ(1) xσ(2) · · · xσ(n) )
→ We consider unary operators, but higher arities possible.
[Droste and Wiesmann, 2000, Rowe et al., 2007]
Unbiased Variation Operators
x∗
x r
y
Condition 1) and 2) imply Hamming-invariance.
Unbiased Black-Box Algorithm Scheme
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:
t ← 0. Choose x(t) uniformly at random from {0, 1}n . repeat t ← t + 1. Compute f (x(t − 1)). I(t) ← (f (x(0)), ..., f (x(t − 1))). Depending on I(t), choose a prob. distr. ps on {0, ..., t − 1}. Randomly choose an index j according to ps . Depending on I(t), choose an unbiased variation op. pv (· | x(j)). Randomly choose a bitstring x(t) according to pv . until termination condition met.
→ (µ +, λ) EA, simulated annealing, metropolis, RLS, any population size, any selection mechanism, steady state EAs, cellular EAs, ranked based mutation ...
Simple Unimodal Functions Algorithm
LeadingOnes
(1+1) EA
Θ(n2 )
(1+λ) EA
Θ(n2 + λn)
(µ+1) EA
Θ(n2 + µn log n)
BB
Ω(n)
Simple Unimodal Functions Algorithm
LeadingOnes
(1+1) EA
Θ(n2 )
(1+λ) EA
Θ(n2 + λn)
(µ+1) EA
Θ(n2 + µn log n)
BB
Ω(n)
Theorem The expected runtime of any black box algorithm with unary, unbiased variation on LeadingOnes is Ω(n2 ).
Simple Unimodal Functions Algorithm
LeadingOnes
(1+1) EA
Θ(n2 )
(1+λ) EA
Θ(n2 + λn)
(µ+1) EA
Θ(n2 + µn log n)
BB
Ω(n)
Theorem The expected runtime of any black box algorithm with unary, unbiased variation on LeadingOnes is Ω(n2 ).
Proof idea I
Potential between n/2 and 3n/4.
I
# 0-bits flipped hypergeometrically distributed.
I
Lower bound by polynomial drift.
Escaping from Local Optima
Jump(x)
|x|
m
Escaping from Local Optima Jump(x)
|x|
m
Theorem For any m ≤ n(1 − ε)/2 with 0 < ε < 1, the expected runtime of any black box algorithm with unary, unbiased variation is at least I I
2cm with probability 1 − 2−Ω(m) . n cm with probability 1 − 2−Ω(m ln(n/(rm))) . rm
→ These bounds are lower than the Θ(nm ) bound for (1+1) EA!
Escaping from Local Optima
Jump(x)
|x|
m
Proof idea I
Simplified drift in gaps 1. Expectation of hypergeometric distribution. 2. Chv´atal’s bound.
General Pseudo-boolean Functions Algorithm
OneMax
(1+1) EA
Θ(n log n)
(1+λ) EA
O(λn + n log n)
(µ+1) EA
O(µn + n log n)
BB
Ω(n/ log n)
General Pseudo-boolean Functions Algorithm
OneMax
(1+1) EA
Θ(n log n)
(1+λ) EA
O(λn + n log n)
(µ+1) EA
O(µn + n log n)
BB
Ω(n/ log n)
Theorem The expected runtime of any black box search algorithm with unbiased, unary variation on any pseudo-boolean function with a single global optimum is Ω(n log n).
General Pseudo-boolean Functions Algorithm
OneMax
(1+1) EA
Θ(n log n)
(1+λ) EA
O(λn + n log n)
(µ+1) EA
O(µn + n log n)
BB
Ω(n/ log n)
Theorem The expected runtime of any black box search algorithm with unbiased, unary variation on any pseudo-boolean function with a single global optimum is Ω(n log n).
Proof idea I
Expected multiplicative weight decrease.
I
Chv´atal’s bound.
Summary and Conclusion
I
Refined black box model.
I
Proofs are (relatively) easy!
I
Comprises EAs never previously analysed.
I
Ω(n log n) on general functions.
I
Some bounds coincide with the runtime of (1+1) EA.
I
Future work: k-ary variation operators for k > 1.
References I Baswana, S., Biswas, S., Doerr, B., Friedrich, T., Kurur, P. P., and Neumann, F. (2009). Computing single source shortest paths using single-objective fitness. In FOGA ’09: Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms, pages 59–66, New York, NY, USA. ACM. Doerr, B., Klein, C., and Storch, T. (2007). Faster evolutionary algorithms by superior graph representation. In Proceedings of the 1st IEEE Symposium on Foundations of Computational Intelligence (FOCI’2007), pages 245–250. Droste, S. (2006). A rigorous analysis of the compact genetic algorithm for linear functions. Natural Computing, 5(3):257–283. Droste, S., Jansen, T., and Wegener, I. (2002). On the analysis of the (1+1) Evolutionary Algorithm. Theoretical Computer Science, 276:51–81. Droste, S., Jansen, T., and Wegener, I. (2006). Upper and lower bounds for randomized search heuristics in black-box optimization. Theory of Computing Systems, 39(4):525–544.
References II Droste, S. and Wiesmann, D. (2000). Metric based evolutionary algorithms. In Proceedings of Genetic Programming, European Conference, Edinburgh, Scotland, UK, April 15-16, 2000, Proceedings, volume 1802 of Lecture Notes in Computer Science, pages 29–43. Springer. Friedrich, T., Hebbinghaus, N., Neumann, F., He, J., and Witt, C. (2007). Approximating covering problems by randomized search heuristics using multi-objective models. In Proceedings of the 9th annual conference on Genetic and evolutionary computation (GECCO’2007), pages 797–804, New York, NY, USA. ACM Press. Giel, O. and Wegener, I. (2003). Evolutionary algorithms and the maximum matching problem. In Proceedings of the 20th Annual Symposium on Theoretical Aspects of Computer Science (STACS 2003), pages 415–426. He, J. and Yao, X. (2003). Towards an analytic framework for analysing the computation time of evolutionary algorithms. Artificial Intelligence, 145(1-2):59–97. Jansen, T., Jong, K. A. D., and Wegener, I. (2005). On the choice of the offspring population size in evolutionary algorithms. Evolutionary Computation, 13(4):413–440.
References III Johnson, D. S., Papadimitriou, C. H., and Yannakakis, M. (1988). How easy is local search? Journal of Computer and System Sciences, 37(1):79–100. Lehre, P. K. and Yao, X. (2007). Runtime analysis of (1+1) EA on computing unique input output sequences. In Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC’2007), pages 1882–1889. IEEE Press. M¨ uhlenbein, H. (1992). How genetic algorithms really work I. Mutation and Hillclimbing. In Proceedings of the Parallel Problem Solving from Nature 2, (PPSN-II), pages 15–26. Elsevier. Neumann, F. and Wegener, I. (2007). Randomized local search, evolutionary algorithms, and the minimum spanning tree problem. Theoretical Computer Science, 378(1):32–40. Neumann, F. and Witt, C. (2006). Runtime analysis of a simple ant colony optimization algorithm. In Proceedings of The 17th International Symposium on Algorithms and Computation (ISAAC’2006), number 4288 in LNCS, pages 618–627.
References IV Neumann, F. and Witt, C. (2008). Ant colony optimization and the minimum spanning tree problem. In Proceedings of Learning and Intelligent Optimization (LION’2008), pages 153–166. Oliveto, P. S., He, J., and Yao, X. (2007a). Evolutionary algorithms and the vertex cover problem. In In Proceedings of the IEEE Congress on Evolutionary Computation (CEC’2007). Oliveto, P. S., He, J., and Yao, X. (2007b). Time complexity of evolutionary algorithms for combinatorial optimization: A decade of results. International Journal of Automation and Computing, 4(1):100–106. Reichel, J. and Skutella, M. (2008). Evolutionary algorithms and matroid optimization problems. Algorithmica. Rowe, J. E., Vose, M. D., and Wright, A. H. (2007). Neighborhood graphs and symmetric genetic operators. In FOGA, pages 110–122.
References V Scharnow, J., Tinnefeld, K., and Wegener, I. (2002). Fitness landscapes based on sorting and shortest paths problems. In Proceedings of 7th Conf. on Parallel Problem Solving from Nature (PPSN–VII), number 2439 in LNCS, pages 54–63. Storch, T. (2006). How randomized search heuristics find maximum cliques in planar graphs. In Proceedings of the 8th annual conference on Genetic and evolutionary computation (GECCO’2006), pages 567–574, New York, NY, USA. ACM Press. Witt, C. (2005). Worst-case and average-case approximations by simple randomized search heuristics. In In Proceedings of the 22nd Annual Symposium on Theoretical Aspects of Computer Science (STACS’05), number 3404 in LNCS, pages 44–56. Witt, C. (2006). Runtime Analysis of the (µ + 1) EA on Simple Pseudo-Boolean Functions. Evolutionary Computation, 14(1):65–86. Zarges, C. (2009). On the utility of the population size for inversely fitness proportional mutation rates. In FOGA ’09: Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms, pages 39–46, New York, NY, USA. ACM.