Iterated Local Search with Guided Mutation Qingfu Zhang and Jianyong Sun
Abstract— Guided mutation uses the idea of estimation of distribution algorithms to improve conventional mutation operators. It combines global statistical information and the location information of good individual solutions for generating new trial solutions. This paper suggests using guided mutation in iterative local search. An experimental comparison between a conventional iterated local search (CILS) and an iterated local search with guided mutation has been conducted on four classes of the test instances of the quadratic assignment problem.
I. I NTRODUCTION With information (often called memory) extracted from the previous search and problem-specific knowledge, how should a new trial solution (or a set of new trial solutions) should be generated at each iteration? This is a fundamental issue in the design of an iterative heuristic search algorithm for hard optimization problems. Estimation of distribution algorithms (EDAs) [1] extract globally statistical information from the previous search and then build a probability model for modeling the distribution of best solutions visited in the search space. New trial solutions are sampled from the model thus built. However, the information of the locations of best individual solutions found so far is not directly used for guiding the further search. In contrast, traditional genetic algorithms [2] employ crossovers and mutation as their main operators for generating new trial solutions. A crossover operator applies to a pair of parent solutions and swap parts of these two solutions to produce new trial solutions. A mutation operator randomly alters part of a parent solution to produce a new solution. These parent solutions are often best solutions visited in the previous search. No globally statistical information is extracted or used for producing new solutions in GAs. Recently, some effort has been recently made to combine EDAs and GAs [3]. We have proposed an new operator, called guided mutation, for generating new solutions [4][5]. Guided mutation can be regarded as a combination of conventional mutation and EDA offspring generating scheme. Guided by a probability model, guided mutation alters a parent solution to produce a new solution. The new solution can hopefully fall in or close to a promising area which is characterized by the probability model. Meanwhile, it directly takes a user-specified percentage of elements from its parents, which is often a better solution found so far. In such a way, the similarity between a new solution and its parent can be controlled to some extent. We have successfully Q. Zhang is with Department of Computer Science, University of Essex, Colchester, Wivenhoe Park, CO4 3SQ, UK.
[email protected] J. Sun is with School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK.
[email protected]
applied guided mutation in hybrid evolutionary algorithms for the maximum clique problem [4]. Iterated local search (ILS) [6] is a general meta-heuristic. It has two basic operators for generating new solutions. One is a local search and the other is a perturbation operator. When its local search is trapped in a local optimal solution, a perturbation operator is applied to the local optimum to generate a new starting point for its local search. It is desirable that the generated starting point should be in a promising area in the search space. A commonly-used perturbation operator is a conventional mutation, which can produce a starting point in a neighboring area of the local optimum. In this paper, we use the guided mutation operator as the perturbation operator in iterated local search for the quadratic assignment problem. The guided mutation operator can generate a new starting point for the further local search, which is in a promising area characterized by a probability model and not far away for the best solution found so far. In [7] and [5], we have used guided mutation operators in evolutionary algorithms with guided local search and 2opt local search for the QAP. The algorithms in [7] and [5] are population-based methods, while the algorithm in this paper is a single-point based iteration method. One of the major contributions of this paper is the introduction of guided mutation operators to iterated local search. We show that a guided mutation operator can improve the performance of ILS. The rest of the paper is organized as follows: Section 2 introduces the generic framework of iterated local search. Section 3 presents guided mutation for permutation vectors. Section 4 proposes an iterated local search with guided mutation (ILS/GM) for the permutation search space. Section 5 presents the 2-opt local search for the QAP and discusses the distribution of the locally optimal solutions of the QAP. Section 6 compares ILS/GM with ILS on a set of QAP test instances. Finally, Section 7 provides some conclusions. II. I TERATED L OCAL S EARCH (ILS) Iterated local search works as follows [6]: 0. Generate an initial solution π 0 π ∗ = LocalSearch(π 0 ) 1. π 0 = P erturbation(π ∗ , history) 2. π 00 = LocalSearch(π 0 ) 3. π ∗ = AcceptanceCriterion(π ∗ , π 00 , history). 4. If the stopping condition is not met, go to 1. LocalSearch is an algorithm which improves a solution in the search space. In most implementations of ILS, LocalSearch can be chosen as an iterated descent local search. P erturbation mutates the current local optimal
solution π ∗ and generates an intermediate solution x0 . Then LocalSearch is applied to π 0 and produces a new local optimal solution π 00 . If π 00 wins π ∗ in Acceptancecriterion, then π 00 will replace π ∗ and become the current optimal solution, otherwise, the search remains at the previous optimal solution π ∗ . Let the search space be S and the set of all the local optimal solutions (i.e., all the possible output of LocalSearch) be S ∗ . LocalSearch in ILS moves each solution in S to its corresponding local optimal solution. Therefore, ILS performs a biased sampling in S ∗ in a sense. If π 0 is close to π ∗ in P erturbation, it is very likely that the new local optimal solution generated in LocalSearch will be not far away from the current local optimal solution. As a result, ILS makes a walk in S ∗ from one local optimal solution to a nearby one. In implementations of ILS, P erturbation can a conventional mutation operator with an appropriate perturbation strength. For some problems, the perturbation strength needs to be high. The search history has also been explored in some perturbation schemes. For example, Battiti and Protasi [8] proposed a ILS-like algorithm for MAX-SAT, in which a tabu memory of the search history is used in the perturbation stage. The rationale behind ILS is supported the proximate optimality principle [9]. This principle assumes that good solutions are similar. This assumption is reasonable for most real-world problems. For example, the percentage of common edges in any two locally optimal solutions obtained by the Lin-Kernighan method is about 85% on average. Based on this principle, search should be taken place in S ∗ around the best locally optimal solutions found so far. III. G UIDED M UTATION In conventional mutation, a new solution is close to its parent, but may be far away from other best solutions found so far since the mutation only utilizes the location information of the parent solution. EDAs extract global statistical information from the previous search and use it to guide the further search. However, the location information of the best solutions found so far has not directly used in EDA. Guided mutation [4] combines the global statistical information and location information of the solutions found so far to overcome the shortcoming of the conventional mutation and EDAs. In the following, we introduce a guided mutation operator on permutation vectors. Let the search space be Π, the set of all possible permutations of I = {1, 2, . . . , n}. Suppose that the distribution of promising solutions can be modeled by the following probability matrix: p11 · · · p1n .. .. P = ... . . pn1
···
pnn
where pij is the probability that πi = j in a promising solution π = (π1 , . . . , πn ) ∈ Π.
Guided by the probability matrix P , guided mutation mutates an existing solution π ∈ Π to generate a new solution σ ∈ Π. This operator also needs a control parameter 0 < α < 1. It works as follows: σ = GuidedMutation(π, P, α) Input: a permutation π = (π1 , · · · , πn ), a probability matrix P = (pij )n×n and a positive parameter α < 1.0 Output: σ = (σ1 , · · · , σn ) , a permutation. Step 1 Randomly pick [αn] integers uniformly from I = {1, 2, · · · n} and let these integers constitute a set K ⊂ I. Set V = I\K and U = I\{πl |l ∈ K}. Step 2 For each i ∈ K, set σi = πi . Step 3 While(U 6= ∅) do: Uniformly randomly select an i from V, then randomly draw a k from U with probability P
pik
j∈U
pij
.
Set σi = k, U = U \{k} and V = V \{i}. Step 4 Return σ. In the above guided mutation operator, σi is directly copied from the parent π if i ∈ K. Otherwise, under the constraint that σ is a permutation, it is randomly generated based on the probability matrix X. The larger α is, the more elements of σ are directly copied from its parent π. In other words, α controls the similarity between the offspring and the parent. In the conventional mutation for permutation vectors, several (often two) elements are randomly selected and then swapped. The probability that a permutation σ being generated from parent π is entirely determined by the number of the positions where σ and π are different. In contrast, the guided mutation mutates π based on the probability matrix P , which characterizes distribution of promising solutions. It can be expected that offspring π fall in or close to a promising area in the search space. Meanwhile, randomness in Step 3 also provides diversity for the search. IV. ILS WITH G UIDED M UTATION (ILS/GM) During the search of ILS, ILS will visit a number of locally optimal solutions. We propose that statistics of these optimal solutions can be exacted for building a probability model. Then a guided mutation by using this model can perturb the current locally optimal solution to generate a new starting solution for LocalSearch. As in Section III, we assume that the search space is Π. In the following, we present the main components of ILS/GM. A. Initialization and Update of Probability Matrix We initialize and update the probability matrix P in ILS/GM as follows.
1) Initialization: At the beginning of the search, we set 1 · · · n1 n P = ... . . . ... 1 n
···
1 n
If there is a prior knowledge of the distribution of promising solutions in the search space, P can be initialized by such a knowledge to bias towards promising areas. 2) Update of Probability Matrix: Assume that the current local optimal solution is π ∗ = (π1∗ , . . . , πn∗ ) and the current probability matrix is P = (pij )n×n . Then at each iteration P can be updated as follows: ∗
pij = (1 − β)Iij (π ) + βpij , (1 ≤ i, j ≤ n) where ½ Iij (π) =
1 0
if π ∗ (i) = j, otherwise.
0 ≤ β ≤ 1 is the learning rate. The bigger β is, the greater is the contribution of π ∗ to the new probability matrix. B. Perturbation To perturb the current locally optimal solution π ∗ , ILS/GM apply the guided mutation operator (described in Section III) to it, guided by the current probability matrix. V. T HE Q UADRATIC A SSIGNMENT P ROBLEM (QAP) The quadratic assignment assignment problem (QAP) [10] can model a variety of applications in scheduling, manufacturing and data analysis. Given I = {1, 2, . . . , n} and two n×n matrices A = (aij ) and B = (blk ), the QAP can be stated as follows: min c(π) = π∈Π
n X n X
aπi πj bij
i=1 j=1
where π is a permutation of I = {1, 2, . . . , n} and Π is the set of all the permutations of I as defined in Section III. In the facility location context, A is the distance matrix, so that aij represents the distance between locations i and j. B is the flow matrix, so that bkl represents the flow between facilities k and l.π represents an assignment of n facilities to n locations. More specifically, πi = k means that the facility i is assigned to location k. The QAP is one of the most difficult NP-hard combinatorial problems. Solving the QAP instances with n > 30 to optimality is computationally impractical for exact algorithms such as the branch-and-bound method. Therefore, a variety of heuristic algorithms for dealing with large QAP instances have been developed over the past decade, such as tabu search [11] guided local search [12], evolution strategies [13], genetic algorithm [14], ant colony optimization [15], hybrid evolutionary algorithm [16], scatter search [17], and so on.
A. Local Search for the QAP The local search used in this paper is the 2-opt local search [18]. Let π be a solution for the QAP. Then its 2opt neighborhood N (π) is defined as the set of all possible solutions resulting from π by swapping its two distinct elements. The 2-opt local search algorithm searches the neighborhood of its current solution for a better solution. If such a solution is found, it replaces the current solution and the search continues. Otherwise, a local optimum has been reached. In our experiments, the first better solution found is accepted and used to replace the current solution. In other words, we use the first-improvement principle. B. The distribution of the locally optimal solutions of the QAPs The proximate optimality principle is a basic assumption behind almost all the metaheuristics. To verify this principle on the QAP instances, we have conducted the following experiments in [7]: 500 different locally optimal solutions π 1 , . . . , pi500 are generated by applying the 2-opt local search on randomly generated solutions in Π, then we sort all the 500 obtained locally optimal solutions with respect to their costs in ascending order. For each locally optimal solution π k , we generate 1000 distinct locally optimal solutions σk1 , . . . , σk1000 by applying the 2-opt local search on randomly generated solutions in a neighbourhood of π k (the set of all the solutions differing from π k on at most 0.1n items in our experiments). We compute the average cost and the average Hamming distance to π k of the local optima σk1 , . . . , σk1000 . Figure 1 plots these average costs and average distances for QAP test instances nug30 and bur26b. We can have the following observations from Figure 1: • The average cost of locally optimal solutions around a better optimal solution is lower. k • The better π is, the shorter the average distance of 1 1000 σk , . . . , σ to π k These observations verify the proximate optimality principle in these instances, therefore, it is appropriate to use ILS for dealing with these QAP instances. More experimental results on the distribution of optimal solutions can be found in [19]. VI. E XPERIMENTAL S TUDIES We compared ILS/G with a conventional ILS (CILS) proposed in [6] on four classes of QAP instances from QAPLIB. All the experiments was performed on Athlon (1.6 GHz, 1G memory, Linux). The algorithms stop after a prefixed CPU time has reached. A. Experimental Setting In both ILS/GM and CILS: • LocalSeach: 2-opt local search described in Section 5.1 is used. • AcceptanceCriterion, we use the following criterion: AcceptanceCriterion(π ∗ , π 00 ) =
8
74
8.8
x 10
72 8.7
70
68 8.6
66
64
8.5
62 8.4
60
58 8.3
56
54
0
50
100
150
200
250
300
350
400
450
500
8.2
0
50
100
150
200
250
300
350
400
450
500
400
450
500
6
20 5.454
x 10
5.452
19 5.45
18
5.448
5.446
17 5.444
16
5.442
5.44
15 5.438
14
0
50
100
150
200
250
300
350
400
450
500
5.436
0
50
100
150
200
250
300
350
Fig. 1. The POP verification for the two QAP instances. Two top figures are for tai80b and two bottom figures for Bur26a. On the x-axis is given the order of π 1 , . . . , π 500 w.r.t their fitness values. The left figures plot the average costs of σk1 , . . . , σk1000 while the right figures show their average distances to π k . The continuous lines are the interpolation curves.
½
π∗ π 00
if c(π ∗ ) < c(π 00 ), otherwise.
Both ILS/GM and CILS use restart strategy, if no better solution has been found in consecutive 200 generations, we restart the search from a randomly generated solution. In ILS/GM: • Pertubation: Guided mutation is employed. α = 0.9, 0.8, 0.7, and 0.65 in Guided mutation for Class 1, 2, 3, and 4 respectively. • Update of Probability Matrix: β = 0.005, 0.005, 0.01, 0.01 for for Class 1, 2, 3, and 4 respectively. The details of the parameter setting in the perturbation operator of CILS can be found in [6]. CILS randomly exchanges k elements in π ∗ to generate π 0 . kmin ≤ k ≤ kmax . kmin = 3 and kmax = 0.9n. The value of k is updated as in variable neighborhood search (VNS). CILS uses the same local search, acceptance criterion and restart criterion as ILS/GM does. B. Experimental Results ILS/GM and CILS were run independently for ten times on each QAP test instance. We recorded avg, the average percentage excess over the best-known solution obtained for each algorithm for each QAP test instance. Table 1 lists avg and the CPU time used in seconds (indicated by t) in both CILS and ILS/GM for each test instance. It also gives avg, the average value of avg’s of each algorithm for each class of test instances.
Figure 2 plots the evolution of the average cost of the best solutions vs the number of the calls of the local search procedure in both algorithms for test instance tai40a. It is very clear from Table 1 and Figure 2 that ILS/GM outperforms CILS in the solution quality. Noting that ILS/GM differs from CILS only in perturbation operators, we can claim that guided mutation does improve the performance of ILS. It is also should be pointed out that ILS/GM uses different setting for α and β for different classes. VII. C ONCLUSIONS Guided by a probability model which characterizes the distribution of promising solutions in the search space, the guided mutation alters a parent solution to generate a new solution. Guided mutation operators provide a mechanism for combining global statistical information about the search space and the position information of a good solution found during the previous search in generating new trial solutions in heuristics. Perturbation is one of the major operators in ILS. Conventional mutation is used as a perturbation operator in most implementations of ILS. This paper advocated using guided mutation in ILS. An ILS with guided mutation for the QAP was proposed and compared with a conventional ILS on a set of the QAP test instances. The experimental results indicated that the guided mutation operator improves the performance of ILS. We also experimentally studied the distribution of the locally optimal solutions of the QAP and showed that the proximity optimality principle holds for some QAP test
TABLE I T HE EXPERIMENTAL RESULTS OF CILS AND ILS/GM ON FOUR CLASSES OF THE TEST QAP INSTANCES . instances tai20a tai25a tai30a tai35a tai40a tai50a tai60a tai80a tai100a tai256c avg tai20b tai25b tai30b tai35b tai40b tai50b tai60b tai80b tai100b tai150b avg chr25a bur26a kra30a kra30b ste36a ste36b avg nug30 sko42 sko49 sko56 sko64 sko72 sko81 sko90 sk0100a avg
CILS ILS/GM class 1 0.248 0.293 0.785 0.478 0.958 0.670 1.046 0.648 1.015 0.869 1.577 1.333 1.619 1.514 1.501 1.424 1.396 1.250 0.298 0.204 1.127 0.863 Class 2 0.000 0.000 0.000 0.000 0.000 0.000 0.018 0.000 0.000 0.000 0.233 0.017 0.483 0.042 0.367 0.060 0.193 0.127 0.616 0.433 0.182 0.137 Class 3 2.718 1.916 0.000 0.000 0.101 0.022 0.040 0.028 0.283 0.243 0.000 0.000 0.524 0.360 Class 4 0.042 0.019 0.074 0.000 0.137 0.176 0.247 0.194 0.174 0.132 0.227 0.218 0.326 0.203 0.270 0.229 0.284 0.213 0.198 0.173
t 5 15 20 60 60 90 90 180 300 1200 5 15 20 60 60 90 90 180 300 600 15 15 20 20 30 30 20 60 60 90 90 120 120 180 300 -
instances, which explains to some extent why ILS and guided mutation are suitable for these QAP instances. One of the major shortcoming of the guided mutation is that it has two control parameters to tune. In the future, we will study how to systematically adjust these parameters based on statistical information collected from the search. Another interesting topic is to study the effect of guided mutation in other metaheuristics. R EFERENCES [1] P. Larra˜naga and J. A. Lozano. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, 2002. [2] D. E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley: Reading, MA, 1989. [3] J. M. Pe˜na, V. Robles, P. Larra˜naga, V. Herves, F. Rosales, and M. S. P´erez. Ga-eda: Hybrid evolutionary algorithm using genetic and estimation of distribution algorithms. In The 17th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, pages 361–371, Ottawa, Canada, 2004.
6
tai40a
x 10
ILS/GM CILS 3.28
3.26
3.24
3.22
3.2
3.18
3.16
2
10
20
30
40
50
60
70
80
90
× 10 100
Fig. 2. the evolution of the average cost of the best solutions vs the number of the calls of the local search procedure in both algorithms for test instance tai40a.
[4] Q. Zhang, J. Sun, and E.P.K. Tsang. Evolutionary algorithm with the guided mutation for the maximum clique problem. IEEE Transactions on Evolutionary Computation, 9(2):192–200, 2005. [5] Q. Zhang, J. Sun, E.P.K. Tsang, and J.A. Ford. Combination of guided local search and estimation of distribution algorithm for solving quadratic assignment problem. In Proc. of the Bird of a Feather Workshops, Genetic and Evolutionary Computation Conference, pages 42–48, 2004. [6] T. St¨utzle. Iterated local search for the quadratic assignment problem. Technical Report AIDA-99-03, Darmstadt University of Technology, Computer Science Department, Intellectics Group, 1999. [7] Q. Zhang, J. Sun, E. Tsang, and J. Ford. Estimation of Distribution Algorithm with 2-opt Local Search for the Quadratic Assignment Problem. 2006. [8] R. Battiti and M. Protasi. Handbook of Combinatorial Optimization, volume 1, chapter Approximate Algorithms and Heuristics for MAXSAT, pages 77–148. Kluwer Academic Publishers, 1998. [9] F. Glover and M. Laguna. Tabu Search. Kluwer Academic Publishers, 1998. [10] E. C¸ela. The Quadratic Assignment Problem: Theory and Algorithms. Kluwer Academic Publishers, 1998. ` D. Taillard. Robust taboo search for the quadratic assignment [11] E. problem. Parallel Computing, 17:443–455, 1991. [12] P. Mills and E. P. K. Tsang. Guided local search for solving SAT and weighted MAX-SAT problems. Journal of Automatic Reasoning, Special Issue on Satisfiability Problems, 24:205–223, 2000. [13] V. Nissen. Solving the quadratic assignment problem with clues from nature. IEEE Transactions on Neural Networks, 5(1):66–72, 1994. [14] D. M. Tate and A. E. Smith. A genetic approach to the quadratic assignment problem. Computers and Operations Research, 22(1):73– 83, 1995. ´ Taillard, and M. Dorigo. Ant colonies for the [15] L. Gambardella, E. QAP. Journal of the Operations Research Society, 50:167–176, 1999. [16] P. Merz and B. Freisleben. Fitness landscape analysis and memetic algorithms for the quadratic assignment problem. IEEE Transcations on Evolutionary Computation, 4(4):337–352, 2000. [17] V.-D. Cung, T. Mautor, P. Michelon, and A. Tavares. A scatter search based approach for the quadratic assignment problem. In T. B¨ack, Z. Michalewicz, and X. Yao, editors, Proc. of the 1997 IEEE International Conference on Evolutionary Computation (ICEC), pages 165–170. IEEE Press, 1997. [18] E. S. Buffa, G.C. Armour, and T.E. Vollmann. Allocating facilities with CRAFT. Hardware Business Review, pages 136–158, March 1964. [19] J. Sun. Hybrid Estimation of Distribution Algorithms for Optimization
Problems. PhD thesis, Department of Computer Science, University of Essex, 2005.