Instance generator for the quadratic assignment problem with

12 downloads 0 Views 195KB Size Report
QAP problem instance generator whose instances can be used as benchmark ... though generating uniform randomly distributed values for ..... h = 0 or h = 1.0.
Instance generator for the quadratic assignment problem with additively decomposable cost function Madalina M. Drugan Department of Information and Computing Sciences, Utrecht University, PO Box 80.089, 3508 TB, Utrecht, The Netherlands Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium Email: [email protected] Abstract—Quadratic assignment problems (QAPs) are NP-hard combinatorial optimization problems often used to compare the performance of meta-heuristics. In this paper, we propose a QAP problem instance generator whose instances can be used as benchmark for meta-heuristics. Our generator aggregates various QAP instances into a larger size QAP instance in order to obtain an large size, additively decomposable QAP. We assume that a QAP instance which is difficult for local search has many local optima from which local search needs to escape from. We call the resulting QAPs the composite QAP instances (cQAPs). We use numerical and empirical techniques for the landscape analysis of generated composite QAPs. The comparison between our QAP instances with the other QAPs from the literature classify cQAPs as difficult. We show that heuristic algorithms that exploit the particularities of the cQAP search space, like iterated local search, can outperform other heuristics that do not consider the structure of this search space, like multi-restart local search.

I.

I NTRODUCTION

The Quadratic assignment problem (QAP) models many real-world problems like computer aided design in the electronics industry, scheduling, vehicle routing, etc. Recent extensive reviews on QAP instances are given in [1], [6]. Intuitively, QAPs can be described as the (optimal) assignment of a number of facilities to a number of locations. Let us consider N facilities, a set Π(N ) of all permutations of {1, . . . , N } and the N × N distance matrix A = (aij ), where aij is the distance between location i and location j. We assume a flow matrix B = (bij ) where bij represents the flow from facility i to facility j. The goal is to minimize the cost function c(π) =

N N X X

aij · bπ(i)π(j)

(1)

i=1 j=1

where π(·) is a permutation. It takes quadratic time to evaluate this function. We consider a QAP as a tuple (A, B, s) where s, if known, is the optimal solution. If the optimal solution is not known, we denote the QAP simply with (A, B). Here, we consider QAP instances that were initially used in [13]. We pose three requirements for the flow and distance matrices. The distance and the flow between the same facilities is 0, ∀i, aii = 0 and bii = 0. The distance and the flow matrices are symmetrical, ∀i, j, aij = aji and bij = bji . All the elements of the distance and flow matrices are non-negative, natural numbers within a range, ∀i, j, 0 ≤ aij ∈ IN and 0 ≤ bij ∈ IN . Related work. Drezner et al. [2] propose instances of QAPs that are difficult to solve with some heuristics, but

easy to solve with exact methods. QAP instances proposed in [2] are simplified versions of instances from [10] where the underlying grid is very sparse and the distribution generating positions of facilities on this grid is random. The goal is to generate difficult QAPs with sparse flow matrices for heuristics. The QAP instance generator from [2] splits the facilities in regions, or clusters. Because of their sparseness, these QAP instances are easy to solve with some exact methods [8], even with about 90 facilities, and difficult for the heuristics that use the popular two facilities exchange operator [9]. Taillard [13] analysed the performance of some heuristics on several QAPs from the literature. He points out that, even though generating uniform randomly distributed values for distance and flow matrices is a popular choice, the resulting QAPs are not interesting for heuristics because their exact optimum is difficult to find, but the difference between some easy to find local optima and the global optimum value is small. Moreover, the difference in cost function between the best and the worst cost values is approaching 0 when the number of facilities approaches ∞. This behaviour is explained by the lack of correlation between local optima. Taillard created a larger variation between the worst and the best local optimum using non-uniform random elements for distance and flow matrices. He concluded that large size QAPs with interesting structure to exploit by heuristic search algorithms are difficult to generate by a problem generator. Local search algorithms. Intuitively, local search [12] starts from an initial solution and iteratively generates new solutions using a neighbourhood strategy. Each step, a solution that improves over the existing best-so-far solution is chosen. The algorithm stops when there is no improvement possible. Best improvement explores all the individuals in the neighbourhood of a solution and selects the best solution that is improving over the initial solution and all the other visited solutions. Because LS can be stuck in local optima, the multirestart local search algorithm restarts LS multiple times from one uniform randomly chosen initial solution. QAPs are permutation minimization problems. A suitable neighbourhood operator for QAPs is the 2-exchange swapping operator that swaps the position of two different facilities. This operator is attractive because of its linear time to compute the change in the cost function with the condition that the matrices A and B are symmetrical. The size of the neighbourhood increases quadratically with the number of facilities.

There are certain limitations in the design of multi-restart LS because it is basically random sampling in the space of local optima, it does not scale up for a large number of local optima. Iterated local search algorithms restart the search from solutions obtained by mutating local optimal solutions in order to restart LS from good regions of the search space. In permutation problems like QAPs, the mutation operator interchanges facilities between different positions. When LS uses the 2-exchange operator to generate a neighbourhood, the mutation operator should exchange at least 3 facilities to escape from the region of attraction of a local optimum. The m-exchange mutation uniform randomly selects m distinct pairs of positions which are sequentially exchanged. The stopping criterion for the iterated LS is chosen to fairly compare its performance to the multi-restart LS. The search in the iterated LS is halted when it reaches the same number of swaps as the multi-restart LS. The distance between two solutions is defined as the minimum number of exchanges necessary to obtain one solution from another. The distance between a solution and its m-exchange solution is m − 1. Counting the number of swaps is equivalent to counting the number of function evaluations. The main contributions of this paper. Our goal is to design a QAP instance generator that creates useful benchmark instances for heuristics like iterated LS which exploits the particularities of the search space. Our solution is to aggregate QAPs with computable optimal solutions, into a larger QAP and, afterwards, to fill the rest of the positions in the composite QAP with specific values. Note that there are two optimization problems corresponding with the two regions: i) the region of the component QAPs, or the composite region, and ii) the region outside these component QAPs, or the outside region. This class of QAP instances is denoted as composite QAPs. We show that a composite QAP instance has an additively decomposable cost function that can be expressed as the sum of component QAP’s costs plus two extra terms corresponding to: i) the overlapping parts of the component QAPs, and ii) the outside region to the components QAPs. Problems with additively decomposable cost functions are considered useful test benchmark for meta-heuristic algorithm [11]. Approximating the number and the size of basins of attraction is a good method for landscape analysis [5]. The larger the number and the size of local optima, the more probable it becomes that local search gets stuck before finding the global optimum. We show that even small size (composite) QAPs have a lot of local optima and their number increases with the number of facilities. Due to the possible huge number of local optima, we propose an alternative definition and an empirical approximation of the distribution of basins of attraction. We use iterated LS that restarts local search from correlated solutions, as opposed to multi-restart LS that restarts local search from random solutions, to connect the dynamical exploitation - the number of enhancements - and exploration - the probability of escaping local optima - of the landscape with the search efficiency of iterated LS [4]. We show that iterated LS performs better on cQAPs than multi-restart local search because it exploits the structure of the search space. Outline. Section II introduces an algorithm that generates

Algorithm 1: generate cQAP Require: d QAP matrices (Ak , B k , I), 1 ≤ k ≤ d Calculate N Initialize AC and B with 0s everywhere (AC , B C ) ← assemble({(A1 , B 1 , I), . . . , (Ad , B d , I)}, L) H L H L , DA , DB , DB , h) (AO , B O ) ← fillingUp(DA C O C O A←A +A ; B ←B +B return (A, B)

composite QAP instances. Section III describes landscape analysis on several QAP instances. Section IV shows that iterated local search can outperform multi-restart local search on cQAPs. Section V concludes this paper. II.

A COMPOSITE QAP INSTANCES GENERATOR

In this section, we design an algorithm that generates QAPs with desired properties: i) large size, ii) many local optima, and iii) structure of the search space that can be exploited with genetic like operators. In Algorithm 1, the pseudo-code for generating composite QAPs, there are two function calls. The input is a set of d small size QAP matrices with identity permutation I as optimal solution into a larger size QAP, where ∀i ∈ {1, . . . , N }, I(i) = i. In general, the exact solution for a QAP is different from the identity permutation. A straightforward method to transform a component QAP with an optimal solution s into a QAP with identity permutation as optimal solution is to rename the facilities. The function assemble aggregates the input component QAPs. In order to keep track of the assigned positions in the matrices of the composite QAP, a mask C, called the composite mask, is considered, where cij = 1 if the aij and bij were assigned with the assemble algorithm. Otherwise, cij = 0. The elements from A assigned in assemble are denoted with a masked flow matrix AC , where aCij = aij , if cij = 1 and aCij is otherwise undefined. Similarly, the elements from B assigned with assemble are denoted with a masked distance matrix B C . The corresponding pseudo-code is given in Algorithm 2 and it is explained in Section II-A. The function fillingUp assigns the positions in cQAP’s matrices that were not assigned with assemble. Thus, a position (i, j) in A and B is assigned with fillingUp if and only if cij = 0. We consider also a outside region mask matrix O = ¬C. The elements from A assigned in fillingUp are denoted with a masked matrix AO , and the elements from B assigned in fillingUp are denoted with B O . fillingUp ensures that this outside region has the identity permutation as the optimal solution. The corresponding pseudo-code is given in Algorithm 3 and it is described in Section II-B. Section II-C discusses some aspects of the generated QAP instances. Note that the resulting cQAP does not generally have a known local optima, finding these conditions is outside the scope of this paper. A. Aggregate QAPs Let L be an allocation strategy or function that assigns facilities from each component QAP, (Ak , B k , I) to the facilities

Algorithm 2: ble({(A1 , B 1 , I), . . . , (Ad , B d , I)},L) for k = 1 to d do for i = 1 to nk do t ← Lk (i) for j = i + 1 to nk do p ← Lk (j) aCtp ← aCtp + akij ; aCpt ← aCtp bCtp ← bCtp + bkij ; bCpt ← bCtp ctp ← 1; cpt ← 1 end for end for end for return (AC , B C )

assem-

of the composite QAP. We denote with j the facility from the cQAP that corresponds to the facility i from the component k, j ← Lk (i). To each facility of cQAP corresponds at least one facility of a component QAP. To each facility in a component QAP is assigned a unique facility in cQAP. The union of all assignment functions is ∪dk=1 Lk (·) = {1, . . . , N }. Note that overlapping between facilities of different component QAPs are allowed. For simplicity, we consider that the facilities from component QAPs are adjacent in the composite QAP. In Algorithm 2, for each pair of facilities (i, j) ∈ Ak from each k-th component QAP, there assigned a pair of facilities (t, p) in the resulting AC . We update the values aCtp ∈ AC and bCtp ∈ B C with the corresponding values in akij ∈ Ak and bkij ∈ B k . Then, aCtp ← aCtp + akij and bCtp ← bCtp + bkij , respectively. Since AC and B C should be symmetrical, we also set aCpt ← aCtp and bCpt ← bCtp . The mask C is updated to 1 for the pair of indices t and p, ctp ← 1, and cpt ← 1. The returned, yet incomplete cQAP, is added to the output of Algorithm 3. B. Filling up the cQAP This algorithm guarantees that the elements in AO and B O obey the rearrangement inequality [14]1 . Informally, the rearrangement inequality states that the values in AO and B O are either larger or smaller than values in AC and B C . Thus, the largest values in B O correspond to the lowest values in AO , and the lowest values in B O correspond to the largest values in AO . L samples the lowest values Consider that the distribution DA L of AO , and the distribution DB samples the lowest values of H BO . Consider that the distribution DA samples the highest H values of AO , and the distribution DB samples the highest H values of BO . We generate h values in AO from DA and L 1 − h from DA . Because of the rearrangement inequality, h L values in BO are generated from DA and 1 − h are generated H from DA .

H L H L Algorithm 3: fillingUp(DA , DA , DB , DB ,h)

SA ← ∅ and SB ← ∅ two ordered sets t←0 for i, j = 1 to N , and cij = 0 do if h < U (0, 1) then H ′ L Generate aO ij ∝ DA and bt ∝ DB else L ′ H Generate aO ij ∝ DA and bt ∝ DB end if ′ SA ← SA ∪ a O ij and SB ← SB ∪ bt t←t+1 end for for r = 1 to |SA | do r ← rank of aO ij in SA ′ bO ← b with rank |SA | − r in SB t ij end for return (AO , B O )

H L H L set of distributions for the outside region, DA , DA , DB , DB , H and the percentage of values generated from DA in AO , h.

Let SA and SB be ordered sets containing all the values from AO and B O , respectively. Let r be the rank of aO ij in H O SA , where cij = 0. If aO is generated from D , then b ij ij is A L O generated from DB such that the rank of bij in SB is |SA | − r. L O Similarly, if aO ij is generated from DA , then bij is generated H O from DB such that the rank of bij in SB is |SA | − r. C. Discussion Additively decomposable cost function for cQAPs. Consider the set of all permutations of N facilities in the flow matrix. In permutation group theory, permutations are often written in cyclic form. If π is a permutation of facilities, we can write it as π = (π 1 , . . . , π d ), where π k is a cycle containing a set of facilities that can be swapped with each other. These cycles are disjunct subsets. We consider d cycles, each cycle containing the facilities of exactly one component QAP. If there are nk facilities in the k-th component QAP, the corresponding cycle is a nk -cycle. The cycle of a component QAP k has the cost function k

k

k

c (π ) =

k

n n X X

akij · bkπk πk i

j

i=1 j=1

where k ∈ {1, . . . , d}, d are the number of component QAPs, and π k is a permutation of the facilities of the k-th component QAP, and πik is the i-th value of the permutation π k . By design, the optimum cost for each cycle is ck (I) = minπk ck (π k ). The cost function of π on the cQAP is c(π) =

N N X X i=1 j=1

aij · bπi πj =

d X

ck (π k )

(2)

k=1

Algorithm 3 presents the pseudo-code for generating the elements in AO and BO . fillingUp has as input parameters the

+ResidueC (π) + ResidueO (π)

1 Let n variables be generated with any two distributions {x , . . . , x } n 1 and {y1 , . . . , yn } for which x1 ≤P. . . ≤ xn andP y1 ≥ . . . ≥ yn . The n rearrangement inequality states that n i=1 xi · yi ≤ i=1 xi · yπ(i) , for all permutations π

P where ResidueO (π) = cij =0 aij ·bπi πj is a sum of costs for the elements in the outside region. ResidueC (π) is the sum over all overlapping positions of the overlapping component

QAPs and depends on the overlapping pattern. Thus, we consider that the proposed QAPs have additively decomposable cost functions with two residual terms representing costs of the overlapping parts and the outside region, respectively. Easy vs hard cases of QAP instances. Cela [1] showed that QAP instances where all the elements obey the rearrangement inequality are easy. When the component QAPs are degenerated, n1 = . . . = nd = 1, the cQAP becomes the ”easy” QAP of Cela [1]. Let’s generate the elements in the component QAPs using uniform random distributions. According to Taillard [13], QAP instances generated with uniform random distributions are difficult because their global optimum is difficult to find. However, there are small differences in cost values between the global optimum and a local optimum. Furthermore, as the number of facilities goes to ∞, this difference in cost function goes to 0. We consider the composite region defined by C as the difficult region of the generated cQAP. The outside region defined by O is the easy region of the cQAP. By design, both the outside region and all the component QAPs are optimized by the identity permutation, but the composite QAP’s optimal solution is not, in general, the identity permutation. Parameter setting for cQAP instances. Component QAPs are independently generated with a uniform random distribution, the same for all the components, D. Let DL and DH be uniform independent distributions, and let the values of DL and DH be uniform randomly located in AO and B O . Note that even though component QAPs and the elements in the outside region are generated with uniform random distributions, cQAP is not generated with a uniform random distribution. To compare cQAP instances with the uniform randomly generated QAPs of Taillard [13], let the bounds of the uniform distributions that generate the flow and the distance matrices be the same. Consider the following setting: i) m = 20 and M = 30, ii) mH = 90 and M H = 99, and iii) mL = 1 and M L = 10. Let n = 8 be the number of facilities in component QAPs, where d ≥ 2. Let’s consider the percentage of high values in the outside region of the distance matrix, h. We consider h ∈ 0.5 to increase the variance in the outside region. This variance is larger than when there are only large or small elements in it, h = 0 or h = 1.0. To compare our generator with Drezner et al. [2]’s generator, we also consider a certain amount of 0’s in DL , we denote them with z. We consider three values for z, z ∈ {0, 0.5, 1.0}. z = 0 means that there are no 0’s in the outside region, whereas z = 1.0 means that there are only 0’s in DL . For the same h, the variance of an outside region with lots of 0s is smaller than the variance of an outside region with no 0s. III.

L ANDSCAPE ANALYSIS WITH MULTI - RESTART LOCAL SEARCH

The basin of attraction [5] of a local optimum, X, is the set of independent uniformly distributed, iid, restarting points for local search of which the optimum is in X. As performance

indicator, we count the number of times the optimal solution is found. Section III-A shows that this definition of the basin of attraction is not informative even for small size QAPs because of the large number of basins of attraction in their landscape. In Section III-B, we propose an alternative definition of the basin of attraction that reduce the number of basins of attraction by counting as one the local optima with the same value. In Section III-C, we use empirical measures to approximate the size of basins of attraction. We compare the basins of attraction of three types of QAPs: i) Taillard [13], ii) Drezner [2] and iii) our cQAPs with the parameters recommended in the previous section. All these three methods generate QAPs of various sizes using uniform randomly generated numbers distributed with some, simple or rather elaborated, strategy in the flow and distance matrices. Taillard’s QAPs have matrices with uniform randomly distributed elements. Drezner’s QAPs represent grids with uniform randomly generated points with a rather small amount of connections in the grid. The component QAPs and the elements in the outside region of cQAP are also uniform randomly generated. Therefore, for all these QAPs, we assume a random configuration for basins of attraction. A. Basins of attraction To approximate the number of basins of attraction, B, we count the frequency of their occurrence. βj is the number of local optima that are reached exactly j times. Like in [5], we assume that the distribution of βj follows a family of parametrized distributions Lawγ . The expected values βj,γ = IEγ [βj ] are computed using βj,γ = L ·

Γ(j + γ) Γ(L · γ) · · j! · Γ(j + γ) Γ((L − 1) · γ)

M! Γ((L − 1) · γ + M − j) · Γ(L · γ + M ) (M − j)! where L is the number of basins of attraction visited so far, and M is the number of uniform randomly sampled solutions used to restart LS. To simplify this equation, we assume that γ is a positive integer. Then, Γ(γ) = γ!. After re-grouping the terms of the above equation, we have  L M j · 1 βj,γ = L · L+M  j+1

M j



M! j!·(M −j)! .

where = βj,γ are used to calculate B, where βj,γ is the solution of the equation P∞ M −γ 1 − (1 + γ·B ) j=1 βj,γ = M γ Here, γ ∈ {1, 2, 3} is chosen toP minimize the difference between βj and βj,γ in the χ2 test, j (βj − βj,γ )2 /βj,γ . The probability that at least one point of the searching space lies in a basin of attraction is pB,M →L→∞ exp −a−1 where a is the solution of M = [L2 · a]. Some experimental results. For this experiment, we restart the local search M = 105 times. In Figure 1 on the top,

Fig. 1. On the top: The estimated number of basins of attraction B versus the visited basins of attraction L for M ∈ (0, 105 ] on the three types of QAPs: a) composite QAP, cqapN with N = {12, . . . , 56}, b) Taillard [13], taiN a with N = {12, . . . , 50}, and c) Drezner [2], dreN with N = {15, . . . , 56}. d) The probabilities that all local optima are visited, pB,M . On the bottom: The estimated, BI , vs the visited, LI , number of indecisive basins of attraction for M = 105 on three type of QAPs: e) cqapN , f) taiN a and g) dreN . h) The probabilities that all the indecisive basins of attraction are visited, pBI ,M .

1e+07

1e+06

1e+06

1e+06

10000

# local optima

1e+07

100000

100000 10000

prob of coverage, pB, M

dreN, N = 15...56 1e+08

B L

1e+07 # local optima

# local optima

taiNa, N = 12...50 1e+08

B L

1

B L

dreN cqapN taiNa

0.8 0.6 pB,M

cqapN, N = 12...56 1e+08

100000

0.4

10000 0.2

1000

1000

1000

100

100

100

0 15 20 25 30 35 40 45 50 55

15 20 25 30 35 40 45 50 55

# facilities, N

10000

prob of coverage, pB ,M I

1

BI LI

10000

0.6

100000 I

pB ,M

100000

dreN cqapN taiNa

0.8

1e+06 # local optima

100000

(d)

dreN, N = 15...56 1e+07

BI LI

1e+06 # local optima

# local optima

1e+06

# facilities, N

(c)

taiNa, N = 12...50 1e+07

BI LI

15 20 25 30 35 40 45 50 55

# facilities, N

(b)

cqapN, N = 12...56 1e+07

15 20 25 30 35 40 45 50 55

# facilities, N

(a)

10000

0.4 0.2

1000

1000

1000 0

100

100 15 20 25 30 35 40 45 50 55 # facilities, N

(e)

100 15 20 25 30 35 40 45 50 55 # facilities, N

(f)

we compare the estimated number of basins of attraction, B, versus the number of already visited basins of attraction L for the three QAP instances. We set M = 105 and γ = {1, 2, 3}. For larger N > 20, for all the QAP instances, the values of estimated, B, and visited, L, basins of attraction are about the same 105 . The covering probability pB,M for the three QAPs is basically 0. Thus, even for a small number of facilities, QAP instances are difficult with lots of local optima, where the restarting points are almost always located in another basin of attraction. We conclude that this definition of basin of attraction is not very informative for the prediction of the size and the number of local optima of QAPs. B. Fitness indecisive basins of attraction The number of basins of attraction is reduced using the observation that LS cannot discriminate between solutions that have equal fitness values but different representations. These solutions, with the same value, can belong or not to the same plateau and, in general, they are difficult to differentiate and to store. A fitness indecisive set of basin of attraction, in short indecisive basin of attraction, is the set of points for which the local optimal solutions obtained by restarting LS from these points have the same cost (or fitness value). The number and the size of indecisive basins of attraction are computed with the same algorithm as the regular basins of attraction, with the logical difference that now the equally valued local optima are counted as one indecisive basin of attraction. We denote with BI the number of estimated indecisive basins of attraction, with LI the number of visited indecisive basins of attraction, and with pBI ,M the coverage probability of BI .

15 20 25 30 35 40 45 50 55 # facilities, N

(g)

15 20 25 30 35 40 45 50 55 # facilities, N

(h)

Experimental results. On the bottom of Figure 1 the number of approximated indecisive basins of attraction, BI , versus the number of already visited indecisive basins of attraction LI for the three QAPs are compared. Note that even for the small size QAPs, cqap16 and tai15a, there are lots of indecisive basins of attraction, LI ≈ 104 , which is about M/10, the coverage probability is again about 0. We deduce that M = 105 LS restarts are not enough for landscape analysis of composite and Taillard’s QAP intances. Note that for small size Drezner’s QAP, dre15, the coverage probability pBI ,M = 0.6 is large and BI is close to LI meaning that M = 105 is a reasonable number of restarts. Furthermore, dreN has non-zero coverage probabilities pBI ,M > 0.2 even for medium size QAPs, N ≤ 56, and the lowest BI . Comparing these results with the results the standard basins of attraction, i.e. B, L and pB,M , we deduce that Drezner’s QAP has lots of basins of attraction with the same value of the local optimum, which can be grouped in fewer indecisive basins of attraction. In conclusion, for the same number of facilities N , cqapN and taiN a are more difficult than dreN . C. Empirical analysis of basins of attraction The number of the neighbourhood function calls in a LS, #N , is an empirical approximation of the size of the basins of attraction. The number of swaps generated in a LS run is the number of neighbourhood calls multiplied  with the number of solutions in a neighbourhood, #N · N2 . This measure gives an approximation of the time necessary for a LS to converge to a local optimum. Figure 2 b) presents the average number of neighbourhood calls for the three types of QAPs. Note that all QAP instances with the same number of facilities have about the same number

#neigh call per LS run 45

# visit sol. per LS 60000

cqapN dreN taiN

40

Fig. 3. The dynamical properties of 36 iterated LSs on the test QAP cqap56. Each iterated LS generates restarting solutions with one of the exchange rates of m = {3, . . . , 39} for the mutation operator: a) the escape probability, b) the success probability, c) the improvement probability. d) The average number of times LS is restarted for each of the 36 iterated LSs. Escape prob on cqap56

cqapN dreN taiN

50000

25 20 15

escape

40000 # sol / LS

# neigh / LS

35 30

30000

Success prob on cqap56 0.1

cqap56 1

0.08

0.8

0.06 success

Fig. 2. a) The number of calls of the neighbourhood function per local search run. b) The number of solutions visited during a local search run. c) The number of times a solution with the best value in all runs is found. Each of the three tests is run on three type of QAPs: cqapN , dreN , and taiN .

0.6

cqap56

0.04

20000

0.4

0.02

10000

0.2

0

10 5 0

0 15 20 25 30 35 40 45 50 55

0

15 20 25 30 35 40 45 50 55

# facilities, N

-0.02 5

10

# facilities, N

(a)

(b)

15 20 25 30 exchange rates

0.16

40

35

40

cqap56

1000 # restarted LS

0.12 improv

# solutions with best values

35

0.08

800 600 400

0.04

400

200

0.02

200

0

0 15 20 25 30 35 40 45 50 55 # facilities, N

(c)

of neighbourhood calls, and the number of visited solutions during a LS run, Figure 2 a), is about the same. In Figure 2 b), if N increases, then the number of times the neighbourhood function is called also increases. This is reflected also in Figure 2 a) where the number of visited solutions during a LS run increases when N increases. The percentage of times LS’s local optimum is equal to the best value found in all runs from Figure 2 c) is a good indicator of the performance of the multi-restart LS algorithm. The identity permutation is a solution for dreN type of QAPs, whereas for Taillard’s QAPs, taiN a (or simpler taiN ) the (near)optimal solutions are known from QAPLIB’s home-page. From M = 105 LS runs, for N > 12, the multi-restart LS finds almost never the solutions with best value. Unlike for cqapN and taiN a, for all the instances dreN , where N ≤ 56, the multi-restart LS finds the optimal solution within M = 105 LS runs. As expected, the larger the number of facilities in dreN , the lower the number of times multi-restart LS finds the best-so-far solution. We conclude that the tested cQAPs are more difficult than Drezner’s QAP instances, which have lots of local optima with the same value. IV.

1200

cqap56

0.06

600

15 20 25 30 exchange rates

(b)

0.1

800

10

Nr restarted LS on cqap56

0.14

1200 1000

5

(a)

dreN cqapN taiN

1400

40

Improvement prob on cqap56

# best values 1600

35

L ANDSCAPE ANALYSIS WITH ITERATED L OCAL S EARCH

In this section, we show that the cQAP instances can be well explored with iterated LS [7], [9] because of the structure in the search space generated by the component QAPs. In Section IV-A, we investigate how to design an iterated LS algorithm that has a good balance between exploration vs exploitation using the method proposed in [3], [4], adapted for single objective search spaces. Section IV-B shows why iterated LS outperforms multi-restart LS on the cQAP instances.

0 5

10

15 20 25 30 exchange rates

35

40

(c)

5

10

15 20 25 30 exchange rates

(d)

A. Exploration/exploitation properties of iterated LS The biggest advantage of multi-restart LS over iterated LS is that there are no parameters to set. For iterated LS, a mutation rate, or a set of mutation rates, needs to be set and the robustness of iterated LS’s behavior depends on the used (combinations of) mutations. In our analysis, we measure the exploitation and exploration characteristics of iterated LS - that are the success, improvement and escape probabilities - like in [4]. The escape probability is the probability that the solution after mutation does not belong to the basin of attraction of its parent pescape = #escapes/#restarts where #restarts are the number of times LS is restarted in a run. The success probability is the probability that the new starting solution escapes from the basin of attraction, and the local optimum generated by LS is performing at least as good as the current solution psuccess = #success/#restarts The improvement probability is the probability that, if the restart escapes from the basin of attraction of its parent, the local optimum output by the restarted LS is at least as good as the current solution pimprov = #success/#escapes Note that pimprov = psuccess /pescape . All iterated LSs are run an equal number of swaps, 5412690, which is equivalent with restarting 100 LS from uniform randomly generated solutions on the same problem, cqap56. The results - that are escape, success and improvement probabilities are averaged over 1000 independent runs of iterated LS for each m = {3, . . . , 39}.

Fig. 4. The dynamical properties of iterated LS for 12 cqapN , where N = {12, . . . , 56}. Each iterated LS generates restarting solutions using mutation and one of the exchange rates m ∈ {3, . . . , N/3} uniformly random chosen from this set: a) the escape, b) the success, c) the improvement probabilities. Escape prob on cqapN

Success prob on cqapN 0.1

ILS 1

0.08

0.8

0.06 success

escape

0.6

0.02

0.2

0

0

ILS

0.04

0.4

-0.02 15 20 25 30 35 40 45 50 55

15 20 25 30 35 40 45 50 55

# facilities, N

# facilities, N

(a)

(b) Improvement prob on cqapN 0.16

ILS

0.14 0.12 0.1 improv

In the following, we use the iterated LS a mutation operator that uniform randomly selects from a range of exchange rates m ∈ {3, . . . , ⌊N/3⌋}, where N is the number of facilities of the cQAP as before.

0.08 0.06 0.04 0.02

Figure 4 a) shows that the escape probability of the iterated LS increases with the numbers of facilities. Note that the performance of iterated LS is inverse proportional with the escape probability, and very large exchange rates, e.g. larger than ⌊N/3⌋, decrease the performance of the algorithm. In Figure 4 b), the success probabilities are within a small range between 0.02 and 0.04 and with large variances. We consider that iterated LS has a very different behaviour in different parts of the landscape, a larger number of LS restarts being required before this behaviour stabilizes. In Figure 4 c), the improvement probabilities vary within a larger range than the success probabilities. The largest improvement is ≈ 0.12 for cqap20 and the smallest improvement is ≈ 0.06 for cqap56. The number of times LS is restarted for all iterated LS is about 200, meaning that, on average, there are at most 24 and at least 12 improvements per iterated LS run. The variance of the improvement probabilities is large, and our explanation for it is the small number of times iterated LS is restarted in a large search space.

0 15 20 25 30 35 40 45 50 55 # facilities, N

(c)

In Figure 3 we present the escape, success and improvement probabilities for cqap56 for iterated LSs using one of the exchange rates m = {3, . . . , 39} to generate restarting solutions. Figure 3 a) shows that pescape increases when m increases. When m > 15, pescape ≈ 0.8 with very large standard deviations. The success probability is rather small for all the exchange rates, the increase in exchange rates resulting in a slight increase in the success probability. The maximum success ≈ 0.01 is at m = 20. In Figure 3 d), iterated LS with m = 20 is restarted about 1100 times, meaning that there are about 1100 · 0.01 = 11 improvements of the current (local) optimum in an iterated LS run. In Figure 3 c), pimprov is largest for small exchange rates m = 3 where the escape rate is very small and the number of restarted LS in Figure 3 d) is very large. The improvement probability is the smallest for m > 20 where the escape rate is very large and the number of restarted LS is close to 100. Thus, for small m, the algorithm iterated LS is improving a while and then quickly gets stuck in a local optimum, whereas for large m > 20, iterated LS explores, but also improves a lot. Figure 3 d) shows that the larger the exchange rate, the closer the number of restarted LS is to 100. This means LS is restarted from uniformly random generated solutions in the search space. The larger m is, the more time - or number of swaps - the restarted LS needs to reach its local optimum. Note that the behaviour of iterated LS greatly depends on the exchange rate of the mutation operator. For very large the restarted solutions are basically uniform randomly generated whereas for small m the solutions might not even escape the basin of attraction. In the following paragraph, we show that using a set of exchange rates for the mutation operator rather than a single exchange rate is beneficial for the exploitation/exploration properties of the iterated LS.

Let’s now compare the results in Figure 3 and 4. Figure 4 shows that the iterated with m ∈ {3, . . . , N/3} has better balance between exploitation and exploration than all the iterated LSs that used a fixed m. In Figure 4 a), pescape = 0.8 for cqap56 is approximatively equal with the escape probability of the iterated LS with a large exchange rate m > 20 from Figure 3 a). In Figure 3 b), psuccess = 0.04 for cqap56 is larger that the largest success probability of the iterated LS from Figure 3 b). The corresponding pimprov = 0.06 in Figure 4 c) is larger than the improvement probability of an iterated LS with good exploration properties, m > 5, in Figure 3 c). We conclude that randomly alternating exchange rates is beneficial for the success and improvement probabilities of the iterated LS, without decreasing the escape probability. B. Exploring basins of attraction with iterated LS The empirical analysis of the basins of attraction shows differences between multi-restart LS and iterated LS runs on cQAPs. In Figure 5 a), the mean number of solutions visited during a LS run is considerably larger for the multi-restart LS than for the iterated LS. Consequently, the number of neighbourhood calls per LS in Figure 5 b) for the iterated LS is considerably smaller than for the multi-restart LS. Because multi-restart and iterated LS run the same number of swaps and the number of visited solutions per LS is smaller for the iterated LS than for the multi-restart LS, for the same cqapN instance, the number of times multi-restart LS is restarted in Figure 5 c) is much smaller than the number of times iterated LS is restarted. That means that, on average, iterated LS spends less time in a local search run than multi-restart LS does and, in turn, restarts LS more times. Figure 5 c) shows that the iterated LS for cqap56 restarts LS approximately 200 times. The number of successes for the iterated LS is, on average, 12 that is considerably more than 1.8 for the iterated LS with m = 20. Figure 5 d) shows that

Fig. 5. a) The number of solutions visited during a local search run, and b) the number of calls of the neighbourhood function per local search run on cqapN where N = {12, . . . , 56}. c) The number of times the local search is restarted in order to run iterated LS as many number of swaps as multi-restart LS restarted 100 times. d) The number of times a solution with the best value is found in an iterated LS and multi-restart LS run. # visit sol. per LS on cqapN 60000

#neigh call per LS on cqapN 45

MLS ILS

MLS ILS

40

50000 35 # neigh / LS

# sol / LS

40000 30000 20000

30 25 20

10 5 0 15 20 25 30 35 40 45 50 55

15 20 25 30 35 40 45 50 55

# facilities, N

# facilities, N

(a)

(b)

# LS runs on cqapN

% best value 1

MLS ILS

220

ILS MLS

0.8 % best value found

200 180 # LS

The landscape analysis is limited by the heuristics used for optimization . A significant improvement in performance of the multi-restart LS was obtained with the iterated LS. The dynamical behaviour of iterated LS, like the escape, success and improvement probabilities, is considered an alternative empirical method for landscape analysis for the large size cQAPs. We conclude that cQAPs are difficult QAP instances that are especially useful for testing heuristics which exploit the structure of the search space.

15

10000 0

propose the fitness indecisive basins of attraction assuming that there are many local optima with the same value.

160 140 120 100

0.6 0.4 0.2 0

80 15 20 25 30 35 40 45 50 55

15 20 25 30 35 40 45 50 55

# facilities, N

# facilities, N

(c)

(d)

the iterated LS finds with probability 0.1 from 100 runs - that is 10 times - the best local optimum unlike multi-restart LS that cannot find this optimum at all. We conclude that the iterated LS is performing better than the multi-restart LS because it exploits the structure of the search space. This difference in performance is especially large for medium size cQAPs. For the efficient exploration of cQAPs with large number of facilities, the iterated LS should be enhanced with other techniques, like population and recombination operators, but such a study is beyond the scope of this paper. V.

C ONCLUSION

We propose an instance generator for composite QAPs (cQAPs) what combines a number of small size QAPs, whose exact solutions can be easily computed in order to create large QAP instance with lots of local optima solution and, the the same time, structure in the search space. The rest of the elements from the cQAP, which were not assigned with elements of component QAPs are generated using the rearrangement inequality. Note that both composite and outside regions have the identity permutation as the optimal solution but the composite QAP, in general, has another optimal solution. We show that the composite QAP instances has additively decomposable cost function. To measure the difficulty of local search-based heuristics, like multi-restart local search, to optimize composite QAP instances, we approximate the size and the number of basins of attraction with numerical and empirical methods. We show that the conventional definition of basins of attraction is too strict for practical use with medium size QAP instances. We

R EFERENCES [1] E. C¸ela. The Quadratic Assignment Problem: Theory and Algorithms. Kluwer Academic Publishers, Dordrecht, 1998. ´ [2] Z. Drezner, P.M. Hahn, and E.D. Taillard. Recent advances for the quadratic assignment problem with special emphasis on instances that are difficult for meta-heuristic methods. Annals of Operations Research, 139(1):65–94, 2005. [3] M. M. Drugan and D. Thierens. Stochastic pareto local search: Pareto neighborhood exploration and perturbation strategies. Journal of Heuristics, 18(5):727–766, 2012. [4] M.M. Drugan and D. Thierens. Path-guided mutation for stochastic Pareto local search algorithms. In Parallel problem solving from Nature (PPSN), volume LNCS, pages 485–495. Springer, 2010. [5] J. Garnier and L. Kallel. Efficiency of local search with multiple local optima. SIAM J. Discrete Math., 15(1):122–141, 2001. [6] E.M. Loiola, N.M.M. de Abreu, P.O. Boaventura-Netto, P. Hahn, and T. Querido. A survey for the quadratic assignment problem. European J. of Operational Research, 176(2):657–690, 2007. [7] H.R. Lourenc¸o, O.C. Martin, and T. St¨utzle. Iterated local search. In Handbook of Metaheuristics, Lecture Notes in Computer Science, pages 320–353. Springer-Verlag, 2003. [8] A. Marzetta and A. Br¨ungger. A dynamic-programming bound for the quadratic assignment problem. In COCOON’99, volume LNCS 1627, pages 339–348. Springer-Verlag Berlin, 1999. [9] P. Merz and B. Freisleben. Fitness landscape analysis and memetic algorithms for the quadratic assignment problem. IEEE Transactions on Evolutionary Computation, 4(4):337–352, 2000. [10] G. Palubeckis. An algorithm for construction of test cases for the quadratic assignment problem. Informatica, Lith. Acad. Sci., 11(3):281– 296, 2000. [11] P. Siarry and Z. Michalewicz, editors. Advances in Metaheuristics for Hard Optimization. Springer, 2008. [12] T. St¨utzle. Iterated local search for the quadratic assignment problem. European Journal of Operational Research, 174(3):1519–1539, 2006. ´ [13] E.D. Taillard. Comparison of iterative searches for the quadratic assignment problem. Location science, 3:87–105, 1995. [14] A. Wayne. Inequalities and inversions of order. Scripta Mathematica, 12(2):164–169, 1946.

Suggest Documents