A combination of clustering algorithms with Ant Colony ... - CiteSeerX

4 downloads 0 Views 264KB Size Report
of each sub-TSP. We also design an algorithm to combine the sub-tours found in the second stage in order to get a final solution of the original TSP instance.
A combination of clustering algorithms with Ant Colony Optimization for large clustered Euclidean Travelling Salesman Problem TRUNG HOANG DINH, ABDULLAH AL MAMUN Department of Electrical and Computer Engineering National University of Singapore, NUS Address: 10 Kent Ridge Crescent, Singapore 117584

Abstract: The algorithm of Ant Colony System (ACS) has been found attractive for solving combinatorial optimization problem such as Travelling Salesman Problem (TSP). The run-time for this algorithm increases with increase in number of nodes. In this paper, we propose an efficient method that reduces the run-time for very large-scale Euclidean TSP problem and yet conserve the quality of solution for certain clustered cases. Moreover, the proposed method has a simple parallel implementation. The proposed method shows excellent performance both in run-time and quality of solution specially on large clustered instances. Effectiveness of the proposed method is underscored by applying them on two kinds of different benchmark problems.

Key-Words: travelling salesman problem, ant colony optimization, clustering algorithms, combinatorial optimization.

1

Introduction

algorithm (GA), simulated annealing (SA), and ant algorithms. Most ant algorithms which have been successfully applied to many combinatorial optimization problems [3] follows a general scheme which is called Ant Colony Optimization - ACO (see [2]). Ant System (AS) [5], AS’s variants eg. ant-Q [6], rank ant [1]), ACS [4] are some examples of the ant algorithms applied to TSP problem. We chose ACS due to the fact that it is one of the most successful ant algorithms applied to TSP, outperformed GA, SA, and also meets conditions in the theorem about reducing run-time which will be discussed later. Our method of solving Euclidean TSP problem of very large instances is divided into three separated stages(in some scenes the first and the third stage can be combined together in one single stage). In the first stage, the original instance is partitioned into few clusters of smaller dimensions. Each cluster is considered as a sub-TSP, and in the second stage, ACS is applied to find the optimal solution of each sub-TSP. We also design an algorithm to combine the sub-tours found in the second stage in order to get a final solution of the original TSP instance. For the same settings, the run-time of

The classic well-known NP-hard Travelling Salesman Problem (TSP) has been used as a rich testing ground for most important algorithmic ideas during the past few decades. Interested readers may refer to Lawler et al. [8] for a fascinating history. In Euclidean TSP, nodes lie in B > km−1 , where k is the number of parts resulting from the clustering stage, m is the degree of polynomial F , A is the total amount of run-time to find the optimal solutions for all clusters using Υ, B’s definition is similar A’s definition but for the original instance (without clustering).

After the original TSP is separated into several smaller clusters in stage 1, the algorithm ACS is applied to find the optimal tour for each part. It is clear that when the number of cities and/or the number of iterations increases, this second stage becomes the most time-consuming stage comparing with other two stages. Let A be the time required for the second stage, B be the time required to solve the original instance without clustering, and we assume that the same algorithm (ACS) with same settings is used in both cases. We can find A an upper and strict lower bound of B .

The proof of this theorem is referred to appendix A. Remark: Assuming that the run-times of the first and third stage are very much insignificant compared to that of the second stage, this theorem suggests that the proposed 3-stage method is faster than the other, but can not faster than k m−1 times. The original instances should be clustered into parts such that the clusters are approximately equal in size to reach the limit (k m−1 ). Using the same symbols given in the theorem 1, we define the following corollary to present a result of the case when the size of every part is bounded.

Proposition 1 Given ci > 0, i = 1..k such that k k P P 1 ci = 1 then 1 > cm i ≥ km−1 holds ∀m ≥ 2.

i=1

i=1

Proof 1 It is clear that 0 < cm i < ci , ∀i = 1..k, hence the left hand side of the inequalities holds. Let f (x) = xm , m ≥ 2, the 2nd derivative of f is f ” = m(m − 1)xm−2 > 0 ∀x > 0. Thus f (x) is a convex down function in the interval Corollary 2 If the size of every part is bounded in k P A the interval [1, Kmax ] then ² ≥ B , where 0 < ² = ci (0, ∞). According to Jensen inequality 3 f ( i=1k ) ≤ Kmax ≤ 1, n is the size of the original instance. n k P

i=1

f (ci ) k

k P i=1



k 1 P ( ci )m m k i=1



1 k

k P i=1

cm → i

1 km−1

≤ Proof 2 Let βi = Kmax n

cm i .

The equality happens when ci =

→1=

k P i=1

βi ; 0 < βi ≤ ε =

≤ 1. We have k

1 k

ci n

k

ai X i X = βj ≤ βj εi−1 = εi−1 , i ≥ 2. bi

∀i.

j=1

(1)

j=1

Proposition 2 With any numbers a, b, c, d > 0, The equality takes place if αj = ², ∀j hence 1 a c Replace (1) we have max( ab , dc ) ≥ a+c k = αj = ² → k · Kmax = n. b+d ≥ min( b , d ). into left hand side of inequality (2), we obtain ai m A The equality takes place if and only if ab = dc . This ² = max[{²i−1 }m i=2 ] ≥ max[{ bi }i=2 ] ≥ B . proposition leads easily to the next corollary: Summary: Most practical implementations of Υ may not have a run-time function F of the exact Corollary 1 With any ai , bi > 0 ∀i = 1..n, then form described in Ω. But if the input size is large n P ai enough and F is a polynomial then sum of commax[{ abii }ni=1 ] ≥ i=1 ≥ min[{ abii }ni=1 ]. n P ponents of degrees that is lower than the degree of bi i=1 F is less significant compared to the value of the These equalities happen concurrently when ab11 = highest degree component of F . The above results, a2 in such case, are still reasonable. In addition, if an b2 = .. = bn . F is not a polynomial but of other form e.g. logm P Let Ω = {F : R+ → R+ , F (x) = αi xi , m ∈ arithmic function, we can approximate F using a i=2 polynomial G such that the highest coefficient of G N, m ≥ 2; αi > 0; ∀i = 2, m}. is positive. Then this case is similar to the former 3 http://mathworld.wolfram.com/JensensInequality.html one. 3

3.3

The Third Stage - Method of Com- edges. bining Individual Solutions

Theorem 4 In the optimal tour, there exists no cluster that has at least four linking edges (to link to other four distinct clusters). In other words, to build the optimal tour from sub-tours, each cluster has two linking edges which link to two different clusters.

After the solutions of sub-TSPs are obtained from the second stage, we combine them to get a feasible solution i.e. a closed tour of the original instance. The most general solution is to choose some edges in a sub-tour to create bridges linking the cluster to other clusters such that when the chosen edges are removed the bridging edges will make a closed tour. We shall call the edges that bridge between clusters the linking edges from now on. If we search for a final tour by increasing the number of chosen edges of all sub-tours to its number of edges then it becomes an exhaustive searching, and of course, it is practically impossible as the Euclidean TSP is a NP-hard problem. So we design a greedy combining method restricting the number of chosen edges to one only, and point out the conditions under which the combining method is good enough. A sufficient condition for search space including the optimal solution for the case of two clusters: For the case of the number of clusters k = 2, we have the following result:

Remark: Theorems 3, 4 are useful, specially for clustered instances. Based on these results, for the clustered instance, an implementation of 3-stageACS is proposed as below: Step 1: Partition the original instance into smaller parts. Step 2: Find k linking edges for k clusters according to theorems 3,4. It seems that it is also considered roughly as a TSP problem with k nodes. Step 3: Apply ACS for each cluster, but its subtour always contains a fixed edge whose two nodes link this cluster to two other distinct clusters.

4

Experiment

4.1

Theorem 2 Let A and B be the two clusters, dA and dB be diameter of A and B, respectively, and dAB be distance between A and B. If there exist n ∈ N, 2 ≤ n ≤ min{|A|, |B|} such that dAB ≥ (∗∗) A ,dB } + dA + dB then the search space dn = min{d 2(n−1) of the proposed 3-stage method contains the optimal final tour and that space is formed by number of linking edges less than 2n.

Large ETSP instances for testing

We test the effectiveness of the proposed 3-stage method by applying the algorithm to two different benchmark problems. The first problem is the TSPLIB4 . We consider some large Euclidean instances with the number of cities between 650 and 3795. The second benchmark is a group of instances where cities are randomly clustered dis6 The proof of this theorem is referred to appendix tributed on the square [0, 10 ]. These instances with the number of cities between 1000 and 5000 B. generated by the Instance Generator Code of Remark: A clear corollary resulting from this the- were th the 8 DIMACS Implementation Challenge5 . orem is that if n = 2 and (**) still holds then the optimal final tour must belong to the search space formed by using only two linking edges, each from 4.2 Comparison between 3-stage-ACS and other algorithms a sub-tour. It means that, 3-stage method will searche the optimal tour in such a reduced space We compare the proposed 3-stage-ACS with ACS. which attributes to reduce its run-time as well. Because the run-time quantity is also used to An efficient combining algorithm for clus- compare their computational efficiency, both algotered instance: If the number of clusters k > 2, rithms were run on the same computer (a Dell PC and the distances among clusters is large enough Pentium IV 2.4GHz processor, 256MB of RAM), compared with the diameter of any cluster, we can the same code to implement ACS (except code have the following theorem with an assumption parts particularly designed for 3-stage-ACS inthat the distances between any three clusters meet cluding clustering and combining parts), and the the triangular inequality: 4

http://www.iwr.uni-heidelberg.de/groups/comopt

Theorem 3 In the optimal final tour no two clus- /software/TSPLIB95 5 http://www.research.att.com/˜dsj/chtsp/download.html ters exist that are joined by at least two linking 4

Table 1: A comparison of 3-stage-ACS and ACS is based on instances of 1000-5000 cities clustered randomly generated. Each trial was stopped after 5000 iterations. Averages are over 15 trials. Results in bold are the best in the table. (*) is the proportion of run-time of ACS to 3-stage-ACS’s; k is the number of clusters. No. of 3-stage ACS (*)ACS/ [(2)-(1)] cities average std dev best(1) k average std dev best(2) 3-stage /(1) 1000 1500 2000 2500 5000

11870553.80 13840435.67 16435565.40 17841082.93 25300826.40

74110.93 70049.18 82162.62 68272.57 82162.62

11747792 13735671 16307804 17740618 25147562

6 6 8 13 26

12302227.67 14143592.20 17124049.13 18592455.20 26515075.20

88129.51 104250.71 187148.32 185965.00 239561.92

12168397 13971107 16908891 18324584 26202524

3.68 5.63 5.08 6.35 11.82

3.58% 1.71% 3.69% 3.29% 4.20%

Table 2: A comparison of 3-stage-ACS and ACS is based on large benchmark instances. Averages are over 15 trials. Each trial were stopped after 5000 iterations. Results in bold are the best in the table. (*) is the proportion of run-time of ACS to 3-stage-ACS’s; k is the number of clusters. prob. 3-stage ACS Optimum (*)ACS/ [(2)-(1)] name average std dev best(1) k average std dev best(2) known 3-stage /(1) p654 fl3795

35554.67 30804.33

297.48 120.27

35113 30689

3 4

35860.67 30881.33

438.98 64.97

35120 30842

34643 28772

1.96 2.44

0.02% 0.50%

rameter cl = 0, because such a cluster may not be considered as a large instance. For clustered Euclidean TSPs: As shown in the Table 1, 3-stage-ACS always produces better solutions than the standard ACS, in much shorter time. For example, it took 16354.6 seconds for solving a 5000-city instance using ACS, but only 1383.31 seconds for the proposed 3-stage-ACS algorithm. This is attributed to the fact that 3-stageACS outperforms ACS in clustered Euclidean TSP. For benchmark Euclidean TSPs: Because almost large benchmark instances do not follow the sufficient condition mentioned in theorems (3, 4), thus the combining algorithm used for such an instance has a little but important change which takes a part in improving the quality of final solution. After doing the same thing as doing for the above type of instance, the ”representative tour” (found at step 2 in the above remark) is replaced by a closed tour which characterizes like a TSP tour -starts from a city, visit all other cities and come back the starting city- but a city in the closed tour can be visited more than once as long as its total length is less than the replaced one. As shown in table 2, 3-stage ACS outperformed ACS for both benchmark instances p654 and fl3795 both average and optimal solution, but for fl3795 the ACS seems more stable than 3-stage ACS which can be seen from the value of standard deviation.

same settings for parameters as discussed in [4], m = 10, β = 2, q0 = 0.98, α = ρ = 0.1, and τ = 1/(n·Lnn )−1 . In addition, all testing instances are very large, thus for both case, a candidate list is used with the length of cl = 20. Memory storage requirement: For the largescale TSP instances, the required memory to store the pheromone and cost matrix may contribute most remarkably to the memory requirement of the algorithm. With the input size of N cities, the amount of memory to store these two matrices should be CN (N − 1) bytes, where C is a systemdependent parameter to store a real-type number and these two matrices store the upper triangle of matrices only. However, the proposed 3-stage method takes, during the execution of ACS, approximately Cn(n − 1) bytes where n is the size of largest cluster. It can be seen that, from the point of view of memory requirement, the 3-stage-ACS is more efficient than ACS. Experimental results: To compare the run-time of 3-stage-ACS with ACS, we take the factor total run-time of ACS of 3-stage-ACS. The larger this factor, the faster the 3-stage-ACS is. Due to the fact that the optimal solutions of generated clustered Euclidean TSP instances are unknown, we use a factor relative performance to compare ACS’s performance with 3-stage-ACS on quality of solutions. This factor (relative performance) is computed by taking proportion of the subtraction of smallest cost found by ACS with 3-stage-ACS’s over 3-stage-ACS’s. Each instance was run totally 15 trials, each trial had 5000 iterations, and for clusters whose size is less than 150 we set the pa-

5

Conclusion and future work

The 3-stage ACS proposed in this article shows promising results in increasing efficiency both in 5

the assumption, run-time and quality of solution for large clustered Proof 4 (of theorem 1) From m m P i Euclidean TSP. The proposed method runs faster B = F (n) = P αi n = bi , where bi = αi ni , i = 2, m. than the conventional ACS that does not use clusi=2 i=2 tering. Moreover, the proposed algorithm can be Similarly, converted to a parallel version with a little changes k k m X X X in its serial version. The results presented in this A= F (cj ) = ( αi cij ) article underscores the idea that both decreasing j=1 j=1 i=2 run-time and guaranteed quality of solutions is m X k m X X achieved when the proposed method is applied to = ( αi cij ) = ai , problems whose solutions can be partitioned into i=2 j=1 i=2 sub-solutions and vice versa. where ai =

j=1

References

i=1 k P

(

i=2

m P

i=2

αi ni

≥ min[

αi n

αi ni

i=2

the

corollary m

1 ] = ] = min[ ki−1 i=2

1

ki−1



≥ cj )i

i=2 m P

m P

bi

1 . ki−1

= bi

1

ki−1

i=2 m P

i=2

(2)

αi ni .

αi ni

i=2

1 The right hand side equality happens when abii = ki−1 ↔ n c1 = c2 = .. = ck = k . Combine equation (2) with the A 1 lemma (1) hence 1 > B > km−1 , or the theorem 1 is proved.

Appendix B Proof 5 (of theorem 2) Let Tn be the set of all final tours obtained by combining all two tuples of n edges, each tuple from a sub-tour, 1 ≤ n ≤ min{|A|,|B|} . Set Sn = 2 min{cost of tour t = |t| : t ∈ Tn }, we will prove that if (**) holds then Sn > S 1 . (3)

Appendix A

m

k P

i=2 m P

[9] C. H. Papadimitriou. Euclidean TSP is NP-complete. Theoretical Computer Science, 4:237–244, 1977.

i

αi (

αi cij

m P ai ai m A i=2 1 > max[{ }i=2 ] ≥ = P m bi B bi

[8] E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys. The traveling salesman problem. John Wiley, 1985.

1 ki−1

j=1

Combining with corollary 1, we have

[7] M.R. Garey, R.L. Graham, and D.S. Johnson. Some NPcomplete geometric problems. In Proc. ACM Symposium on Theory of Computing, pages 10–22, 1976.

αi ni

1 , ∀i ≥ 2 → ki−1

j=1

[6] L.M. Gambardella and M. Dorigo. Ant-Q: A reinforcement learning approach to the traveling salesman problem. In International Conference on Machine Learning, pages 252–260, 1995.

1 ki−1

cj )i

ai = 1> bi

[5] M. Dorigo, V. Maniezzo, and A. Colorni. Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on System, Man, and Cybernetics, 26(1):28–41, 1996.

Proof 3 (of lemma 1) From



k P

[4] M. Dorigo and L.M. Gambardella. Ant colony system: A cooperative learning approach to the travelling salesman problem. IEEE Transactions on Evolutionary Computation, 1:53–66, 1997.

1 . km−1

k P

cij

j=1

[3] M. Dorigo, G. Di Caro, and L.M. Gambardella. Ant algorithms for discrete optimization. Artificial Life, 5:137– 172, 1999.

Lemma 1 If n, αi > 0i = 2, m, k > 1 then

j=1

1>

[2] M. Dorigo and G. Di Caro. The ant colony optimization metaheuristic. In D. Corne, M. Dorigo, and F. Glover, editors, New Ideas In Optimization. McGraw-Hill, 1999.

m P i 1 i−1 αi n i=2 k m P αi ni i=2

αi cij , i = 2, m, and cj is the size of part i

(cluster i). According to proposition (1), with the note of k P n= ci , we obtain

[1] B. Bullnheimer, R.F. Hartl, and Ch. Strauss. A new rank based version of the ant system: a computational study. Central European Journal of Operations Research, 7(1):25–38, 1999.

m P

k P

We choose any n edges in each sub-tour whose lengths are ai , bi , i = 1..n, ai ∈ A, bi ∈ B. Let xi , i = 1..2n be the length of the ith linking edge, ξ be the sum of length of the remaining edges of the two sub-tours which are not chosen to be deleted. We need to consider two cases: Case 1: There exists at least one pair of edges such that the two linking edges link only four vertices of these two edges as shown in Fig.1. The length of the final tour is |t2 | = 2n−2 n−1 P P ξ+ xi − (ai + bi ) + (e + f − a − c).

>

1, 1 . km−1

i=1

Since the sequence { k1i }∞ i=1 is strictly monotonic the equality does not take place, and hence the lemma is proved.

i=1

From the assumption, we have

2n−2 P i=1

6

xi ≥ (2n − 2)dAB >

Figure 1: pair of edges such that the two linking edges link only four vertices of these two edges

Figure 2: There is no pair of edges such that the two linking edges link only four vertices of these two edges

2(n−1)(dA +dB ) ≥ 2

n−1 P

n−1 P

(ai + bi ) → |t2 | > ξ+

i=1

(ai + bi )+

i=1

(e + f − a − c) ≥ S1 . Case 2: If case 1 does not happen, we can assume without loss of generality that dA ≥ dB and there are two edges of sub-tour of cluster A and two of cluster B with linking as shown in Fig.2. The length of the final tour, in this case, is 2n−4 n−2 P P |t2 | = ξ + xi − (ai + bi )+(e+g +h+i−a−b−c−d). i=1

i=1

Due to h + x > f and 2n−4 X

xi + (g + i) ≥ dB + 2(n − 1)(dA + dB ) ≥

i=1

2{

n−2 X

(ai + bi ) + (b + d)} + x →

i=1

|t2 | > ξ +

n−2 X

(ai + bi ) + (b + d) + (e + f − a − c)

i=1

≥ S1 . Hence the two cases shown above prove the inequality (3). A ,dB } Since dn = min{d + dA + dB is a decreasing monotonic 2(n−1) sequence, if (**) takes place at n = n0 then dAB ≥ dn0 > dn ∀n > n0 or Sn > S1 ∀n > n0 , on the other words, the inequality (3) is true ∀n ≥ n0 or the optimal final tour is in the restricted search space formed by using less than n chosen edges in each sub-tour. Theorem 2 is completely proved.

7

Suggest Documents