Parallel Simulated Annealing for the Delivery Problem Zbigniew J. Czech Institute of Computer Science, Silesia University of Technology Gliwice, Poland e-mail:
[email protected]
Abstract A delivery problem which reduces to an NP-complete set-partitioning problem is considered. Two algorithms of parallel simulated annealing, i.e. the simultaneous independent searches and the simultaneous periodically interacting searches are investigated. The objective is to improve the accuracy of solutions to the problem by applying parallelism. The accuracy of a solution is meant as its proximity to the optimum solution. The empirical evidence supported by the statistical analysis indicate that the interaction of processes in parallel simulated annealing can yield more accurate solutions to the delivery problem as compared to the case when the processes run independently. Key words. Delivery problem, set-partitioning problem, parallel simulated annealing algorithms, message passing model of parallel computation
1 Introduction Two algorithms of parallel simulated annealing, i.e. the simultaneous independent searches and the simultaneous periodically interacting searches are investigated. The algorithms are applied to solve a delivery problem which consists in finding a set of routes of the smallest total length for a fleet of vehicles to satisfy the cargo delivery requirements of customers. The practical applications of the delivery problem include: deliveries of goods to department stores, picking up students by school buses, newspaper, laundry and mail distribution, maintenance inspection tours, etc. If a constraint is imposed on the size of routes then the delivery problem reduces to an NP-complete setpartitioning problem. The objective of this work is to improve the accuracy of solutions to the problem by applying
This research was supported in part by the State Committee for Scientific Research grant BK-210-RAu2-2000.
parallelism. The accuracy of a solution is meant as its proximity to the optimum solution. Several methods of parallelization of simulated annealing were proposed in the literature. In this regard we refer to Aarts and Korst [1], Greening [10], Azencott [3], Boissin and Lutton [4], and Verhoeven and Aarts [17]. The delivery problem is discussed by Altinkemer and Gavish [2], Lenstra and Rinnooy Kan [13], Christofides, Mignozzi and Toth [6], Clarke and Wright [7], Fisher and Jaikumar [9], Haimovich and Rinnooy Kan [11]. The preliminary results of comparison of the simultaneous independent searches and the simultaneous periodically interacting searches applied for the delivery problem are presented in [8]. Here we extend these results by selecting a suitable value for the temperature reduction parameter of annealing, increasing the number of experimental executions of searches, and running the searches in a parallel environment using the message passing interface (MPI) library. In section 2 the problem under investigation is formulated. Section 3 describes a sequential annealing algorithm. In section 4 and 5 the simultaneous independent searches and the simultaneous periodically interacting searches, respectively, are presented. Section 6 describes the problem of selection a suitable value of the temperature reduction parameter of annealing. In section 7 the comparison of the searches is presented. Section 8 concludes the work.
2 Problem formulation The delivery problem (DP) which arose in a transportation company is considered. It can be formulated as follows. There is a central depot of cargo and n customers (nodes) located at the specified distances from the depot. The cargo have to be delivered to (or pickuped from) each customer according to the cargo delivery requirements by a fleet of vehicles. We assume that the number of vehicles in the fleet is unlimited, and that the capacity of any vehicle is large
enough to fulfill deliveries. In each tour which is effected during an eight-hour day a vehicle crew can visit at most k customers, where k is a small constant. Let k = 3. Then on a single tour the crew starts from the depot, visits one, two or three customers and returns to the depot. A set of tours which guarantees the delivery of cargo to all customers is sought. Furthermore, the cost defined as the total length of the tours in the set should be minimized. If a strong constraint is imposed on the magnitude of k (e.g. k = 3), then the DP reduces to the set-partitioning problem (SPP) which is NP-complete. Let N = f1, 2, . . . , ng be? a set?of customers, and let S = fS1 , S2 , . . . , Sq g, ? q = n1 + n2 + n3 , be a set of all subsets of N of size at most 3, i.e. Si N and jSi j 3, i 2 M , where M = f1, 2, . . . , q g. Every Si represents a possible tour of a solution to the DP. Let ci be a minimum cost (length) of the tour Si . To obtain the solution to the DP we need to solve the SPP which consists in finding the collection fSl g, l 2 M , of minimum total cost such that every customer j , j 2 N , is covered by the subsets in the collection exactly once. In other words, the intersection of any pair of subsets in fSl g is empty. The delivery problem is considered by Altinkemer and Gavish in [2]. They assume that k can be arbitrary and take into account the limited capacity of vehicles. Lenstra and Rinnooy Kan [13] proved that under these assumptions the delivery problem is NP-hard.
likely to be accepted when Ti is high. As Ti approaches zero most uphill moves are rejected. The algorithm of annealing stops if equilibrium is encountered. We define that equilibrium is reached if 20 consecutive stages of temperature reduction fail to improve the best solution. Contrary to the classical approach in which a solution to the problem is taken as the last solution obtained in the annealing process, we memorize the best solution found during the whole annealing process (lines 26– 27). Summing up, the annealing algorithm performs the local search by sampling the neighborhood randomly. It attempts to avoid becoming prematurely trapped in a local optimum by sometimes accepting an inferior solution. The level of this acceptance depends on the magnitude of the increase in solution cost and on the search time to date. The cost of a single iteration of the repeat statement (lines 4–12) is proportionate to n2 , as the steps in line 5 and lines 9–11 are executed in constant time, and there is n2 repetitions of the for loop (lines 6–8). It is difficult to establish analytically the number of stages, a, in the cooling schedule, i.e. the number of iterations of the repeat statement. In our experiments we found that for the number of customers n = 40 : : : 100 and = 0:92 (selection of is discussed in section 6), the number of stages did not exceed 200. Therefore the worst case time complexity of our sequential annealing algorithm is T (n) 200n2 = O(n2 ).
3 Sequential annealing
Sequential annealing algorithm
The algorithm of simulated annealing which can be regarded as a variant of local search was first introduced by Metropolis et al. [14], and then used to optimization problems by Kirkpatrick et al. [12] and Cˇerny [5]. A comprehensive introduction to the subject can be found in [15]. The application of simulated annealing to solve the DP is as follows. Initially a solution to the problem is the set of tours of size 3 (last tour may have less than 3 customers). The customers are grouped into the tours randomly. On every step a neighbor solution is determined by either moving a randomly chosen customer from one tour to another (perhaps empty) or by exchanging the places of random customers between their tours. The neighbor solutions of lower costs obtained in this way are always accepted, whereas the solutions of higher costs are accepted with the probability
Pr = Ti =(Ti + )
(1)
where Ti , i = 0, 1, . . . is a parameter called a temperature of annealing, which drops from the value T0 = cost(initial solution)=1000 according to the formula Ti+1 = Ti , where < 1. Eq. (1) implies that large increases in solution cost, so called uphill moves, are more
1 2 3 4 5 6 7 8 9 10 11 12
Create initial old solution; best solution := old solution; eq counter := 0; equilibrium counter T := cost(best solution)=1000; repeat; c := cost(best solution); memorize the cost for iteration counter := 1 to n2 do annealing step(old solution, best solution); end for; if cost(best solution) < c then eq counter := 0; else eq counter := eq counter + 1; end if; T := T ; temperature reduction until eq counter > 20;
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
procedure annealing step(old solution, best solution); Select randomly a customer; Select randomly a tour (distinct from the customer’s tour selected in the previous line); if the tour size is less than 3 then Create the new solution; else Select randomly a customer in the tour; Create the new solution; end if; := cost (new solution ) cost (old solution ); Generate random x uniformly in the range (0; 1); if ( < 0) or (x < T=(T + )) then old solution := new solution; if cost (new solution ) < cost (best solution ) then best solution := new solution; end if; end if;
f
g
f
g
f
g
?
29
end annealing step;
4 Simultaneous independent searches Let us assume that p processors are available and each of them is capable of generating its own annealing process. The processors can be used either to speed up the sequential annealing algorithm or to achieve a higher accuracy of solutions to a problem. In this work we consider the latter goal. The accuracy of a solution is meant as its proximity to the optimum solution. The algorithm of simultaneous independent searches (IS) consists in executing p independent annealing processes and taking as the final result the best solution among the solutions found by the processes. More formally, let r be the total available time (expressed in annealing steps), (1) (2) (p) and let Ep = fVr ; Vr ; : : : ; Vr g be the set of p best solutions found by the processes after having performed r sequential annealing steps. The final solution to the problem is selected as the best solution in set Ep , what we denote
Wr = Vr(1) Vr(2) Vr(p) :
(2)
The IS algorithm written for the message passing model of computation is given below. The processes Pj , j = 1, 2, . . . , p, carry out the independent annealing searches using the same initial solution and cooling schedule as in the sequential algorithm (see section 3). At each temperature process Pj executes n2 annealing steps (lines 19–21) and then sends its best local solutionj to process P0 (line 22). The process P0 chooses the best global solution among the local solutions, i.e. in the set of best local solutionj , j = 1, 2, . . . , p (line 6). Then it tests whether to update the final solution with the best global solution (lines 7–9). If such an update is made then the eq counter is set to 0. The searches stop when equilibrium is reached, i.e. variable go sent to processes Pj takes on value false (line 10).
Simultaneous independent searches (IS)
1 2 3 4 5 6 7 8 9 10 11 12
P ROCESS P0 : Create initial solution; Send initial solution to processes Pj , j = 1, 2, . . . , p; final solution := initial solution; eq counter := 0; while eq counter 20 do Receive best local solutionj from processes Pj ; Choose best global solution; if cost(best global solution) < cost(final solution) then final solution := best global solution; eq counter := 0; else eq counter := eq counter + 1; end if; go := (eq counter 20); Send go to processes Pj , j = 1, 2, . . . , p; end while;
13
Produce final solution as the result of IS;
14 15 16 17 18 19 20 21 22 23 24 25 26
P ROCESSES Pj , j = 1 , 2, . . . , p: Receive initial solution from P0 ; old solutionj := initial solution; best local solutionj := initial solution; T := cost(initial solution)=1000; loop; for iteration counter j := 1 to n2 do annealing step(old solutionj , best local solutionj ); end for; Send best local solutionj to P0 ; Receive go from P0 ; if go = false then stop; end if; T := T ; temperature reduction end loop;
f
g
The equation (2) indicates that in order to compute Wr it is enough to execute independently p annealing processes and then take the minimum solution in set Ep . However the implementation of processes P0 , P1 , . . . , Pp given above is slightly more complicated. Namely, processes P1 , P2 , . . . , Pp communicate with P0 after every n2 annealing steps. The reason for this communication, which of course slows down the searches, is that process P0 has to find out when to finish the searches. If 20n2 p annealing steps do not change the final solution, process P0 assumes that equilibrium is reached and stops the computation. The worst case time complexity of the IS algorithm is Tp (n) pn + a(n2 + pn + n + p). The execution cost of lines 1–3 does not exceed pn, as there are p messages of size n which are sent to processes P1 , P2 , . . . , Pp . The cooling schedule gives a iterations of the while statement (lines 4–12) and the cost of every iteration is at most n2 + pn + n + p. This cost includes the waiting time of n2 units until processes Pj , j = 1, 2, . . . , p, compute their best local solutionj ’s in lines 19–21, the cost pn of receiving those solutions (line 5), and the cost n + p of updating the final solution (lines 7–9) and of notification whether to continue the computation (lines 10–11).
5 Simultaneous searches
periodically
interacting
As before, we assume that p identical processors are available and execute their own sequential annealing processes. In the simultaneous periodically interacting searches (or co-operating searches, CS) the processes P1 , P2 , . . . , Pp interact with each other every w steps passing their best solutions found so far. Suppose for a moment that the temperature of annealing, T , is fixed. Let Vr(j) (T ), j = 1, 2, . . . , p, r = 1, 2, . . . be the Markov chains for each of the processes, let PT (V ) be the realization of one step of the chain at temperature T
and with starting point V , and let Vr be the best solutions found by processes j = 1, 2, . . . , p, so far, i.e. between step 1 and r. We assume the following scheme of interaction: (j )
Vr(1) +1 j) Vr(+1 (j ) Vuw
= = =
PT (Vr ); (3) j PT (Vr ) for j 6= 1; and if r + 1 6= uw;(4) PT (Vuwj ? ) if (5) cost (PT (Vuwj ? )) cost (Vuwj? );
=
P ROCESS P0 : The same as for the IS.
(1)
( )
( )
1
( )
(j ) Vuw
Simultaneous periodically interacting searches (CS)
(j ?1) Vuw
1
otherwise:
(
1)
(6)
In this scheme the processes interact at steps uw, u = 1, 2, . . . where each step consists of a single realization of the Markov chain, i.e. of an annealing step. The chain for the first process (j = 1) is completely independent. The chain for the second process is updated at steps uw to the better solution between the best solution found by the first process (1) so far, Vuw , and the realization of the last step of the second (2) process, PT (Vuw?1 ). Similarly, the third process chooses as the next point in its chain the better solution between (2) (3) Vuw and PT (Vuw?1 ). Clearly, the best solution found by the l-th process is propagated for further exploration to processes m, m > l. The above scheme of interaction is a modification of the scheme given by Aarts and Laarhoven [1] and Graffigne [3]. Their scheme uses in Eqs. (5) and (6) the value of (j ?1) (j ?1) Vuw instead of Vuw . That is process j updates its chain to the better solution found by its left neighbor in step (j ?1) uw ? 1, PT (Vuw ?1 ), and its own realization of this step, (j ) PT (Vuw?1 ). Graffigne [3] formulates the conjecture that if w is large enough to allow a total connection of the Markov chain (i.e. it is possible to transform any problem solution to any other solution in less than w annealing steps) then there is no need to add interaction between processes as defined by Eqs. (5) and (6). It is sufficient running processes independently and taking the best solution in their set of final solutions. Graffigne admits that it is difficult to prove this conjecture but she validates it on several examples. Now note that the temperature of annealing decreases according to the formula Ti+1 = Ti for i = 0, 1, 2, . . . There are two possibilities in establishing the points in which the temperature drops and the processes interact. Namely, we may assume that the processes interact frequently during each of temperature plateaus, or that the temperature drops several times before an interaction takes place. In this paper the former approach is adopted. The CS algorithm written for the message passing model of parallel computation is shown below. The computations are similar to those in the IS algorithm. However now the processes interact every w = n steps in order to pass their best solutions (lines 8–17). The scheme of communication of processes P1 , P2 , . . . , Pp with process P0 is as before.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
P ROCESSES Pj , j = 1 , 2, . . . , p: Receive initial solution from P0 ; old solutionj := initial solution; best local solutionj := initial solution; T := cost(initial solution)=1000; loop; for iteration counter j := 1 to n2 do annealing step(old solutionj , best local solutionj ); if number of steps is a multiplicity of n then interact if j = 1 then Send best local solution1 to process P2 ; else j > 1 receive best local solutionj ?1 from process Pj ?1 ; if cost(best local solutionj ?1 ) < cost(best local solutionj ) then best local solutionj := best local solutionj ?1 ; end if; if j < p then Send best local solutionj to process Pj +1 ; end if; end if; end if; end for; Send best local solutionj to P0 ; Receive go from P0 ; if go = false then stop; end if; T := T ; temperature reduction end loop;
f
f
g
g
f
g
The worst case time complexity of the CS algorithm is 2 + (p ? 1)n2 + pn + n + p). As compared to the previous algorithm the complexity is higher by a cost of communication among processes P1 , P2 , . . . , Pp . Consider process Pp . There are n steps of communication (lines 8–17) during the for loop execution. The cost of a single communication step is proportionate to (p ? 1)n, as p ? 1 messages of size n are sent and received. Thus the cost of execution of the for loop in process Pp is now n2 + (p ? 1)n2 .
Tp (n) pn + a(n
6 Selection of One of the aims of our investigation is to select a suitable value for the parameter , < 1, which strongly influences the accuracy of solutions found by the annealing process. Recall that controls the speed of reduction of the temperature of annealing. This temperature winds down from an initial value T0 according to the formula Ti+1 := Ti . In order to choose the best value for we use the results of execution of the IS for p = 5 processes. As already mentioned, simulated annealing is probabilistic in nature, so costs of solutions generated by the simultaneous independent searches can be regarded as a population of some distribution. Our goal is to select for a fixed n such value = 0 for which the population mean, , is minimum, i.e. close to the op-
timum. As a large random sample from the population we take the solution costs obtained from 500 executions of the IS. In Table 1 the mean values and standard deviations of the samples for pairs ( ; n), 2 f0:895; 0:900; : : :; 0:930g, n 2 f40; 50; : : :; 100g, are shown. Let 0 denote the value of for which for a fixed n the minimum mean value of solution costs is observed. Let x denote this minimum value, and let k , k 6= 0 , denote all other values from set f0:895; 0:900; : : :; 0:930g. We test the null hypothesis, H0 : = k , versus the alternative hypothesis, Ha : < k , for all k , where and k are the population means [16]. In other words we want to verify whether a given value of the temperature reduction parameter, 0 , gives the minimum mean value of population solution costs as compared to other values k . Using the test statistic Z = rX ? X k
a) Mean values of solution costs, x . The minimum value, x , for each n is underlined.
0.895 0.900 0.905 0.910 0.915 0.920 0.925 0.930
k n + n
we reject H0 in favor of Ha at the = 0:01 significance level if z < z0:01 = ?2:33 (a one-sided test). The values of population variances, 2 and 2k , are replaced by thir estimators, s2 and s2 k , respectively, i.e. by the variances of sample sets. The calculated values of the test statistic for data from Table 1 are shown in Table 2. The cases for which H0 ’s are rejected in favor of Ha ’s are framed. It can be seen that only in a few cases we can claim that 0 gives significantly smaller population mean than for other k ’s. For example, if n = 100 we may assert that for 0 = 0:925 the population mean is smaller than for k = 0:895. If n = 80, though, we may not claim that 0 = 0:925 gives a smaller mean than for any other k . Thus in some sense the result of testing our hypothesis is inconclusive. However if we sum up the test statistic values for every (cf. the Total column of Table 2) we observe a distinct maximum for = 0:92. Since the lower value of Total for a given k means that we are nearer the rejection regions for n = 40, 50, . . . , 100 (note that in a rejection region, k is decidedly worse than 0 ), we conclude that a suitable value for the temperature reduction parameter is = 0:92.
7 Comparison of searches The parallel IS and CS algorithms presented in section 4 and 5 were implemented by making use of C language and the message passing interface library (MPI). The implementations were run on the IBM RS/6000 SP, SGI Origin 2000 and Sun Enterprise 6500 multiprocessor computers. The test results of 500 executions of the algorithms are shown in Table 3 and 4. The results were obtained for the subsets of the test set consisted of n = 100 uniformly distributed customers in the 60 60 square with the depot in the center.
n 40 819.7 819.7 819.7 819.8 819.8 819.8 819.9 820.2
0.895 0.900 0.905 0.910 0.915 0.920 0.925 0.930
50 984.4 984.3 984.1 984.0 983.8 984.1 984.3 984.8
60 1177.1 1176.7 1177.1 1176.9 1177.1 1176.6 1176.8 1177.1
70 1374.3 1374.2 1373.8 1373.6 1373.5 1373.4 1373.3 1373.4
n 80 1510.5 1510.9 1510.4 1510.9 1510.4 1510.4 1510.2 1510.8
90 1700.4 1700.1 1700.0 1699.9 1700.0 1699.7 1700.2 1700.0
100 1846.3 1845.6 1846.0 1845.6 1845.9 1845.8 1845.5 1845.8
b) Standard deviations, s.
0.895 0.900 0.905 0.910 0.915 0.920 0.925 0.930
n 40 1.99 1.84 1.90 2.65 2.86 2.46 3.00 4.40
50 4.69 4.47 4.15 3.82 3.35 4.30 5.17 5.96
60 3.82 3.57 3.82 3.78 3.65 3.44 4.01 4.14
70 6.06 5.90 6.03 6.13 6.43 6.00 6.35 6.67
80 5.99 6.13 5.84 5.86 5.59 5.99 5.87 6.33
90 4.61 4.28 4.52 4.66 4.20 4.38 4.68 5.61
100 5.45 5.32 5.36 5.45 5.27 5.34 5.24 5.87
Table 1. The mean values and standard deviations for 500 executions of the simultaneous independent searches for p = 5 processes
0.895 0.900 0.905 0.910 0.915 0.920 0.925 0.930
0.895 0.900 0.905 0.910 0.915 0.920 0.925 0.930
40 -0.582 0 -0.198 -1.038 -0.572 -0.860 -1.630 -2.613
n
50 -2.309 -1.848 -1.342 -0.786 0 -1.339 -2.011 -3.136
60 -1.861 -0.258 -2.103 -1.231 -1.924 0 -0.810 -2.001
a) p = 2
70 -2.592 -2.421 -1.186 -0.907 -0.561 -0.336 0 -0.385
n 40 50 60 70 80 90 100
n 80 -0.851 -1.919 -0.709 -1.899 -0.697 -0.711 0 -1.584
90 -2.305 -1.487 -0.915 -0.786 -0.954 0 -1.746 -1.108
100 -2.468 -0.286 -1.643 -0.408 -1.390 -1.004 0 -1.059
n2
x 819.2 982.9 1174.8 1370.2 1507.5 1697.3 1842.6
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x 819.1 982.5 1174.4 1368.3 1505.9 1696.2 1841.5
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x 819.8 984.1 1176.9 1373.8 1510.3 1700.1 1846.1
s 1.02 2.30 1.82 5.08 3.90 3.58 3.99 Total
H 440 347 35 43 55 37 7 957
b) p = 3
Total -12.968 -8.219 -8.096 -7.055 -6.098 -4.280 -6.197 -11.886
Table 2. The test statistic values (n1 = 500)
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
n 40 50 60 70 80 90 100
=
1 As mentioned in section 2 the delivery problem is NP-hard, so it cannot be solved to its optimality within a reasonable computing time. The suspected optimum solution value is the best value found in all our experiments. 2 The assumption regards the connectedness of the Markov chain. Given w = n and our definition of the neighborhood (see section 3) it is clear that this assumption holds.
H 470 403 51 98 96 39 7 1022
c) p = 4
n The columns of the tables contain the cardinality of a subset (n), the suspected optimum solution value1 to the DP (Opt), the mean value of solution costs over 500 executions (x ), the standard deviation of the results (s), and the number of times the optimum solution was hit in 500 executions (H). As already mentioned, Graffigne [3] investigated a slightly different scheme of the CS. She conjectured that under some assumption2 there is no need to add interaction between processes before the last interaction in which the final result is computed using formula (2). Our empirical results show that this conjecture does not hold for the modified scheme of interaction defined by Eqs. (3)–(6). We can compare the total number of hits into optima in both searches (see Total values in Tables 3 and 4). Clearly, these values for the number of processes p = 4 and 5 are significantly larger for the CS than for the IS. This means that if we employ enough processes, then the co-operating searches give higher probability of finding an optimum as compared to the independent searches.
s 1.09 1.88 1.50 4.16 2.79 3.07 3.03 Total
40 50 60 70 80 90 100
s 2.74 4.19 3.81 6.43 6.22 4.83 5.48 Total
H 343 232 9 37 33 14 1 621
d) p = 5
n 40 50 60 70 80 90 100
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x
s
819.8 984.1 1176.6 1373.4 1510.4 1699.7 1845.8
2.46 4.30 3.44 6.00 5.99 4.38 5.34 Total
H 333 220 15 37 31 19 5 655
Table 3. Performance of the independent searches (IS) for p = 2 :: 5 processes ( = 0:92).
a) p = 2
n 40 50 60 70 80 90 100
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x 819.3 984.2 1177.0 1373.6 1510.5 1699.6 1846.1
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x 820.0 984.2 1176.9 1373.6 1510.4 1700.1 1845.9
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x 819.2 982.4 1174.1 1367.2 1504.8 1695.0 1840.0
s 1.50 4.30 4.01 6.15 6.11 4.75 5.40 Total
H 425 231 14 32 39 15 4 760
b) p = 3
n 40 50 60 70 80 90 100
s 2.79 4.68 4.05 6.27 5.78 4.53 5.32 Total
H 320 239 19 32 24 14 1 649
c) p = 4
n 40 50 60 70 80 90 100
s 1.42 0.98 2.35 3.85 2.27 3.00 3.28 Total
H 480 442 78 160 174 85 23 1442
d) p = 5
Furthermore, based on the mean values, x , and standard deviations, s, from Tables 3 and 4 we test the hypothesis H0 : IS CS versus the alternative hypothesis Ha : IS > CS . In all cases for which H0 is rejected we can claim that the IS give inferior solution costs in comparison to the CS. Using the test statistic:
XIS ? XCS Z=q s s
IS CS n + n
we reject H0 at the = 0:01 significance level if z > z0:01 = 2:33. The calculated values of the test statistic for data from Tables 3 and 4 are shown in Table 5. The cases for which H0 ’s are rejected are framed. It can be seen that the co-operating searches compare favorably to the independent searches for all n’s and the number of processes p > 3.
n 40 50 60 70 80 90 100
Number of processes, p 2 3 4 5 -1.23 -6.72 4.34 4.84 -5.96 -7.54 8.83 7.13 -11.17 -12.94 13.99 16.60 -9.53 -15.75 19.69 22.20 -9.25 -15.68 18.57 20.98 -8.65 -15.94 20.06 23.32 -11.66 -16.07 21.36 23.46
Table 5. The test statistic values for comparison the independent (IS) and co-operating searches (CS) (n1 = n2 = 500)
8 Conclusions
n 40 50 60 70 80 90 100
Opt 819.0 982.2 1173.4 1364.7 1502.9 1691.1 1835.4
x 819.2 982.5 1173.9 1366.5 1504.5 1694.3 1839.4
s 1.28 2.58 1.18 3.51 1.91 2.76 2.95 Total
H 477 457 86 213 202 116 29 1580
Table 4. Performance of the co-operating searches (CS) for p = 2 :: 5 processes ( = 0:92).
We investigated the performance of two algorithms of parallel simulated annealing applied to the delivery problem. Assuming the message passing model of parallel computation the simultaneous independent searches and the simultaneous periodically interacting (or co-operating) searches were implemented. First, a suitable value for the temperature reduction parameter of annealing, , was selected, and then the parallel implementations of both searches were run using that value ( = 0:92). While looking for the value of and comparing the performance of the searches we used the method of statistical hypothesis testing. As the main result of our work we consider the empirical evidence supported by the statistical analysis (see section 7) indicating that the interaction of processes in parallel simulated annealing can yield more accurate solutions to the
delivery problem (in terms of their proximity to the optimum solutions) as compared to the case when the processes run independently. This is contrary to the conjecture posed by Graffigne [3]. It is well known that in order to achieve good problem solutions, the serial simulated annealing algorithm has to be carefully tuned by selecting suitable values of its parameters. These parameters include: (i) initial solution — solving the DP the initial solution is built as the set of tours of size 3, where the customers are grouped into the tours randomly; (ii) initial temperature of annealing — we assume T0 = cost (initial solution )=1000; (iii) cooling schedule — we adopt the scheme Ti+1 := Ti with the value = 0:92 chosen empirically (see section 6); (iv) number of annealing steps executed in each temperature — taken as n2 ; (v) termination condition — the DP algorithm stops if 20 consecutive stages of temperature reduction fail to improve the best solution found so far, i.e. if equilibrium is reached. In addition to these parameters in a serial algorithm the following issues have to be addressed in its parallel implementation: (vi) scheme of interaction (co-operation) of processes — we assume that in a line of processes, P1 , P2 , . . . , Pp , each process in points of interaction takes for further exploration the better solution between its own current solution and the one obtained from its left neighbor; (vii) frequency of interaction — it is assumed that during n2 annealing steps processes interact every n steps, i.e. there are n points of interaction at each cooling stage. Surely, the last two issues add a new dimension of difficulties in tuning the simulated annealing algorithm. Further research can regard the frequency of interaction (some of the possibilities here are mentioned in section 5) and other schemes of co-operation of processes [10, 3, 4, 17].
Acknowledgements We would like to thank the Wroclaw Centre of Networking nad Supercomupting for the computing grant No 04/97, and the Computer Center of the Silesia University of Technology for the similar computing grant which enabled us to obtain the empirical results described in this work.
References [1] Aarts, E.H.L., and Korst, J.H.M., Simulated annealing and Boltzmann machines, Wiley, Chichester, 1989. [2] Altinkemer, K., Gavish, B., Parallel savings based heuristics for the delivery problem, Operations Research 39, 3 (May-June 1991), 456–469. [3] Azencott, R., Parallel simulated annealing: An overview of basic techniques, in Azencott, R. (Ed.),
Simulated annealing. Parallelization techniques, J. Wiley, NY, (1992), 37–46. [4] Boissin, N., and Lutton, J.-L., A parallel simulated annealing algorithm, Parallel Computing 19, (1993), 859–872. [5] Cˇerny, V., A thermodynamical approach to the travelling salesman proble: an efficient simulation algorithm, J. of Optimization Theory and Applic. 45, (1985), 41-55. [6] Christofides, N., Mignozzi, A., and Toth, P., Exact algorithms for the vehicle routing problem, based on spanning tree and shortest path relaxations, Math. Prog. 20, (1981), 255–282. [7] Clarke, G, and Wright, J., Scheduling of vehicles from a central depot to a number of delivery points, Opens. res. 12, (1964), 568–581. [8] Czech, Z.J., Parallel simulated annealing for the set-partitioning problem, Proc. of the 8th Euromicro Workshop on Parallel and Distributed Processing, Rhodos, Greece, (January 19–21, 2000), 343–350. [9] Fisher, M.L., and Jaikumar, R., A generalized assignment heuristic for vehicle routing, Networks 11, (1981), 109–124. [10] Greening, D.R., Parallel simulated annealing techniques, Physica D 42, (1990), 293–306. [11] Haimovich, M., and Rinnooy Kan, A., Bounds and heuristics for capacitated routing problems, Math. Opns. Res. 10, (1985), 527–542. [12] Kirkpatrick, S., Gellat, C.D., and Vecchi, M.P., Optimization by simulated annealing, Science 220, (1983), 671-680. [13] Lenstra, J., and Rinnooy, K., Complexity of vehicle routing and scheduling problems, Networks 11, (1981), 221–227. [14] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E., Equation of state calculation by fast computing machines, Journ. of Chem. Phys. 21, (1953), 1087-1091. [15] Reeves, C.R., (Ed.) Modern Heuristic Techniques for Combinatorial Problems, McGraw-Hill, London, 1995. [16] Scheaffer, R.L., McClave, J.T., Probability and statistics for engineers, Duxbury Press, Calif., 1995. [17] Verhoeven, M.G.A., and Aarts, E.H.L., Parallel local search techniques, Journal of Heuristics 1, (1996), 43– 65.