Author Instructions for Extended Abstract - CiteSeerX

MIC2003: The Fifth Metaheuristics International Conference

18-1

A new Ant Colony System updating strategy for Vehicle Routing Problem with Time windows Issmail Ellabib

Otman A. Basir

Paul Calamai

Department of Systems Design Engineering University of Waterloo, Waterloo, Canada, N2l 3G1 {iellabib, obasir}@engmail.uwaterloo.ca [email protected]

1 Introduction Vehicle fleet planning in most of transportation systems is responsible for an important fraction of the economical, social and environmental aspects. Its applications encompass diverse activities such as retail distribution, school bus routing, mail and newspaper delivery, municipal waste collection, fuel and oil delivery, e-commerce, etc. In general, these activates have two prevailing planning issues; namely, the routing of capacitated vehicles to visit some customers, and the scheduling of vehicles to meet timing or precedence restrictions imposed on the vehicles routes. We view these issues as a combined vehicle routing and scheduling problem. This combination is often known as Vehicle Routing Problem with Time Windows (VRPTW). VRPTW is focused on the efficient use of a fleet of capacitated vehicles that must make a number of stops to serve a set of customers, and to specify which customers should be served by a vehicle and in what order so as to minimize cost, subject to vehicle capacity and service time restrictions imposed in the depot and the customer locations. Hierarchical objective function can be used in which the minimization of vehicles is the primary criterion whereas the total length of their tours can be considered as the secondary criterion. VRPTW has received a lot of attention. This is basically due to the complexity of the problem and the wide applicability of time window constraints in the real world. Various optimization methods have been applied to solve the problem. The reader may consult the excellent survey on VRPTW reported in [2]. A number of algorithms inspired by the foraging behavior of ant colonies have been recently applied successfully to solve many hard combinatorial optimization problems. These algorithms are called Ant Colony Optimization (ACO) algorithms. The basic idea of these algorithms is that a large number of simple artificial ants are able to build good solutions to hard combinatorial optimization problems via low-level based communications. Different representations of pheromone information through different updating strategies were introduced in order to improve the performance of ACO algorithms. For more details on these algorithms see Bonabeau et al. [1]. In this paper, we introduce a new updating strategy in the global updating rule of the most recent ACO algorithms called Ant Colony System (ACS) algorithm for solving VRPTW. We recommend the use of an adaptive weighting scheme for weighting the contribution of ants based on their quality and the state of convergence. The paper is organized Kyoto, Japan, August 25–28, 2003

18-2


as follows. First, the ACS algorithm and the new updating strategy are discussed in Section 2. Then, the methodology for solving VRPTW based on a simple Ant Colony System model is described in Section 3. Computational results of the improved ACS are presented in Section 4 along with a comparative performance analysis involving other metaheuristic approaches. Finally, some concluding remarks are provided in Section 5.

2 Ant Colony System (ACS) algorithm The basic idea of an ACS algorithm is that a large number of simple artificial ants are able to generate good solutions to the TSP via low-level communications; an ACS model simulates the foraging behavior of real ants in the sense that artificial ants cooperate to construct good solutions by using a common memory to emulate the pheromone deposited by real ants. The artificial pheromone is accumulated during the construction phase through a learning mechanism implied in the pheromone updating rules. An artificial ant is considered as an agent that moves from city to city on a TSP graph based on a transition rule, and updates the pheromone trail of its tour. Each ant k starts from a randomly chosen city i, and selects unvisited city j from the list Jki based on a transition rule until it visits all the unvisited cities. The transition rule is given by:

arg max u∈J k {[τ iu (t )].[ηiu (t )]β } i j=  J

if q ≤ q0 ; if q > q0 ,

(1)

where q is a random variable uniformly distributed over [0,1], q0 is a tunable parameter (0≤q0≤1). τij is the amount of pheromone on the edge that connects city i and city j, and ηij is the reciprocal of the travel cost between city i and city j (called visibility). β is a control parameter that represents the relative importance of pheromone trail versus visibility value. J∈Jki is a city that is selected randomly according to the following probability PiJk ( t ) =

[τ iJ ( t )].[η iJ ( t )] β ∑ [τ iu (t )].[η iu (t )] β

(2)

u ∈ J ik

An edge with high transition probability has a high chance of being selected. The transition rule is based on the amount of pheromone added to the edge, and on the visibility that represents the heuristic quality of that edge. The pheromone trail is updated into two different ways: Local updating rule: As the ant moves between cities, it updates the amount of pheromone on the edge by the following formula:

τ ij (t ) = (1 − ρ ).τ ij (t − 1) + ρ .τ 0

(3)

The value τ0 is the initial value of pheromone on all edges and can be calculated as τ0=(n.Ch)-1, where n is the number of cities and Ch is the travel cost of a tour produced by one of the construction heuristics. The persistent factor ρ is a given parameter governing pheromone decay. It takes values in the range [0,1]. Global updating rule: When all ants have completed their tours the ant that found the least cost (Cb ) tour updates the edges in its tour using the following formula:

τ ij (t ) = (1 − ρ ).τ ij (t − 1) + ρ .(1/Cb ) Kyoto, Japan, August 25–28, 2003

(4)


18-3

The local updating rule makes the edge pheromone level diminish, indirectly favoring the exploration of unvisited edges. Consequently, ants tend not to converge to a common tour [1]. In the global updating rule, the ant that generates the best tour is allowed to update the edges of its tour. This rule reinforces the other ants to search for tours in the vicinity of the best tour found so far. For more details on ACS see Dorigo et al. [3]. In this paper, the underlying assumption is that in the neighborhood of high quality solutions there may be further high quality solutions. Thus, we propose to modify the global updating rule to reinforce ants to search for tours in the vicinity of the generated high quality tours. We will discuss in the following subsection how the proposed updating strategy modifies the global updating rule.

2.1 Weight updating strategy In this strategy, the set of ants are weighted based on the quality of their solutions as well as on their state of convergence. Pheromone trails of their tours are then updated adaptively in response of their weights. The extra amount of pheromone is decreased when the ant’s population approaches local optimum and is increased when the ant’s population is scattered in the solution space. It is essential here to identify whether the ant’s population is converging to local optimum or not. One possible way of detecting a convergence is to observe the average value of the solution of the ant’s population in relation to the minimum cost value of the population. Since, the difference between the average cost of the tours Cavg and the minimum tour cost Cmin of a population is likely to be less for a population that has converged to a local optimum solution than that for a population scattered in the search space. We have observed this property in our experiments with the ACS algorithm. Therefore, we use the difference in average and minimum values (Cavg - Cmin) as a yardstick for detecting the convergence of ACS. Consequently, a weighting factor for adding an extra amount of pheromone can be determined using this measure with the difference in the ant k tour cost Ck, and minimum tour cost Cmin. The expression takes the form:

(C avg − C k ) (C avg − C min ) wk =  0 

if { Ck < C avg } , otherwise

(5)

Using this scheme the solution of each ant has the weighting value wk in the range [0,1]. The weighting value depending not only on the measure of convergence but also on the difference between Ck and Cmin: the closer Ck to Cmin the larger is wk. Therefore, a reinforcement amount of pheromone can be incorporated in the global updating rule using this scheme in which the extra amount of pheromone is proportional to the solution quality and the current state of convergence. The expression that represents the amount of pheromone updated on the edges of the best tour (Eqn.4) is modified so as to add an extra amount of pheromone to the edges belonging to high quality tours. Thus, the global updating rule of Eqn.4 is modified as follows:

(1 − ρ ).τ ij (t − 1) + ρ. (wk C min ) τ ij (t ) =  b (1 − ρ ).τ ij (t − 1) + ρ.(1 C )

if { edge(i, j) ∈T k } , if { edge(i, j) ∈T b }

(6)

where, Tk and Tb are the generated tours of the ant k, and the best ant b respectively..

Kyoto, Japan, August 25–28, 2003

18-4


3 ACS model for VRPTW A simple model of a single colony system is applied in this work to solve VRPTW with hierarchical objective function. The ACS model is applied to VRPTW by transforming it closely to the graph of TSP with including the problem constraints. The transformation is done by staring from the feasible solution generated by the Nearest Neighbor heuristic (NN) of Solomon [5]. The approach starts by applying the NN heuristic for creating an initial feasible solution, and then let the ACS constructive procedure similar to the procedure designed for the TSP in [4] to improve the initial feasible solution. In this procedure, each ant starts from a depot, and moves to the feasible unvisited customer based on the transition rule of Eqn.1 and Eqn.2 until finish all the remaining unvisited customers. In each ant step, the amount of pheromone of the selected edge is updated locally using Eqn.3. The modified global update rule introduced in Eqn.6 is applied when the ants are visited all customers. The visibility value (ηij=1/Cij) in each edge is determined using the cost function Type3 introduced in [4], and is given as:

Cij = w1.(Tij .Vij ) + w2 .Pij

(7)

Tij = b j − (bi + si )

(8)

Vij = l j − (bi + si )

(9)

Pij = θ j − θi

(10)

w1 and w2 are weights of the travel and delay time product, and the coordinate angle difference respectively. The first term of Eqn.7 represents travel time Tij and delay time Vij at node (customer or depot) j when the vehicle comes from node i. They are calculated according to Eqn.8 and Eqn.9 respectively. bi, si are the beginning of service time and the service time at current node i respectively. bj and lj are the beginning of service time and the latest start of service time at next node j respectively. The second term of Eqn.7 represents the absolute difference in the position angle between current node i and next node j, and it is calculated according to Eqn.10. Where, θi and θj are the polar coordinate angles of node i, and j respectively. More details on different heuristic functions are introduced in [4].

4 Experimental results In this section, we present some computational results obtained by the ACS algorithm using the proposed weight updating strategy and compare our results with the other obtained from different metaheuristic approaches on the well known data sets of VRPTW introduced by Solomon [5]. The data sets contain 56 instances and they are classified into six classes (C1, C2, R1, R2, RC1, RC2), each with different property. The computing platform used to perform the experiments is PIII-700MHZ, 128MB-RAM under Red Hat Linux operating system. The code was written in C++ language. Experiments have been done with the following parameter settings: β=1, ρ=0.1, q0=0.93, m=10 ants, and max. of iterations=300. The performance here is measured in terms of the average best solution found so far for each data set. Table1 reports the average of the best solution obtained in our experiments using the Kyoto, Japan, August 25–28, 2003


18-5

modified ACS (ACSw). Similar results were also provided by different metaheuristic approaches: Tabu Search (TS) of Tan et al. [6] on a Pentium II machine (266 MHZ-MMX-32 MB RAM), Simulated Annealing (SA) and hybridizing approach between Tabu Search and Simulated Annealing (TSSA) of Thangiah et al.[8] on a NeXT machine 68040 (25MHZ), Genetic algorithm (GIDEON ) of Thangiah [7] on a SOLBOURNE machine (5/802), and Ant Colony System with the Insertion heuristic (ACS+I1) of Ellabib et al. [4]. Data set Approach R1 R2 C1 C2 RC1 RC2 NV 13.83 3.82 10.00 3.25 13.63 4.25 TS TD 1266.24 1080.23 870.93 634.82 1458.18 1293.38 CPU [1076] [3323] [557] [1885] [1016] [3217] NV 13.70 3.20 10.00 3.00 13.40 3.80 SA TD 1252.00 1169.00 883.00 687.00 1454.00 1249.00 CPU [96] [1254] [24] [714] [108] [510] NV 13.33 3.20 3.00 13.00 3.90 10.00 TSSA TD 1242.42 1113.00 831.00 663.00 1413.00 1257.00 CPU [607.00] [648.09] [298.89] [288.38] [520.00] [1070.25] NV 12.75 3.18 10.00 3.00 12.5 3.38 GIDEON TD 1300.25 1124.28 892.11 749.13 1474.13 1411.13 CPU [97.37] [183.24] [89.66] [142.16] [111.83] [159.44] NV 13.08 3.18 10.44 3.25 12.88 3.38 ACS+I1 TD 1441.08 1294.09 1118.67 698.88 1566.88 1521.25 CPU [N/A] [N/A] [N/A] [N/A] [N/A] [N/A] NV 10.00 3.38 12.67 3.00 3.00 12.38 ACSw TD 1383.12 1309.19 914.71 657.14 1555.27 1518.58 CPU [13.45] [11.00] [9.68] [15.94] [15.52] [17.21] Table 1: Average number of vehicles (NV), total distance traveled (TD), and computation time (CPU) in seconds obtained by the proposed approach ACSw and five other approaches. Particularly, ACS with the modified updating rule (ACSw) has improved significantly the performance of ACS algorithm introduced in [4] in terms of both the quality of solution (NV and TD) and the speed of convergence (number of generated ant tours). For each instance, the results of ACSw were obtained after the algorithm generated 3,000 ant tours while the results of ACS+I1 were obtained after the algorithm generated 30,000 ant tours. It is also the results shown that ACSw is very competitive for the most data sets compared with the other metaheuristic approaches; it is clearly the best approach for four data sets: R1, R2, C2, and RC1, and it is also among the best approaches for the other problem data sets. In terms of computation time, it is very difficult to compare the various approaches, due to different implementation and different hardware applied, but ACSw was able to produce relatively high quality solutions in consistent and short amount of time. Kyoto, Japan, August 25–28, 2003

18-6


5 Conclusion and further work In this paper, a new updating strategy is introduced in the global updating rule for the standard ACS algorithm, and a simple ACS model based on this strategy is applied to solve VRPTW. The strategy weights all the ants based on their quality tours and their state of convergence. The pheromone trails on the edges of ant tours are then updated based on their weights. Computational experiments on the well-known data sets of Solomon [5] indicated that ACS based on the modified updating rule is an efficient approach for improving significantly the performance. Moreover, the results of the improved algorithm are indicated to compete very well against the other metaheuristic approaches found in the literature. The best solution obtained by the improved ACS does not tend to compete the best solution published so far by some other metaheuristic approaches. This is to be expected because the other metaheuristic approaches embedded more specific local search procedure in their iterations. One feature of the improved ACS algorithm over the standard ACS algorithm is the fact that the high quality solutions found in each iteration are evaluated in terms of both the quality and the state of convergence to direct systematically the search toward promising regions. The results are encouraging, and future work should be directed at developing other such adaptive strategies for updating the pheromone. It would be also interesting to see the effect of such strategies on different problem types.

6 References [1] [2]

[3] [4]

[5] [6] [7]

[8]

E. Bonabeau, M. Dorigo, and G. Theraulaz. Swarm Intelligence: From Natural to Artificial systems, New York: Oxford University Press, 1999. J. Cordeau, G. Desrosiers, M. Solomon, and F. Soumis. The VRP with Time Windows, in The Vehicle Routing Problem, Chapter 7, SIAM Monographs on Discrete Mathematics and Applications, 157-193, 2002. M. Dorigo, and L. Gambardella. Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem, IEEE Trans. on Evol. Comp. Vol.1, 53-66, 1997. I. Ellabib, O. Basir, and P. Calamai. An Experimental Study of a Simple Ant Colony System for the Vehicle Routing Problem with Time Windows, Ant Algorithms, Springer LNCS 2463, Berlin/Heidelberg, 53-64, 2002. M. Solomon. Algorithms for the vehicle Routing and Scheduling Problems with Time Window constraints, Operations research 35: 254-265, 1987. K.Tan, Q. Lee, K. Zhu. Heuristic methods for vehicle routing problem with time windows, Artificial Intelligence in engineering 15:281-295, 2001. S. Thangiah. Vehicle Routing with Time Windows using Genetic Algorithms, Application Handbook of Genetic Algorithms: New Frontiers, Volume II. Lance Chambers (Ed.), CRC Press,253-277, 1995. S. Thangiah, I. Osman, and T. Sun. Hybrid Genetic Algorithm, Simulated Annealing, and Tabu Search Methods for Vehicle Routing Problems with Time Windows. Technical Report 27, Computer Science Department, Slippery Rock University, 1994.

Kyoto, Japan, August 25–28, 2003