Abstract. This paper presents a new evolutionary algorithm, called. Routes Generation Evolutionary Algorithm with Knowledge. (RGEAwK), for determining ...
Evolutionary Algorithms Find Routes in Public Transport Network with Optimal Time of Realization Anna Piwonska, Jolanta Koszelew Technical University of Bialystok Computer Science Faculty Wiejska 45A, 15-351 Bialystok, Poland {a.piwonska, j.koszelew}@pb.edu.pl
Abstract. This paper presents a new evolutionary algorithm, called Routes Generation Evolutionary Algorithm with Knowledge (RGEAwK), for determining routes with optimal travel time in graph which models public transport network. The method was implemented and tested on the real transport network. Effectiveness of the method was compared with Routes Generation Matrix Algorithm (RGMA) [8]. This comparison is based on experimental results which were performed on realistic data. The paper ends with some remarks about future work on improving of RGEAwK. Keywords: public transport network, time-dependent graph, optimal routes, evolutionary algorithm.
1 Introduction There are many commercial computer systems which solve optimization problems in transport networks: BUSSMAN, HOT or HASTUS [1], [2]. However, it is very difficult to find optimal routes in public transport systems which take into consideration users’ preferences. Effective algorithms which generate optimal routes in a public transport network are the heart of journey planners [4], [3]. Users of such system determine source and destination point of travel, start time, their preferences and as a result, system returns information about optimal routes. In practice, public transport users’ preferences may be various, but the most important of them are: a minimal travel time and a minimal number of transfers. Because of the fact that graph which models public transport network has time-dependent dynamic weights, problem of finding routes with a minimal travel time is NPc [5], [6]. Authors present a new original evolutionary algorithm, called RGEAwK for determining routes with optimal travel time. The method was implemented and tested on the real public transport network. Effectiveness of the method was compared with RGMA algorithm [7], [8]. RGMA realizes the label-setting strategy and uses two
special matrices, called transfer matrix and minimal distance matrix and exploits the following assumption: routes with minimal (or close to minimal) number of transfers or with minimal length (number of bus stops on the route) have probably optimal time of realization. The previous version of RGEAwK was RGEA algorithm [9]. The main problem with RGEA was not sufficient performance for routes with very long distances. It was the motivation for the authors to take a crack to improve the RGEA. The main improvement was incorporating knowledge about transport network to the algorithm. The improved version, RGEAwK, is described in this paper. Next section includes definition of optimal routes generation problem and description of public transport network model. RGEAwK is described in section 3. In section 4 authors present the comparison of effectiveness of two methods: RGMA and RGEAwK in two aspects: time of realization of routes (taking into account the number of transfers in routes) and time complexity. This comparison is based on experimental results which were performed on realistic data for Bialystok city. The paper ends with some remarks about future work on improving of RGEAwK.
2 Network Model and Problem Definition A transportation network in our model is represented as a bimodal weighted graph G = [10], where V is a set of nodes, E is a set of edges and t is a function of the weights. Each node in G corresponds to a certain transport station. We assume, for simplification, that there is only one kind of public transportation - the bus, so each node corresponds to a bus stop. This assumption does not limit the applications of the presented methods. We also assume that bus stops are represented with numbers from 1 to n. The directed edge (i, j) E is an element of the set E, the bus line number l connects the stop number i as a source point and the stop number j as a destination. One directed edge called bus link corresponds to one possibility of the connection between two stops. Each edge has a weight tij which is equal to the travel time (in minutes) between nodes i and j which can be determined on the base of timetables (Tab. 1). A set of edges is bimodal because it includes, besides directed links, undirected walk links. The undirected edge {i, j} E is an element of the set E, if walk time in minutes between i and j stops is not greater then limitw parameter. The value of limitw parameter has a big influence on the number of network links (density of graph). The tij value for undirected edge (i, j) called walk link is equal to walk time in minutes between i and j stops. We assume, for simplification, that a walk time is determined as an Euclidian distance between bus stops. It is very important to take the walk links into consideration, because public transportation network is very rare in small cities or peripheral districts of big cities, and walk links increase chance of finding at least one route between bus stops which are located in such regions. A graph representation of public transportation network is shown in Fig. 1. It is a very simple example of the network which includes only nine bus stops. In the real world the number of nodes is equal to 1500 for the city with 250 thousands of inhabitants.
Fig. 1. Representation of a simple transportation network (different styles of lines mark different bus links; solid lines mark walk links)
Formal definition of our problem is the following. At the input we have: graph of transportation network, timetables - times of departures for each stops and each line, source point of the travel (s), destination point of the travel (d), starting time of the travel (times), number of the resulting paths (k), maximum number of transfers (maxt) and limit for walk links (limitw). The maxt parameter is required only for RGMA method. At the output we want to have the set of resulting routes, containing at most k quasi-optimal paths with minimal time of realization (in minutes), with at most maxt transfers - assumption important only for RGMA. The tij values are marked in Fig. 1 only for walk links, because only this kind of links is time independent. Weights for bus links are strongly dependent on the starting time parameter and can be computed only during the realization of algorithm. We can determine tij values for bus links using timetables. We assume, for simplification, that each bus line has the same timetables for each weekday. Table 1. Timetables for each bus stop presented in Fig.1 nr 1 2 3 4
Line Dash Dash-dot Dash Dot Solid Dash Dash-dot Dash Dash-dot
5:55 6:01 5:58 6:05 10 min 6:03 6:09 6:05 6:07
Times of departure 6:10 6:13 6:25 6:13 6:20 8:30 6:18 6:21 6:29 6:19
6:33 6:31
5 6 7 8 9
Dash Dot Dash-dot Dot Dash-dot Solid Solid Dash-dot Dot Solid Dash Dot
6:08 6:11 6:03 6:15 6:04 10 min 10 min 6:05 6:08 10 min 6:10 6:13
6:23 6:26 6:15 6:30 6:16
6:36 6:27 6:40 6:27
6:17 6:23
6:29 6:33
6:25 6:28
6:38
3 RGGAwK for Discovering Routes The pseudocode of RGEAwK is presented below. procedure RGEAwK(m,s,d,max_t,time_s,limit_w,P,ng,pr,k) begin generate m routes from s to d with minimal number of tranfers (not greater than max_t) starting at time time_s, with maximal walk links equal to limit_w; include m routes to an initial population of RGEAwK; generate P-m routes from s to d in a random way; compute fitness function F for each individual; for i:=1 to ng do begin choose with probability pr a genetic operator: crossover or mutation; if operator=crossover begin choose two individuals according to the F value; cross parent individuals, if possible; compute fitness function F for new individuals; add offspring individuals to the population; end else begin randomly choose one individual; mutate the individual; compute fitness function F for mutated individual; add mutated individual to the population; end;
end; choose k best routes from the final population; end; The RGEAwK starts with a population of P solutions of a given problem. The initial population is generated in a special way: m individuals are computed as routes from bus stop s to bus stop d, starting at time times, with minimal number of transfers not greater than maxt and with maximal walk links equal to limitw [10]. Determining these routes is not a difficult task since they are computing on a graph without timedependent link (a network without timetables represented by a graph without weights). The rest of the population is generated in a random way. The next step is to evaluate individuals in the initial population by means of the fitness function F. The fitness function should estimate the quality of individuals, according to the time of realization of the tour. After fitness evaluation, the RGEAwK starts to improve initial population through ng applications of crossover and mutation. In every generation we first choose with probability pr between crossover and mutation. In the case of crossover, we first select two parent individuals, according to the fitness value: the better an individual is, the bigger chance it has to be chosen. Since chromosomes lengths are different, we presented a new heuristic crossover operator, adjusted to our problem [9]. In the first step we test if crossover can take place. If two parents do not have at least one common bus stop, crossover can not be done and parents remain unchanged. Crossover is implemented in the following way. First we choose one common bus stop, it will be the crossing point. Then we exchange fragments of tours from the crossing point to the end bus stop in two parent individuals. After crossover, we must correct offspring individuals in two ways. First we eliminate so called ”bus stop loops”, then we eliminate so called ”line loops” [9]. The next step is to compute fitness function for these new individuals. Finally, offspring individuals are added to the population, they do not replace their parents. In the case of mutation, we first choose randomly one chromosome. The next step is to randomly select two bus stops, denoted as s1 and d1 from the route (s1,d1 ≠ s,d). Then we generate m routes from s1 to d1 with minimal number of transfers not greater than maxt and with maximal walk links equal to limitw [10]. From these m routes we select a route with minimal time of realization. This best route exchanges the fragment of a route from s1 to d1 in a chromosome being mutated. Then we compute fitness function for this individual and add it to the population.
4 Experimental results There was a number of computer tests conducted on real data of transportation network in Bialystok city. This network consists of about 700 bus stops, connected by about 30 bus lines. There was an assumption about length of walk links limitw - they were limited to 15 minutes. The value of limitw is very important because it influences the density of network. The bigger value of limitw, the more possibilities of walk links in a network. Density of network is of a key importance for time complexity of algorithms.
The parameters of RGEAwK were: P = 200, m = 2, maxt = 5, pr = 0.5, k = 5 and ng = 500. We performed three kinds of tests. We examinated routes from the centre of the city to the periphery of the city (set C - P), routes from the periphery of the city to the centre of the city (set P - C) and routes from the periphery of the city to the periphery of the city (set P - P). Each of these sets includes routes which are difficult cases for both algorithms. First matter is a long distance from s to d, the second is a rare density of the network in s or d localization. Rare density of network could be a reason of high value of approximation error for routes. The examples of routes from each set, generated by RGEAwK, are presented in Tab. 2, 3, 4. The last four columns in each table denote time (-t) and number of transfers (-tr) of the best route generated by RGMA and RGEAwK, respectively. In case of routes of C - P type, that RGEAwK has less average time of realization than RGMA in 70% of examinated routes. The average difference is equal to 12 minutes. Additionally, in 53% cases where RGEAwK was not worse than RGMA, the number of transfers in RGEAwK was only one more than in RGGA, maximally. RGEAwK has never been worse than RGMA according to time of realization. We can observe on the base of tested routes that RGEAwK has less average time of realization than RGMA in 40% of examinated routes of P - C type. The average difference is equal to 21 minutes. Additionally, in 40% cases where RGEAwK was not worse than RGMA, the number of transfers in RGEAwK was only two more than in RGGA, maximally. RGEAwK has never been worse than RGMA according to time of realization. In case of routes of P - P type, that RGEAwK has less average time of realization than RGMA in 40% of examinated routes. The average difference is equal to 18 minutes. Additionally, in 30% cases where RGEAwK was not worse than RGMA, the number of transfers in RGEAwK was three more (only in one case) than in RGGA, maximally. RGEAwK has never been worse than RGMA according to time of realization. Table 2. The examples of routes from set C - P; times = 15:30 nr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
s Malmeda Akademia Med. Bazantarnia Berlinga Botaniczna Branickiego Hetmanska Hotel Gromada K.E.N./Kollataja Kino Pokój Klub Rozrywki Koscielna Kosciol s. Woj. Monte Cassino Wierzbowa Nr of winners
d Dojlidy G. Nowodworce Wiadukt Dojlidy G. Chelmons. Dojlidy G. Dojlidy G. Mysliwska Nowodworce Silikaty Klepacze Silikaty Makro Kleosin Silikaty
RGMA-t 00:28 01:42 01:10 00:58 00:45 01:25 01:25 01:24 01:42 00:31 00:53 00:31 00:49 00:54 01:46 9
RGGAwK-t 00:28 01:28 00:46 00:58 00:45 00:28 00:58 00:48 01:42 00:31 00:53 00:31 00:49 00:54 01:16 15
RGMA-tr 1 1 1 1 1 0 1 0 2 1 1 1 1 1 1 9
RGGAwK-tr 1 3 2 1 1 1 2 1 2 1 1 1 1 1 2 15
Table 3. The examples of routes from set P - C; times = 7:30 nr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
s Nowodworce Zagorki Silikaty Ksiezyno Wiadukt Ksiezyno Baranowicka Ksiezyno Nowodworce Dojlidy G. Produkcyjna Niemenska Dojlidy G. Zagorki Dojlidy G. Nr of winners
d Malmeda Wierzbowa Akademia M. Czestochow. Fabryczna Gajowa Grottgera Hala Jagiell. Hetmanska K.E.N./Kollataja Kalinowskiego Kino Pokoj Klub Rozrywki Koleowa PKP Kopernika
RGMA-t 03:08 02:17 01:31 01:16 00:51 01:07 00:35 00:49 01:37 01:17 00:26 01:28 01:44 01:21 00:36 5
RGGAwK-t 01:35 01:42 00:51 00:52 00:38 01:07 00:35 00:49 01:32 00:55 00:26 01:28 01:03 00:39 00:36 15
RGMA-tr 2 2 1 1 1 1 2 1 2 1 1 2 1 1 1 5
RGGAwK-tr 3 3 2 2 2 1 2 1 1 2 1 2 2 2 0 15
RGMA-tr 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 9
RGGAwK-tr 1 1 2 4 1 2 1 2 1 1 2 2 1 0 2 15
Table 4. The examples of routes from set P - P; times = 15:30 nr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
s Wiadukt Klepacze Nowodworce Klepacze Gajowa Chelmonsk. Gajowa Polmos Gajowa Silikaty Wiadukt Nowodworce Gajowa Silikaty Polmos Nr of winners
d Silikaty Dziesi ciny Zielone Wzgorza Ksiezyno Chelmonsk. Bialostocka Mysliwska Wiadukt Nowodworce Dubois Fabryka Dyw. Kleosin Polmos Produkcyjna Swobodna
RGMA-t 00:49 00:49 01:50 02:00 00:57 01:34 00:55 00:55 01:42 01:36 04:28 01:17 01:02 01:04 01:09 9
RGGAwK-t 00:49 00:49 01:50 00:58 00:57 01:24 00:55 00:46 01:42 01:36 01:38 01:17 01:02 00:56 00:51 15
The last experiment was focused on comparison of time complexity of algorithms. The results are presented in Fig. 2. In this experiment we tested examples of routes with a minimal number of bus stops, between 5 and 30. The set of tested routes was generated by BFS-method on the base of the structure of the network (graph without weights) [7]. The minimal number of bus stops is marked on x axis. On y axis we denoted the time of realization of algorithm in miliseconds (processor Pentium 3.0 GHz). Precise analysis showed that RGEAwK is on average 3143 ms better than RGMA.
Fig. 2. The comparison of time complexity of RGEAwK and RGMA
5 Conclusions The reason for developing a new evolutionary algorithm was too high time complexity and not sufficient quality of resulting routes generated by RGMA and RGEA [7], [9]. The authors’ motivation was to try to improve the RGEA. The main improvement was incorporating knowledge about transport network to the algorithm. Computer experiments have shown that improved algorithm, RGEAwK, performs much more better than RGMA and significantly faster. Future work will be concentrated on testing RGEAwK on transport network for big metropolises such as Warsaw or Pol. Gornoslaski Okrag Przemyslowy (Silesian Industrial Region). If tests show poor performance of RGEAwK the new heuristics must me added to the algorithm. The proposal of improvement which can be considered includes to the algorithm information about geographic location of start and destination bus stops and topology of network.
References 1. Rousseau, J.M.: Scheduling Regional Transportation with HASTUS. In: Proceedings of CAPST 2000, Berlin, electronically published (2000) 2. Serna Delgado, C.R., Bonrostro Pacheco J.: MINMAX Vihicule Routing Problems: Application to Schol Transport in the Providence of Burgos (Spain). In: Proceedings CASPT 2000. Berlin, electronically published (2000)
3. WU, Q., Hartley, J. K.: Accommodating User Preferences in the Optimization of Public Transport Travel. International Journal of Simulation Systems, Science and Technology: Applied Modeling and Simulation, Vol.5, No 3–4, pp. 12–25 (2004) 4. Tulp, T.: CVI: Builder of Dutch Public Transportation System, unpublished (1993) 5. Hansen, P.: Bicriterion path problems. In: Multiple criteria decision making: theory and applications. Lecture Notes in Economics and Mathematical Systems 177, Eds. G. Fandel, T. Gal. Heidelberg, Springer-Verlag, pp. 236-245 (1980) 6. Safer, H.M., Orlin, J.B.: Fast approximation schemes for multicriteria combinatorial optimization. Technical Report No. 3756-95. Cambridge, Massachusetts Institute of Technology, Sloan School of Management (1995) 7. Koszelew, J.: Two methods of quasi-optimal routes generation in public transportation network. In: Proceedings of 7th International Conference on Computer Information Systems and Industrial Management Applications: CISIM 2008, IEEE Computer Society, pp. 231–236 (2008) 8. Koszelew, J.: Approximation method to route generation in public transportation network,. Polish Journal Environment Studies, Vol.17, Nr 4C, pp. 418–422 (2008) 9. Koszelew, J., Piwonska, A.: A new genetic algorithm for optimal routes generation in public transport network, Proceedings of 13th International Conference on System Modelling Control: SMC2009, Lodz University of Technology (2009) 10. Koszelew, J.: The Theoretical Framework of Optimization of Public Transport Travel. In: Proceedings of 6th International Conference on Computer Information Systems and Industrial Management Applications: CISIM 2007, IEEE Computer Society, pp. 65–70 (2007) Acknowledgments. This research was supported by S/WI/2/2008 and S/WI/3/2008.