A HYBRID OF TABU SEARCH AND INTEGER PROGRAMMING FOR SUBWAY CREW PARING OPTIMIZATION Junha Hwang1, Chang Sung Kang1, Kwang Ryel Ryu1, Yongho Han2, Hyung Rim Choi3 1 Department of Computer Engineering, Pusan National University, Pusan, Korea
[email protected],
[email protected],
[email protected] 2 Department of Information Systems, Pusan University of Foreign Studies, Pusan, Korea
[email protected] 3 Department of Management Information Systems, Dong-A University, Pusan, Korea
[email protected]
Abstract Methods based on integer programming have been shown to be very effective in solving various crew pairing optimization problems. However, their applicability is limited to problems with linear constraints and objective functions. Also, those methods often require an unacceptable amount of time and/or memory resources given problems of larger scale. Heuristic methods such as tabu search, on the other hand, can handle large-scaled problems without too much difficulty and can be applied to problems having any form of objective functions and constraints. However, tabu search often gets stuck at local optima when faced with complex search spaces. This paper presents a hybrid algorithm of tabu search and integer programming, which nicely combines the advantages of both methods. The hybrid algorithm has been successfully tested on a large-scaled crew pairing optimization problem for a real subway line.
Key Words Planning and Scheduling, Crew Pairing Optimization, Tabu Search, Integer Programming
different number and/or length of rest periods, thus resulting in different time span of duty. The characteristic of a pairing determines its cost. The general approach to the crew scheduling problem is decomposed into two phases: pairing generation and pairing optimization [1]. In the pairing generation phase, a lot of feasible pairings are generated typically by using enumerative techniques. Then, in the pairing optimization phase, a selection is made of the best possible subset of all the generated pairings in such a way that all the legs are covered at minimum total cost. When there are many legs of various types, however, a huge number of pairings can be generated and thus the complexity of pairing optimization problem becomes very high. The pairing optimization problem can be formulated as the following set covering problem in which all the m rows (legs) should be covered by a minimum cost subset of columns (pairings) selected from the generated n pairings. n
Minimize
∑c
j
xj
ij
x j ≥ 1 , i = 1, . . . , m
ij
x j ≤ v i , i = 1, . . . , p
j =1
n
1. Introduction
subject to
∑a j =1
The plan for urban subway operation is usually defined by a daily timetable specifying the trips to be serviced by a fleet of trains for a day. Each trip is divided into legs, which are defined as the minimum trip segments to be serviced by the same crew without rest. The type of a leg is determined by its beginning and ending stations. A pairing is a combination of legs which can be assigned to a crew as a daily duty. A feasible pairing should start and end at the same crew base (a crew base is also a station and there can be more than one crew bases in a subway line) and must satisfy a number of constraints including company regulations and labor agreements. There can be a very large number of feasible pairings with different characteristics. The pairings having the same starting leg, for example, can have different succeeding legs by having
n
∑b j =1
x j ∈ {0,1} , j = 1, . . . , n
where xj = 1 if pairing j is selected, and 0 otherwise. cj > 0 is the associated cost of pairing j. aij = 1 if pairing j covers the ith leg (i.e. the ith leg is contained in pairing j), and 0 otherwise. The first inequality says that each leg must be covered by at least one pairing. The values of bij and vi in the second inequality determine p linear constraints that must be satisfied by the selected pairings. Previous works on pairing optimization problem employed solution techniques such as Lagrangian relaxation
[2, 3], genetic algorithm [4], and column generation [5, 6]. The first two methods are applicable when the number of pairings that can be generated are not too many. The column generation method is applicable to large-scaled problems because it generates pairings dynamically as needed during the solution of set covering optimization. However, since the column generation method is based on integer programming [7], the computation often takes too long a time and its applicability is limited to the problems whose objective function and constraints are all represented in linear forms.
lem. Any nonlinear objective function components or constraints, if exist, are dropped off whenever IP is invoked. All the nonlinear aspects of the problem are taken care of only during the tabu search phase. Experiments show that the hybrid algorithm significantly outperforms any of the single methods both in solution time and quality. In Section 2, an example crew scheduling problem used for our empirical study is described. Then, an overview of our approach and a detailed explanation of our hybrid algorithm are given in Sections 3 and 4, respectively. Experimental results comparing the hybrid algorithm with others are shown in Section 5, and finally some conclusions are drawn in Section 6.
In this paper, we show that a hybrid of tabu search [8] and integer programming (IP) can be a powerful solution method for complex crew pairing optimization problems. One of the advantages of using tabu search is that for large-scaled problems it is usually a lot faster than IP because the latter is basically an exhaustive search method. Besides, tabu search can be applied without any difficulty to problems having nonlinear objective functions and constraints. However, it is advantageous if we can exploit the global perspective of IP when the problem at hand is linear and its size is moderate. Our hybrid algorithm combines the advantages of both methods.
2. Example Crew Scheduling Problem The problem we have experimented with in this research is the crew scheduling problem of a subway line in real operation. As shown in the timetable of train trips of Figure 1 (a), the trains move back and forth between the two stations A and B where the two crew bases are located. Although there are many stations between A and B in the original timetable, only C is shown in the figure because it is the only station (except A and B) where crews can take a break and switch to a different train. There are four types of legs in this problem, i.e., A → C, C → B, B → C, and C → A, and the total number of legs is 640.
Our algorithm is based on a maximal covering problem formulation to be described in Section 3. It starts with a subset of predetermined number of pairings selected from a big pairing pool by using a simple greedy method. Then, it follows the basic framework of tabu search to find a better pairing subset to cover more legs. Whenever the pairing subset cannot be improved for a certain number of iterations, IP is invoked to make a more diversified search. When the current subset is taken over to IP, some of the bad looking pairings are removed from it. The task of IP is to have the subset refilled by finding from the pool the best possible replacement of the removed pairings. Note that the size of the problem to be solved by IP gets much smaller than that of the original maximal covering prob-
07
train 1 train 2 train 3 train 4
08
09
10
Figure 1 (b) shows an example pairing which consists of a sequence of legs taken from the timetable of train trips. There are a number of rules that should be followed by each pairing as a daily duty. A pairing must start and finish at the same crew base (A or B) and should cover ten legs, which makes the train switch at station C inevitable. A crew cannot operate the vehicle for longer than three
11
12
14
13
A A
B
C
B
A
C A
A
C
C C
C
B
C
A
C
C
A
B B
C
B
(a)
B
C
C
A
A
C
C
B
B
C
C
A
A
C
B
(b)
Figure 1. A fragment of timetable showing daily train trips (a), and an example pairing (b).
B
hours without break for a safety reason, but the number of breaks (rest periods) allowed for a pairing is bounded by four because the crews prefer not to make too many train switches. We must find a set of pairings each of which satisfies these and many other constraints not mentioned here, so that all the legs in the timetable are covered. There are also constraints and objectives that must be followed by the pairings collectively. The number of pairings is fixed as 65 of which 38 belong to crew base A and the rest to crew base B. Since there are 640 legs and each pairing covers 10 legs, we allow some of the legs to be covered by more than one pairing. When multiple pairings cover the same leg, those pairings except a designated one are said to contain a deadhead or deadhead leg for which the crew member does not work but is just transported as a passenger. One of the important objectives of this pairing optimization problem is to minimize the average work hour that is defined as the time interval from the start of the first leg (start time) to the finish of the last leg (finish time) of a pairing. Another objective is to minimize the number of pairings having four breaks. Yet another important objective is related to the so called FIFO(first-in-first-out) property. A pair of pairings i and j satisfies the FIFO property if the implication si < sj → fi < fj holds, where si and fi denote the start time and finish time of pairing i, respectively. In our problem, minimizing the violation of the FIFO property is considered as one of the most important objectives. Note also that more FIFO violation leads to larger variance of work hours.
3. Overview of the Paring Optimization Algorithm As mentioned earlier, all the possible pairings must be generated before pairing optimization. For the problem described in the previous section, the pairings satisfying all the constraints can be generated relatively easily by an enumerative method. The problem is that there are simply too many feasible 10-leg pairings that can be extracted from the 640-leg timetable. It seems that the column generation technique is a viable option to go round this problem. The column generation method generates pairings dynamically as needed during the optimization and thus has the potential of searching through all the possible pairings, even though the computation time required could be unacceptably long. Unfortunately, the column generation technique cannot be a possibility for our problem because the FIFO property which is an important factor of the objective function cannot easily be represented in a linear form. To derive an IP formulation for minimization of FIFO violation, it is convenient to consider the set of all violation pairs V = {(i, j) | si < sj and fj < fi, 1 ≤ i, j ≤ n} where n is the number of all possible pairings and si and fi denote
the start time and finish time of pairing i, respectively. Then, the formulation can be given as follows. Minimize
∑z
ij
( i , j )∈V
subject to
x i + x j − 1 ≤ z ij , (i, j ) ∈V
x i ∈ {0,1} , i = 1, . . . , n
z ij ∈ {0,1} , (i, j ) ∈V
where xi = 1 if pairing i is selected, and 0 otherwise. The inequality forces zij to take the value 1 when the violating pair of pairings i and j are both selected. In other cases, zij will take the value 0 so that the given summation is minimized. This formulation, however, cannot be implemented in practice because the number of variables, or the cardinality of the set V, is huge. One might suspect that minimization of FIFO violation can be achieved indirectly by constraining the work hour of pairings to lie within narrower range at the pairing generation phase. The reality is that more number of pairings are needed to cover all the legs as we work with more constrained set of pairings. The decision to adopt a heuristic search, or more specifically a hybrid of tabu search and IP, as our main engine to solve the pairing optimization problem is based on the fact that in tabu search virtually any form of constraints or objective functions are allowed. Unlike the column generation technique, however, the tabu search cannot investigate all the possible pairings given a large-scaled problem. What we can do is to provide with a pool of selected pairings whose size is not too big for the tabu search to handle. An inevitable consequence of using a limited pool of pairings is a sacrifice of the solution quality; the result of pairing optimization is quite likely to contain more number of pairings than allowed. Therefore, some of the pairings should be removed from the resultant solution to keep the number of pairings at the given required number, which makes some of the legs left uncovered by any pairing. In our algorithm the uncovered legs, or what we call the blanks, are taken care of by a heuristic repair method to be introduced shortly. Instead of adopting the set covering problem (SCP) model for pairing optimization and then removing some of the pairings to meet the requirement on the number of pairings, we adopt the following maximal covering problem (MCP) model to find a subset of pairings of the required number that can cover as many legs as possible. Let m, n, and d be the number of legs, the number of all the possible pairings, and the required number of pairings to be selected, respectively. Let the variable xj and the constants aij, bij, and vi be defined in the same way as in the SCP formulation given in Section 1. The MCP formulation can be given as follows.
other heuristics including pairing replacement are used in the heuristic repair algorithm. For further detail of the algorithm please refer to [9].
m
Minimize
∑y
i
i =1 n
subject to
∑x
j
=d
Heuristic repair has the positive effect of introducing into the pairing subset new pairings which may not be included in the pairing pool generated initially. However, heuristic repair also deteriorates the overall quality of the pairings. In Figure 2, for example, the pairing P′ is worse (more costly) than P because the number of breaks increased from two to three. Therefore, it is necessary that the pairing subset before heuristic repair, although incomplete, should contain pairings of good enough quality.
j =1 n
∑a
ij
x j + y i ≥ 1 , i = 1, . . . , m
ij
x j ≤ v i , i = 1, . . . , p
j =1 n
∑b j =1
x j ∈ {0,1} , j = 1, . . . , n y i ∈ {0,1} , i = 1, . . . , m
The first constraint imposes the required number d of pairings to be selected. The second constraint forces the variable yi to take the value 1 when the ith leg is a blank. When the ith leg is not a blank, yi will take the value 0 so that the given summation is minimized. Our experience reveals that application of the MCP model leaves much less number of blanks than the SCP model followed by the pairing removal.
4. Hybridization of Tabu Search and Integer Programming After a pool of selected pairings is generated, we apply a hybrid algorithm of tabu search and IP to find a subset of pairings that maximally covers the legs. Figure 3 shows the algorithm in the form of pseudo-code. An initial solution, namely a subset of a predetermined number of pairings, is generated by selecting the pairings one by one from the pool by using a simple greedy heuristic [10]. The greedy heuristic selects a pairing so that the selection makes yet uncovered legs to be maximally covered. Then for tabu search, neighborhood solutions are generated by deleting a certain number of pairings from the current solution and then adding back the same number of new pairings selected from the pool. The deletion of a pairing is made probabilistically by measuring how critical the pairing is in the current solution. The criticality of a pairing is measured by counting the number of legs which are solely covered by the pairing. A pairing is more likely to be deleted as it is less critical. Addition of pairings is done by the same greedy heuristic used for generating the initial solution.
To take care of the blanks remaining after a subset of pairings of given number has been chosen by applying the MCP model, some of the pairings in the subset are either modified or replaced by a heuristic repair algorithm. Figure 2 illustrates the basic idea of a repair operation by modification. In the figure, a blank leg L2 is repaired by having pairing P give up the leg L1 and cover L2 instead to become pairing P′. Note that in pairing P′ the new leg L2 does not interfere with any of the preceding or following legs. If L1 happens to be a deadhead leg, repair of L2 is complete. Otherwise, the same process continues by modifying a series of pairings until a deadhead leg is encountered. Given a blank leg such as L2, there can be more than one pairing that can be modified for blank repair. A breadth-first search or an iterative deepening search can be used to find a shortest path from the blank to a deadhead. A few variations of this operation and
07
train 1 train 2 train 3 train 4
08
09
10
The neighborhood set Sk denotes a set of neighborhood
11
12
A
14
13
L1 C
A
B
C
B
C
A
C
B
C
!
"
B
A
C
B
C
A
C B "
A
B
P′
!
L2 A
P
B
L1 C
A
C
C
C
A
A C !," C
B
B
C
C
B
A modified
L2 C
A
C
B
"
Figure 2. Repair of a blank leg by modification of a pairing.
solutions generated by replacing k pairings. In our implementation the neighborhood is defined as S1 ∪ S2 ∪ ⋅ ⋅ ⋅ ∪ Sk, where the best k has been empirically found to be 5. The appropriate size of each of the neighborhood sets has also been determined empirically. The best candidate solution among the neighbors is the one resulting in the least number of blanks. If there occurs a tie in the number of blanks, the one with the best objective function value is selected. Note that the objective function checks, among others, how well the FIFO property is satisfied by a candidate solution.
unacceptably long time because it eventually exhausts the search space to find a real optimal solution. Therefore, we impose a time bound on IP and take whatever solution found by that time. IP is a very powerful search method that works only on problems with linear constrains and objective functions. The FIFO property requirement thus cannot be enforced during the IP phase. However, this is not a problem because it can be well taken care of by the subsequent tabu search phase.
5. Experimental Results Find an initial solution x Define tabu structure count = 0 repeat until stopping condition is met Generate neighborhood sets of x : S1, S2, ..., Sk Select the best non-tabu solution x′ from S1 ∪ S2 ∪ !!! ∪ Sk x ← x′ if x is better than the current best-so-far solution then count = 0 else count = count + 1 if count > threshold then Update x by calling Integer Programming count = 0
Figure 3. Pseudo-code for the hybrid algorithm.
When a new best solution cannot be found for longer than a predetermined number of iterations (i.e., count > threshold) the search switches to IP phase. Normal tabu search in this situation usually calls for intensification or diversification strategies to get out of a local optimum. Typical intensification and diversification strategies keep memory structures to store rather long history of recent search activities and use them to guide future search directions. In our algorithm, IP is employed as a diversification strategy. To make a somewhat radical change of the current solution, some large number of pairings are deleted from it probabilistically based on criticality measure as before. Then, IP is invoked to find the best possible new pairings from the pool in such a way that the updated solution after replacement maximally covers the legs. The number of pairings to be replaced increases every time IP is invoked because otherwise the solution quality hardly improves. In our implementation 20 replacements are made initially and then the number is increased by 3 at every invocation thereafter. Experiments show that IP performs much better than other memory-based diversification strategies for our pairing optimization problem. It is very effective in reducing the blanks. Moreover, the updated solution by IP provides a new context in which even further improvement can be made by the subsequent tabu search phase. Although IP works on reduced problems by limiting the number of replacement to around 20, it could still take an
The hybrid algorithm of the previous section has been implemented and tested on the problem described in Section 2. All the experiments were done on a Pentium-III 800 machine. The goal of the problem is to find a subset of pairings of size 65 to cover 640 legs. The number of all the possible pairings is roughly estimated to be larger than three thousand million, of which about 180 thousand pairings have two breaks, about 50 million have three breaks, and the rest four. Obviously, we cannot work with a pool of all these pairings. It has been found advantageous to work with the pool of pairings having two breaks only. When the solution obtained by the hybrid algorithm using the MCP model contains only the pairings of two breaks, heuristic repair operations can be applied easily with maximum flexibility. Remember that heuristic repair typically increases the number of breaks of the pairings, which is not allowed to exceed four. If many pairings in a solution already have three or four breaks it becomes hard to apply repair operations to such a solution. In our first experiment, we wanted to see the solution quality that can be obtained by IP alone. We also compared the results by an SCP model (IP/SCP) with those by an MCP model (IP/MCP). Given a CPU time bound of two hours, the best result having 24 blanks was achieved by IP/MCP with a pool of 20,000 pairings (a subset of 180,000 two-break pairings). The best result by IP/SCP (more accurately, IP/SCP followed by some pairing removal) was obtained when the pool size was 15,000, and the number of blanks was 47. IP had difficulty in finding a good solution when given a pool of size bigger than 20,000 perhaps because of the limited time and space resources. Table 1 summarizes the results of ten runs of the pure tabu search (TS) and the hybrid algorithm of tabu and IP (HYBRID). Both algorithms implemented the MCP model with the number of pairings in the covering set fixed to 65. Each run was given two hours of CPU time and the pool of 180,000 two-break pairings was used. We can see a significant reduction in the number of blanks by the hybrid algorithm. Note that the best result by IP/MCP had 24 blanks, which is not surprising because the pool size used was too small. In the experiments with the hybrid algorithm, the average number of IP invocations was
around four and each invocation was give a CPU time bound of 20 minutes.
Table 1. Performance comparison of the pure tabu search and the hybrid algorithm. Number of blanks
TS
HYBRID
Maximum
16
12
Average
14.8
11.2
Minimum
14
10
Table 2 compares the results of applying heuristic repair to the maximal covering solutions obtained by IP, the pure tabu search (TS), and the hybrid algorithm (HYBRID). As seen from the bottom row of the table, it took only 38 minutes to repair 10 blanks, while it took 1 hour and 24 minutes and 16 hours and 42 minutes to repair 14 and 24 blanks, respectively. We can see how important it is to remove as many blanks as possible at the maximal covering phase, and the hybrid algorithm is a definite winner. The quality of the final solution produced by the hybrid algorithm followed by heuristic repair is almost the same as or slightly better than that by a human expert. It took about three months for the expert to come up with a satisfactory schedule. For your reference, the average work hour of the expert’s schedule was 10 hours and 34 minutes with three FIFO violations. The numbers of pairings with 2, 3, and 4 breaks were 33, 26, and 6, respectively.
Table 2. Results of heuristic repair after the maximal covering phase by IP, the pure tabu search, and the hybrid algorithm. IP
TS
HYBRID
Initial blanks
24
14
10
Avg. work hour
10h 33m
10h 38m
10h 34m
Number of pairing with 2, 3, and 4 breaks
33:24:8
37:21:7
34:27:4
FIFO violations
5
4
4
CPU time
16h 42m
1h 24m
38m
6. Conclusions In this paper we presented a hybrid algorithm of tabu search and IP that can solve complex maximal covering problems efficiently. The algorithm has been implemented and successfully tested on a crew pairing optimization problem for a real subway line. The tabu search
component of our hybrid allows a much bigger pool of pairings to work with than pure IP, thus significantly increasing the chance of finding quality solutions. Additional advantage of using tabu search is that it can easily handle nonlinear constraints or objective functions. IP is employed within the tabu search framework as a powerful diversification strategy that comes into play when a new best solution cannot be found for many iterations. IP within the hybrid does not face any difficulty in handling a big pool of pairings because it only solves much reduced subproblems of the original pairing optimization problem. Experimental results confirmed that the hybrid algorithm, by combining the advantages of tabu search and IP, performs much better than any of the single methods.
References [1] L. Bodin, B. Golden, A. Assad, and M. Ball, Routing and scheduling of vehicles and crews: the state of the art, Computers and Operations Research, 10, 1983, 63-211. [2] Caprara, M. Fischetti, P.L. Guida, P. Toth, and D. Vigo, Solution of large-scale railway crew planning problems: The Italian experience, Technical Report OR-97-9, DEIS University of Bologna, 1997. [3] S. Ceria, P. Nobili, and A. Sassano, A Lagrangianbased heuristic for large-scale set covering problems,"Mathematical Programming, 81, 1998, 215228. [4] J.E. Beasly, and P.C. Chu, A genetic algorithm for the set covering problem,"European Journal of Operational Research, 94, 1996, 392-404. [5] S. Lavoie, M. Minoux, and E. Odier, A new approach for crew pairing problems by column generation with an application to air transportation, European Journal of Operations Research, 35, 1988, 4558. [6] C. Barnhart, E.L. Johnson, G..L. Nemhauser, M.W.P. Savelsbergh, and P.H. Vance, Branch and price: Column generation for huge integer programs,"Operations Research, 46, 1998, 316-329. [7] L.A. Wolsey, Integer programming (Wiley, 1998). [8] F. Glover and M. Laguna, Tabu search (Kluwer Academic Publishers, 1997). [9] C.H. Park, Heuristic repair methods for improving subway crew schedules, Master’s Thesis, Department of Computer Engineering, Pusan National University, 2000. [10] R. Church, and C. ReVelle, The maximal covering location problem, Papers of the Regional Science Association, 32, 1974, 101-118.