PRE-PROCESSING METHOD WITH SURROGATE CONSTRAINT ALGORITHM FOR THE SET COVERING PROBLEM Yaquan Xu, Virginia State Univesity, Petersburg, VA 23806
[email protected] 804-524-5346 Gary Kochenberger, University of Colorado, Denver, CO 80202
[email protected] 303-556-5864 Haibo Wang, Texas A&M international University, Laredo, TX 78041
[email protected] 956-326-2503 ABSTRACT
In this article, we present a pre-processing method with surrogate constraint algorithm for non-unicost set-covering problems. Computational results, based upon problems involving up to 400 rows and 4000 columns, indicate that the enhanced algorithm produces better quality results than other heuristics algorithms. Keywords: Pre-processing, Set-covering, Surrogate Constraint
INTRODUCTION The set covering problem (SCP) is a fundamental combinatorial problem in Operations Research. It is usually described as the problem of covering the rows of an m-row, n-column, zero-one matrix ( aij ) by a subset of the columns at minimum cost and can be written as a binary integer program as follows: Minimize
n
∑c j =1
Subject to
j
xj
(1)
x j ≥ 1, i = 1,2,..., m ,
(2)
n
∑a j =1
ij
x j ∈ (0,1), j = 1,2,..., n.
(3)
Equation (2) states that each row is covered by at least one column. The variable x j , equals 1 if the column j is in the solution, and 0 otherwise. If the cost c j are equal for j ∈ J = {1,2,.., n} , the problem is referred to as the unicost SCP, otherwise, the problem is called the weighted or non-unicost SCP.
The SCP is important in practice. It has applications arising from airline crew scheduling [1], [2], [3],[4], design of switching circuits [5], assembly line balancing [6], truck deliveries [7] , market basket analysis [8], and information retrieval [9], etc. The SCP with unicost and non-unicost have been proven to be NP-complete, see [10], and are therefore considered difficult to solve. A number of exact and heuristic algorithms have been developed for solving the unicost and non-unicost SCP. In this paper, we presents a surrogate constraint algorithm to solve the non-unicost SCP. The main difference between our algorithm and previous heuristic approaches is the use of - 351 -
pre-processing for problem reduction. The pre-processing method allows a better exploration of the data space and provides a relatively small size of data input for heuristic search. After reviewing the available literature about the SCP algorithms, we describe the pre-processing method and discuss the surrogate constraint algorithm in Section 3. The implemented algorithm is presented along with a computational experience with OR-Library instances[11] at section 4. A brief summary and conclusion is given in section 5. LITERATURE REVIEW
The methods to solve SCP can be divided in two main categories, exact and heuristic approaches. Exact approaches can be obtained by, eg., a branch-and bound and branch-and-cut approach for modestly sized problems. Harche and Thompson developed an exact algorithm, called column subtraction, which is capable of solving large sparse instances of set covering problems. The algorithm is based on tree-search procedures[12]. For lager problems, various heuristic methods have been used because they can find a good or near-optimal solution in a reasonable time. Beasley [13]combined a Lagrangian heuristic, feasible exclusion constraints, Gomory f-cuts and an improved branching strategy to enhance his previous algorithm [14] and solved problems with up to 400 rows and 4000 columns. Balas and Carrera have proposed an algorithm based on Lagrangean relaxations and subgradient optimization [15]. To improve solution quality, modern heuristics, such as genetic algorithm, neural network, are also been used for solving SCP. These heuristics are often classified as Meta-heuristics.
SURROGATE CONSTRAINT FOR THE SCP A surrogate constraint is an inequality implied by the constraints of an integer program. The idea of surrogate constraint is that the use of appropriate linear combination of original constraint to create a surrogate constraint. The purpose of creating surrogate constraint is to capture useful information that cannot be extracted from the parent constraints individually and obtain bounding information [16], [17]. For the SCP problems, because of its special structure of the formulations, we can create a so called simple-sum surrogate constraint which results by summing the original inequalities without modification, the right hand side of surrogate constraint equal the number of original inequality constraints and the coefficient values equal the sum of the unit coefficients for the variables that appear in original constraints[18]. The surrogate relaxation of the SCP is given by [17, 18] (4) Minimize cx + λ (e − Ax) Subject to w( Ax) ≥ we , (5) x ∈ (0,1) (6) where w is the vector of weights associated with each original constraint and all equal 1. When the normalized surrogate constraint is defined, a solution can be obtained from the greedy binary knapsack heuristic algorithm as follow: - 352 -
Step 0: Initialization. Start with all free initial partial solution. x ( 0) = (# ,..., # ) and set solution index t ← 0 Step 1: Stopping. If all components of current solution x (t ) are fixed, stop and output xˆ ← x (t ) as an approximate optimum. Step 2: Step. Choose a free component x r = 1 of partial solution x ( t ) and a value for it that leads to a good minimum feasible value. Then, advance to partial solution x ( t +1) identical to x ( t ) except x r is fixed at the 1 Step 3: Increment. Increment t ← t + 1 , and return to step 1 PREPROCESSING
Preprocessing is a popular method to speed up the algorithm. The idea of preprocessing is to reduce the size of an SCP instance by removing redundant columns and rows. Garfinkel [4] proposed four reduction rules as follows: Reduction 1: if row ri is a null vector for some i, there is no feasible solution since the ith constraint cannot be satisfied. Reduction 2: if ri = ek for some i,k, then x k = 1 in every feasible solution, and column a k may be deleted. Reduction 3: if row rt ≥ rp for some t and p, then rt may be deleted Reduction 4: if for some set of column S and some column k,
∑c j∈S
∑a j∈S
j
∑a j∈S
j∈S
j
= a k and
≤ c k , column k may be deleted.
Reduction 4a: if for some set of column S and some column k,
∑c
j
j
> a k and
≤ c k , column k may be deleted.
In this research, all of these four reduction rule are used. To illustrate the pre-processing reduction method, we present a small example as follow: Min 10x1+5x2+8x3+6x4+9x5+13x6+11x7+4x8+6x9 s.t. x1+ x2+ x3+ x5+ x7+ x8 ≥1 x2+ x3+ +x8 ≥1 x2+ x5+ x6 +x8+ x9 ≥1 x4 ≥1 x1+ x3+ x5+ x6+ x9 ≥1 x2+ x3+ x7+ x9 ≥1 x1+ x4+ x5+ x8+ x9 ≥1 x j ∈ (0,1), j ∈ {1,2,3,4,5,6,7,8,9} ,
Repetitively applied above four reduction rules until no more rows or columns can be removed, the original problem can be converted to as follow: Min 8x3+6+ 4x8+6x9 - 353 -
s.t.
x3+
x8 ≥1 x8+x9 ≥1 x9≥1
x3+ x j ∈ (0,1), j ∈ {3,8,9}
The example shows that the preprocessing can substantially reduces the effort required to solve the set covering problems and that repeated use of this procedure can reduce the time required to solve the integer problem substantially. Table 1 lists the original test problem data size, together with the number of rows remaining and number of columns remaining after applying the pre-processing method: Table 1: Results for problems size after pre-processing After Pre-Processing Problem Data Set Original Problem Size # of Constraints # of Variables # of Constraints # of Variables Name SCP44
200
1000
188
197
SCP46
200
1000
197
227
SCP48
200
1000
188
217
SCP52
200
2000
194
262
SCP54
200
2000
189
221
SCP59
200
2000
194
234
SCP62
200
1000
200
276
SCP64
200
1000
200
224
SCP65
200
1000
199
278
SCPa2
300
3000
300
411
SCPa4
300
3000
300
394
SCPa5
300
3000
300
406
SCPb2
300
3000
300
572
SCPb4
300
3000
300
576
SCPb5
300
3000
300
571
SCPc2
400
4000
400
578
SCPc4
400
4000
400
579
SCPc5
400
4000
400
572
SCPd2
400
4000
400
914
SCPd4
400
4000
400
859
SCPd5
400
4000
400
853
SCPe2
50
500
50
500
SCPe4
50
500
50
500
SCPe5
50
500
50
500
COMPUTATIONAL RESULTS
The pre-processing method and surrogate constraint algorithm presented in this paper were programmed in FORTRAN, compiled using Visual FORTRAN, and run on a - 354 -
3.20GHZ Pentium 4 PC equipped with 3 GB RAM and operated with a Microsoft Windows XP Professional system. We provide some preliminary computational experience along with comparisons with an exact method by using CPLEX Branch-and-Cut method. The solutions from both methods are listed in table 2. Table 2: Results for problems
Time (sec.)
Solution
Surrogate Constraint with Pre-processing Time (sec.) Solution
SCP44
2.574
557
2.184
509
SCP46
2.590
621
2.278
577
SCP48
2.480
520
2.200
509
SCP52
4.852
329
4.477
322
SCP54
5.008
255
4.508
248
SCP59
4.618
301
4.228
290
SCP62
1.482
157
1.451
156
SCP64
1.529
137
1.342
133
SCP65
1.529
183
1.404
186
SCPa2
12.137
276
11.045
266
SCPa4
12.308
255
11.060
244
SCPa5
12.168
250
11.794
240
Data Set Name
CPLEX
SCPb2
6.490
80
6.505
76
SCPb4
6.911
85
6.396
83
SCPb5
6.287
73
6.209
75
SCPc2
24.459
237
22.433
225
SCPc4
24.196
237
22.511
237
SCPc5
23.291
228
22.043
219
SCPd2
12.262
72
11.731
68
SCPd4
12.714
66
12.636
63
SCPd5
12.667
66
12.168
62
SCPe2
0.031
5
0.031
5
SCPe4
0.031
5
0.031
5
SCPe5
0.016
5
0.016
5
Comparing our test result from table 2, note that in every case our surrogate constraint algorithm with pre-processing method produced the optimal solution has an equal, or superior, performance to the algorithm of exact method on the majority of these test problems. It is noteworthy that the time required by our algorithm is much less than the time required by the traditional exact method. This shows that the pre-processing method combined with surrogate constraint algorithm are competitive with the exact algorithms for SCP presented in the literature, and that their performance can sensibly be improved by an external preprocessing procedure. Reference
[1]
Marsten, R.E., Muller, M.R., and Killion, C.L., Crew Planning at Flying Tiger: - 355 -
[2] [3]
[4] [5]
[6]
[7] [8]
[9]
[10] [11] [12]
[13] [14] [15] [16] [17]
[18]
A Successful Application of Integer Programming. Management Science, 1979. 25: p. 1175-1183. Hoffman, K.L., Padberg, M., Sloving Airline Crew Scheduling Problems by Branch-and-Cut. Management Science, 1993. 39(6): p. 657-682. Rubin, J., A Technique for the Solution of Massive Set Covering Problems with Application to Airline Crew Scheduling. Transportation Science, 1973. 7: p. 34-48. Garfinkel, R., Integer Programming. 1972: Wiley Intersecience. Breuer, M.A., Simplification of the Covering Problem with Application to Boolean Expressions. Journal for the Association of Computing Machinery, 1970. 17: p. 166-181. Freeman, D.R., Jucker, J.V., An Interger Programming Approach to the Vehicle Scheduling Problem. Operational Research Quarterly, 1967. 27: p. 367-384. Forster, B.A., Ryan, D.M., An Integer Programming Approach to the Vehicle Scheduling Problem. Operational Research Quarterly, 1976. 27: p. 367-384. Garfinkel, R.G., R., Tripahi, A., and Yin, F., Design of a shopbot and recommender system for bundle purchases. Decision Support Systems, 2006. 42: p. 1974-1986. Day, R.H., On Optimal Extracting from a Multiple File Data Storage System: An Application of Integer Programming. Operations Research, 1965. 13: p. 482-494. Garey, M.R., Johnson, D.S., Computers and Intractability: A Guide to the Theory of NP-Completeness. 1979, San Francisco: W.H.Freeman. Beasley, J.E., OR-Library: Distributing Test Problems by Electronic Mail. Journal of Operational Research Society, 1990. 41(11): p. 1069-1072. Harche, F., Thompson, G.L., The Column Subtraction Algorithm: An Exact Method for Solving Weighted Set Covering, Packing and Partitioning Problems. Computers & Operations Research, 1994. 21: p. 689-705. Beasley, J.E., Jornsten, K., Enhancing an Algorithm for Set Covering Problem. European Journal of Operational Research, 1992. 58: p. 293-300. Beasley, J.E., An Algorithm for Set Covering Problems. European Journal of Operational Research, 1987. 31: p. 85-93. Balas, E., Carrera, M.C., A Dynamic Subgradient-Based Branch-and Bound Procedure for Set Covering. Operational Research, 1996. 44(6): p. 875-890. Glover, F., Surrogate Constraints. Operations Research, 1968. 16(4): p. 741-749. John, G.C., Kochenberger, G.A., Using Surrogate Constraints in Lagrangian Relaxation Approach to Set Covering Problems. Operational Research Society, 1988. 39(7): p. 681-685. Glover, F., Tutorial on Surrogate Constraint Approaches for Optimization in Graphs. Journal of Heuristics, 2003. 9: p. 175-227.
- 356 -