Penalty Cuts for GUB-Constrained Mixed Integer Programs Fred Glover Leeds School of Business University of Colorado Boulder, CO 80309-0419
[email protected] June 22, 2002 Abstract Penalty cuts provide a new class of cutting planes for GUB-constrained (and ordinary) mixed integer programs, which are easy to generate by exploiting standard penalty calculations employed in branch-and-bound. The Penalty cuts are created by reference to a selected GUB set and a foundation hyperplane that is typically dual feasible relative to a current linear programming basis. As a special case, the GUB restrictions translate into related disjunctions that provide cutting planes for ordinary MIP problems. At the simplest level these yield the classical Gomory mixed-integer cuts, and at higher levels yield deeper cuts. In general, the strength of the cuts can be varied according to the tradeoffs between the strengths of alternative penalty calculations and the effort required to apply them, according to interactions between the foundation hyperplanes and the branching disjunctions that underlie the penalties. By this means, Penalty cuts are especially convenient to use in branch-and-cut procedures, where penalty calculations are employed as a matter of course, and afford new strategies for generating cutting planes in this setting.
1
1. Notation and Problem Formulation We denote the mixed integer programming (MIP) problem relative to a vector of variables x = (x1 x2 … xn), an mxn matrix A, an n-component vector c and an mcomponent vector b, as (I)
Minimize subject to
xo = cx Ax = b x≥0 xεX
where the condition x ε X requires some of the variables to receive integer values. In particular, we give special consideration to the case where x ε X compels a subset of the integer variables to be zero-one, as expressed by xj ε {0,1} j ε I and where these zero-one variables are further required to satisfy generalized upper bound (GUB) conditions ∑ (xj: j ε Ih) = 1 Ih ⊂ I, h ε H contained within the equation Ax = b. Previous work that yields special cutting planes for such problems can be found, for example, in Glover (1973,1975), Hammer, Johnson and Peled (1975), Balas (1979), Sherali, Lee and Adams (1995), Sherali and Lee (1995), Glover, Sherali and Lee (1997) and Gu, Nemhauser and Savelsbergh (1999). By convention, we will say that a variable xj belongs to a GUB set Ih (or that the GUB set contains xj) if j ε Ih. We focus specifically on the generation of special cutting planes to exploit these GUB constraints, and stipulate that the union of the GUB sets Ih over H is the set I. (The condition x ε X may compel additional variables to be zero-one, but our primary concern is with variables that belong to the GUB sets.) However, our results also provide cutting planes for ordinary mixed integer programs, without reference to zero-one variables or GUB constraints. Assume A has full row rank, and hence the problem may be represented relative to a current linear programming (LP) basis by (II)
Minimize subject to
xo =
cNxN + co xB + ANxN = b xB, xN ≥ 0 xεX
where xB and xN respectively denote the basic and non-basic sub-vectors of x. The scalar co, the vectors b and cN, and the matrix AN, are determined in the customary way by matrix row operations performed on (I), hence the b vector in the two systems will generally differ. Representation (II) provides the focus of subsequent discussion. We treat N and B as index sets (as well as subscripts denoting non-basic and basic vectors), and for convenience write xN = (xj : j ε N), xB = (xi : i ε B) and b = (bi : i ε B).
2
Thus, denoting the ith row of AN by Ai•, i ε B, the equation xB + ANxN = b may also be written as the collection of equations xi + Ai•xN = bi i ε B. Throughout most of the following development, we also assume that (II) is an LP-optimal representation, and hence satisfies the dual feasibility condition cN ≥ 0 and the primal feasibility condition b ≥ 0, which by the GUB constraints implies 1 ≥ bi ≥ 0, i ε I. Thus, in particular, the basic solution given by xN = 0 and (hence) xB = b, is feasible and optimal for the linear program that excludes the condition x ε X.
2. Penalty Cutting Planes. 2.0 Representation and Conventions. A Penalty cutting plane will be represented using the notation uo + AoxN = bo where Ao is a row vector, bo is a scalar, and uo is a non-negative variable (the “cut slack”) which takes the role of an additional basic variable for the system (II). The cutting plane is said to be separating if it renders the current LP extreme point (basic solution) infeasible, which is equivalent to stipulating bo < 0. The situation of interest is where the current basic solution does not satisfy the zero-one restriction for the GUB-constrained variables, i.e., there exists some k ε B∩I such that xk = bk violates the requirement xk ε {0,1}, and hence 1 > bk > 0. The Penalty cut has two sources: a selected GUB set Iq that contains a branching variable xk of the form specified, and a selected foundation hyperplane. The foundation hyperplane:-- Create a hyperplane yo = dNxN by selecting dN to be any dual feasible (non-negative) vector in which dj > 0, for a given j ε N, if there is some branching variable xk in Iq such that Akj < 0 (where Akj is the jth coefficient of the row Ak•). The form prescribed for the foundation hyperplane is to enable the use of standard penalty calculations that (implicitly or explicitly) take advantage of the dual simplex method. However, as will be seen, the assumption dN ≥ 0 is not necessary except in a special case. Associated with the foundation hyperplane, we create an MIP problem that replaces the objective function of (II) by that of minimizing yo, to yield (III)
Minimize subject to
yo =
dNxN xB + ANxN = b xB, xN ≥ 0 xεX
Let yok[1] and yok[0] denote the optimum value of yo in (III) subject to the additional restriction xk ≥ 1 and xk ≤ 0, respectively. (For the zero-one case, which is our initial concern, these restrictions are equivalent to xk = 1 and xk = 0.) We suppose that penalty calculation procedures are applied to the problem (III) to identify penalties Pk[1] and 3
Pk[0] which provide lower bounds for yok[1] and yok[0]. (We recall that in customary penalty calculations, some of the constraining equations of (III) are typically disregarded. A simple example is a first-order calculation based on a single dual pivot, which implicitly considers only the single constraint induced by a branching equation or inequality.)
2.1 Characteristics of the Penalties. In general, we allow the penalties Pk[1] and Pk[0] to be based on analyses that go beyond simply solving a linear programming relaxation of (III), as for example by solving integer-constrained knapsack problems using surrogate constraint strategies. However, we will focus more specifically on LP relaxations of (III) in the setting of ordinary MIP problems where GUB constraints may not be present. If the penalty calculations disclose that (III) has no feasible solution for one of the branches xk ≥ 1 or xk ≤ 0 (in which case by convention Pk[1] = ∞ or Pk[0] = ∞), then we simply enforce the opposite branch. Hence for the following we assume that infeasibility is not encountered, i.e., Pk[1] and Pk[0] are both finite. To begin, we are only interested in determining the penalties Pk[1], k ε Iq (for a selected q ε H) and disregard the penalties Pk[0]. The assumption dj > 0 if Akj < 0 for some branching variable xk in Iq assures that standard penalty calculations will yield Pk[1] positive, which in turn will assure that the Penalty cut is separating. The penalties Pk[1] are to be generated not only for branching variables xk, but for all variables in the GUB set Iq. (For example, if xk is non-basic, then in the lowest order penalty calculation, the penalty Pk[1] can be given simply by dk, the component of dN corresponding to xk.) Remark 1. In determining the penalty Pk[1] for a basic variable xk (or for a nonbasic variable xk, when a higher order penalty calculation is used), the calculation can be strengthened by enforcing xj = 0 for all j such that j and k belong to a common GUB set Ih. The foregoing observation is frequently overlooked in the literature on MIP penalty calculations, yet can have a considerable impact on the penalties generated, particularly in problems where a given xk belongs to numerous GUB sets. The effect of compelling the indicated variables to equal 0 will automatically be achieved if a penalty calculation is based on performing a sufficient number of dual pivots, but possibly at the expense of undue computational effort. Setting xj = 0 can of course be conveniently handled for any non-basic xj simply by disregarding the associated component dj and column A•j, (of dN and AN) in performing the penalty calculation.
2.2 Main Result. As a foundation for our main result, we reiterate that the generation of a Penalty cut begins by selecting a given q ε H, where Iq contains at least one branching variable xk, and then selecting the non-negative vector dN to produce the foundation hyperplane yo = dNxN. By convention, if xk is non-basic, we specify Ak to be the negative unit vector with a -1 in the position for xk and 0’s elsewhere, and specify bk to be 0. (This yields the
4
appropriate equation xk + AkxN = bk for the non-basic variable, which is simply the identity xk – xk = 0.) Theorem. The penalties Pj[1] for j ε Iq, generated relative to (III), give a valid cutting plane uo + AoxN = bo where Ao = - (dN + Σ(Pj[1]Aj•: j ε Iq)) bo = - Σ(Pj[1]bj: j ε Iq). Moreover, the cutting plane is separating if Pk[1] is positive for at least one k ε Iq such that xk qualifies as a branching variable. Proof: Let yo, x be any feasible solution to (III). Hence, there exists some k ε Jq such that xk = 1, and for this k, yo ≥ yok[1]. Moreover for the indicated feasible solution yo, x, it follows that yok[1] = ∑(yoj[1]xj: j ε Iq), and hence yo ≥ ∑( yoj[1]xj: j ε Iq) ≥ ∑(Pj[1]xj: j ε Iq). Let uo = yo – ∑(Pj[1]xj: j ε Iq). Then uo ≥ 0, and replacing yo by dNxN and xj by bj – Aj•, xN for j ε Jq, and re-arranging terms, yields the form of the cutting plane specified in the theorem. The condition bo < 0 is clearly assured whenever Pk[1] > 0 for some branching variable xk, k ε Iq. The validity of the Penalty cut is also implied by results of Glover (1973,1975). The present development is more direct, however, and provides a new context and construction for generating the cutting planes. As previously remarked, the assumption dN ≥ 0 is not necessary for the preceding result. However, in the case where dN can have negative components, possibly some of the penalties Pj[1] can be negative, and then the separation condition requires the explicit stipulation Σ(Pj[1]bj: j ε Iq) > 0 to assure bo < 0. The Theorem has an interesting interpretation relative to the branching conditions xk ≥ 1 and xk ≤ 0. Specifically, these conditions respectively correspond to augmenting (III) by the inequalities AkxN ≤ bk – 1 and – AkxN ≤ – bk, or more specifically, upon introducing slack variables, by equations s1k + AkxN = bk – 1 and sok – AkxN = – bk, (The inequalities in fact hold as equations for xk bounded by 0 and 1, and the slack variables in this case are constrained to 0.) The Penalty cut identified in the Theorem results by weighting each inequality – AkxN ≤ – bk (which applies to the xk ≤ 0 branch) by the corresponding penalty Pk[1], (which applies to the xk ≥ 1 branch). The quantity – dNxN is then added to the left hand side of the result and the variable uo serves as a slack variable to create the cutting plane equation. In retrospect, the effect of weighting inequalities for “0 branches” by penalties for “1 branches” has an intuitive plausibility to it. By this means, the larger the penalty Pk[1] for the xk ≥ 1 branch, the greater is the weight for the opposite branch in the representation of the cut.
2.3 Special Consequences. By relying on the assumption dN ≥ 0 we can obtain a simple variation of the preceding result that applies to problems that contain restrictions resembling GUB
5
constraints, but that do not impose a zero-one restriction. Specifically, we make reference to problems having disjunctive constraints of the form (D) xj ≥ 1 for exactly one j ε Ih and xj ≤ 0 for all other j ε Ih. The condition (D) is quite general, since it can handle disjunctive relations such as xj ≥ uj and xj ≤ vj simply by scaling and translating xj (which shifts the value bj for a basic variable to lie between 0 and 1 in a feasible solution). As before, we define yok[1] as the optimum value of yo subject to xk ≥ 1 (and dual feasibility again implies this value is nonnegative). Remark 2. Under the twin assumption that penalties are calculated relative to a linear programming relaxation of (III) and that dN ≥ 0, the Theorem remains valid relative to the disjunctive condition (D). Specifically, an LP relaxation implies Pk[1] is the same whether calculated for xk = 1 or xk ≥ 1, and in addition implies yok[1] ≥ Pk[1]xk, which assures yok[1] ≥ ∑(yoj[1]xj: j ε Iq). The proof of the theorem therefore carries through as before. Remark 2 is relevant to the situation where xk is any integer-constrained variable, without being required to be zero-one or to belong to a GUB set. For example, xk can be obtained by the typical device (Gomory, 1960a, 1960b) of taking an original integerconstrained variable xp and reducing the coefficients of non-basic variables to their “fractional parts,” corresponding to forming integer combinations of xk with the nonbasic integer variables. Then, shifting bp by an integer amount so that the resulting bk lies between 0 and 1, the Theorem for this special case can be expressed as follows, making use of the penalty Pk[0]. Corollary. (Penalty cut relative to a single branching variable xk.) By the conventions of Remark 2, the penalties Pk[1] and Pk[0] generated relative to (III) give a valid cutting plane uo + AoxN = bo by defining Ao = - (dN + (Pk[1] – Pk[0])Ak•) bo = - (Pk[0 ] + (Pk[1] – Pk[0])bk). Moreover, the cutting plane is separating if either Pk[1] or Pk[0] is positive. The Theorem and its Corollary have additional implications for generating useful cutting planes. Evidently, different scalings of dN yield corresponding scalings of Pk[1] (and of Pk[0]), which translate into different scalings for the resulting cut. Consequently, it is possible to normalize the vector dN without changing the space of feasible solutions admitted by the cut. More importantly, stronger forms of the Penalty cut result by keeping the coefficients of dN as small as possible to yield a given set of penalties Pk[1], k ε Iq. Such a determination of dN can be made in retrospect for penalties implicitly or explicitly calculated from post-optimization with the dual simplex method.
6
Remark 3. Let dj(k) be the reduced cost for xj obtained from an LP relaxation of (III) subject to branching on xk (where imposing xk ≥ 1 gives the same outcome as imposing xk = 1). Then the cut of the Theorem can be strengthened by replacing dJ with dj* = dj – Min(dj(k): k ε Iq), j ε N. The validity of Remark 3 follows from the fact that the construction of dN* assures the vector will be non-negative and yield the same penalties as the vector dN. Recourse to a simple first-order penalty calculation permits dN to be determined initially so that it does not need to be replaced by a vector dN*, as observed next. Remark 4. When penalties are implicitly generated by a single dual pivot on the constraint induced by a branching equation or inequality, the effect of Remark 3 can be established in advance by selecting dj = Max(|Akj|: Akj < 0, k ε Iq). (In the case of the Corollary, where there is only one branching variable xk to consider, this consists simply of selecting dj = |Akj|.) Subject to these observations, different choices of dN can be made to alter the slope of the cut, thus making it possible to select Penalty cuts that extend more deeply along selected dimensions. The relevance of creating designs for producing deep cuts relative to particular dimensions is addressed in Glover (1970) and Sherali, Lee and Kim (2001). The latitude for such choice is limited in the case of first-order penalty calculations, however, as Remark 4 suggests. In fact, for the restricted setting of the Corollary, where a single branching variable is treated without reference to a GUB set, and in the special case where a first-order penalty is used, it is easy to show that choosing dj = |Akj| as in Remark 4 gives the same cutting plane as the mixed integer cutting plane of Gomory (1960b). Thus, the Corollary is not of great interest when only first-order penalties are applied. (However, Marchand and Wolsey (2001) demonstrate that a variety of other inequalities, including residual capacity inequalities, mixed cover inequalitites and weight inequalities, are instances of such first-order cutting planes that correspond to the Gomory mixed-integer cuts.) More advanced penalty calculations, on the other hand, afford an opportunity to obtain stronger cuts. Supplementary strengthening possibilities in the zero-one and GUBconstrained cases are also provided in Glover and Sommer (1976), Glover, Sherali and Lee (1997) and Gu, Nemhauser and Savelsbergh (1999).) In general, apart from using higher-order penalty calculations, in the case of GUB-constrained applications the potential to obtain cutting planes of greater power comes from exploiting the Theorem by including all members of a GUB set in the determination of the Penalty cut. Finally, we note that the observation of Remark 4 can be re-expressed by explicitly referring to the row vector w of non-negative dual variables that generates a dual feasible solution, and hence generates both the penalties and reduced costs. In particular, let w[k] denote the vector w for the system that includes the inequality AkxN ≤ bk – 1 (or corresponding equation s1k + AkxN = bk – 1) associated with the branch xk ≥ 1. Then a lower bound Pk[1] for yok[1] is given by identifying a non-negative w[k] (with a positive weight for the equation containing s1k) to yield
7
hence
dN + w[k]AN ≥ 0 (for dual feasibility).
dj ≥ - w[k]A•j for all j ε N. The tightest dN vector, which gives the strongest cutting plane, is therefore given by dj = Max(- w[k]A•j: k ε Iq), j ε N and the penalty associated with w[k] is Pk[1] = - w[k]b[k], where b[k] is the b vector augmented to include the equation containing s1k. These relationships can be used directly in strategies for generating penalty cuts, with and without the use of linear programming.
3. Concluding Remarks The Penalty cuts offer a previously unavailable opportunity to exploit penalty calculations of the type customarily used in branch-and-bound, thereby yielding a new function for these calculations that supplements their role in fathoming nodes and in selecting branches of the branch-and-bound tree. Consequently, the Penalty cuts are particularly convenient for being used in branch-and-cut methods. The new cutting planes also open up several areas of research. The latitude to select the foundation hyperplane yo = dNxN in order to bias the cut to extend more deeply in particular dimensions invites an investigation of alternative strategies for generating the dN vector. Similarly, the trade-offs involved in employing more advanced penalty calculations as a source of the cutting planes warrant investigation. In particular, higherorder penalties may provide a different degree of advantage for Penalty cuts than for branch-and-bound fathoming operations, since the latter are only relevant in the case where it is possible to determine infeasibility or to establish that the objective function exceeds an admissible bound. The determination of which GUB sets from a given collection provide the best source for Penalty cuts also invites investigation. We anticipate that MIP problems in which GUB constraints are numerous and include a large portion of the integer variables are likely to provide the most useful applications for these cutting planes. The fact that the Penalty cuts in such settings are based on selecting GUB sets rather than individual variables as a foundation for creating the cutting plane structure gives them a novel property whose consequences likewise deserve study.
Acknowledgement: This research was partially supported by the Office of Naval Research contract N00014-01-1-0917 in connection with the Hearin Center of Enterprise Science at the University of Mississippi.
8
References Balas, E. (1979) “Disjunctive Programming,” Annals of Discrete Mathematics, 5, pp. 351. Glover, F. (1970) "Faces of the Gomory Polyhedron," Integer and Nonlinear Programming II, Jean Abadie, editor, North-Holland Publishing Company Glover, F. (1973) “Convexity Cuts for Multiple Choice Problems,” Discrete Mathematics, Vol. 3, No. 1, pp. 86-100. Glover, F. (1975) “Polyhedral Annexation in Mixed Integer and Combinatorial Programming,” Mathmatical Programming, Vol. 8, pp. 161-188. Glover, F., H. Sherali and Y. Lee (1997) “Generating Cuts from Surrogate Constraint Analysis for Zero-One and Multiple Choice Programming,” Computational Optimization and Applications, Volume *, Number 2, pp. 151-172. Glover, F. and D. Sommer (1976) “Inequalities for Mixed Integer Programs with Structure,” Naval Research Logistics Quarterly, Vol. 23, No. 4, pp. 603-609. Gomory, R. (1960a) “Solving Linear Programming Problems in Integers,” R.E. Bellman and M. Hall, Jr., eds. Combinatorial Analysis, American Mathematical Society, pp. 211-216. Gomory, R. (1960b) “An Algorithm for the Mixed Integer Problem,” Research Memorandum RM-2597, Rand Corporation, Santa Monica. Gu, Z., G.L. Nemhauser, and M.W.P. Savelsbergh (1999) “Lifted Flow Covers for Mixed 0-1 Integer Programs,” Mathematical Programming 85, pp. 439-468. Hammer, P.L., E.L. Johnson, and U.N. Peled (1975) “Facets of Regular 0-1 Polytopes,” Mathematical Programming, Vol. 8, pp. 179-206. Marchand, H. and L.A. Wolsey (2001) “Aggregation and Mixed Integer Rounding to Solve MIPS,” Operations Research, Vol. 49, No. 3, pp. 363-371. Sherali, H. D. and Y. Lee (1995) “Sequential and Simultaneous Liftings of Minimal Cover Inequalities for Generalized Upper Bound Constrained Knapsack Polytopes,” SIAM Journal on Discrete Mathematics, Vol. 8, No. 1, pp. 133-153. Sherali, H. D., Y. Lee and W.P. Adams (1995) “A Simultaneous Lifting Strategy for Identifying New Classes of Facets for the Boolean Quadric Polytope,” Operations Research Letters, Vol. 17, No. 1, pp. 19-26.
9
Sherali, H.D., Y. Lee and Y. Kim (2001) “Partial Convexification Cuts for 0-1 MixedInteger Programs,” Working paper, Virginia Polytechnic and State University, Blacksburg, VA.
10