a hierarchy of relaxations between the continuous and convex hull ...

5 downloads 13790 Views 2MB Size Report
Sep 20, 1989 - problem, converts it into a zero-one polynomial programming problem, and ... of the resulting reformulation depends on the degree of the terms.
SIAM J. Disc. MATH. VOI. 3, No. 3, pp. 411-430, August 1990

011

A HIERARCHY OF RELAXATIONS BETWEEN THE CONTINUOUS AND CONVEX HULL REPRESENTATIONS FOR ZERO-ONE PROGRAMMING PROBLEMS* HANIF D.

SHERALI"

AND

WARREN P.

ADAMS:I:

Abstract. In this paper a reformulation technique is presented that takes a given linear zero-one programming problem, converts it into a zero-one polynomial programming problem, and then relinearizes it into an extended linear program. It is shown that the strength of the resulting reformulation depends on the degree of the terms used to produce the polynomial program at the intermediate step of this method. In fact, as this degree varies from one up to the number of variables in the problem, a hierarchy of sharper representations is obtained with the final relaxation representing the convex hull of feasible solutions. The reformulation technique readily extends to produce a similar hierarchy of linear relaxations for zero-one polynomial programming problems. A characterization of the convex hull in the original variable space is also available through a projection process. The structure of this convex hull characterization (or its other relaxations) can be exploited to generate strong or facetial valid inequalities through appropriate surrogates in a computational framework. The surrogation process can also be used to study various classes of facets for different combinatorial optimization problems. Some examples are given to illustrate this point.

Key words, convex hull, zero-one programs, tight relaxations, facets AMS(MOS) subject classifications. 90C 10, 90C27

1. Introduction. This paper describes a technique for generating a hierarchy of polyhedral representations for linear and polynomial zero-one programming problems. In the linear case, the technique first multiplies the constraints in the problem, including the zero-one interval bounds on the variables, by a select set of d-degree polynomial terms or factors formed using the n zero-one variables, where =< d _-< n. The resulting system is then linearized by defining new nonnegative variables for each existing crossproduct term. It is shown that for each d l, n, by enforcing the binary restrictions on the original x-variables, we obtain an equivalent reformulation of the problem which has at least as tight a linear programming relaxation as that obtained by using factors of degree d- 1. (Here, the 0-degree factor is taken as unity.) Moreover, when d n, we show that the resulting linear system represents a polytope whose extreme points are precisely the zero-one solutions feasible to the original problem, and hence characterizes the convex hull of feasible solutions. A projection of this convex hull characterization onto the space of the original variables with an identification of its facets is also evident. A similar approach produces a hierarchy of polyhedral representations extending up to the convex hull of feasible solutions for zero-one polynomial programming problems. The foregoing approach is a generalization (in the pure zero-one case) of the linearization strategy suggested by Adams and Sherali (1987) for mixed-integer zero-one programming problems using d 1. This technique was theoretically shown to generate tighter linear programming relaxations than alternative methods such as those due to Petersen 1971 ), Glover and Woolsey (1974), and Glover 1975 ). Computational strategies for exploiting the structure of this reformulation were presented for zero-one quadratic programming problems in Adams and Sherali (1986). The analysis in the present Received by the editors December 12, 1988; accepted for publication (in revised form) September 20, 1989. This material is based upon work supported by National Science Foundation grant ECS-8807090. The United States Government has certain rights in this material. f Department of Industrial Engineering and Operations Research, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061-0118. Department of Math Sciences, Clemson University, Clemson, South Carolina 29634-1907.

411

412

H. D. SHERALI AND W. P. ADAMS

paper demonstrates that while maintaining the basic structure of the reformulation, we can strengthen it as desired to the point that we obtain the convex hull of feasible solutions to the original zero-one problem. Balas (1983) has proposed an alternative technique for obtaining a hierarchy of relaxations for mixed-integer zero-one programs via disjunctive programming methods. Here, the basic construct used from disjunctive programming is a description of the closure convex hull of a union of polyhedral sets in terms of a certain higher dimensional polyhedron. This construct is used as follows. Given a representation of the feasible region F as F Nj T S., where S is the union of certain polyhedra for each j in an index set T, a relaxation of the closure convex hull of F is obtained by taking the intersection of the closure convex hull of each Sj, j T. This process is said to determine a hullrelaxation ofF. If F is written as the intersection of its continuous relaxation constraints, for each binary variable xj, this process along with the disjunctions that xj 0 or xj On the other end of the spectrum, if F relaxation. linear programming usual the yields is represented as a union of polyhedra (IT[ by essentially enumerating all possible feasible binary combinations, we obtain the closure convex hull of F via this process. The in-between relaxations are obtained by converting in a stage-wise fashion the intersections of each of various pairs of sets S to an equivalent union of polyhedra before taking the hull-relaxation. Evidently, there is a considerable degree of flexibility in designing these in-between relaxations. While our technique coincides with Balas’ method at the two ends of this spectrum of relaxations, it differs in the sequence of intermediate relaxations. Observe also that our technique is in the spirit of Chvatal’s (1973) result which shows that any given facetial inequality for a linear, bounded, pure integer programming problem can be derived through a sequence of constraint surrogations followed by a rounding up of coefficients, where the new constraints can themselves be used in the surrogates. Our technique can similarly be viewed as an appropriate sequential process of constraint multiplication by single degree factors followed by variable redefinitions. However, unlike Chvatal, we provide an explicit characterization of the convex hull. Its structure enables us to not only verify the validity of facetial inequalities through a surrogation process, but can also be used constructively to generate facets for different classes of combinatorial zero-one problems, such as for the zero-one knapsack problem (Balas 1975, 1976), Balas and Zemel (1977), Hammer, Johnson, and Peled (1975), and Wolsey 1975 )), the travelling salesman problem (Crowder and Padberg (1980), Grotschel and Padberg 1979 ), and Lawler et al. 1985 ), the multiple choice problem (Johnson 1981 ), set coveting and set packing problems (Padberg (1973), (1979), (1980)), positive zeroone programs Balas and Zemel 1984 )), general zero-one problems Crowder, Johnson, and Padberg 1983 ), Wolsey (1976), and Zemel (1978)), and certain special problems as in Balas 1985 ), among several others. Furthermore, our characterization can be used computationally to provide sharper or tighter polyhedral representations for linear and polynomial zero-one programming problems (including mixed-integer cases), beyond that described in Adams and Sherali 1987 ), and to generate strong valid inequalities for such problems. From an algorithmic viewpoint, the importance of having tight linear programming relaxations which provide at least a partial convex hull representation has long been recognized. This has led to some crucial and critical research on model construction and problem formulation as in Jeroslow (1980), (1983), (1984a), (1984b), (1985), Jeroslow and Lowe (1984), (1985), Meyer (1975), (1976), (1981), Meyer, Thakkar, and Hallman (1980), and Williams (1974), (1985). Cutting plane generation schemes striving to provide tighter relaxations have also been proposed by Martin and Schrage (1982), (1983), Padberg

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS

413

(1973), (1979), (1980), Van Roy and Wolsey (1983) and Wolsey (1975), (1976). Automatic reformulation procedures utilizing such constraint generation schemes within a branch and bound framework are presented in Crowder, Johnson, and Padberg (1983), Hoffman and Padberg (1985), Johnson and Suhl (1980), Johnson, Kostreva, and Suhl 1985 ), Oley and Sjouquist (1982), Spielberg and Suhl (1980), and Van Roy and Wolsey (1984a), (1984b). For some specially structured problems, a variable redefinition technique has also been proposed by Martin (1984), and Eppen and Martin 1985 ), where a linear transformation on variables is used to yield an equivalent formulation that tightens the continuous relaxation by constructing a partial convex hull of a subset of constraints. Our work is intended to lead to research in the same spirit as the foregoing papers. Our presentation is organized as follows. Section 2 presents the hierarchy of fighter relaxations for linear zero-one problems. Section 3 establishes the convex hull property of the final relaxation, providing a characterization of the facets of the convex hull of X. This is then generalized to zero-one polynomial programming problems in 4. Some illustrative examples are provided in 5, and 6 provides some concluding remarks pertaining to our ongoing research.

_.

2. A hierarchy of relaxations. Consider the feasible region X of a linear zero-one programming problem in n variables as defined below, where e is a vector of ones in R

X

.

x R n:

OlrjXj

r, r

1,

R1,

j=l

(2.1)

R2 0 -= O

}

Observe that in (2.3) if we put d 0 and define (J, J2) (, ) to be the only pair of order zero, and let f0(q, ) 1, we obtain precisely the linear programming relaxation X0. Further observe that for any d 1, n, the operations conducted on the set X in Steps and 2 to obtain the set Xd do not eliminate any binary x solutions feasible to X. Also, note that at Step above, if we had symmetrically multiplied each of the bounding constraints 0 =< xj =< 1, j 1, n, by each of the factors in (2.2), we would have obtained relationships enforcing nonnegativities on all factors of degree d and all factors of degree D min { d + 1, n }. However, when d < n, Lemma below shows that it is sufficient to include only the nonnegativities on the factors of degree (d + as in (2.3c) above. Finally, note that we could have more generally included the inequalities R, along with the inequalities xj >- 0 and xj) (= 10lrjXj /r) 0, r 1, n, in various combinations to define the factors of the type (2.2) to be 0, j 1, used in the above reformulation scheme. Although the new constraints thus generated might help tighten the representations obtained for d < n, and so may be algorithmically advantageous, it follows from our development in the sequel that these constraints are all implied by the constraints of Xn. (We remark, however, that for any d -< n, the constraints generated via the products of such new factors with the equality constraints in X are already implied by the constraints present in Xa above.) LEMMA 1. For any = 0 for all (J, J2) of order d are implied by the constraints fa / l(J, J2) >- 0 for all (J, J2) of order (d + ). Proof. Consider any (Jl, J2) of order =< d < n. Let k N (Jl tO J2). Then we have

...,

...,

(2.4a)

Fa+(J,+k, Jz)+Fa+(J,J:z+k)=[x,+(1-x)lFa(J,Jz)=Fa(J,J).

Hence, we have, (2.4b)

J(Jl, J2)

-

J+ l(Jl + k, J2)+j+ (J, J2 + k)>-0

and the proof therefore follows.

_

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS

415

Next, in order to establish the equivalence of Xd to X under binary restrictions on x, and to establish the necessary machinery for our convex hull characterization, consider the following results. LEMMA 2. For any d

.

n }, consider the constraint set

( 1,

Zd { (X, W)" fd(Jl, J2)>=O for all (J, J2) of order d}.

Let be any binary vector. Then (:, ) Zd if and only if j 79jjfor all sets J such that JI 1, d, where Vs =- when JI 1. d, we have that the value 1, rj jj for all sets J such that JI Proof. If at (.f, v) ofJ(Jl, J2 equals that of Fd(J, J2) for all (J, J2 ), and so (.f, ) e Zd. Hence, let us now prove the converse by induction. Note that it is obviously true for d 1, and so assume that it is true for Z, Zd-l and consider Zd, 2 --2, and let {k,l}_J. Since Fd(J- k, k) Xk)(jj_kXj), the constraintj(J- k, k) >_- 0 is

s

vs

...,

_

ws - 0 is

(2.7)

...,

rj j- k- tXj(

wj>-- wj_ t+ Wj_ k-- Ws_ k_ for any { k,

-

Xt + XkXt) the constraint

Xg

} J,

if J -= { k, } itself. Now, if 79 s2 0, then ffj_ k rj s_ k2j 0 for where ws- ksome k e J, and since vj >= 0 from the constraint J(J, ) >- 0, we have from (2.6) that 0. On the other hand, if 792j 1, then by the induction hypothesis, we have 1. Therefore Vj_ kand so from (2.6) and (2.7) we obtain vj_ k Vq ffs rj jj for all J with JI d as well, and the proof is complete. We are now ready to establish the hierarchy in the relaxations Xd of X, for d n. This is addressed in the following result. 1, THEOREM 1. Let Xo be the linear programming relaxation of X, and let conv (X) denote the convex hull ofX. Define

s

...,

vs

__

xe= {x:(x, w)sXe}

(2.8) to be the projection

OfXd onto the x-space for d

_.

Xpo Xpl x)__ Xp2 X)

(2.9) where Xo =- Xeo.

.

1,

Xpn

..., n. Then we have, Conv

X

.

n }, and let (Y, if) e Xd, SO that e Xm. We will Proof. Consider any d e { 1, first show that the same values for Y and the components of defining Xd- satisfy the constraints in Xd- l, hence implying that Xe(d- ), and consequently that Xed c_ Xe(d- ). Toward this end, consider any inequality arXj >=/3, r e { 1, R }. Let (J, J2) be any pair of order (d ), and denote J3 N (J tO J2). (J and J2 are both empty if d 1.) Then by (2.3a) of Xd, we have for any k e J3 that

E OtrjJJl

[3r)fa(J,,

J2 4- k) 4-

ZJ’=

E JJ3 k

Otrjfd+ l(J1 +j, J2 + k)>-0

416

H. D. SHERALI AND W. P. ADAMS

and

( j lrj--r)fd(Jl

Z aofa+ l(J + k+j, J2)

d-k, J2)"l-lrkfd(Jl’at-k, J2) -Jl"

J

_

0.

jJ3-k

Summing these last two inequalities and using (2.4b), we get

(2.10)

-

( j lrj-r)fd-l(JI,J2) J

E Olrjfd(Jl "+’J, J2) -0

"Jr"

JJ3

__

where fo(q, 4) 1. However, note that (2.10) is precisely the constraint X}’= 10lrjXj >(whence Jl J2 b), and is the constraint (2.3a) of the set Xa-l if d >_- 2. /3r if d Hence, (, #) satisfies the constraints (2.3a) in Xa-I. In an identical manner, we get that (2, v) satisfies the constraints (2.3b) in Xa-1. Furthermore, ifd e { 1, n }, then by Lemma 1, (2, v) satisfies the constraints (2.3c) of Xa-l, and if d n, then (2.3c) for Xa and Xa- are identical. Hence, we have that (Y, #) satisfies all the constraints in Xd-1, and so 2 Xe(d- 1)o This establishes that Xed Xe(d- 1) for d 1, n. To complete the proof, we need to show that conv (X) Xen. If conv (X) the result follows trivially. Otherwise, let 2 e X, and define ffj rj j2j for all the 2 n n. Then by construction, (, )e X, and so (n + 1) sets J such that JI -2, ffl 2 Xe,. Hence X Xe,, and so conv (X) Xpn and the proof is complete, COROLLARY 1. Let d { 1, n }. Then (2, ) with binary is a feasible solution to Xd if and only if2 X and vj rj j2j for all sets J such that IJI D, where 2, 1, D=min{d+ n}. Proof. If (2, v) with 2 binary is a feasible solution to Xd, then since X x { X0: x binary } and since 2 X0 from (2.9) of Theorem 1, we have that 2 X. Moreover, since (2, v) ZD by (2.3c), we have from Lemma 2 that rj2j for all The D. of follows the converse sets J with JI from the corollary directly 2, [3 construction of Xd, and the proof is complete.

_

,

...,

s

...,

3. Convex hull characterizations. We will now show that in the hierarchy described in Theorem 1, the final relaxation Xe, is indeed the convex hull of X. We begin by first demonstrating that the system Z, defined in (2.5) describes a polytope with all binary valued vertices. As we will see later, this forms a fundamental system which characterizes the convex hull of feasible solutions for any linear or polynomial zero-one programming problem. Toward this end, consider the following result which describes a useful linear transformation. LEMMA 3. Consider the linear transformation

(3.1a)

YJ, =fn(Jl ,J) for all J1

- _ N-- { 1,

..., n }, J15t= ,J1

N- Jl.

Then this describes a nonsingular linear transformation with an inverse

(3.1b) where wj

WJI xj

when

JI

,

yj, u j for all Jl

N, Jl d: dp

1.

Proof. Note that writing (3. a) as y Bw, (3.1 b) asserts that Ty TBw w, thus implying that the square matrices T and B are nonsingular. Hence, to prove the result, let us show that 3.1 b) holds under 3.1 a).

_

__

_

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS

417

Consider any J N, J 4: 4, and let ys for all J N be as defined in (3.1a). Observing that for each J N,f(J, a) Y =__j (-1)lzl wjoz, we get by direct algebraic manipulation that J

J

L(J U J,J U J)

YJ, UJ J

J

M]

E L E U J (-1)1 Wg, ugu J

J

JM L=M-J

Wj O M(1--1) MI

=wj,+

LM

M

wj.

M

This completes the proof. Example 1. To illustrate, note that for n W23, Y23

Y123

(3.3)

W23

W123), YI3

3, putting W3

W123), Y2

W2

W23),

y3=(x3-w3-w23+w123),Y2=(x2-w12-w23+w123),and y

(x

w2- w3 + w23)

J in (3.3) may be readily verified, we get (Y23 + Y123), WI3 (YI3 + Yl3), W2 (Y2 + Y23),

where the expressions for(J, W23

(3.4)

Y23, W23

x3=(y3+Y3+Y23+Y23),x2=(y2+Y2+Y23+Y23), and

.

X (y +y2+Y3+Y23). Next, consider the following (well known) result. LEMMA 4. Consider the set Sx { x" Gx g } Under a nonsingular linear transformation y H-x, i.e., x Hy, let the set Sx be transformed into the set S { y: GHy g }. Then S is bounded if and only ifSy is bounded, is a vertex of Sx if and only if H- is a vertex of Sy, and the row Grx gr defines a facet of Sx if and only if the corresponding row Grny gr defines a facet of Sy. Proof. The proof follows directly from the definitions of extreme points, extreme directions, and facets. THEOREM 2. Consider the set

Z, { (x, w) :(J, ) OforallJ N,]= N- J}. Then Z, is bounded and (, ) is an extreme point of Z if and only if (, ) is binary with j n. Moreover, every defining inequality jjfor all sas J such that JI 2,

...,

-

of Zn is a facetial inequality. Proof. Note that the sum of the expressions for all the factors F,(J, J) over J N is unity, and so,

F.(6,N)= Consequently, we may write

(3.5a)

Z as

Z

F(J,J).

JN

J

Z= (x,w):(J,j)OforallJN,J4, and (J,J)N1

418

H. D. SHERALI AND W. P. ADAMS

Now, using the nonsingular linear transformation 3.1 ), the set Zn gets transformed into the y-space as

&"-

(3.5b)

{Y" JNE

yj-< l, and yj>=O forall

JN,J:/:d}.

J44

By Lemma 4, since Sn is bounded, so is Zn, and since all the inequalities of S, are facetial, so are the inequalities defining Zn. Furthermore, the vertices of S, are either y 0 or y equal to a unit vector (one component unity and the others zero). Hence by Lemma 4 and 3.1 b), the extreme points of Zn are binary. Therefore, if (, ) is an extreme point of Z,, it is binary and by Lemma 2, j rj jj. for all sets J such that J[ 2, n. Conversely, if (A, v) is binary with vj rj-jf, for all J with JI n then 2, (, v)e Z, by construction, and moreover, from (3.1a), the corresponding 3Jt-= f(J, J Fn(J, J for all J N, J :/: Hence, 33 is either the zero vector or a unit vector. Consequently, fi is an extreme point of S, which implies that (A, ) is an [2] extreme point of Zn by Lemma 4. This completes the proof. COROLLARY 2. Let (x, w) Zn. Then for any nonempty sets Jl, J’ such that J’ J N, we have wj, >= wj, where ws xj when JI 1. Proof. Using the transformation (3.1) as in (3.5), we have from the nonnegativity of y, and from (3. b), since aY’ (J J’ t_J a, where (J J’ is well defined, that

.

_

.

Wj

J’ =-J

yj] j,>---

YJ] U J’ J’= (J J)U J J1

_

...,

YJ, t.; J-" Wj,.

[-"l

J-.t

The above corollary establishes a relationship between variable values which can be useful in deriving implied inequalities. Observe that it is a generalization of inequalities (2.6) in that by restricting J’ and J to satisfy J’l + Jl, (2.6) results. Let us now establish the main result of this section. THEOREM 3. Let X, and Xpn be as defined by (2.3) and (2.8), respectively. Then (, ) is an extreme point of Xn if and only if (, ) is a binary feasible solution to Xn and, moreover, any such solution satisfies vj rjjj for all sets J such that IJI n. Furthermore, Xe conv (X). 2, Proof. Note that from (2.3), we may write Xn as

X (3.6)

(x, w)" for each J___ N, J N- J, we have,

(.,a-)f(J,])>--Oforr=l,...,R,, \jJ

,arj-br)fn(J,])=Oforr=l,

jJ

Now, define

(3.7) jJ

,R2, and fn(J,

aY)>__0}.

_

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS

419

and let E denote the complement of E with respect to the power set of N. Observe that for any J N, if J e E then the corresponding (R + RE + constraints in (3.6) are equivalent to the constraint fn(J, J) 0, while if J e E, these are equivalent to the constraint f(J, J) >- 0. Hence, we may rewrite Xn as

X,= {(x, w):f,(J, aY) 0 for JeE,f(J,Y)>-O for Je/}. (3.8) But this means that X, is a subface of Z,, and so the extreme points of X, are precisely those extreme points of Z, which are feasible to X,. Hence, if (, ) is an extreme point of X,, then from Theorem 2, (, ) is binary with vj rjj for all sets J with [JI n. Conversely, if (, ) is a binary feasible solution to X,, then 2, r sj for all sets J with J] 2, n by Corollary 1, and moreover from Theorem 2, (, ) is an extreme point of Z,, and hence of X, as well. Finally, note that if is an extreme point of Xe,, then since X., is a projection of X,, there exists a such that (, v) is an extreme point of X,. From above, (, v) is therefore binary with 2, n, and so by Corollary rj j for all sets J with JI 1, e X. Hence, the extreme points of Xe, belong to X, and since X, is bounded by Theorem 2, we get X, conv (X). From (2.9) of Theorem 1, this means that X, U] conv (X) and the proof is complete. Example 2. Consider the set X { x (x, x2)" 2xi + 2x2 >- l, x binary }, and note that conv (X) {x: xl + x2 >-- l, x -< e}. The set X0 {x’2x + 2x2 >- 1,0= 1, and inx and elude the nonnegativities on the factors xx2, x( x2), Xl)X2 and Xl) x2) of degree 2. Linearizing these sets of constraints as in Step 2 of the reformulation procedure gives constraint sets (2.3a) and (2.3c), respectively, as below.

s

s

-

X { (x, w)" x + 2w2 >- 0, x2 2w2 >= 0, x + 2x2 2w2 >= 1,2x + x2 2w2 >_- 1, }. w2 >- 0, x w2 >= 0, x2 w2 >- 0, -x x2 + w2 >= Note that the constraints of X0 are obtainable from XI by surrogating appropriate constraints. For example, summing constraints and 3 gives 2x + 2x2 >- 1, summing constraints 5 and 6 gives x >= 0, and summing constraints 7 and 8 gives x _-< 1. Moreover, note that the point x 1/2, 0) e X0, but putting x2 0 in X implies that w2 0 from from constraints 3 and 8. Hence, (1/2, 0) is infeasible constraints 5 and 7, and so x to Xe, and so, X0 strictly contains Xe. However, the solution { x ), w2 0 } e is infeasible to conv (X). Therefore, X,I strictly contains conv (X). X while x Finally, consider X2. Using the factors xl x, x x2), x )x2 and x x) >= O, x2) of degree 2 on the constraint 2x + 2x2 >- gives 3xx2 >- O, x(

,

-

,

_

x)( x) >= 0, respectively. Hence, E { 4} in (3.7), x) >= 0, and (-1)( and so from (3.8), we get X2 { (x, w)" w2 >-- 0, x w2 >- 0, x2 w2 >= 0, w2 x + x2 }. Substituting the fourth constraint for w2 into the first three constraints, we obtain Xp2 conv (X) as in Theorem 3. Hence for this example, X0 = X Xp2 eonv (X). We have established, via Theorems and 3, that X0 X X, =conv (X). Observe that one can (consider) the process of constructing this hierarchy of tighter relaxations to be in the spirit of Chvatal (1973) as a sequential process in which the n are used to multiply the constraints of Xd-1, factors x and xj) for j 1,

x2(

420

_

H. D. SHERALI AND W. P. ADAMS

N, in order to produce Xd for which are then linearized using Wj’Xk n. (This sequential technique produces certain additional implied constraints d 1, in Xd, which can be avoided by prohibiting the use of the factor xj. or (1 -xj) on a constraint which has previously been treated by either of these factors.) Another useful fact to note at this point is an alternative representation of Z, and Xn, given by (3.5a) and (3.8), respectively, in the (x, y)-variable space under the transformation 3.1 ). Note that under (3.1) Zn of (3.5a) gets transformed into Sn given by (3.5b) in the y-space which can be written equivalently in the (x, y) space using (3.1b) as Wju k for all J

,

Zn(x,y)=(

yjujfor j= 1,

(x,y)’xj

,n

JN-j

(3.9) 1,y0>=0 and yj->-0 for all

yj+yo JN

J_N,Jdp}

where we have introduced a slack variable Y0 in the final constraint of Zn. Accordingly, X of (3.8) can be written as

Xn(x,y)={(x,y)’xj E (3.10a)

yjtjfor j= 1,’." ,n J_N-j

+yo=

J_N

J4: 4

yo >= 0 with yo=0if{}eE

y j>= 0 with y j= 0 if JE for all J_ N, J:/: This means that the projected set Xpn can equivalently be written as

(3.10b)

_

{ X" (X, y)-Xn(x y) }.

Xpn

Observe that the representation ofXe conv (X) as given by (3.10) has (n + constraints in (at most) n + 2" nonzero variables, and is essentially in the spirit of writing x as a convex combination of all feasible binary solutions in X, where Y0 represents for all for a set J N, J 4: represents x n, and yj x 0 for all j 1, e J J. for all 0 and j j xj A third point to note is that the set Xp, =- conv (X) can be characterized in the xspace itself via its projection from the (x, w) or the (x, y) space. For example, consider its representation in the (x, y) space as given by (3.10) and, for the sake of convenience, let us write it in the following form with obvious notation:

(3.11)

,

Xen { x" there exists a y >_- 0 such that Ay x and e.y

.

}

where the vector y is comprised of components corresponding to the set .E (i.e., the zero components have been eliminated), and where e is an appropriate vector of Now, x Xe, if and only if 0 so that Xe, 4: ones. We assume here that E 4: min { O. y" Ay x, e.y 1, y >_- 0 }, i.e., 0 max { rx + r0: rA + er0 --< 0 }, where

,

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS r

Ay

421

...,

(Try, 7rn) and ro are Lagrange multipliers associated with the constraints x and e.y 1, respectively. Define the set

P= ((r, ro)" rA + evro_-< 0 },

(3.12)

and note that P is clearly unbounded. Now, consider the following result. THEOREM 4. Let Xen and P be defined by 3.11 and (3.12), respectively. Then is full dimensional in R n if and only if the origin is a vertex of P. In this case, P has O, + 1, or -1, and K, where extreme directions (rk, ro), k 1,

r

Xe, { x"

(3.13)

rx + ro ->= }. 7ro2 7ro results. COROLLARY 3. The projection of the set Xen 4: c of (3.11 onto the x-space is given 7ro, ro2), k 1, in general by (3.13), where (r, K are the extreme directions of the set {(r, r2, ro, ro2) >= 0: (r r)A + e(ro ro2) =< 0}, and where r (r r), K. rzfor each k 1, COROLLARY 4. The inequality rx + ro _- 1, 0 O 2 < 1, 0 0/3 < l, and a2 + a3 >= 1. Hence, from (3.7), the set E { { }, { 2 }, and { 3 } }, and so from (3.10), noting accordingly that Yo Y2 Y3 0, we have Xen {X" Yl + Y12 + Y3 + Y123 Xl, Y12 + Y23 + Y123 X2, Y13 Y23 + Y123 X3, Yl + Y12 + Y3 h- Y23 q- Y23 1, and y >= 0 }. Denoting r, 7r2, r3 and ro as the Lagrange multipliers

-

r,

ro r,

.

-

422

H. D. SHERALI AND W. P. ADAMS

with respect to the four equality constraints in XPn we obtain the set P in (3.12) as

--

P= { (Tr, 7to)" 7rl + 7to-< 0, 7rl + 7r2 + 7r0-< 0, 7rl + 7r3 + 7to =< 0, 7r2 + 7r3 + ro-< 0, and

71"

-t- 71"2

71" 3

-- {r for r= l,

r for r

R + 1,

,R

,R + R2

tTr

0 =< x =< e, x integer

xj)) are polynomial terms for (J, J2) (rjjxj)(TrjJ2( 0,’", R + R2. Note that this is a generalization of the linear n }, Jrt { } and Jt { } for all zero-one program for which we have Tr { 1, degree over all polynomial the maximum let denote Now, Tr, r O, + R2. R for the linear terms in PP (including those in the objective function). Observe that case. The hierarchy of relaxations Xa, d 1,..., n for problem PP is obtained by n }. constructing Xa as follows for any d { 1, Step 1. Multiply each of the R + R2 constraints in (4.1) by each factor Fa(J, J2) n. Include of degree d, using the relationships xj2. xj and xj.( xj.) 0 for j 1, the constraints Fo(J1J2) >= 0 for all (Jl, J2) of order D min { d + 6, n }. Step 2. Linearize the resulting polynomial constraint expressions by substituting wj for 7r Ejxj for all sets J such that JI >-- 2. Let this resulting polyhedron in the (x, w) space be denoted by Xa as before. (For uniformity, let X0 be the set of x-variables such that (x, w) is feasible to the constraints obtained via the above linearization process applied using d 0. Note that for the linear case, this is the usual linear programming relaxation as before.)

where p(J, J2)

(Jrt, Jt),

Tr, r

424

H. D. SHERALI AND W. P. ADAMS

Now, let us generalize the results of 2 and 3. (Throughout, wj -= xj whenever 1.) First of all, Lemmas and 2 remain unaffected. Next, consider Theorem n. Then as in (2.9), we again obtain the 1, and define Xea as in (2.8) for d

JI

Xpn conv (X), with the proof of hierarchy of relaxations Xo Xe XP2 Theorem repeating the same arguments based on the following observation. Consider ), denote J3 N J U J2, and let any d { 1, n }, let (J, J2) be of order (d k J3. Note that by virtue of (2.4), if p(J, J) is any polynomial term, then p(jr, J_)[Fa(J,, J2 + k)+ Fa(J, + k, J2)] =p(jr, J,)[Fa- ,(J,, J2)].

Hence, any constraint in X_ constructed by multiplying some constraint in PP by F_ l(J, J2) can be obtained as before by summing the corresponding constraints in Xu constructed by multiplying this same constraint by F(J, J2 + k) and by Fa(J + k, J2) for any k J3. Therefore the proof of Theorem applies here as well. Corollary also follows since if (, ) is any binary solution feasible to Xu for d n }, then by Theorem 1, (, v) satisfies the constraints in X0, and since j { 1, D min { d +/5, n ) by Lemma 2, this means 1, rj j.fj for all sets J such that J[ X. The converse statement of Corollary follows by construction. that Lemmas 3 and 4, Theorem 2, and Corollary 2 remain unaffected and hold as before. Next, to establish Theorem 3, consider any constraint teTr trtP(Jrt, Jt) > in problem PP for some r 1, R }, and suppose that this constraint is multiplied by a factor Fn (J, J-) in constructing Xn, where J N and a N- J. Then, for any polynomial term p(jrt, Jt), the term p(Jrt, Jt)" F,,(J, J) =- F,,(J, J) provided Jrt J and Jt a, and is zero otherwise. Therefore, denoting Tr(J) { Tr" Jrt J and Jt ay} for allJ_Nandr= 1,...,R+R2, weget

_

Xn

I

_

r

(x, w)" for each J___ N, J N- J, we have,

ETr(J) Olrt-- r fn (J,J) >- 0 for r= 1,

,R

(4.2)

art-fir fn(J, J)= 0 for r= R + 1,

,R + R2

Tr(J)

Now, let us define E

f

{ J_ N:

,

Olrt-- fir < 0

for some r { 1,

R } or

Tr(J)

(4.3)

Otrt-13r 4: O for some r { R + 1, Tr(J)

,R +R2}

t

and let E be the complement of E with respect to the power set of N. Then X, given by (4.2) is precisely of the form in (3.8). Therefore, the proof of Theorem 3 now follows identically, and so does the ensuing discussion in the foregoing section, including Theorem 4 and Corollaries 3 and 4. Hence, concerning the solution of the zero-one polynomial programming problem n }, Problem PP PP, we have by Lemma 2 and Corollary that for any d { 0, 1,

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS

425

is equivalent to the linear mixed-integer zero-one programming problem over the feasible region Xa N { x" x binary }, where the objective function is linearized by substituting and 3, the wj rjjxj for all sets J such that JI >-- 2. Moreover, by Theorems of tighter relaxation linear over a provides hierarchy programming corresponding Xa with the final relaxations for increasing values of d 0, 1, relaxation representing n, the convex hull of X.

5. Illustrations for deriving implied inequalities. In 3, we characterized the convex hull of feasible solutions Xen to X via Xn given by (3.8) in the (x, w) space, or via Xn(x, y) given by (3.10) in the (x, y) space, where E is defined in (3.7). We also saw using the representation (3.10), for example, that valid inequalities can be derived by surrogating the constraints of X,,(x, y) using dual feasible solutions to this problem defined by the set P in (3.12) (see Corollary 4). We demonstrate in this section that although Xn has a combinatorial structure, this structure is special enough to permit us to use it implicitly in deriving valid inequalities. This is particularly true if the original combinatorial optimization problem possesses some special identifiable structure. As a simple example, consider the problem: maximize x" 2x + 2x2 + + 2xn n, x binary }, where n >= 3 and odd, which was shown by Jeroslow (1974) to require an exponential effort via branch and bound. This problem is trivially infeasible, and so the convex hull of feasible solutions is empty. We readily recognize this fact since E in (3.7) contains all the subsets of N and so Y0 YJ 0 for all J N, contradicting that the y-variables must sum to unity in (3.10a). We note that the set X 4: 4 (since 1/2 for all i, and wij (n 2)/4(n 1) for all (i, j), < j, is feasible to X), but xi that Xa h for d >= (n + )/2. This follows since for any d such that (n + )/2 =< d =< n, we obtain the same contradiction as above that wj 0 for all sets J such that n. To illustrate for n 3, the problem JI >-- 2, implying that xj. 0 for all j 1, over X is given below, where we have used the abbreviated form of X as explained at the end of 3. Maximize x" 2x + 2x2 + 2x3 3,x 2w12 2W3,X2 2W2 + 2W23,

_

,

(5.1)

-

2W23 + 2W3, and (2.3c) with D= 2, i.e., [WijXi, WijXj, wi>=x+x 1, and wo>=0] for i,j 1,2,3, i_- 0, this gives W12 W13 w23 0 which from the second, third, and fourth constraints in (5.1), implies that x x2 x3 0, an inconsistency with respect to the first constraint in (5.1). Hence,

X2

X3 =conv (Y)

As a somewhat more involved example, consider the edge covering problem defined on an undirected, connected graph G(V, A), which seeks a minimum weighted set of edges such that for each vertex v 6 V, at least one incident edge is chosen. Hence for this A is chosen, and zero otherwise, we have if edge problem, defining xi (5.2)

X={x: x>_-lforeachvV,0= 2. in (5.2) asserts that Now, for any v e V, the constraint iel(v)X

y j= 0 for all J I(v),

(5.4)

since any such J belongs to E, where I(v) N- I(v) is the complement of I(v) with respect to the index set N for the arcs in the graph. In particular { } e E and so Y0 0. )/2. Hence, the (s Next, consider any J c__ N such that [J f’l I[ ----O, wI3---O, w23=-O, x1 ’-x2-I-x3- w12- wI3- W23N 1, w13 NxI, w23 Nx2, w13

(5.12)

w23 Nx q- w12, w12

Wl2 + wz3 _- 0 such that Ay= (x, WII) t, and e.y

},

which is of the form (3.1 ). Then defining P as in (3.12), we could have obtained a characterization ofX3(x, wn) in terms of its facets as in (3.13) of Theorem 4. Note that in this case, the set P of (3.12) is given by

- -

P= { (Tr, 7r0)" 7r’l "- 71"0 0, 71"2 71" 71"

-I- 71" -I- 71" -I- 71"0 0, 7I"2 -+" 71" 71" 2

-I- 3 -I- 7I" 4

-(r3

--(71" 2 +

71" 3

+ ro), r’

"-

71" 6

-]"

-(r

-}- 71"0

7I" 0

71" 6

+r+

"/1"0) and r9

71" 6

71"0

"transformation0, and

71"

Employing the nonsingular linear

r

0, 71"3

71"0

-r0,

0, 71"1 + 71"2 -]- 71"4 0, 7r0

=< 0 }.

"- 71"0

0,

-

-(rl + r0), 7r -(71- 2 -/i’0) -(r + r3 + r + ro), which has as its inverse, r0 -7r9, rl

r4

+ ro),

7r

r

r

(r + r) (r + rg), r6 (r + 7r) (r + rg), transforms (5.13) into the set + 7r + 7r + { r’: -r’ r9 =< 0, 7/>_- 0 } which is of the form (5.11). Using Lemma 4, the extreme directions enumerated for (5.11 may be transformed as above to determine the extreme directions of P in (5.13), and hence characterize X3(x, wii) in terms of its facets as in (3.13). This leads to the same set (5.12) obtained above. Hence, this illustrates how the desired projection can be conducted using the original inequalities of Xn, or under the transformation of Lemma 3, as one finds con-

r r

r

venient.

6. Concluding discussions. In this paper we have presented a constructive technique for deriving tight relaxations for zero-one linear (or polynomial) programming problems, which yield a hierarchy ranging from the usual linear programming relaxation to the convex hull of feasible solutions. From a computational viewpoint, even the first order relaxation Xl can provide an algorithmic advantage beyond the continuous relaxation X0, as evident in the results of Adams and Sherali (1986) on zero-one quadratic programming problems. By including partial sets of constraints from tighter relaxations, or by developing methods for generating desirable valid inequalities based on appropriate surrogates of the tighter relaxations, we can enhance the computational performance still further. Note that, as shown in Adams and Sherali (1987) using first degree factors, this strategy for obtaining tighter relaxations is also applicable to linear mixed-integer zeroone programming problems, or to mixed-integer polynomial programming problems which are linear in the continuous variables for fixed values of the binary variables. The construction of a similar hierarchy for these problems as in the present paper, along with a characterization of facets, will be presented in a subsequent paper (see Sherali and Adams (1989)).

HIERARCHY OF RELAXATIONS FOR ZERO-ONE PROBLEMS

429

From a theoretical viewpoint, as illustrated in the foregoing section, we have the opportunity to exploit the structure of the constructive representation of the convex hull of feasible solutions given by (3.10) to derive valid inequalities. Although establishing that these are members of the set defining (3.13) is a different task altogether, any facetial properties of classes of inequalities generated in this fashion may be established by demonstrating the existence of n affinely independent feasible solutions lying on the defining hyperplane, as usual. The principal use of the characterization (3.10) is then to provide a unifying framework for studying known classes of facets for various combinatorial optimization problems, and to provide a mechanism for possibly discovering new classes of facets. This will be explored in future research.

REFERENCES W. P. ADAMS

AND H. D. SHERALI (1986), A tight linearization and an algorithm for zero-one quadratic programming problems, Management Sci., 32, pp. 1274-1290. (1987), Linearization strategies for a class of zero-one mixed integer programming problems, Oper. Res., to appear. E. BALAS 1975 ), Facets of the knapsack polytope, Math. Programming, 8, pp. 146-164. (1976), Facets of one-dimensional and multi-dimensional knapsack polytopes, Instituto Nazionale di Alta Matematica, Symposia Mathematica, XIX, pp. 11-34. (1983), Disjunctive programming and a hierarchy of relaxations for discrete optimization problems, MSRS-492, GSIA, Carnegie Mdlon University, Pittsburgh, PA 15213. (Also see SIAM J. Algebraic Discrete Methods, 6 (1985), pp. 466-486.) (1985), On the facial structure of scheduling polyhedra, MSRR-496(R), GSIA, Carnegie Mellon University, Pittsburgh, PA. E. BALAS AND E. ZEMEL (1977), Facets of the knapsack polytopefrom minimal covers, SIAM J. Appl. Math., 34, pp. 119-148. (1984), Lifting and complementing yields all the facets ofpositive zero-one programming polytopes, in Mathematical Programming, R. W. Cottle, M. L. Kelmanson, and B. Korte, eds., North Holland, Amsterdam, pp. 13-24. V. CHVATAL (1973), Edmonds’ polytopes and a hierarchy of combinatorial problems, Discrete Math., 4, pp.

305-337.

H. CROWDER, E. L. JOHNSON, AND M. W. PADBERG (1983), Solving large-scale zero-one linear programming problems, Oper. Res., 31, pp. 803-834. H. CROWDER AND M. W. PADBERt3 (1980), Solving large-scale symmetric travelling salesman problems to optimality, Management Sci., 26, pp. 495-509. J. EDMONDS (1965), Maximum matching and a polyhedron with O-1 vertices, Journal Res. Nat. Bur. Standards (B), 69, pp. 125-130. G. D. EPPEN AND R. K. MARTIN (1985), Solving multi-item capacitated lot sizing problems using variable redefinition, Graduate School of Business, University of Chicago. F. GLOVER (1975), Improved linear integer programming formulations of nonlinear integer problems, Management Sci., 22, pp. 455-460. F. GLOVER AND E. WOOLSEY (1973), Further reduction ofzero-one polynomial programming problems to zeroone linear programming problems, Oper. Res., 2 l, pp. 156-161. (1974), Converting the O- polynomial programming problem to a O- linear program, Oper. Res., 22, pp. 180-182. M. GROTSCHEL AND M. W. PADBERG (1979), On the symmetric travelling salesman problem II: Lifting theorems and facets, Math. Programming, 16, pp. 281-302. P. L. HAMMER, E. L. JOHNSON, AND U. N. PELED (1975), Facets ofregular O- polytopes, Math. Programming, 8, pp. 179-206. K. HOFFMAN AND M. PADBERG (1985/6), LP-based combinatorial problem solving, Annals of OR, 4, pp. 145-194.

R. G. JEROSLOW (1974), Trivial integer programs unsolvable by branch-and-bound, Math. Programming, 6, pp. 105-109. (1980), Representation pp. 339-351.

of unbounded optimizations as integer programs, J. Optim. Theory Appl., 30,

430

H.D. SHERALI AND W. P. ADAMS

R. G. JEROSLOW (1983), Alternative formulations of mixed-integer programs, College of Industrial Management, Georgia Institute of Technology, Atlanta, GA. (1984a), Representability offunctions College of Industrial Management, Georgia Institute of Technology, Atlanta, GA. (1984b), Representability in mixed integer programming, I: Characterization results, College of Industrial Management, Georgia Institute of Technology, Atlanta, GA. 1985 ), Representability in mixed integer programming, II: A lattice of relaxations, College of Industrial Management, Georgia Institute of Technology, Atlanta, GA. R. G. JEROSLOW AND J. K. LOWE (1984), Modelling with integer variables, Math. Programming Stud., 22, pp. 167-184. 1985 ), Experimental results on the new techniques for integer programming formulations, Journal of the Operational Research Society, 36, pp. 393-403. E. L. JOHNSON, M. M. KOSTREVA, AND U. H. SUHL 1985 ), Solving O-1 integer programming problems arising from large scale planning models, Oper. Res., 33, pp. 803-819. E. L. JOHNSON AND U. H. SUHL (1980), Experiments in integer programming, Discrete Appl. Math., 2, pp. 39-55. E. L. LAWLER, J. K. LENSTRA, A. H. G. RINNOOY KAN, AND D. SHMOYS (1985), The Travelling Salesman Problem: A Guided Tour of Combinatorial Optimization, John Wiley, Chichester. K. R. MARTIN (1984), Generating alternative mixed-integer linear programming models using variable redefinition, Graduate School of Business, University of Chicago. K. MARTIN AND L. SCHRAGE (1982), Subset coefficient reduction cuts for O-1 mixed-integer programming, Graduate School of Business, University of Chicago. (1983), Constraint aggregation and coefficient reduction cuts for mixed-O linear programming, Graduate School of Business, University of Chicago. R. R. MEYER 1975 ), Integer and mixed-integer programming models: general properties, J. Optim. Theory Appl., 16, pp. 191-206. (1976), Mixed-integer minimization models for piecewise-linear functions ofa single variable, Discrete Math., 16, pp. 163-171. 1981 ), A theoretical and computational comparison of’equivalent’ mixed-integerformulations, Naval Res. Logist. Quart., 28, pp. 115-13 I. R. R. MEYER, M. V. THAKKAR, AND W. P. HALLMAN (1980), Rational mixed-integer and polyhedral union minimization models, Math. Oper. Res., 5, pp. 135-146. L. A. OLEY AND R. J. SJOUQUIST (1982), Automatic reformulation of mixed and pure integer models to reduce solution time in Apex IV, presented at the ORSA/TIMS Fall Meeting, San Diego, CA. M. W. PADBERG (1973), On the facial structure of set packing polyhedra, Math. Programming, 5, pp. 199215. (1979), Covering, packing and knapsack problems, Ann. Discrete Math., 4, pp. 265-287. (1980), l, k)-configurations and facets for packing problems, Math. Programming, 18, pp. 94-99. M. W. PADBERG (1988), The Boolean quadric polytope: Some characteristics, facets and relatives, Working Paper, New York University. C. PETERSEN 1971 ), A note on transforming the product ofvariables to linearform in linear programs, Working Paper, Purdue University, Hammond, IN. n. O. SHERALI AND W. P. ADAMS (1989), A hierarchy of relaxations and convex hull characterizations for mixed-integer zero-one programming problems, Working Paper, Virginia Polytechnic Institute and State University, Blacksburg, Virginia. K. SPIELBERG AND U. SUHL 1980 ), An experimental software system for large scale O- problems with efficient data structures and access to MPSX! 370, IBM Research Report RC 8219, White Plains, NY. ’T. J. VAN ROY AND L. A. WOLSEY (1984a), Solving mixed integer programs by automatic reformulation, CORE Discussion Paper No. 8432, Center for Operations Research and Econometrics, Universite Catholique de Louvain, Belgium. (1983), Valid inequalities for mixed 0-1 programs, CORE Discussion Paper No. 8316, Center for Operations Research and Econometrics, Universite Catholique de Louvain, Belgium. (1984b), MPSARX, a mathematical programming system with an automatic reformulation executor, CORE Computing Report 84-B-0 l, Center for Operations Research and Econometrics, Universite Catholique de Louvain, Belgium. H. P. WILLIAMS (1974), Experiments in theformulation ofintegerprogramming problems, in Math. Programming Study 2, M. L. Balinski, ed., North Holland, Amsterdam, pp. 180-197. (1985), Model Building in Mathematical Programming, second ed., Wiley Interscience, New York. L. A. WOLSEY (1975), Faces for a linear inequality in 0-1 variables, Math. Programming, 8, pp. 165-178. (1976), Facets and strong valid inequalities for integer programs, Oper. Res., 24, pp. 367-373. E. ZEMEL (1978), Lifting the facets of O-1 polytopes, Math. Programming, 15, pp. 268-277.