Intersection Cuts for Nonlinear Integer Programming: Convexification

Intersection Cuts for Nonlinear Integer Programming: Convexification Techniques for Structured Sets

arXiv:1302.2556v2 [math.OC] 12 Mar 2013

Sina Modaresi∗

Mustafa R. Kılın¸c†

Juan Pablo Vielma

‡

March 14, 2013

Abstract We study the generalization of split and intersection cuts from Mixed Integer Linear Programming to the realm of Mixed Integer Nonlinear Programming. Constructing such cuts requires calculating the convex hull of the difference of two convex sets with specific geometric structures. We introduce two techniques to give precise characterizations of such convex hulls and use them to construct split and intersection cuts for several classes of sets. In particular, we give simple formulas for split cuts for essentially all convex sets described by a single quadratic inequality and for more general intersection cuts for a wide variety of convex quadratic sets. ∗

Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA 15261, [email protected] Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA 15261,[email protected] ‡ Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139, [email protected]

†

1

Introduction.

An important area of Mixed Integer Linear Programming (MILP) is the characterization of the convex hull of specially structured non-convex polyhedral sets to develop strong valid inequalities or cutting planes such as split and intersection cuts [22, 23, 25, 31]. This approach has led to highly effective branch-and-cut algorithms [1, 15, 16, 45, 50], so there has recently been significant interest in extending the associated theoretical and computational results to the realm of Mixed Integer Nonlinear Programming (MINLP) [6, 7, 10, 17, 20, 26, 27, 28, 33, 46, 67]. Unfortunately, this extension requires the study of the convex hull of a non-convex and non-polyhedral set, which has proven significantly harder than the polyhedral case. Most of the known results in this area are limited to very specific sets [44, 66, 68] or to approximations of semi-algebraic sets through Semidefinite Programming (SDP) [35, 49, 56, 57, 58, 60, 61]. While some precise SDP representations of the convex hulls of semi-algebraic sets exist [40, 42, 43, 65], these require the use of auxiliary variables. Such higher dimensional, extended, or lifted representations are extremely powerful. However, there are theoretical and computational reasons to want representations in the original space and/or in the same class as the original set (e.g. representations that do not jump from quadratic basic semi-algebraic to SDP). We refer to characterizations that satisfy both these requirements as projected and class preserving. Projected and class preserving are in general in compatible (e.g. the convex hull of the basic semi-algebraic set x ∈ R2 : (x21 − x2 )x1 ≥ 0, x2 ≥ 0 has no projected basic semi-algebraic representation, but has a lifted basic semi-algebraic representation [59]). Furthermore, even giving an algebraic characterization of the boundary of the convex hull of a variety [62, 63] or giving a projected SDP representation of the convex hull of certain varieties and quadratic semi-algebraic sets [64, 71, 72] requires very complex techniques from algebraic geometry. In this paper we show that simple class preserving projected characterizations can be given for a wide range of specially structured sets. More specifically, we show how two very simple techniques can be used to construct projected class preserving characterizations of the convex hull of certain non-polyhedral non-convex sets that mimic the special structures studied in MILP. These special structures correspond to the difference between two convex sets with simple geometry. The techniques we consider are tailored to this special structure and do not require any additional algebraic properties (e.g. being quadratic, basic semi-algebraic, etc.). Thanks to this, the resulting characterizations are quite general, but give simple closed form expressions. While the structures we study are somewhat specific, we use them to extend split cuts to essentially all convex sets described by a single quadratic inequality, and to extend general intersection cuts to a wide variety of quadratic sets of interest to trust region and lattice problems. The rest of this paper is organized as follows. We begin with Section 2 where we introduce some notation and review some known results. Section 3 then introduces an interpolation technique that can be used to construct split cuts for many classes of sets including convex quadratic sets. Finally, Section 4 introduces an aggregation technique that can be used to construct a wide array of general intersection cuts.

2

Notation, Known Results and Other Preliminaries.

We use the following notation. Let ei ∈ Rn be the i-th unit vector and I ∈ Rn×n be the identity matrix where n is an appropriate dimension that we omit if evident from the context. We also

1

let kxk2 :=

qP n

2 i=1 xi

denote the Euclidean norm of a given vector x ∈ Rn and for a vector

v ∈ Rn , we let the projection onto its span be Pv := Pv⊥

vv T . kvk22

vv T kvk22

and onto its orthogonal complement

Rn ,

be := I − For a set S ⊆ we let int (S) be its interior, conv (S) be its convex hull, and conv (S) be the closure of its convex hull. For a function F : Rn → R we let epi (F ) := {(x, t) ∈ Rn × R : F (x) ≤ t} be its epigraph, gr (F ) := {(x, t) ∈ Rn × R : F (x) = t} be its graph, and hyp (F ) := {(x, t) ∈ Rn × R : F (x) ≥ t} be its hypograph. In addition, we let [n] := {1, . . . , n}. Definition 2.1 (Intersection and Split Cuts). Let C, S ⊆ Rn be two closed convex sets and g : Rn → R be an arbitrary function. We say inequality g(x) ≤ 0 is : • Valid for conv (C \ int (S)) if conv (C \ int (S)) ⊆ {x ∈ Rn : g(x) ≤ 0}, • an intersection cut for C and S if it is valid for conv (C \ int (S)) and g is convex, and • a non-trivial intersection cut for C and S if additionally {x ∈ C : g(x) > 0} = 6 ∅. We let a split be a set of the form x ∈ Rn : π T x ∈ [π0 , π1 ] for some π ∈ Rn \ {0} and π0 , π1 ∈ R such that π0 < π1 . If S is a split, we say that the associated intersection cut is a split cut. If S is additionally a split with π = ei for some i ∈ [n], we say the associated split cut is an elementary split cut. We note that the term intersection cut was introduced by Balas [8] for the case in which C is a translated simplicial cone and its unique vertex is in int (S). In this setting, we have that conv (C \ int (S)) is closed and can be described by adding a single linear inequality to C. Furthermore, this single linear inequality has a simple formula dependent on the intersections of the extreme rays of C with S. While we do not always have such intersection formulas for other classes of sets, we continue to use the term in the more general setting and avoid any additional qualifiers for simplicity. In particular, we do not use the term generalized intersection cut as it has already been used for the case of polyhedral C and S and in conjunction with an improved cut generation procedure for MILP [9]. The term split cut was introduced by Cook, Kannanand, and Schrijver [24], and their original definition directly generalizes to non-polyhedral sets as in Definition 2.1. The interest of intersection cuts for MILP and MINLP arises from the fact that if int(S) ∩ Zp × q R = ∅, an intersection cut for C and S is valid for conv (C ∩ Zp × Rq ). Hence, intersection cuts can be used to strengthen the continuous relaxation of MILP and MINLP problems. Intersection cuts are particularly attractive in the MILP setting, since they can be quite strong and can be easily constructed. They were extensively studied when they were first proposed in the 1970s [8, 38, 39] and have recently received renewed interest [23, 31]. Part of the relative simplicity and effectiveness of intersection cuts for MILP stems from two basic facts. The first one is that in the MILP setting, C is a polyhedron (i.e. the continuous relaxation of a MILP is an LP). The second one is the fact that every convex set S such that int(S) ∩ Zn = ∅ (usually denoted a lattice free convex set) and that is maximal with respect to inclusion for this property is also a polyhedron [51]. Restricting both C and S to be polyhedra give intersection cuts for MILP several useful properties. For instance, if C and S are polyhedra, then conv (C \ int (S)) is a polyhedron [31]. Hence, in the MILP setting, we can restrict our attention to linear intersection cuts. Furthermore, if C is a translated simplicial cone and its unique vertex is int (S), then conv (C \ int (S)) is closed, can be described by adding a single linear inequality to C, and this linear inequality has a relatively 2

simple formula [8, 38, 39]. In particular, if S is a split and C is a polyhedron, then all linear intersection cuts for C and S can be constructed from simplicial relaxations of C and hence have simple formulas [2, 29, 69]. Gomory Mixed Integer (GMI) cuts [38, 39] and Mixed Integer Rounding (MIR) cuts [52, 54, 55, 70] are two versions of these formulas that have made split cuts one of the most effective cuts for MILP [16]. For more information on the ongoing efforts to duplicate this effectiveness for other lattice free polyhedra, we refer the reader to [23, 31]. In this context, we note that conv (C \ int (S)) can fail to be closed evenif C and S are polyhedra and S is not a split (e.g. consider C = x ∈ R2 : x2 ≥ 0 and S = x ∈ R2 : x2 ≤ 1, x1 + x2 ≤ 1 ). However, conv (C \ int (S)) is closed in the polyhedral case if S is full-dimensional and the recession cone of S is a linear subspace [4]. In the MINLP setting, there has been significant work on the computational use of linear split cuts [17, 20, 67, 33, 46]. From the theoretical side, we know that if S is a split, then conv (C \ int (S)) is closed even if C is not polyhedral [28]. With respect to formulas for intersection cuts, there has been some progress in the description of split cuts for quadratic sets in [6, 7, 28, 10]. Dadush et al. [28] show that, if C is an ellipsoid and S is a split, then conv (C \ int (S)) can be described by intersecting C with either a linear half space, a linear transformation of the second-order cone (a.k.a. Lorentz cone), or an ellipsoidal cylinder. In addition, they give simple closed form expressions for all these linear and nonlinear split cuts. Independently, [10] studies split cuts for more general quadratic sets, but only for splits in which {x ∈ C : π T x = π0 } and {x ∈ C : π T x = π1 } are bounded. They give a procedure to find the associated split cuts, but do not give closed form expressions for them. Finally, [6, 7] give a simple formula for an elementary split cut for the standard three dimensional second-order cone. While [10] develops a procedure to construct split cuts through a detailed algebraic analysis of quadratic constraints developed in [11], [6, 7, 28] give formulas for split cuts through simple geometric arguments. As we have recently shown at the MIP 2012 Workshop, these geometric techniques can be extended to additional quadratic and basic semi-algebraic sets [47]. In this paper we show that the principles behind these geometric arguments can be abstracted from the semi-algebraic setting to develop simple split cut formulas for a wider class of specially structured convex sets. This abstraction greatly simplifies the proofs and can be used to construct split cuts for essentially all convex sets described by a single quadratic inequality through simple linear algebra arguments. In addition to studying split cuts, we show how a commonly used aggregation technique can be used to develop formulas for general nonlinear intersection cuts for the case in which C and S are both non-polyhedral, but share a common structure. While a non-polyhedral S is not necessary in the MINLP settings (it still should be sufficient to consider maximal lattice free convex sets, which are polyhedral), they could still provide an advantage and are important in other settings such as trust region problems [12, 60] and lattice problems [18, 19, 53]. We finally note that similar results for the quadratic case have recently been independently developed in [3]. To describe our approach, we use the following additional definitions. Because we restrict to the cases in which conv (C \ int (S)) is closed, we drop the closure from the definitions. Definition 2.2. Let C, S ⊆ Rn be two closed convex sets and g : Rn → R be an arbitrary function. We say inequality g(x) ≤ 0 is a: • Binding valid cut for conv (C \ int (S)) if it is valid and {x ∈ C \ int (S) : g(x) = 0} = 6 ∅, and • Sufficient cut, if {x ∈ C : g(x) ≤ 0} ⊆ conv (C \ int(S)). 3

In this setting, we refer to C as the base set and to S as the forbidden set. Binding valid cuts correspond to valid cuts that cannot be improved by translations, and sufficient cuts are those that are violated outside conv (C \ int(S)). We can show that a convex cut that is sufficient and valid is enough to describe conv (C \ int(S)) together with the original constraints defining C. Our approach to generating such cuts will be to construct cuts that are binding and valid by design, and that have simple structures from which sufficiency can easily be proved.

3

Nonlinear Split Cuts Through Interpolation.

In this section we consider the case in which the base set is either the epigraph or lower level set of a convex function and the forbidden set corresponds to a split set. Our cut construction approach is based on a simple interpolation technique that can be more naturally explained for epigraphs of specially structured functions. For this reason, we begin with such a case and then consider special cases of non-epigraphical sets and discuss the limits of the interpolation technique. While the structures for which the technique yields simple formulas are quite specific, we can consider broader classes by considering linear transformations. We illustrate the power of this approach by showing how the interpolation technique yields formulas for split cuts for convex quadratic sets. For convenience, we use the following notation for the sets associated to specific classes of splits. Definition 3.1. Let π ∈ Rn \ {0} , π0 , π1 , π ˆ ∈ R be such that π0 < π1 and π ˆ 6= 0. • General Sets: For a closed convex base set C ⊆ Rn and a forbidden split set S = x ∈ Rn : π T x ∈ [π0 , π1 ] , we define C π,π0 ,π1 := conv (C \ int (S)) = conv x ∈ C : π T x ≤ π0 ∪ x ∈ C : π T x ≥ π1 .

• Epigraphical Sets: For a closed convex base set C ⊆ Rn × R and a forbidden split set n T Sx = (x, t) ∈ R × R : π x ∈ [π0 , π1 ] that does not affect t, we define

C π,π0 ,π1 := conv (C \ int (Sx )) = conv (x, t) ∈ C : π T x ≤ π0 ∪ (x, t) ∈ C : π T x ≥ π1 , and for a forbidden split set Sx,t = (x, t) ∈ Rn × R : π T x + π ˆ t ∈ [π0 , π1 ] that does affect t, we define C π,ˆπ,π0 ,π1 := conv (C \ int (Sx,t )) = conv (x, t) ∈ C : π T x + π ˆ t ≤ π0 ∪ (x, t) ∈ C : π T x + π ˆ t ≥ π1 . Note that in this latter case, we also allow disjunctions of the form π ˆ t ≤ π0 and π ˆ t ≥ π1 for which π = 0.

• Epigraphical Sets with Elementary Disjunctions: For a closed convex base set C ⊆ R × Rn × R and a forbidden elementary split set S = {(z, y, t) ∈ R × Rn × R : z ∈ [π0 , π1 ]}, we define C π0 ,π1 := conv (C \ int (S)) = conv ({(z, y, t) ∈ C : z ≤ π0 } ∪ {(z, y, t) ∈ C : z ≥ π1 }) .

4

3.1

Epigraphical Sets with Simple Split Disjunctions.

Let F : R → R be a closed convex function, epi(F ) := {(z, t) ∈ R × R : F (z) ≤ t}

(1)

be its epigraph, and consider an elementary split disjunction on z. As illustrated in Figure 1(a), epi(F )π0 ,π1 = epi(F ) ∩ epi(G) for G(z) =

π1 F (π0 ) − π0 F (π1 ) F (π1 ) − F (π0 ) z+ . π1 − π0 π1 − π0

(2)

z 0 , t0

z, t

z, t z 1 , t1

⇡0

⇡1

(a) Graph of F in black and graph of G in blue.

⇡0

⇡1

(b) Naive friends construction.

⇡0

⇡1

(c) Friends by following the slope.

Figure 1: Interpolation technique for univariate functions. Indeed, since G is a linear function and hence epi(F ) ∩ epi(G) convex, it suffices to show that G(z) ≤ t is a valid and sufficient cut. We can check that G(z) ≤ t is a binding valid cut by design. Indeed, G is the (affine) linear interpolation of f through z = π0 and z = π1 . Convexity of f then implies that this interpolation is below f outside z ∈ (π0 , π1 ). To show that the cut is sufficient, we need to show that any point z, t ∈ epi(F ) that satisfies π ,π 0 0 1 1 0 1 the cut is in epi(F ) . To achieve this, we find z , t and z , t in epi(F ) such can two points 0 , t0 , z 1 , t1 that z 0 ≤ π0 , z 1 ≥ π1 , and z, t ∈ conv z . Following [30], we will denote these points the friends of z, t . One naive way to construct the friends is to wiggle z, t by decreasing and increasing z until it reaches π0 and π1 , respectively. However, as illustrated in Figure 1(b), this can result in one of the friends falling outside epi(F ). Fortunately, as illustrated in Figure 1(c), we can always wiggle by following the slope of cut G to assure that the friends are at least in epi(G). Correctness then follows by noting that G(z) = F (z) at z = π0 and z = π1 , since G(z) ≤ t is a binding valid cut. The previous reasoning can be formalized for multivariate functions and general split disjunctions as follows.

5

Proposition 3.1. Let F : Rn → R be a closed convex function, π ∈ Rn \ {0}, and π0 , π1 ∈ R such that π0 < π1 . If G : Rn → R is a closed convex function such that G(x) = F (x)

∀x s.t. π T x ∈ {π0 , π1 }

(3a)

G(x) ≤ F (x)

∀x s.t. π T x ∈ / (π0 , π1 ) ,

(3b)

¯ ∈ (π0 , π1 ) has friends in and if every (x, t¯0 ) ∈ epi(G) such that π T x (x, t) ∈ epi(F ) : π T x ∈ / (π0 , π1 ) ,

then

epi(F )π,π0 ,π1 = epi(F ) ∩ epi(G). Proof. Let Q = (x, t) ∈ epi(F ) : π T x ∈ / (π0 , π1 ) . We have that

Q ⊆ epi(F ) ∩ epi(G) ⊆ conv(Q) = epi(F )π,π0 ,π1 ,

(4)

(5)

where the first containment comes from (3b) and the last from the friends condition. The result follows by taking convex hull in (5) and noting that epi(F ) ∩ epi(G) is convex because both F and G are convex. Our general approach to use Proposition 3.1 is to construct a convex function that yields binding valid cuts (i.e. satisfies (3)) and to follow its slope to construct friends for sufficiency. We now consider two structures in which the appropriate interpolation can easily be constructed once we identify its general structure. 3.1.1

Separable Functions.

If F is a separable function of the form F (z, y) = f (z) + g(y) with f : R → R and g : Rd → R closed convex functions, we can simply interpolate F parametrically on y to obtain G(z, y) =

F (π1 , y) − F (π0 , y) π1 F (π0 , y) − π0 F (π1 , y) z+ . π1 − π0 π1 − π0

(6)

In this case, the interpolation simplifies to G(z, y) =

π1 f (π0 ) − π0 f (π1 ) f (π1 ) − f (π0 ) z+ + g(y), π1 − π0 π1 − π0

(7)

which is convex on (z, y) and linear on z. Our original univariate argument follows through directly. We can then use the general version of this parametric interpolation to show the following result. Proposition 3.2. Let π ∈ Rn \ {0}, π0 , π1 ∈ R such that π0 < π1 , g : Rn → R and f : R → R be closed convex functions, n o Sg,f := (x, t) ∈ Rn × R : g Pπ⊥ x + f π T x ≤ t , a=

f (π1 )−f (π0 ) , π1 −π0

and b =

π1 f (π0 )−π0 f (π1 ) . π1 −π0

π,π0 ,π1 Sg,f

Then we have that n o = (x, t) ∈ Sg,f : g Pπ⊥ x + aπ T x + b ≤ t . 6

Proof. Let F (x) = g Pπ⊥ x + f π T x and G(x) = g Pπ⊥ x + aπ T x + b. Interpolation condition (3) holds by the definition of a and b and convexity of f . Now let (x, t¯0 ) ∈ epi(G) such that π T x ¯ ∈ (π0 , π1 ). To construct the friends of (x, t¯0 ), we consider two cases. Case 1. If f (π0 ) = f (π1 ), then (¯ x, t¯0 ) can be written as a convex combination of x0 , t¯0 and x1 , t¯0 , where π0 x0 = Pπ⊥ x ¯+ π kπk22 and x1 = Pπ⊥ x ¯+

π1 π. kπk22

We have that x0 , t¯0 , x1 , t¯0 ∈ epi(F ) ∩ (x, t) : π T x ∈ / (π0 , π1 ) , since π T x0 = π0 and π T x1 = π1 , and because interpolation condition (3a) holds. Case 2. If f (π0 ) 6= f (π1 ), (¯ x, t¯0 ) can be written as a convex combination of x0 , t0 and x1 , t1 given by π0 0 ¯0 + a π0 − π T x π, t = t ¯ , (8) x0 = Pπ⊥ x ¯+ kπk22 and

x1 = Pπ⊥ x ¯+

π1 π, kπk22

t1 = t¯0 + a π1 − π T x ¯ .

(9)

Indeed, since π T x ¯ ∈ (π0 , π1 ), there 1) such that π T x ¯ = απ0 + (1 − α)π1 . One exists α ∈ (0, 0 0 1 ¯0 ) = α x , t + (1 − α) x , t1 . We again have that x0 , t0 , x1 , t1 ∈ can check that (¯ x , t (x, t) ∈ epi(F ) : π T x ∈ / (π0 , π1 ) , since π T x0 = π0 and π T x1 = π1 , and because interpolation condition (3a) holds. The result then follows from Proposition 3.1. 3.1.2

Non-Separable Positive Homogeneous Convex Functions.

Proposition 3.1 can also be used for some non-separable functions, but as illustrated in the following example, we p need slightly more complicated interpolations. Consider F : R × R → R given by F (z, y) = z 2 + y 2 and let π0 = −10 and π1 = 1. Constructing a parametric linear interpolation as in (6) yields p p p p 10 1 + y 2 + 100 + y 2 + z 1 + y 2 − 100 + y 2 . (10) GL (z, y) = 11 The associated cut is certainly valid, binding and sufficient (we can always find friends by wiggling z toward π0 and π1 , and using t to correct by following the slope of G for fixed y). However, while G is linear with respect to z, it is not convex with respect to y. We hence cannot use Proposition 3.1 for this interpolation. Fortunately, we can construct an alternative interpolation given by s 20 − 9z 2 GC (z, y) = + y2 (11) 11 that is convex on (z, y). This function is not linear on z for fixed y, but we can still show it satisfies 2 the interpolation condition (3) by noting that 20−9z ≤ z 2 for any z ∈ / (π0 , π1 ) and that equality 11 7

holds for z ∈ {π0 , π1 }. This is illustrated in Figure 2(a) which shows that for fixed y = 4, GC ≤ t is a nonlinear binding valid cut, but is strictly weaker than GL ≤ t. While GC yields a weaker cut than GL , GC is in fact the strongest convex function that satisfies the interpolation condition π0 ,π1 = epi(F ) ∩ epi(G ). However, as illustrated in Figure 2(b), (3) and we can show that epi(F C ) constructing friends for z, y, t can no longer be done by leaving y fixed and following the slope of GC or GL to wiggle z towards π0 and π1 .

z, y, t

⇡0

⇡1

⇡0

⇡1

(a) Graphs of F , GL and GC in black, green and (b) Following slopes fails in friends construction. blue, respectively.

Figure 2: Nonlinear interpolation and friends. The issue with the friends construction stems from the fact that for certain fixed y, GC is not linear on z. However, for any z, y, t such that z¯ ∈ (π0 , π1 ), we can always find a direction in which GC is linear in a neighborhood of z, y, t . Discovering one such direction is straightforward once we note that the epigraph of GC is a convex cone. Let (ˆ z , yˆ) be the projection of the apex of this cone onto (z, y) and let r = (rz , ry ) be a vector that starts at this point and goes through (z, y) (See Figure 3(a)). As illustrated in Figure 3(b), we can then check that GC is linear on {(ˆ z , yˆ) + rα : α ≥ 0} (Figure 3(b) shows F ((ˆ z , yˆ) + r (1 − α)) and GC ((ˆ z , yˆ) + r (1 − α)) with αi such that zˆ + rz (1 − αi ) = πi for i ∈ {0, 1}). The friends construction can now be done by wiggling z towards π0 and π1 while modifying y to remain in the affine subspace in which GC is locally linear (i.e. {(ˆ z , yˆ) + rα : α ∈ R}), and modifying t to follow the slope of GC as before. This is illustrated in Figure 3(c), which also shows that we can always wiggle all the way to π0 and π1 as GC fails to be linear in the subspace only outside z ∈ [π0 , π1 ]. Alternatively, we can wiggle towards π0and π1 by following the ray that starts in the apex of the epigraph of GC and goes through z, y, t as illustrated in Figure 3(d). p The key in this procedure was guessing that the appropriate interpolation had the form (az + b)2 + y 2 . After that, we could easily find the appropriate coefficients a and b, and construct a direction to find friends. This can be easily generalized to p-order norms by using the following simple lemma 8

whose proof is included in the appendix. Lemma 3.1. Let p ∈ N, π0 , π1 ∈ R such that π0 < 0 < π1 , a =

π0 +π1 π1 −π0 ,

1 π0 and b = − π2π1 −π . 0

• If s ∈ / [π0 , π1 ], then |as + b|p < |s|p , • if s ∈ {π0 , π1 }, then |as + b|p = |s|p , and • if s ∈ (π0 , π1 ), then |as + b|p > |s|p . Using a similar interpolation form and Lemma 3.1, we can also calculate general split cuts for a wide range of positive homogeneous convex functions as follows. Proposition 3.3. Let π ∈ Rn \ {0}, π0 , π1 , β ∈ R such that π0 < π1 , p ∈ N, g : Rn → R be a 2π1 π0 0 positive homogeneous closed convex function, a = ππ11 +π −π0 , b = − π1 −π0 , and Cp,g :=

p 1/p T p ⊥ (x, t) ∈ R × R : g Pπ x + βπ x ≤t . n

π,π0 ,π1 If 0 ∈ / (π0 , π1 ), then Cp,g = Cp,g . Otherwise, we have that p p 1/p π,π0 ,π1 ⊥ T ≤t . Cp,g = (x, t) ∈ Cp,g : g Pπ x + aβπ x + βb

p 1/p p and G(x) = Proof. We first show the case 0 ∈ (π0 , π1 ). Let F (x) = g Pπ⊥ x + βπ T x p 1/p p T ⊥ . Interpolation condition (3) holds by the definition of a and b and g Pπ x + aβπ x + βb Lemma 3.1. Now let x, t ∈ epi(G) such that π T x ¯ ∈ (π0 , π1 ). To construct the friends of x, t , we consider two cases. Case 1. If |π0 | = |π1 |, then the proof is analogous to case f (π0 ) = f (π1 ) in the proof of Proposition 3.2. Case 2. If |π0 | = 6 |π1 |, one can check that for ! −b π , x∗ = akπk22 all points on the ray R := {(x∗ , 0) + α (¯ x − x∗ , t¯0 ) : α ∈ R+ }belong to epi(G). Let the intersections T T 0 0 1 1 of R with the hyperplanes π x = π0 and π x = π1 be x , t and x , t , respectively. Such points are obtained from R by setting π0 + ab α0 = πT x ¯ + ab and α1 =

π1 +

b a

, πT x ¯ + ab respectively. We have that x0 , t0 , x1 , t1 ∈ (x, t) ∈ epi(F ) : π T x ∈ / (π0 , π1 ) , since π T x0 = π0 and π T x1 = π1 , and because interpolation condition (3a) holds. Now note that (¯ x, t¯0 ) is obtained from R by setting α = 1. If α0 < 1 < α1 or α1 < 1 < α0 , then there exists β ∈ (0, 1) such that 9

(¯ x, t¯0 ) = β x0 , t0 + (1 − β) x1 , t1 . Seeing that π T x ¯ ∈ (π0 , π1 ), |π0 | 6= |π1 |, and 0 ∈ (π0 , π1 ), one can check α0 < 1 < α1 or α1 < 1 < α0 . The result then follows from Proposition 3.1. ¯0 ) ∈ (x, t) ∈ epi(F ) : π T x ∈ (π0 , π1 ) Finally, consider the case 0 ∈ / (π , π ). We need to show that any point (¯ x , t 0 1 has friends in (x, t) ∈ epi(F ) : π T x ∈ / (π0 , π1 ) . We can construct the friends in a similar way as before by noting that the ray R := {α (¯ x, t¯0 ) : α ∈ R+ } is contained in epi(F ) and intersects both T T π x = π0 and π x = π1 . In particular, if g is a p-norm, the following direct corollary characterizes elementary split cuts for p-order cones. P Corollary 3.1. Let ek be the k-th unit vector, π0 , π1 ∈ R such that π0 < π1 , kxkp = ( ni=1 |xi |p )1/p , Cp := {(x, t) ∈ Rn × R+ : kxkp ≤ t},

ek ,π0 ,π1 +π0 2π1 π0 b := I + (a − 1)ek ek T . If 0 ∈ a = ππ11 −π , b = − , and B / (π , π ), then C = Cp . Otherwise, p 0 1 π −π 0 1 0 we have that

b ek ,π0 ,π1 k Cp = (x, t) ∈ Cp : Bx + be ≤ t . p

3.2

Level Sets and Disjunctions Considering t.

We have concentrated on epigraphical sets, since having t not be affected by the disjunction simplifies the friends constructions. Fortunately, the effect of t can be replicated by a positive homogeneous function of variables not affected by the disjunction. With this modification, we can show the following proposition for non-epigraphical sets. Proposition 3.4. Let π ∈ Rn \ {0}, π0 , π1 ∈ R such that π0 < π1 , g : Rn → R be a positive homogeneous convex function, f : R → R ∪ {+∞} be a closed convex function such that f (π0 ), f (π1 ) ≤ 0, n o Df,g := x ∈ Rn : g Pπ⊥ x + f (π T x) ≤ 0 , a=

f (π1 )−f (π0 ) , π1 −π0

and b =

π1 f (π0 )−π0 f (π1 ) . π1 −π0

Then we have that

n o π,π0 ,π1 Df,g = x ∈ Df,g : g Pπ⊥ x + aπ T x + b ≤ 0 .

(12)

Proof. The left to right containment in (12) holds by convexity of f and the definition of a and b. For the right to left containment, let x ∈ Rn be such that g Pπ⊥ x + aπ T x + b ≤ 0 (13) and π T x ∈ (π0 , π1 ). To construct friends of x, we consider two cases. Case 1. If f (π0 ) = f (π1 ), one can check that x0 = Pπ⊥ x ¯+

π0 π kπk22

x1 = Pπ⊥ x ¯+

π1 π kπk22

and

10

are friends of x ¯. Case 2. If f (π0 ) 6= f (π1 ), let α ∈ (0, 1) be such that π T x ¯ = απ0 + (1 − α)π1 . Then x ¯ can be written as a convex combination of x0 and x1 , where x0 = β 1 Pπ⊥ x ¯+

π0 π kπk22

x1 = β 2 Pπ⊥ x ¯+

π1 π, kπk22

and

for β i := f (πi )/(αf (π0 ) + (1 − α)f (π1 )) ∈ [0, ∞) for i ∈ {0, 1} by the assumptions on f and f (π0 ) 6= f (π1 ). By construction, π T x0 = π0 and π T x1 = π1 . To check that xi ∈ Df,g , first note that by positive homogeneity of g we have g Pπ⊥ xi + f π T xi =

f (πi ) g Pπ⊥ x ¯ + f (πi ). (14) αf (π0 ) + (1 − α)f (π1 ) If f (πi ) = 0, then we directly obtain g Pπ⊥ xi + f π T xi ≤ 0. Otherwise, if f (πi ) < 0, (14) implies that g Pπ⊥ xi + f π T xi ≤ 0 is equivalent to g Pπ⊥ x ¯ + αf (π0 ) + (1 − α)f (π1 ) ≤ 0, which holds because of (13) and αf (π0 ) + (1 − α)f (π1 ) = aπ T x ¯ + b. The result then follows from convexity of g Pπ⊥ x + aπ T x + b, similar to the proof of Proposition 3.1. As a direct corollary of Proposition 3.4, we obtain formulas for elementary split cuts for balls of p-norms. Corollary 3.2. Let ek be the k-th unit vector, π0 , π1 , r ∈ R such that π0 < π1 , r > 0, and |π0 | , |π1 | ≤ r, Bp := {x ∈ Rn : kxkp ≤ r}, f (u) := − (rp − |u|p )1/p , a =

f (π1 )−f (π0 ) , π1 −π0

k Bpe ,π0 ,π1

=

b=

π1 f (π0 )−π0 f (π1 ) π1 −π0

b := I − ek ek T . Then , and B

b x ∈ Bp : Bx + axk + b ≤ 0 . p

Proof. Direct from Proposition 3.4 by noting that

b n Bp = x ∈ R : Bx + f (xk ) ≤ 0 . p

From the proof of Proposition 3.4, we can glimpse a natural extension of Proposition 3.4 to general non-epigraphical sets. We do not explore such extension further, since we have not yet encountered non-epigraphical structures beyond that of Proposition 3.4 which allow for simple split cut characterizations. However, we consider the following direct extension of Proposition 3.1 as we need it to give a complete characterization of split cuts for convex quadratic sets. We will see in Section 3.3.2 that the resulting formulas are significantly more complicated than those obtained through Proposition 3.1. 11

Proposition 3.5. Let F : Rn → R be a closed convex function, π ∈ Rn , π0 , π1 , π ˆ ∈ R such that n π0 < π1 and π ˆ 6= 0. If G : R → R is a closed convex function such that (x, t) ∈ gr(F ) : π T x + π ˆ t = π0 = (x, t) ∈ gr(G) : π T x + π ˆ t = π0 (15a) T T (x, t) ∈ gr(F ) : π x + π ˆ t = π1 = (x, t) ∈ gr(G) : π x + π ˆ t = π1 (15b) T T (x, t) ∈ epi(F ) : π x + π ˆt ∈ / (π0 , π1 ) ⊆ (x, t) ∈ epi(G) : π x + π ˆt ∈ / (π0 , π1 ) , (15c) and if every (x, t¯0 ) ∈ epi(G) such that π T x ¯+π ˆ t¯0 ∈ (π0 , π1 ) has friends in (x, t) ∈ epi(F ) : π T x + π ˆt ∈ / (π0 , π1 ) ,

then

epi(F )π,ˆπ,π0 ,π1 = epi(F ) ∩ epi(G).

3.3

(16)

Split Cuts for Quadratic Sets.

In this section we consider split cuts for convex sets described by a single quadratic inequality. Since we also want to include the second-order cone, we also consider quadratic sets that are the union of two convex sets and can be separated by a single linear inequality. It is well known (e.g. see [11] Section 2.1) that all such convex quadratic sets correspond to the following list: 1. A full dimensional paraboloid, 2. a full dimensional ellipsoid (or a single point), 3. a full dimensional second-order cone, 4. one side of a full dimensional hyperboloid of two sheets, 5. a cylinder generated by a lower-dimensional version of one of the previous sets, or 6. an invertible affine transformation of one of the previous sets. To give formulas for split cuts for all the above sets, it suffices to give formulas for the cases 1–4. With these, we can construct formulas for cylinders by using the following straightforward lemma. Lemma 3.2. Let C ⊆ Rn and let L ⊆ Rn be a linear subspace. Then conv (C + L) = conv (C) + L. Finally, split cuts for affine transformations can be obtained through the following simple lemma that we prove in the appendix. We include an epigraphical version of this lemma, since it will simplify the formulas in some cases. Lemma 3.3. Let c ∈ Rn , π ∈ Rn \ {0} , π0 , π1 ∈ R such that π0 < π1 , B ∈ Rn×n be an invertible matrix, π ˜ = B −T π, π ˜0 = π0 − π T c, π ˜1 = π1 − π T c, and f, g, f˜, g˜ : Rn → R ∪ {+∞} be proper closed ˜ convex functions such that f (x) = f (B(x − c)) and g(x) = g˜ (B(x − c)). If π˜ ,˜π0 ,˜π1 epi f˜ = epi f˜ ∩ epi (˜ g) , then epi (f )π,π0 ,π1 = epi (f ) ∩ epi (g). Similarly, if n oπ˜ ,˜π0 ,˜π1 n x ∈ Rn : f˜(x) ≤ 0 = x ∈ Rn : f˜(x) ≤ 0, then {x ∈ Rn : f (x) ≤ 0}π,π0 ,π1 = {x ∈ Rn : f (x) ≤ 0, 12

g(x) ≤ 0}.

o g˜(x) ≤ 0 ,

We first consider split cuts for quadratic sets with simple structures that can be obtained as direct corollaries of Propositions 3.2, 3.3 and 3.4. We refer to these as simple split cuts. We then consider split cuts for sets with more complicated structures that require ad-hoc proofs based on Propositions 3.1 or 3.5. As expected, we will see that formulas for the first case are significantly simpler than those for the second case. However, in either case, it is crucial to exploit the symmetry of the Euclidean norm through the following well known lemma.

2 Lemma 3.4. For v ∈ Rn , kxk22 = kPv xk22 + Pv⊥ x 2 . 3.3.1

Simple Split Cuts.

Simple split cuts can be obtained for general ellipsoids and for paraboloids and cones that, when interpreted as epigraphs of quadratic or conic functions (i.e. based on the Euclidean norm), are such that t is unaffected by the split disjunctions. We note that the ellipsoid case has already been proven on [10, 28], and that the conic case generalizes Proposition 2 in [7] which considers elementary disjunctions for the standard three dimensional second-order cone. Corollary 3.3 (Simple split cuts for paraboloids). Let B ∈ Rn×n be an invertible matrix, c ∈ Rn , π ∈ Rn \ {0}, π0 , π1 ∈ R such that π0 < π1 , LC 2 (B, c) := (x, t) ∈ Rn × R+ : kB (x − c) k22 ≤ t , a=

π0 +π1 −2π T c , kB −T πk22 2

b=−

LC (B, c)

(π1 −πT c)(π0 −πT c) kB −T πk22

π,π0 ,π1

b = P ⊥−T B. Then , and B B π

2

b

2 T = (x, t) ∈ LC (B, c) : B (x − c) + aπ (x − c) + b ≤ t . 2

Proof. Using Lemma 3.3, we prove the corollary by finding a closed form expression for LC 2 (I, 0)π˜ ,˜π0 ,˜π1 , where π ˜ = B −T π, π ˜0 = π0 − π T c, andoπ ˜1 = π1 − π T c. By Lemma 3.4, we have LC 2 (I, 0) = n T 2 (x, t) ∈ Rn × R+ : kPπ˜⊥ xk22 + (˜πk˜πkx)2 ≤ t . The result then follows from Proposition 3.2. 2

Corollary 3.4 (Simple split cuts for cones). Let B ∈ Rn×n be an invertible matrix, c ∈ Rn , π ∈ Rn \ {0}, π0 , π1 ∈ R such that π0 < π1 , LC (B, c) := {(x, t) ∈ Rn × R+ : kB (x − c) k2 ≤ t} , −2(π1 −π T c)(π0 −π T c) π1 +π0 −2π T c b = P ⊥−T + aPB −T π B, and b ,b= ,B c= π1 −π0 π1 −π0 B π π,π ,π πT c ∈ / (π0 , π1 ), then LC (B, c) 0 1 = LC (B, c). Otherwise, we have that

a= If

π,π0 ,π1

LC (B, c)

2 b/ B −T π 2 B −T π.

n o

b

= (x, t) ∈ LC (B, c) : B (x − c) + b c ≤ t . 2

Proof. Using Lemma 3.3, we prove the corollary by finding a closed form expression for LC (I, 0)π˜ ,˜π0 ,˜π1 , where π ˜ = B −T π, π ˜0 = π0 − π T c, and π ˜1 = π1 − π T c. By Lemma 3.4, we have LC (I, 0) = T 2 1/2 (x, t) ∈ Rn × R+ : kPπ˜⊥ xk22 + (˜πk˜πkx)2 ≤ t . The result then follows using Proposition 3.3. 2

13

A particularly interesting application of Corollaries 3.3 and 3.4 iso the Closest Vector Problem n [53], which can be alternatively written as min kB (x − c)k22 : x ∈ Zn or min {kB (x − c)k2 : x ∈ Zn }. In turn, these problems can be reformulated as min t : (x, t) ∈ LC 2 (B, c) , x ∈ Zn and min {t : (x, t) ∈ LC (B, c) respectively. We can then use Corollaries 3.3 and 3.4 with lattice free splits to construct cuts that could improve the solution speed of these problems. We are currently studying the effectiveness of such cuts. We can also obtain as a corollary the following result from [10, 28]. Corollary 3.5 (General split cuts for ellipsoids). Let B ∈ Rn×n be an invertible matrix, c ∈ Rn , π ∈ Rn \ {0}, r ∈ R+ , E (B, c, r) := {x ∈ Rn : kB (x − c) k2 ≤ r} , q u2 π0 , π1 ∈ R such that π0 < π1 , f (u) := − r2 − kB −T , πk2 2

π1 − π T c f (π0 − π T c) − π0 − π T c f (π1 − π T c) , b= π1 − π0

T

T

(π1 −π c) and a = f (π0 −π πc)−f . 1 −π0

−T T

If π c − r B π 2 ≤ π0 < π1 ≤ π T c + r B −T π 2 , then o n E (B, c, r)π,π0 ,π1 = x ∈ E (B, c, r) : kPB⊥−T π B (x − c) k2 ≤ aπ T (x − c) − b ,

if π0 < π T c − r B −T π 2 < π1 ≤ π T c + r B −T π 2 , then E (B, c, r)π,π0 ,π1 = x ∈ E (B, c, r) : π T x ≥ π1 ,

if π T c − r B −T π 2 ≤ π0 < π T c + r B −T π 2 < π1 , then E (B, c, r)π,π0 ,π1 = x ∈ E (B, c, r) : π T x ≤ π0 ,

if π T c − r B −T π 2 ≥ π1 or π0 ≥ π T c + r B −T π 2 , then E (B, c, r)π,π0 ,π1 = E (B, c, r), otherwise, E (B, c, r)π,π0 ,π1 = ∅.

(17)

(18)

(19) and

Proof. Using Lemma 3.3, we prove (17) by finding a closed form expression for E (I, 0, r)π˜ ,˜π0 ,˜π1 , T where π ˜ = B −T π, π ˜r ˜1 = π1 − π T c. By Lemma 3.4, we have E (I, 0, r) = 0 = π0 − π c, and π x ∈ Rn : kPπ˜⊥ xk2 −

r2 −

(˜ π T x)2 k˜ π k22

≤ 0 . The result then follows from Proposition 3.4.

The other cases can be shown by studying when the ellipsoid is partially or completely contained in one side of the disjunction, or when it is completely contained strictly between the disjunctions.

We note that Corollary 3.5 shows there are two types of split cuts for E (B, c, r). In (17), we obtain a nonlinear split cut that we would expect from Proposition 3.4, while in (18)–(19) we obtain simple linear split cuts. These linear inequalities are actually Chvátal-Gomory (CG) cuts for E (B, c, r) [21, 26, 27, 32, 37], but they are still sufficient to describe E (B, c, r)π,π0 ,π1 together with the original constraint. We hence follow the same MILP convention used in [28] and still consider them split cuts. Finally, we note that we can also consider “CG split cuts” in Proposition 3.4 if we include additional structure on the functions such as g being non-negative. Similarly, we can also do the case analysis for CG cuts in Corollary 3.2. 14

3.3.2

Other Split Cuts.

The split cut formulas in this section are significantly more complicated. For this reason, we only present them for standard sets (i.e. with B = I and c = 0). Formulas for the general case may be obtained by combining the formulas for the standard case with Lemma 3.3. Proposition 3.6 (General split cuts for paraboloids). Let π ∈ Rn , π0 , π1 , π ˆ ∈ R such that π0 < π1 and π ˆ 6= 0, LC 2 (I, 0) := {(x, t) ∈ Rn × R+ : kxk22 ≤ t}. If π ˆ > 0 and π0 < π1 ≤

−kπk22 4ˆ π ,

or if π ˆ < 0 and

−kπk22 4ˆ π

≤ π0 < π1 , then

LC 2 (I, 0)π,ˆπ,π0 ,π1 = LC 2 (I, 0) , if π ˆ > 0 and π0
0 and 4ˆπ 2 ≤ π0 < π1 , or π ˆ < 0 and π0 < π1 ≤

−kπk22 4ˆ π .

We prove Proposition this case using 3.5.

T x+b

T Let F (x) = kxk22 and G(x) = Pπ⊥ x + πkπk /d. One can check that f, d > 0. 2 π − cπ x − e 2

2

15

Now we consider two cases. Tx + π Case 1. Assume that kπk = 6 0. First, we prove (15a). Let S := (x, t) ∈ gr(F ) : π ˆ t = π 0 F 2 and SG := (x, t) ∈ gr(G) : π T x + π ˆ t = π0 . To prove SG ⊆ SF , let (¯ x, t¯0 ) ∈ SG . We need to show 2 ¯ that k¯ xk2 = t0 . By squaring the split cut equality and using Lemma 3.4, we can equivalently show that 2 2 2 πT x ¯+b πT x ¯ T cπ x ¯ + dt¯0 + e − = t¯0 − . (21) kπk22 kπk22 Replacing t¯0 with π0 − π T x ¯ /ˆ π , one can check that (21) follows from the definition of b, c, d, and e. To prove SF ⊆ SG , let (¯ x, t¯0 ) ∈ SF . We only need to show that cπ T x ¯ + dt¯0 + e ≥ 0. Since T ¯ d = cˆ π , we need to show that c π x ¯+π ˆ t0 ≥ −e, which after a few simplifications, can be written as q q 2 2 2 T ˆ kπk2 + 4π1 π ˆ /4. (22) π ˆ π x ¯+π ˆ t¯0 ≥ − kπk2 + kπk2 + 4π0 π kπk2 (22) follows from noting that min π ˆ πT x + π ˆ t : (x, t) ∈ LC 2 (I, 0) = − 4 2 . Proving (15b) is anologous. Now we prove (15c). Let S˜F := {(x, t) ∈ epi(F ) : π T x + π ˆt ∈ / (π0 , π1 )} and S˜G := {(x, t) ∈ T ˜ ˜ ˜ epi(G) : π x + π ˆt ∈ / (π0 , π1 )}. To prove SF ⊆ SG , let (¯ x, t¯0 ) ∈ SF . To show that (¯ x, t¯0 ) satisfies the split cut inequality (20), we can show that 2 ! 2 ! Tx Tx π ¯ + b π ¯ 2 cπ T x ¯ + dt¯0 + e − − t¯0 − ≥ 0. (23) kπk22 kπk22 One can check that proving (23) is equivalent to showing that f 2 πT x ¯+π ˆ t¯0 − π0 π T x ¯+π ˆ t¯0 − π1 2 (π1 − π0 )2 π ˆ2

≥ 0,

which follows from π T x ¯+π ˆ t¯0 ∈ / (π0 , π1 ). Proving cπ T x ¯ + dt¯0 + e ≥ 0 is similar as before. T ¯ ¯ Now let (x, t0 ) ∈ epi(G) such that π x ¯+π ˆ t0 ∈ (π0 , π1 ). To construct the friends of (x, t¯0 ), one can check that for −b bc − e x∗ = t∗ = , 2 π, d kπk2 all points on the ray R := {(x∗ , t∗ ) + α (¯ x − x∗ , t¯0 − t∗ ) : α ∈ R+ } belong to epi(G). Let the in T tersections of R with the hyperplanes π x + π ˆ t = π0 and π T x + π ˆ t = π1 be x0 , t0 and x1 , t1 , respectively. Such points are obtained from R by setting α0 = and α1 = √

π1 + h , +π ˆ t¯0 + h

πT x ¯

kπk22 +4π0 π ˆ kπk22 +4π1 π ˆ 0 , t0 , x1 , t1 ∈ epi(F ) ∩ . We have that x 4ˆ π since π T x0 + π ˆ t0 = π0 and π T x1 + π ˆ t1 = π1 , and because interpolation

kπk22 +

respectively, where h = (x, t) : π T x + π ˆt ∈ / (π0 , π1 ) ,

√

π0 + h +π ˆ t¯0 + h

πT x ¯

16

conditions (15a) and (15b) hold. Now note that (¯ x, t¯0 ) is obtained from R by setting α = 1. If α0 < 0 0 ¯ 1 < α1 or α1 < 1 < α0 , then there exists β ∈ (0, 1) such that (¯ x, t0 ) = β x , t + (1 − β) x1 , t1 . Seeing that π T x ¯+π ˆ t¯0 ∈ (π0 , π1 ), one can check α0 < 1 < α1 or α1 < 1 < α0 . Case 2. If kπk2 = 0, the split cut (20) is simplified to kxk2 ≤ dt + e. Let G(x) = (kxk2 − e) /d. One can prove (15a) and (15b) by showing that for (¯ x, t¯0 ) ∈ epi(F ) such that π ˆ t¯0 ∈ {π0 , π1 }, 2 we have (dt¯0 + b) = t¯0 . The latter follows from the definition of the interpolation coefficients. Non-negativity of d, e, and t also imply dt¯0 + e ≥ 0. Proving (15c) is also equivalent to showing that f 2 (ˆ π t¯0 − π0 ) (ˆ π t¯0 − π1 ) ≥ 0, 2 (π1 − π0 )2 π ˆ2 which follows from π ˆ t¯0 ∈ / (π0 , π1 ). We can construct the friends in a similar way as in Case 1 by noting that the ray R := {(0, t∗ ) + α (¯ x, t¯0 − t∗ ) : α ∈ R+ }, where t∗ = −e ˆ t = π0 d , is contained in epi(G) and intersects both π and π ˆ t = π1 . The result then follows from Proposition 3.5. Proposition 3.7 (General split cuts for cones). Let π ∈ Rn , π0 , π1 , π ˆ ∈ R such that π0 < π1 and π ˆ 6= 0, LC (I, 0) := {(x, t) ∈ Rn × R+ : kxk2 ≤ t}. If 0 ∈ / (π0 , π1 ), then LC (I, 0)π,ˆπ,π0 ,π1 = LC (I, 0). Otherwise, if 0 ∈ (π0 , π1 ) and π ˆ ≤ − kπk2 , then LC (I, 0)π,ˆπ,π0 ,π1 = (x, t) ∈ LC (I, 0) : π T x + π ˆ t ≤ π0 , if 0 ∈ (π0 , π1 ) and π ˆ ≥ kπk2 , then

LC (I, 0)π,ˆπ,π0 ,π1 = (x, t) ∈ LC (I, 0) : π T x + π ˆ t ≥ π1 ,

and if 0 ∈ (π0 , π1 ) and π ˆ ∈ (− kπk2 , kπk2 ), then

) (

aπ T x + b

⊥ π,ˆ π ,π0 ,π1 T π ≤ cπ x + dt + e , LC (I, 0) = (x, t) ∈ LC (I, 0) : Pπ x +

kπk22 2

where

a=

(π0 + π1 ) kπk22 − π ˆ2

f 2π0 π1 kπk22 b=− f 4π0 π1 π ˆ c=− (π1 − π0 ) f f d= (π1 − π0 ) kπk22 − π ˆ2

2π0 π1 (π0 + π1 ) π ˆ (π1 − π0 ) f r kπk22 − π ˆ 2 kπk22 (π0 − π1 )2 − (π0 + π1 )2 π ˆ2 . f=

e=

17

Proof. We include the case analysis in Lemma 5.2 in the appendix. The second and third cases where 0 ∈ (π0 , π1 ) and π ˆ ≤ − kπk2 or π ˆ ≥ kπk2 follow from Lemma 5.2. Now we show the case 0 ∈ (π0 , π1 ) and π ˆ ∈(− kπk , kπk). Note that π ˆ 6= 0 and ˆ ∈ (− kπk , kπk) π

T

⊥ T imply kπk2 6= 0. Let F (x) = kxk2 and G(x) = Pπ x + aπkπkx+b /d. One can check 2 π − cπ x − e 2

2

that d > 0. Similarly to the proof of Proposition 3.6, we can show that interpolation condition (15) holds by the definition of a, b, c, d, and e. Now let (x, t¯0 ) ∈ epi(G) such that π T x ¯+π ˆ t¯0 ∈ (π0 , π1 ). To construct the friends of (x, t¯0 ), we consider two cases. Case 1. If |π0 | = |π1 |, (¯ x, t¯0 ) can be written as a convex combination of x0 , t0 and x1 , t1 given by c πT x ¯ − π T x0 π0 − π ˆ t¯0 − dc π ˆ πT x ¯ 0 ⊥ 0 ¯ x = Pπ x ¯+ π, t = t0 + , (24) d ˆ kπk22 1 − dc π and

1

x =

Pπ⊥ x ¯

π1 − π ˆ t¯0 − dc π ˆ πT x ¯ + π, 2 1 − dc π ˆ kπk2

c πT x ¯ − π T x1 t = t¯0 + . d 1

(25)

Indeed, since π T x ¯+π ˆ t¯0 ∈ (π0 , π1 ), there 1) such that π T x ¯+π ˆ t¯0 = απ0 + exists α ∈ (0, (1 − α)π 1. 0 0 1 1 One can check that (¯ x, t¯0 ) = α x , t + (1 − α) x , t . We also have that x0 , t0 , x1 , t1 ∈ epi(F ) ∩ (x, t) : π T x + π ˆt ∈ / (π0 , π1 ) , since π T x0 + π ˆ t0 = π0 and π T x1 + π ˆ t1 = π1 , and because interpolation conditions (15a) and (15b) hold. Case 2. If |π0 | = 6 |π1 |, one can check that for x∗ =

−b π, akπk22

t∗ =

bc − ae , ad

all points on the ray R := {(x∗ , t∗ ) + α (¯ x − x∗ , t¯0 − t∗ ) : α ∈ R+ } belong to epi(G). Let the in T tersections of R with the hyperplanes π x + π ˆ t = π0 and π T x + π ˆ t = π1 be x0 , t0 and x1 , t1 , respectively. Such points are obtained from R by setting α0 =

π0 + h +π ˆ t¯0 + h

πT x ¯

and

π1 + h , +π ˆ t¯0 + h 0 , t0 , x1 , t1 ∈ epi(F )∩ (x, t) : π T x + π 0 π1 . We again have that x ˆ t ∈ / (π , π ) , respectively, where h = − π2π0 +π 0 1 1 since π T x0 + π ˆ t0 = π0 and π T x1 + π ˆ t1 = π1 , and because interpolation conditions (15a) and (15b) hold. Now note that (¯ x, t¯0 ) is obtained from R by setting α = 1. If α0 < 1 < α1 or α1 < 1 < α0 , then there exists β ∈ (0, 1) such that (¯ x, t¯0 ) = β x0 , t0 + (1 − β) x1 , t1 . Seeing that π T x ¯+π ˆ t¯0 ∈ (π0 , π1 ), |π0 | = 6 |π1 |, and 0 ∈ (π0 , π1 ), one can check α0 < 1 < α1 or α1 < 1 < α0 . The result then follows from Proposition 3.5. consider the case / (π0 , π1 ). We need that any point (¯ x, t¯0 ) ∈ epi(F ) ∩ Finally, 0 ∈ to show T T (x, t) : π x + π ˆ t ∈ (π0 , π1 ) has friends in epi(F )∩ (x, t) : π x + π ˆt ∈ / (π0 , π1 ) . We can construct the friends in a similar way as before by noting that the ray R := {α (¯ x, t¯0 ) : α ∈ R+ } is contained T T in epi(F ) and intersects both π x + π ˆ t = π0 and π x + π ˆ t = π1 . α1 =

πT x ¯

18

Proposition 3.8 (Simple split cuts for hyperboloids). Let π ∈ Rn \ {0}, π0 , π1 , l ∈ R such that π0 < π1 and l 6= 0, q H := (x, t) ∈ Rn × R+ : kxk22 + l2 ≤ t .

If |π0 | = |π1 |, then

H π,π0 ,π1 = where ˆb =

where

)

ˆ b

(x, t) ∈ H : Pπ⊥ x + π ≤ t ,

kπk22 2

(

q 6 |π1 |, then l2 kπk22 + π12 , and if |π0 | =

( )

Tx + b aπ

H π,π0 ,π1 = (x, t) ∈ H : Pπ⊥ x + π ≤ t ,

kπk22 2 s

2l2 kπk22

a=

+

π02

+

l2 kπk22 − π0 π1 + b=

π12

r

−2

r

l2 kπk22 + π02

π1 − π0

l2 kπk22 + π02 π0 + π1

Proof. We first show the case |π0 | = |π1 |. Let F (x) =

l2 kπk22 + π12

l2 kπk22 + π12

a.

q

kxk22 + l2 and G(x) = Pπ⊥ x +

ˆb

π . kπk22 2 that π T x ¯∈

Interpolation condition (3) holds by the definition of ˆb. Now let (x, t¯0 ) ∈ epi(G) such (π0 , π1 ). We can construct the friends of (x, t¯0 ) in a similar way as in Case 1 in the proof of Proposition 3.3. The result then follows from Proposition 3.1.

⊥ aπ T x+b Now consider the case |π0 | = 6 |π1 |. Let G(x) = Pπ x + kπk2 π . Interpolation condition (3) 2

2

holds by the definition of a and b. Now let (x, t¯0 ) ∈ epi(G) such that π T x ¯ ∈ (π0 , π1 ). We can construct the friends of (x, t¯0 ) in a similar way as in Case 2 in the proof of Proposition 3.3. The result then follows from Proposition 3.1. We have partially generalized Proposition 3.8 to general split cuts for hyperboloids. However, in most cases, the resulting formulas are unmanageably complicated, so we did not pursue this further.

4

General Intersection Cuts Through Aggregation.

In this section we consider the case in which the base sets are either epigraphs or lower level sets of convex functions and the forbidden sets are hypographs or upper level sets of concave functions. Our cut construction approach in this case is based on a simple aggregation technique, which again can be more naturally explained for epigraphs of specially structured functions. Following the structure of Section 3, we also begin by studying the epigraphical sets and then consider the case of non-epigraphical sets. We also end the section by illustrating the power and limitations of the approach by considering intersection cuts for quadratic constraints. 19

4.1

Intersection Cuts for Epigraphs.

Let F, G : R × R → R be a convex and a concave function given by F (z, y) = z 2 + 2y 2 and G(z, y) = −(z − 1)2 + 1 − y 2 , and let epi(F ) = {(z, y, t) : F (z, y) ≤ t} and hyp(G) = {(z, y, t) : t ≤ G(z, y)} be the epigraph of F and the hypograph of G, respectively. For λ ∈ [0, 1], let Hλ (z, y) = (1 − λ)F + λG. As illustrated in Figure 4(a), for any λ ∈ [0, 1], we have that Hλ (z, y) ≤ t is a binding valid cut for epi(F ) \ int (hyp(G)). However, depending on the choice of λ, the inequality could be non-convex, or it could be convex but not sufficient. It is clear from Figure 4(a) that in this case, the correct choice of λ is 1/2 = arg max {λ ∈ [0, 1] : Hλ is convex}, which yields the strongest convex cut from this class. Furthermore, as illustrated in Figure 4(b), we have that for any z, y, t ∈ epi(H1/2 ) ∩ int (hyp(G)), we can find friends in epi(F ) \ int (hyp(G)) by following the slope of H1/2 similar to what we did for split cuts of separable functions. We can then show that conv (epi(F ) \ int (hyp(G))) = epi(F ) ∩ epi H1/2 .

A similar construction can also be obtained if we instead study conv ({(z, y, t) ∈ epi(F ) : G(z, y) ≤ 0}). Hλ and the convexity requirement on it are the basis of many techniques such as Lagrangian/SDP relaxations of quadratic programming problems [35, 57, 60, 61], the QCR method for integer quadratic programming [13, 14], and an algorithm for constructing projected SDP representations of the convex hull of quadratic constraints introduced in [71]. It is hence not surprising that the approach works in the quadratic case. However, as shown in [71], even in the quadratic case the approach can fail to yield convex constraints or closed form expressions. Furthermore, for general functions, Hλ can easily be non-convex for every λ. Fortunately, as the following proposition shows, the approach can yield closed form expressions for general intersection cuts for problems with special structures Proposition 4.1. Let gi : R → R be convex functions for each i ∈ [n], m, l ∈ Rn , r, q ∈ R, and γ ∈ R+ . Furthermore, let {ai }ni=1 ⊆ Rn be such that an 6= 0 and ai ⊥ aj for every i 6= j, and {αi }ni=1 ⊆ R+ be such that 0 6= αn ≥ αi for all i. Let F (x) =

n X i=1

G(x) = −

gi aTi x + mT x + r,

n X i=1

αi gi aTi x − lT x − q,

C := epi(F ), and S := {(x, t) ∈ Rn × R : γt ≤ G(x)}. If (1 + γ/αn ) > 0 and ! (m − l/αn )T an T T lim −αn gn san an − s l an + γ = −∞, 1 + γ/αn |s|→∞

(26)

then

conv (C \ int (S)) = conv ({(x, t) ∈ epi(F ) : G(x) ≤ γt}) = epi(F ) ∩ epi(H), 20

(27)

where F (x) + (1/αn )G(x) H(x) := = 1 + γ/αn

Pn−1 i=1

(1 − αi /αn ) gi aTi x + (m − l/αn )T x + (r − q/αn ) . (28) (1 + γ/αn )

Proof. The first equality in (27) is direct. For the second equality, we proceed as follows. H is a non-negative linear combination of F and G that is also a convex function from which it is easy to see that the left to right containment holds. To show the right to left containment, let x, t ∈ epi(F ) ∩ epi(H) be such that G (x) > γt. T n ) an Let k = (m−l/α . Because of (26), there exits s1 > 0 and s2 < 0, for which xi , ti = 1+γ/α n x + si an , t + si k for i = 1, 2 are such that G xi = γti . Furthermore, by design xi , ti ∈ epi(H) i i ≤ ti . The result for i = 1, 2, which implies F xi + G xi /αn ≤ (1 + γ/α n ) t and hence F x 1 1 2 2 then follows by noting that x, t ∈ conv x ,t , x ,t .

4.2

Intersection Cuts for Level Sets.

In contrast to the split cut case, we cannot use positive homogeneous functions to play the role of t in non-epigraphical sets. Nevertheless, we can extend the aggregation approach to certain non-epigraphical sets through the following proposition whose proof is a direct analog to that of Proposition 4.1. Proposition 4.2. Let gi : R → R be convex functions for each i ∈ [n], m ∈ Rn , r, q ∈ R. Furthermore, let {ai }ni=1 ⊆ Rn be such that an 6= 0 and ai ⊥ aj for every i 6= j, and {αi }ni=1 ⊆ R+ be such that 0 6= αn ≥ αi for all i. Let F (x) =

n X i=1

G(x) = −

gi aTi x + mT x + r,

n X i=1

αi gi aTi x − αn mT x − q,

C := {x ∈ Rn : F (x) ≤ 0}, and S := {x ∈ Rn : G(x) ≥ 0}. If lim −αn gn saTn an − sαn mT an = −∞,

(29)

|s|→∞

then

conv (C \ int (S)) = conv

(

x ∈ Rn :

where H(x) := F (x) + (1/αn )G(x) =

)! F (x) ≤ 0,

G(x) ≤ 0

n−1 X i=1

=

(

x ∈ Rn :

) F (x) ≤ 0,

H(x) ≤ 0

(1 − αi /αn ) gi aTi x + (r − q/αn ).

,

(30)

(31)

The special structure in both of these propositions is extremely simple, but thanks to the symmetry of quadratic constraints, they can be used to get formulas for several quadratic intersection cuts. 21

4.3

Intersection Cuts for Quadratic Sets.

Corollary 4.1. Let B ∈ Rn×n be an invertible matrix, A ∈ Rn×n , c, d ∈ Rn , q ∈ R, γ ∈ R+ , LC 2 (B, c) := (x, t) ∈ Rn × R+ : kB (x − c) k22 ≤ t ,

and

n o S := (x, t) ∈ Rn × R : γt + q ≤ − kA (x − d)k2 .

Then

2

conv LC (B, c) \ int (S) =

(

n

(x, t) ∈ R × R+ :

kB (x − c)k2 ≤ t

)

xT Ex + aT x + f ≤ (αn + γ)t

,

(32)

for E = B T HB, a = −2B T e − 2B T HBc, T f = cT B T HBc + 2 B T e c − w − q, H=

n−1 X

(αn − αi ) vi viT ,

i=1

e=

n X

αi viT B(c − d)vi ,

i=1

w=

n X i=1

2 αi viT B(c − d) ,

where (vi )ni=1 ⊆ Rn and (αi )ni=1 ⊆ R correspond to an eigenvalue decomposition of B −T AT AB −1 so that n X B −T AT AB −1 = αi vi viT , i=1

kvi k2 = 1 for all i, viT vj = 0 for all i 6= j, and αn ≥ αi for all i.

Proof. Let y = B(x − c) and T := LC 2 (B, c) \ int (S). Using orthonormality of the vectors vi , T can be written on the y variables as   n X 2   T    vi y ≤ t      i=1 n . T = (y, t) ∈ R × R+ : n X   2   T T    − αi vi y − 2e y − w − q ≤ γt   i=1

The result then follows by using Proposition 4.1.

22

An interesting case of Corollary 4.1 arises when γ = 0. In this case, the base set C corresponds to a paraboloid and the forbidden set S corresponds to an ellipsoidal cylinder. In such a case, the minimization of t over (x, t) ∈ C \ int (S) is equivalent to the minimization of a convex quadratic function outside an ellipsoid, which corresponds to the simplest indefinite version of the well known trust region problem. While this is a non-convex optimization problem, it can be solved in polynomial time through Lagrangian/SDP approaches [60]. It is known that optimal dual multipliers of an SDP relaxation of a non-convex quadratic programming problem such as the trust region problem can be used to construct a finite convex quadratic optimization problem with the same optimal value as the original non-convex problem (e.g. [36]). Furthermore, the complete feasible region induced by a SDP relaxation on the original variable space (in this case (x, t)) can be characterizes by an infinite number of convex quadratic constraints [48]. This characterization has recently been simplified for the feasible region of the trust region problem in [12]. This work gives a semi-infinite characterization of T for γ = 0 composed by the convex quadratic constraint kB (x − c)k2 ≤ t plus an infinite number of linear inequalities that can be separated in polynomial time. Corollary 4.1 shows that these linear inequalities can be subsumed by a single convex quadratic constraint, which gives another explanation for their polynomial time separability. We note that the techniques in [12] are also adapted to other non-convex optimization problems (both quadratic and non-quadratic). Hence, combining Corollary 4.1 with these techniques could yield valid convex quadratic inequalities for more general non-convex problems. Another interesting application of Corollary 4.1ofor the case γ = 0 is the Shortest Vector Problem n 2 (SVP) [53] of the form min kBxk : x ∈ Zn \ {0} . Similar to the Closest Vector Problems (CVP) studied in Section 3.3.1, we can transform this problem to min(x,t)∈Q∩(Zn ×R) t for n o Q = (x, t) ∈ Rn × R+ : kBxk2 ≤ t, x 6= 0 ,

so that we can strengthen the problem by generating valid inequalities for Q. Unfortunately, as the following simple lemma shows, traditional split cuts will not add any strength. Lemma 4.1. Let Q0 = Q ∪ {(0, 0)}. For any B ∈ Rn×n , n o 0 ,π0 +1 t∗ = min t : (x, t) ∈ ∩(π,π0 )∈Zn ×Z Qπ,π = 0. 0

Proof. Note that for all integer splits (π, π0 ) ∈ Zn × Z, (¯ x, t¯0 ) = (0, 0) belongs to one side of the ∗ disjunctions. Thus, we have t ≤ 0 and the result follows from non-negativity of the norm.

However, we can easily construct near lattice free ellipsoids centered at 0 that do not contain any point from Zn \ {0} in their interior, and use them to get some bound improvement. For instance, in the trivial case of B = I, Corollary 4.1 applied to the single near lattice free ellipsoid given by the unit ball {x ∈ Rn : kxk2 ≤ 1} yields a cut that provides the optimal value t∗ = 1. Similar ellipsoids could be used to generate strong convex quadratic valid inequalities for non-trivial cases to significantly speed up the solution of SVP problems. Studying the effectiveness of these cuts is left for future research. We end this section with a brief discussion about the strength and possible extensions of the aggregation technique. For this, we begin by presenting the following corollary of Proposition 4.2 whose proof is analogous to that of Corollary 4.1.

23

Corollary 4.2. Let B ∈ Rn×n be an invertible matrix, A ∈ Rn×n , c ∈ Rn , r1 , r2 ∈ R+ , E 2 (B, c, r1 ) := x ∈ Rn : kB (x − c) k22 ≤ r1 ,

and

o n S := x ∈ Rn : kB (x − c)k22 ≥ r2 .

Then there exist a positive semi-definite matrix E ∈ Rn×n , a ∈ Rn , and f ∈ R such that ) ( 2 ≤ r kB (x − c)k 1 2 . conv E 2 (B, c, r1 ) \ int (S) = x ∈ Rn : T x Ex + aT x + f ≤ 0

(33)

Corollary 4.2 shows how to construct the convex hull of the set obtained by removing an ellipsoid or an ellipsoidal cylinder from an ellipsoid. However, this construction only works if the ellipsoids have a common center c. The following example shows how the construction can fail for noncommon centers. In addition, the example shows that the aggregation technique does not subsume the interpolation technique and sheds some light into the relationship between Corollaries 4.1 and 4.2 and SDP relaxations for quadratic programming. Example 4.1. Consider the set C = (z, y) : z 2 + y 2 ≤ 4 and the split disjunction z ≤ 0 ∨ z ≥ 1. From Corollary 3.5, we have that C 0,1 := conv ({(z, y) ∈ C : z ≤ 0} ∪ {(z, y) ∈ C : z ≥ 1}) n √ o = (z, y) : z 2 + y 2 ≤ 4, |y| ≤ 3−2 z+2 .

Now let F (z, y) = z 2 + y 2 − 4 and G(z, y) = −(z − 1/2)2 + 1/4. Since split disjunction z ≤ 0 ∨ z ≥ 1 is equivalent to G(z, y) ≤ 0, we have C 0,1 = conv (z, y) ∈ R2 : F (z, y) ≤ 0, G(z, y) ≤ 0 . (34) √ Now consider Hλ = (1 − λ)F + λG. One can check that the split cut |y| ≤ 3 − 2 z + 2 obtained through Corollary 3.5, can be equivalently written as √ 2 y2 − 3−2 z+2 ≤0 (35a) √ 3 − 2 z + 2 ≥ 0. (35b) √ 4 In turn, (35a) is equivalent to Hλ∗ ≤ 0 for λ∗ = 33 6 − 3 because Hλ∗ / √ 2 y2 − 3 − 2 z + 2 . By noting that (35b) holds for C, we conclude that C 0,1 = (z, y) ∈ R2 : z 2 + y 2 ≤ 4, Hλ∗ (z, y) ≤ 0 .

1 33

√ 9+4 3 =

(36)

Unfortunately, Hλ∗ is not a convex function, so it does not fit in the aggregation framework described in this section. In particular, Hλ∗ is an indefinite quadratic function so it cannot be obtained 2 : F (z, y) ≤ 0, from a SDP relaxation of (z, y) ∈ R G(z, y) ≤ 0 . Indeed, we can show that the SDP relaxation of (z, y) ∈ R2 : F (z, y) ≤ 0, G(z, y) ≤ 0 strictly contains C 0,1 . Finally, while we can obtain Hλ∗ through a procedure described in [71], this procedure requires the execution of a numerical algorithm and does not give closed form expressions such as those provided by Corollary 3.5. 24

5

APPENDIX: Omitted proof and auxiliary lemmas.

Lemma 3.1. Let p ∈ N, π0 , π1 ∈ R such that π0 < 0 < π1 , a =

π0 +π1 π1 −π0 ,

1 π0 and b = − π2π1 −π . 0

• If s ∈ / [π0 , π1 ], then |as + b|p < |s|p , • if s ∈ {π0 , π1 }, then |as + b|p = |s|p , and • if s ∈ (π0 , π1 ), then |as + b|p > |s|p . Proof. We prove the lemma for p = 2 only. The same result follows for any p ∈ N by taking the appropriate power of both sides of the inequalities. Let p = 2. Consider the quadratic equation (as + b)2 − s2 = 0. One can see that s = π0 and s = π1 solve the this quadratic equation can be equation above. With some rearrangements, 0 π1 written as a2 − 1 s2 + 2abs + b2 = 0, where a2 − 1 = (π4π−π < 0. The result follows from noting 2 1 0) that a2 − 1 s2 + 2abs + b2 < 0 for s ∈ / [π0 , π1 ], and a2 − 1 s2 + 2abs + b2 > 0 for s ∈ (π0 , π1 ). Lemma 3.3. Let c ∈ Rn , π ∈ Rn \ {0} , π0 , π1 ∈ R such that π0 < π1 , B ∈ Rn×n be an invertible matrix, π ˜ = B −T π, π ˜0 = π0 − π T c, π ˜1 = π1 − π T c, and f, g, f˜, g˜ : Rn → R ∪ {+∞} be proper closed convex functions such that f (x) = f˜ (B(x − c)) and g(x) = g˜ (B(x − c)). If π˜ ,˜π0 ,˜π1 epi f˜ = epi f˜ ∩ epi (˜ g) ,

then epi (f )π,π0 ,π1 = epi (f ) ∩ epi (g). Similarly, if

n oπ˜ ,˜π0 ,˜π1 n x ∈ Rn : f˜(x) ≤ 0 = x ∈ Rn : f˜(x) ≤ 0,

then {x ∈ Rn : f (x) ≤ 0}π,π0 ,π1 = {x ∈ Rn : f (x) ≤ 0,

g(x) ≤ 0}.

o g˜(x) ≤ 0 ,

Proof. We prove the first case only. The second case is analogous. Let S0 := (x, t) ∈ epi(f ) : π T x ≤ π0 , S1 := (x, t) ∈ epi(f ) : π T x ≥ π1 , and

n o S˜0 := (x, t) ∈ epi f˜ : π T x ≤ π0 ,

n o S˜1 := (x, t) ∈ f˜ : π T x ≥ π1 .

By definition, f π,π0 ,π1 = conv (S0 ∪ S1 ). To show that conv (S0 ∪ S1 ) ⊆ epi(f ) ∩ epi(g), take 1 , t1 ) ∈ S such that (¯ (¯ x, t¯0 ) ∈ conv (S0 ∪ S1 ). There exist (x0 , t0 ) ∈ S0 and (x x, t¯0 ) = α(x0 , t0 ) + 1 (1 − α) (x1 , t1 ) for some α ∈ [0, 1]. Then B xi − c , ti ∈ S˜i for i ∈ {1, 2} and hence π˜ ,˜π0 ,˜π1 α B x0 − c , t0 + (1 − α) B x1 − c , t1 = (B (¯ x − c) , t¯0 ) ∈ epi f˜ .

Then by assumption, (B (¯ x − c) , t¯0 ) ∈ epi f˜ ∩ epi (˜ g ) and the result follows from the definition of f and g.

25

For the reverse inclusion, take (¯ x, t¯0 ) ∈ epi(f ) ∩ epi(g). Then (B (¯ x − c) , t¯0 ) ∈ epi f˜ ∩ epi (˜ g) 0 0 1 1 0 0 ˜ ˜ and by assumption, there exist (x , t ) ∈ S0 and (x , t ) ∈ S1 such that (B (¯ x − c) , t¯0 ) = α(x , t ) + 1 1 (1 − α) (x , t ) for some α ∈ [0, 1]. Thus, x ¯ = B −1 αx0 + (1 − α) x1 + c and αt0 + (1 − α) t1 = t¯0 . The result then follows by noting that B −1 x0 + c, t0 ∈ S0 and B −1 x1 + c, t1 ∈ S1 . Lemma 5.1. Let π ∈ Rn , π0 , π1 , π ˆ ∈ R such that π0 < π1 and π ˆ 6= 0. Also define n o LC 2 (I, 0) := (x, t) ∈ Rn × R+ : kxk22 ≤ t ,

S0 := {(x, t) ∈ LC 2 (I, 0) : π T x + π ˆ t ≤ π0 }, and S1 := {(x, t) ∈ LC 2 (I, 0) : π T x + π ˆ t ≥ π1 }. • If π ˆ > 0 and π0 < π1 ≤ • if π ˆ > 0 and π0 < • if π ˆ > 0 and

−kπk22 4ˆ π

−kπk22 4ˆ π

• if π ˆ < 0 and

−kπk22 4ˆ π

then S0 = ∅ and S1 = LC 2 (I, 0),

< π1 , then S0 = ∅, S1 ( LC 2 (I, 0), and S1 6= ∅,

≤ π0 < π1 , then S0 , S1 ( LC 2 (I, 0) and S0 , S1 6= ∅, −kπk22 4ˆ π ,

• if π ˆ < 0 and π0 < π1 ≤ • if π ˆ < 0 and π0
0 and π0 < π1 ≤ 4ˆπ 2 . If kπk2 = 0, the result follows from non-negativity of t. Now 2 assume that kπk2 6= 0. Note that if S0 6= ∅, one can find (¯ x, t¯0 ) ∈ S0 such that π T x ¯ / kπk22 ≤ 2 π0 − π T x ¯ /ˆ π . Therefore, we prove S0 = ∅ by showing that π T x / kπk22 > π0 − π T x /ˆ π . This follows from noting that for y ∈ R, the quadratic equation LC 2 (I, 0),

πT x

y2 kπk22

=

π0 −y π ˆ

does not have any solution.

To prove S1 = we show that +π ˆ t ≥ π1 is a valid inequality for LC 2 (I, 0). This π1 −y y2 comes from the fact that the quadratic equation kπk has at most a single solution and as 2 = π ˆ 2 2 a result, we have π1 − π T x /ˆ π ≤ π T x / kπk22 ≤ t. −kπk22 4ˆ π

< π1 . Proving S0 = ∅ is analogous kπk22 to the previous case. We have S1 ( since (¯ x, t¯0 ) = −π , ∈ LC 2 (I, 0), but 2ˆ π 4ˆ π2 n o 2 π1 −π T x ¯ n ¯ ¯ (¯ x, t0 ) ∈ / S1 . To prove S1 6= ∅, one can check that for any x ¯ ∈ R and t0 = Max k¯ xk2 , πˆ , ¯ (¯ x, t0 ) ∈ S1 . Now consider the second case that π ˆ > 0 and π0 < LC 2 (I, 0),

−kπk2

Finally, consider the third case that π ˆ > 0 and 4ˆπ 2 ≤ π0 < π1 . To prove S0 , S1 ( LC 2 (I, 0), kπk2 one can see that (¯ x, t¯0 ) = −π , π0 +π1 + 22 ∈ LC 2 (I, 0), but (¯ x, t¯0 ) ∈ / S0 ∪ S1 . Proving S1 = 2ˆ π

2ˆ π

2ˆ π

∅ is analogous to the previous case. Now we prove S0 6= ∅. If kπk 2 = 0, onecan note that yˆ π0 −ˆ y (¯ x, t¯0 ) = (0, 0) ∈ S0 . If kπk2 6= 0, one can check that (¯ x, t¯0 ) = kπk ∈ S0 , where 2 π, π ˆ 2 √ 4 2 2 −kπk2 − kπk2 +4kπk2 π0 π ˆ y2 π0 −y yˆ = is a solution to the quadratic equation kπk 2 = 2ˆ π π ˆ . 2

26

Lemma 5.2. Let π ∈ Rn , π0 , π1 , π ˆ ∈ R such that π0 < 0 < π1 and π ˆ 6= 0. Also define LC (I, 0) := {(x, t) ∈ Rn × R+ : kxk2 ≤ t} , S0 := {(x, t) ∈ LC (I, 0) : π T x + π ˆ t ≤ π0 }, and S1 := {(x, t) ∈ LC (I, 0) : π T x + π ˆ t ≥ π1 }. • If π ˆ ≥ kπk2 , then S0 = ∅, S1 ( LC (I, 0), and S1 6= ∅, • if π ˆ ≤ − kπk2 , then S1 = ∅, S0 ( LC (I, 0), and S0 6= ∅, • if π ˆ ∈ (− kπk2 , kπk2 ), then S0 , S1 ( LC (I, 0) and S0 , S1 6= ∅. Proof. First, we prove the case that π ˆ ≥ kπk2 . If kπk2 = 0, the result follows from non-negativity of t. Now assume that kπk2 6= 0. Note that if S0 6= ∅, one can find (¯ x, t¯0 ) ∈ S0 such that 2 2 2 2 2 T T ¯ /ˆ π . Therefore, we prove S0 = ∅ by showing that π T x / kπk22 > π x ¯ / kπk2 ≤ π0 − π x 2 2 π0 − π T x /ˆ π . Note that non-negativity of t and π ˆ , together with π T x + π ˆ t ≤ π0 imply π T x ≤ π0 < 0. One can see that π T x < π0 − π T x < −π T x, where the first inequality follows from π T x ≤ π0 and −π T x > 0, and the second inequality comes from the fact that π0 < 0. Thus, 2 2 1 1 π T x > π0 − π T x and the result follows by noting that kπk . We have S1 ( LC (I, 0), 2 ≥ π ˆ2 2 since (¯ x, t¯0 ) = (0, 0) ∈n LC (I, 0), but x, t¯0 ) ∈ / S1 . To prove S1 6= ∅, one can check that for any o (¯ T π1 −π x ¯ n ¯ ¯ x ¯ ∈ R and t0 = Max k¯ xk2 , , (¯ x, t0 ) ∈ S1 . π ˆ

Now consider the second case that π ˆ ≤ − kπk2 . Again for kπk2 = 0, the result follows from non-negativity of t. Now assume that kπk2 6= 0. Note that if S1 6= ∅, one can find (¯ x, t¯0 ) ∈ 2 2 2 2 T T S1 such that π x ¯ / kπk2 ≤ π1 − π x ¯ /ˆ π . Therefore, we prove S1 = ∅ by showing that 2 2 2 2 T T π x / kπk2 > π1 − π x /ˆ π . Note that non-negativity of t, π ˆ < 0, and π T x + π ˆ t ≥ π1 imply T T T T π x ≥ π1 > 0. One can see that −π x < π1 − π x < π x, where the first inequality comes from T T the fact 2 that π1 >T 0, 2 and the second inequality follows from 1π1 ≤ π1 x and −π x < 0. Thus, T π x > π1 − π x and the result follows by noting that kπk2 ≥ πˆ 2 . We have S0 ( LC (I, 0), 2 ¯0 ) ∈ since (¯ x, t¯0 ) = (0, 0) ∈n LC (I, 0), but (¯ x , t / S . To prove S 6= ∅, one can check that for any 0 0 o π0 −π T x ¯ n ¯ ¯ , (¯ x, t0 ) ∈ S0 . x ¯ ∈ R and t0 = Max k¯ xk2 , π ˆ

Finally, consider the last case that π ˆ ∈ (− kπk2 , kπk2 ). Note that π ˆ 6= 0 and π ˆ ∈ (kπk2 , kπk2 ) imply kπk2 6= 0. We have S0 , S1 ( LC (I, 0), since (¯ x, t¯0 ) = (0, 0) ∈ LC (I, 0), but (¯ x, t¯0 ) ∈ / S0 ∪ S1 . To prove S0 , S1 6= ∅, one can check that for x0 =

π0 π, kπk2 (kπk2 − π ˆ)

and x1 =

π1 π, kπk2 (kπk2 + π ˆ)

we have x0 , t0 ∈ S0 and x1 , t1 ∈ S1 .

t0 = −

t1 =

π0 , kπk2 − π ˆ

π1 , kπk2 + π ˆ

References [1] T. Achterberg, SCIP: solving constraint integer programs, Mathematical Programmign Computation 1 (2009), 1–41. 27

[2] K. Andersen, G. Cornuéjols, and Y. Li, Split closure and intersection cuts, Mathematical Programming 102 (2005), 457–493. [3] K. Andersen and A. N. Jensen, Intersection cuts for mixed integer conic quadratic sets, 16th international IPCO Conference, Valparaiso (M. Goemans and J. Correa, eds.), Lecture Notes in Computer Science, Springer, 2013, pp. 37–48. [4] K. Andersen, Q. Louveaux, and R. Weismantel, An analysis of mixed integer linear sets based on lattice point free convex sets, Mathematics of Operations Research 35 (2010), 233–256. [5] M. F. Anjos and J. B. Lasserre (eds.), Handbook on semidefinite, conic and polynomial optimization, International Series in Operations Research & Management Science, vol. 166, Springer, 2012. [6] A. Atamt¨ urk and V. Narayanan, Cuts for conic mixed-integer programming, IPCO (M. Fischetti and D. P. Williamson, eds.), LNCS, vol. 4513, Springer, 2007, pp. 16–29. [7]

, Conic mixed-integer rounding cuts, Mathematical Programming 122 (2010), 1–20.

[8] E. Balas, Intersection cuts-a new type of cutting planes for integer programming, Operations Research 19 (1971), 19–39. [9] E. Balas and F. Margot, Generalized intersection cuts and a new cut generating paradigm, Mathematical Programming 137 (2013), 19–35. [10] P. Belotti, J. C. G´ oez, I. P´ olik, T. K. Ralphs, and T. Terlaky, A conic representation of the convex hull of disjunctive sets and conic cuts for integer second order cone optimization, Optimization Online (2012). [11]

, On families of quadratic surfaces having fixed intersections with two hyperplanes, Optimization Online (2012).

[12] D. Bienstock and A. Michalka, Strong formulations for convex functions over nonconvex sets, Optimization Online (2011). [13] A. Billionnet, S. Elloumi, and A. Lambert, Extending the QCR method to general mixed-integer programs, Mathematical programming 131 (2012), 381–401. [14] A. Billionnet, S. Elloumi, and M.C. Plateau, Improving the performance of standard solvers for quadratic 0-1 programs by a tight convex reformulation: The QCR method, Discrete Applied Mathematics 157 (2009), 1185–1197. [15] R. Bixby and E. Rothberg, Progress in computational mixed integer programming - a look back from the other side of the tipping point, Annals of Operations Research 149 (2007), 37–41. [16] R.E. Bixby, M. Fenelon, Z. Gu, E. Rothberg, and R. Wunderling, Mixed-integer programming: a progress report, The sharpest cut: the impact of Manfred Padberg and his work, SIAM, Philadelphia, PA, 2004, pp. 309–326. [17] P. Bonami, Lift-and-project cuts for mixed integer convex programs, in G¨ unl¨ uk and Woeginger [41], pp. 52–64. 28

[18] C. Buchheim, A. Caprara, and A. Lodi, An effective branch-and-bound algorithm for convex quadratic integer programming, in Eisenbrand and Shepherd [34], pp. 285–298. [19] C. Buchheim, A. Caprara, and A. Lodi, An effective branch-and-bound algorithm for convex quadratic integer programming, Mathematical Programming 135 (2012), 369–395. [20] M. T. C ¸ ezik and G. Iyengar, Cuts for mixed 0-1 conic programming, Mathematical Programming 104 (2005), 179–202. [21] V. Chv´ atal, Edmonds polytopes and a hierarchy of combinatorial problems, Discrete Mathematics 4 (1973), 305–337. [22] M. Conforti, G. Cornuéjols, and G. Zambelli, Polyhedral approaches to mixed integer linear programming, 50 Years of Integer Programming 1958-2008 (2010), 343–385. [23]

, Corner polyhedron and intersection cuts, Surveys in Operations Research and Management Science 16 (2011), 105–120.

[24] W. J. Cook, R. Kannan, and A. Schrijver, Chv´ atal closures for mixed integer programming problems, Mathematical Programming 47 (1990), 155–174. [25] G. Cornuéjols, Valid inequalities for mixed integer linear programs, Mathematical Programming 112 (2008), 3–44. [26] D. Dadush, S. S. Dey, and J. P. Vielma, The Chv´ atal-Gomory closure of a strictly convex body, Mathematics of Operations Research 36 (2011), 227–239. [27]

, On the Chv´ atal-Gomory closure of a compact convex set, in G¨ unl¨ uk and Woeginger [41], pp. 130–142.

[28]

, The split closure of a strictly convex body, Operations Research Letters 39 (2011), 121 –126.

[29] S. Dash, O. G¨ unl¨ uk, and C. Raack, A note on the MIR closure and basic relaxations of polyhedra, Operations Research Letters 39 (2011), 198–199. [30] S. Dash, O. G¨ unl¨ uk, and J. P. Vielma, Computational experiments with cross and crooked cross cuts, Optimization Online (2011). [31] A. Del Pia and R. Weismantel, Relaxations of mixed integer sets from lattice-free polyhedra, 4OR: A Quarterly Journal of Operations Research 10 (2012), 1–24. [32] S. S. Dey and J. P. Vielma, The Chv´ atal-Gomory closure of an ellipsoid is a polyhedron, in Eisenbrand and Shepherd [34], pp. 327–340. [33] S. Drewes, Mixed integer second order cone programming, Ph.D. thesis, Technische Universit¨ at Darmstadt, 2009. [34] F. Eisenbrand and F. B. Shepherd (eds.), Proceedings of the 14th IPCO Conference, Lausanne, Switzerland, 2010, LNCS, vol. 6080, Springer, 2010.

29

[35] T. Fujie and M. Kojima, Semidefinite programming relaxation for nonconvex quadratic programs, Journal of Global Optimization 10 (1997), 367–380. [36] M. Giandomenico, A. N. Letchford, F. Rossi, and S. Smriglio, A new approach to the stable set problem based on ellipsoids, in G¨ unl¨ uk and Woeginger [41], pp. 223–234. [37] R. E. Gomory, Outline of an algorithm for integer solutions to linear programs, Bulletin of the American Mathematical Society 64 (1958), 275–278. [38] R. E. Gomory, Some polyhedra related to combinatorial problems, Linear Algebra and its Applications 2 (1969), 451 – 558. [39] R. E. Gomory and E. L. Johnson, Some continuous functions related to corner polyhedra, Mathematical Programming 3 (1972), 23–85. [40] J. Gouveia and R. Thomas, Convex hulls of algebraic sets, in Anjos and Lasserre [5], pp. 113– 138. [41] O. G¨ unl¨ uk and G. J. Woeginger (eds.), Proceedings of the 15th IPCO Conference, New York, NY, 2011, LNCS, vol. 6655, Springer, 2011. [42] J. W. Helton and J. Nie, Semidefinite representation of convex sets and convex hulls, in Anjos and Lasserre [5], pp. 77–112. [43] D. Henrion, Semidefinite representation of convex hulls of rational varieties, Acta applicandae mathematicae 115 (2011), 319–327. [44] R. Horst and H. Tuy, Global optimization: Deterministic approaches, Springer, 2003. [45] E. L. Johnson, G. L. Nemhauser, and M. W. P. Savelsbergh, Progress in linear programmingbased algorithms for integer programming: An exposition, INFORMS Journal on Computing 12 (2000), 2–23. [46] M. R. Kılın¸c, J. Linderoth, and J. Luedtke, Effective separation of disjunctive cuts for convex mixed integer nonlinear programs, Optimization Online (2010). [47] M. R. Kılın¸c, S. Modaresi, and J. P. Vielma, Split cuts for conic programming, 9th Mixed Integer Programming Workshop (MIP 2012), July 16–19, 2012, Davis, CA, Poster., 2012. [48] M. Kojima and L. Tun¸cel, Cones of matrices and successive convex relaxations of nonconvex sets, SIAM Journal on Optimization 10 (2000), 750–778. [49] J.B. Lasserre, Global optimization with polynomials and the problem of moments, SIAM Journal on Optimization 11 (2001), 796–817. [50] A. Lodi, Mixed integer programming computation, Springer-Verlag, New York, 2010, pp. 619– 645. [51] L. Lovász, Geometry of numbers and integer programming, Mathematical Programming: Recent Developments and Applications (M. Iri and K. Tanabe, eds.), Kluwer, 1989, pp. 177–210.

30

[52] H. Marchand and L.A. Wolsey, Aggregation and Mixed Integer Rounding to solve MIPs, Operations Research 49 (2001), 363–371. [53] D. Micciancio and S. Goldwasser, Complexity of lattice problems: a cryptographic perspective, The Kluwer International Series in Engineering and Computer Science, vol. 671, Kluwer, 2002. [54] G. L. Nemhauser and L. A. Wolsey, Integer and combinatorial optimization, Wiley, 1988. [55] G. L. Nemhauser and L. A. Wolsey, A recursive procedure to generate all cuts for 0-1 mixed integer programs, Mathematical Programming 46 (1990), 379–390. [56] Y. Nesterov, H. Wolkowicz, and Y. Ye, Nonconvex Quadratic Optimization, Handbook of Semidefinite Programming (R. Saigal, L. Vandenberghe, and H. Wolkowicz, eds.), Kluwer Academic Publishers, 2000, pp. 361 – 420. [57] C.L.F. Oustry, SDP relaxations in combinatorial optimization from a Lagrangian viewpoint, Advances in Convex Analysis and Global Optimization: Honoring the Memory of C. Caratheodory (1873-1950) 54 (2001), 119–134. [58] P. A. Parrilo, Semidefinite programming relaxations for semialgebraic problems, Mathematical Programming 96 (2003), no. 2, 293–320. [59] P. A. Parrilo, 6.972 Algebraic Techniques and Semidefinite Optimization, Massachusetts Institute of Technology: MIT OpenCourseWare, http://ocw.mit.edu (Accessed 07 Feb, 2013). License: Creative Commons BY-NC-SA, Spring 2006. [60] I. Pólik and T. Terlaky, A survey of the S-lemma, SIAM review 49 (2007), 371–418. [61] S. Poljak, F. Rendl, and H. Wolkowicz, A recipe for semidefinite relaxation for (0, 1)-quadratic programming, Journal of Global Optimization 7 (1995), 51–73. [62] K. Ranestad and B. Sturmfels, The convex hull of a variety, Notions of Positivity and the Geometry of Polynomials (2011), 331–344. [63]

, On the convex hull of a space curve, Advances in Geometry 12 (2012), 157–178.

[64] R. Sanyal, F. Sottile, and B. Sturmfels, Orbitopes, Mathematika 57 (2011), 275–314. [65] C. Scheiderer, Convex hulls of curves of genus one, Advances in Mathematics 228 (2011), 2606 – 2622. [66] H.D. Sherali and W.P. Adams, A reformulation-linearization technique for solving discrete and continuous nonconvex problems, vol. 31, Springer, 1998. [67] R. A. Stubbs and S. Mehrotra, A branch-and-cut method for 0-1 mixed convex programming, Mathematical Programming 86 (1999), 515–532. [68] M. Tawarmalani and N.V. Sahinidis, Convexification and global optimization in continuous and mixed-integer nonlinear programming: theory, algorithms, software, and applications, vol. 65, Springer, 2002.

31

[69] J. P. Vielma, A constructive characterization of the split closure of a mixed integer linear program, Operations Research Letters 35 (2007), 29–35. [70] L. A. Wolsey, Integer programming, Wiley, 1998. [71] U. Yıldıran, Convex hull of two quadratic constraints is an LMI set, IMA Journal of Mathematical Control and Information 26 (2009), 417–450. [72] U. Yıldıran and I. E. Kose, LMI representations of the convex hulls of quadratic basic semialgebraic sets., Journal of Convex Analysis 17 (2010), 535–551.

32

(a) Level sets of GC in blue and ray r in purple. (b) Graphs of F and GC in black and blue plotted in the direction in which GC is locally linear.

z 0 , y 0 , t0

z 0 , y 0 , t0

z, y, t

z, y, t

z 1 , y 1 , t1

z 1 , y 1 , t1

(c) Friends by following the slope of GC .

(d) Friends by using the fact that epi (GC ) is a cone.

Figure 3: Friends construction for positive homogeneous non-separable functions.

33

t

t

z 2 , y, t2

z, y, t

z 1 , y, t1

z

z

(a) F in black, G in blue and valid aggregation (b) Friends construction by following slope of cuts Hλ for λ ∈ {1/4, 1/2, 3/4} in red, green and H1/2 . brown.

Figure 4: Cuts from aggregation.

34

Intersection Cuts for Nonlinear Integer Programming: Convexification

Intersection Cuts for Nonlinear Integer Programming: Convexification

Suggest Documents

Local cuts for mixed-integer programming - Mathematics

A Pure-Integer Nonlinear Programming for Elliptic

Solving mixed-integer nonlinear programming (MINLP) - AIMMS

A MIXED INTEGER, NONLINEAR PROGRAMMING ... - AgEcon Search

Linear, Nonlinear, and Mixed Integer Programming

Mixed-Integer Nonlinear Programming Models and ... - CiteSeerX

Integer Programming for Calibration

A Genetic Algorithm for Mixed Integer Nonlinear Programming ...

Mixed integer nonlinear programming for three-dimensional ... - PeerJ

Genetic Algorithm for Mixed Integer Nonlinear Bilevel Programming ...

An Exact Penalty Approach for Mixed Integer Nonlinear Programming ...

Fairly Linear Mixed Integer Nonlinear Programming Model for the ...

A mixed-integer nonlinear programming model for minimising ... - rCITI

Conjunctive Cuts for Integer Programs

Intersection in Integer Inverted Indices

INTEGER PROGRAMMING

Integer Programming

Integer Programming

Integer Programming Techniques for Makespan

Mixed-integer Programming for Control

N-fold integer programming and nonlinear multi ... - Springer Link

Constraint Integer Programming

Intersection Cuts with Infinite Split Rank

Integer Programming - MIT