Positive polynomials on unbounded equality ... - Optimization Online

Positive polynomials on unbounded equality-constrained domains Javier Pe˜ na∗ 1 , Juan C. Vera†2 , and Luis F. Zuluaga‡3 1 Tepper

School of Business, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, USA 15213 2 Tilburg School of Economics and Management, Tilburg University, Tilburg, The Netherlands 3 Faculty of Business Administration, University of New Brunswick, Fredericton, NB, Canada E3B 5A3

April 14, 2011

Abstract Certificates of non-negativity are fundamental tools in optimization. A “certificate” is generally understood as an expression that makes the non-negativity of the function in question evident. Some classical certificates of non-negativity are Farkas Lemma and the S-lemma. The lift-and-project procedure can be seen as a certificate of non-negativity for affine functions over the union of two polyhedra. These certificates of non-negativity underlie powerful algorithmic techniques for various types of optimization problems. Recently, more elaborate sum-of-squares certificates of non-negativity for higher degree polynomials have been used to obtain powerful numerical techniques for solving polynomial optimization problems, particularly for mixed integer programs and non-convex binary programs. We present a new certificate of non-negativity for polynomials over the intersection of a closed set S and the zero set of a given polynomial h(x). The certificate is written in terms of the set of non-negative polynomials over S and the ideal generated by h(x). Our certificate of non-negativity yields a copositive programming reformulation for a very general class of polynomial optimization problems. This copositive programming formulation generalizes Burer’s copositive formulation for binary programming and offers an avenue for the development of new algorithms to solve polynomial optimization problems. In particular, the copositive formulation could be used to obtain new semidefinite programming relaxations for binary programs, as our approach is different and complementary to the conventional approaches to obtain semidefinite programming relaxations via matrix relaxations.

1

Introduction

Certificates of non-negativity are fundamental tools in optimization. A “certificate” is generally understood as an expression that makes the non-negativity of the function in question evident. Some classical certificates of non-negativity are Farkas Lemma and the S-lemma. ∗

[email protected] (Corresponding author) [email protected] ‡ [email protected] †

1

The former is a certificate of non-negativity for affine functions over a given polyhedral domain, the latter is a certificates of non-negativity for quadratic functions over the sublevel set of a given quadratic function. The Balas-Ceria-Cornuéjols lift-and-project procedure [1] can be seen as a certificate of non-negativity for affine functions over the union of two polyhedra. More elaborate certificates of non-negativity for higher degree polynomials over a basic semialgebraic set include the classical Pólya’s Theorem [8], and the more modern Schm¨ udgen’s Theorem [15] and Putinar’s Theorem [13]. These theorems give certificates of non-negativity for polynomials over a given basic semi-algebraic set. The more recent work of Nie et al [11], Demmel et al [6], and Marshall [10] provide additional certificates of non-negativity via gradient and KKT ideals. These certificates of non-negativity underlie powerful algorithmic techniques for various types of optimization problems, particularly for mixed integer programs and non-convex binary programs. We present a new certificate of non-negativity for polynomials over the intersection of a closed domain S and the zero set of a given polynomial h(x). It is evident that if p(x) is non-negative on the domain S, then p(x) + h(x)q(x) is non-negative on the domain S ∩ h−1 (0) for any polynomial q(x). We show that under suitable conditions on h(x) and S, the converse of this statement holds as well, thereby establishing a certificate of non-negativity for polynomials on S ∩ h−1 (0) in terms of non-negative polynomials on S. We note that for the case when S is not compact, previous results in [6, 10, 11] give certificates of non-negativity for a polynomial p(x) in terms of the gradient of p(x) or the KKT ideal involving p(x) and the polynomials defining S. By contrast, our certificate of non-negativity is written purely in terms of the set of non-negative polynomials over S and the ideal generated by h(x). This property of our certificate of non-negativity yields some interesting consequences. In particular, it leads to a canonical convexification procedure for polynomial optimization problems. Our convexification procedure yields an equivalent formulation of polynomial optimization problems as linear conic programs over the dual of the cone of copositive forms. This formulation is inspired by Burer’s dual copositive formulation of binary quadratic programming problems [3, 4]. Indeed, the latter can be recovered as a special case of our convexification procedure (see Section 5). These copositive programming formulations offer an avenue for the development of new algorithms to solve polynomial optimization problems. In particular, the copositive formulation could be used to obtain new semidefinite programming relaxations for binary programs, as our approach is different and complementary to the conventional approaches to obtain semidefinite programming relaxations via matrix relaxations [16, 9]. The suitable conditions that ensure the validity of our certificate of non-negativity are related to the behavior of the “zeros at infinity” of the polynomials h(x) on the set S. The formalization of this condition is stated in terms of the horizon cone of the set S and the homogeneous component of the polynomial h(x). Loosely speaking, the conditions presented in [3, 4, 2] for the convexification of binary quadratic programming problems are special cases of the more general zeros at infinity condition presented here. The main parts of the paper are organized as follows. Section 2 motivates and formally states our main result; namely, the certificate of non-negativity presented in Theorem 3. Section 3 provides insight about the zeros at infinity condition in Theorem 3. Section 4 describes a canonical convexification procedure for equality-constrained polynomial optimization problems over the non-negative orthant, and its natural extension to inequality-constrained polynomial optimization problems. The procedure is a generic reformulation of these classes of problems as a linear conic program over the cone of completely positive forms on Rn+ . Section 5 gives further applications of our main theorem. Section 6 presents the technical proofs of the main theorems in the paper.

2

2

A new certificate of non-negativity

To motivate our certificate of non-negativity in Theorem 3 below, we begin by recalling a key result in the Balas-Ceria-Cornuejols lift-and-project procedure [1] and cast it as a certificate of non-negativity. Specifically, assume {x ∈ Rn : Ax ≤ b} ⊆ {x ∈ Rn : 0 ≤ xj ≤ 1} for some fixed j ∈ {1, . . . , n}. In [1, Theorem 2.10] Balas, Ceria, and Cornuéjols show the following characterization of the valid inequalities for {x ∈ Rn : Ax ≤ b, xj ∈ {0, 1}}:

αT x + β ≥ 0 for all x ∈ {x ∈ Rn : Ax ≤ b, xj ∈ {0, 1}}

⇐⇒

∃ u, v ≥ 0 and α = α = β = β =

u0 , v0 ∈ R such that AT u + ej u0 AT v + e j v 0 −bT u + u0 −bT v.

(1) Notice that (1) is a certificate of non-negativity for linear polynomials over the set {x ∈ Rn : Ax ≤ b, xj ∈ {0, 1}}. This can be seen more clearly by introducing some notation. For a given positive integer d, let Rd [x] = Rd [x1 , . . . , xn ] denote the set of real polynomials of degree at most d in n variables. For a given S ⊆ Rn , let Pd (S) ⊆ Rd [x] be defined as the set of non-negative polynomials of degree at most d on S, that is, Pd (S) := {p ∈ Rd [x] : p(x) ≥ 0 ∀x ∈ S}. Letting S = {x : Ax ≤ b}, h(x) = xj (1 − xj ), and using Farkas Lemma to characterize P1 (S), it is not difficult to show that (1) can be rewritten as: P1 (S ∩ h−1 (0)) = (xj P1 (S) + (1 − xj )P1 (S) + h(x)R) ∩ R1 [x].

(2)

In (2), the non-negativity is now evident. The condition S ⊆ {x ∈ Rn : 0 ≤ xj ≤ 1} implies that any element of xj P1 (S), (1 − xj )P1 (S) is non-negative in S ∩ h−1 (0), while any element of h(x)R vanishes in S ∩ h−1 (0). For higher degree polynomials there are more elaborate certificates of non-negativity such as the classical P´ olya’s Theorem [8], and the more modern Schm¨ udgen’s Theorem [15] and Putinar’s Theorem [13]. These theorems give certificates of non-negativity for polynomials over a given basic semi-algebraic set {x : gi (x) ≥ 0, i = 1, . . . , m} in terms of the preordering or the quadratic module generated by g1 , . . . , gm respectively. Compared to these more elaborated certificates, the certificate of non-negativity (2) has a noteworthy characteristic; namely, that like Farkas Lemma, and the S-Lemma, the certificate is constructed using polynomials of bounded degree (i.e., the polynomials in the right-hand of (2) have bounded degree). This is possible thanks to (2) being written in terms of polynomials that are non-negative on the “simple” set S to certify the non-negativity of polynomials in the more “complex” set S ∩ h−1 (0). As mentioned earlier, equation (2) is the key result behind the lift-and-project procedure in [1]. A sequential application of (2) yields a convexification procedure for mixed-integer linear programs. Moreover, an algorithmic implementation of the lift-and-project procedure constituted the foundation for the development of the general purpose Branch-Bound-and-Cut algorithms that are so successfully used today to solve general mixed-integer programming problems (a fascinating account of these developments can be found in [5]). Naturally, the question arises of whether results similar to (2) can be found for more general sets, and whether such results could be used algorithmically to improve solution methods for more general problems. The first question was positively answered when the set S is compact in [12]. Specifically, from Theorem 1 and Corollary 2 in [12] the following result readily follows.

3

Theorem 1. Assume S ⊆ Rn is compact and h ∈ Pd (S). Then Pd (S ∩ h−1 (0)) = closure(Pd (S) + h(x)Rd−deg(h) [x]).

(3)

Preliminary results on the algorithmic use of Theorem 1 to improve solution methods for non-convex quadratic binary programs were recently presented in [7]. Our main contribution is to show that under a suitable additional condition, Theorem 1 also holds in the more general case when the set S is unbounded. To get an idea of what this condition might be, notice that one inclusion in (3) holds for any S; namely Pd (S ∩ h−1 (0)) ⊇ closure(Pd (S) + h(x)Rd−deg(h) [x]). On the other hand, the other inclusion in Theorem 1 may fail when S is unbounded. To see this, consider the following case and the generic counterexample in Section 7. Example 2. Let d = 2, S = R2+ , and h(x) = x1 x2 + 1. Note that because S ∩ h−1 (0) = ∅, then P2 (S ∩ h−1 (0)) = R2 [x]. On the other hand, for any t ∈ R+ consider the point xt = (t, −1/t), and notice that h(xt ) = 0, and limt→∞ xt = (∞, 0) ∈ S (loosely speaking). So although h(x) does not have a zero in S, it has a “zero at infinity” in S. This zero at infinity “limits” the polynomials in Pd (S) + h(x)Rd−deg(h) [x], since for any p ∈ Pd (S) + h(x)Rd−deg(h) [x], limt→∞ p(xt ) ≥ 0. Thus, for example −x21 ∈ P2 (S ∩ h−1 (0)), but −x21 6∈ Pd (S) + h(x)Rd−deg(h) [x] as limt→∞ −(xt1 )2 = ∞. So in order for Theorem 1 to hold when S is unbounded, a condition is needed on the zeros at infinity of h(x) in S. This condition is formally stated below. ˜ Given a polynomial h ∈ R[x] let h(x) denote the homogeneous component of h of highest ˜ total degree. In other words, h(x) is obtained by dropping from h the terms whose total degree is less than deg(h). Recall that given S ⊆ Rn the horizon cone S ∞ is defined as (see, e.g., [14]): S ∞ := {y ∈ Rn : there exist xk ∈ S, λk ∈ R+ , k = 1, 2, . . . such that λk ↓ 0 and λk xk → y}. We are now ready to state our main result, namely a certificate of non-negativity for elements of Pd (S ∩ h−1 (0)) in terms of Pd (S) for the general case when S might be unbounded. Theorem 3. Assume K ⊆ Rn is a closed convex pointed cone. Let S ⊆ K be a closed set and h ∈ Pd (S) be such that ∞

(S ∩ h−1 (0))

˜ −1 (0). = S∞ ∩ h

(4)

Then Pd (S ∩ h−1 (0)) = closure(Pd (S) + h(x)Rd−deg(h) [x]).

(5)

Proof. See Section 6. As it is shown in Section 6.2, the hypotheses of Theorem 3 allow us to use previous results from [12] in its proof . The special case K = Rn+ leads to some interesting consequences that we discuss in Section 4. Condition (4) concerns the behavior of “zeros at infinity” of the polynomial h(x) on the set S. We show in Section 7 that the statement of Theorem 3 generically fails when this assumption is violated. The condition h ∈ Pd (S) can be replaced by deg(h) ≤ d/2 as shown in Corollary 4. Section 3 further elaborates on these conditions. Corollary 4. The statement of Theorem 3 holds if the hypothesis h ∈ Pd (S) is changed to deg(h) ≤ d/2.

4

Proof. Let h1 (x) = h(x)2 ∈ Pd (S). We have ∞

(S ∩ h−1 1 (0))

∞

= (S ∩ h−1 (0)) ˜ −1 (0) = S∞ ∩ h

by assumption ˜1 = h ˜ 2. as h

˜ −1 (0) = S∞ ∩ h 1 Applying Theorem 3, we obtain Pd (S ∩ h−1 (0)) = Pd (S ∩ h−1 1 (0))

= closure(Pd (S) + h1 (x)Rd−deg(h1 ) [x]) ⊆ closure(Pd (S) + h(x)Rd−deg(h) [x]) ⊆ Pd (S ∩ h−1 (0)).

Remark 5. By repeatedly applying Theorem 3 and Corollary 4, we obtain the following more general version of (5). Assume K ⊆ Rn is a closed convex pointed cone. Let S ⊆ K be a closed set. Let hi ∈ Rd [x] be given and put Si = {x ∈ S : hj (x) = 0, j < i}. Assume that for each i = 1, . . . , m (i) hi ∈ Pd (Si ) or deg(hi ) ≤ d/2. ∞ ˜ −1 (0). (ii) (Si ∩ h−1 (0)) = Si ∞ ∩ h i

i

Then Pd

S∩

m \

! h−1 i (0)

= closure Pd (S) +

i=1

3

m X

! hi (x)Rd−deg(hi ) [x] .

i=1

About the “zeros at infinity” condition

We next present some basic results that shed light into the condition (4) in Theorem 3. These results will also allow us to illustrate some particular applications of Theorem 3 in Section 5. The following proposition shows that in order to check if condition (4) holds, it is only necessary to check one inclusion. ∞

Proposition 6. For any S ⊆ Rn and h ∈ R[x] we have (S ∩ h−1 (0))

˜ −1 (0). ⊆ S∞ ∩ h

∞

Proof. Let d = deg(h), and assume y ∈ (S ∩ h−1 (0)) . Then there are sequences xk ∈ S, λk ∈ R+ , k = 1, . . . such that h(xk ) = 0, λk ↓ 0 and λk xk → y. Thus, in particular, y ∈ S ∞ . On the other hand, for ` < d let f` (x) be the homogeneous component of h(x) of degree `. We have that ! X X ˜ h(y) = hd (y) = lim (λk )d hd (xk ) = lim (λk )d h(xk ) − f` (xk ) = lim (λk )d−` f` (λk xk ) = 0. k→∞

k→∞

` 0, j = 1, . . . , K such that Y =

K X

λj xj ,

j=1

K X

λj = 1.

j=1

Let v ∗ be the optimal value of (6), which is the same as the optimal of (8) by part (a) of Theorem 10. Then it follows that ∗

v = hCd (q), Y i =

K X

j

λj q(x ) ≥

j=1

K X j=1

15

λj v ∗ = v ∗ .

Since each λj > 0, we must necessarily have q(xj ) = v ∗ for each j = 1, . . . , K. Therefore xj , j = 1, . . . , K are optimal solutions to (6) and part (b) of Theorem 10 follows. Next we provide a proof of (34). Notice that for U ⊆ Rn , Cd (Pd (U )) = {Cd (p) : p(u) ≥ 0 for all u ∈ U } = {Cd (p) : hCd (p), Md (u)i ≥ 0 for all u ∈ U } = {Cd (p) : hCd (p), Y i ≥ 0 for all Y ∈ Md (U )} = Md (U )∗ , and consequently Cd (Pd (U ))∗ = Md (U )∗∗ = coco Md (U ). Applying this to U = S ∩ h−1 (0) and using Theorem 3, we get coco Md (S ∩ h−1 (0)) = Cd (Pd (S ∩ h−1 (0)))∗ = Cd (closure(Pd (S) + h(x)R))∗

(by Theorem 3)

∗

= closure(Cd (Pd (S) + h(x)R)) = Cd (Pd (S) + h(x)R)∗ = (Cd (Pd (S)) + Cd (h(x)R))∗ = coco(Md (S) ∩ Cd (h(x)R)∗ . To finish, observe that

Cd (h(x)R)∗ = {Y ∈ RN (n,d) : hCd (ch), Y i ≥ 0 for all c ∈ R} = {Y ∈ RN (n,d) : hCd (ch), Y i = 0 for all c ∈ R} = {Y ∈ RN (n,d) : hCd (h), Y i = 0}.

7

A generic counterexample

We next show that indeed the statement of Theorem 3 generically fails if condition (4) is violated. For simplicity assume K = Rn+ , and let a = e, that is, the vector of all-ones. Assume condition (4) in Theorem 3 does not hold. Then by Lemma 15(ii) there exists t ∈ ∆n ¯ −1 (0) \ S ∩ h−1 (0). Since S ∩ h−1 (0) is a closed subset of ∆n+1 there such that (0, t) ∈ S¯ ∩ h exists > 0 such that y ∈ ∆n+1 , y ∈ S ∩ h−1 (0) ⇒ ky − (0, t)k > . Take p(x) := (1 + eT x)d−2

n X 1+ (xi − ti (1 + eT x))2 − 2 (1 + eT x)2

! .

i=1

We claim that p(x) ∈ Pd (S ∩ h−1 (0)) but p(x) 6∈ closure(Pd (S) + h(x)Rd−deg(h) [x]). To show that, we compactify: First, note that if y ∈ S ∩ h−1 (0) ⊆ ∆n+1 we have p¯(y) = ky − (0, t)k2 − 2 > 0. Thus by Lemma 14(i), p(x) ∈ Pd (S ∩ h−1 (0)). On the other hand, p¯(0, t) = −2 . Hence by continuity there exists δ > 0 such that kq − pk < δ ⇒ q¯(0, t) < −2 /2.

16

(35)

Now to show p(x) 6∈ closure(Pd (S) + h(x)Rd−deg(h) [x]) we proceed by contradiction. Assume there exist r(x) ∈ Pd (S) and s(x) ∈ Rd−deg(h) [x] such that kr + hs − pk < δ. From (35) we get ¯ t)¯ r¯(0, t) + h(0, s(0, t) < −2 /2. (36) ¯ t) = 0, and r¯(0, t) ≥ 0 since (0, t) ∈ S¯ and r¯ ∈ Pd (S). ¯ But, this is a contradiction because h(0,

References [1] E. Balas, S. Ceria, and G. Cornuéjols. A lift-and-project cutting plane algorithm for mixed 0–1 programs. Math Program., 58:295–324, 1993. [2] I. Bomze and F. Jarre. A note on Burer’s copositivity representation of mixed binary QPs. Optimization Letters, 4(3):465–472, 2010. [3] S. Burer. On the copositive representation of binary and continuous nonconvex quadratic programs. Math. Program., 120(2):479–495, 2009. [4] S. Burer. Copositive programming. In J. B. Lasserre and M. Anjos, editors, Handbook of Semidefinite, Cone and Polynomial Optimization: Theory, Algorithms, Software and Applications. To Appear. [5] G. Cornuéjols. Revival of the Gomory cuts in the 1990s. Annals of Operations Research, 149:63–66, 2007. [6] J. Demmel, J. Nie, and V. Powers. Representations of positive polynomials on noncompact semi-algebraic sets via KKT ideals. J. Pure Appl. Algebra, 209(1):189–200, 2007. [7] B. Ghaddar, M. Anjos, and J. Vera. An iterative scheme for valid inequality generation in binary quadratic programming. In 15th Conference on integer programming and combinatorial optimization (IPCO XV), 2011. [8] G. Hardy, J. Littlewood, and G. Pólya. Inequalities. Cambridge University Press, New York, second edition, 1988. [9] L. Lov´ asz and A. Schrijver. Cones of matrices and set-functions and 0-1 optimization. SIAM J. on Optim., 1(2):166–190, 1991. [10] M. Marshall. Representations of non-negative polynomials, degree bounds, and applications to optimization. Can. J. Math, 61(1):205–221, 2009. [11] J. Nie, J. Demmel, and B. Sturmfels. Minimizing polynomials via sum of squares over the gradient ideal. Math. Program., 106(3, Ser. A):587–606, 2006. [12] J. Pe˜ na, J. Vera, and L. Zuluaga. Exploiting equalities in polynomial programming. Operations Research Letters, 36:223–228, 2008. [13] M. Putinar. Positive polynomials on compact sets. Indiana Univ. Math. J., 42:969–984, 1993. [14] T. Rockafellar and R. Wets. Variational Analysis. Springer–Verlag, Berlin, 1998. [15] K. Schm¨ udgen. The K-moment problem for compact semi-algebraic sets. Math. Ann., 289:203–206, 1991. [16] N. Shor. Class of global minimum bounds of polynomial functions. Cybernetics, 23:731– 734, 1987.

17