On Cones of Nonnegative Quartic Forms - Enet

7 downloads 3680 Views 493KB Size Report
Sep 2, 2011 - Email: [email protected]. (On leave from Department of Systems ..... from quadratic PSD, while quartic SOS and SOQ forms are in the form of ...
On Cones of Nonnegative Quartic Forms Bo JIANG



Zhening LI



Shuzhong ZHANG



September 2, 2011

Abstract Historically, much of the theory and practice in nonlinear optimization has revolved around the quadratic models. Though quadratic functions are nonlinear polynomials, they are well structured and easy to deal with. Limitations of the quadratics, however, become increasingly binding as higher degree nonlinearity is imperative in modern applications of optimization. In the recent years, one observes a surge of research activities in polynomial optimization, and modeling with quartic or higher order polynomial functions has been more commonly accepted. On the theoretical side, there are also major recent progresses on polynomial functions and optimization. For instance, Ahmadi et al. [2] proved that checking the convexity of a quartic polynomial function is strongly NP-hard in general, which settles a long-standing open question. In this paper we proceed to studying six fundamentally important convex cones of quartic functions in the space of symmetric quartic tensors, including the cone of nonnegative quartic polynomials, the sum of squared polynomials, the convex quartic polynomials, and the sum of fourth powered polynomials. It turns out that these convex cones coagulate into a chain in decreasing order. The complexity status of these cones is sorted out as well. Finally, potential applications of the new results to solve highly nonlinear and/or combinatorial optimization problems are discussed.

Keywords: cone of polynomial functions; nonnegative quartic forms; sum of squares; sosconvexity. Mathematics Subject Classification 2010: 15A69, 08A40, 90C25, 49N15. ∗

Industrial and Systems Engineering Program, University of Minnesota, Minneapolis, MN 55455. Email: [email protected] † Department of Mathematics, Shanghai University, Shanghai, 200444, China. Email: [email protected] ‡ Industrial and Systems Engineering Program, University of Minnesota, Minneapolis, MN 55455. Email: [email protected]. (On leave from Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong. Email: [email protected])

1

1

Introduction

Checking the convexity of a quadratic function boils down to testing the positive semidefiniteness of its Hessian matrix in the domain. Since the Hessian matrix is constant, the test can be done easily. A natural question thus arises: Given a fourth degree polynomial function in n variables, can one still easily tell if the function is convex or not? This simple-looking question was first put forward by Shor [28] in 1992, which turned out later to be a very challenging question to answer. For almost two decades, the question remained open. Only until recently Ahmadi et al. [2] proved that checking the convexity of a general quartic polynomial function is actually strongly NP-hard. Their result not only settled this particular open problem, but also helped to highlight a crucial difference between quartic and quadratic polynomials, which makes the study of quartic polynomials all the more compelling and interesting. On the practical side, quartic polynomial optimization has a wide spectrum of applications, including sensor network localization [7], MIMO radar waveform optimization [12], portfolio management with high moments information [20], quantum entanglement problem [13], among many others. This has stimulated a burst of recent research activities with regard to quartic polynomial optimization. Due to the NP-hardness of quartic polynomial optimization models (see e.g. [25, 15, 23]), there is a considerable amount of recent research work devoted to the approximation algorithms for solving various quartic polynomial optimization models. Luo and Zhang [25] proposed an approximation algorithm for optimization of a quartic polynomial with quadratic constraints. Ling et al. [24] considered a special quartic optimization model, which is to maximize a bi-quadratic function over two spheres. He et al. [15, 16] extended the study to arbitrary degree polynomials. So [35] improved some of the approximation bounds presented in [15]. For a comprehensive survey on the topic, one may refer to the recent Ph.D. thesis of Li [23]. Another well studied approach to cope with general polynomial optimization problems is the so-called SOS method proposed by Lasserre [21] and Parillo [29]. Theoretically, it can solve any general polynomial optimization model to any given accuracy through resorting to a sequence of Semidefinite Programs (SDP). A main drawback of this approach is that the size of those (SDP) problems may grow intolerably large quickly. Interested readers may find more information in a recent survey [22] and the references therein. There is an intrinsic connection between optimizing a polynomial function and the description of all the polynomial functions that are nonnegative over a given domain. For the case of quadratic polynomials, this connection was explored by Sturm and Zhang in [36], and later for the bi-quadratic case in Luo et al. [26]. Such investigations can be traced back to the 19th century when the relationship between nonnegative polynomial functions and the sum of squares (SOS) of polynomials was 2

explicitly studied. One concrete question of interest was: Given a multivariate polynomial function that takes only nonnegative values over the real numbers, can it be represented as a sum of squares of polynomial functions? Hilbert [19] in 1888 showed that the only three classes of polynomial functions where this is generically true can be explicitly identified: (1) univariate polynomials; (2) multivariate quadratic polynomials; (3) bivariate quartic polynomials. Since polynomial functions with a fixed degree form a vector space, and the nonnegative polynomials and the SOS polynomials form two convex cones respectively within that vector space, the afore-mentioned results can be understood as a specification of three particular cases where these two convex cones coincide, while in general of course the cone of nonnegative polynomials is larger. There are certainly other interesting convex cones in the same vector space. For instance, the convex polynomial functions form yet another convex cone in that vector space. Helton and Nie [18] introduced the notion of sos-convex polynomials, to indicate the polynomials whose Hessian matrix can be decomposed as a sum of squares of polynomial matrices. All these classes of convex cones are important in their own rights. They are also important for the sake of optimization of polynomial functions. There are some substantial recent progresses along this direction. As we mentioned earlier, e.g., the question of Shor [28] regarding the complexity of deciding the convexity of a quartic polynomial was nicely settled by Ahmadi et al. [2]. It is also natural to inquire if the Hessian matrix of a convex polynomial is sos-convex. Ahmadi and Parrilo [3] gave an example to show that this is not the case in general. Blekherman proved that a convex polynomial is not necessary a sum of squares [6] if the degree of the polynomial is larger than two. However, Blekherman’s proof is not constructive, and it remains an open problem to construct a concrete example of convex polynomial which is not a sum of squares. Reznick [34] studied the sum of m-th powers of real linear forms. In view of the cones formed by the polynomial functions (e.g. the cones of nonnegative polynomials, the convex polynomials, the SOS polynomials and the sos-convex polynomials), it is natural to inquire about their relational structures. We shall set out to explore this in the paper. In a way there is a ‘phase transition’ in terms of complexity when the scope of polynomials goes beyond quadratics. Compared to the quadratic case (cf. Sturm and Zhang [36]), the structure of the quartic forms is far from being clear. We believe that the class of quartic polynomial functions (or the class of quartic forms) is an appropriate subject of study on its own right, beyond quadratic functions (or matrices). There are at least three immediate reasons to elaborate on the quartic polynomials, rather than polynomial functions of other degrees. First of all, nonnegativity is naturally associated with even degree polynomials, and the quartic polynomial is next to quadratic polynomials in that hierarchy. Second, quartic polynomials represent a landscape after the ‘phase transition’ takes place. However, dealing with quartic polynomials is still manageable, as far as notations are concerned. Finally, from an application point of view, quartic polynomial optimization is by far the most relevant polynomial optimization model beyond quadratic polynomials. The afore-mentioned examples such as kurtosis risks in portfolio management ([20]), the bi-quadratic optimization models ([24]), and the nonlinear

3

least square formulation of sensor network localization ([7]) are all such examples. In this paper we aim to present the relational structure of several important convex cones in the vector space of quartic forms. We shall also motivate the study by some examples from applications.

2 2.1

Preparations Notations

Throughout this paper, we use the lower-case letters to denote vectors (e.g., x ∈ Rn ), the capital 2 letters to denote matrices (e.g., A ∈ Rn ), and the capital calligraphy letters to denote fourth order 4 tensors (e.g., F ∈ Rn ), with subscriptions of indices being their entries (e.g., x1 , Aij , Fijk` ∈ R). The boldface capital letters are reserved for sets in the Euclidean space, e.g., various sets of quatric 4 forms to be introduced later, as well as Rn , the space of n-dimensional fourth order tensors. A generic quartic form is a fourth degree homogeneous polynomial function in n variables, or specifically the function X f (x) = Gijk` xi xj xk x` , (1) 1≤i≤j≤k≤`≤n

where x = (x1 , · · · , xn )T ∈ Rn . Closely related to a quartic form is a fourth order super-symmetric 4 tensor F ∈ Rn . A tensor is said to be super-symmetric if its entries are invariant under all permutations of its indices. The set of n-dimensional super-symmetric fourth order tensors is 4 denoted by Sn . In fact, super-symmetric tensors are bijectively related to forms. In particular, 4 restricting to fourth order tensors, for a given super-symmetric tensor F ∈ Sn , the quartic form in (1) can be uniquely determined by the following operation: X

f (x) = F(x, x, x, x) :=

Fijk` xi xj xk x` ,

(2)

1≤i,j,k,`≤n

where x ∈ Rn , Fijk` = Gijk` /|Π(ijk`)| and Π(ijk`) is the set of all distinctive permutations of the indices {i, j, k, `}, and vice versa. (This is the same as the one-to-one correspondence between a symmetric matrix and a quadratic form.) In the remainder of the paper, we shall more frequently 4 use a super-symmetric tensor F ∈ Sn to indicate a quartic form f (x) or F(x, x, x, x), i.e., the notion of “super-symmetric fourth order tensor” and “quartic form” are used interchangeably in this paper. 4

2

Given a quartic form F ∈ Sn and a matrix X ∈ Rn , we may also define the following operation (in the same spirit as (2)): X F(X, X) := Fijk` Xij Xk` . 1≤i,j,k,`≤n

4

4

We call a fourth order tensor G ∈ Rn partial-symmetric, if Gijk` = Gjik` = Gij`k = Gk`ij

∀ 1 ≤ i, j, k, ` ≤ n.

Essentially this means that the tensor form is symmetric for the first and the last two indices respectively, and is also symmetric by swapping the first two and the last two indices. The set of → − 4 → − 4 4 4 4 all partial-symmetric fourth order tensors in Rn is denoted by S n . Obviously Sn ( S n ( Rn if n ≥ 2. 4

4

4

For any fourth order tensor G ∈ Rn , we introduce a symmetrization mapping sym : Rn 7→ Sn , which is F = sym G with Fijk` =

1 |Π(ijk`)|

X



∀ 1 ≤ i, j, k, ` ≤ n.

π∈Π(ijk`)

The symbol ‘⊗’ represents the outer product of vectors or matrices. If F = x⊗x⊗x⊗x for some x ∈ 2 Rn , then Fijk` = xi xj xk x` ; and if G = X ⊗ X for some X ∈ Rn , then Gijk` = Xij Xk` . The symbol ‘•’ denotes the operation of inner product. As a result, we have F(x, x, x, x) = F • (x ⊗ x ⊗ x ⊗ x).

2.2

Introducing the Quartic Forms

In this subsection we shall formally introduce the definitions of the quartic forms (or equivalently, super-symmetric fourth order tensors) discussed in this paper. Let us start with the well known notion of positive semidefinite (PSD) and the sum of squares (SOS) of polynomials. 4

Definition 2.1 A quartic form F ∈ Sn is called quartic PSD if F(x, x, x, x) ≥ 0 4

∀ x ∈ Rn .

(3)

4

The set of all quartic PSD forms in Sn is denoted by Sn+ . 4

If a quartic form F ∈ Sn can be written as a sum of squares of polynomial functions, then these polynomials must be quadratic forms, i.e., F(x, x, x, x) =

m X

T

i

x Ax

2

= (x ⊗ x ⊗ x ⊗ x) •

m X

Ai ⊗ Ai ,

i=1

i=1

 − n4 P 2 i i ∈ → where Ai ∈ Sn , the set of symmetric matrices. However, m S is only partiali=1 A ⊗ A symmetric, and may not be exactly F, which must be super-symmetric. To place it in the family 4 Sn , a symmetrization operation is required. Since x ⊗ x ⊗ x ⊗ x is super-symmetric, we still have  Pm Pm i i i i (x ⊗ x ⊗ x ⊗ x) • sym i=1 A ⊗ A = (x ⊗ x ⊗ x ⊗ x) • i=1 A ⊗ A . 5

4

Definition 2.2 A quartic form F ∈ Sn is called quartic SOS if F(x, x, x, x) is a sum of squares 2 of quadratic forms, i.e., there exist m symmetric matrices A1 , · · · , Am ∈ Sn such that ! m m X X  F = sym Ai ⊗ Ai = sym Ai ⊗ Ai . i=1

i=1

4

The set of quartic SOS forms in Sn is denoted by Σ2n,4 . As all quartic SOS forms constitute a convex cone, we have n o 2 Σ2n,4 = sym cone A ⊗ A | A ∈ Sn .  Pm i i it maybe a challenge to write it explicitly as a sum Usually, for a given F = sym i=1 A ⊗ A of squares, although the construction can in principle be done in polynomial-time by semidefinte programming (SDP), which however is costly. In this sense, having a quartic SOS tensor in supersymmetric form may not always be beneficial, since the super-symmetry can destroy the SOS structure. Since F(X, X) is a quadratic form, the usual sense of nonnegativity carries over. Formally we introduce this notion below. 4

Definition 2.3 A quartic form F ∈ Sn is called quartic matrix PSD if F(X, X) ≥ 0

2

∀ X ∈ Rn . 2

4

2

The set of quartic matrix PSD forms in Sn is denoted by Sn+ ×n . Related to the sum of squares for quartic forms, we now introduce the notion to the sum of quartics 4 (SOQ): If a quartic form F ∈ Sn is SOQ, then there are m vectors a1 , · · · , am ∈ Rn such that F(x, x, x, x) =

m X

 T i 4

x a

= (x ⊗ x ⊗ x ⊗ x) •

i=1

m X

ai ⊗ ai ⊗ ai ⊗ ai .

i=1 4

Definition 2.4 A quartic form F ∈ Sn is called quartic SOQ if F(x, x, x, x) is a sum of fourth powers of linear forms, i.e., there exist m vectors a1 , · · · , am ∈ Rn such that F=

m X

ai ⊗ ai ⊗ ai ⊗ ai .

i=1 4

The set of quartic SOQ forms in Sn is denoted by Σ4n,4 .

6

As all quartic SOQ forms also constitute a convex cone, we denote Σ4n,4 = cone {a ⊗ a ⊗ a ⊗ a | a ∈ Rn } ⊆ Σ2n,4 . In the case of quadratic functions, it is well known that for a given homogeneous form (i.e., a 2 symmetric matrix, for that matter) A ∈ Sn the following two statements are equivalent: • A is positive semidefinite (PSD): A(x, x) := xT Ax ≥ 0 for all x ∈ Rn . P Pm i T i 2 i • A is a sum of squares (SOS): A(x, x) = m i=1 (x a ) (or equivalently A = i=1 a ⊗ a ) for some a1 , · · · , am ∈ Rn . It is therefore clear that the four types of quartic forms defined above are actually different extensions of the nonnegativity. In particular, quartic PSD and quartic matrix PSD forms are extended from quadratic PSD, while quartic SOS and SOQ forms are in the form of summation of nonnegative polynomials, and are extended from quadratic SOS. We will show later that there is an interesting hierarchical relationship: 2

2

4

n ×n Σ4n,4 ( S+ ( Σ2n,4 ( Sn+ .

(4)

Recently, a class of polynomials termed the sos-convex polynomials (cf. Helton and Nie [18]) has been brought to attention, which is defined as follows (see [4] for three other equivalent definitions of the sos-convexity): A multivariate polynomial function f (x) is sos-convex if its Hessian matrix H(x) can be factorized as H(x) = (M (x))T M (x) with a polynomial matrix M (x). The reader is referred to [3] for applications of the sos-convex polynomials. In this paper, we shall 4 focus on Sn and investigate sos-convex quartic forms with the hierarchy (4). For a quartic form 4 F ∈ Sn , it is straightforward to compute its Hessian matrix H(x) = 12F(x, x, ·, ·), i.e., (H(x))ij = 12F(x, x, ei , ej ) ∀ 1 ≤ i, j ≤ n, where ei ∈ Rn is the vector whose i-th entry is 1 and other entries are zeros. Therefore H(x) is a quadratic matrix of x. If H(x) can be decomposed as H(x) = (M (x))T M (x) with M (x) being a polynomial matrix, then M (x) must be linear with respect to x. 4

Definition 2.5 A quartic form F ∈ Sn is called quartic sos-convex, if there exists a linear matrix M (x) of x, such that its Hessian matrix 12F(x, x, ·, ·) = (M (x))T M (x). 4

The set of quartic sos-convex forms in Sn is denoted by Σ2∇2 . n,4

7

Helton and Nie [18] proved that a nonnegative polynomial is sos-convex, then it must be SOS. In particular, if the polynomial is a quartic form, by denoting the i-th row of the linear matrix 2 M (x) to be xT Ai for i = 1, . . . , m and some matrices A1 , · · · , Am ∈ Rn , then (M (x))T M (x) = Pm i T T i i=1 (A ) xx A . Therefore m

1 1 X T i 2 F(x, x, x, x) = x F(x, x, ·, ·)x = xT (M (x))T M (x)x = x A x ∈ Σ2n,4 . 12 12 T

i=1

In addition, the Hessian matrix for a quartic sos-convex form is obviously positive semidefinite for any x ∈ Rn . Hence sos-convexity implies convexity. Combining these two facts, we conclude that a quartic sos-convex form is both SOS and convex, which motivates us to study the last quartic forms in this paper. 4

Definition 2.6 A quartic form F ∈ Sn is called convex and SOS, if it is both quartic SOS and T 4 4 convex. The set of quartic convex and SOS forms in Sn is denoted by Σ2n,4 Sncvx . 4

4

Here Sncvx is denoted to be the set of all convex quartic forms in Sn .

2.3

Main Results and the Organization of the Paper

All the sets of the quartic forms defined in Section 2.2 are clearly convex cones. The remainder of 2 2 4 this paper is organized as follows. In Section 3, we start by studying the cones: Sn+ , Σ2n,4 , Sn+ ×n , and Σ4n,4 . First of all, we show that they are all closed, and that they can be presented in different formulations. As an example, the cone of quartic SOQ forms is n o 2 Σ4n,4 = cone {a ⊗ a ⊗ a ⊗ a | a ∈ Rn } = sym cone A ⊗ A | A ∈ Sn+ , rank(A) = 1 , which can also be written as n o 2 sym cone A ⊗ A | A ∈ Sn+ , meaning that the rank-one constraint can be removed without affecting the cone itself. Moreover, 4 among these four cones there are two primal-dual pairs: Sn+ is dual to Σ4n,4 , and Σ2n,4 is dual to 2 2 2 2 4 Sn+ ×n . The hierarchical relationship Σ4n,4 ( Sn+ ×n ( Σ2n,4 ( Sn+ will also be shown in the same section. T 4 In Section 4, we further study two more cones: Σ2∇2 and Σ2n,4 Sncvx . Interestingly, these two new n,4

cones can be nicely placed in the hierarchical scheme (4):  \ 4  2 2 4 Σ4n,4 ( Sn+ ×n ( Σ2∇2 ⊆ Σ2n,4 Sncvx ( Σ2n,4 ( Sn+ .

(5)

n,4

The complexity status of all these cones are discussed in Section 5, and in particular we show that  T n4  2 2 Σ∇2 ( Σn,4 Scvx unless P = N P , completing the picture presented in (5), on the premise n,4

8

that P 6= N P . The low dimensional cases of these cones are also discussed in Section 5. Specially, for the case n = 2, all the six cones reduce to only two distinctive ones, and for the case n = 3, they reduce to exactly three distinctive cones. Finally, in Section 6 we discuss applications of quartic conic programming, including bi-quadratic assignment problems and eigenvalues of supersymmetric tensors.

3

Quartic PSD Forms, Quartic SOS Forms, and the Dual Cones 2

2

Let us now consider the first four cones of quartic forms introduced in Section 2.2: Σ4n,4 , Sn+ ×n , 4 Σ2n,4 , and Sn+ .

3.1

Closedness 2

2

4

Proposition 3.1 Σ4n,4 , Sn+ ×n , Σ2n,4 , and Sn+ are all closed convex cones. 4

2

2

While Sn+ and Sn+ ×n are evidently closed, by a similar n argument as2 in o [36] it is also easy to see n 2 is closed. It only remains that the cone of quartic SOS forms Σn,4 := sym cone A ⊗ A | A ∈ S 4 to show that the cone of quartic SOQ forms Σn,4 is closed. In fact, we have a slightly stronger result below, which ensures the closedness of Σ4n,4 : Lemma 3.2 If D ⊆ Rn , then cl cone {a ⊗ a ⊗ a ⊗ a | a ∈ D} = cone {a ⊗ a ⊗ a ⊗ a | a ∈ cl D} . Proof. Suppose that F ∈ cl cone {a ⊗ a ⊗ a ⊗ a | a ∈ D}, then there is a sequence of quartic forms F k ∈ cone {a ⊗ a ⊗ a ⊗ a | a ∈ D} (k = 1, 2, . . .), such that F = limk→∞ F k . Since the dimension  4 of Sn is N := n+3 , it follows from Carath´eodory’s theorem that for any given F k , there exists 4 an n × (N + 1) matrix Z k , such that k

F =

N +1 X

z k (i) ⊗ z k (i) ⊗ z k (i) ⊗ z k (i),

i=1

where z k (i) is the i-th column vector of Z k , and is a positive multiple of a vector in D. Now define P k , then tr F k = nj=1 Fjjjj N +1 X n X k 4 (Zji ) = tr F k → tr F. i=1 j=1

9

P +1 ∗ Thus, the sequence {Z k } is bounded, and have a cluster point Z ∗ , satisfying F = N i=1 z (i) ⊗ ∗ ∗ ∗ ∗ z (i) ⊗ z (i) ⊗ z (i). Note that each column of Z is also a positive multiple of a vector in cl D, it follows that F ∈ cone {a ⊗ a ⊗ a ⊗ a | a ∈ cl D}. The converse containing relationship is trivial.  The cone of quartic SOQ forms is closely related to the fourth moment of a multi-dimensional random variable. Given an n-dimensional random variable ξ = (ξ1 , · · · , ξn )T on the support set D ⊆ Rn with density function p, its fourth moment is a super-symmetric fourth order tensor 4 M ∈ Sn , whose (i, j, k, `)-th entry is Z xi xj xk x` p(x)dx. Mijk` = E [ξi ξj ξk ξ` ] = D

Suppose the fourth moment of ξ is finite, then by the closedness of Σ4n,4 , we have Z M = E [ξ ⊗ ξ ⊗ ξ ⊗ ξ] =

(x ⊗ x ⊗ x ⊗ x) p(x)dx ∈ cone {a ⊗ a ⊗ a ⊗ a | a ∈ Rn } = Σ4n,4 .

D

Conversely, for any M ∈ Σ4n,4 , it is easy to verify that there exists an n-dimensional random variable whose fourth moment is just M. Thus, the set of all the finite fourth moments of n-dimensional random variables is exactly Σ4n,4 , similar to the fact that all possible covariance matrices form the cone of positive semidefinite matrices.

3.2

Alternative Representations

In this subsection we present some alternative forms of the same cones that we have discussed. Some of these alternative representations are more convenient to use in various applications. Theorem 3.3 For the quartic polynomials cones introduced, we have the following equivalent representations: 1. For the cone of quartic SOS forms n o n o → − 4 2 2 Σ2n,4 := sym cone A ⊗ A | A ∈ Sn = sym F ∈ S n | F(X, X) ≥ 0, ∀ X ∈ Sn ; 2. For the cone of quartic matrix PSD forms n o 2 2 4 2 Sn+ ×n := F ∈ Sn | F(X, X) ≥ 0, ∀ X ∈ Rn o n 4 2 = F ∈ Sn | F(X, X) ≥ 0, ∀ X ∈ Sn n o \ 4 2 = Sn cone A ⊗ A | A ∈ Sn ;

10

3. For the cone of quartic SOQ forms Σ4n,4 := cone {a ⊗ a ⊗ a ⊗ a | a ∈ Rn } = sym cone

n o 2 A ⊗ A | A ∈ Sn+ .

→ − 4 4 Recall that S n is the set of partial-symmetric fourth order tensors in Rn , defined in Section 2.1. The remaining of this subsection is devoted to the proof of Theorem 3.3. 2

2

n ×n Let us first study the equivalent representations for Σ2n,4 and S+ . To verify a quartic matrix PSD form, we should check the operations of quartic forms on matrices. In fact, the quartic matrix → − 4 PSD forms can be extended to the space of partial-symmetric tensors S n . It is not hard to verify → − 4 that for any F ∈ S n , it holds that 2

F(X, Y ) = F(X T , Y ) = F(X, Y T ) = F(Y, X) ∀ X, Y ∈ Rn ,

(6)

which implies that F(X, Y ) is invariant under the transpose operation as well as the operation to swap the X and Y matrices. Indeed, it is easy to see that the partial-symmetry of F is a necessary and sufficient condition for (6) to hold. We have the following property for quartic matrix PSD → − 4 2 2 forms in S n , similar to that for Sn+ ×n in Theorem 3.3. Lemma 3.4 For partial-symmetric four order tensors, it holds that n o → − 4 → − n2 ×n2 2 := F ∈ S n | F(X, X) ≥ 0, ∀ X ∈ Rn S+ n o → − 4 2 = F ∈ S n | F(X, X) ≥ 0, ∀ X ∈ Sn n o 2 = cone A ⊗ A | A ∈ Sn .

(7) (8)

2

Proof. Observe that for any skew-symmetric Y ∈ Rn , i.e., Y T = −Y , we have F(X, Y ) = −F(X, −Y ) = −F(X, Y T ) = −F(X, Y )

2

∀ X ∈ Rn ,

which implies that F(X, Y ) = 0. As any square matrix can be written as the sum of a symmetric 2 matrix and a skew-symmetric matrix, say for Z ∈ Rn , by letting X = (Z + Z T )/2 which is symmetric, and Y = (Z − Z T )/2 which is skew-symmetric, we have Z = X + Y . Therefore, F(Z, Z) = F(X + Y, X + Y ) = F(X, X) + 2F(X, Y ) + F(Y, Y ) = F(X, X). 2

2

This implies the equivalence between F(X, X) ≥ 0, ∀ X ∈ Rn and F(X, X) ≥ 0, ∀ X ∈ Sn , which proves (7). n o → − 4 2 2 To prove (8), first note that cone A ⊗ A | A ∈ Sn ⊆ {F ∈ S n | F(X, X) ≥ 0, ∀ X ∈ Rn }. → − 4 2 Conversely, given any G ∈ S n with G(X, X) ≥ 0, ∀ X ∈ Rn , we may rewrite G as an n2 × n2 symmetric matrix MG . Therefore (vec (X))T MG vec (X) = G(X, X) ≥ 0 11

2

∀ X ∈ Rn ,

which implies that MG is positive semidefinite. Let MG = i i i i z i = z11 , · · · , z1n , · · · , zn1 , · · · , znn

Pm

i=1 z

T

i (z i )T ,

where

∀ 1 ≤ i ≤ m.

P Pm Pm i i i 2 i 2 Note that for any 1 ≤ k, ` ≤ n, Gk``k = m i=1 zk` z`k , Gk`k` = i=1 (zk` ) and G`k`k = i=1 (z`k ) , as well as Gk``k = Gk`k` = G`k`k by partial-symmetry of G. We have m m m m X X X X i i 2 i 2 i 2 i i z`k = Gk`k` + G`k`k − 2Gk``k = 0, (zk` − z`k ) = (zk` ) + (z`k ) −2 zk` i=1

i=1

i=1

i=1

i = z i for any 1 ≤ k, ` ≤ n. Therefore, we may construct a symmetric matrix which implies that zk` `k P 2 i i Z i ∈ Sn , such that vec (Z i ) = z i for all 1 ≤ i ≤ m. We have G = m i=1 Z ⊗ Z , and so (8) is proven. 

The first part of Theorem 3.3 follows from (8) by applying the symmetrization operation on both 4 sides, while the second part of Theorem 3.3 follows from (7) and (8) by restricting to Sn . Let us now turn to proving the last part of Theorem 3.3, which is an alternative representation of the quartic SOQ forms. Obviously we need only to show that n o 2 sym cone A ⊗ A | A ∈ Sn+ ⊆ cone {a ⊗ a ⊗ a ⊗ a | a ∈ Rn } . Since there is a one-to-one mapping from quartic forms to fourth order super-symmetric tensors, it P 2 T i 4 suffices to show that for any A ∈ Sn+ , the function (xT Ax)2 can be written as a form of m i=1 (x a ) for some a1 , · · · , am ∈ Rn . Note that the so-called Hilbert’s identity (see e.g., Barvinok [5]) asserts the following: For any fixed positive integers d and n, there always exist m real vectors a1 , · · · , am ∈ Rn P T i 2d such that (xT x)d = m i=1 (x a ) . In fact, when d = 2, He et al. [17] proposed a polynomial-time algorithm to find the afore-mentioned representations where the number m is bounded by a polynomial of n, although in the original 2 version of Hilbert’s identity m is exponential in n. Since we have A ∈ Sn+ , replacing x by A1/2 y in P T 1/2 ai )4 . The desired decomposition Hilbert’s identity when d = 2, one gets (y T Ay)2 = m i=1 (y A follows, and this proves the last part of Theorem 3.3.

3.3

Duality

In this subsection, we shall discuss the duality relationships among these four cones of quartic forms. 4 Note that Sn is the ground vector space within which the duality is defined, unless otherwise specified. 12

Theorem 3.5 The cone of quartic PSD forms and the cone of quartic SOQ forms are primal-dual ∗ 4 ∗ 4 pair, i.e., Σ4n,4 = Sn+ and Sn+ = Σ4n,4 . The cone of quartic SOS forms and the cone of  2 2 ∗ ∗ 2 2 n ×n . quartic matrix PSD forms are primal-dual pair, i.e., Sn+ ×n = Σ2n,4 and Σ2n,4 = S+ 4

Remark that the primal-dual relationship between Σ4n,4 and Sn+ was also studied in [34]. Let us 4 indeed start by discussing the primal-dual pair Σ4n,4 and Sn+ . In Proposition 1 of [36], Sturm and 2 Zhang proved that for the quadratic forms, {A ∈ Sn | xT Ax ≥ 0, ∀ x ∈ D} and cone {aaT | a ∈ D} are a primal-dual pair for any closed D ⊆ Rn . We observe that a similar structure holds for the quartic forms as well. The first part of Theorem 3.5 then follows from next lemma. 4

4

Lemma 3.6 If D ⊆ Rn is closed, then Sn+ (D) := {F ∈ Sn | F(x, x, x, x) ≥ 0, ∀ x ∈ D} and cone {a ⊗ a ⊗ a ⊗ a | a ∈ D} are a primal-dual pair, i.e., 4

Sn+ (D) = (cone {a ⊗ a ⊗ a ⊗ a | a ∈ D})∗ and 

(9)

∗ 4 Sn+ (D) = cone {a ⊗ a ⊗ a ⊗ a | a ∈ D}.

Proof. Since cone {a ⊗ a ⊗ a ⊗ a | a ∈ D} is closed by Lemma 3.2, we only need to show (9). In fact, 4 if F ∈ Sn+ (D), then F • (a ⊗ a ⊗ a ⊗ a) = F(a, a, a, a) ≥ 0 for all a ∈ D. Thus F • G ≥ 0 for all G ∈ cone {a⊗a⊗a⊗a | a ∈ D}, which implies that F ∈ (cone {a ⊗ a ⊗ a ⊗ a | a ∈ D})∗ . Conversely, if F ∈ (cone {a ⊗ a ⊗ a ⊗ a | a ∈ D})∗ , then F • G ≥ 0 for all G ∈ cone {a ⊗ a ⊗ a ⊗ a | a ∈ D}. In particular, by letting G = x ⊗ x ⊗ x ⊗ x, we have F(x, x, x, x) = F • (x ⊗ x ⊗ x ⊗ x) ≥ 0 for all 4 x ∈ D, which implies that F ∈ Sn+ (D).  2

2

Let us turn to the primal-dual pair of Sn+ ×n and Σ2n,4 . For technical reasons, we shall momentarily → − 4 4 lift the ground space from Sn to the space of partial-symmetric tensors S n . This enlarges all − → the dual objects. To distinguish these two dual objects, let us use the notation ‘K ∗ ’ to indicate → − 4 → − 4 4 the dual of convex cone K ∈ Sn ⊆ S n generated in the space S n , while ‘K∗ ’ is the dual of K 4 generated in the space Sn . → − 2 2 Lemma 3.7 For partial-symmetric tensors, the cone S n+ ×n is self-dual with respect to the space − → → − n4 → − 2 2 → − 2 2 ∗ S , i.e., S n+ ×n = S n+ ×n . → − 4 Proof. According to Proposition 1 of [36] and the partial-symmetry of S n , we have 

→ n o− n o ∗ → − 4 2 n2 cone A ⊗ A | A ∈ S = F ∈ S n | F(X, X) ≥ 0, ∀ X ∈ Sn .

13

By Lemma 3.4, we have o n o → − n2 ×n2 n → − 4 2 2 S+ = F ∈ S n | F(X, X) ≥ 0, ∀ X ∈ Sn = cone A ⊗ A | A ∈ Sn . → − 2 2 → − 4 Thus S n+ ×n is self-dual with respect to the space S n .



Notice that by definition and Lemma 3.7, we have − → n o → → − n2 ×n2 − n2 ×n2  ∗ 2 n2 , Σn,4 = sym cone A ⊗ A | A ∈ S = sym S + = sym S + and by the alternative representation in Theorem 3.3 we have n o \ \→ − n2 ×n2 2 2 4 2 4 Sn+ ×n = Sn cone A ⊗ A | A ∈ Sn = Sn S+ . 2

2

Therefore the duality between Sn+ ×n and Σ2n,4 follows immediately from the following lemma. − → → − 4 → − n4 Lemma 3.8 If K ⊆ S n is a closed convex cone and K ∗ is its dual with respect to the space  S ,  T n4 T n4 ∗ − → 4 ∗ n then K S and sym K are a primal-dual pair with respect to the space S , i.e., K S =   ∗ T − → − → 4 sym K ∗ and K Sn = sym K ∗ . − → − → → − 4 4 4 Proof. For any G ∈ sym K ∗ ⊆ Sn , there is a G 0 ∈ K ∗ ⊆ S n , such that G = sym G 0 ∈ Sn . We T 4 4 0 0 0 + Gikj` + Gi`jk ). Thus for any F ∈ K Sn ⊆ Sn , it follows that then have Gijk` = 13 (Gijk`

F •G =

X

0 0 0 ) + Gi`jk + Gikj` Fijk` (Gijk`

3

1≤i,j,k,`≤n

=

X

0 0 0 + Fi`jk Gi`jk + Fikj` Gikj` Fijk` Gijk`

3

1≤i,j,k,`≤n 0

= F • G ≥ 0.  4 ∗

 T 4 ∗ T − → Therefore G ∈ K Sn , implying that sym K ∗ ⊆ K Sn .   − → ∗ − → − → → − 4 4 Moreover, if F ∈ sym K ∗ ⊆ Sn , then for any G 0 ∈ K ∗ ⊆ S n , we have G = sym G 0 ∈ sym K ∗ , →    − − − → ∗ → ∗ and G 0 • F = G • F ≥ 0. Therefore F ∈ K ∗ = cl K = K, which implies that sym K ∗ ⊆  T 4 K Sn . Finally, the duality relationship holds by the bipolar theorem and the closedness of these cones.  

3.4

The Hierarchical Structure

The last part of this section is to present a hierarchy among these four cones of quartic forms. The main result is summarized in the theorem below. 14

Theorem 3.9 If n ≥ 4, then 2

2

4

n ×n Σ4n,4 ( S+ ( Σ2n,4 ( Sn+ .

For the low dimension cases (n ≤ 3), we shall present it in Section 5.2. Evidently a quartic 4 SOS form is quartic PSD, implying Σ2n,4 ⊆ Sn+ . By invoking the duality operation and using 2 2 Theorem 3.5 we have Σ4n,4 ⊆ Sn+ ×n , while by the alternative representation in Theorem 3.3 n o 2 2 4 T 2 we have Sn+ ×n = Sn cone A ⊗ A | A ∈ Sn , and by the very definition we have Σ2n,4 = n o 2 n2 ×n2 sym cone A ⊗ A | A ∈ Sn . Therefore S+ ⊆ Σ2n,4 . Finally, the strict containing relationship is a result of the following examples. 4

Example 3.1 (Quartic forms in Sn+ \Σ2n,4 when n = 4). Let g1 (x) = x1 2 (x1 −x4 )2 +x2 2 (x2 −x4 )2 + x3 2 (x3 −x4 )2 +2x1 x2 x3 (x1 +x2 +x3 −2x4 ) and g2 (x) = x1 2 x2 2 +x2 2 x3 2 +x3 2 x1 2 +x4 4 −4x1 x2 x3 x4 , 4 then both g1 (x) and g2 (x) are in S4+ \ Σ24,4 . We refer the interested readers to [33] for more information on these two cones. 2

2

4

Example 3.2 (A quartic form in Sn+ ×n \ Σ4n,4 when n = 4). Construct F ∈ S4 , whose only nonzero entries (taking into account the super-symmetry) are F1122 = 4, F1133 = 4, F2233 = 4, F1144 = 9, F2244 = 9, F3344 = 9, F1234 = 6, F1111 = 29, F2222 = 29, F3333 = 29, and F4444 = P 3 + 25 . One may verify straightforwardly that F can be decomposed as 7i=1 Ai ⊗ Ai , with A1 =  √7        7 0 0 0 0 2 0 0 0 0 2 0 0 0 0 3 √          0  2 0 0 0   0 0 0 3   0 0 2 0  7 0 0  4 2 3         √ , A =   0 , A =  2 0 0 0 , A =  0 2 0 0 , 0 7 0     0 0 0 3      5 0 0 0 √7 0 0 3 0 0 3 0 0 3 0 0 0       −2 0 0 0 3 0 0 0 3 0 0 0            0 3 0 0  0 −2 0 0  0 3 0 0  5 6 7     A =  , A =  0 0 3 0 , and A =  0 0 −2 0 . According to  0 0 3 0      0 0 0 1 0 0 0 1 0 0 0 1 2

2

Theorem 3.3, we have F ∈ S4+ ×4 . Recall g2 (x) in Example 3.1, which is quartic PSD. Denote G 4 to be the super-symmetric tensor associated with g2 (x), thus G ∈ S4+ . One computes that G • F = 4 + 4 + 4 + 3 + 25 7 − 24 < 0. By the duality result as stipulated in Theorem 3.5, we conclude that 4 F∈ / Σ4,4 . 2

2

Example 3.3 (A quartic form in Σ2n,4 \ Sn+ ×n when n = 3). Let g3 (x) = 2x1 4 + 2x2 4 + 21 x3 4 + 6x1 2 x3 2 + 6x2 2 x3 2 + 6x1 2 x2 2 , which is obviously quartic SOS. Now recycle the notation and denote G ∈ Σ23,4 to be the super-symmetric tensor associated with g3 (x), and we have G1111 = 2, G2222 = 2,

15

2

G3333 = 12 , G1122 = 1, G1133 = 1, and G2233 = 1. If we let X = Diag (1, 1, −4) ∈ S3 , then  T   1 1 2 1 1      =  1   1 2 1   1  = −2, −4 −4 1 1 21 

G(X, X) =

X

Gijk` Xij Xk` =

1≤i,j,k,`≤3 2

X

Giikk Xii Xkk

1≤i,k≤3 2

implying that G ∈ / S3+ ×3 .

4

Cones Related to Convex Quartic Forms

In this section we shall study the cone of quartic sos-convex forms Σ2∇2 , and the cone of quartic n,4 T 4 forms which are both SOS and convex Σ2n,4 Sncvx . The aim is to incorporate these two new cones into the hierarchical structure as depicted in Theorem 3.9. Theorem 4.1 If n ≥ 4, then  \ 4  2 2 4 Σ4n,4 ( Sn+ ×n ( Σ2∇2 ⊆ Σ2n,4 Sncvx ( Σ2n,4 ( Sn+ . n,4

As mentioned in Section 2.2, an sos-convex homogeneous quartic polynomial function is both SOS T n4  2 2 and convex (see also [3]), which implies that Σ∇2 ⊆ Σn,4 Scvx . Moreover, the following n,4 example shows that a quartic SOS form is not necessarily convex. 4

2

Example 4.1 (A quartic form in Σ2n,4 \ Sncvx when n = 2). Let g4 (x) = (xT Ax)2 with A ∈ Sn , " # −3 0 2 T T and its Hessian matrix is ∇ g4 (x) = 8Axx A + 4x AxA. In particular, by letting A = 0 1 ! " # " # 0 0 0 −12 0 and x = , we have ∇2 f (x) = +  0, implying that g4 (x) is not convex. 1 0 8 0 4 The above example suggests that 2



Σ2n,4

T

4

Sncvx



( Σ2n,4 when n ≥ 2. Next we shall prove the

2

assertion that Sn+ ×n ( Σ2∇2 when n ≥ 3. To this end, let us first quote a result on the sos-convex n,4

functions due to Ahmadi and Parrilo [3]: If f (x) is a polynomial with its Hessian matrix being ∇2 f (x), then f (x) is sos-convex if and only if y T ∇2 f (x)y is a sum of squares in (x, y). For a quartic form F(x, x, x, x), its Hessian matrix is 12F(x, x, ·, ·). Therefore, F is quartic sos2 2 convex if and only if F(x, x, y, y) is is a sum of squares in (x, y). Now if F ∈ Sn+ ×n , then by 16

2

Theorem 3.3 we may find matrices A1 , · · · , Am ∈ Sn such that F = F(x, x, y, y) = F(x, y, x, y) =

m X

X

Pm

t=1 A

t

⊗ At . We have

xi yj xk y` Atij Atk`

t=1 1≤i,j,k,`≤n

=

m X





 X

X

xi yj Atij   xk y` Atk`  1≤i,j≤n 1≤k,`≤n

 t=1

=

m X

x T At y

2

,

t=1 2

2

n ×n which is a sum of squares in (x, y), hence sos-convex. This proves S+ ⊆ Σ2∇2 , and the example n,4 below rules out the equality when n ≥ 3.

Example 4.2 (A quartic form in Σ2∇2

n,4

2

2

n ×n \ S+ when n = 3). Recall g3 (x) = 2x1 4 + 2x2 4 +

1 4 2 2 2 2 2 2 2 x3 + 6x1 x3 + 6x2 x3 + 6x1 x2

in Example 3.3, which is shown not to be quartic matrix PSD. Moreover, it is straightforward to compute that   T    T    T   x1 x1 0 0 x3 x3 x2 2 0 0            ∇2 g3 (x) = 24  x2   x2  + 12  x3   x3  + 12  0   0  + 12  0 x1 2 0   0, x3 x3 x2 x2 x1 x1 0 0 0 2 2 

which implies that g3 (x) is quartic sos-convex. T 4 A natural question regarding to the hierarchical structure in Theorem 4.1 is whether Σ2n,4 Sncvx = Σ2∇2 or not. In fact, the relationship between convex, sos-convex, and SOS is a highly interesting n,4

subject which attracted many speculations in the literature. Ahmadi and Parrilo [3] gave an explicit example with a three dimensional homogeneous form of degree eight, which they showed to be both convex and SOS not sos-convex. However, for quartic polynomials (degree four),  but T  4 2 n 2 such an explicit instant in Σn,4 Scvx \ Σ∇2 is not in sight. Notwithstanding the difficulty in n,4 constructing an explicit quartic example, on the premise that P 6= N P in Section 5 we will show   T 4 that Σ2n,4 Sncvx 6= Σ2∇2 . With that piece of information, the chain of containing relations n,4 manifested in Theorem 4.1 is complete, under the assumption that P 6= N P . However, the following open question remains:  T 4  Question 4.1 Find an explicit instant in Σ2n,4 Sncvx \ Σ2∇2 : a quartic form that is both SOS n,4 and convex, but not sos-convex. The two newly introduced cones in this section are related to the convexity properties. Some more words on convex quartic forms are in order here. As mentioned in Section 2.2, for a quartic form 4 F ∈ Sncvx , its Hessian matrix is 12F(x, x, ·, ·). Therefore, F is convex if and only if F(x, x, ·, ·)  0 for all x ∈ Rn , which is equivalent to F(x, x, y, y) ≥ 0 for all x, y ∈ Rn . In fact, it is also equivalent 17

2

to F(X, Y ) ≥ 0 for all X, Y ∈ Sn+ . To see why, we first decompose the positive semidefinite matrices P P X and Y , and let X = ni=1 xi (xi )T and Y = nj=1 y j (y j )T (see e.g., Sturm and Zhang [36]). Then   n n X X F(X, Y ) = F  xi (xi )T , y j (y j )T  i=1

=

X

j=1 i

i T

F x (x ) , y j (y j )T



1≤i,j≤n

=

X

 F xi , xi , y j , y j ≥ 0,

1≤i,j≤n

if F(x, x, y, y) ≥ 0 for all x, y ∈ Rn . Note that the converse is trivial, as it reduces to let X and Y be rank-one positive semidefinite matrices. Thus we have the following equivalence for the quartic convex forms. 4

Proposition 4.2 For a given quartic form F ∈ Sn , the following statements are equivalent: • F(x, x, x, x) is convex; • F(x, x, ·, ·) is positive semidefinite for all x ∈ Rn ; • F(x, x, y, y) ≥ 0 for all x, y ∈ Rn ; 2

• F(X, Y ) ≥ 0 for all X, Y ∈ Sn+ . For the relationship between the cone of quartic convex forms and the cone of quartic SOS forms, 4 Example 4.1 has ruled out the possibility that Σ2n,4 ⊆ Sncvx , while Blekherman [6] proved that 4 Σ2n,4 is not contained in Sncvx . Therefore these two cones are indeed distinctive. According to Blekherman [6], the cone of quartic convex forms is actually much bigger than the cone of quartic SOS forms when n is sufficiently large. However, at this point we are not aware of any explicit 4 instant belonging to Sncvx \ Σ2n,4 . In fact, according to a recent working paper of Ahmadi et al. [1], this kind of instants exist only when n ≥ 4. Anyway, the following challenge remains: 4

Question 4.2 Find an explicit instant in Sncvx \ Σ2n,4 : a quartic convex form that is not quartic SOS.

5

Complexity and Low Dimensions of the Quartic Cones

In this section, we shall study the computational complexity issues for the membership queries regarding these cones of quartic forms. Unlike their quadratic counterparts where the positive 18

semidefniteness can be checked in polynomial-time, the case for the quartic cones are substantially subtler. We also study the low dimension cases of these cones, as a complement to the result of hirarchary relationship in Theorem 4.1.

5.1

Complexity

Let us start with easy cases. It is well known that deciding whether a general polynomial function is SOS can be done in polynomial-time, by resorting to checking the feasibility of an SDP. Therefore, the membership query for Σ2n,4 can be done in polynomial-time. By the duality relationship claimed 2 2 in Theorem 3.5, the membership query for Sn+ ×n can also be done in polynomial-time. In fact, 4 for any quartic form F ∈ Sn , we may rewrite F as an n2 × n2 matrix, to be denoted by MF , and n2 ×n2 then Theorem 3.3 assures that F ∈ S+ if and only if MF is positive semedefinite, which can be checked in polynomial-time. Moreover, as discussed in Section 4, a quartic form F is sos-convex if  and only if y T ∇2 F(x, x, x, x) y = 12F(x, x, y, y) is a sum of squares in (x, y), which can also be checked in polynomial-time. Therefore, the membership checking problem for Σ2∇2 can be done n,4 in polynomial-time as well. Summarizing, we have: 2

2

Proposition 5.1 Whether a quartic form belongs to Σ2n,4 , Sn+ ×n , or Σ2∇2 , can be verified in n,4 polynomial-time. Unfortunately, the membership checking problems for all the other cones that we have discussed so far are difficult. To see why, let us introduce a famous cone of quadratic functions: the co-positive cone n o 2 C := A ∈ Sn | xT Ax ≥ 0, ∀ x ∈ Rn+ , whose membership query is known to be co-NP-complete. The dual of the co-positive cone is the cone of completely positive matrices, defined as  C∗ := cone xxT | x ∈ Rn+ . Recently, Dickinson and Gijben [14] gave a formal proof for the NP-completeness of the membership problem for C∗ . 4

Proposition 5.2 It is co-NP-complete to check if a quartic form belongs to Sn+ (the cone of quartic PSD forms). Proof. We shall reduce the problem to checking the membership of the co-positive cone C. In 2 4 particular, given any matrix A ∈ Sn , construct an F ∈ Sn , whose only nonzero entries are ( Aik i 6= k, 3 (10) Fiikk = Fikik = Fikki = Fkiik = Fkiki = Fkkii = Aik i = k. 19

For any x ∈ Rn , F(x, x, x, x) =

X

2

2

(Fiikk + Fikik + Fikki + Fkiik + Fkiki + Fkkii ) xi xk +

=

Fiiii xi 4

i=1

1≤i