An unconstrained minimization method for solving ... - Semantic Scholar

1 downloads 0 Views 318KB Size Report
(SDP) problems arising as relaxations of the max cut problem in a graph. Given a .... Let L := Diag(Ae) − A denote the Laplacian matrix associated with the graph. ...... tion λi(v) in the augmented Lagrangian function of Hestenes-Powell, we.
An unconstrained minimization method for solving low rank SDP relaxations of the max cut problem L. Grippo∗ , L. Palagi∗ , V Piccialli∗ ∗

Universit`a degli Studi di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica “A. Ruberti” Via Ariosto 25 - 00185 Roma - Italy e-mail (Grippo): [email protected] e-mail (Palagi): [email protected] e-mail (Piccialli): [email protected]

Abstract In this paper we consider low-rank semidefinite programming (LRSDP) relaxations of the max cut problem. Using the Gramian representation of a positive semidefinite matrix, the LRSDP problem is transformed into the nonconvex nonlinear programming problem of minimizing a quadratic function with quadratic equality constraints. First, we establish some new relationships among these two formulations and we give necessary and sufficient conditions of global optimality. Then we propose a continuously differentiable exact merit function that exploits the special structure of the constraints and we use this function to define an efficient and globally convergent algorithm for the solution of the LRSDP problem. Finally, we test our code on an extended set of instances of the max cut problem and we report comparisons with other existing codes.

Keywords: semidefinite programming - low-rank factorization - max cut problem - nonlinear programming - exact penalty functions

2

1

L. Grippo, L. Palagi, V. Piccialli

Introduction

This paper concerns the solution of large scale Semidefinite Programming (SDP) problems arising as relaxations of the max cut problem in a graph. Given a simple undirected graph G = (V, E) weighted on the edges, the max cut problem consists in finding a partition of vertices such that the sum of the weights on the edges between the two parts of the partition is maximum. The max cut problem is a well known NP-hard problem, and good bounds can be obtained by using convex SDP relaxations [17]. The simplest SDP relaxation of the max cut problem is of the form: min

trace (QX) diag(X) = e X º 0, X ∈ S n ,

(1)

where the data matrix Q is an n × n real symmetric matrix, trace (QX) denotes the trace-inner product of matrices, diag(X) is the vector of dimension n containing all the diagonal elements of X and the n × n matrix variable X is required to be symmetric and positive semidefinite, as indicated with the notation X º 0, X ∈ S n . Several algorithms have been proposed in the literature for solving SDP problems, many of them belonging to the interior point class (see for example the survey [27] and references therein). In alternative to interior point methods, a recent trend has been developing algorithms based on nonlinear programming reformulations of the SDP problem. The first idea goes back to Homer and Peinado [22] who use the change of variables Xij = viT vj /kvi kkvj k for the elements of X, to transform problem (1) into an unconstrained optimization problem in the new variables vi ∈ Rn for i = 1, . . . , n. In particular, they define a parallel computational scheme, in order to cope with the large dimensionality of the new problem. Burer and Monteiro in [6] propose a variant of Homer and Peinado’s approach where they use the change of variables X = LLT where L is a lower triangular matrix. More recently, Burer and Monteiro in [7, 8] recast a general linear SDP problem as a low rank semidefinite programming problem (LRSDP) by applying the change of variables X = V V T , where V is a n × r, r < n, rectangular matrix. The value of r is chosen by exploiting the result proved in Barvinok [1] and Pataki [31], that states that there exist an optimal solution of a linearly constrained SDP problem with rank r satisfying r(r + 1) ≤ m, where m is the number of linear constraints. For the solution of the LRSDP problem, Burer and Monteiro propose an augmented Lagrangian method, which requires the solution of a sequence

Method for solving LRSDP relaxations of the max cut

3

of unconstrained problems for different values of a penalty parameter and of the Lagrange multipliers estimates. In this paper we focus on the max cut SDP relaxation (1), and on the corresponding LRSDP problem. In fact, we also consider the reduced problem where we replace the variable X with a rectangular matrix V of dimension n × r. First, we study optimality conditions and we establish necessary and sufficient conditions expressed in terms of the Lagrange multipliers for guaranteeing that a stationary point of the Lagrangian function of LRSDP problem yields a global minimizer that solves the original SDP problem. In particular, we show that known sufficient optimality conditions [7] can be proven to be also necessary. Then we define a new unconstrained differentiable exact merit function for the computation of stationary points of the LRSDP problem. The augmented Lagrangian approach introduced in [7], although quite effective in practice, has two intrinsic drawbacks, since a sequence of unconstrained minimizations need to be performed, and some a posteriori assumptions on the behavior of the sequence are needed in order to prove global convergence. The exact penalty method defined in this paper overcomes both these drawbacks. Indeed, we need only a single unconstrained minimization of the merit function for a fixed sufficiently small value of the penalty parameter, and we can prove the global convergence of the algorithm without imposing any assumption on the behavior of the generated sequence. Therefore, we feel that, at least in the particular case of the problem arising from the relaxation of max cut, our approach fills the gap left by the seminal work of Burer and Monteiro [7, 8]. The paper is organized as follows. In Section 2, we report some known results on the max cut problem and we state the non linear relaxation that will be addressed in the paper. In particular, we use a Kronecker product notation to reformulate the problem in a standard NLP form. In Section 3, we review the main optimality conditions for the non linear programming problem and we state necessary and sufficient conditions for global optimality. In Section 4 we recast the original equality constrained problem as an unconstrained one, using a penalty function approach that exploits the special structure of the problem. In Section 5 we define a globally convergent algorithm for the computation of stationary points of the LRSDP problem and finally in Section 6 we report extensive numerical results on standard instances of the max cut problem.

4

1.1

L. Grippo, L. Palagi, V. Piccialli

Notation and terminology

We denote by IRn the space of real n−dimensional columns vector and by IRn×m the space of real n × m matrices. By S n we indicate the space of real n × n symmetric matrices. Given two n ×n square matrices Q and A we define the usual trace-inner product by letting trace (QA) =

n X n X

qij aij ,

i=1 j=1

and we indicate by kAk the induced Frobenius norm: kAk2 = trace (AT A). If v ∈ IRn , kvk is intended as the Euclidean norm of v. Given a square matrix A ∈ IRn×n , we denote by diag(A) the vector of dimension n containing all the diagonal elements of A. Given a vector a ∈ IRn , we denote by Diag(a) the diagonal square matrix of dimension n, with the elements of a on the diagonal. Moreover, we indicate by ei the vector of zeroes elements except for the i-th equal to one, by e the vector of all ones, and we set Eii = Diag(ei ), Ip = Diag(e) with e ∈ IRp . In the paper, we make use of the Kronecker product ⊗ (see, for instance, [24]). We recall that given two matrices A m × n and B p × q, the Kronecker product A ⊗ B is the mp × nq matrix given by   a11 B a12 B . . . a1n B   .. .. .. .. A⊗B = . . . . . am1 B

an2 B

...

amn B

The basic properties of the Kronecker product are the following identities A ⊗ B ⊗ C = (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C), (A + B) ⊗ (C + D) = A ⊗ C + A ⊗ D + B ⊗ C + B ⊗ D, (A ⊗ B)(C ⊗ D) = (AC ⊗ BD), where we assume that all the matrix operations appearing in each identity can be performed. Note that here and in the sequel, in order to simplify notation, we indicate by (AC ⊗ BD) the Kronecker product ((AC) ⊗ (BD)). The transpose of a Kronecker product is: (A ⊗ B)T = AT ⊗ B T . Given a matrix A ∈ S n , with spectrum σ(A) = {λ1 , . . . , λn } and a matrix

Method for solving LRSDP relaxations of the max cut

5

B ∈ S m , with spectrum σ(B) = {µ1 , . . . , µm }, the spectrum of A ⊗ B is given by: σ(A ⊗ B) = {λi µj : i = 1, . . . , n; j = 1, . . . , m}. Furthermore, in the Frobenius norm, we have that kA ⊗ Bk = kAkkBk.

2

SDP formulation and relaxations of max cut

Let G(V, E) be a weighted undirected graph, with n = |V | nodes and weights wij for (i, j) ∈ E. Let A ∈ S n be the weighted adjacency matrix n aij = wij (i, j) ∈ E . 0 otherwise The max cut problem consists in finding a partition of the set of nodes V of the weighted undirected graph G so as to maximize the sum of the weights on the edges that have one end in each side of the partition. Let the vector x ∈ {−1, 1}n represent any cut in the graph, i.e. the sets {i ∈ 1, . . . , n : xi = +1} and {i ∈ 1, . . . , n : xi = −1} constitute a partition of the sets of nodes. Then the weight of the cut induced by the partition is given by X (1 − xi xj ) aij . 2 i P (v0 ). Now we show that if kvi k2 > β > 1 for an index i ∈ {1, . . . , n} then P (v) > P (v0 ). Let t ≡ max1≤h≤n kvh k2 . First we note that kvi k2 > β > 1 implies that t = max kvh k2 > β > 1, 1≤h≤n

and

kvk2 ≤ n max kvh k2 = nt. 1≤h≤n

Method for solving LRSDP relaxations of the max cut

39

Let us consider again the inequality (44), hence we can write 1 −2Cnt − Cn2 t2 + (t − 1)2 ² Ã µ ¶2 ! Cn 1 1 = t2 −2 − Cn2 + 1− t ² t

P (v) ≥

à ≥

t

2

µ −Cn

¶ µ ¶2 ! 2 1 1 1− +n + β ² β

(50)

Furthermore by (49) we can write P (v0 ) ≤ Cn ≤ Cnβ 2 < Cnt2 . Thus ²
Cnt2 β ² β

but then we have by (49) P (v) > P (v0 ).

Proof of Proposition 14 By expression (36) of ∇P (v) we can write v T (Ekk ⊗ Ir )∇P (v) = v T (Ekk ³⊗ Ir )∇L(v, λ(v)) ´ Pn + i=1 v T (Ekk ⊗ Ir ) ∇λi (v) + 2² ∇hi (v) (hi (v) − 1). From (29) (iv) of Proposition 10, using the properties of Kronecker products, recalling that hk (v) = v T (Ekk ⊗ Ir )v and observing that

40

L. Grippo, L. Palagi, V. Piccialli

Ekk Eii = 0n×n , for all i 6= k, and Ekk Eii = Ekk , for i = k, we get v T (Ekk ⊗ Ir )∇P (v)

= 2λk (v)(hk (v) − 1) n X − (hi (v) − 1) v T [(Ekk Eii Q + Ekk QEii ) ⊗ Ir ] v i=1 n

4X T v (Ekk Eii ⊗ Ir )v (hi (v) − 1) + ² i=1 = 2λk (v)(hk (v) − 1) − (hk (v) − 1) v T (Ekk Q ⊗ Ir ) v n X 4 + hk (v) (hk (v) − 1) − (hi (v) − 1) v T (Ekk QEii ⊗ Ir )v ² i=1 4 = (3λk (v) + hk (v))(hk (v) − 1) ² n X − (hi (v) − 1) v T (Ekk QEii ⊗ Ir ) v. i=1

Since ∇P (¯ v ) = 0, from the last equality we get for all k = 1, . . . n · ¸ X n 4 (hk (¯ v ) − 1) 3λk (¯ v ) + hk (¯ v) − (hi (¯ v ) − 1)¯ v T [Ekk QEii ⊗ Ir ] v¯ = 0. ² i=1 (51) Now, we observe that v T [Ekk QEii ⊗ Ir ] v = qki vkT vi so that, setting bkk

=

4 3λk (¯ v ) + hk (¯ v ) − qkk hk (¯ v ), ²

bik

=

bki = −qki v¯kT v¯i

zi

=

hi (¯ v ) − 1,

i 6= k,

formula (51) becomes bkk zk −

X

bki zi = 0

k = 1, . . . , n.

i6=k

Letting B = (bij )i,j=1,...,n ∈ S n , the equations above can be rewritten as the homogeneous system Bz = 0.

Method for solving LRSDP relaxations of the max cut

41

Now we show that the choice of ² implies X that the matrix B is strictly diagonally dominant, that is |bkk | > | bki | for all k, namely that i6=k

¯ ¯ ¯ ¯ ¯ ¯ X ¯ ¯ ¯ ¯ 4 T ¯ ¯3λk (¯ ¯ ¯ v ) + h (¯ v ) − q h (¯ v ) > q v ¯ v ¯ k kk k ki k i ¯ for k = 1, . . . , n. ¯ ¯ ¯ ² ¯ i6=k ¯ The above inequality is implied by n X 4 hk (¯ v ) − 3 |λk (¯ v )| − |qki v¯kT v¯i | > 0 for k = 1, . . . , n.. ² i=1

(52)

Since

¯ ¯ √ |λk (¯ v )| = ¯v¯T (Ekk Q ⊗ Ir )¯ v ¯ ≤ k(Ekk ⊗ Ir )¯ v kkQ ⊗ Ir v¯k ≤ rkQkk¯ vk kk¯ vk

n X

|qki |k¯ vk kk¯ vi k = k¯ vk k

i=1

n X

|qki |k¯ vi k ≤ nk¯ vk k max |qki |k¯ vi k, i=1,...,n

i=1

condition (52) is implied, in turn, by √ 4 hk (¯ v ) − 3 rkQkk¯ vk kk¯ v k − nk¯ vk k max |qki |k¯ vi k > 0. i=1,...,n ²

(53)

By Proposition 13, we know that for all α ∈ (0, 1), we can choose ² ≤ ²ˆ such that α ≤ hi (¯ v ) = k¯ vi k2 ≤ β for all v ∈ L0 . Thus condition (53) is implied by p √ 4 α − 3 βkQk rk¯ v k − nβ max |qki | > 0 for k = 1, . . . , n. i=1,...,n ² As kvk =

à n X i=1

!1/2 2

kvi k

à ≤

n X

!1/2 β

=

p

nβ,

i=1

we can finally write √ 4 α − 3 nrβkQk − nβ max |qki | ≥ i=1,...,n ² 4 α − 3nβkQk − nβkQk > 0 for k = 1, . . . , n. ² Hence for ² satisfying (40), the above inequality is satisfied, so that B is diagonally dominant, and this implies that the unique solution of

42

L. Grippo, L. Palagi, V. Piccialli

system Bz = 0 is z = 0, namely hi (¯ v ) = 1 which in turns implies that v¯ is feasible for Problem (NLPr ). Hence from ∇P (¯ v ) = 0, recalling (ii) of Proposition 11, we get ∇v L(¯ v , λ(¯ v )) = 0 which gives the first order optimality conditions and P (¯ v ) = q(¯ v ).

Appendix B In this section we report the tables containg the characteristics of the test problems used. graph mcp100 mcp124-1 mcp124-2 mcp124-3 mcp124-4 mcp250-1 mcp250-2 mcp250-3 mcp250-4 mcp500-1 mcp500-2 mcp500-3 mcp500-4 maxG11 maxG32 maxG51 maxG55 maxG60

|V | 100 124 124 124 124 250 250 250 250 500 500 500 500 800 2000 1000 5000 7000

|E| 269 149 318 620 1271 331 612 1283 2421 625 1223 2355 5120 1600 4000 5909 12498 17148

Table B1: The SDPLIB max cut problems graph torusg3-8 torusg3-15 toruspm3-8-50 toruspm15-50

|V | 512 3375 512 3375

|E| 1536 10125 1536 10125

Table B2: The torus max cut problems

Method for solving LRSDP relaxations of the max cut graph G01 G14 G22 G35 G36 G43 G48 G52 G57 G58 G63 G64 G65 G66 G67 G70 G72 G77 G81

|V | 800 800 2000 2000 2000 1000 3000 1000 5000 5000 7000 7000 8000 9000 10000 10000 10000 14000 20000

43

|E| 19176 4694 19990 11778 11766 9990 6000 5916 10000 29570 41459 41459 16000 18000 20000 9999 20000 28000 40000

Table B3: The Gset max cut problems

References [1] A. Barvinok. Problems of distance geometry and convex properties of quadratic maps. Discrete Computational Geometry, 13:189–202 (1995). [2] S. J. Benson, Y. Ye, and X. Zhang. Solving Large-Scale Sparse Semidefinite Programs for Combinatorial Optimization. SIAM Journal on Optimization, 10(2):443–461 (2000). [3] A. Ben-T and M. Teboulle. Hidden convexity in some nonconvex quadratically constrained quadratic programming. Mathematical Programming 72:51-63, (1996). [4] Dimitri P. Bertsekas. Nonlinear Programming, Athena Scientific, 1999.

44

L. Grippo, L. Palagi, V. Piccialli

[5] S. Burer, R. D. C. Monteiro and Y. Zhang. Rank-Two Relaxation Heuristics for max cut and Other Binary Quadratic Programs.SIAM Journal on Optimization, 12(2):503–521 (2002). [6] S. Burer and R.D.C. Monteiro. A projected gradient algortihm for solving the Maxcut SDP relaxation. Optimization Methods and Software, 15:175–200 (2001). [7] S. Burer and R.D.C. Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Programming, Ser. B 95:329–357 (2003). [8] S. Burer and R.D.C. Monteiro. Local Minima and Convergence in Low-Rank Semidefinite Programming. Mathematical Programming, Ser. A, 103:427-444 (2005). [9] C. Delorme and S. Poljak. Laplacian eigenvalues and the maximum cut problem. Mathematical Programming, 62(3):557–574 (1993). [10] G. Di Pillo and L. Grippo. Exact Penalty Functions in Constrained Optimization Problems. SIAM Journal on Control and Optimization, 27(6):1333–1360 (1989). [11] E.D. Dolan and J.J. Mor`e. Benchmarking optimization software with performance profile. Mthematical Programming, Ser. A, 91:201–213 (2002). [12] R. Fletcher. A new approach for variable metric algorithms. Computer J., 13:317-322, 1970. [13] K. Fujisawa, M. Fukuda, M. Kojima, and K. Nakata. Numerical Evaluation of SDPA (Semidefinite Programming Algorithm) High Performance Optimization, H.Frenk, K. Roos, T. Terlaky and S. Zhang eds., Kluwer Academic Press, 1999, pp.267-301. [14] K. Fujisawa, M. Kojima, and K. Nakata. Exploiting sparsity in primal-dual interior-point methods for semidefinite programming, Mathematical Programming B, 79, 235-253 (1997). [15] K. Fujisawa, M. Kojima, K. Nakata, and M. Yamashita. SDPA (SemiDefinite Programming Algorithm) User’s manual — version 6.2.0 Research Report B-308, Dept. Math. & Comp. Sciences, Tokyo Institute of Technology, December 1995, Revised September 2004. [16] M.X. Goemans. Semidefinite Programming in Combinatorial Optimization. Mathematical Programming, 79, 143–161 (1997).

Method for solving LRSDP relaxations of the max cut

45

[17] M.X. Goemans and D.P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. Assoc. Comput. Mach., 42(6):1115–1145 (1995). [18] L. Grippo, M. Sciandrone. Nonmonotone globalization techniques for the Barzilai-Borwein gradient method. Computational Optimization and Applications, 23:143–169 (2002). [19] M. Gr¨otschel, L. Lov´asz, and A. Schrijver. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica, 1:169–197 (1981). [20] C. Helmberg and F. Rendl. A spectral bundle method for semidefinite programming. SIAM J. on Optimization, 3:673–696 (1999). [21] M. Hestenes, Multiplier and gradient methods. Journal of Optimization Theory and Application 4 303–320 (1969). [22] S. Homer and M. Peinado. Desing and performance of parallel and Distributed Approximation Algortihm for the Maxcut. J. of Parallel and Distributed Computing, 46:48–61 (1997). [23] R. A. Horn and C. R. Johnson. Matrix analysis, Cambridge University Press, Cambridge, 1985. [24] R. A. Horn and C. R. Johnson. Topics in matrix analysis, Cambridge University Press, New York, 1986. [25] M. Kojima, S. Shindoh, and S. Hara. Interior-point methods for the monotone semidefinite linear complementarity problem in symmetric matrices, SIAM Journal on Optimization 7, 86-125 (1997). [26] M. Laurent, S. Poljak, and F. Rendl. Connections between semidefinite relaxations of the maxcut and stable set problems. Mathematical Programming, 77:225–246 (1997). [27] R. D. C. Monteiro.First- and second-order methods for semidefinite programming. Mathematical Programming 97:209–244 (2003). ´, Generalization of the trust region problem, Optimization [28] J. More Methods and Software, 2 (1993), pp. 189–209. [29] Yu. Nesterov and A. Nemirovskii. Self-concordant functions and polynomial time methods in convex programming. Central Economical and Mathematical Institute, U.S.S.R. Academy of Dcience, Moscow (1990).

46

L. Grippo, L. Palagi, V. Piccialli

[30] Yu. Nesterov and A. Nemirovskii. Interiorpoint polynomial algorithms in convex programming. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1994). [31] G. Pataki. On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal eigenvalues. Mathematics of Operations Research, 23:339358 (1998). [32] S. Poljak, F. Rendl, and H. Wolkowicz. A recipe for semidefinite relaxation for 0-1 quadratic programming. Journal of Global Optimization, 7:51–73 (1995). [33] S. Poljak and Z. Tuza. Maximum cuts and largest bipartite subgraphs. In W. Cook, L. Lov’asz, and P. Seymour, editors, Combinatorial Optimization, DIMACS series in Disc. Math. and Theor. Comp. Sci. AMS (1995). [34] M. J. D. Powell. A method for nonlinear constraints in minimization problem. In Optimization, R. Fletcher (ed.), pp. 283–298, Academic Press, New York, 1969. [35] R. Stern and H. Wolkowicz. Indefinite trust region subproblems and nonsymmetric eigenvalue perturbations, SIAM Journal on Optimization, 5(2):286–313 (1995). [36] M. Yamashita, K. Fujisawa, and M. Kojima. Implementation and evaluation of SDPA 6.0 (SemiDefinite Programming Algorithm 6.0), Optimization Methods and Software 18, 491-505 (2003).

Suggest Documents