In this paper, we proposed Nesterov step to reduced gradient algorithm for ... Frank-Wolfe algorithm some problems, the numerical results which show the.
Reduced gradient method
Speed reduced gradient algorithm for nonlinear programming with applications Abdelkrim El Mouatasim
˚
˚ Ibn Zohr University, Faculty of Polydisciplinary Ouarzazate (FPO), B.P. 284, Ouarzazate 45800, Morocco.
ISCSA’17: Errachidia 26-28 October 2017
A. El Mouatasim
Reduced gradient method
1 / 25
Reduced gradient method
Plan of Presentation 1
Introduction
2
Reduced Gradient Method Reduced problem
3
Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm
4
Convergence analysis
5
Numerical experiments Test Problems Applications
A. El Mouatasim
Reduced gradient method
2 / 25
Reduced gradient method Introduction
In this paper, we proposed Nesterov step to reduced gradient algorithm for optimizing a convex differentiable function subject to linear equality constraints and nonnegativity bounds on the variables. In particular, at each iteration, we compute a search direction by reduced gradient, and line search by bisection algorithm or Armijo rule. Under some assumption, the convergence rate of speed reduced gradient (SRG) algorithm is proven to be significantly better, both theoretically and practically. The algorithm of SRG are programmed by Octave/Matlab, and comparing by Frank-Wolfe algorithm some problems, the numerical results which show the efficient of our approach, we give also an application to ODE, optimal control, image and video co-localization and learning machine.
A. El Mouatasim
Reduced gradient method
3 / 25
Reduced gradient method Introduction
In this paper, we proposed Nesterov step to reduced gradient algorithm for optimizing a convex differentiable function subject to linear equality constraints and nonnegativity bounds on the variables. In particular, at each iteration, we compute a search direction by reduced gradient, and line search by bisection algorithm or Armijo rule. Under some assumption, the convergence rate of speed reduced gradient (SRG) algorithm is proven to be significantly better, both theoretically and practically. The algorithm of SRG are programmed by Octave/Matlab, and comparing by Frank-Wolfe algorithm some problems, the numerical results which show the efficient of our approach, we give also an application to ODE, optimal control, image and video co-localization and learning machine.
A. El Mouatasim
Reduced gradient method
3 / 25
Reduced gradient method Reduced Gradient Method Reduced problem
Outline 1
Introduction
2
Reduced Gradient Method Reduced problem
3
Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm
4
Convergence analysis
5
Numerical experiments Test Problems Applications
A. El Mouatasim
Reduced gradient method
4 / 25
Reduced gradient method Reduced Gradient Method Reduced problem
From now on, we consider a nonlinear programming problem with linear equality constraints of the form $ f px q & minimize subject to Ax “ b (1) % 0ďx where f : Rn ÝÑ R is convex twice continuously differentiable function, A is m ˆ n matrix with m ď n and b is a vector in Rm .
A. El Mouatasim
Reduced gradient method
5 / 25
Reduced gradient method Reduced Gradient Method Reduced problem
The reduced gradient method begins with a basis B and a feasible solution k xk “ pxBk , xN q such that xBk ą 0. The solution x is not necessarily a basic solution, i.e. xN do not has to be identically zero. Such a solution can be obtained e.g. by the usual first phase procedure of linear optimization. Using the basis B form BxB ` NxN “ b, we have xB “ B ´1 b ´ B ´1 NxN , hence the basic variables xB can be eliminated from the problem (1) " minimize : F pxN q subject to : 0 ď xN ,
(2)
where F pxN q “ f pB ´1 b ´ B ´1 NxN , xN q. Using the notation “ ‰ ∇f pxqt “ ∇B f pxqt , ∇N f pxqt , the gradient of F which is the so-called reduced gradient, can be expressed as ` ˘ r “ ∇F px qt “ ´ ∇B f px qt B ´1 Nq ` p∇N f px qt . A. El Mouatasim
Reduced gradient method
6 / 25
Reduced gradient method Algorithm of reduced gradient Algorithm of RG
Outline 1
Introduction
2
Reduced Gradient Method Reduced problem
3
Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm
4
Convergence analysis
5
Numerical experiments Test Problems Applications
A. El Mouatasim
Reduced gradient method
7 / 25
Reduced gradient method Algorithm of reduced gradient Algorithm of RG
We are now ready to present the details of our RG algorithm. Stp 0 (Initialization). Choose a feasible point x0 P Rn , and IB0 , IN0 such that B 0 is nonsingular. Set the iteration counter k “ 0. Stp 1 (Independent variables choice). If k ‰ 0 choose the sets IBk , INk ; Stp 2 (Search direction computation). "
1 2
´rj if rj ă 0 or pxN qj ą 0, 0 otherwise If dN is zero, stop ; the current point is a solution. Otherwise, fined dB “ ´B ´1 NdN . Let pdN qj “
A. El Mouatasim
Reduced gradient method
8 / 25
Reduced gradient method Algorithm of reduced gradient Algorithm of RG
Stp 3 (Optimal line search computation : Bisection algorithm). Fined ηmax , ηk achieving respectively, min t
1ďjďn
min
xjk ´djk
0ďηďηmax
: djk ă 0u
f px k ` ηd k q
(3)
Stp 4 (Next point computation). Put x k`1 “ x k ` ηk d k ; Stp 5 If ηk ă ηmax , return to Step 2. Otherwise, declare the vanishing variable in the dependent set independent and declare a strictly positive variable in the independent set dependent. Update B and N Stp 6 (Basic variables choice). Choose IBk`1 , INk`1 . Let k “ k ` 1 and go to Step 1.
A. El Mouatasim
Reduced gradient method
9 / 25
Reduced gradient method Algorithm of reduced gradient Speed reduced gradient algorithm
Outline 1
Introduction
2
Reduced Gradient Method Reduced problem
3
Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm
4
Convergence analysis
5
Numerical experiments Test Problems Applications
A. El Mouatasim
Reduced gradient method
10 / 25
Reduced gradient method Algorithm of reduced gradient Speed reduced gradient algorithm
Reduced gradient and Nesterov step 1
Select a point y0 P C. Put k “ 0, b0 “ 1, x´1 “ y0 .
2
kth iteration. a) Compute direction of reduced gradient dk . b) Put xk “ yk ` ηk dk , a bk`1 “ 0.5p1 ` 4b2k ` 1 q, ´1 yk`1 “ xk ` p bbkk`1 qpxk ´ xk´1 q,
(4)
see for instance [8].
The recalculation of the point yk in(4) is done using a “ravine” step, and ηk is the optimal step (3) .
A. El Mouatasim
Reduced gradient method
11 / 25
Reduced gradient method Convergence analysis
The reduced gradient method stops when the direction vector with respect to the non basic variables dNk “ 0. The justification of this stopping criterion is presented in [1]. Theorem If the sequence txk ukě0 is constructed by FRG algorithm, then there exist a constants C such that @k ě 0 fpxk q ´ fpx˚ q ď
A. El Mouatasim
C . pk ` 2q2
Reduced gradient method
(5)
12 / 25
Reduced gradient method Numerical experiments
The code of proposed algorithm SRG, RG, FW is written by using Octave/Matlab programming language. We test SRG method and compare it with RG and FW algorithm. The optimal line search process of RG, FW and SRG fined by using bisection algorithm, we set “ 10´4 . We stop the iteration if the KKT condition }dN } ď 10´5 is satisfied or its equal maximal iteration. The algorithms are run on a workstation HP Intel(R) Celeron(R) M processor 1.30 GHz., 224 Mo RAM. The row cpu gives the mean CPU time in seconds for one run.
A. El Mouatasim
Reduced gradient method
13 / 25
Reduced gradient method Numerical experiments Test Problems
Outline 1
Introduction
2
Reduced Gradient Method Reduced problem
3
Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm
4
Convergence analysis
5
Numerical experiments Test Problems Applications
A. El Mouatasim
Reduced gradient method
14 / 25
Reduced gradient method Numerical experiments Test Problems
This algorithms has been tested on some problems from Hock and Schittkowski (HS#) [3], Floudas (F#) [2] and Kiwiel (K6) [6] where linear constraints are present. We test the performance of FW, RG and FRG methods on the following test problems with given initial feasible points x 0 . The results are listed in Table 1, n stands for the dimension of tested problem and nc stands for the number of constraints. We will report the following results : the CPU time, the optimal value f ˚ , the number of iteration Iter .
A. El Mouatasim
Reduced gradient method
15 / 25
Problem # n HS44 4 HS48 5 HS49 5 HS50 5 HS51 5 HS53 5 HS55 6 HS76 4 HS86 5 HS110 10 HS118 15 HS119 16 F7 20 F8 24 K6 30
nc 6 2 2 3 3 8 8 3 10 20 47 8 10 10 60
CPU 0.16 1.25 4.22 0.11 0.19 0.08 0.05 0.94 4.47 0.04 0.63 41.50 2.31 3.20 5.20
FW f˚ -15 6.2e-9 1.9e-5 5.6303 8.6e-8 4.093 6.333 -4.67 -32.32 -45.78 664.82 244.91 -5099 18423 2.4e-13
Ite 4 49 219 10 14 6 3 50 250 2 8 2000 100 150 100
Algorithm RG CPU f˚ 0.06 -15 0.09 1.8e-9 0.48 1.9e-5 0.02 2.9e-4 0.03 1.6e-8 0.05 4.093 0.02 6.333 0.06 -4.68 0.03 -32.35 0.04 -45.78 0.81 664.82 2.50 244.89 0.44 -7252 0.84 32585 6.42 6.64
Ite 5 47 198 10 11 6 2 9 8 3 39 398 96 150 100
CPU 0.05 0.06 0.10 0.02 0.03 0.02 0.02 0.02 0.02 0.03 0.40 0.41 0.39 0.43 3.17
FRG f˚ -15 1.9e-9 7.9e-6 2.9e-4 1.6e-8 4.093 6.333 -4.68 -32.35 -45.78 664.82 244.89 -7252 15990 2.4e-14
Table : Comparing results of test problems between FW, RG and SRG algorithms.
Ite 5 41 32 10 11 6 2 9 8 3 19 221 96 137 45
Reduced gradient method Numerical experiments Test Problems
From Table 1 above, we can see that our algorithm SRG can find their solutions with a small number of iterations or CPU time, and the computation results illustrate that our algorithm SRG executes well for those problems. In contrast to the numerical results of Frank-Wolfe algorithm and reduced gradient algorithm, the results in Table 1 show that the numbers of iterations for some problems are larger, while this may not very important, since the computational complexity in a single iteration and the CPU time are the main effort of this work for applications.
A. El Mouatasim
Reduced gradient method
17 / 25
Reduced gradient method Numerical experiments Applications
Outline 1
Introduction
2
Reduced Gradient Method Reduced problem
3
Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm
4
Convergence analysis
5
Numerical experiments Test Problems Applications
A. El Mouatasim
Reduced gradient method
18 / 25
Reduced gradient method Numerical experiments Applications
App1 : Ordinary differential equation (ODE) Given a positive integer k, the problem is defined as follows : minimize :
1 2
k´2 ř
pxk`i`1 ´ xk`i q2 ,
i“1
subject to :
xk`i ´ xi`1 ` xi “ 0, i “ 1, . . . , k ´ 1, αi ď xi ď αi`1 , i “ 1, . . . , k, 0.4pαi`2 ´ αi q ď xk`1 ď 0.6pαi`2 ´ αi q i “ 1, . . . , k ´ 1, where the constants αi are defined by αi “ 1.0 ` p1.01qi´1 . These problems arise in the optimal placement of nodes in a scheme for solving ordinary differential equations with given boundary values [7]. We solve these problems for different values of k “ 500. A. El Mouatasim
Reduced gradient method
19 / 25
Reduced gradient method Numerical experiments Applications
App2 : Optimal Control
Consider the optimal control of a servomotor [1] $ ş8 T 2 & minimize Jpx0 , uq “ „ 0 rx ptqQxptq `„ru ptqsdt 0 1 0 “ x ` u % subject to Bx Bt 0 ´1 1 „ 1 0 where Q “ , r ą 0. 0 0
A. El Mouatasim
Reduced gradient method
(6)
20 / 25
Reduced gradient method Numerical experiments Applications
The state and control variables are parametrized by the proposed approximation method. $ řm 2 řm 2 h ’ 1 ui q 2 p 1 yi ` r ’ minimize ’ ’ y1 “ y0 ` h2 z1 & subject to : (7) yi “ yi´1 ` h2 pzi ` zi´1 q, i “ 2, . . . , m ’ h ’ z “ z ` p´z ` u q ’ 1 0 1 i ’ 2 % zi “ zi´1 ` h2 p´zi ` ui ´ zi`1 ` ui´1 q, i “ 2, . . . , m where m “ 250, x “ py, zq ; the time interval h “ 2. We choose r “ 0.5 ; u0 “ 1000 ˚ onesp1, mq and x0 “ r3.25; 4.95s.
A. El Mouatasim
Reduced gradient method
21 / 25
Reduced gradient method Numerical experiments Applications
App3 : Image and video co-localization
The application comes from image and video co-localization. The approach used by [4] is formulated as a quadratic program (QP) over a flow polytope, the convex hull of paths in a network. In this application, the linear minimization oracle is equivalent to finding a shortest path in the network, which can be done easily by dynamic programming [5]. For comparing with Frank-Wolfe algorithm, we re-use the code provided by [4] and their included aeroplane dataset resulting in a QP over 660 variables.
A. El Mouatasim
Reduced gradient method
22 / 25
Reduced gradient method Numerical experiments Applications
App4 :Support vector machines (SVM)
The Lagrange dual of standard convex optimization setup for structural SVMs [5] has n variables. Writing αi for the dual variable associated with the training example i, The dual problem is given by " λ T minimize 2 }Aα} ř ´b α (8) subject to : i αi “ 1 where the matrix A P