Speed reduced gradient algorithm for nonlinear

Reduced gradient method

Speed reduced gradient algorithm for nonlinear programming with applications Abdelkrim El Mouatasim

˚

˚ Ibn Zohr University, Faculty of Polydisciplinary Ouarzazate (FPO), B.P. 284, Ouarzazate 45800, Morocco.

ISCSA’17: Errachidia 26-28 October 2017

A. El Mouatasim


1 / 25


Plan of Presentation 1

Introduction

2

Reduced Gradient Method Reduced problem

3

Algorithm of reduced gradient Algorithm of RG Speed reduced gradient algorithm

4

Convergence analysis

5

Numerical experiments Test Problems Applications

A. El Mouatasim


2 / 25

Reduced gradient method Introduction

In this paper, we proposed Nesterov step to reduced gradient algorithm for optimizing a convex differentiable function subject to linear equality constraints and nonnegativity bounds on the variables. In particular, at each iteration, we compute a search direction by reduced gradient, and line search by bisection algorithm or Armijo rule. Under some assumption, the convergence rate of speed reduced gradient (SRG) algorithm is proven to be significantly better, both theoretically and practically. The algorithm of SRG are programmed by Octave/Matlab, and comparing by Frank-Wolfe algorithm some problems, the numerical results which show the efficient of our approach, we give also an application to ODE, optimal control, image and video co-localization and learning machine.

A. El Mouatasim


3 / 25

Reduced gradient method Introduction

In this paper, we proposed Nesterov step to reduced gradient algorithm for optimizing a convex differentiable function subject to linear equality constraints and nonnegativity bounds on the variables. In particular, at each iteration, we compute a search direction by reduced gradient, and line search by bisection algorithm or Armijo rule. Under some assumption, the convergence rate of speed reduced gradient (SRG) algorithm is proven to be significantly better, both theoretically and practically. The algorithm of SRG are programmed by Octave/Matlab, and comparing by Frank-Wolfe algorithm some problems, the numerical results which show the efficient of our approach, we give also an application to ODE, optimal control, image and video co-localization and learning machine.

A. El Mouatasim


3 / 25

Reduced gradient method Reduced Gradient Method Reduced problem

Outline 1

Introduction

2


3


4


5


A. El Mouatasim


4 / 25


From now on, we consider a nonlinear programming problem with linear equality constraints of the form $ f px q & minimize subject to Ax “ b (1) % 0ďx where f : Rn ÝÑ R is convex twice continuously differentiable function, A is m ˆ n matrix with m ď n and b is a vector in Rm .

A. El Mouatasim


5 / 25


The reduced gradient method begins with a basis B and a feasible solution k xk “ pxBk , xN q such that xBk ą 0. The solution x is not necessarily a basic solution, i.e. xN do not has to be identically zero. Such a solution can be obtained e.g. by the usual first phase procedure of linear optimization. Using the basis B form BxB ` NxN “ b, we have xB “ B ´1 b ´ B ´1 NxN , hence the basic variables xB can be eliminated from the problem (1) " minimize : F pxN q subject to : 0 ď xN ,

(2)

where F pxN q “ f pB ´1 b ´ B ´1 NxN , xN q. Using the notation “ ‰ ∇f pxqt “ ∇B f pxqt , ∇N f pxqt , the gradient of F which is the so-called reduced gradient, can be expressed as ` ˘ r “ ∇F px qt “ ´ ∇B f px qt B ´1 Nq ` p∇N f px qt . A. El Mouatasim


6 / 25

Reduced gradient method Algorithm of reduced gradient Algorithm of RG

Outline 1

Introduction

2


3


4


5


A. El Mouatasim


7 / 25


We are now ready to present the details of our RG algorithm. Stp 0 (Initialization). Choose a feasible point x0 P Rn , and IB0 , IN0 such that B 0 is nonsingular. Set the iteration counter k “ 0. Stp 1 (Independent variables choice). If k ‰ 0 choose the sets IBk , INk ; Stp 2 (Search direction computation). "

1 2

´rj if rj ă 0 or pxN qj ą 0, 0 otherwise If dN is zero, stop ; the current point is a solution. Otherwise, fined dB “ ´B ´1 NdN . Let pdN qj “

A. El Mouatasim


8 / 25


Stp 3 (Optimal line search computation : Bisection algorithm). Fined ηmax , ηk achieving respectively, min t

1ďjďn

min

xjk ´djk

0ďηďηmax

: djk ă 0u

f px k ` ηd k q

(3)

Stp 4 (Next point computation). Put x k`1 “ x k ` ηk d k ; Stp 5 If ηk ă ηmax , return to Step 2. Otherwise, declare the vanishing variable in the dependent set independent and declare a strictly positive variable in the independent set dependent. Update B and N Stp 6 (Basic variables choice). Choose IBk`1 , INk`1 . Let k “ k ` 1 and go to Step 1.

A. El Mouatasim


9 / 25

Reduced gradient method Algorithm of reduced gradient Speed reduced gradient algorithm

Outline 1

Introduction

2


3


4


5


A. El Mouatasim


10 / 25

Reduced gradient method Algorithm of reduced gradient Speed reduced gradient algorithm

Reduced gradient and Nesterov step 1

Select a point y0 P C. Put k “ 0, b0 “ 1, x´1 “ y0 .

2

kth iteration. a) Compute direction of reduced gradient dk . b) Put xk “ yk ` ηk dk , a bk`1 “ 0.5p1 ` 4b2k ` 1 q, ´1 yk`1 “ xk ` p bbkk`1 qpxk ´ xk´1 q,

(4)

see for instance [8].

The recalculation of the point yk in(4) is done using a “ravine” step, and ηk is the optimal step (3) .

A. El Mouatasim


11 / 25

Reduced gradient method Convergence analysis

The reduced gradient method stops when the direction vector with respect to the non basic variables dNk “ 0. The justification of this stopping criterion is presented in [1]. Theorem If the sequence txk ukě0 is constructed by FRG algorithm, then there exist a constants C such that @k ě 0 fpxk q ´ fpx˚ q ď

A. El Mouatasim

C . pk ` 2q2


(5)

12 / 25

Reduced gradient method Numerical experiments

The code of proposed algorithm SRG, RG, FW is written by using Octave/Matlab programming language. We test SRG method and compare it with RG and FW algorithm. The optimal line search process of RG, FW and SRG fined by using bisection algorithm, we set “ 10´4 . We stop the iteration if the KKT condition }dN } ď 10´5 is satisfied or its equal maximal iteration. The algorithms are run on a workstation HP Intel(R) Celeron(R) M processor 1.30 GHz., 224 Mo RAM. The row cpu gives the mean CPU time in seconds for one run.

A. El Mouatasim


13 / 25

Reduced gradient method Numerical experiments Test Problems

Outline 1

Introduction

2


3


4


5


A. El Mouatasim


14 / 25


This algorithms has been tested on some problems from Hock and Schittkowski (HS#) [3], Floudas (F#) [2] and Kiwiel (K6) [6] where linear constraints are present. We test the performance of FW, RG and FRG methods on the following test problems with given initial feasible points x 0 . The results are listed in Table 1, n stands for the dimension of tested problem and nc stands for the number of constraints. We will report the following results : the CPU time, the optimal value f ˚ , the number of iteration Iter .

A. El Mouatasim


15 / 25

Problem # n HS44 4 HS48 5 HS49 5 HS50 5 HS51 5 HS53 5 HS55 6 HS76 4 HS86 5 HS110 10 HS118 15 HS119 16 F7 20 F8 24 K6 30

nc 6 2 2 3 3 8 8 3 10 20 47 8 10 10 60

CPU 0.16 1.25 4.22 0.11 0.19 0.08 0.05 0.94 4.47 0.04 0.63 41.50 2.31 3.20 5.20

FW f˚ -15 6.2e-9 1.9e-5 5.6303 8.6e-8 4.093 6.333 -4.67 -32.32 -45.78 664.82 244.91 -5099 18423 2.4e-13

Ite 4 49 219 10 14 6 3 50 250 2 8 2000 100 150 100

Algorithm RG CPU f˚ 0.06 -15 0.09 1.8e-9 0.48 1.9e-5 0.02 2.9e-4 0.03 1.6e-8 0.05 4.093 0.02 6.333 0.06 -4.68 0.03 -32.35 0.04 -45.78 0.81 664.82 2.50 244.89 0.44 -7252 0.84 32585 6.42 6.64

Ite 5 47 198 10 11 6 2 9 8 3 39 398 96 150 100

CPU 0.05 0.06 0.10 0.02 0.03 0.02 0.02 0.02 0.02 0.03 0.40 0.41 0.39 0.43 3.17

FRG f˚ -15 1.9e-9 7.9e-6 2.9e-4 1.6e-8 4.093 6.333 -4.68 -32.35 -45.78 664.82 244.89 -7252 15990 2.4e-14

Table : Comparing results of test problems between FW, RG and SRG algorithms.

Ite 5 41 32 10 11 6 2 9 8 3 19 221 96 137 45


From Table 1 above, we can see that our algorithm SRG can find their solutions with a small number of iterations or CPU time, and the computation results illustrate that our algorithm SRG executes well for those problems. In contrast to the numerical results of Frank-Wolfe algorithm and reduced gradient algorithm, the results in Table 1 show that the numbers of iterations for some problems are larger, while this may not very important, since the computational complexity in a single iteration and the CPU time are the main effort of this work for applications.

A. El Mouatasim


17 / 25

Reduced gradient method Numerical experiments Applications

Outline 1

Introduction

2


3


4


5


A. El Mouatasim


18 / 25


App1 : Ordinary differential equation (ODE) Given a positive integer k, the problem is defined as follows : minimize :

1 2

k´2 ř

pxkì`1 ´ xkì q2 ,

i“1

subject to :

xkì ´ xi`1 ` xi “ 0, i “ 1, . . . , k ´ 1, αi ď xi ď αi`1 , i “ 1, . . . , k, 0.4pαi`2 ´ αi q ď xk`1 ď 0.6pαi`2 ´ αi q i “ 1, . . . , k ´ 1, where the constants αi are defined by αi “ 1.0 ` p1.01qi´1 . These problems arise in the optimal placement of nodes in a scheme for solving ordinary differential equations with given boundary values [7]. We solve these problems for different values of k “ 500. A. El Mouatasim


19 / 25


App2 : Optimal Control

Consider the optimal control of a servomotor [1] $ ş8 T 2 & minimize Jpx0 , uq “ „ 0 rx ptqQxptq  `„ru ptqsdt  0 1 0 “ x ` u % subject to Bx Bt 0 ´1 1 „  1 0 where Q “ , r ą 0. 0 0

A. El Mouatasim


(6)

20 / 25


The state and control variables are parametrized by the proposed approximation method. $ řm 2 řm 2 h ’ 1 ui q 2 p 1 yi ` r ’ minimize ’ ’ y1 “ y0 ` h2 z1 & subject to : (7) yi “ yi´1 ` h2 pzi ` zi´1 q, i “ 2, . . . , m ’ h ’ z “ z ` p´z ` u q ’ 1 0 1 i ’ 2 % zi “ zi´1 ` h2 p´zi ` ui ´ zi`1 ` ui´1 q, i “ 2, . . . , m where m “ 250, x “ py, zq ; the time interval h “ 2. We choose r “ 0.5 ; u0 “ 1000 ˚ onesp1, mq and x0 “ r3.25; 4.95s.

A. El Mouatasim


21 / 25


App3 : Image and video co-localization

The application comes from image and video co-localization. The approach used by [4] is formulated as a quadratic program (QP) over a flow polytope, the convex hull of paths in a network. In this application, the linear minimization oracle is equivalent to finding a shortest path in the network, which can be done easily by dynamic programming [5]. For comparing with Frank-Wolfe algorithm, we re-use the code provided by [4] and their included aeroplane dataset resulting in a QP over 660 variables.

A. El Mouatasim


22 / 25


App4 :Support vector machines (SVM)

The Lagrange dual of standard convex optimization setup for structural SVMs [5] has n variables. Writing αi for the dual variable associated with the training example i, The dual problem is given by " λ T minimize 2 }Aα} ř ´b α (8) subject to : i αi “ 1 where the matrix A P

Speed reduced gradient algorithm for nonlinear

Speed reduced gradient algorithm for nonlinear

Suggest Documents

Reduced-gradient and Projection Methods for Nonlinear Programming

Projected Gradient Methods for Nonlinear

A gradient-based adaptive algorithm with reduced complexity, fast

A speed-gradient-based method to passify nonlinear discrete-time

A new algorithm of nonlinear conjugate gradient method with ... - SciELO

A new algorithm of nonlinear conjugate gradient ... - Semantic Scholar

Riemannian stochastic variance reduced gradient

GRADIENT REGULARITY FOR NONLINEAR ... - Aalto Math

The Gradient-Fourier method for nonlinear Neumann

GRADIENT ESTIMATES FOR A NONLINEAR DIFFUSION EQUATION ...

Gradient formulae for nonlinear probabilistic ... - Weierstrass Institute

Nonlinear image reconstruction algorithm for

Nonmonotone Barzilai-Borwein Gradient Algorithm for $\ell_1 ...

Surrogate Gradient Algorithm for Lagrangian ... - Semantic Scholar

Gradient Algorithm for Reference-based Cubic ...

accelerated proximal gradient algorithm for frame ...

Surrogate Gradient Algorithm for Lagrangian ... - Semantic Scholar

Projected Nesterov's Proximal-Gradient Algorithm for Sparse

Nonmonotone Adaptive Barzilai-Borwein Gradient Algorithm for ...

Multivariate Spectral Gradient Algorithm for Nonsmooth Convex ...

Reduced differential transform method for nonlinear

An imperfect conjugate gradient algorithm

A CONJUGATE-GRADIENT BASED ALGORITHM

Misconceptions About Incline Speed for Nonlinear Slopes