ON SEVERAL ALGORITHMS FOR VARIATIONAL

ON SEVERAL ALGORITHMS FOR VARIATIONAL INEQUALITY AND INCLUSION PROBLEMS

Doctoral Thesis by Reinier D´ıaz Mill´ an Supervised by Prof. Dr. Jos´ e Yunier Bello Cruz

Funded by CAPES

´ tica e Estat´ıstica IME - Instituto de Matema ´s Universidade Federal de Goia ˆ nia, Goia ´ s, Brazil Goia February 2015

ń Reinier D´ıaz Milla

ON SEVERAL ALGORITHMS FOR VARIATIONAL INEQUALITY AND INCLUSION PROBLEMS

Tese apresentada ao Programa de Pós-Gradua¸caõ do Instituto de Matemática e Estat´ıstica da Universidade Federal de Goiás, como requisito parcial para obten¸cão do t´ıtulo de Doutor em Matemática. ´ Area de concentra¸c˜ ao: Otimiza¸caõ Orientador: Prof.Dr. José Yunier Bello Cruz

Goiânia 2015

iii

Ficha catalográfica elaborada automaticamente com os dados fornecidos pelo(a) autor(a), sob orientação do Sibi/UFG.

Díaz Millán, Reinier On several algorithms for variational inequality and inclusion problems [manuscrito] / Reinier Díaz Millán. - 2015. 6, 89 f.: il.

Orientador: Prof. Dr. José Yunier Bello Cruz. Tese (Doutorado) - Universidade Federal de Goiás, Instituto de Matemática e Estatística (IME) , Programa de Pós-Graduação em Matemática, Goiânia, 2015. Bibliografia. Inclui símbolos, gráfico, algoritmos. 1. Extragradient´s method. 2. Forward-Backward method. 3. Linesearch. 4. Maximal monotone operators. 5. Projection methods. I. Bello Cruz, José Yunier , orient. II. Título.

v

Dedicado a: Mis padres, Miriam y Fernando Mis hermanos, Henry y Yanexy Mi esposa Mariana e Hija Lia.

vi

Agradecimentos Gostaria de agradecer em primeiro lugar a meu amigo e orientador Prof. Dr. José Yunier Bello Cruz, sem a persistência, constâcia e ajuda dele não teria sido poss´ıvel a realiza¸caõ desta tese. Estendo meus agradecimentos à CAPES pela grande ajuda oferecida com a bolsa de estudos de doutorado, a` Universidade Federal de Goiás (UFG), pela aceita¸caõ no seu programa de doutorado como aluno regular e os conhecimentos adquiridos. A todos os professores da UFG pelo apoio e dedica¸cão nesses quatro anos, em especial aos que integram o grupo de Otimiza¸caõ, com ênfase aos profesores (em ordem alfabética) Dr. Glaydston de Carvalho Bento, Dr. Jefferson D. G. de Melo, Dr. Leandro da Fonseca Prudente, Dr. Luis Roman Lucambio Pérez, Dr. Ole Peter Smith e Dr. Orizon Pereira Ferreira. Agrade¸co tambem a minha amada esposa Mariana Neves Ferreira quem deu-me todo apoio, compreensão e amor que precisei para fazer esta tese. A minha preciosa filha Lia, que tanta alegria e for¸ca me proporciona a cada dia. Incalculable gratitud a mi familia, que siempre han luchado para que yo llegue a donde estoy. A mis viejitos, Miriam Millán Ordaz, mamá gracias por tanta garra, amor y dedicaci´ on, Fernando D´ıaz Monte de Oca, papá sin tus consejos y sabidur´ıa no huviese sido posible nada. A mis hermanos que suplieron mi falta en Cuba, Yanexy D´ıaz Millán y Henry D´ıaz Millán. Gracias desde bien adentro. I would like to thanks to Professor Heinz H. Bauschke from University of British Columbia and like to express great gratitude to Professor Hung M. Phan from University of Massachusetts Lowell, by their important contributions in the developed of of this thesis. Agrade¸co a todos os funcionários da UFG, em particular aos servidores da secretaria de Pós-gradua¸caõ pelo apoio log´ıstico ao largo destes quatro anos.

vii

Abstract In this thesis we present various algorithms to solve the Variational Inequality and Inclusion Problems. For the variational inequality problem we propose, in Chapter 2, a generalization of the classical extragradient algorithm by utilizing non-null normal vectors of the feasible set. In particular, two conceptual algorithms are proposed and each of them has three different projection variants which are related to modified extragradient algorithms. Two different linesearches, one on the boundary of the feasible set and the other one along the feasible direction, are proposed. Each conceptual algorithm has a different linesearch strategy and three special projection steps, generating sequences with different and interesting features. Convergence analysis of both conceptual algorithms are established, assuming existence of solutions, continuity and a weaker condition than pseudomonotonicity on the operator. In Chapter 4 we introduce a direct splitting method for solving the variational inequality problem for the sum of two maximal monotone operators in Hilbert space. In Chapter 5, for the same problem, a relaxed-projection splitting algorithm in Hilbert spaces for the sum of m nonsmooth maximal monotone operators is proposed, where the feasible set of the variational inequality problem is defined by a nonlinear and nonsmooth continuous convex function inequality. In this case, the orthogonal projections onto the feasible set are replaced by projections onto separating hyperplanes. Furthermore, each iteration of the proposed method consists of simple subgradient-like steps, which does not demand the solution of a nontrivial subproblem, using only individual operators, which explores the structure of the problem. For the Inclusion Problem, in Chapter 3, we propose variants of forward-backward splitting method for finding a zero of the sum of two operators, which is a modification of the classical forward-backward method proposed by Tseng. The conceptual algorithm proposed here improves Tseng’s method in many instances. Our approach contains firstly an explicit Armijo-type line search in the spirit of the extragradient-like methods for variational inequalities. During the iterative process, the line search performs only one calculation of the forward-backward operator in each tentative for finding the step size. This achieves a considerable computational saving when the forward-backward operator is computationally expensive. The second part of the scheme consists of special projection steps bringing several variants. Keywords: Extragradient’s method, Forward-Backward method, Linesearch, Maximal monotone operators, Point-to-set operator, Projection method, Quasi-Fejér and Fejér convergence, Relaxed method, Splitting Method, Variational inequality problem, Inclusion problem, Weak convergence. viii

Resumo Nesta tese apresentamos vários algoritmos para resolver os problemas de Desigualdade Variacional e Inclusão. Para o problema de desigualdade variacional propomos, no Cap´ıtulo 2 uma generaliza¸caõ do algoritmo clássico extragradiente, utilizando vetores normais não nulos do conjunto viável. Em particular, dois algoritmos conceituais são propostos e cada um deles contêm três variantes diferentes de proje¸caõ que estão relacionadas com algoritmos extragradientes modificados. Duas buscas diferentes são propostas, uma sobre a borda do conjunto viável e a outra ao longo das dire¸co˜es viáveis. Cada algoritmo conceitual tem uma estratégia diferente de busca e três formas de proje¸caõ especiais, gerando três sequências com diferente ´ feito a análise da convergência de ambos os algoritmos cone interessantes propriedades. E ceituais, pressupondo a existência de solu¸cões, continuidade do operador e uma condi¸caõ mais fraca do que pseudomonotonia. No Cap´ıtulo 4, nós introduzimos um algoritmo direto de divisão para o problema variacional em espa¸cos de Hilbert. Já no Cap´ıtulo 5, propomos um algoritmo de proje¸cão relaxada em Espa¸cos de Hilbert para a soma de m operadores monótonos maximais ponto-conjunto, onde o conjunto viável do problema de desigualdade variacional é dado por uma fun¸caõ não suave e convexa. Neste caso, as proje¸cões ortogonais ao conjunto viável são substitu´ıdas por proje¸co˜es em hiperplanos que separam a solu¸caõ da itera¸caõ atual. Cada itera¸cão do método proposto consiste em proje¸co˜es simples de tipo subgradientes, que não exige a solu¸caõ de subproblemas não triviais, utilizando apenas os operadores individuais, explorando assim a estrutura do problema. Para o problema de Inclusão, propomos variantes do método de divisão de forward-backward para achar um zero da soma de dois operadores, a qual é a modifica¸caõ clássica do forwardbackward proposta por Tseng. Um algoritmo conceitual é proposto para melhorar o apresentado por Tseng em alguns pontos. Nossa abordagem contém, primeramente, uma busca linear tipo Armijo expl´ıcita no esp´ırito dos métodos tipo extragradientes para desigualdades variacionais. Durante o processo iterativo, a busca linear realiza apenas um cálculo do operador forward-backward em cada tentativa de achar o tamanho do passo. Isto proporciona uma considerável vantagem computacional pois o operador forward-backward é computacionalmente caro. A segunda parte do esquema consiste em diferentes tipos de proje¸co˜es, gerando sequências com caracter´ısticas diferentes. Palavras-chave : Busca Linear, Convergência fraca, Convergência Fejér e Quase-Fejér, Método de proje¸caõ, Método Extragradiente, Método Forward-Backward, Método Relaxado, Método de separa¸caõ, Operador monótono maximal, Operador ponto-conjunto, Problema de desigualdade variacional, Problema de inclusão. ix

Basic notation and terminology H: a nontrivial real Hilbert space, SI : the solution set of the Inclusion Problem, SV I : the solution set of the Variational Inequality Problem, h·, ·i: the inner product of the space, k·k: the induced norm, Rn : the n-dimensional Euclidean space, R+ : the set of nonnegative real numbers, R++ : the set of positive real numbers, N: the set of natural numbers, dom(g): the domain of a function g, dom(T ): the domain of an operator T , ∂g: the subdifferential set of a convex function g, NC (x): the normal cone of a set C at a point x, δC : the indicator function of a set C, 2H : the power set of a set H, B(x, δ): the open ball with radius δ centered at x, B[x, δ]: the closed ball with radius δ centered at x, int(C): the interior of the set C, RλT : the resolvent operator of T ,

1

Contents 1 Introduction

4

1.1

Variational inequality problem . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.2

Inclusion problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

1.3

Mathematical tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.3.1

Introduction to functional analysis . . . . . . . . . . . . . . . . . . .

8

1.3.2

Introduction to convex analysis . . . . . . . . . . . . . . . . . . . . .

9

1.3.3

Introduction to monotone operator theory . . . . . . . . . . . . . . .

11

2 Conditional extragradient algorithms for variational inequalities 2.1

16

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

2.1.1

Extragradient algorithm . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.1.2

Proposed schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.2

Motivation: An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.3

The extragradient algorithm with normal vectors . . . . . . . . . . . . . . .

22

2.4

Two linesearches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.5

Conceptual algorithm with Linesearch B . . . . . . . . . . . . . . . . . . . .

28

2.5.1

Convergence analysis of Variant B.1 . . . . . . . . . . . . . . . . . . .

30

2.5.2


33

2.5.3


35

Conceptual algorithm with Linesearch F . . . . . . . . . . . . . . . . . . . .

38

2.6.1

Convergence analysis of Variant F.1 . . . . . . . . . . . . . . . . . . .

40

2.6.2


43

2.6.3


45

2.6

2

2.7

Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

3 A variant of forward-backward splitting method for the sum of two monotone operators 50 3.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

3.2

The Conceptual Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

3.2.1

The Linesearch D . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

3.2.2

The Conceptual Algorithm FB

. . . . . . . . . . . . . . . . . . . . .

52

Convergence Analysis of Algorithm FB . . . . . . . . . . . . . . . . . . . . .

53

3.3.1

Convergence analysis of Variant FB.1 . . . . . . . . . . . . . . . . . .

54

3.3.2


56

3.3.3


58

Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

3.3

3.4

4 A direct splitting method for nonsmooth variational inequalities

63

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

4.2

A splitting direct method . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

4.2.1

Convergence analysis of Algorithm D . . . . . . . . . . . . . . . . . .

64

Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

4.3

5 A relaxed-projection splitting algorithm for variational inequalities in Hilbert spaces 69 5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

5.2

A relaxed-projection splitting method . . . . . . . . . . . . . . . . . . . . . .

70

5.2.1

Linesearch to relax the projection process . . . . . . . . . . . . . . .

70

5.2.2

The Algorithm R . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

5.2.3

Convergence analysis of Algorithm R . . . . . . . . . . . . . . . . . .

73

Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

5.3

3

Chapter 1 Introduction The idea is clear and old: divide et impera (divide and conquer). When we face difficult and structure problems splitting is one of the most important and systematic techniques for the development of efficient algorithms for solving them. Difficult models in environment science, meteorology, optimization, differential equations and variational inequalities have given rise to various operator splitting techniques. The principal idea has always been to simplify: to improve efficiency and computational work by solving simpler subproblems. The Variational Inequality Problem and the Inclusion Problem play an important role in the fields of optimization and differential equation both are strong tools for solving minimizations problems and differential equations. These problems has been a classical subject in economics, operations research and mathematical physics, particularly in the calculus of variations associated with the minimization of infinite-dimensional functionals (see, e.g., [55] and the references therein). They are closely related to many problems of nonlinear analysis, such as optimization, complementarity, equilibrium problems and finding fixed points (see, e.g., [42, 55, 82]). This thesis is focused on this aforementioned two problems, proposing new algorithms for solving them when the data of the problems has a particular structure. We next present several comments on both problems.

1.1

Variational inequality problem

The studies of variational inequality problem began in the mid-1960s, introduced by Hartman and Stampacchia. The theory has developed into a very fruitful discipline in the field of mathematical programming and partial differential equations with applications in mechanics. These developments include the theory of engineering, economics and mathematical science. Recently, the variational inequality problem has been the focus of many studies, 4

increasing the interest in this problem (see,e.g., [17,32,44,49,58,80,81]). In the present work, we propose algorithms for solving this problem with different kinds of operators. Now, we recall the formulation of the variational inequality problem. Let H be Hilbert space and T be a point-to-set operator, e.g., T : dom(T ) ⊂ H → 2H , and C be a nonempty subset of H, the variational inequality problem consist in: Find x∗ ∈ C such that ∃ u∗ ∈ T (x∗ ), with hu∗ , x − x∗ i ≥ 0

∀ x ∈ C.

(1.1)

The solution set of problem (1.1) will be denoted by SV I . Many methods have been proposed for solving problem (1.1), with T point-to-point (see e.g., [53, 61, 80, 81]), and for T pointto-set (see e.g., [8, 49, 52, 57]). An excellent survey of methods for variational inequality problems can be found in [41]. In this thesis, the operator T is considered, in Chapter 2, as a single value operator, in Chapter 4 as a sum of two set-value operators, and in Chapter 5 as a sum of a finite number of set-value operators, e.g., T = T1 + T2 + · · · + Tm ; we are interested in methods that explore the structure of T , e.g., in Chapter 2 we see the variational inequality thought the inclusion problem, considering the normal cone, and taking non-null elements of it. This kind of methods are called splitting, since each iteration involves only the individual operators, but not the sum. Many splitting algorithms have been proposed in order to solve this kind of problems (see, e.g., [4, 33, 38, 65, 69, 71, 83, 85]) and the references therein). However, in all of them, the resolvent operator (defined forward) of any individual operator, has to be evaluated in each iteration. It is important to mention, that this proximal-like iteration (the evaluation of the resolvent operator) is, in general, a nontrivial problem, which demands hard work from a computational point of view. Our algorithm tries to avoid this task, replacing implicit proximal-like iterations by explicit subgradient-like steps. This represents a significant advantage in both implementation and theoretical senses. Another weakness which appears in most of methods proposed in the literature for solving problem (1.1), is the necessity to compute the exact projection onto the feasible set. This limits the applicability of the methods, especially when such a projection is computationally difficult to evaluate. It is well known that only in a few specific instances the projection onto a convex set has an explicit formula. When the feasible set of problem (1.1) is a general closed convex set, C, we have to solve a nontrivial quadratical problem, in order to compute the projection onto C. This difficulty appears when the feasible set of problem (1.1) is expressed as the solution set of another problem, as in Chapter 5. Frequently, in this kind of problems, it is very hard to find the projection onto the feasible set or even find a feasible point. One option for avoiding this difficulty consists in replacing, at each iteration, the projection onto C by the projection onto separating halfspaces containing the given set C. The last option was introduced by Fukushima in [44] for point-to-point and strongly monotone variational inequality problem. Other schemes have been proposed, in order to improve 5

the convergence results, without performing projections onto the feasible set: for point-to-set and paramonotone operators (see e.g., [14,16,31]); for point-to-point and monotone operators (see e.g., [17, 32]).

1.2

Inclusion problem

Taking, the set C as a whole space, in the variational inequality problem, e.g., C = H, the problem (1.1) reduces to finding a zero of operator T . Finding a zero of a function is an old and very important problem, and many mathematical and physical problems reduce to this problem. The inclusion problem is a generalization of this classical problem, when the function is a point-to-set operator. Given the Hilbert space H and the operator T : dom(T ) ⊆ H → 2H , the inclusion problem consists of: Find x∗ ∈ H such that 0 ∈ T (x∗ ).

(1.2)

The solution set will be denoted by SI := {x ∈ H : 0 ∈ T (x)}. This problem has recently received a lot attention, due to the fact that many nonlinear problems, arising within applied areas, are mathematically modeled as nonlinear operator equations and/or inclusions, which are decomposed as the sum of two operators. Here we focus our attention in the so-called splitting methods, when the operator T = A + B, where A : dom(A) ⊆ H → H is a pointto-point and B : dom(B) ⊆ H → 2H is a point-to-set operator. Frequently the splitting iteration involves only the individual operators, A or B, but not the sum, A + B at each step (see, e.g., [37, 39, 83]). A classical splitting method for solving problem (1.2) is the so-called forward-backward splitting method proposed by Lions and Mercier in [65] and Passty in [71]. Assuming that dom(B) ⊆ dom(A), this scheme is given as follows: xk+1 = (Id +βk B)−1 (Id −βk A)(xk ),

(1.3)

where βk > 0 for all k ∈ N. Mercier in [66] and Gabay in [45] showed that if A−1 is strongly monotone with modulus δ > 0, then the sequence (xk )k∈N generated by (1.3), converges weakly to a solution when βk ≤ β < 1/2δ for all k ∈ N. Moreover, when A is strongly monotone, then (xk )k∈N converge strongly to the unique solution (see [66]). An important and promising modification of Scheme (1.3) was presented by Tseng in [83]. It consists of: J(xk , βk ) :=(Id +βk B)−1 (Id −βk A)(xk ) k+1 k k k x =PX J(x , βk ) − βk A(J(x , βk )) − A(x ) ,

(1.4) (1.5)

where X is a nonempty, closed and convex set, belonging to dom(A). Note that there exist various choices for the set X. If dom(B) is closed, then the result of Minty in [68], implies 6

that dom(B) is convex, hence we may choose X = dom(B) (see [83]). The stepsize βk is chosen to be the largest β ∈ {σ, σθ, σθ2 , . . .}, satisfying:

β A J(xk , β) − A(xk ) ≤ δ J(xk , β) − xk , (1.6) with θ, δ ∈ (0, 1) and σ > 0. The convergence of (1.4)-(1.6) was established assuming maximal monotonicity of A and B, as well as either Lipschitz continuity of A over X∪dom(B) or continuity of A on dom(B) ⊃ X. It is important to say that, in the above scheme, in order to compute βk satisfying (1.6), the forward-backward operator (1.4) must be calculated, in each tentative to choose of the step length. From a computational point of view, this represents a considerable drawback. In order to overcome this limitation, in Chapter 3, a conceptual algorithm has been proposed, containing three variants. Our scheme contains two parts. The first part consists of finding a separating halfspace, containing the solution set of the problem. This procedure employs a new Armijo-type search which performs only one calculation of the forward-backward operator, differently from Tseng’s algorithm.

1.3

Mathematical tools

In this section we introduce some mathematical tools, including theorems, propositions, facts and definitions used in the reminder part of this thesis, mostly from Functional Analysis, Convex Analysis and Monotone Operator Theory. We begin with some definitions and results needed for the convergence analysis of the proposed methods. Throughout this thesis, we write p := q to indicate that p is defined to be equal to q. The closed (open) ball centered at x ∈ H with radius ρ > 0 will be defined by B[x, ρ] := {y ∈ H : ky − xk ≤ ρ}(B(x, ρ) := {y ∈ H : ky − xk < ρ}). The domain of any function f : H → R, is defined as dom(f ) := {x ∈ H : f (x) < +∞} and we say f is proper if dom(f ) 6= ∅. Finally, let T : H → 2H be an operator. Then, the domain and the graph of T are defined as dom(T ) := {x ∈ H : T (x) 6= ∅} and Gph(T ) := {(x, u) ∈ H × H : u ∈ T (x)}. The indicator function of set C is defined as:

δC (x) :=

  0

if x ∈ C

 ∞ otherwise. The normal cone of the set C is NC := ∂δC , where the subdifferential of f at x is defined as the set ∂f (x) := {v ∈ H : f (y) ≥ f (x) + hv, y − xi, ∀y ∈ H} . 7

We define the resolvent operator of T as RλT := (Id +λT )−1 .

1.3.1

Introduction to functional analysis

Now some well-known results of functional analysis follow, starting with basic definitions and consequences. Definition 1.3.1 The subset C ⊂ H, is said to be bounded if and only if, there exists M ∈ R such that kxk ≤ M , for all x ∈ C. Definition 1.3.2 A sequence (xk )k∈N ⊂ H is said to be: (i) Strongly convergent to x ∈ H if and only if limk→∞ kxk − xk = 0. (ii) Weakly convergent to x ∈ H if and only if limk→∞ hxk − x, yi = 0, for all y ∈ H. Lemma 1.3.3 [28, Theorem 3.16]. In Hilbert spaces, every bounded sequence has a weakly convergent subsequence. We next deal with the so-called quasi-Fejér and Fejér convergence and its properties. Definition 1.3.4 Let S be a nonempty, convex and closed subset of H. A sequence (xk )k∈N is said to be quasi-Fejér convergent to S, if and only if, for all x ∈ S, there exist k0 ≥ 0 and a summable sequence (k )k∈N ⊂ R+ , such that kxk+1 − xk2 ≤ kxk − xk2 + k , for all k ≥ k0 . If k = 0 for all k ∈ N, the sequence is said to be Fejér convergent. This definition originates in [40] and has been further elaborated in [54]. Useful results on quasi-Fejér sequences are the following. Note that Fejér convergence implies quasi-Fejér convergence. Proposition 1.3.5 [3, Proposition 1]. If (xk )k∈N is quasi-Fejér convergent to S ⊂ H, then: (i) The sequence (xk )k∈N is bounded. (ii) For each x ∈ S, (kxk − xk)k∈N converges. (iii) If all weak accumulation point of (xk )k∈N belong to S, then the sequence (xk )k∈N is weakly convergent to S. If finite-dimensional Hilbert spaces, it suffices that one accumulation point of (xk )k∈N belongs to S. 8

Lemma 1.3.6 [14, Lemma 2]. If (xk )k∈N is quasi-Fejér convergent S, then (PS (xk ))k∈N is strongly convergent. A know result on sequence averages is the following. Proposition 1.3.7 [14, Proposition 3 ]. Let (pk )k∈N ⊂ H be a sequence strongly convergent to p˜. Take nonnegative real numbers ζk,j (k ≥ 0, 0 ≤ j ≤ k) such that limk→∞ ζk,j = 0 for P all j and kj=0 ζk,j = 1 for all k ∈ N. Define k

x :=

k X

ζk,j pj .

j=0

Then, (xk )k∈N also converges strongly to p˜.

1.3.2

Introduction to convex analysis

Definition 1.3.8 The set C ⊂ H is say to be convex if and only if, for each x, y ∈ C tx + (1 − t)y belongs to C for all t ∈ [0, 1]. This says that all points on a segment connecting two points in the set are in the set. For C a nonempty, convex and closed subset of H, we define the orthogonal projection of x onto C by PC (x), as the unique point in C such that kPC (x) − yk ≤ kx − yk for all y ∈ C. We state well-known facts on orthogonal projections. Proposition 1.3.9 Let C be any nonempty, closed and convex set in H, and PC the orthogonal projection onto C. For all x, y ∈ H and all z ∈ C the following hold: (i) kPC (x) − PC (y)k2 ≤ kx − yk2 − k(PC (x) − x) − PC (y) − y k2 . (ii) hx − PC (x), z − PC (x)i ≤ 0. (iii) Let x ∈ C, y ∈ H and z = PC (y), then hx − y, x − zi ≥ kx − zk2 . Proof. (i) and (ii): See Lemma 1.1 and 1.2 in [84]. (iii): Using (ii), we have hx − y, x − zi = hx − z, x − zi + hx − z, z − yi ≥ kx − zk2 .

The next property will be useful and important thought the thesis. Proposition 1.3.10 [11, Proposition 28.19]. Let f : H → H a convex function and given y, z, w ∈ H and v ∈ ∂f (z). Define Wz,w := {x ∈ H : hx − z, w − zi ≤ 0} and Cz := {x ∈ 9

H : f (z) + hv, x − zi ≤ 0}. Then, PCz ∩Wz,w (w) = w + max{0, λ1 }v + λ2 (w − z), where λ1 , λ2 are solution of the linear system: λ1 kvk2 + λ2 hv, w − zi = −hv, w − zi − f (z) λ1 hv, w − zi + λ2 kw − zk2 = −kw − zk2 and

f (z) + hv, y − zi PCz (y) = y − max 0, v. kvk2

The next lemma provides a computable upper bound for the distance from a point to the feasible set C. Lemma 1.3.11 [14, Lemma 4]. Let c : H → R be a convex function and C := {x ∈ H : c(x) ≤ 0}. Assume that there exists w ∈ C such that c(w) < 0. Then, for all y ∈ H such that c(y) > 0, we have dist(y, C) ≤

ky − wk c(y) . c(y) − c(w)

The next two lemmas were presented in [22, Lemma 2.9 and 2.10]. We provide the proof of them only for the integrity of this thesis. Lemma 1.3.12 Let H ⊆ H be a closed halfspace and C ⊆ H such that H ∩ C 6= ∅. Then, for every x ∈ C, we have PH∩C (x) = PH∩C (PH (x)). Proof. If x ∈ H, then x = PH∩C (x) = PH∩C (PH (x)). Suppose that x ∈ / H. Fix any y ∈ C ∩ H. Since x ∈ C but x ∈ / H, there exists γ ∈ [0, 1), such that x˜ = γx + (1 − γ)y ∈ C ∩ bd H, where bd H is the hyperplane boundary of H. So, (˜ x − PH (x))⊥(x − PH (x)) and (PH∩C (x) − PH (x))⊥(x − PH (x)). Then k˜ x − xk2 = k˜ x − PH (x)k2 + kx − PH (x)k2 ,

(1.7)

kPH∩C (x) − xk2 = kPH∩C (x) − PH (x)k2 + kx − PH (x)k2 ,

(1.8)

and respectively. Using (1.7) and (1.8), we get ky − PH (x)k2 ≥ k˜ x − xk2 = k˜ x − PH (x)k2 + kPH (x) − xk2 ≥ k˜ x − PH (x)k2 . = k˜ x − xk2 − kx − PH (x)k2 ≥ kPH∩C (x) − xk2 − kx − PH (x)k2 = kPH∩C (x) − PH (x)k2 . Thus, ky − PH (x)k ≥ kPH∩C (x) − PH (x)k for all y ∈ C ∩ H and as consequence, PH∩C (x) = PC∩H (PH (x)). 10

Lemma 1.3.13 Let S be a nonempty, closed and convex set. Let x0 , x ∈ H. Assume that x0 ∈ / S and that S ⊆ W (x) := {y ∈ H : hy − x, x0 − xi ≤ 0}. Then, x ∈ B[ 12 (x0 + x), 12 ρ], where x = PS (x0 ) and ρ = dist(x0 , S). Proof. Since S is convex and closed, x = PS (x0 ) and ρ = dist(x0 , S) are well-defined. S ⊆ W (x) implies that x = PS (x0 ) ∈ W (x). Define v := 12 (x0 +x) and r := x0 −v = 21 (x0 −x), then x − v = −r and krk = 21 kx0 − xk = 12 ρ. It follows that

0 ≥ hx − x, x0 − xi = x − v + v − x, x0 − v + v − x = h−r + (v − x), r + (v − x)i = kv − xk2 − krk2 . The result follows.

1.3.3

Introduction to monotone operator theory

In the following we state some useful definitions and results on monotone operators. Definition 1.3.14 Let H be a Hilbert space, the operator T : dom(T ) ⊆ H → 2H is said to be: (i) Strongly monotone if and on if there exists c > 0 such that for all (x, u), (y, v) ∈ Gph(T ), we have hu − v, x − yi ≥ ckx − yk2 . (ii) Monotone if and only if for all (x, u), (y, v) ∈ Gph(T ), we have hu − v, x − yi ≥ 0, and it is Maximal monotone if T has no proper monotone extension in the graph inclusion sense. (iii) Pseudomonotone if and only if for all (x, u), (y, v) ∈ Gph(T ), we have hv, x − yi ≥ 0 implies hu, x − yi ≥ 0. Is easy to see that (i) ⇒ (ii) ⇒ (iii), but the reverse is not true in general (see, e.g., [13]). Lemma 1.3.15 Let T : dom(T ) ⊆ H → 2H be a maximal monotone operator. Then, (i) Gph(T ) is closed. (ii) T is bounded on bounded subsets of the interior of its domain. Proof. (i): See Proposition 4.2.1(ii) in [30]. (ii): See Lemma 5(iii) in [16].

We continue with the definition and some results on normal cones. 11

Definition 1.3.16 Let C be a subset of H and let u ∈ C. A vector u ∈ H is called a normal to C at x if for all y ∈ C, hu, y − xi ≤ 0. The collection of all such normal u is called the normal cone of C at x and is denoted by NC (x). If x ∈ / C, we define NC (x) = ∅. Note that in some special cases, the normal cone can be computed explicitly as showed below. Example 1.3.17 [63, page 386]. If C is a polyhedral set, i.e., C = {x ∈ Rn : hai , xi ≤ bi , i = 1, . . . , m}, then NC (x) = {d ∈ Rn : d =

m X

λi ai , ; λi (hai , xi − bi ) = 0, λi ≥ 0, i = 1, . . . , m}

i=1

for x ∈ C, and NC (x) = ∅ for x ∈ / C. Example 1.3.18 [25, Example 2.62]. Let C be a closed convex cone in Rn . Define C := {d ∈ Rn : hd, xi ≤ 0, ∀x ∈ C}. Then, NC (x) = C ∩ {x}⊥ = {d ∈ C : hd, xi = 0}, if x ∈ C and ∅ if x ∈ / C. Example 1.3.19 [75, Theorem 6.14]. Let C = {x ∈ X : F (x) ∈ D} for closed sets X ⊂ Rm and D ∈ Rm and continuously differentiable mapping F : Rn → Rm such that F (x) = (f1 (x), f2 (x), . . . , fm (x)). Assume that x ∈ C satisfies the constraint qualification that:     the unique vector y = (y1 , . . . , yn ) ∈ ND (F (x)) for which m X   − yi ∇fi (x) ∈ NX (x) is y = (0, . . . , 0).  i=1

Then, NC (x) =

( m X

) yi ∇fi (x) + z : y ∈ ND (F (x)), z ∈ NX (x) .

i=1 n

The normal cone can be seen as an operator, i.e., NC : C ⊂ Rn → 2R : x 7→ NC (x). Indeed, recall that the indicator function of C is defined by δC (y) := 0, if y ∈ C and +∞, otherwise, ¯ is defined and the classical convex subdifferential operator for a proper function f : Rn → R n by ∂f : Rn → 2R : x 7→ ∂f (x) := {u ∈ Rn : f (y) ≥ f (x) + hu, y − xi, ∀ y ∈ Rn }. Then, it is well-known that the normal cone operator can be expressed as NC = ∂δC . ¯ be a proper Example 1.3.20 [73, Theorem 23.7] [25, Proposition 2.61]. Let f : Rn → R and convex function. Consider C = {x ∈ Rn : f (x) ≤ 0}. Suppose that there exists x such 12

that f (x) < 0 (Slater condition). Then,      cl(cone(∂f (x))), NC (x) =

0,     ∅,

if f (x) = 0; if f (x) < 0; if f (x) > 0.

Fact 1.3.21 [30, Proposition 4.2.1]. The normal cone operator for C, NC , is maximal monotone operator and its graph, Gph(NC ), is closed, i.e., for all sequences (xk , uk )k∈N ⊂ Gph(NC ) that converges to (x, u), we have (x, u) ∈ Gph(NC ). Next we present some facts on normal cone, Fact 1.3.22 [10, Proposition 2.3]. Let C be a nonempty, convex and closed set in H, and PC the orthogonal projection onto C. Then PC = (Id +NC )−1 . Corollary 1.3.23 [22, Corollary 2.8]. For all x, p ∈ H and α > 0, we have x − PC (x − αp) ∈ p + NC (PC (x − αp)). α Fact 1.3.24 [41, Proposition 1.5.8]. Let SV I be the solution of problem (1.1). Then the following statements are equivalent: (i) x ∈ SV I . (ii) −T (x) ∈ NC (x). (iii) There exists β > 0 such that x = PC (x − βT (x)). Proposition 1.3.25 [22, Proposition 2.14]. Given T : dom(T ) ⊆ H → H and α > 0. If x = PC (x−α(T (x)+αu)), with u ∈ NC (x), then x ∈ SV I , or equivalently, x = PC (x−βT (x)) for all β > 0. Remark 1.3.26 It is quite easy to see that the reverse of Proposition 1.3.25 is not true in general. Given the operator T and a nonempty, closed and convex set C, we defined the set S0 as follows: S0 := {x ∈ C : hT (y), y − xi ≥ 0, ∀y ∈ C}. (1.9) This set is closely related to the solution set of the variational inequality problem, SV I . Now we present some results that show this relation and will be useful through this thesis. 13

Lemma 1.3.27 If T : dom(T ) ⊆ Rn → Rn is continuous, then S0 ⊆ SV I . Proof. Assume that x ∈ {x ∈ C : hT (y), y − xi ≥ 0 ∀ y ∈ C}. Take y(α) = (1 − α)x + αy, y ∈ C with α ∈ (0, 1). Since y(α) ∈ C and hence 0 ≤ hT (y(α)), y(α) − xi = hT ((1 − α)x + αy), (1 − α)x + αy − xi = αhT ((1 − α)x + αy), y − xi. Dividing by α > 0, we get 0 ≤ hT ((1 − α)x + αy), y − xi, and taking limits when α goes to 0, we obtain from the continuity of T that hT (x), y − xi ≥ 0, for all y ∈ C, i.e., x ∈ SV I . Lemma 1.3.28 For any (z, v) ∈ Gph(NC ) define H(z, v) := y ∈ Rn : hT (z) + v, y − zi ≤ 0 . Then, S0 ⊆ H(z, v). Proof. Take x∗ ∈ S0 , then hT (z), x∗ − zi ≤ 0 for all z ∈ C. Since (z, v) ∈ Gph(NC ), we have hv, x∗ − zi ≤ 0. Summing up these inequalities, we get hT (z) + v, x∗ − zi ≤ 0. Then, x∗ ∈ H(z, v). Lemma 1.3.29 If T : dom(T ) ⊆ Rn → Rn is continuous, then S0 is a closed and convex. Proof. It follows directly from Definition (1.9).

Lemma 1.3.30 [18, Lemma 2.4]. Let T : H → 2H be a maximal monotone operator and C a closed and convex set. Then if SV I is nonempty, it is also closed and convex. Lemma 1.3.31 [76, Lemma 3]. If T : H → 2H is maximal monotone and let SV I be the solution set of the problem (1.1), then SV I = {x ∈ C : hv, y − xi ≥ 0 , ∀ y ∈ C, ∀ v ∈ T (y)}. Proposition 1.3.32 [67, Theorem 4]. Given β > 0 and T : dom(T ) ⊆ H → 2H a pointto-set and maximal monotone operator. Then the operator (Id +β T )−1 : H → dom(T ) is single valued and Lipschitz continuous with constant 1 (nonexpansive). Proposition 1.3.33 [37, Proposition 3.13]. Given β > 0 and A : dom(A) ⊆ H → H and B : dom(B) ⊆ H → 2H two maximal monotone operators. Then, x = (Id +βB)−1 (Id −βA)(x), if and only if 0 ∈ (A + B)(x). Theorem 1.3.34 [26, Theorem 9]. Let A : dom(A) ⊂ H → 2H and B : dom(B) ⊂ H → 2H two maximal monotone operators. Suppose also that either (i) the set int(dom(A)) ∩ int(dom(B)) is nonempty; or 14

(ii) dom(A) ∩ int(dom(B)) 6= ∅ while dom(A) ∩ dom(B) is closed and convex. Then A + B is maximal monotone.

15

Chapter 2 Conditional extragradient algorithms for variational inequalities 2.1

Introduction

This chapter is based on reference [22]. We present a conditional extragradient algorithms for solving the problem (1.1) with ∅ 6= C ⊂ H convex and closed. We recall the variational inequality problem with a single value operator T : find x∗ ∈ C such that hT (x∗ ), x − x∗ i ≥ 0

∀ x ∈ C.

(2.1)

Through this chapter we consider that the Hilbert space H is finite dimensional, i.e., H = Rn . It is well-known that (2.1) is closely related with the so-called dual formulation problem of the variational inequalities, written as find x∗ ∈ C such that hT (x), x − x∗ i ≥ 0,

∀ x ∈ C.

(2.2)

The solution set of problem (2.2) is the set S0 defined in (1.9). Throughout this chapter, our standing assumptions are the following: (A1) T is continuous on C. (A2) Problem (2.1) has at-least one solution and all solutions of (2.1) solve the dual problem (2.2). By Lemma 1.3.27, assumption (A1) implies S0 ⊆ SV I . So, the existence of solutions of (2.2) implies that of (2.1). However, the reverse assertion needs generalized monotonicity assumptions. For example, if T is pseudomonotone then SV I ⊆ S0 (see [58, Lemma 1]). 16

With this results, we note that (A2) is strictly weaker than pseudomonotonicity of T (see Example 1.1.3 of [57] and Example 2.2.1 below). Moreover, the assumptions SV I 6= ∅ and the continuity of T are natural and classical in the literature for most of methods for solving (2.1). Assumption (A2) has been used in various algorithms for solving problem (2.1) (see, e.g., [58, 59]).

2.1.1

Extragradient algorithm

In this chapter we focus our attention on projection-type algorithms for solving problem (2.1). Excellent surveys on projection algorithms for solving variational inequality problems can be found in [41, 47, 57]. One of the most studied projection algorithms is the so-called Extragradient algorithm, which first appeared in [61]. The projection methods for solving problem (2.1) necessarily have to perform two projection steps onto the feasible set because the natural extension of the projected gradient method (one projection and T = ∇f ) fails in general for monotone operators (see, e.g., [16]). Thus, an extra projection step is required in order to establish the convergence of the projection methods. Next we describe a general version of the extragradient algorithm together with important strategies for computing the stepsizes (see, e.g., [41, 57]). Algorithm E (Extragradient Algorithm) Given αk , βk , γk . Step 0 (Initialization): Take x0 ∈ C. Step 1 (Iterative Step): Compute z k = xk − βk T (xk ),

(2.3a)

y k = αk PC (z k ) + (1 − αk )xk , and xk+1 = PC xk − γk T (y k ) .

(2.3b) (2.3c)

Step 2: (Stop Test): If xk+1 = xk then stop. Otherwise set k ← k + 1 and go to Step 1. We now describe some possible strategies to choose positive stepsizes αk , βk and γk in (2.3b), (2.3a) and (2.3c), respectively. (a) Constant stepsizes: βk = γk where 0 < βˇ ≤ βk ≤ βˆ < +∞ and αk = 1, ∀k ∈ N. (b) Armijo-type linesearch on the boundary of the feasible set: Set σ > 0, and δ ∈ (0, 1). For each k, take αk = 1 and βk = σ2−j(k) where  δ  k k,j k k,j 2  j(k) := min j ∈ N : kT (x ) − T (PC (z ))k ≤ kx − PC (z )k , σ2−j (2.4)   and z k,j = xk − σ2−j T (xk ). 17

In this approach, we take γk = βk , y k = PC (xk − βk T (xk )) and γk =

hT (y k ), xk − y k i , kT (y k )k2

∀k ∈ N.

(c) Armijo-type linesearch along the feasible direction: Set δ ∈ (0, 1), z k := xk − βk T (xk ) ˇ β] ˆ such that 0 < βˇ ≤ βˆ < +∞, and αk = 2−`(k) where with, (βk )k∈N ⊂ [β,  δ  k k 2 k,` k k  `(k) := min ` ∈ N : hT (z ), x − PC (z )i ≥ kx − PC (z )k , βk (2.5)   and z k,` = 2−` P (z k ) + (1 − 2−` )xk . C

Then, define y k = αk PC (z k ) + (1 − αk )xk and γk =

hT (y k ), xk − y k i , kT (y k )k2

∀k ∈ N.

Below several comments follow, explaining the differences between these strategies. It has been proved in [61] that the extragradient algorithm with Strategy (a) is globally convergent if T is monotone and Lipschitz continuous on C. The main difficulty of this strategy is the necessity of choosing βk in (2.3a) satisfying 0 < βk ≤ β < 1/L where L is the Lipschitz constant of T , since the stepsizes have be taken sufficiently small to ensures the convergence, and also L may not be know beforehand. Strategy (b) was first studied in [56] under monotonicity and Lipschitz continuity of T . The Lipschitz continuity assumption was removed later in [51]. Note that this strategy requires computing the projection onto C inside the inner loop of the Armijo-type linesearch (2.4). So, the need to compute possible many projections for each iteration k makes Strategy (b) inefficient when an explicit formula for PC is not available. Strategy (c) was presented in [53]. This strategy guarantees convergence by assuming only the monotonicity of T and the existence of solutions of (2.1), without assuming Lipschitz continuity on T . This approach demands only one projection for each outer step k. In Strategies (b) and (c), T is evaluated at least twice and the projection is computed at least twice per iteration. The resulting algorithm is applicable to the whole class of monotone variational inequalities. It has the advantage of not requiring exogenous parameters. Furthermore, Strategies (b) and (c) occasionally allow long steplength because they both exploit much the information available at each iteration. Extragradient-type algorithms is currently a subject of intense research (see, e.g., [6, 16, 18, 19, 32, 80, 81]). A special modification on Strategy (c) was presented in [59] where the monotonicity was replaced by (A2). The main difference is that it performs ( `(k) := min ` ∈ N : hT (z k,` ), xk − PC (z k )i ≥ δhT (xk ), xk − PC (z k )i , (2.6) and z k,` = 2−` PC (z k ) + (1 − 2−` )xk , 18

instead of (2.5).

2.1.2

Proposed schemes

The main part of this chapter contains two conceptual algorithms, each of them with three variants. Convergence analysis of both conceptual algorithms is established assuming weaker assumptions than previous extragradient algorithms [20, 63]. The approach presented here is closely related to the extragradient algorithm in the above subsection and is based on combining, modifying and generalizing of several ideas contained in various classical extragradient variants. Our scheme was inspired by the conditional subgradient method proposed in [63], and it uses a similar idea of Algorithm E over Strategies (a), (b) and (c). The example presented in Section 2.2 motivates our scheme and shows that the variants proposed here may perform better than previous classical variants. Basically, our two conceptual algorithms contain two parts: The first has two different linesearches: one on the boundary of the feasible set and the other along the feasible direction. These linesearches allow us to find a suitable halfspace separating the current iteration and the solution set. The second has three projection steps allowing several variants with different and interesting features on the generated sequence. In this setting, some of the proposed variants on the conceptual algorithms are related to the algorithms presented in [18, 53, 80]. An essential characteristic of the conceptual algorithms is the convergence under very mild assumptions, like continuity of the operator T (see (A1)), existence of solutions of (2.1), and assuming that all such solutions also solve the dual variational inequality (2.2) (see (A2)). We would like to emphasize that this concept is less restrictive than pseudomonotonicity of T and plays a central role in the convergence analysis of our algorithms.

2.2

Motivation: An example

In this section, we present an elementary instance of problem (2.1), in which normal vectors to the feasible set are beneficial. Example 2.2.1 Let B := (b1 , b2 ) ∈ R2 recall that the rotation with angle θ ∈ − π2 , π2 around B is given by 2

2

Rθ,B : R → R : x 7→

cos θ

sin θ

− sin θ

cos θ

(x − B) + B.

We consider problem (2.1) in R2 with the operator T := R− π ,B − Id where B := ( 12 , 1), and 2

19

the feasible set is given as C := (x1 , x2 ) ∈ R2 : x21 + x22 ≤ 1, x1 ≤ 0, x2 ≥ 0 . Note that operator T is Lipschitz continuous with constant L = 2, but not monotone. Now we prove that operator T satisfies (A2), i.e., S0 = S∗ . Now let us split our analysis into two distinct parts. Part 1: (The primal problem has a unique solution). For x := (x1 , x2 ) ∈ R2 , consider the operator       −1 −1 3/2 0 −1 (x − B) + B − x =  x +  . (2.7) T (x) :=  1 0 1 −1 1/2 We will show that the primal variational inequality problem (1.1), has a unique solution. Indeed, notice that the solution (if exists); cannot lie in the interior of C (because T (x) 6= 0 for all x ∈ C); and also cannot lie on the two segment {0} × [0, 1] and [−1, 0] × {0} (by direct computations). Thus, the solution must lie on the arc Γ := {(x1 , x2 ) ∈ R2 | x21 + x22 = 1, x1 ≤ 0, x2 ≥ 0}. Using polar coordinates, set x = (cos t, sin t) ∈ Γ, t ∈ (π/2, π). Then,   3 − cos t − sin t + 2  T (x) =  . cos t − sin t + 12 Since x∗ ∈ SV I , the vectors x∗ and T (x∗ ) must be parallel. Hence, cos t∗ − sin t∗ + 12 − cos t∗ − sin t∗ + 23 = cos t∗ sin t∗ 2 2 3 − sin t∗ cos t∗ − sin t∗ + 2 sin t∗ = cos t∗ − cos t∗ sin t∗ + 12 cos t∗ 3 2 √3 10

sin t∗ − 21 cos t∗ = 1

sin t∗ −

√1 10

cos t∗ = sin t∗ − arcsin( √110 ) =

√2 10 √2 . 10

Since t ∈ (π/2, π) for all x ∈ C, we have t∗ = π − arcsin( √210 ) + arcsin( √110 ) ≈ 2.7786. Then, the unique solution is x∗ = (cos t∗ , sin t∗ ) ≈ (−0.935, 0.355). Part 2: (The primal solution is also a solution of the dual problem). Now, we will show that x∗ is a solution to the dual problem and as consequence of the continuity of T and Lemma 1.3.27 the result follows. If x∗ ∈ S0 , hT (y), y − x∗ i ≥ 0 for all y ∈ C. 20

First, notice that kx∗ k = 1 and        −1 −1 −0.935 3/2  2.08  T (x∗ ) ≈  + ≈ ≈ −2.22 x∗ . 1 −1 0.355 1/2 −0.79 So, we can write T (x∗ ) = γ(−x∗ ) where 2 < γ ≈ 2.22.

(2.9)

On the other hand, from (2.7), we can check that hT (y) − T (x∗ ), y − x∗ i = −ky − x∗ k2 ,

∀y ∈ R2 .

(This is why T is never monotone!). It follows that hT (y), y − x∗ i = hT (x∗ ), y − x∗ i − ky − x∗ k2 . Thus, it suffices to prove hT (x∗ ), y − x∗ i ≥ ky − x∗ k2 Take y ∈ C, so kyk ≤ 1. we define z =

for all y ∈ C.

(2.10)

x∗ + y . Then, 2

hz, z − x∗ i = 21 hy + x∗ , z − x∗ i = 41 hy + x∗ , y − x∗ i = 41 (kyk2 − kx∗ k2 ) ≤ 0, implying that hz − x∗ , z − x∗ i = hz, z − x∗ i + h−x∗ , z − x∗ i ≤ h−x∗ , z − x∗ i. Combining the last inequality with the definition of z, we get 0 ≤ky − x∗ k2 = 4kz − x∗ k2 = 4hz − x∗ , z − x∗ i ≤ 4h−x∗ , z − x∗ i =2h−x∗ , y − x∗ i < γh−x∗ , y − x∗ i = hγ(−x∗ ), y − x∗ i = hT (x∗ ), y − x∗ i, where we use (2.9) in the last inequality. This proves (2.10) and thus complete the proof. As consequence T satisfies (A2) and the unique solution of the problem is x∗ ≈ (−0.935, 0.355). Now we present some numerical results for the presented algorithm. We set a starting point (x0 , y 0 ) = (0, 1) and a scalar β = 0.25. Now, we observe the performance of the following two algorithms: (a1) Algorithm E with Strategy (a) (constant stepsizes, i.e., ∀k ∈ N, γk = βk = β and αk = 1), which generates the following sequences: ( z k = PC (y k − βT (y k )), ∀k ∈ N : y k+1 = PC (y k − βT (z k )). 21

(a2) A modified Algorithm E with Strategy (a) involving unit normal vectors in both projection steps, which generates the following sequences as follows: ( ∀k ∈ N :

z k = PC (xk − β(T (xk ) + uk )),

where uk ∈ NC (xk ), kuk k = 1,

xk+1 = PC (xk − β(T (z k ) + v k )),

where v k ∈ NC (z k ), kv k k = 1.

Figure 2.1: Advantage by using non-null normal vectors.

The first five iterations of each algorithm are showed in Figure 2.1: Recall that (y k )k∈N (red) and (xk )k∈N (blue) are the sequences generated by the algorithms presented above in (a1) and (a2), respectively. The comparison suggests that normal vectors of the feasible set can help the extragradient algorithm to improve considerably the convergence speed.

2.3

The extragradient algorithm with normal vectors

Inspired by the previous section, we investigate the extragradient algorithm with constant stepsizes involving normal vectors of the feasible set. In this section we assume that T is Lipschitz with constant L and (A2) holds. The algorithm proposed in this section is related with Algorithm E under Strategy (a) (constant stepsizes). It is defined as: 22

ˇ β] ˆ such Algorithm CE (Conditional Extragradient Algorithm) Take (βk )k∈N ⊂ [β, that 0 < βˇ ≤ βˆ < 1/(L + 1) and δ ∈ (0, 1). Step 1 (Initialization): Take x0 ∈ C and set k ← 0. Step 1 (Stop Test 1): If xk = PC (xk − βk T (xk )), i.e., xk ∈ SV I , then stop. Otherwise: Step 2 (First Projection): Take uk ∈ NC (xk ) such that kuk k ≤ δkxk − PC (xk − βk (T (xk ) + uk ))k.

(2.11)

z k = PC (xk − βk (T (xk ) + uk )).

(2.12)

Step 3 (Second Projection): Take v k ∈ NC (z k ) such that kv k − uk k ≤ kxk − z k k.

(2.13)

xk+1 = PC (xk − βk (T (z k ) + v k )).

(2.14)

Step 4(Stop Test 2): If xk+1 = xk then stop. Otherwise set k ← k + 1 and go to Step 1.

Proposition 2.3.1 The following hold: Algorithm CE is well-defined.

Proof. It is sufficient to prove that if Step 1 is not satisfied, i.e., kxk − PC (xk − βk T (xk ))k > 0.

(2.15)

Then, Step 2 and Step 3 are attainable. Step 2 is attainable: Suppose that (2.11) does not hold for every αuk ∈ NC (xk ) with α > 0, i.e., kαuk k > δkxk − PC (xk − βk (T (xk ) + αuk ))k ≥ 0. Taking limit when α goes to 0, we get kxk − PC (xk − βk T (xk ))k = 0, which contradicts (2.15). Step 3 is attainable: Suppose that (2.13) does not hold for every αv k ∈ NC (z k ) with α > 0, i.e., kαv k − uk k > kxk − z k k, where z k = PC (xk − βk (T (xk ) + uk )) as (2.12) and uk ∈ NC (xk ) satisfying (2.11). Letting α goes to 0 and using (2.11), we get kxk −z k k ≤ kuk k ≤ δkxk −z k k. So, xk = z k . Then, Proposition 1.3.25 implies a contradiction to (2.15).

Lemma 2.3.2 Suppose that T is Lipschitz continuous with constant L. Let x∗ ∈ SV I . Then, for every k ∈ N, it holds that kxk+1 − x∗ k2 ≤ kxk − x∗ k2 − (1 − βk2 (L + 1)2 )kz k − xk k2 . 23

Proof. Define wk = xk − βk (T (z k ) + v k ) with v k ∈ NC (z k ) as Step 3. Then, using (2.14) and applying Proposition 1.3.9(i), with x = wk and y = x∗ , we get kxk+1 − x∗ k2 ≤ kwk − x∗ k2 − kwk − PC (wk )k2 ≤ kxk − x∗ − βk (T (z k ) + v k )k2 − kxk − xk+1 − βk (T (z k ) + v k )k2 = kxk − x∗ k2 − kxk − xk+1 k2 + 2βk hT (z k ) + v k , x∗ − xk+1 i.

(2.16)

Since v k ∈ NC (z k ) and (A2), we have hT (z k ) + v k , x∗ − xk+1 i =hT (z k ) + v k , z k − xk+1 i + hT (z k ) + v k , x∗ − z k i ≤hT (z k ) + v k , z k − xk+1 i + hT (z k ), x∗ − z k i ≤hT (z k ) + v k , z k − xk+1 i. Substituting the above inequality in (2.16), we get kxk+1 − x∗ k2 ≤kxk − x∗ k2 − kxk − xk+1 k2 − 2βk hT (z k ) + v k , xk+1 − z k i =kxk − x∗ k2 − kxk − z k k2 − kz k − xk+1 k2 +2hxk − βk (T (z k ) + v k ) − z k , xk+1 − z k i.

(2.17)

Define xk = xk − βk (T (xk ) + uk ) with uk ∈ NC (xk ) as Step 2 and recall that z k = PC (¯ xk ) and xk+1 = PC (wk ) = PC (xk − βk (T (z k ) + v k )), we have 2hxk − βk (T (z k ) + v k ) − z k , xk+1 − z k i = 2hwk − PC (xk ), PC (wk ) − PC (xk )i = 2hxk − PC (xk ), PC (wk ) − PC (xk )i + 2hwk − xk , PC (wk ) − PC (xk )i ≤ 2hwk − xk , PC (wk ) − PC (xk )i = 2hwk − xk , xk+1 − z k i = 2βk h(T (xk ) + uk ) − (T (z k ) + v k ), xk+1 − z k i ≤ 2βk kT (z k ) − T (xk )k + kv k − uk k kxk+1 − z k k ≤ 2βk (L + 1)kz k − xk kkxk+1 − z k k ≤ βk2 (L + 1)2 kz k − xk k2 + kxk+1 − z k k2 ,

(2.18)

using Proposition 1.3.9(ii), with x = xk −βk (T (xk )+uk ) and z = xk+1 , in the first inequality, the Cauchy-Schwarz inequality in the second one and the Lipschitz continuity of T and (2.13) in the third one. Therefore, (2.18) together with (2.17) proves the lemma. Proposition 2.3.3 If Algorithm CE stops then xk ∈ SV I . Proof. If the algorithm stops in stop at Step 1 then Proposition 1.3.25 garantes the optimality of xk . If xk = PC (xk − βk (T (z k ) + v k )) then xk ∈ SV I by Lemma 2.3.2 and Proposition 1.3.25. In the remainder of this section, we investigate the case that Algorithm CE does not stop and generates an infinite sequence (xk )k∈N . 24

Corollary 2.3.4 The sequence (xk )k∈N is Fejér convergent to SV I and limk→∞ kz k −xk k = 0. Proof. It follows directly from Lemma 2.3.2 and the fact βk ≤ βˆ < 1/(L + 1) for all k ∈ N that. kxk+1 − x∗ k2 ≤ kxk − x∗ k2 − (1 − βˆ2 L2 )kz k − xk k2 ≤ kxk − x∗ k2 . So, (xk )k∈N is Fejér convergent to SV I and Proposition 1.3.5(ii) together with the above inequality imply limk→∞ kz k − xk k = 0. Proposition 2.3.5 The sequence (xk )k∈N converges to a point in SV I . Proof. The sequence (xk )k∈N is bounded by Lemma 2.3.2 and Proposition 1.3.5(i). Let x˜ be an accumulation point of a subsequence (xik )k∈N . By Corollary 2.3.4, x˜ is also an accumulation point of (z ik )k∈N . Without loss of generality, we suppose that the corresponding parameters (βik )k∈N and (uik )k∈N converge to β˜ and u˜, respectively. Since z k = PC (xk − βk (T (xk ) + uk )), ˜ (˜ taking the limit along the subsequence (ik )k∈N , we obtain x˜ = PC (˜ x − β(T x) + u˜)). Thus, Fact 1.3.21 and Proposition 1.3.25 imply x˜ ∈ SV I . Next, we examine the performance of Algorithm CE for the variational inequality in Example 2.2.1 with and without normal vectors.

Figure 2.2: Conditional extragradient method with and without normal vectors. 25

Figure 2.2 shows the first five elements of sequences (y k )k∈N (generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors). In practice, using normal vectors with large magnitude can potentially produce significant difference.

2.4

Two linesearches

In this section we present two linesearches, which will be used in our conceptual algorithms in the next two sections. These linesearches are related to Strategies (b) and (c) in Algorithm E. The main difference here is that the new ones utilize normal vectors to the feasible sets. We begin introducing a linesearch on the boundary of the feasible set, which is closely related with the linesearch in Strategy (b) of the extragradient algorithm (see Algorithm E). Indeed, if we set the vectors u ∈ NC (x) and vα ∈ NC (zα ) (α ∈ {σ, σθ, σθ2 , . . .}) as the null vector in Linesearch B below, then it becomes Strategy (b) presented in (2.4).

Linesearch B (Linesearch on the Boundary) Input: (x, u, σ, δ, M ). Where x ∈ C, u ∈ NC (x), σ > 0, δ ∈ (0, 1), and M > 0. Set α = σ and θ ∈ (0, 1) and choose u ∈ NC (x). Denote zα = PC (x − α(T (x) + αu)) and choose vα ∈ NC (zα ) with kvα k ≤ M . While αkT (zα ) − T (x) + αvα − αuk > δkzα − xk do α ← θα and choose any vα ∈ NC (zα ) with kvα k ≤ M . End While Output: (α, zα , vα ). We now show that Linesearch B is well definedassuming only (A1), i.e., continuity of T is sufficient to prove the well-definition of Linesearch B. Lemma 2.4.1 If x ∈ C and x ∈ / SV I , then Linesearch B stops after finitely many steps. Proof. Suppose on the contrary that Linesearch B does not stop for all α ∈ P := {σ, σθ, σθ2 , . . .} and the chosen vectors vα ∈ NC (zα ),

kvα k ≤ M,

(2.19a)

zα = PC (x − α(T (x) + αu)).

(2.19b)

αkT (zα ) − T (x) + αvα − αuk > δkzα − xk.

(2.20)

We have

26

On the one hand, dividing both sides of (2.20) by α > 0 and letting α goes to 0, we obtain by the boundedness of (vα )α∈P , presented in (2.19a), and the continuity of T that 0 = lim inf kT (zα ) − T (x) + αvα − αuk ≥ lim inf α→0

α→0

kx − zα k ≥ 0. α

Thus, by (2.19b), lim inf α→0

kx − PC (x − α(T (x) + αu))k = 0. α

(2.21)

On the other hand, Corollary 1.3.23 implies x − PC (x − α(T (x) + αu)) ∈ T (x) + αu + NC (PC (x − α(T (x) + αu))). α From (2.21), the continuity of the projection and the closedness of Gph(NC ) imply 0 ∈ T (x) + NC (x), which is a contradiction since x ∈ / SV I . As mentioned before, the disadvantage of Linesearch B is the necessity to compute the projection onto the feasible set inside the inner loop to find the stepsize α. To overcome this, we present a linesearch along the feasible direction below, which is closely related to Strategy (c) of Algorithm E. Indeed, if we set the vectors u ∈ NC (x) and vα ∈ NC (zα ) (α ∈ {1, θ, θ2 , . . . , }) as the null vector in Linesearch F below, we obtain Strategy (c) presented in (2.6). Furthermore, if we only choose u = 0, then the projection step is done outside the procedure to find the stepsize α.

Linesearch F (Linesearch along the Feasible direction) Input: (x, u, β, δ, M ). Where x ∈ C, u ∈ NC (x), β > 0, δ ∈ (0, 1), and M > 0. Set α ← 1 and θ ∈ (0, 1). Define zα = PC (x − β(T (x) + αu)) and choose u ∈ NC (x), v1 ∈ NC (z1 ) with kv1 k ≤ M . While hT αzα + (1 − α)x + vα , x − zα i < δhT (x) + αu, x − zα i do α ← θα and choose any vα ∈ NC (αzα + (1 − α)x) with kvα k ≤ M . End While Output: (α, zα , vα ). In the following we prove that Linesearch F is well definedassuming only (A1), i.e., continuity of T is sufficient to prove the Linesearch F is well defined. Lemma 2.4.2 If x ∈ C and x ∈ / SV I , then Linesearch F stops after finitely many steps. 27

Proof. Suppose on the contrary that Linesearch F does not stop for all α ∈ Q := {1, θ, θ2 , . . .} and that

vα ∈ NC αzα + (1 − α)x , zα = PC

kvα k ≤ M,

(2.22a)

x − β(T (x) + αu) .

(2.22b)

We have hT (αzα + (1 − α)x) + vα , x − zα i < δhT (x) + αu, x − zα i.

(2.23)

By (2.22a) the sequence (vα )α∈Q is bounded; thus, without loss of generality, we can assume that it converges to some v0 ∈ NC (x) (by Fact 1.3.21). The continuity of the projection operator and (2.22b) imply that (zα )α∈Q converges to z0 = PC (x − βT (x)). Taking the limit in (2.23), when α goes to 0, we get hT (x) + v0 , x − z0 i ≤ δhT (x), x − z0 i. Noticing that v0 ∈ NC (x), we have

0 ≥ (1 − δ)hT (x), x − z0 i + hv0 , x − z0 i ≥ (1 − δ)hT (x), x − z0 i ≥

(1 − δ) kx − z0 k2 . β

Then, it follows that x = z0 = PC (x − βT (x)), i.e., x ∈ SV I , a contradiction.

2.5

Conceptual algorithm with Linesearch B

In this section, we study Conceptual Algorithm B in which Linesearch B is used to obtain the stepsizes. From now on, we assume that (A1) and (A2) hold. The next conceptual algorithm is related with Algorithm E over Strategy (b) when non-null normal vectors are using in the steps (2.3a)-(2.3c).

28

Conceptual Algorithm B Given σ > 0, δ ∈ (0, 1), and M > 0. Step 0 (Initialization): Take x0 ∈ C and set k ← 0. Step 1 (Stop Test 1): If xk = PC (xk − T (xk )), i.e., xk ∈ SV I , then stop. Otherwise, Step 2 (Linesearch B): Take uk ∈ NC (xk ) with kuk k ≤ M and set (αk , z k , v k ) := Linesearch B (xk , uk , σ, δ, M ), i.e., (αk , z k , v k ) satisfy  k k k   v ∈ NC (z ) with kv k ≤ M ; αk ≤ σ; z k = PC (xk − αk (T (xk ) + αk uk ));   αk kT (z k ) − T (xk ) + αk (v k − uk )k ≤ δkz k − xk k.

(2.24)

Step 3 (Projection Step): Set v k := αk v k

(2.25a)

and xk+1 := FB (xk ).

(2.25b)

Step 4 (Stop Test 2): If xk+1 = xk then stop. Otherwise, set k ← k + 1 and go to Step 1. We consider three variants of this algorithm. Their main difference lies in the computation (2.25b): FB.1 (xk ) =PC PH(zk ,vk ) (xk ) ; (Variant B.1) (2.26) FB.2 (xk ) =PC∩H(zk ,vk ) (xk );

(Variant B.2)

(2.27)

FB.3 (xk ) =PC∩H(zk ,vk )∩W (xk ) (x0 ),

(Variant B.3)

(2.28)

where

and

H(z, v) := y ∈ Rn : hT (z) + v, y − zi ≤ 0 , W (x) := y ∈ Rn : hy − x, x0 − xi ≤ 0 .

(2.29a) (2.29b)

These halfspaces have been widely used in the literature, e.g., [15, 20, 78] and the references therein. Our goal is analyze the convergence of these variants. First, we start by showing that this conceptual algorithm is well-defined. Proposition 2.5.1 Assume that (2.25b) is well definedwhenever xk is available. Then, Conceptual Algorithm B is also well-defined. Proof. If Step 1 is not satisfied, then Steps 2 is guaranteed by Lemma 2.4.1. Thus, the entire algorithm is well-defined. 29

Next, we present a useful proposition for establishing that projection step (2.25b) under each variant is well defined. Proposition 2.5.2 xk ∈ SV I if and only if xk ∈ H(z k , v k ), where z k and v k are given respectively by (2.24) and (2.25a). Proof. Suppose that xk ∈ / SV I . Define u¯k = αk uk ∈ NC (xk ) and wk = xk − αk (T (xk ) + u¯k ). Then, αk hT (z k ) + v k , xk − z k i = αk hT (z k ) − T (xk ) + v k − u¯k , xk − z k i + αk hT (xk ) + u¯k , xk − z k i = αk hT (z k ) − T (xk ) + v k − u¯k , xk − z k i + hxk − wk , xk − z k i ≥ −αk kT (z k ) − T (xk ) + v k − u¯k k · kxk − z k k + kxk − z k k2 ≥ −δkxk − z k k2 + kxk − z k k2 = (1 − δ)kxk − z k k2 > 0,

(2.30)

where we have used Linesearch B and Proposition 1.3.9(iii) in the second inequality. Thus, xk ∈ / H(z k , v k ). Conversely, if xk ∈ SV I using Lemma 1.3.28, xk ∈ H(z k , v k ). Now, we note a useful algebraic property on the sequence generated by Conceptual Algorithm B, which is a direct consequence of Linesearch B. Let (xk )k∈N , (z k )k∈N and (αk )k∈N be sequences generated by Conceptual Algorithm B, using (2.30), we get ∀k ∈ N :

2.5.1

hT (z k ) + v k , xk − z k i ≥

(1 − δ) δkxk − z k k2 . αk

(2.31)

Convergence analysis of Variant B.1

In this section, all results are for Variant B.1, which is summarized below. Variant B.1 x where

k+1

k

k

k

= FB.1 (x ) = PC (PH(zk ,vk ) (x )) = PC x −

T (z k )+v k ,xk −z k

kT (z k )+v k k2

T (z k ) + v k

H(z k , v k ) = y ∈ Rn : hT (z k ) + v k , y − z k i ≤ 0 , with z k and v k are respectively given by (2.24) and (2.25a). Proposition 2.5.3 If xk+1 = xk , if and only if xk ∈ SV I and Variant B.1 stops. Proof. If xk+1 = PC PH(zk ,vk ) (xk ) = xk , then Proposition 1.3.9(ii) implies hPH(zk ,vk ) (xk ) − xk , z − xk i = hPH(zk ,vk ) (xk ) − xk+1 , z − xk+1 i ≤ 0,

(2.32)

for all z ∈ C. Using again Proposition 1.3.9(ii), hPH(zk ,vk ) (xk ) − xk , PH(zk ,vk ) (xk ) − zi ≤ 0, 30

(2.33)

for all z ∈ H(z k , v k ). Note that C ∩ H(z k , v k ) 6= ∅, because z k belongs to it. So, for any z ∈ C ∩ H(z k , v k ), adding up (2.32) and (2.33) yields kxk − PH(zk ,vk ) (xk )k2 = 0. Hence, xk = PH(zk ,vk ) (xk ), i.e., xk ∈ H(z k , v k ). Thus, we have xk ∈ SV I by Proposition 2.5.2. Conversely, if xk ∈ SV I , Proposition 2.5.2 implies xk ∈ H(z k , v¯k ) and together with (2.26), we get xk = xk+1 . As consequence of Proposition 2.5.3, we can assume that Variant B.1 does not stop. Note that by Lemma 1.3.28, H(z k , v k ) is nonempty for all k ∈ N. So, the projection step (2.26) is well-defined. Thus, Variant B.1 generates an infinite sequence (xk )k∈N such that xk ∈ / SV I for all k ∈ N. Proposition 2.5.4 The following hold: (i) The sequence (xk )k∈N is Fejér convergent to SV I . (ii) The sequence (xk )k∈N is bounded. (iii) limk→∞ hT (z k ) + v k , xk − z k i = 0. Proof. (i): Take x∗ ∈ SV I . Note that, by definition (z k , v k ) ∈ Gph(NC ). Using (2.26), Proposition 1.3.9(i) and Lemma 1.3.28, we have kxk+1 − x∗ k2 = kPC (PH(zk ,vk ) (xk )) − PC (PH(zk ,vk ) (x∗ ))k2 ≤ kPH(zk ,vk ) (xk ) − PH(zk ,vk ) (x∗ )k2 ≤ kxk − x∗ k2 − kPH(zk ,vk ) (xk ) − xk k2 .

(2.34)

So, kxk+1 − x∗ k ≤ kxk − x∗ k. (ii): Follows from (i) and Proposition 1.3.5(i).

T (z k ) + v k , xk − z k k k (iii): Take x∗ ∈ SV I . Since PH(zk ,vk ) (x ) = x − T (z ) + v , and kT (z k ) + v k k2 combining it with (2.34), yields

2

k k k k T (z ) + v , x − z

k k+1 2 k 2 k k k kx − x∗ k ≤ kx − x∗ k − x − T (z ) + v − x

kT (z k ) + v k k2 k

= kxk − x∗ k2 −

k

(hT (z k ) + v k , xk − z k i)2 . kT (z k ) + v k k2

It follows from the last inequality that (hT (z k ) + v k , xk − z k i)2 ≤ kxk − x∗ k2 − kxk+1 − x∗ k2 . k 2 k kT (z ) + v k

(2.35)

Since T and the projection are continuous and (xk )k∈N is bounded, (z k )k∈N is bounded. The boundedness of kT (z k ) + v k k k∈N follows from (2.24). Using Proposition 1.3.5(ii), the right hand side of (2.35) goes to 0, when k goes to ∞. Then, the result follows. 31

Next we establish our main convergence result on Variant B.1. Theorem 2.5.5 The sequence (xk )k∈N converges to a point in SV I . Proof. We claim that there exists an accumulation point of (xk )k∈N belonging to SV I . The existence of the accumulation points of (xk )k∈N follows from Proposition 2.5.4(ii). Let (xik )k∈N be a convergent subsequence of (xk )k∈N such that, (uik )k∈N , (v ik )k∈N and (αik )k∈N ˜. also converge, and set limk→∞ xik = x˜, limk→∞ uik = u˜, limk→∞ v ik = v˜ and limk→∞ αik = α Using Proposition 2.5.4(iii), (2.31), and taking the limit along the subsequence (ik )k∈N , we have 0 = limk→∞ hT (z ik ) + v ik , xik − z ik i ≥ (1−δ) limk→∞ kz ik − xik k2 ≥ 0. Then, α ˜ lim kxik − z ik k = 0.

k→∞

(2.36)

Now we have two cases: Case 1: limk→∞ αik = α ˜ > 0. We have from (2.24), the continuity of the projection, and (2.36) that x˜ = limk→∞ xik = limk→∞ z ik = PC x˜ − α ˜ (T (˜ x) + α ˜ u˜) . Then, x˜ = PC x˜ − α ˜ (T (˜ x) + α ˜ u˜) and as consequence of Proposition 1.3.25, x˜ ∈ SV I . ˜ = 0. Define α ˜ k := Case 2: limk→∞ αik = α

αk . θ

Hence,

αik = 0. k→∞ θ

lim α ˜ ik = lim

k→∞

(2.37)

Since α ˜ k does not satisfy Armijo-type condition in Linesearch B, we have δk˜ z k − xk k kT z˜k − T (xk ) + α ˜ k v˜k − α ˜ k uk k > , α ˜k

(2.38)

where v˜k ∈ NC (˜ z k ) and z˜k = PC (xk − α ˜ k (T (xk ) + α ˜ k uk )).

(2.39)

The left hand side of (2.38) goes to 0 along the subsequence (ik )k∈N by the continuity of T and PC . So, k˜ z k − xk k = 0. (2.40) lim inf k→0 α ˜k By Corollary 1.3.23, with x = xk , α = α ˜ k and p = T (xk ) + α ˜ k uk , we have xk − z˜k ∈ T (xk ) + α ˜ k uk + NC (˜ z k ). α ˜k Taking the limits along the subsequence (ik )k∈N and using (2.37), (2.39), (2.40), the continuity of T and the closedness of Gph(NC ), we get that 0 ∈ T (˜ x) + NC (˜ x), thus, x˜ ∈ SV I . 32

Figure 2.3: Variant B.1 with and without normal vectors. Figure 2.3 above examines the performance of Variant B.1 for the variational inequality in Example 2.2.1 with and without normal vectors. It shows the first five elements of sequences (y k )k∈N (generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors).

2.5.2


In this section, all results are for Variant B.2, which is summarized below. Variant B.2 xk+1 = FB.2 (xk ) = PC∩H(zk ,vk ) (xk ) where H(z k , v k ) = y ∈ Rn : hT (z k ) + v k , y − z k i ≤ 0 , z k and v k are given by (2.24) and (2.25a), respectively. Proposition 2.5.6 If xk+1 = xk , if and only if xk ∈ SV I and Variant B.2 stops. Proof. We have xk+1 = xk implies xk ∈ C ∩ H(z k , v k ). Thus, xk ∈ SV I by Proposition 2.5.2. Conversely, if xk ∈ SV I , then Proposition 2.5.2 implies xk ∈ H(z k , v¯k ). Then, the result follows from (2.27). 33

We study the case that Variant B.2 does not stop, thus, it generates a sequence (xk )k∈N . Proposition 2.5.7 The sequence (xk )k∈N is Féjer convergent to SV I . bounded and limk→∞ kxk+1 − xk k = 0.

Moreover, it is

Proof. Take x∗ ∈ SV I . By Lemma 1.3.28, x∗ ∈ H(z k , v k ), for all k ∈ N, and also x∗ belongs to C implying that the projection step (2.27) is well-defined. Then, using Proposition 1.3.9(i) for two points xk , x∗ and the set C ∩ H(z k , v k ), we have kxk+1 − x∗ k2 ≤ kxk − x∗ k2 − kxk+1 − xk k2 .

(2.41)

So, (xk )k∈N is Féjer convergent to SV I . Hence, by Proposition 1.3.5(i) (xk )k∈N is bounded. Taking the limit in (2.41) and using Proposition 1.3.5(ii), the result follows. The next proposition shows a relation between the projection steps in Variant B.1 and Variant B.2. This fact has a geometry interpretation: since the projection of Variant B.2 is done onto a smaller set, it can improve the convergence of Variant B.1. Proposition 2.5.8 Let (xk )k∈N be the sequence generated by Variant B.2. Then: (i) xk+1 = PC∩H(zk ,vk ) (PH(zk ,vk ) (xk )). (ii) limk→∞ hA(z k ) + v k , xk − z k i = 0. Proof. (i): Since xk ∈ C but xk ∈ / H(z k , v k ) and C ∩ H(z k , v k ) 6= ∅, the result follows from Lemma 1.3.12. (ii): Take x∗ ∈ SV I . Notice that xk+1 = PC∩H(zk ,vk ) (xk ) and that projections onto convex sets are firmly-nonexpansive (see Proposition 1.3.9(i)), we have kxk+1 − x∗ k2 = kxk − x∗ k2 − kxk+1 − xk k2 ≤ kxk − x∗ k2 − kPH(zk ,vk ) (xk ) − xk k2 . The remainder of the proof is analogous to Proposition 2.5.4(iii).

Finally we present the convergence result for Variant B.2. Proposition 2.5.9 The sequence (xk )k∈N converges to a point in SV I . Proof. Similar to the proof of Theorem 2.5.5.

Next, we examine the performance of Variant B.2 for the variational inequality in Example 2.2.1 with and without normal vectors. 34

Figure 2.4: Variant B.2 with and without normal vectors. Figure 2.4 shows the first five elements of sequences (y k )k∈N (generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors).

2.5.3


In this section, all results are for Variant B.3, which is summarized below. Variant B.3 xk+1 = FB.3 (xk ) = PC∩H(zk ,vk )∩W (xk ) (x0 ) where W (xk ) = y ∈ Rn : hy − xk , x0 − xk i ≤ 0 , H(z k , v k ) = y ∈ Rn : hT (z k ) + v k , y − z k i ≤ 0 , z k and v k are defined by (2.24) and (2.25a), respectively. Proposition 2.5.10 If xk+1 = xk , then xk ∈ SV I and Variant B.3 stops. Proof. We have xk+1 = PC∩H(zk ,vk )∩W (xk ) (x0 ) = xk . So, xk ∈ C ∩ H(z k , v k ) ∩ W (xk ) ⊆ H(z k , v k ). Finally, xk ∈ SV I by Proposition 2.5.2. We now consider the case Variant B.3 does not stop. Observe that W (xk ) and H(z k , v k ) are closed halfspaces, for each k. Therefore, C ∩ H(z k , v k ) ∩ W (xk ) is a closed convex set. 35

So, if the set C ∩ H(z k , v k ) ∩ W (xk ) is nonempty, then the next iterate, xk+1 , is well-defined. The following lemma guarantees the non-emptiness of this set. Lemma 2.5.11 For all k ∈ N, we have SV I ⊆ C ∩ H(z k , v k ) ∩ W (xk ). Proof. We proceed by induction. By definition, SV I 6= ∅ and SV I ⊆ C. By Lemma 1.3.28, SV I ⊆ H(z k , v k ), for all k ∈ N. For k = 0, as W (x0 ) = Rn , SV I ⊆ H(z 0 , v 0 ) ∩ W (x0 ). Assume that SV I ⊆ H(z k , v k ) ∩ W (xk ). Then, xk+1 = PC∩H(zk ,vk )∩W (xk ) (x0 ) is well-defined. By Proposition 1.3.9(ii), we obtain hx∗ − xk+1 , x0 − xk+1 i ≤ 0, for all x∗ ∈ SV I . This implies x∗ ∈ W (xk+1 ). Hence, SV I ⊆ H(z k+1 , v k+1 ) ∩ W (xk+1 ). Then, the statement follows by induction. As in the previous cases we establish the converse statement of Proposition 2.5.10, which is a direct consequence of Lemma 2.6.11. Corollary 2.5.12 If xk ∈ SV I , then xk+1 = xk , and Variant B.3 stops. Corollary 2.5.13 Variant B.3 is well-defined. Proof. Lemma 2.5.11 shows that C ∩ H(z k , v k ) ∩ W (xk ) is nonempty for all k ∈ N. So, the projection step (2.28) is well-defined. Thus, Variant B.3 is well definedby using Proposition 2.5.1. Before proving the convergence of the sequence (xk )k∈N , we study its boundedness. The next lemma shows that the sequence remains in a ball determined by the initial point. Lemma 2.5.14 Let x = PSV I (x0 ) and ρ = dist(x0 , SV I ). Then (xk )k∈N ⊂ B C. In particular, (xk )k∈N is bounded.

1 2

(x0 + x), 12 ρ ∩

Proof. SV I ⊆ H(z k , v k ) ∩ W (xk ) follows from Lemma 2.5.11. Using Lemma 1.3.13, with S = SV I and x = xk , we have xk ∈ B 12 (x0 + x), 21 ρ , for all k ∈ N. Finally, notice that (xk )k∈N ⊂ C. Now, we focus on the properties of the accumulation points. Proposition 2.5.15 All accumulation points of (xk )k∈N belong to SV I . Proof. Notice that W (xk ) is a halfspace with normal x0 −xk , we have xk = PW (xk ) (x0 ). Moreover, xk+1 ∈ W (xk ). Thus, by the firm non-expansiveness of PW (xk ) (see Proposition 1.3.9(i), we have kxk+1 − xk k2 ≤ kxk+1 − x0 k2 − kxk − x0 k2 . So, the sequence (kxk − x0 k)k∈N is monotone and nondecreasing. In addition, (kxk − x0 k)k∈N is bounded by Lemma 2.5.14. Thus, (xk )k∈N converges. Hence, lim kxk+1 − xk k = 0. (2.42) k→∞

36

Since xk+1 ∈ H(z k , v k ), we get hT (z k ) + v k , xk+1 − z k i ≤ 0, where v k and z k are defined by (2.25a) and (2.24), respectively. Combining the above inequality with (2.31), we get

1−δ k k 2 0 ≥ hT (z k )+v k , xk+1 −xk i+ T (z k )+v k , xk −z k ≥ −kT (z k )+v k k·kxk+1 −xk k+ kx −z k . αk

After some simple algebra and using (2.24),

kxk − z k k2 ≤

σ kT (z k ) + v k k · kxk+1 − xk k. 1−δ

(2.43)

Choosing a subsequence (ik )k∈N such that, the subsequences (αik )k∈N , (xik )k∈N and (v ik )k∈N converge to α ˜ , x˜ and v˜, respectively. This is possible by the boundedness of (v k )k∈N and (xk )k∈N . Taking the limits in (2.43) and using (2.42), we obtain limk→∞ kxik − z ik k2 = 0. and as consequence x˜ = limk→∞ z ik . Now we consider two cases: Case 1: limk→∞ αik = α ˜ > 0. By (2.24) and the continuity of the projection, x˜ = ik limk→∞ z = PC x˜ − α ˜ (T (˜ x) + α ˜ u˜) and hence by Proposition 1.3.25, x˜ ∈ SV I . Case 2: limk→∞ αik = α ˜ = 0. Then, limk→∞ to the proof of Theorem 2.5.5.

αi k θ

= 0. The rest part of this case is analogous

Finally, we are ready to prove the convergence of the sequence (xk )k∈N generated by Variant B.3, to the solution closest to x0 .

Theorem 2.5.16 Define x = PSV I (x0 ). Then, (xk )k∈N converges to x.

Proof. First note that from Lemma 1.3.29 x¯ is well-defined. It follows from Lemma 2.5.14 that (xk )k∈N ⊂ B 12 (x0 + x), 12 ρ ∩ C, so it is bounded. Let (xik )k∈N be a convergent subsequence of (xk )k∈N , and let xˆ be its limit. Thus, xˆ ∈ B 12 (x0 + x), 21 ρ ∩ C. Furthermore, by Proposition 2.5.15, xˆ ∈ SV I . So, xˆ ∈ SV I ∩ B 12 (x0 + x), 12 ρ = {x}, implying xˆ = x, i.e., x is the unique limit point of (xk )k∈N . Hence, (xk )k∈N converges to x ∈ SV I . 37

Figure 2.5: Variant B.3 with and without normal vectors.

Figure 2.5 above shows the performance of the first five elements of the sequence generated by Variant B.3 in Example 2.2.1 with and without normal vectors. Recall that (y k )k∈N (is the sequence generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors).

2.6

Conceptual algorithm with Linesearch F

We continue by presenting our second conceptual algorithm for solving (2.1) using Linesearch F.

38

ˇ β] ˆ 0 < βˇ ≤ βˆ < +∞, δ ∈ (0, 1), and Conceptual Algorithm F Given (βk )k∈N ⊂ [β, M > 0. Step 0 (Initialization): Take x0 ∈ C and set k ← 0. Step 1 (Stop Test 1): If xk = PC (xk − T (xk )), then stop. Otherwise, Step 2 (Linesearch F): Take uk ∈ NC (xk ) with kuk k ≤ M and set (αk , z k , v k ) = Linesearch F (xk , uk , βk , δ, M ),

(2.44)

 k k k v k k ≤ M ; αk ≤ 1;   v¯ ∈ NC (αk z + (1 − αk )x ) with k¯ z k = PC (xk − βk (T (xk ) + αk uk ));   hT (αk z k + (1 − αk )xk ) + v¯k , xk − z k i ≥ δhT (xk ) + αk uk , xk − z k i.

(2.45)

i.e., (αk , z k , v¯k ) satisfy

Step 3 (Projection Step): Set xk := αk z k + (1 − αk )xk ,

(2.46a)

and xk+1 := FF (xk ).

(2.46b)

Step 4(Stop Test 2): If xk+1 = xk , then stop. Otherwise, set k ← k + 1 and go to Step 1. Also, we consider three different projection steps (called Variants F.1, F.2 and F.3) which are analogous to Conceptual Algorithm B. These variants of Conceptual Algorithm F have their main difference in the projection step given in (2.46b). (Variant F.1) (2.47) FF.1 (xk ) =PC PH(xk ,vk ) (xk ) ; FF.2 (xk ) =PC∩H(xk ,vk ) (xk ); k

0

FF.3 (x ) =PC∩H(xk ,vk )∩W (xk ) (x ),

(Variant F.2)

(2.48)

(Variant F.3)

(2.49)

where H(x, v) and W (x) are defined by (2.29). Now, we analyze some general properties of Conceptual Algorithm F. Proposition 2.6.1 Assuming that (2.46b) is well defined whenever xk is available. Then, Conceptual Algorithm F is well-defined. Proof. If Step 1 is not satisfied, then Step 2 is guaranteed by Lemma 2.4.2. Thus, the entire algorithm is well-defined. Proposition 2.6.2 xk ∈ H(xk , v k ) for xk and v k as in (2.46a) and (2.44) respectively, if and only if, xk ∈ SV I . 39

Proof. Since xk ∈ H(xk , v k ), hT (xk ) + v k , xk − xk i ≤ 0. Using the definition of αk in Algorithm F, (2.45) and (2.46a), we have

0 ≥ hT (xk ) + v k , xk − xk i = αk hT (xk ) + v k , xk − z k i ≥ αk δ T (xk ) + αk uk , xk − z k αk αk δkxk − z k k2 , (2.50) ≥ δkxk − z k k2 ≥ ˆ βk β implying that xk = z k . Then, from (2.45) and Proposition 1.3.25, xk ∈ SV I by Proposition 1.3.25. Conversely, if xk ∈ SV I , then xk ∈ H(xk , v k ) by Lemma 1.3.28. Next, we note a useful algebraic property on the sequence generated by Conceptual Algorithm F, which is a direct consequence of the linesearch in Algorithm F. Let (xk )k∈N and (αk )k∈N be sequences generated by Conceptual Algorithm F, using (2.50), we get αk ∀k ∈ N : hT (xk ) + v k , xk − xk i ≥ δkxk − z k k2 . (2.51) ˆ β

2.6.1

Convergence analysis of Variant F.1

In this subsection all results are for Variant F.1, which is summarized below. k )+v k ,xk −xk i k k Variant F.1 xk+1 = FF.1 (xk ) = PC (PH(xk ,vk ) (xk )) = PC xk − hT (x (T (x ) + v ) kT (xk )+v k k2 where H(xk , v k ) = y ∈ Rn : hT (z k ) + v k , y − z k i ≤ 0 , xk and v k as (2.46a) and (2.44), respectively. Proposition 2.6.3 If xk+1 = xk , if and only if xk ∈ SV I , and Variant F.1 stops. Proof. We have xk+1 = PC PH(xk ,vk ) (xk ) = xk . Then, Proposition 1.3.9(ii) implies hPH(xk ,vk ) (xk ) − xk , z − xk i ≤ 0,

(2.52)

for all z ∈ C. Again, using Proposition 1.3.9(ii), hPH(xk ,vk ) (xk ) − xk , PH(xk ,vk ) (xk ) − zi ≤ 0,

(2.53)

for all z ∈ H(xk , v k ). Note that C ∩ H(xk , v k ) 6= ∅. So, for any z ∈ C ∩ H(xk , v k ), adding up (2.52) and (2.53) yields kxk − PH(xk ,vk ) (xk )k2 = 0. Hence, xk = PH(xk ,vk ) (xk ), i.e., xk ∈ H(xk , v k ). Finally, we have xk ∈ SV I by Proposition 2.6.2. Conversely, if xk ∈ SV I , Proposition 2.6.2 implies xk ∈ H(¯ xk , v¯k ) and together with (2.47), we get xk = xk+1 . From now on, we assume that Variant F.1 does not stop. Note that by Lemma 1.3.28, H(xk , v k ) is nonempty for all k ∈ N. Then, the projection step (2.47) is well-defined. Thus, Variant F.1 generates an infinite sequence (xk )k∈N such that xk 6∈ SV I for all k ∈ N. 40

Proposition 2.6.4 The following hold: (i) The sequence (xk )k∈N is Fejér convergent to SV I . (ii) The sequence (xk )k∈N is bounded. (iii) limk→∞ hA(xk ) + v k , xk − xk i = 0. Proof. (i): Take x∗ ∈ SV I . Note that, by definition (xk , v k ) ∈ Gph(NC ). Using (2.47), Proposition 1.3.9(i) and Lemma 1.3.28, we have kxk+1 − x∗ k2 = kPC (PH(xk ,vk ) (xk )) − PC (PH(xk ,vk ) (x∗ ))k2 ≤ kPH(xk ,vk ) (xk ) − PH(xk ,vk ) (x∗ )k2 ≤ kxk − x∗ k2 − kPH(xk ,vk ) (xk ) − xk k2 .

(2.54)

So, kxk+1 − x∗ k ≤ kxk − x∗ k. (ii): Follows immediately from the previous item and Proposition 1.3.5(i). (iii): Take x∗ ∈ SV I . Using (2.46a) and

k k k k T (x ) + v , x − x PH(xk ,vk ) (xk ) = xk − T (xk ) + v k k k 2 kT (x ) + v k combining with (2.54), yields

2

k k k k ) + v , x − x T (x

k k k k+1 2 k 2 k T (x ) + v − x kx − x∗ k ≤ kx − x∗ k − x −

kT (xk ) + v k k2 = kxk − x∗ k2 −

(hT (xk ) + v k , xk − xk i)2 . kT (xk ) + v k k2

It follows that (hT (xk ) + v k , xk − xk i)2 ≤ kxk − x∗ k2 − kxk+1 − x∗ k2 . k k 2 kT (x ) + v k Using Proposition 1.3.5(ii), the right side of the above inequality goes to 0, when k goes to ∞ and since T is continuous and (xk )k∈N , (z k )k∈N and (xk )k∈N are bounded. Implying the boundedness of kT (xk ) + v k k k∈N and the desired result. Next, we establish our main convergence result on Variant F.1. Theorem 2.6.5 The sequence (xk )k∈N converges to a point in SV I . 41

Proof. We claim that there exists an accumulation point of (xk )k∈N belonging to SV I . The existence of the accumulation points follows from Proposition 2.6.4(ii). Let (xik )k∈N be a convergent subsequence of (xk )k∈N such that, (xik )k∈N , (v ik )k∈N , (uik )k∈N , (αik )k∈N and (βik )k∈N ˜ ˜ and limk→∞ βik = β. also converge, and set limk→∞ xik = x˜, limk→∞ uik = u˜, limk→∞ αik = α Using Proposition 2.6.4(iii) and by passing to the limit in (2.51), over the subsequence (ik )k∈N , α we have 0 = limk→∞ hT (xik ) + u¯ik , xik − xik i ≥ limk→∞ βîk δkxik − z ik k2 ≥ 0. Therefore, lim αik kxik − z ik k = 0.

k→∞

(2.55)

Now we consider two cases. Case 1: limk→∞ αik = α ˜ > 0. In the view of (2.55), limk→∞ kxik − z ik k = 0. Using the ˜ (˜ continuity of T , and the projection imply x˜ = limk→∞ xik = limk→∞ z ik = PC x˜ − β(T x) + ˜ x) + α ˜ u˜) , and Proposition 1.3.25 implies that x˜ ∈ SV I . α ˜ u˜) . Then, x˜ = PC x˜ − β(T (˜ Case 2: limk→∞ αik = α ˜ = 0. Define α ˜k =

αi k . θ

Then,

lim α ˜ ik = 0.

k→∞

(2.56)

Define y˜k := α ˜ k z˜k + (1 − α ˜ k )xk , where z˜k = PC (xk − βk (T (xk ) + α ˜ k uk )), as (2.45). Hence, lim y ik = x˜.

k→∞

(2.57)

From the definition αk in Algorithm F, y˜k does not satisfy the condition, i.e., D

k

k

k

k

T (˜ y ) + v˜ , x − z˜

E

< δhT (xk ) + α ˜ k uk , xk − z˜k i,

(2.58)

for v˜k ∈ NC (˜ y k ) and all k. Taking a subsequence (ik )k∈N and relabeling if necessary, we assume that (vαk˜ k )k∈N converges to v˜ over the subsequence (ik )k∈N . By Fact 1.3.21, v˜ belongs ˜ (˜ to NC (˜ x). Using (2.45) and (2.56), limk→∞ z˜ik = z˜ = PC x˜ − βT x) . By passing to the limit in (2.58) over the subsequence (ik )k∈N and using (2.57), we get hT (˜ x) + v˜, x˜ − z˜i ≤ δhT (˜ x), x˜ − z˜i. Then,

0 ≥ (1 − δ) T (˜ x), x˜ − z˜ + v˜, x˜ − z˜ ≥ (1 − δ) T (˜ x), x˜ − z˜ (1 − δ)

(1 − δ) (1 − δ) ˜ (˜ = x˜ − (˜ x − βT x)), x˜ − z˜i ≥ k˜ x − z˜k2 ≥ k˜ x − z˜k2 ≥ 0. ˆ β˜ β˜ β This means x˜ = z˜, which implies x˜ ∈ SV I .

42

Figure 2.6: Variant F.1 with and without normal vectors. Figure 2.6 examines the performance of Variant F.1 for the variational inequality in Example 2.2.1 with and without normal vectors. Figure 2.6 shows the first five elements of sequences (y k )k∈N (generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors).

2.6.2


In this section, all results are for Variant F.2, which is summarized below. Variant F.2 xk+1 = FF.2 (xk ) = PC∩H(xk ,vk ) (xk ) where H(xk , v k ) = y ∈ Rn : hT (z k ) + v k , y − z k i ≤ 0 , xk and v k given by (2.46a) and (2.44), respectively. Proposition 2.6.6 If xk+1 = xk , if and only if xk ∈ SV I , and Variant F.2 stops. Proof. We have xk+1 = PC∩H(xk ,vk ) (xk ) = xk . So, xk ∈ C ∩ H(xk , v k ). Hence, xk ∈ SV I by Proposition 2.6.2. Conversely, if xk ∈ SV I , Proposition 2.6.2 implies xk ∈ H(¯ xk , v¯k ) and together with (2.48), we get xk = xk+1 . 43

From now on, we assume that Variant F.2 does not stop. Proposition 2.6.7 The sequence (xk )k∈N is Féjer convergent to SV I . bounded and limk→∞ kxk+1 − xk k = 0.

Moreover, it is

Proof. Take x∗ ∈ SV I . By Lemma 1.3.28, x∗ ∈ H(xk , v k ), for all k ∈ N and also x∗ belongs to C, so, the projection step (2.48) is well-defined. Then, using Proposition 1.3.9(i) for the projection operator PH(xk ,vk ) , we obtain kxk+1 − x∗ k2 ≤ kxk − x∗ k2 − kxk+1 − xk k2 .

(2.59)

The above inequality implies that (xk )k∈N is Féjer convergent to SV I . Hence, by Proposition 1.3.5(i)&(ii), (xk )k∈N is bounded and thus (kxk −x∗ k)k∈N is a convergent sequence. By passing to the limit in (2.59) and using Proposition 1.3.5(ii), we get limk→∞ kxk+1 − xk k = 0. The next proposition shows a relation between the projection steps in Variant F.1 and Variant F.2. This fact has a geometry interpretation: since the projection of Variant F.2 is done over a small set, it may improve the convergence behaviour of the sequence generated by Variant F.1. Proposition 2.6.8 Let (xk )k∈N be the sequence generated by Variant F.2. Then, (i) xk+1 = PC∩H(xk ,vk ) (PH(xk ,vk ) (xk )). (ii) limk→∞ hA(xk ) + v k , xk − xk i = 0. Proof. (i): Since xk ∈ C but xk ∈ / H(xk , v k ) and C ∩ H(xk , v k ) 6= ∅, by Lemma 1.3.12, we have the result. (ii): Take x∗ ∈ SV I . Notice that xk+1 = PC∩H(zk ,vk ) (xk ) and that projections onto convex sets are firmly-nonexpansive (see Proposition 1.3.9(i)), we have kxk+1 − x∗ k2 ≤ kxk − x∗ k2 − kxk+1 − xk k2 ≤ kxk − x∗ k2 − kPH(xk ,vk ) (xk ) − xk k2 . The rest of the proof is analogous to Proposition 2.6.4(iii).

Proposition 2.6.9 The sequence (xk )k∈N converges to a point in SV I . Proof. Similar to the proof of Theorem 3.3.7.

Next, we examine the performance of Variant F.2 for the variational inequality in Example 2.2.1 with and without normal vectors. Figure 2.7 shows the first five elements of sequences (y k )k∈N (generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors). 44

Figure 2.7: Variant F.2 with and without normal vectors.

2.6.3


In this section, all results are for Variant F.3, which is summarized below. Variant F.3 xk+1 = FF.3 (xk ) = PC∩H(xk ,vk )∩W (xk ) (x0 ) where W (xk ) = y ∈ Rn : hy − xk , x0 − xk i ≤ 0 , H(xk , v k ) = y ∈ Rn : hT (z k ) + v k , y − z k i ≤ 0 , xk and v k given by (2.46a) and (2.44), respectively. Proposition 2.6.10 If xk+1 = xk , then xk ∈ SV I and Variant F.3 stops. Proof. We have xk+1 = PC∩H(xk ,vk )∩W (xk ) (x0 ) = xk . So, xk ∈ C ∩ H(xk , v k ) ∩ W (xk ) ⊆ H(xk , v k ). Thus, xk ∈ SV I by Proposition 2.6.2. From now on we assume that Variant F.3 does not stop. Observe that, by the virtue of their definitions, W (xk ) and H(xk , v k ) are convex and closed halfspaces, for each k. Therefore, C ∩ H(xk , v k ) ∩ W (xk ) is a closed convex set. So, if C ∩ H(xk , v k ) ∩ W (xk ) is nonempty, then the next iterate, xk+1 , is well-defined. The following lemma guarantees this fact and the proof is very similar to Lemma 2.5.11. 45

Lemma 2.6.11 For all k ∈ N, we have SV I ⊂ C ∩ H(xk , v k ) ∩ W (xk ). Proof. We proceed by induction. By definition, SV I 6= ∅ and SV I ⊆ C. By Lemma 1.3.28, SV I ⊆ H(xk , v k ), for all k ∈ N. For k = 0, as W (x0 ) = Rn , SV I ⊆ H(x0 , v 0 ) ∩ W (x0 ). Assume that SV I ⊆ H(x` , v ` ) ∩ W (x` ), for ` ≤ k. Henceforth, xk+1 = PC∩H(xk ,vk )∩W (xk ) (x0 ) is well-defined. Then, by Proposition 1.3.9(ii), we have hx∗ − xk+1 , x0 − xk+1 i ≤ 0, for all x∗ ∈ SV I . This implies x∗ ∈ W (xk+1 ), and hence, SV I ⊆ H(xk+1 , v k+1 ) ∩ W (xk+1 ). Then, the result follows by induction. The above lemma shows that the set C ∩H(xk , v k )∩W (xk ) is nonempty and as consequence the projection step, given in (2.49), is well-defined. Before proving the convergence of the sequence, we study its boundedness. The next lemma shows that the sequence remains in a ball determined by the initial point. As previous variants we establish the converse statement of Proposition 2.6.10, which is a direct consequence of Lemma 2.6.11. Corollary 2.6.12 If xk ∈ SV I , then xk+1 = xk , and Variant F.3 stops. Lemma 2.6.13 The sequence (xk )k∈N is bounded. Furthermore, (xk )k∈N ⊂ B C, where x = PSV I (x0 ) and ρ = dist(x0 , SV I ).

1 2

(x0 + x), 12 ρ ∩

Proof. It follows from Lemma 2.6.11 that SV I ⊆ H(xk , v k ) ∩ W (xk ), for all k ∈ N. The proof now follows by repeating the proof of Lemma 2.5.14. Finally we prove the convergence of the sequence generated by Variant F.3 to the solution closest to x0 . Theorem 2.6.14 Define x = PSV I (x0 ). Then, (xk )k∈N converges to x. Proof. First we prove the optimality of the all accumulation points of (xk )k∈N . Notice that W (xk ) is a halfspace with normal x0 −xk , we have xk = PW (xk ) (x0 ). Moreover, xk+1 ∈ W (xk ). So, by the firm nonexpansiveness of PW (xk ) , we have 0 ≤ kxk+1 − xk k2 ≤ kxk+1 − x0 k2 − kxk − x0 k2 , which implies that the sequence (kxk − x0 k)k∈N is monotone and nondecreasing. From Lemma 2.6.13, we have that (kxk − x0 k)k∈N is bounded, thus, convergent. So, lim kxk+1 − xk k = 0.

(2.60)

hT (xk ) + v k , xk+1 − xk i ≤ 0,

(2.61)

k→∞

Since xk+1 ∈ H(xk , v k ), we get

with v k and xk given by (2.44) and (2.46a). Substituting (2.46a) into (2.61), we have hT (xk )+

v k , xk+1 − xk i + αk T (xk ) + v k , xk − z k ≤ 0. Combining the above inequality with (2.45), 46

we get

hT (xk ) + v k , xk+1 − xk i + αk δhT (xk ) + αk uk , xk − z k i ≤ 0.

(2.62)

Combining the above inequality with (2.45) and using Proposition 1.3.9(iii), we can check that βk hT (xk ) + αk uk , xk − z k i ≥ kxk − z k k2 . Thus, after use the last inequality and the Cauchy-Schwartz inequality in (2.62), we get

αk δkxk − z k k2 ≤ kT (xk ) + v k k · kxk+1 − xk k. βk

(2.63)

Choosing a subsequence (ik )k∈N such that, the subsequences (αik )k∈N , (uik )k∈N , (βik )k∈N , ˜ x˜ and v˜ respectively (this is possible by the ˜ , u˜, β, (xik )k∈N and (v ik )k∈N converge to α boundedness of all of these sequences) and taking limit in (2.63) along the subsequence (ik )k∈N , we get, from (2.60),

lim αik kxik − z ik k2 = 0.

k→∞

(2.64)

Now we consider two cases, Case 1: limk→∞ αik = α ˜ > 0. By (2.64), limk→∞ kxik − z ik k2 = 0. By continuity of the ˜ (˜ projection, we have x˜ = PC x˜ − β(T x) + α ˜ u˜) . So, x˜ ∈ SV I by Proposition 1.3.25. Case 2: limk→∞ αik = 0. Then, limk→∞ 2.5.5.

αi k θ

= 0. The rest is similar to the proof of Theorem

Thus, all accumulation points of (xk )k∈N are in SV I . The proof follows similar to Theorem 2.5.16. Next, we examine the performance of Variant F.3 for the variational inequality in Example 2.2.1 with and without normal vectors. Figure 2.8 shows the first five elements of sequences (y k )k∈N (generated without normal vectors) and (xk )k∈N (generated with non-null normal vectors). 47

Figure 2.8: Variant F.3 with and without normal vectors.

2.7

Final remarks

In this chapter we have proposed two conceptual conditional extragradient algorithms generalizing classical extragradient algorithms for solving constrained variational inequality problems. The main idea here comes from the (sub)gradient algorithms where non-null normal vectors to the feasible set improve the convergence, avoiding zigzagging. The scheme proposed here mainly contains two parts: (1) Two different linesearches are analyzed. The linesearches allow us to find suitable halfspaces containing the solution set of the problem using non-null normal vectors of the feasible set. It is well-known in the literature that such procedures are very effective in absence of Lipschitz continuity and they use more information available at each iteration, allowing long steplengths. (2) Many projection steps are performed, which yield different and interesting features extending several known projection algorithms. Furthermore, the convergence analysis of both conceptual algorithms was established assuming existence of solutions, continuity and a weaker condition than pseudomonotonicity on the operator, showing examples when the non-null vectors in the normal cone archive better performance. 48

We hope that this study will serve as a basis for future research on other more efficient variants, as well as including sophisticated linesearches permitting optimal choice for the vectors in the normal cone of the feasible set. Several of the ideas of this paper merit further investigation, some of which will be presented in future work. In particular we discuss in a separate paper variants of the projection algorithms proposed in [21] for solving nonsmooth variational inequalities. The difficulties of extending this previous result to point-to-set operators are non-trivial, the main obstacle yields in the impossibility to use linesearches or separating techniques as was strongly used in this paper. To our knowledge, variants of the linesearches for variational inequalities require smoothness of T , even for the nonlinear convex optimization problems (T = ∂f ) is not possible make linesearch, because the negative subgradients are not always a descent direction. Actually, a few explicit methods have been proposed in the literature for solving nonsmooth monotone variational inequality problems, examples of such methods appear in [31, 49]. Future work will be addressed to further investigation on the modified forward-backward splitting iteration for inclusion problems [19,20,83], exploiting the additive structure of the main operator and adding dynamic choices of the stepsizes with conditional and deflected techniques [23,35,63]. Our future studies will focus on building more real examples, complex and higher dimensions, possible non-linear examples will be important to corroborate the superiority of choosing nonzero normal vectors.

49

Chapter 3 A variant of forward-backward splitting method for the sum of two monotone operators 3.1

Introduction

The results presented in this chapter are based on reference [20]. Here, we propose variants of forward-backward splitting method for finding a zero of the sum of two operators, when the Hilbert space H has finite dimension, i.e., H = Rn . A classical modification of forwardbackward method was proposed by Tseng, which is known to converge when the forward and the backward operators are monotone and the forward operator is Lipschitz continuous, or continuous on the whole space. The conceptual algorithm proposed here improves Tseng’s method in some instances. The first and main part of our approach, contains an explicit linesearch in the spirit of the extragradient-like methods for variational inequalities. During the iteration process the search performs only one calculation of the forward-backward operator, in each tentative of the step. This achieves a considerable computational saving when the forward-backward operator is computationally expensive. The second part of the scheme consists of three special projection steps. The convergence analysis of the proposed scheme assumes only monotonicity of both operators, without requires Lipschitz continuity of the forward operator. The proposed scheme also is interesting for solve the well studed problem of the minimization of sum of two convex functions, i.e. min f1 (x) + f2 (x) s.t. x ∈ Rn or equivalently, finding x ∈ Rn such that 0 ∈ A(x) + B(x) where A = ∇f1 and B = ∂f2 . This problem has a wide range of important applications in engineering and in inverse problems, and many methods has been proposed in order to solve it (see e.g., [34] and the references therein). 50

3.2

The Conceptual Algorithm n

Let A : dom(A) ⊆ Rn → Rn and B : dom(B) ⊆ Rn → 2R be two maximal monotone operators, with A point-to-point and B point-to-set. Assume that dom(B) ⊆ dom(A). Choose any nonempty, closed and convex set, X ⊆ dom(B), satisfying X ∩ SI 6= ∅. Thus, from now on, the solution set, SI , is nonempty. Also we assume that the operator B satisfies, that for each bounded subset V of dom(B) there exists R > 0, such that B(x) ∩ B[0, R] 6= ∅, for all x ∈ V . We emphasize that this assumption holds trivially if dom(B) = Rn , or V ⊂ int(dom(B)), or B is the normal cone to any subset of dom(B) (see Lemma 1.3.15(ii)).

3.2.1

The Linesearch D

In this section we present a linesearch which will be used in the conceptual algorithm, it is related to Strategy (c) in Algorithm E. The idea here is to use both operators A and B in the linesearch. If B is the normal cone to C and uα is taken as zero, then we obtain Strategy (c).

Linesearch D (Linesearch along the feasible direction with norm) Input: (x, β, δ, θ, R). Where x ∈ X, β > 0, δ, θ ∈ (0, 1) and R > 0. Set α ← 1 and θ, δ ∈ (0, 1). Define J(x, β) := (I + βB)−1 (I − βA)(x)

(3.1)

and choose u1 ∈ B(J(x, β)), with ku1 k ≤ R. While hA αJ(x, β) + (1 − α)x + uα , x − J(x, β)i < βδ kx − J(x, β)k do α ← θα and choose any uα ∈ B(αJ(x, β) + (1 − α)x) with kuα k ≤ R. End While Output: (α, x¯, u¯), with x¯ := αJ(x, β) + (1 − α)x and u¯ := uα . In the following we prove that Linesearch D is well defined. Lemma 3.2.1 If x ∈ X and x ∈ / SI , then Linesearch D stop after finitely many steps. Proof. Suppose on the contrary that Linesearch D does not stop for all α ∈ Q := {1, θ, θ2 , . . . , } and the chosen uα ∈ B αJ(x, β) + (1 − α)x ∩ B[0, R], then D

E δ A αJ(x, β) + (1 − α)x + uα , x − J(x, β) < kx − J(x, β)k2 . β 51

(3.2)

Since the sequence (uα )α∈Q is bounded, without loss of generality, we can assume that it converges to some u ∈ B(x), by maximality of B. Taking limits in (3.2)

βA(x) + βu, x − J(x, β) ≤ δkx − J(x, β)k2 . (3.3) It follows from (3.1) that βA(x) = x − J(x, β) − βv, for some v ∈ B(J(x, β)). Now, the above equality together with (3.3), lead to D E kx − J(x, β)k2 ≤ x − J(x, β) − βv + βu, x − J(x, β) ≤ δkx − J(x, β)k2 , using the monotonicity of B for the first inequality. So, (1 − δ)kx − J(x, β)k2 ≤ 0, which contradicts Stop Criteria 1. Thus, the conceptual algorithm is well-defined.

3.2.2

The Conceptual Algorithm FB

ˇ β] ˆ with 0 < βˇ ≤ βˆ < ∞, and be Let (βk )k∈N be a sequence such that (βk )k∈N ⊂ [β, θ, δ ∈ (0, 1). The algorithm is defined as follows:

Conceptual Algorithm FB Step 0 (Initialization): Take x0 ∈ X. Step 1 (Iterative Step 1): Given xk and βk , compute J(xk , βk ) := (I + βk B)−1 (I − βk A)(xk ). Step 2 (Stopping Test 1): If xk = J(xk , βk ), then stop. Step 3 (Linesearch D): (αk , x¯k , u¯k ) = Linesearch D (xk , βk , δ, θ, R).

(3.4)

Step 4 (Iterative Step 2): xk+1 := FF B (xk ).

(3.5)

Step 5 (Stopping Test 2): If xk+1 = xk , stop. Otherwise, set k ← k + 1 and go to Step 1. Now we consider three variants on this conceptual algorithm. The difference is given by the definition of the procedure F in (3.5). FFB.1 (xk ) =PX PH(xk ,uk ) (xk ) ; (Variant FB.1) (3.6) FFB.2 (xk ) =PX∩H(xk ,uk ) (xk );

(Variant FB.2)

(3.7)

FFB.3 (xk ) =PX∩H(xk ,uk )∩W (xk ) (x0 ),

(Variant FB.3)

(3.8)

52

where H(x, u) := y ∈ Rn : hA(x) + u, y − xi ≤ 0

(3.9)

W (x) := y ∈ Rn : hy − x, x0 − xi ≤ 0 .

(3.10)

and This kind of hyperplane have been used in some works, see [15, 78].

3.3

Convergence Analysis of Algorithm FB

In this section we analyze the convergence of the variants presented in the previous section. First, we present some general properties as well as prove that the conceptual algorithm is well defined. Lemma 3.3.1 For all (x, u) ∈ Gph(B), SI ⊆ H(x, u). Proof. Take x∗ ∈ SI . Using the definition of the solution, there exists v∗ ∈ B(x∗ ), such that 0 = A(x∗ ) + v∗ . By the monotonicity of A + B, we have hA(x) + u − (A(x∗ ) + v∗ ), x − x∗ i ≥ 0, for all (x, u) ∈ Gph(B). Hence, hA(x) + u, x∗ − xi ≤ 0 and by (3.9), x∗ ∈ H(x, u). From now on, (xk )k∈N is the sequence generated by the conceptual algorithm. Proposition 3.3.2 The Conceptual Algorithm FB is well-defined. Proof. Follow from Lemma 3.2.1 and Proposition 1.3.33.

Proposition 3.3.3 xk ∈ H(¯ xk , u¯k ) for x¯k and u¯k as in (3.4), if and only if, xk ∈ SI . Proof. Since xk ∈ H(¯ xk , u¯k ), hA(¯ xk ) + u¯k , xk − x¯k i ≤ 0. Using the Armijo-type linesearch, given in Linesearch D and (3.4), we obtain αk 0 ≥ hA(¯ xk ) + u¯k , xk − x¯k i = αk hA(¯ xk ) + u¯k , xk − J(xk , βk )i ≥ δkxk − J(xk , βk )k2 ≥ 0, βk which implies that xk = J(xk , βk ). So, by Proposition 1.3.33, xk ∈ SI . Conversely, if xk ∈ SI using Lemma 3.3.1, xk ∈ H(¯ xk , u¯k ). Finally we state a useful algebraic property on the sequence generated by the conceptual algorithm, which is a direct consequence of the inner loop and (3.4). Corollary 3.3.4 Let (xk )k∈N , (βk )k∈N and (αk )k∈N be sequences generated by the conceptual algorithm, with δ and βˆ as in the conceptual algorithm. Then, αk δ k hA(¯ xk ) + u¯k , xk − x¯k i ≥ kx − J(xk , βk )k2 ≥ 0, (3.11) ˆ β for all k ∈ N. 53

3.3.1

Convergence analysis of Variant FB.1

In this subsection all results refers to k

k

k

x )+¯ u ,x −¯ x Variant FB.1 xk+1 = FFB.1 (xk ) = PX (PH(¯xk ,¯uk ) (xk )) = PX xk − hA(¯ kA(¯ xk )+¯ uk k2 u¯k where H(xk , uk ) = y ∈ Rn : hT (xk ) + uk , y − xk i ≤ 0 ,

ki

A(¯ xk ) +

with xk and uk are respectively given by (3.4). Proposition 3.3.5 If Variant FB.1 stops, then xk ∈ SI . Proof. If Stop Criteria 2 is satisfied, xk+1 = PX PH(¯xk ,¯uk ) (xk ) = xk . Using Proposition 1.3.9(ii), we have hPH(¯xk ,¯uk ) (xk ) − xk , z − xk i ≤ 0, (3.12) for all z ∈ X. Now using Proposition 1.3.9(ii), hPH(¯xk ,¯uk ) (xk ) − xk , PH(¯xk ,¯uk ) (xk ) − zi ≤ 0,

(3.13)

for all z ∈ H(¯ xk , u¯k ). Since X ∩ H(¯ xk , u¯k ) 6= ∅ summing (3.12) and (3.13), with z ∈ X ∩ H(¯ xk , u¯k ), we get kxk − PH(¯xk ,¯uk ) (xk )k2 = 0. Hence, xk = PH(¯xk ,¯uk ) (xk ), implying that xk ∈ H(¯ xk , u¯k ) and by Proposition 3.3.3, xk ∈ SI . From now on assume that Variant FB.1 does not stop. Note that by Lemma 3.3.1 H(¯ xk , u¯k ) is nonempty for all k ∈ N. Then the projection step (3.6) is well defined, i.e., if Variant FB.1 does not stop, it generates an infinite sequence (xk )k∈N . Proposition 3.3.6

(i) The sequence (xk )k∈N is Fejér convergent to SI ∩ X.

(ii) The sequence (xk )k∈N is bounded. (iii) limk→∞ hA(¯ xk ) + u¯k , xk − x¯k i = 0. Proof. (i): Take x∗ ∈ SI ∩ X. Note that, by definition (¯ xk , u¯k ) ∈ Gph(B). Using (3.6), Proposition 1.3.9(i) and Lemma 3.3.1, we have kxk+1 − x∗ k2 =kPX (PH(¯xk ,¯uk ) (xk )) − PX (PH(¯xk ,¯uk ) (x∗ ))k2 ≤ kPH(¯xk ,¯uk ) (xk ) − PH(¯xk ,¯uk ) (x∗ )k2 ≤kxk − x∗ k2 − kPH(¯xk ,¯uk ) (xk ) − xk k2 . So, kxk+1 − x∗ k ≤ kxk − x∗ k.

54

(3.14)

(ii): Follows immediately from the previous item. (iii):Take x∗ ∈ SI ∩ X. Using (3.4) and

A(¯ xk ) + u¯k , xk − x¯k k k , A(¯ x ) + u ¯ PH(¯xk ,¯uk ) (x ) = x − kA(¯ xk ) + u¯k k2 k

k

(3.15)

combining with (3.14), yields

2

k k k k A(¯ x ) + u ¯ , x − x ¯

k k k kxk+1 − x∗ k2 ≤kxk − x∗ k2 − xk − A(¯ x ) + u ¯ − x

kA(¯ xk ) + u¯k k2 =kxk − x∗ k2 −

(hA(¯ xk ) + u¯k , xk − x¯k i)2 . kA(¯ xk ) + u¯k k2

Reordering the above inequality, we get (hA(¯ xk ) + u¯k , xk − x¯k i)2 ≤ kxk − x∗ k2 − kxk+1 − x∗ k2 . kA(¯ xk ) + u¯k k2

(3.16)

By Proposition 1.3.32 and the continuity of A, we have that J is continuous. Since (xk )k∈N and (βk )k∈N are bounded then (J(xk , βk ))k∈N and (¯ xk )k∈N are bounded, implying the bound edness of kA(¯ xk ) + u¯k k k∈N . Using Proposition 1.3.5(ii), the right side of (3.16) goes to 0, when k goes to ∞, establishing the result. Next we establish our main convergence result on Variant FB.1. Theorem 3.3.7 The sequence (xk )k∈N converges to some element belonging to SI ∩ X. Proof. We claim that there exists an accumulation point of (xk )k∈N belonging to SI . The existence of accumulation points follows from Proposition 3.3.6(ii). Let (xik )k∈N be a convergent subsequence of (xk )k∈N such that, (¯ xik )k∈N , (¯ uik )k∈N , (αik )k∈N are convergent, and set limk→∞ xik = x˜. Using Proposition 3.3.6(iii) and taking limits in (3.11) over the subsequence (ik )k∈N , we have 0 = lim hA(¯ xik ) + u¯ik , xik − x¯ik i ≥ lim k→∞

k→∞

α ik δ ik kx − J(xik , βik )k2 ≥ 0. ˆ β

(3.17)

Therefore,limk→∞ αik kxik − J(xik , βik )k = 0. Now consider the two possible cases. Case (a): First, assume that limk→∞ αik 6= 0, i.e., αik ≥ α ¯ for all k ∈ N and some α ¯ > 0. In view of (3.17), lim kxik − J(xik , βik )k = 0. (3.18) k→∞

55

Taking a subsequence, if necessary, we may assume that limk→∞ βik = β˜ such that β˜ ≥ βˇ > 0 and since J is continuous, by the continuity of A and (I + βk B)−1 and by Proposition 1.3.32, (3.18) becomes ˜ x˜ = J(˜ x, β), which implies that x˜ ∈ SI . Establishing the claim. Case (b): On the other hand, if limk→∞ αik = 0. We have that, for θ ∈ (0, 1) defined in the conceptual algorithm, αi lim k = 0. k→∞ θ Define α ik α ik i k ik ik y := J(x , βik ) + 1 − x . θ θ Then, (3.19) lim y ik = x˜. k→∞

ik

Using the definition of αk , y does not satisfy Linesearch D implying D E δ ik ik k A(y ik ) + uij(i , x − J(x , β ) < kxik − J(xik , βik )k2 , i k )−1 k βik

(3.20)

k for uij(i ∈ B(y ik ) and all k. k )−1 Refining the subsequence (ik )k∈N , if necessary, we may assume that (βik )k∈N converges to k some β˜ such that β˜ ≥ βˇ > 0 and (uij(i )k∈N converges to u˜. By the maximality of B, k )−1 ˜ Using u˜ belongs to B(˜ x). Using the continuity of J, (J(xik , βik ))k∈N converges to J(˜ x, β). (3.19) and taking limit in (3.20) over the subsequence (ik )k∈N , we have D E ˜ ≤ δ k˜ ˜ 2. x − J(˜ x, β)k (3.21) A(˜ x) + u˜, x˜ − J(˜ x, β) β˜

˜ − β˜ ˜v + β˜u˜, x˜ − Using (3.1) and multiplying by β˜ on both sides of (3.21), we get h˜ x − J(˜ x, β) ˜ ≤ δk˜ ˜ 2 , where v˜ ∈ B(J(˜ ˜ J(˜ x, β)i x − J(˜ x, β)k x, β)). Applying the monotonicity of B, we 2 2 ˜ ˜ ˜ ≤ 0. Thus, x˜ = J(˜ ˜ obtain k˜ x − J(˜ x, β)k ≤ δk˜ x − J(˜ x, β)k , implying that k˜ x − J(˜ x, β)k x, β) and hence, x˜ ∈ SI .

3.3.2


In this subsection all results refers to Variant FB.2 xk+1 = FFB.2 (xk ) = PX∩H(¯xk ,¯uk ) (xk ) where H(xk , uk ) = y ∈ Rn : hT (xk ) + uk , y − xk i ≤ 0 , with xk and uk are respectively given by (3.4). 56

Proposition 3.3.8 If Variant FB.2 stops, then xk ∈ SI . Proof. If xk+1 = PX∩H(¯xk ,¯uk ) (xk ) = xk then xk ∈ X ∩ H(¯ xk , u¯k ) and by Proposition 3.3.3, xk ∈ SI ∩ X. From now on assume that Variant FB.2 does not stop. Proposition 3.3.9 The sequence (xk )k∈N is Féjer convergent to SI ∩ X. Moreover, it is bounded and limk→∞ kxk+1 − xk k = 0. Proof. Take x∗ ∈ SI ∩ X. By Lemma 3.3.1, x∗ ∈ H(¯ xk , u¯k ) ∩ X, for all k ∈ N. Then using Proposition 1.3.9(ii) and (3.7) kxk+1 − x∗ k2 − kxk − x∗ k2 + kxk+1 − xk k2 = 2hx∗ − xk+1 , xk − xk+1 i ≤ 0, we obtain kxk+1 − x∗ k2 ≤ kxk − x∗ k2 − kxk+1 − xk k2 .

(3.22)

The above inequality implies that (xk )k∈N is Féjer convergent to SI ∩X. Hence by Proposition 1.3.5(i) and (ii), (xk )k∈N is bounded and thus {kxk − x∗ k} is a convergent sequence. Taking limits in (3.22), we get limk→∞ kxk+1 − xk k = 0. The next proposition shows a relation between the projection steps in Variant FB.1 and Variant FB.2. This fact has a geometry interpretation, since the projection of Variant FB.2 is done over a smaller set, possibly improving the convergence of Variant FB.1. Note that this can be reduce the number of iterations, avoiding possible zigzagging of Variant FB.1. Proposition 3.3.10 Let (xk )k∈N the sequence generated by Variant FB.2. Then, (i) xk+1 = PX∩H(¯xk ,¯uk ) (PH(¯xk ,¯uk ) (xk )). (ii) limk→∞ hA(¯ xk ) + u¯k , xk − x¯k i = 0. Proof. (i): Fix any y ∈ X ∩ H(¯ xk , u¯k ). Since xk ∈ X but xk ∈ / H(¯ xk , u¯k ) by Proposition 3.3.3, there exists γ ∈ [0, 1], such that x˜ = γxk + (1 − γ)y ∈ X ∩ ∂H(¯ xk , u¯k ), where ∂H(¯ xk , u¯k ) := x ∈ Rn : hA(¯ xk ) + u¯k , x − x¯k i = 0 . Hence, ky − PH(¯xk ,¯uk ) (xk )k2 ≥(1 − γ)2 ky − PH(¯xk ,¯uk ) (xk )k2 =k˜ x − γxk − (1 − γ)PH(¯xk ,¯uk ) (xk )k2 =k˜ x − PH(¯xk ,¯uk ) (xk )k2 + γ 2 kxk − PH(¯xk ,¯uk ) (xk )k2 − 2γh˜ x − PH(¯xk ,¯uk ) (xk ), xk − PH(¯xk ,¯uk ) (xk )i ≥k˜ x − PH(¯xk ,¯uk ) (xk )k2 , 57

(3.23)

where the last inequality follows from Proposition 1.3.9(ii), applied with X = H(¯ xk , u¯k ), x = xk and z = x˜ ∈ H(¯ xk , u¯k ). Furthermore, we have k˜ x − PH(¯xk ,¯uk ) (xk )k2 = k˜ x − xk k2 − kxk − PH(¯xk ,¯uk ) (xk )k2 ≥ kxk+1 − xk k2 − kxk − PH(¯xk ,¯uk ) (xk )k2 = kxk+1 − PH(¯xk ,¯uk ) (xk )k2 ,

(3.24)

where the first equality follows by PH(¯xk ,¯uk ) (xk ) = P∂H(¯xk ,¯uk ) (xk ), x˜ ∈ ∂H(¯ xk , u¯k ) and Pythagoras’s Theorem, using the fact that x˜ ∈ X ∩ H(¯ xk , u¯k ) and xk+1 = PX∩H(¯xk ,¯uk ) (xk ) in the first inequality, and Pythagoras’s Theorem again in the last equality. Combining (3.23) and (3.24), we obtain ky − PH(¯xk ,¯uk ) (xk )k ≥ kxk+1 − PH(¯xk ,¯uk ) (xk )k, for all y ∈ X ∩ H(¯ xk , u¯k ). Hence, xk+1 = PX∩H(¯xk ,¯uk ) (PH(¯xk ,¯uk ) (xk )). (ii): Take x∗ ∈ X ∩ SI . By item (i), Lemma 3.3.1 and Proposition 1.3.9(i), we have kxk+1 − x∗ k2 = kPX∩H(¯xk ,¯uk ) (PH(¯xk ,¯uk ) (xk )) − PX∩H(¯xk ,¯uk ) (x∗ )k2 ≤ kPH(¯xk ,¯uk ) (xk ) − PH(¯xk ,¯uk ) (x∗ )k2 ≤ kxk − x∗ k2 − kPH(¯xk ,¯uk ) (xk ) − xk k2 . The proof is similar to the proof of Proposition 3.3.6(iii). Finally we present the convergence result for Variant FB.2. Theorem 3.3.11 The sequence (xk )k∈N converges to some point belonging to SI ∩ X. Proof. Repeat the proof of Theorem 3.3.7.

3.3.3


In this subsection all results refers to Variant FB.3 xk+1 = FFB.3 (xk ) = PX∩H(¯xk ,¯uk )∩W (xk ) (x0 ) where H(xk , uk ) = y ∈ Rn : hT (xk ) + uk , y − xk i ≤ 0 , W (xk ) = y ∈ Rn : hy − xk , x0 − xk i ≤ 0 with xk and uk are respectively given by (3.4). Proposition 3.3.12 If Variant FB.3 stops, then xk ∈ SI ∩ X. 58

Proof. If Stop Criteria 2 is satisfied then, xk+1 = PX∩H(¯xk ,¯uk )∩W (xk ) (x0 ) = xk . So, xk ∈ X ∩ H(¯ xk , u¯k ) ∩ W (xk ) ⊂ X ∩ H(¯ xk , u¯k ) and finally using Proposition 3.3.3, xk ∈ SI ∩ X. From now on we assume that Variant FB.3 does not stop. Observe that, in virtue of their definitions, W (xk ) and H(¯ xk , u¯k ) are convex and closed halfspaces, for each k. Therefore X ∩ H(¯ xk , u¯k ) ∩ W (xk ) is a convex and closed set. So, if X ∩ H(¯ xk , u¯k ) ∩ W (xk ) is nonempty, then the next iterate, xk+1 , is well defined. The following lemma guarantees this fact. Lemma 3.3.13 SI ∩ X ⊂ H(¯ xk , u¯k ) ∩ W (xk ), for all k ∈ N. Proof. We proceed by induction. By definition, SI ∩ X = 6 ∅. By Lemma 3.3.1, SI ∩ X ⊂ k k n H(¯ x , u¯ ), for all k ∈ N. For k = 0, as W0 = R , SI ∩ X ⊂ H0 ∩ W0 . Assume that SI ∩ X ⊂ H` ∩ W` , for ` ≤ k. Henceforth, xk+1 = PX∩H(¯xk ,¯uk )∩W (xk ) (x0 ) is well-defined. Then, by Proposition 1.3.9(ii), we have hx∗ −xk+1 , x0 −xk+1 i = hx∗ −PX∩H(¯xk ,¯uk )∩W (xk ) (x0 ) , x0 −PX∩H(¯xk ,¯uk )∩W (xk ) (x0 )i ≤ 0, (3.25) for all x∗ ∈ SI ∩ X. The inequality follows by the induction hypothesis. Now, (3.25) implies that x∗ ∈ W (xk+1 ) and hence, SI ∩ X ⊂ H(¯ xk+1 , u¯k+1 ) ∩ W (xk+1 ). The above lemma shows that the set X∩H(¯ xk , u¯k )∩W (xk ) is nonempty and in consequence the projection step, given in (3.8), is well-defined. Corollary 3.3.14 Variant FB.3 is well defined. Proof. By Lemma 3.3.13 , SI ∩ X ⊂ H(¯ xk , u¯k ) ∩ W (xk ), for all k ∈ N. Then, given x0 , the sequence (xk )k∈N is computable. Before proving the convergence of the sequence, we study its boundedness. The next lemma shows that the sequence remains in a ball determined by the initial point. Lemma 3.3.15 The sequence (xk )k∈N is bounded. Furthermore, k

(x )k∈N

1 0 1 ⊂ B (x + x¯), ρ ∩ X, 2 2

where x¯ = PSI ∩X (x0 ) and ρ = dist(x0 , SI ∩ X). Proof. Follow using Lemma 1.3.13, with S = SI ∩ X and x = xk Now, we focus on the properties of the accumulation points. Lemma 3.3.16 All accumulation points of (xk )k∈N belong to SI ∩ X. 59

Proof. Since xk+1 ∈ W (xk ), 0 ≥ 2hxk+1 − xk , x0 − xk i = kxk+1 − xk k2 − kxk+1 − x0 k2 + kxk − x0 k2 . Equivalently 0 ≤ kxk+1 − xk k2 ≤ kxk+1 − x0 k2 − kxk − x0 k2 , establishing that the sequence (kxk − x0 k)k∈N is monotone and nondecreasing. From Lemma 3.3.15, we get that (kxk − x0 k)k∈N is bounded, and thus, convergent. Therefore, lim kxk+1 − xk k = 0.

(3.26)

hA(¯ xk ) + u¯k , xk+1 − x¯k i ≤ 0,

(3.27)

k→∞

Since xk+1 ∈ H(¯ xk , u¯k ), we get

with u¯k and x¯k as (3.4). Using (3.4) and (3.27), we have hA(¯ xk )+ u¯k , xk+1 − xk i + αk A(¯ xk )+ u¯k , xk − J(xk , βk ) ≤ 0. Combining the above inequality with the stop criteria of the Linesearch D, we get hA(¯ xk ) + u¯k , xk+1 − xk i +

αk δ k kx − J(xk , βk )k2 ≤ 0. ˆ β

(3.28)

Choosing a subsequence (ik )k∈N such that the subsequences (xik )k∈N , (βik )k∈N and (¯ uik )k∈N converge to x˜, β˜ and u˜ respectively. This is possible by the boundedness of (¯ uk )k∈N , by hypothesis on B, and the boundedness of (xk )k∈N and (βk )k∈N . Taking limits in (3.28), we have (3.29) lim αik kxik − J(xik , βik )k2 = 0. k→∞

Now we consider two cases, limk→∞ αik = 0 or limk→∞ αik 6= 0 (taking a subsequence again if necessary). Case 1: limk→∞ αik 6= 0, i.e., xik ≥ α ˜ for all k ∈ N and some α ˜ > 0. By (3.29),limk→∞ kxik − ˜ and hence by Proposition 1.3.33, x, β) J(xik , βik )k2 = 0. By continuity of J, we have x˜ = J(˜ x˜ ∈ SI . α Case 2: limk→∞ αik = 0, then limk→∞ θik = 0. The result follows in the same way as in the proof of Theorem 2.5.5(b). Finally, we are ready to prove the convergence of the sequence (xk )k∈N generated by Variant FB.3, to the solution closest to x0 . Theorem 3.3.17 Define x¯ = PSI ∩X (x0 ). Then, (xk )k∈N converges to x¯. Proof. By Lemma 3.3.15, (xk )k∈N ⊂ B 12 (x0 + x¯), 12 ρ ∩ X, so it is bounded. Let (xik )k∈N be a convergent subsequence of (xk )k∈N , and let xˆ be its limit. Evidently xˆ ∈ B 12 (x0 + x¯), 12 ρ ∩ X. Furthermore, by Lemma 3.3.16, xˆ ∈ SI ∩ X. Then, 1 0 1 x}, xˆ ∈ SI ∩ X ∩ B (x + x¯), ρ = {¯ 2 2 implying xˆ = x¯, i.e., x¯ is the unique limit point of (xk )k∈N . Hence, (xk )k∈N converges to x¯ ∈ SI ∩ X. 60

3.4

Final remarks

The aim of this section is to compare Variant FB.1 with the algorithm proposed in [53] when B = NC . In this instance problem (1.2) becomes in the well studied variational inequality problem, and the proposed variants (FB.1, FB.2 and FB.3) are related to the algorithms in [18, 53, 80]; see Fact 1.3.22. In the following, we present an example showing that there exists advantage in taking, inside the Linesearch D, a non-zero element, ukj , belonging to NC for the application of Variant FB.1. 2

Example 3.4.1 Consider A : R2 → R2 defined as A(x, y) = (−y, x) and B : R2 → 2R as B = NC where C is the ball centered in (0, 0) and radius 1, i.e.,

NC (x, y) =

 

0

, x2 + y 2 < 1

 R (x, y) , x2 + y 2 = 1. + Clearly, A and B are monotone and the unique solution of problem (1.2) is x∗ = (0, 0). Set βk = 1 for all k ∈ N, δ = 12 and X = C. We begin with Variant FB.1 taking x0 = (a, b) such that a2 + b2 = 1. Then, √ 2 0 a + b, b − a . J(x ) = 2 Beginning Linesearch D with j = 0, and take u00 ∈ NC J(x0 ) , i.e, u00 where r ≥ 0. For all r ≤ j(0) = 0, x¯0 = J(x0 ) and

√

√ 2r = a + b, b − a , 2

2, J(x0 ) and u00 satisfies the condition of Linesearch D. Then, √ u¯00 =

2 r a + b, b − a . 2

Thus, √ 1 + (1 − 2)r x =PC (a, b) − a − b + r(a + b), a + b + r(b − a) 2(r2 + 1) √ 1 + (1 − 2)r =(a, b) − a − b + r(a + b), a + b + r(b − a) , 2(r2 + 1) √ for all 0 ≤ r ≤ 2. Therefore, 1

D(r) = dist2 (x∗ ; x1 ) = kx1 k2 = 61

3r2 − 2r + 1 , 2(r2 + 1)

√ √ for all 0 ≤ r ≤ 2, which attains the unique minimum in r = 2 − 1, concluding that it is √ 2−1 0 better to take u¯0 = √2 a + b, b − a in order to obtain the next point, x1 , nearest to the unique solution, x∗ , of problem (1.2).

62

Chapter 4 A direct splitting method for nonsmooth variational inequalities 4.1

Introduction

This chapter is based on reference [19]. The Hilbert space H is infinite dimensional. In this chapter, we introduce an especial direct method for solving nonsmooth variational inequality problem where the operator is a sum of two maximal monotone operators. This work is inspired by the incremental subgradient method for nondifferentiable optimization, proposed in [70], and it uses a similar idea to the one presented in [14,16]. For the case of one operator, it is known that a natural extension of the subgradient iteration (one step) fails for monotone operators; see [16, 17]. However, as we will shown, an extra step is an option in order to prove the weak convergence of the sequence generated by the proposed algorithm.

4.2

A splitting direct method

In this section we present a direct splitting method for solving problem (1.1). The main advantage of our scheme is that it is not required to solve any nontrivial subproblem. The method proposed here is a natural extension of the incremental subgradient method for nondifferentiable optimization [70], using simple subgradient-like steps. Our algorithm requires an exogenous sequence (αk )k∈N ⊂ R++ satisfying ∞ X

∞ X

αk = ∞,

k=0

αk2 < ∞.

(4.1)

k=0

This selection rule has been considered several times in the literature (see, e.g., [2, 3, 14, 16, 63

17, 44]). The algorithm is defined as: Algorithm D Take (αk )k∈N as in (4.1). Step 0 (Initialization): Take x0 ∈ C. Define z 0 := x0 and σ0 := α0 , and set k ← 0. Step 1 (Iterative step):Given xk , z k and σk . Compute: y k = PC z k − αk w1k

z k+1 = PC y k − αk v2k , where w1k ∈ T1 (z k ) , v2k ∈ T2 (y k ). Set: αk+1 αk+1 k+1 k+1 x = 1− xk + z , σk+1 σk+1

(4.2) (4.3)

(4.4)

with σk+1 := σk + αk+1 . Step 2 (Stop Test): If z k+1 = y k = z k , then stop. Otherwise, k ← k1 and go to Step 1. We make the following boundedness assumption on the operators T1 and T2 . (H) There exists a positive scalar M such that kuk ≤ M,

∀u ∈ Ti (z k ) ∪ Ti (y k ),

i = 1, 2 ∀k ∈ N.

(4.5)

This assumption holds automatically in finite-dimensional spaces, when dom(T1 ) = dom(T2 ) = H or C ⊂ int(dom(T1 ) ∩ dom(T2 )). We also mention that Assumption (H) is required in the analysis of [70] for proving convergence of the incremental subgradient method and in others similar schemes; see [14, 17].

4.2.1

Convergence analysis of Algorithm D

We start with the good definition of the stoping criteria. Proposition 4.2.1 If Algorithm D stops in the step k, then z k ∈ SV I . Proof. If z k+1 = y k , then using Proposition 1.3.9(ii) in (4.3) we have, hy k −αk v2k −y k , x−y k i ≥ 0 for all x ∈ C, hence, hv2k , y k − xi ≤ 0. Moreover, if z k = y k then, using again Proposition 1.3.9(ii) in (4.2), we have that, hw1k , y k − xi ≤ 0, for all x ∈ C. Thus, hv k , x − y k i ≤ 0 for all x ∈ C and v k = w1k + v2k ∈ T (y k ), showing that y k ∈ SV I . From now on, we assume that Algorithm D generates infinite sequences. We present an important algebraic property of the auxiliary sequence (z k )k∈N obtained by Algorithm D. 64

Proposition 4.2.2 Let (z k )k∈N be the auxiliary sequence generated by Algorithm D. Then, for each (x, u) ∈ Gph(T ), with x ∈ C, there exists a constant R > 0 such that, kz k+1 − xk2 ≤ kz k − xk2 + Rαk2 − 2αk hu, z k − xi, ∀k ∈ N.

(4.6)

Proof. For each x ∈ C, take u ∈ T (x), such that u = u1 +u2 , with u1 ∈ T1 (x) and u2 ∈ T2 (x). Choosing M like in Assumption (H), we have

2

2 kz k+1 − xk2 = PC y k − αk v2k − PC (x) ≤ y k − αk v2k − x ≤ky k − xk2 + M 2 αk2 − 2αk hv2k , y k − xi ≤ky k − xk2 + M 2 αk2 − 2αk hu2 , y k − xi =kPC z k − αk w1k − PC (x)k2 + M 2 αk2 − 2αk hu2 , y k − xi ≤kz k − xk2 + 2M 2 αk2 − 2αk hu2 , y k − xi + hu1 , z k − xi =kz k − xk2 + 2M 2 αk2 − 2αk hu, z k − xi − 2αk hu2 , y k − z k i ≤kz k − xk2 + 2M 2 αk2 − 2αk hu, z k − xi + 2αk ku2 kky k − z k k ≤kz k − xk2 + (2M 2 + M ku2 k)αk2 − 2αk hu, z k − xi, where we use Proposition 1.3.9(i) in the first inequality, the monotonicity of T2 in the third one, Proposition 1.3.9(i) and the monotonicity of T1 in the fourth one, and the last inequality comes from ky k − z k k = kPC (z k − αk w1k − PC (z k )k ≤ αk M, (4.7) using Assumption (H) and Proposition 1.3.9(i). Defining R = 2M 2 + 2M ku2 k, we get (4.6). From now on we assume that SV I is nonempty. We prove the quasi-Fejér property of the auxiliary sequence (z k )k∈N generated by Algorithm D. Proposition 4.2.3 The auxiliary sequence (z k )k∈N generated by Algorithm D is quasiFejér convergent to SV I , and bounded. Proof. Take x¯ ∈ SV I . Thus, there exists u¯ ∈ T (¯ x) such that h¯ u, x − x¯i ≥ 0 ∀ x ∈ C. By k Proposition 4.2.2, with x = x¯ and u = u¯, and using that z ∈ C for all k ∈ N, we have kz k+1 − x¯k2 ≤ kz k − x¯k2 + Rαk2 − 2αk h¯ u, z k − x¯i ≤ kz k − x¯k2 + Lαk2 , establishing that (z k )k∈N is quasi-Fejér convergent to SV I . The boundedness of (z k )k∈N follows from Proposition 1.3.5(i). Corollary 4.2.4 Let (xk )k∈N be the sequence generated by Algorithm D. Then, 65

k 1 X αi z i , for all k ∈ N; (i) x = σk i=0 k

(ii) (xk )k∈N is bounded. Proof. (i): We proceed by induction on k. For k = 0, we have x0 = z 0 by definition. By inductive hypothesis, assume that k 1 X αi z k . x = σk i=0 k

(4.8)

Since σk+1 = σk + αk+1 , we get xk+1 =

σk k αk+1 k+1 x + z . σk+1 σk+1

By (4.8) and the above equation, we have x

k+1

=

k 1 X

σk+1

i=1

k+1 1 X αk+1 k+1 z = αi z + αi z i , σk+1 σk+1 i=0 i

proving the assertion. (ii): Using Proposition 4.2.3 and Proposition 1.3.5(i), we get the boundedness of (z k )k∈N . We may assume that there exists M > 0 such that kz k k ≤ R, for all k ∈ N. By the previous item, k 1 X k αi kz i k ≤ M, kx k ≤ σk i=0 for all k ∈ N.

Now we prove that the accumulation points of the sequence generated by Algorithm D belong to the solution set. Theorem 4.2.1 All weak accumulation points of (xk )k∈N belong to SV I . Proof. Take any x ∈ C and u ∈ T (x). Rewriting (4.6) in Proposition 4.2.2, we get, kz i+1 − xk2 − kz i − xk2 − Rαi2 ≤ 2αi hu, x − z i i, for all i. Now summing (4.9), from i = 0 to i = k, and dividing by σk , we have * + k k X 1 1 X i+1 kz − xk2 − kz i − xk2 − Rαi2 ≤ 2 u, αi (x − z i ) . σk i=0 σk i=0 66

(4.9)

Using Corollary 4.2.4(i) and defining K :=

P∞

i=0

αi2 , we get

kz k+1 − xk2 − kz 0 − xk2 − RK ≤ 2hu, x − xk i, ∀k ∈ N. σk

(4.10)

Let x¯ be any weak accumulation of (xk )k∈N , that exists by Corollary 4.2.4(ii). Since (z k )k∈N is bounded and limk→∞ σk = ∞, then taking limits in (4.10), over any weak convergent subsequence to x¯, we have hu, x − x¯i ≥ 0 for all x ∈ C and u ∈ T (x). By Lemma 1.3.31, x¯ belongs to the solution set. Hence, all accumulation points of (xk )k∈N belong to SV I . Finally, we prove our main result. Theorem 4.2.2 Define x∗ := limk→∞ PSV I (z k ). Then (xk )k∈N converges weakly to x∗ . Proof. Define pk := PSV I (z k ). Note that pk , the orthogonal projection of z k onto SV I , exists since the solution set SV I is nonempty by assumption, and closed and convex by Lemma 1.3.30. By Proposition 1.3.5, (z k )k∈N is quasi-Fejér convergent to SV I . Therefore, it follows from Lemma 1.3.6 that PSV I (z k ) k∈N is strongly convergent. Set x∗ := lim PSV I (z k ) = lim pk . k→∞

k→∞

(4.11)

By Corollary 4.2.4(ii), (xk )k∈N is bounded and by Theorem 4.2.1 each of its weak accumulation points belong to SV I . Let (xik )k∈N be any weakly convergent subsequence of (xk )k∈N , and let x¯ ∈ SV I be its weak limit. It suffices to show that x¯ = x∗ for establishing the weak convergence of (xk )k∈N . By Lemma 1.3.9(ii) we have that h¯ x − pj , z j − pj i ≤ 0 for all j. Let ξ = sup0≤j≤∞ kz j − pj k. Since (z k )k∈N is bounded by Proposition 4.2.3, we get that ξ < ∞. Using Cauchy-Schwarz inequality, h¯ x − x∗ , z j − pj i ≤ hpj − x∗ , z j − pj i ≤ ξ kpj − x∗ k, (4.12) αj and summing from j = 0 to k, we get from Corollary for all j. Multiplying (4.12) by σk 4.2.4(i), *

k 1 X k x¯ − x∗ , x − α j pj σk j=0

Define ζk,j :=

αj σk

+

(k ≥ 0, 67

k ξ X ≤ αj kpj − x∗ k. σk j=0

0 ≤ j ≤ k).

(4.13)

P It follows from the definition of σk , that limk→∞ ζk,j = 0 for all j and kj=0 ζk,j = 1 for all P P k ∈ N. Using (4.11) and Proposition 1.3.7 with pk = kj=0 ζk,j pj = σ1k kj=0 αj pj , we have k 1 X x∗ = lim p = lim α j pj , k→∞ k→∞ σk j=0

(4.14)

k 1 X αj kpj − x∗ k = 0. lim k→∞ σk j=0

(4.15)

k

and

Taking limits in (4.13) over the subsequence (ik )k∈N , and using (4.14) and (4.15), we get h¯ x − x∗ , x¯ − x∗ i ≤ 0, implying that x¯ = x∗ .

4.3

Final remarks

In this chapter we have been proposed a direct splitting method for nonsmooth variational inequality, and analyzed it convergence properties, where the operator is a sum of two monotone operators. In the proposed scheme the resolvent of any individual operator is evaluated, which represents an important advantage in the computational sense. The proposed scheme also is interesting for solving the problem of the minimization of the sum of two nonsmooth convex functions, i.e. min f1 (x) + f2 (x) s.t. x ∈ C or equivalently, for solving problem (1.1) where T = ∂f1 + ∂f2 . There are few results for the nonsmooth case in the literature; see [12, 70]. Here we are improving ones the previous results by extending them for solving variational inequality problems.

68

Chapter 5 A relaxed-projection splitting algorithm for variational inequalities in Hilbert spaces 5.1

Introduction

This chapter is based on reference [21]. We present a relaxed-projection splitting algorithm for solving the variational inequality problem for T and C, with T as a sum of m nonsmooth maximal monotone operators, i.e, T = T1 +T2 +· · ·+Tm where Ti : H → 2H , (i = 1, 2, . . . , m) and C is of the following form: C := {x ∈ H : c(x) ≤ 0} where c : H → R is a continuous and convex function, possibly nondifferentiable. It is clear that, if Ti , (i = 1, . . . , m) are monotone, then T = T1 + T2 + · · · + Tm is also monotone. But if Ti , (i = 1, 2, . . . , m) are maximal, it does not necessarily follow that T is maximal even when dom(T ) is nonempty. Some additional condition is needed, since for example the graph of T can be even empty (as happens when dom(T ) = dom(T1 )∩dom(T2 )∩ · · · ∩ dom(Tm ) = ∅). The problem of determining conditions under which the sum is maximal, turns out to be of fundamental importance in the theory of monotone operators. Results in this directions were proved in [74]. It is clear that in our case (dom(Ti ) = H, i = 1, 2, . . . , m) all these sufficient conditions for establishing the maximality of T , are satisfied. Our method was inspired by the incremental subgradient method for nondifferentiable optimization, proposed in [70], and it uses similar ideasto the methods in [14, 16, 19]. In the case of only one operator, it is known that a natural extension of the subgradient iteration (one step), the convergence fails in general for monotone operators; see [16, 17]. Here we 69

introduce an extra step in order to prove the weak convergence of the sequence generated by our algorithm.

5.2

A relaxed-projection splitting method

In this section, we introduce an algorithm for solving the variational inequality problem when T = T1 + T2 + . . . + Tm and C is of the form C = {x ∈ H : c(x) ≤ 0},

(5.1)

where c : H → R is a continuous convex function. Differentiability of c is not assumed and therefore the representation (5.1) is rather general, since any system of inequalities cj (x) ≤ 0 with j ∈ J, where each cj is convex, may be represented as in (5.1) with c(x) = sup{cj (x) : j ∈ J}.

5.2.1

Linesearch to relax the projection process

The following search is inspired on the method proposed by Bello Cruz and Iusem in [15].

Linesearch R (Linesearch Relax the projection) Input: (z, C, θ, α). Where z ∈ / C, C = {x ∈ H : c(x) ≤ 0} and c : H → R is continuous 0 and convex, θ, α > 0 Define y := z. While dist(y j+1 , C) > θα where y j+1 := PCj ∩Wj (y 0 ),

(5.2)

Cj := {x ∈ H : c(y j ) + hg j , x − y j i ≤ 0},

(5.3)

Wj := {x ∈ H : hx − y j , y 0 − y j i ≤ 0},

(5.4)

with g j ∈ ∂c(y j ). do j ← j + 1. End While ˜j :=j + 1 C˜ :=C˜. j

˜

˜ Output: (y j , C). Now we will prove the good definition of Linesearch R. 70

Proposition 5.2.1 Let C, Cj , Wj for all j ∈ N, C˜ be defined by Linesearch R and take z∈ / C. Then, ˜ (i) C ⊆ Cj ∩ Wj , C ⊆ C. (ii) Linesearch R stops after finitely many steps. Proof. (i): It follows from (5.3) and the definition of the subdifferential that C ⊆ Cj for all j. Using Proposition 4 and Corollary 1 of [15], with C = H, f (x) = c+ (x) := max{0, c(x)} implying that f∗ = 0, since our C 6= ∅, we get C ⊆ Wj for all j. Note that for all y j ∈ / C, we have j j ˜ ˜ ∂f (y ) = ∂c(y ). Thus, C ⊆ C by definition of C. (ii): Regarding the projection step in (5.2) , item (a) shows that the projections onto Cj ∩ Wj are well defined. Using Theorem 2 of [15], with C = H, x0 = y 0 , f (x) = c+ (x) := max{0, c(x)}, we have that (y j )j∈N converges strongly to PC (y 0 ) ∈ C. Thus, Linesearch R is well defined, stopping in a finite number of steps.

5.2.2

The Algorithm R

Consider an exogenous sequence (αk )k∈N in R++ . The algorithm is defined as follows. Algorithm R ( Relaxed-Projection Algorithm) Step 0 (Initialization): Take x0 ∈ H and θ > 0. Define z 0 := x0 and σ0 := α0 . Step 1 (Iterative step 1): Given z k . If c(z k ) ≤ 0, then take z0k := z k and Ck = {x ∈ H : hg k , x − z0k i ≤ 0} where g k ∈ ∂c+ (z0k ) with c+ (x) = max{0, c(x)} and go to Step 3. Else, Step 2 (Linesearch): (z0k , Ck ) := Linesearch R(z k , C, θ, αk )

(5.5)

Step 3 (Iterative Step 2): Compute the cycle, from i = 1, 2, . . . , m, as follows k zik = PCk zi−1 − αk uki ,

(5.6)

k k where uki ∈ Ti (zi−1 ). Define z k+1 := zm .

σk := σk−1 + αk , and x

k+1

:=

αk 1− σk 71

xk +

(5.7) αk k+1 z . σk

(5.8)

Before the formal analysis of the convergence properties of Algorithm R, we make some comments about the assumptions. First, unlike other projection methods, Algorithm R generates a sequence (xk )k∈N which is not necessarily contained in the set C. As will be shown in the next subsection, the generated sequence is asymptotically feasible and, in fact, converges to some point in the solution set. We observe that Linesearch R starts with the point z k and ends with a point z0k close to C, in fact dist(z0k , C) ≤ θαk . This is possible since Linesearch R, in the step k, is a direct application of Algorithm A in [15], with C = H, x0 = y k,0 , f (x) = c+ (x) := max{0, c(x)} and f∗ := inf x∈H c+ (x) = 0. Recently has been proposed in [1,24] a restricted memory level bundle method improving the convergence result of [15], which can be applied to Linesearch R, accelerating its convergence. It might seem that this Linesearch R can be replaced by any finite procedure leading to an approximation of PC (z k ), say a point z0k such that kz0k − PC (z k )k is sufficiently small. This is not the case: In the first place, depending on the location of the intermediate hyperplanes Cj and Wj , the sequence (y k,j )j∈N may approach points in C far from PC (z k ); in fact the computational cost of our Linesearch R is lower than the computation of an inexact orthogonal projection of z k onto C. On the other hand, it is not the case that any point z close enough to PC (z k ) will do the job. The crucial relation for convergence of our method is kz0k − xk ≤ kz k − xk for all x ∈ C, which may fail if we replace z0k by points z arbitrarily closed to PC (z k ). Algorithm R is easily implemented, since PWj ∩Cj and PCk , given in (5.2) and (5.5) respectively, have easy formulae by Proposition 1.3.10. Hence, by Proposition 1.3.10 the projections onto Cj ∩ Wj in Linesearch R and Ck in (5.6), can be calculated explicitly. Therefore, Algorithm R may be considered as an explicit method, since it does not solve a nontrivial subproblem. We need the following boundedness assumptions on ∂c. (H1) ∂c is bounded on bounded sets. In finite-dimensional spaces, this assumption is always satisfied in view of Theorem 4.6.1(ii) in [30], due to the maximality of ∂c. The maximality has been proved in [74]. For some equivalences with condition (H1) see e.g. Proposition 16.17 in [11]. Moreover, in the literature, (H1) has been required in the convergence analysis of various methods solving optimization problems in infinite-dimensional spaces (see e.g., [3, 15, 72]). We only use this assumption for establishing the good definition of Linesearch R. (H2) Define

ηk := max {1, kuki k}, 1≤i≤m

72

(5.9)

k with uki ∈ Ti (zi−1 ). We assume that the stepsize sequence, (αk )k∈N , satisfies: ∞ X

αk = +∞,

(5.10)

(ηk αk )2 < +∞.

(5.11)

k=0 ∞ X k=0

We mention that in the analysis of [70], a stronger condition than (H2) is required for proving convergence of the incremental subgradient method. Recently, similar assumptions have been used in the convergence analysis in [4, 27]. The condition (5.10) (divergent-series) on the stepsizes has been used widely for the convergence of classical projected subgradient methods; see [3, 72]. The condition (5.11) is used for establishing Proposition 5.2.3, which implies boundedness of the sequence (z k )k∈N . When (αk )k∈N is in `2 (N), the condition (5.11) holds, assuming that the image of Ti , (i = 1, 2, . . . , m) are bounded. Furthermore, it is possible to assume a weaker sufficient condition for (5.11) as for example: if (αk )k∈N = (1/k)k∈N , then the sequence (ηk )k∈N , defined in (5.9), may be unlimited like the sequence (k s )k∈N for any s ∈ (0, 1/2).

5.2.3

Convergence analysis of Algorithm R

Before establishing convergence of Algorithm R, we need to ascertain the validity of the stopping criterion as well as the fact that Algorithm R is well defined. A useful proposition for the convergence of algorithm is: Proposition 5.2.2 Let (z k )k∈N and (zik )k∈N , with i = 0, 1, . . . , m be sequences generated by Algorithm R. Then, (i) kzjk − zik k ≤ (j − i)ηk αk , for all k ∈ N and 0 ≤ i ≤ j ≤ m. P (ii) For any x ∈ C and u ∈ T (x) such that u = m i=1 ui with ui ∈ Ti (x), (i = 1, 2, . . . , m). Then, kz k+1 − xk2 ≤ kz k − xk2 + m [(ηk αk )2 + (m − 1)ηηk αk2 ] − 2αk hu, z0k − xi, where η := max1≤i≤m kui k. k Proof. (i): Since zik ∈ Ck for all 0 ≤ i ≤ m and all k, taking any ukj ∈ Tj (zj−1 ) and using the Cauchy-Schwarz inequality, we have

k kzjk − zik k = PCk zj−1 − αk ukj − PCk zik k k ≤kzj−1 − zik − αk uj,k k ≤ kzj−1 − zik k + kukj kαk ≤ · · · ≤ (j − i) ηk αk .

73

Pm (ii): Take (x, u) ∈ Gph(T ) with u = i=1 ui and ui ∈ Ti (x), (i = 1, 2, . . . , m). Using Proposition 1.3.9(i) in the first inequality, and the monotonicity of each component operator Ti in the latter, we obtain,

2

2 k k − αk uki − x kzik − xk2 = PCk zi−1 − αk uki − PCk (x) ≤ zi−1 k k = kzi−1 − xk2 + (kuki kαk )2 − 2αk huki , zi−1 − xi k k ≤ kzi−1 − xk2 + (kuki kαk )2 − 2αk hui , zi−1 − xi,

for all k ∈ N and i = 1, 2, . . . , m. By summing the above inequalities over i = 1, 2, . . . , m and using (5.9), we get kz

k+1

2

− xk ≤

kz0k

2

2

− xk + m(ηk αk ) − 2 αk

m X

k hui , zi−1 − xi

i=1

= kz0k − xk2 + m(ηk αk )2 − 2αk

m X

k hui , z0k − xi + hui , zi−1 − z0k i

i=1

=

kz0k

2

2

− xk + m(ηk αk ) −

2αk hu, z0k

m X k − xi − 2αk hui , zi−1 − z0k i. i=1

Using the Cauchy-Schwarz inequality and item (a) for j = m and i = 0, we have kz

k+1

2

− xk ≤

kz0k

2

2

− xk + m(ηk αk ) −

2αk hu, z0k

− xi + 2αk

m X

k kui kkzi−1 − z0k k

i=1

≤ kz0k − xk2 + m(ηk αk )2 + 2αk2

m X

(i − 1)kui kηk − 2αk hu, z k − xi

i=1 k

2

2

≤ kz − xk + m(ηk αk ) + m(m − 1)ηηk αk2 − 2αk hu, z0k − xi, where the last inequality is a direct consequence of the fact that z0k is obtained by Linesearch R and defining η = max1≤i≤m kui k thus proving the proposition. We continue by proving the quasi-Fejér properties of the sequences (z k )k∈N generated by Algorithm R. From now on, we assume that the solution set, SV I , of problem (1.1) is nonempty. Proposition 5.2.3 The sequence (z k )k∈N is quasi-Fejér convergent to SV I . Proof. Take x¯ ∈ SV I . Then, there exists u¯ ∈ T (¯ x) such that h¯ u, x − x¯i ≥ 0 ∀ x ∈ C, 74

(5.12)

Pm where u¯ = ¯i , with u¯i ∈ Ti (¯ x), (i = 1, 2, . . . , m). Using now Proposition 5.2.2(b), i=1 u taking η¯ = max1≤i≤m k¯ ui k, we get u, z0k − xi kz k+1 − x¯k2 ≤ kz k − x¯k2 + m (ηk αk )2 + (m − 1)¯ η ηk αk2 − 2αk h¯ u, PC (z0k ) − x¯i) u, z0k − PC (z0k )i + h¯ = kz k − x¯k2 + m (ηk αk )2 + (m − 1)¯ η ηk αk2 − 2αk (h¯ ≤ kz k − x¯k2 + m (ηk αk )2 + (m − 1)¯ η ηk αk2 + 2αk k¯ ukdist(z0k , C) ≤ kz k − x¯k2 + m (ηk αk )2 + (m − 1)¯ η ηk αk2 + 2θk¯ ukαk2 , (5.13) where we use (5.12) and the Cauchy-Schwarz inequality in the second inequality and the last inequality is a consequence of the fact that z0k is obtained by Linesearch R. It follows from (5.13), (5.9) and (H2) that (z k )k∈N is quasi-Fejér convergent to SV I . Next we establish some convergence properties of Algorithm R. Proposition 5.2.4 Let (z k )k∈N and (xk )k∈N be the sequences generated by Algorithm R. Then, (i) x

k+1

k 1 X αj z j+1 , for all k ∈ N; = σk j=0

(ii) (xk )k∈N are bounded; (iii) limk→∞ dist(xk , C) = 0; (iv) all weak accumulation points of (xk )k∈N belong to C. Proof. (i): We proceed by induction on k. For k = 0, using (5.8) and that σ0 = α0 , we have that x1 = z 1 . By inductive hypothesis, assume that

xk =

k−1 1 X

σk−1

αj z j+1 .

j=0

Using (5.7) and (5.8), we obtain xk+1 =

σk−1 k αk k+1 x + z . σk σk

By (5.14) and the above equation, we get xk+1 =

k−1 k αk 1 X 1 X αj z j+1 + z k+1 = αj z j+1 , σk j=0 σk σk j=0

75

(5.14)

proving the assertion. (ii): Using Proposition 5.2.3 and Proposition 1.3.5(a), we have the boundedness of (z k )k∈N . We assume that there exists R > 0 such that kz k k ≤ R, for all k ∈ N. By the previous item, k−1 1 X

k

kx k ≤

σk−1

αj kz j+1 k ≤ R,

j=0

for all k ∈ N. (iii): It follows from definition of z0k that dist(z0k , C) ≤ θαk .

(5.15)

Define x˜

k+1

k 1 X αj PC (z0j ). := σk j=0

(5.16)

k 1 X Since αj = 1 by (5.7), we get from the convexity of C, that x˜k+1 ∈ C. Thus, σk j=0

k k

1 X

1 X

j k+1 k+1 k+1 j+1 dist(x , C) ≤ kx − x˜ k = αj z − PC (z0 ) ≤ αj z j+1 − PC (z0j )

σk

σk j=0 j=0 ≤

k k 1 X 1 X αj kz j+1 − z0j k + kz0j − PC (z0j )k ≤ αj (m ηj αj + dist(z0j , C)) σk j=0 σk j=0

k 1 X ≤ (m ηj αj2 + θαj2 ), σk j=0

(5.17)

using the fact that x˜k+1 belongs to C in the first inequality, (ii) and (5.16) in the equality, convexity of k · k in the second inequality, Proposition 5.2.2(i), with j = m and i = 0, in the fourth inequality and (5.15) in the last one. Taking limits in (5.17) and using (5.7) and (5.10), we get limk→∞ dist(xk+1 , C) = 0, establishing (iii). (iv): Follows directly from (iii).

Next we prove optimality of the accumulation points of (xk )k∈N . Theorem 5.2.5 All weak accumulation points of the sequence (xk )k∈N generated by Algorithm R solve problem (1.1). 76

Proof. Using Proposition 5.2.2(ii), we get, for any x ∈ C, u ∈ T (x) and for all j > 0, kz j+1 − xk2 ≤ kz j − xk2 + m (ηj αj )2 + (m − 1)ηηj αj2 − 2αj hu, z0j − xi, = kz j − xk2 + m (ηj αj )2 + (m − 1)ηηj αj2 − 2αj hu, z0j − z j+1 i − 2αj hu, z j+1 − xi ≤ kz j − xk2 + m (ηj αj )2 + (m − 1)ηηj αj2 + 2αj kukkz0j − z j+1 k − 2αj hu, z j+1 − xi ≤ kz j − xk2 + m (ηj αj )2 + (m − 1)ηηj αj2 + 2mkukηj αj2 − 2αj hu, z j+1 − xi, (5.18) using the Cauchy-Schwarz inequality in the second inequality and Proposition 5.2.2(i), with j = m and i = 0, in the last one. Rewriting and summing (5.18) from j = 0 to j = k and dividing by σk , we obtain from Proposition 5.2.4(i) that k 1 X kz j+1 − xk2 − kz j − xk2 − m (ηj αj )2 − (m − 1)ηηj αj2 − 2kukηj αj2 ≤ 2hu, x − xk+1 i. σk j=0

After rearrangements, we have ∞ X 1 k+1 2 0 2 (kz − xk − kz − xk − m (ηj αj )2 + (m − 1)ηηj αj2 + 2kukηj αj2 ) ≤ 2hu, x − xk+1 i. σk j=0

(5.19) Let xˆ be a weak accumulation point of (x )k∈N . Existence of xˆ is guaranteed by Proposition 5.2.4(ii). Note that xˆ ∈ C by Proposition 5.2.4(iv). Taking limits in (5.19), using (5.11), boundedness of (z k )k∈N by Proposition 5.2.3 and (5.10), we obtain that hu, x − xî ≥ 0 for all x ∈ C and u ∈ T (x). Using Lemma 1.3.31, we get that xˆ ∈ SV I . Hence, all weak accumulation points of (xk )k∈N solve problem (1.1). k

Finally, we state and prove the weak convergence of the main sequence generated by Algorithm R. Theorem 5.2.6 Define x∗ = limk→∞ PSV I (z k ). Then (xk )k∈N converges weakly to x∗ . Proof. Define pk := PSV I (z k ) the orthogonal projection of z k onto SV I . Note that pk exists, since the solution set SV I is nonempty, closed and convex by Lemma 1.3.30. By Proposition 1.3.5, (z k )k∈N is quasi-Fejér convergent to SV I . Therefore, it follows from Lemma 1.3.6 that (PSV I (z k ))k∈N is strongly convergent. Set x∗ := lim PSV I (z k ) = lim pk . k→∞

k→∞

(5.20)

By Proposition 5.2.4(ii), (xk )k∈N is bounded and by Theorem 5.2.5 each of its weak accumulation points belong to SV I . Let (xik )k∈N be any weakly convergent subsequence of (xk )k∈N , 77

and let x¯ ∈ SV I be its weak limit. In order to establish the weak convergence of (xk )k∈N , it suffices to show that x¯ = x∗ . By Proposition 1.3.9(ii) we have that h¯ x − pj , z j − pj i ≤ 0 for all j. Let ξ = sup0≤j≤∞ kz j − pj k. Since (z k )k∈N is bounded by Proposition 1.3.5(a), we get that ξ < ∞. Using the CauchySchwarz inequality, h¯ x − x∗ , z j − pj i ≤ hpj − x∗ , z j − pj i ≤ ξ kpj − x∗ k, (5.21) αj−1 for all j. Multiplying (5.21) by and summing from j = 1 to k − 1, we get from σk−1 Proposition 5.2.4(i), * x¯ − x∗ , xk −

k−1 1 X

σk−1

Define ζk,j :=

+ αj−1 pj

j=1

αj σk

(k ≥ 0,

≤

ξ

k−1 X

σk−1

j=1

αj−1 kpj − x∗ k.

(5.22)

0 ≤ j ≤ k).

It follows from the definition of σk , that limk→∞ ζk,j = 0 for all j and k X k k ∈ N. Using (5.20) and Proposition 1.3.7 with w = ζk−1,j−1 pj = j=1

Pk

1 for all

1

αj pj+1 , we

j=0 ζk,j = k−1 X

σk−1

j=0

have k 1 X x∗ = lim p = lim αj pj+1 , k→∞ k→∞ σk j=0

(5.23)

k 1 X αj kpj+1 − x∗ k = 0. k→∞ σk j=0

(5.24)

k

and

lim

Taking limits in (5.22) over the subsequence (ik )k∈N , and using (5.23) and (5.24), we get h¯ x − x∗ , x¯ − x∗ i ≤ 0, implying that x¯ = x∗ .

5.3

Final remarks

In this section we discuss the assumptions of our scheme, showing examples as well as some alternatives for changing these assumptions. One problem in establishing the good definition 78

of the sequence generated by Algorithm R may be the difficulty of choosing stepsizes satisfying Assumption (H2); see (5.10)-(5.11). This is easy in the important case where the operators have bounded range, i.e., the sequence (ηk )k∈N , defined in (5.9), is bounded. Hence, any sequence (αk )k∈N in `2 (N) \ `1 (N) may be used satisfying (H2). Now we present some examples showing that (H2) is verified for some different instances. Example 5.3.1 Consider the variational inequality problem in a Hilbert space H, for T and S = argminx∈H f (x), where T : H → 2H is a maximal monotone operator and f : H → R is a continuous convex function which satisfies (H1). This problem is equivalent to problem (1.1), with m = 1, T1 = T , c = f − f∗ , where f∗ = minx∈H f (x) and Algorithm R may be rewritten as follows:

Algorithm R.1 . Step 0 (Initialization): Take x0 ∈ H and θ > 0. Define z 0 := x0 and σ0 := β0 . Step 1 (Iterative Step 1):Given z k . If f (z k ) ≤ f∗ , then take z0k := z k and Sk = {x ∈ H : hg k , x − z0k i ≤ 0} where g k ∈ ∂f (z0k ). Step 2 (Inner Loop): (z0k , Sk ) := LinesearchR(z k , S, θ, βk ) Step 3 (Iterative Step 2): Compute βk k k+1 k z = PSk z0 − u , ηk where uk ∈ T (z0k ) and ηk = max{1, kuk k}. σk := σk−1 + x

k+1

:=

βk 1− σk

βk , ηk

xk +

βk k+1 z . σk

Algorithm R.1 is the point-to-set version of the algorithm proposed in [14]. Assuming that the problem has solutions and that (βk )k∈N in `2 (N) \ `1 (N), the analysis of the convergence follows directly from the analysis in [14]. (Linesearch R is slightly different, however the convergence proof remains essentially unchanged.) Example 5.3.2 We consider the optimization problem of the form min x∈X

φ1 (L(x)) + φ2 (x),

(5.25)

where L : X → Y is a continuous linear operators, with closed range, X and Y are two Hilbert spaces and φ1 : Y → R, φ2 : X → R are convex and continuous functions. 79

This is a classical problem which appears in many applications in mechanics and economics (see, e.g., [46]). Denote K = {(x, y) ∈ X × Y : L(x) − y = 0}, and







A=

∂φ1 0  0 0

B=

 0

0  . 0 ∂φ2

A and B are maximal monotone and (5.25) is equivalent to problem (1.1), with m = 2, T1 = A, T2 = B and C = K in H = X × Y . In this case our algorithm does not require Linesearch R, since the feasible set K is a linear and closed subspace. Thus, the projection onto K is easy to compute; in effect, the set K can be rewritten as

1 K = {(x, y) ∈ X × Y : c(x, y) = kL(x) − yk2 ≤ 0}, 2

 and ∇c(x, y) = 

∗



L (L(x) − y)  and hence, y − L(x)

1 ∇c(x, y) . PK (x, y) = (x, y) − kL(x) − yk2 2 k∇c(x, y)k2

Algorithm R may be rewritten as follows:

80

Algorithm R.2 . Step 0 (Initialization): Take x0 := (x01 , x02 ) ∈ K and θ > 0. Define z 0 := x0 and σ0 := β0 . Step 1 (Iterative Step 1):Given z k = (z1k , z2k ). Compute βk k k k k+1 k+1 (z1,1 , z1,2 ) = PK z1 − u1 , z2 , ηk βk k k+1 k+1 k+1 k+1 (z2,1 , z2,2 ) = PK z1,1 , z1,2 − u2 , ηk k+1 ) and ηk = max{1, kuk1 k kuk2 k}. where uk1 ∈ ∂φ1 (z1k ), uk2 ∈ ∂φ2 (z1,2

σk := σk−1 + xk+1 1 xk+1 2

:= :=

βk 1− σk

αk 1− σk

βk , ηk

xk1 +

βk k+1 z . σk 2,1

xk2 +

βk k+1 z . σk 2,2

Set k+1 xk+1 = xk+1 , x . 1 2 Example 5.3.3 Consider the minimax problem: min max{φ1 (x1 ) − φ2 (x2 ) + hx2 , L(x1 )i},

x1 ∈X x2 ∈X

(5.26)

where L : X → X is a self adjoint and continuous linear operator, X is a Hilbert space, φ1 : X → R, φ2 : X → R are convex and continuous functions and φ2 is Gâteaux differentiable. This problem was presented in [74] and under a suitable constraint qualification, it is equivalent to problem (1.1), with m = 2, H = C = X × X, and T1 (x1 , x2 ) = A(x1 , x2 ) = (∂φ1 (x1 ), 0) and T2 (x1 , x2 ) = B(x1 , x2 ) = (L(x2 ), ∇φ2 (x2 ) − L(x1 )), which are maximal monotone operators. Algorithm R can be rewritten as follows:

81

Algorithm R.3 . Step 0 (Initialization): Take x0 := (x01 , x02 ) ∈ X × X and θ > 0. Define z 0 := x0 and σ0 := β0 . Step 1 (Iterative Step 1): Given z k = (z1k , z2k ). Compute k+1 z1,1 = z1k − k+1 k+1 − = z1,1 z1,2

βk k u ηk 1

βk k+1 ) L(z2,1 ηk

k+1 z2,1 = z2k , k k+1 − = z2,1 z2,2

βk k+1 k+1 ) , ) − L(z1,1 ∇φ2 (z2,1 ηk

k+1 k+1 k+1 where uk1 ∈ ∂φ1 (z1k ) and ηk = max 1, kuk1 k, kL(z2,1 )k, k∇φ2 (z2,1 ) − L(z1,1 )k . k+1 k+1 z k+1 := (z1k+1 , z2k+1 ) = (z1,2 , z2,2 ). Step 2 (Actualization): βk σk := σk−1 + , ηk βk βk xk+1 := 1 − xk1 + z1k+1 . 1 σk σk βk βk xk+1 := 1 − xk2 + z2k+1 . 2 σk σk

Set

Set k+1 xk+1 = (xk+1 1 , x2 ).

βk , for ηk all k ∈ N and hence ηk αk = βk and (5.11) is equivalent to (βk )k∈N in `2 (N), which by Proposition 5.2.3 implies the boundedness of the sequence (z k )k∈N . Moreover as a consequence of k )k∈N , (i = 1, 2, . . . , m) are bounded. Now, condition Proposition 5.2.2(a), the sequences (zi−1 (5.10) becomes in ∞ X βk = ∞, (5.27) ηk k=0 In Algorithms R.2 and R.3, presented in the above examples, the stepsize is αk =

k where ηk = max1≤i≤m {1, kuki k}, with uki ∈ Ti (zi−1 ). Thus, a sufficient condition for (5.27) is that the image of Ti , (i = 1, 2, . . . , m) is bounded on bounded sets (since the sequences k (zi−1 )k∈N , (i = 1, 2, . . . , m) are bounded). Moreover, (ηk )k∈N may be an unlimited sequence, as for example (k s )k∈N for any s ∈ (0, 1/2).

Therefore Assumption (H2) turns into “(βk )k∈N lies in `2 (N) \ `1 (N)”, which is a requirement widely used in the literature. The convergence analysis of Algorithms C.2 and C.3 follows from the convergence analysis of Algorithm R. 82

Another important point on Algorithm R, is that Linesearch R uses the distance function. It is clear that this is weakest than computing the exact projection for almost all instances. Furthermore, inside Linesearch R, we may only check the condition related with the distance on a selected index. We may include the following assumption: (H3) Assume that a Slater point is available, i.e. there exists a point w ∈ H such that c(w) < 0. If Assumption (H3) holds, by Lemma 1.3.11 we can obtain an explicit algorithm for a quite general convex set C, replacing the inequality dist(y k,j+1 , C) ≤ θαk in Linesearch R of Algorithm R by c˜(y k,j+1 ) ≤ θαk , where  kx − wkc(x)   if x ∈ /C    c(x) − c(w) c˜(x) =      0 if x ∈ C. All our convergence results are preserved. (H3) is a hard assumption in Hilbert spaces and the point w is almost always unavailable. Hence, such assumptions can be replaced by a rather weaker one, namely: (H3∗ ) There exists an easily computable and continuous c˜ : H → R, such that dist(x, C) ≤ c˜(x) for all x ∈ H, and c˜(x) = 0 if and only if c(x) = 0. There are examples of sets C for which no Slater point is available, while (H3∗ ) holds, including instances in which C has an empty interior. An exhaustive discussion about weak constraint qualifications for getting error-bounds can be found in [64, 77].

83

Bibliography [1] Ackooij, W.V., Bello Cruz, J.Y., de Oliveira, W. A strongly convergent proximal bundle method for convex minimization in Hilbert spaces. Optimization. doi: 10.1080/02331934.2015.1004549 (2015). [2] Alber, Ya.I.: Recurrence relations and variational inequalities. Soviet Mathematics Doklady 27 (1983) 511–517. [3] Alber, Ya.I., Iusem, A.N., Solodov, M.V.: On the projected subgradient method for nonsmooth convex optimization in a Hilbert space. Mathematical Programming 81 (1998) 23–37. [4] Attouch, H., Czarnecki, M.-O., Peypouquet, J.: Coupling forward-backward with penalty schemes and parallel splitting for constrained variational inequalities. SIAM Journal on Optimization 21 (2011) 1251–1274. [5] Aubin, J.E.: L’Analyse non linéaire et ses motivations économiques. Masson, Paris (1984) [6] Auslender, A., Teboulle, M.: Interior projection-like methods for monotone variational inequalities. Mathematical Programming 104 (2005) 39–68. [7] Baiocchi, C., Capelo, A.: Variational and Quasivariational Inequalities. Applications to Free Boundary Problems. Wiley, New York (1988). [8] Bao, T.Q., Khanh, P.Q.: A projection-type algorithm forp seudomonotone nonlipschitzian multivalued variational inequalities. Nonconvex Optimization and Its Applications 77 (2005) 113–129. [9] Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Review 38 (1996) 367–426. [10] Bauschke, H.H., Burke, J.V., Deutsch, F.R., Hundal, H.S., Vanderwerff, J.D.: A new proximal point iteration that converges weakly but not in norm. Proceedings of the American Mathematical Society 133 (2005) 1829–1835. [11] Bauschke, H.H., Combettes, P. L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2011).

84

[12] Bello Cruz, J.Y.: On proximal subgradient splitting method for minimizing the sum of two nonsmooth convex functions. http://arxiv.org/pdf/1410.5477.pdf (2014). [13] Bello Cruz, J.Y.: Direct Methods for Monotone Variational Inequalities. Doctoral dissertation (2009). [14] Bello Cruz, J.Y., Iusem, A.N.: An explicit algorithm for monotone variational inequalities.Optimization 61 (2012) 855–871. [15] Bello Cruz, J.Y., Iusem, A.N.: A strongly convergent method for nonsmooth convex minimization in Hilbert spaces. Numerical Functional Analysis and Optimization 32 (2011) 1009– 1018. [16] Bello Cruz, J.Y., Iusem, A.N.: Convergence of direct methods for paramonotone variational inequalities. Computation Optimization and Applications 46 (2010) 247–263. [17] Bello Cruz, J.Y., Iusem, A.N.: Full convergence of an approximate projection method for nonsmooth variational inequalities. Mathematics and Computers in Simulation, doi: 10.1016/j.matcom.2010.05.026 (2010). [18] Bello Cruz, J.Y., Iusem, A.N.: A strongly convergent direct method for monotone variational inequalities in Hilbert spaces. Numerical Functional Analysis and Optimization 30 (2009) 23–36. [19] Bello Cruz, J.Y., D´ıaz Mill´ an, R.: A direct splitting method for nonsmooth variational inequalities. Journal Optimization Theory and Application 161 (2014) 728–737. [20] Bello Cruz, J.Y., D´ıaz Mill´ an, R.: A variant of forward-backward splitting method for the sum of two monotone operators with a new search strategy. Optimization, doi: 10.1080/02331934.2014.883510 (2014). [21] Bello Cruz, J.Y., D´ıaz Mill´ an, R.: A relaxed-projection splitting algorithm for variational inequalities in Hilbert spaces. http://arxiv.org/pdf/1312.3921.pdf (2014). [22] Bello Cruz, J.Y., D´ıaz Mill´ an, R., Phan, M. Hung.: Conditional extragradient algorithms for variational inequalities. http://arxiv.org/pdf/1411.4338.pdf (2014). [23] Bello Cruz, J.Y., Nghia, T.T.A. On the convergence of the proximal forward-backward splitting method with linesearches. http://arxiv.org/pdf/1501.02501.pdf (2015). [24] Bello Cruz, J.Y., de Oliveira. W.: Level bundle-like algorithms for convex optimization. Journal of Global Optimization 59 (2014) 787–809. [25] Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000).

85

[26] Borwein, J.M.: Maximality of sums of two maximal monotone operators in Banach space. Proceedings of the American Mathematical Society 12 (2007) 3917–3924. [27] Bot¸, R.I., Hendrich, C.: A Douglas-Rachford type primal-dual method for solving inclusions with mixtures of composite and parallel-sum type monotone operators. SIAM Journal on Optimization 23 (2013) 2541–2565. [28] Brezis, Haim.: Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer New York Dordrecht Heidelberg London (2010). [29] Browder, F.E.: Convergence theorems for sequences of nonlinear operators in Banach spaces. Mathematische Zeitschrift 100 (1967) 201–225. [30] Burachik, R.S., Iusem, A.N.: Set-Valued Mappings and Enlargements of Monotone Operators. Springer, Berlin (2008). [31] Burachik, R.S., Lopes, J.O., Svaiter, B.F.: An outer approximation method for the variational inequality problem. SIAM Journal on Control and Optimization 43 (2005) 2071–2088. [32] Censor, Y., Gibali, A., Reich, S.: The subgradient extragradient method for solving variational inequalities in Hilbert space. Journal of Optimization Theory and Applications 148 (2011) 318–335. [33] Combettes, P.L., Pesquet, J.-C.: Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set-Valued and Variational Analysis 20 (2012) 307–330. [34] Combettes, P.L., Pesquet, J.C.: Proximal Splitting Methods in Signal Processing. FixedPoint Algorithms for Inverse Problems in Science and Engineering Springer Optimization and Its Applications 10 (2011) 185–212. [35] d’Antonio, G., Frangioni, A.: Convergence analysis of deflected conditional approximate subgradient methods. SIAM Journal on Optimization 20 (2009) 357–386. [36] Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two or three space variables. Transactions of the American Mathematical Society 82 (1956) 421–439. [37] Eckstein, J.: Splitting Methods for Monotone Operators, with Applications to Parallel Optimization. PhD thesis, Massachusetts Institute of Techonology, Cambridge, MA, 1989. Report LIDS-TH-1877, Laboratory for Information and Decision Systems, M.I.T. [38] Eckstein, J., Svaiter, B.F.: General projective splitting methods for sums of maximal monotone operators. SIAM Journal on Control and Optimization 48 (2009) 787–811.

86

[39] Eckstein, J., Svaiter, B.F.: A family of projective splitting methods for the sum of two maximal monotone operators. Mathematical Programming 111 (2008) 173–199. [40] Ermoliev, Yu.M.: On the method of generalized stochastic gradients and quasi-Fejér sequences. Cybernetics and Systems Analysis 5 (1969) 208–220. [41] Facchinei, F., Pang, J.S.: Finite-dimensional Variational Inequalities and Complementarity Problems. Springer, Berlin (2003). [42] Fang, S.C., Petersen, E.L.: Generalized variational inequalities. Journal of Optimization Theory and Applications 38 (1982) 363–383. [43] Ferris, M.C., Pang, J.S.: Engineering and economic applications of complementarity problems. SIAM Review 39 (1997) 669–713. [44] Fukushima, M.: A Relaxed projection for variational inequalities. Mathematical Programming 35 (1986) 58–70. [45] D. Gabay,: Applications of the method of multipliers to variational inequalities, Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary Value Problems, M. Fortin and R. Glowinski, eds., North-Holland, Amsterdam (1983) 299-331. [46] Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Computers & Mathematics with Applications 2 (1976) 17– 40. [47] Harker, P.T., Pang, J.S.: Finite dimensional variational inequalities and nonlinear complementarity problems: a survey of theory, algorithms and applications. Mathematical Programming 48 (1990) 161–220. [48] Hartman, P., Stampacchia, G.: On some non-linear elliptic differential-functional equations. Acta Mathematica 115 (1966) 271–310. [49] He, B.S.: A new method for a class of variational inequalities. Mathematical Programming 66 (1994) 137–144. [50] Iusem, A.N.: On some properties of paramonotone operators. Journal of Convex Analysis 5 (1998) 269–278. [51] Iusem, A.N.: An iterative algorithm for the variational inequality problem. Computational and Applied Mathematics 13 (1994) 103–114. [52] Iusem, A.N., Lucambio Pérez, L.R.: An extragradient-type method for non-smooth variational inequalities. Optimization 48 (2000) 309–332.

87

[53] Iusem, A.N., Svaiter, B.F.: A variant of Korpelevich’s method for variational inequalities with a new search strategy. Optimization 42 (1997) 309–321. [54] Iusem, A.N., Svaiter, B.F., Teboulle, M.: Entropy-like proximal methods in convex programming. Mathematics of Operations Research 19 (1994) 790–814. [55] Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications. Academic Press, New York (1980). [56] Khobotov, E.N.: Modifications of the extragradient method for solving variational inequalities and certain optimization problems. USSR Computational Mathematics and Mathematical Physics 27 (1987) 120–127. [57] Konnov, I.V.: Combined Relaxation Methods for Variational Inequalities. Lecture Notes in Economics and Mathematical Systems. 495 Springer-Velarg, Berlin (2001). [58] Konnov, I.V.: A combined relaxation method for variational inequalities with nonlinear constraints. Mathematical Programming 80 (1998) 239–252. [59] Konnov, I.V.: A class of combined iterative methods for solving variational inequalities. Journal of Optimization Theory and Applications 94 (1997) 677–693. [60] Konnov, I.V.: Splitting-type method for systems of variational inequalities. Computers & Operations Research 33 (2006) 520–534. [61] Korpelevich, G.M.: The extragradient method for finding saddle points and other problems. Ekonomika i Matematcheskie Metody 12 (1976) 747–756. [62] Lassonde, M., Nagesseur, L.: Extended forward-backward agorithm. Journal of Mathematical Analysis and Applications 403 (2013) 167–172. [63] Larson, T., Patriksson, M., Stromberg, A-B.: Conditional subgradient optimization - Theory and application. European Journal of Operational Research 88 (1996) 382–403. [64] Lewis, A. S., Pang, J.-S.: Error bounds for convex inequality systems. Generalized convexity, generalized monotonicity: recent results: Nonconvex Optimization and Its Applications 27 75–110 Kluwer Academic Publishers, Dordrecht (1998). [65] Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM Journal of Numererical Analysis 16 (1979) 964–979. [66] B. Mercier,: Inéquations Variationnelles de la Mécanique, Publ. Math. Orsay, 80.01 Université de Paris-Sud, Orsay, 1980. [67] Minty, G.: Monotone (nonlinear) operators in Hilbert Space. Duke Mathetematical Journal 29 (1962) 341–346.

88

[68] Minty, G.: On the maximal domain of a “monotone” function. Michigan Mathematical Journal 8 (1961) 135–137. [69] Moudafi, A.: On the convergence of splitting proximal methods for equilibrium problems in Hilbert spaces. Journal of Mathematical Analysis and Applications 359 (2009) 508–513. [70] Nedic, A., Bertsekas, D.: Incremental subgradient methods for nondifferentiable optimization. SIAM Journal on Optimization 12 (2001) 109–138. [71] Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. Journal of Mathematical Analysis and Applications 72 (1979) 383–390. [72] Polyak, B.T.: Minimization of unsmooth functionals. USSR Computational Mathematics and Mathematical Physics 9 (1969) 14–29. [73] Rockafellar, R.T.: Convex Analysis. Princeton, New York (1970). [74] Rockafellar, R.T.: On the maximality of sums of nonlinear monotone operators. Transactions of the American Mathematical Society 149 (1970) 75–88. [75] Rockafellar, R.T., Wets, R.J-B.: Variational Analysis. Springer, Berlin (1998). [76] Shih, M.H., Tan, K.K.: Browder-Hartmann-Stampacchia variational inequalities for multivalued monotone operators. Journal of Mathematical Analysis and Applications 134 (1988) 431–440. [77] Sien, D.: Computable error bounds for convex inequality systems in reflexive Banach spaces. SIAM Journal on Optimization 1 (1997) 274–279. [78] Solodov, M.V., Svaiter, B.F.: Forcing strong convergence of proximal point iterations in a Hilbert space. Mathematical Programming 87 (2000) 189–202. [79] Solodov, M.V., Svaiter, B.F.: A hybrid approximate Extragradient-Proximal Point Algorithm using the enlargement of a maximal monotone operator. Set-Valued Analysis 7 (1999) 323– 345. [80] Solodov, M.V., Svaiter, B.F.: A new projection method for monotone variational inequality problems. SIAM Journal on Control and Optimization 37 (1999) 765–776. [81] Solodov, M.V., Tseng, P.: Modified projection-type methods for monotone variational inequalities. SIAM Journal on Control and Optimization 34 (1996) 1814–1830. [82] Todd, M.J.: The Computations of Fixed Points and Applications. Springer, Berlin (1976). [83] Tseng, P.: A modified forward-backward splitting method for maximal monotone mappings. SIAM on Journal Control Optimization 38 (2000) 431–446.

89

[84] Zaraytonelo, E.H.: Projections on Convex Sets in Hilbert Space and Spectral Theory. in Contributions to Nonlinear Functional Analysis, E. Zaraytonelo, Academic Press, New York (1971) 237–424. [85] Zhang, H., Cheng, L.: Projective splitting methods for sums of maximal monotone operators with applications. Journal of Mathematical Analysis and Applications 406 (2013) 323–334.

90

ON SEVERAL ALGORITHMS FOR VARIATIONAL

ON SEVERAL ALGORITHMS FOR VARIATIONAL

Suggest Documents

Comparison of several intelligent algorithms for ... - ScienceDirect.com

Conditional extragradient algorithms for variational inequalities

Variational Algorithms for Approximate Bayesian ... - Semantic Scholar

Variational Algorithms for Approximate Bayesian Inference - Computer ...

Variational Algorithms for Approximate Bayesian Inference - Computer ...

efficient algorithms for several constrained resource allocation

Variational Algorithms for Approximate Bayesian Inference - Computer ...

Convergence Conditions for Variational Inequality Algorithms Thomas ...

On Variational Bayes Algorithms for Exponential Family Mixtures

Algorithms for the Split Variational Inequality Problem

Several Hydropower Production Management Algorithms

Categorization of Several Clustering Algorithms from ...

First order algorithms in variational image processing

Offline Algorithms for Several Network Design, Clustering and

Comparative study of several algorithms for flexible ... - Abagyan Lab

Several Algorithms for Finite-Model Adaptive Control - Google Sites

Variational algorithms for analysis and ... - Wiley Online Library

Variational EM Algorithms for Correlated Topic Models - Emtiyaz Khan

Comparative study of several algorithms for flexible ... - Abagyan Lab

A comparison of several algorithms for SAR raw data ...

A comparison of several algorithms and models for analyzing ...

Refined Error Bounds for Several Learning Algorithms - Steve Hanneke

An Empirical Study of Stochastic Variational Algorithms for the ... - arXiv

Duality based algorithms for the solution of multidomain variational