22) it follows that for every f ∈ C1,1(G) f is Gateaux differentiable on a dense subset G(f) ... where f (z) denotes the Gateaux derivative of f at z. In the case when ...
Second subdifferentials of C1,1 functions, optimality conditions and well posedness ∗ Pando Gr. Georgiev, Nadia P. Zlateva University of Sofia Department of Mathematics and Informatics 5 James Bourchier Blvd. 1126 Sofia, BULGARIA
J.B. Hirriart-Urruty , J.J. Strodiot and V.H. Nguyen in [6] considered the properties of the generalized Hessian matrices of C 1,1 functions in Rn and proved necessary conditions for minimization problems with C 1,1 data. In this note we develop their results for the infinite dimentional case. We obtain also necessary conditions and sufficient conditions for minimization problems, which can not be proved by the methods in [6]. Our approach for obtaining optimality conditions is based on the method of V.M. Alexeev, V.M. Tykhomirov, S.V. Fomin in [1]. Let E be a separable Banach space with separable dual E ∗ . Consider the class C 1,1 (G) of all functions f : G → R, G ⊂ E open, whose first derivatives are locally Lipschitz (then f is strictly differentiable, and hence, Frechet differentiable, see [2], p.32). By a theorem of D.Yost ([5], Theorem 22) it follows that for every f ∈ C 1,1 (G) f 0 is Gateaux differentiable on a dense subset G(f ) of G (we shall say that f is twice Gateaux differentiable on G(f )). For simplicity we shall write C 1,1 instead C 1,1 (G). ∗
The research was partially supported by the Bulgarian Ministry of Education and Science under contract number MM-213/1992
1
Denote by L(E × E) the Banach space of all bilinear continuous functionals on E × E with the norm k.k : kLk = sup | L[h1 , h2 ] |, where kh1 k=1 kh2 k=1
L ∈ L(E × E). Let L(E, E ∗ ) be the Banach space of all linear continuous mappings L : E → E ∗ with the norm k.k : kLk = sup kL(x)k∗ , where k.k∗ kxk=1 ∗
∗
is the norm in E , L ∈ L(E, E ). It is well-known that L(E × E) and L(E, E ∗ ) are linear isomorphic (see [1], section 2.2.5). So, in the sequel we identify L(E × E) and L(E, E ∗ ). L(E, E ∗ ) is a congugate space (see Holmes [3], p.215); the w∗ -topology of L(E, E ∗ ) is called weak∗ -operator topology. The sequence {Ln } ⊂ L(E × E) converges in the w∗ -topology to some L0 ∈ L(E × E) iff Ln [h1 , h2 ] → L0 [h1 , h2 ] for every h1 , h2 ∈ E. The predual of L(E, E ∗ ) is the linear hull V of all functionals lh1 ,h2 ∈ (L(E, E ∗ ))∗ , h1 , h2 ∈ E of the form lh1 ,h2 (L) := L[h1 , h2 ] with the norm in (L(E, E ∗ ))∗ . Since E is separable, V is a separable normed space, so every w∗ -compact subset of L(E, E ∗ ) is metrizable (see Holmes [4], Section 12F). There are two natural ways to define second subdifferentials of f ∈ C 1,1 (by analogy with the first order Clarke’s subdifferential, see [2], p.27 and Theorem 2.8.6). The first one is: ∂c2 f (x) := {L ∈ L(E × E) : L[h1 , h2 ] ≤ f 00 (x; h1 , h2 )}, where f 00 (x; h1 , h2 ) := lim sup y→x t↓0
f 0 (y + th1 )[h2 ] − f 0 (y)[h2 ] t
The second definition is: ∂ 2 f (x) := co∗ {L ∈ L(E, E ∗ ) : L = w∗ −
lim
G(f )3z→x
f 00 (z)},
where f 00 (z) denotes the Gateaux derivative of f 0 at z. In the case when E = Rn this definition was considered and used by J.B.H. Urruty, J.J. Strodiot and V.H. Nguyen [6] (∂ 2 f (x) was called there generalized Hessian matrix). They used Rademacher’s theorem (instead Yost’s one); then f is twice Frechet differentiable almost everywhere. Therefore f 00 (z) (when it exists) is a symmetric matrix (see [1], Setion 2.2.5), and hence ∂ 2 f (x) consists of symmetric matrices. In our infinite dimensional case 2
we cannot say that ∂ 2 f (x) consists of symmetric bilinear functionals. In Rn ∂c2 f (x) is in fact the plenary hull of ∂ 2 f (x) in the terminology of [7]. We shall prove that ∂ 2 f (x) is nonempty for every x. Let xn ∈ G(f ) and xn → x. Since the set D := {f 00 (xn ) : n > ν} is norm bounded in L(E × E) for some ν (by the Lipschitz constant of f 0 at x), we can apply the Alaoglou-Bourbaki theorem. Thus the limit of a w∗ -convergent subsequence of D belongs to ∂ 2 f (x). It is easy to prove that f 00 (.; h1 , h2 ) is upper semicontinuous for every h1 , h2 ∈ E. By this fact we obtain that ∂ 2 f (x) ⊂ ∂C2 f (x) for every x ∈ E. If we define d2 f (x; h1 , h2 ) = lim sup f 00 (z)[h1 , h2 ], G(f )3z→x
then we obtain that d2 f (.; h1 , h2 ) is upper semicontinuous for every h1 , h2 ∈ E and d2 f (x; h1 , h2 ) = sup{L[h1 , h2 ] : L ∈ ∂ 2 f (x)}. The equality f 00 (x; h1 , h2 ) = d2 f (x; h1 , h2 ) is proven in Rn by J.B.H.Urruty ([7], Theorem 2.1, see also [6], Theorem 2.1). In our case we also prove this equality, but the proof uses essentially the properties of null sets from the paper of D. Yost [5] (Theorems 19 and 22). The following properties of ∂ 2 f, considered by J.B.H. Urruty, J.J. Strodiot and V.H. Nguyen [6] in Rn , are also valid in this case: 1) ∂ 2 f is w∗ -compact convex set (nonempty) for every x ∈ G. 2) the map x → ∂ 2 f (x) has w∗ -closed graph and is locally bounded. 3) the map x → ∂ 2 f (x) is w∗ -u.s.c. The properties 1), 2) and 3) are also valid, if we replace ∂ 2 f by ∂C2 f . An open question is when the above two types subdifferentials coincide. We obtain that if ∂ 2 f is single- valued, then ∂C2 f is also single-valued and ∂ 2 f is single-valued if and only if f is second-order strictly Gateaux differentiable (i.e. there exists f 00 (x) ∈ L(E × E) such that lim y→x t↓0
f 0 (y + th1 )[h2 ] − f 0 (y)[h2 ] = f 00 (x)[h1 , h2 ] t
for every h1 , h2 ∈ E).
3
The defined second subdifferential gives the oportunity to state and prove second order optimality conditions of a similar type of those in [1] for the problem: f0 (x) → min subject to x ∈ E, P(E) fi (x) ≤ 0 , 1 ≤ i ≤ m gj (x) = 0 , 1 ≤ j ≤ k where f0 , fi , 1 ≤ i ≤ m, gj , 1 ≤ j ≤ k are C 1,1 functions. Denote F (x) = (g1 (x), g2 (x), . . . , gk (x))T , λ = (λ0 , . . . , λm ), µ = (µ1 , . . . , µk ), L(x; λ, µ) =
m X
λi fi (x) +
i=0
k X
µj gj (x),
j=1
Λ(x) = {(λ, µ) ∈ Rm+1 × Rk : λi ≥ 0, 0 ≤ i ≤ m,
m X
λi = 1,
i=0 m X
λi fi0 (x) +
i=0
k X
µj gj0 (x) = 0},
j=1
K(x) = {h ∈ E : fi0 (x)[h] ≤ 0, 0 ≤ i ≤ m, gj0 (x)[h] = 0, 1 ≤ j ≤ k}. Theorem 1 Let Im F 0 (x0 ) = Rk . If x0 is a local minimum of P(E) then for every h ∈ K(x0 ) there exist bilinear functionals Li ∈ ∂ 2 fi (x0 ) , 0 ≤ i ≤ m and Mj ∈ ∂ 2 gj (x0 ) , 1 ≤ j ≤ k, such that max
(λ,µ)∈Λ(x0 )
m X
(
λi L i +
i=0
k X
µj Mj )[h, h] ≥ 0.
j=1
Theorem 2 Let in P(E) fi (x0 ) = 0, 0 ≤ i ≤ m, Im F 0 (x0 ) = Rk , Λ(x0 ) 6= ∅ and there exists a constant α > 0 such that for every Li ∈ ∂ 2 fi (x0 ) , 0 ≤ i ≤ m, Mj ∈ ∂ 2 gj (x0 ) , 1 ≤ j ≤ k we have max
(λ,µ)∈Λ(x0 )
m X
(
i=0
λi Li [h, h] +
k X
µj Mj [h, h]) ≥ 2αkhk2 ∀h ∈ K(x0 ).
j=1
Then x0 is a strict local minimum of the problem P(X) for every finitedimensional subspace X 3 x0 of E. If, in addition, the functions fi , 0 ≤ i ≤ 4
m are convex and the functions gj , 1 ≤ j ≤ k are affine, then the problem P(X) has unique solution x0 and hence it is Tykhonov well posed. Also in this case the proplem P(E) has unique solution x0 . The following sufficient condition is a modification of that one in [1, Section 3.4.3]. The regularity condition ImF 0 (x0 ) = Rk is removed, but the conditions λ0 = 1, λi > 0, 1 ≤ i ≤ m are imposed. Theorem 3 Let in P(E) fi (x0 ) = 0, 1 ≤ i ≤ m, there exists a number α > 0 and Lagrange’s multipliers (λ, µ) ∈ Rm+1 ×Rk , such that λ0 = 1, λi > 0, 1 ≤ i ≤ m, L0x (x0 ; λ, µ) = f00 (x0 ) +
m X
λi fi0 (x0 ) +
i=1
k X
µj gj0 (x0 ) = 0
j=1
and for all L ∈ ∂ 2 L(x0 ; λ, µ) L[h, h] ≥ 2αkhk2 , ∀h ∈ C(x0 ), where C(x0 ) = {h ∈ E : fi0 (x0 )[h] = 0, 1 ≤ i ≤ m, F 0 (x0 )[h] = 0}. Then x0 is a local minimum of the problem P(X) for every finite dimensional subspace X 3 x0 . Theorem 4 Let in P(E) fi , 0 ≤ i ≤ m be convex and twice Frechet differentiable at x0 , fi (x0 ) = 0, 1 ≤ i ≤ m, gj be affine functions gj (x) = lj (x) + bj , lj ∈ E ∗ , lj be linearly independent 1 ≤ j ≤ k and for some α > 0 there exists (λ, µ) ∈ Λ(x0 ) such that L00xx (x0 ; λ, µ)[h, h] ≥ αkhk2
∀h ∈ K1 (x0 ),
where K1 (x0 ) = {h ∈ E : fi0 (x0 )[h] ≤ 0, 1 ≤ i ≤ m, lj [h] = 0, 1 ≤ j ≤ k}. Then the problem P(E) is Tykhonov well-posed with solution x0 . Proof. Since K1 (x0 ) ⊃ K(x0 ) and ImF 0 (x0 ) = Rk (by the linearly independence of lj ), by [3, Section 10.1.1] we obtain that x0 is a local minimum of the problem P(E). Let {xn } be a minimizing sequence of P(E), i.e. xn be an admisible points and f (xn ) → f (x0 ). Since lj (xn ) = lj (x0 ) for 1 ≤ j ≤ k, fi (x0 ) = 0 and 0 ≥ fi (xn ) − fi (x0 ) ≥ fi0 (x0 )[xn − x0 ] for 1 ≤ i ≤ m, we have that xn − x0 ∈ K1 (x0 ) for every n. 5
By the Taylor expansion of L, having in mind that L0x (x0 ; λ, µ) = 0 (by the condition (λ, µ) ∈ Λ(x0 )), for every z ∈ E we have 1 L(z; λ, µ) − L(x0 ; λ, µ) = L00xx (x0 ; λ, µ)[z − x0 , z − x0 ] + o(kz − x0 k2 ). 2 2
0k ) There exists δ > 0 such that if kz − x0 k < δ then | o(kz−x |< α/4. kz−x0 k2 0 0 Let δ ≤ δ. If we suppose that kxn − x0 k > δ for infinitely many n, then for such n there exists ηn ∈ (0, 1) such that ηn kxn − x0 k = δ 0 and since L(.; λ, µ) is convex,
λ0 (f0 (xn ) − f0 (x0 )) ≥ L(xn ; λ, µ) − L(x0 ; λ, µ) ≥ ≥ =
1 (L(x0 + ηn (xn − x0 ); λ, µ) − L(x0 λ, µ)) = ηn
1 1 00 ( Lxx (x0 ; λ, µ)[ηn (xn − x0 ), ηn (xn − x0 )] + o(ηn2 k(xn − x0 k2 )) ≥ ηn 2
δ0α δ 02 α α = kxn − x0 k > . 4 4 4 Hence, for large n we obtain a contradiction, therefore δ 0 ≤ kxn − x0 k only for finitely many n, which completes the proof. ≥ ηn kxn − x0 k2
6
References [1] Alexeev V.M., V.M.Tykhomirov, S.V.Fomin, Optimal Control, Nauka 1979 [Russian]. [2] Clarke F.H., Optimization and Nonsmooth Analysis, John Wiley and Sons, 1983. [3] Galeev E.M., V.M. Tykhomirov, A brief course of theory of extremal problems, 1989. [4] Holmes R.B., Geometric Functional Analysis and its Applications, Graduate Text in Mathematics 24, Springer-Verlag, 1975. [5] Yost D. ”If every doughnut is a teacup, then every Banach space is a Hilbert space” ,Universidad de Murcia, Notas de matematica, vol 1, Seminar on functional analysis, 1987. [6] Hiriart-Urruty J.B., J.J. Strodiot, V.H. Nguyen, ”Generalized Hessian Matrix and Seccond-Order Optimality Conditions for Problems with C1,1 Data”, Appl. Math. Optim. 11:43-56 (1984). [7] Hiriart-Urruty J.B., Characterizations of the plenary hull of the generalized Jacobian matrix, Math. Programming Study, 17 (1982), 1-12.
7