ON CONSTRAINT QUALIFICATION FOR AN

0 downloads 0 Views 250KB Size Report
c 2005 Society for Industrial and Applied Mathematics. Vol. ... math.cuhk.edu.hk). This author was supported by a Direct Grant (CUHK) and an Earmarked Grant.
c 2005 Society for Industrial and Applied Mathematics 

SIAM J. OPTIM. Vol. 15, No. 2, pp. 488–512

ON CONSTRAINT QUALIFICATION FOR AN INFINITE SYSTEM OF CONVEX INEQUALITIES IN A BANACH SPACE∗ CHONG LI† AND K. F. NG‡ Abstract. For a general infinite system of convex inequalities in a Banach space, we study the basic constraint qualification and its relationship with other fundamental concepts, including various versions of conditions of Slater type, the Mangasarian–Fromovitz constraint qualification, as well as the Pshenichnyi–Levin–Valadier property introduced by Li, Nahak, and Singer. Applications are given in the restricted range approximation problem, constrained optimization problems, as well as in the approximation problem with constraints by conditionally positive semidefinite functions. Key words. system of convex inequalities, basic constraint qualification, Slater condition, MFCQ, PLV property, optimality condition, best approximation AMS subject classifications. Primary, 90C34; Secondary, 90C46, 41A29 DOI. 10.1137/S1052623403434693

1. Introduction. Let X be a Banach space over the real field R, C be a closed convex subset of X, and let {gi : i ∈ I} be a family of continuous convex functions on X, where I is an arbitrary (but nonempty) index-set. We consider the following system of convex inequalities: gi (x) ≤ 0,

(1.1)

i ∈ I,

and the associated system for the sup-function G, G(x) ≤ 0,

(1.2) where G is defined by (1.3)

G(x) := sup gi (x)

for all x ∈ X.

i∈I

In this paper, we always assume that the sup-function G(.) is continuous on X. It is clear that if X is of finite dimension and G(x) is finite for each x ∈ X, or if {gi : i ∈ I} is locally uniformly bounded, then the condition that G(.) is continuous on X is automatically satisfied. Throughout we use S to denote the solution set of (1.1) (equivalently, of (1.2)): (1.4)

S := {x ∈ X : gi (x) ≤ 0

for all i ∈ I} = {x ∈ X : G(x) ≤ 0}.

Let (1.5)

K := C ∩ S;

∗ Received by the editors September 17, 2003; accepted for publication (in revised form) June 28, 2004; published electronically February 19, 2005. http://www.siam.org/journals/siopt/15-2/43469.html † Department of Mathematics, Zhejiang University, Hangzhou 310027, P. R. China ([email protected]. cn). This author was supported in part by National Natural Science Foundation of China grant 10271025. ‡ Department of Mathematics, Chinese University of Hong Kong, Hong Kong, P. R. China (kfng@ math.cuhk.edu.hk). This author was supported by a Direct Grant (CUHK) and an Earmarked Grant from the Research Grant Council of Hong Kong.

488

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

489

that is, K is the solution set of (1.1) with the constraint (1.6)

x ∈ C.

Many important problems in mathematical programming are based on relationships between the normal cone NK (x) in relation to NC (x) and the subdifferentials of G, gi with i ∈ I(x). Here x ∈ X, and I(x) is the set of “active indices” at x defined by (1.7)

I(x) := {i ∈ I : gi (x) = G(x)}.

In the classical case (that is, when I is finite, X is a Euclidean space, and each gi is differentiable), many fundamental notions of so-called constraint qualifications for (1.1) have been introduced, such as the basic constraint qualification (BCQ), the Slater condition (SC), and the Mangasarian–Fromovitz constraint qualification (MFCQ); their interrelationships are well known and play a very important part not only in optimization problems but also in many other areas, such as best approximation theory. A big step forward from the classical theory was recently achieved by Li, Nahak, and Singer in [16], where detailed studies on the BCQ and the SC have been made for the case when the index-set I is not restricted to be finite. They introduced and studied the Pshenichnyi–Levin–Valadier (PLV) property and the weak PLV property; in particular, they made use of these properties in extending the classical theory to the case when I is infinite. Another direction of extension was done in [12, 13], where I was assumed to be finite but X was allowed to be an infinite dimensional Hilbert space; moreover, the constraint (1.6) was considered. In this paper, we propose to study the general case: X is a Banach space, C is a closed convex subset, and I is an arbitrary set. In this general case we introduce and study various notions of constraint qualifications. In particular, in the stated general setting, the notions of the BCQ and the PLV are introduced in section 2. We show in Theorem 2.4 that, in the presence of the PLV, (1.1) satisfies the BCQ if and only if (1.2) satisfies the BCQ. Our main results are presented in section 3, where we establish some sufficient conditions to ensure the BCQ. Along with the extended notions of the SC and the MFCQ, the weak versions (weak SC and weak MFCQ) are also introduced. Some of their relationships are established. We show in particular that, under the weak SC, the PLV implies the BCQ for (1.1). We remark that many of the results obtained in this paper, for example, Theorems 3.8 and 3.10, are new even for the case when C = X is a finite dimensional space. In particular, Theorem 3.10 provides a set of conditions ensuring that the following implication holds: Each finite subsystem of (1.1) satisfies the BCQ=⇒the system (1.1) does. Meanwhile, Examples 3.2, 3.3, and 3.4 show that the above implication is no longer true if any of the conditions in the theorem are dropped. The last three sections of the paper are devoted to applications of the main results. In section 4, we study the problem of minimizing a real-valued function f (x) subjected to x ∈ C ∩S and we show that, for an optimal point x0 , the BCQ of (1.1) relative to C at x0 is closely related to the KKT property. In section 6, we address an approximation problem studied by H. Strauss. We give a counterexample to show that [22, Theorem 6.2] is not true, and we provide a new version which is established under the added assumption of the BCQ. One of the reasons why we study the proposed general case instead of the more restrictive case is that there are many approximation and optimization problems which

490

CHONG LI AND K. F. NG

are in the general Banach space setting with infinitely many convex constraints. For example, it was shown in [14] that the restricted Chebyshev approximation problem (see [11, 21] and references therein) in the Banach space of complex-valued continuous functions on a compact Hausdorff space can be reformulated as a constrained approximation problem with a system of convex inequalities. Section 5 is devoted to another application dealing with the best restricted range approximation problem in Lp space studied by Levasseur and Levis: a main result from [10] is extended and improved under much weaker conditions. 2. Preliminaries, the BCQ, and the PLV property. For a set Z in X (or in Rn ), ext Z denotes the set of all extreme points of Z and the interior (resp., relative interior, closure, convex hull, convex cone hull, linear hull, negative polar, boundary) of Z is defined by int Z (resp., ri Z, Z, conv Z, cone Z, span Z, Z  , bd Z). We use |Z|(≤ ∞) to denote the number of elements of Z and use dim Z to denote the dimension of span (Z − z), where z is an element of Z. For x ∈ X, in the case when Z is closed and convex, the normal cone of Z at x is denoted by NZ (x) and defined by NZ (x) = (Z − x) . R− denotes the subset of R consisting of all nonpositive real numbers. For a proper convex function on X and x ∈ X, the subdifferential of f at x is defined by ∂f (x) := {z ∗ ∈ X ∗ : f (x) + z ∗ , y − x ≤ f (y)

for all y ∈ X}.

In particular, NZ (z) = ∂IZ (z) for each z ∈ Z. Here and throughout, IZ denotes the / Z. indicator function of Z: IZ (x) = 0 if x ∈ Z and IZ (x) = +∞ if x ∈ Recall that the directional derivative f  (x, d) of the function f at x in the direction d is defined by (2.1)

f  (x, d) = lim

t→0+

f (x + td) − f (x) . t

Then (cf. [6, Proposition 2.2.7]) if f is convex and if x is a continuity point of f , (2.2)

∂f (x) = {z ∗ ∈ X ∗ : z ∗ , d ≤ f  (x, d)

for all d ∈ X}

and (2.3)

f  (x, d) = max{z ∗ , d : z ∗ ∈ ∂f (x)}.

Remark 2.1. For a continuous convex function f on X, it is easy to show that cone ∂f (x) ⊆ Nf −1 (R− ) (x) if f (x) = 0 and that the equality holds in both of the following cases: (a) f is affine, (b) x is not a minimizer of f . See [6, Corollary 1, p. 56]. Definition 2.1. Let x ∈ X, and let I(x) be defined by (1.7). Assuming I(x) = ∅, define D (x) and N  (x) by ⎛ ⎞  D (x) := conv ⎝ (2.4) ∂gi (x)⎠ , i∈I(x)

⎛ (2.5)

N  (x) := cone ⎝

 i∈I(x)

⎞ ∂gi (x)⎠ ;

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

491

also we adopt the convention that N  (x) = {0} if I(x) = ∅.

(2.6)

Part (ii) of the following proposition is well known while the other parts can be verified easily by definitions. Proposition 2.2. The following assertions hold: (i) If i ∈ I(x), then ∂gi (x) ⊆ ∂G(x). Hence (2.7)

N  (x) ⊆ cone ∂G(x)

for all x ∈ X

and (2.8)

D (x) ⊆ ∂G(x)

for all x ∈ X.

D (x) = ∂G(x)

for all x ∈ X.

(ii) If I is finite, then (2.9)

(iii) If G(x) = 0 (e.g., if x ∈ bd S), then (2.10)

∂G(x) ⊆ NK (x)

and (2.11)

NC (x) + N  (x) ⊆ NC (x) + cone ∂G(x) ⊆ NK (x).

Below we define the conversed relations of the inclusions given in the preceding proposition. Definition 2.3. Let x ∈ X. The system (1.1) is said to satisfy (a) the BCQ at x relative to C if (2.12)

NC (x) + N  (x) ⊇ NK (x);

(b) the PLV at x relative to C if (2.13)

NC (x) + D (x) ⊇ NC (x) + ∂G(x).

The system (1.1) is said to satisfy the BCQ relative to C if it satisfies the BCQ at each point of K. In addition, the wording “relative to C” need not be mentioned if C = X. The following theorem (the proof of which is routine and hence omitted) extends a result of Li, Nahak, and Singer for the case when C = X is finite dimensional. Theorem 2.4. Let x ∈ C ∩ bd S. Suppose that the system (1.1) satisfies the PLV at x relative to C. Then the system (1.1) satisfies the BCQ at x relative to C if and only if (1.2) satisfies the BCQ at x relative to C. Part (ii) of the following theorem is well known; see, for example, [9, Theorem 4.4.2] and [16, Theorem 3.1]. The proof given in [16] is valid for any compact space I satisfying the first axiom of countability (in the sense that there exists a countable local base at each point of I). For wider scope of applications (such as that in section 5), we do not assume that I is metrizable. Theorem 2.5. Suppose that I is a compact space satisfying the first axiom of countability and that the function: i → gi (x) is upper semicontinuous for each x ∈ X. Then I(x) = ∅ for each x ∈ X and the following assertions hold:

492

CHONG LI AND K. F. NG w∗

(i) ∂G(x) = D (x) . (ii) ∂G(x) = D (x), provided that I(x) is finite or X is finite dimensional. (iii) If x ∈ span C and if span C is finite dimensional, then the system (1.1) satisfies the PLV relative to C at x. Furthermore, if in addition {gi : i ∈ I} is equicontinuous on X, then the assumption on I about the first axiom of countability can be dropped. Proof. Let x, d ∈ X. By assumption, it is easy to verify that I(x) = ∅. For each t > 0, define   gi (x + td) − G(x)  (2.14) ≥ G (x, d) . It := i ∈ I : t By convexity, I(x + td) ⊆ It and it follows that It is nonempty. Moreover, {It : t > 0} has the “finite intersection property” (and hence ∩t>0 It = ∅) because the function (2.15)

t →

gi (x + td) − gi (x) gi (x) − G(x) gi (x + td) − G(x) = + t t t

is nondecreasing on (0, +∞) (the functions represented in the two terms of the righthand side of (2.15) are nondecreasing by the convexity of gi and the fact that gi (x) − G(x) ≤ 0). Pick i0 ∈ ∩t>0 It . Then passing to the limits as t → 0 in gi0 (x + td) − G(x) ≥ tG (x, d),

(2.16)

we see that gi0 (x) = G(x) (that is, i0 ∈ I(x)) and so (2.16) entails gi0 (x, d) ≥ G (x, d). Consequently, it follows from (2.3) that   ∗ max (2.17) G (x, d) = max {gi (x, d)} = max x , d = ∗max x∗ , d.  ∗ i∈I(x)

i∈I(x)

x ∈∂gi (x)

x ∈D (x)

Thus (i) must hold by [6, Proposition 2.1.4]. Note that we have not used the assump˜ for span C; let G ˜ and g˜i , tion on the first axiom of countability. For (iii), write X ˜ Clearly, G ˜ is the sup-function respectively, denote the restrictions of G and gi to X. ˜ of {˜ gi i ∈ I} and it follows from (ii) that, for x ∈ X, (2.18)

˜ ∂ G(x) = conv {∂˜ gi (x) : i ∈ I(x)}.

To prove (iii), we need only show that (2.19)

∂G(x) ⊆ NC (x) + D (x).

˜ Then To do this, let y ∗ ∈ ∂G(x), and let y˜∗ denote the of y ∗ to X. restriction m ∗ ∗ ∗ ˜ y˜ ∈ ∂ G(x), and so it follows from (2.18) that y˜ = j=1 λj y˜j for some {λj } ⊆ [0, 1] ˜ ∗ with m λj = 1, y˜∗ ∈ ∂˜ and {˜ yj∗ } ⊆ X gij (x), and i1 , i2 , . . . , im ∈ I(x). Noting j j=1 (2.20)

˜ yj∗ , z ≤ g˜ij (x, z) = gij (x, z)

˜ for all z ∈ X,

one can apply the Hahn–Banach theorem to obtain a continuous linear extension yj∗ of y˜j∗ such that yj∗ , · ≤ gij (x, ·)

on X.

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

493

Thus yj∗ ∈ ∂gij (x) and ∗

y , · =

m

λj yj∗ , ·

˜ on X.

j=1

˜ this implies that y ∗ − m λj y ∗ ∈ NC (x). Therefore y ∗ ∈ NC (x) + Since C − x ⊆ X, j j=1 N  (x) and (2.19) is proved. Now we drop the assumption that I satisfies the first axiom of countability but assume that {gi : i ∈ I} is equicontinuous on X. By the above proof, (iii) will be true if (ii) holds. Hence, to complete the proof it suffices to show that ∂G(x) = D (x) when X is finite dimensional. For this purpose, we need to introduce a new topology on the set I. For each i0 ∈ I, x ∈ X, and each r ∈ R, r > 0, set Ox (i0 , r) = {i ∈ I : gi (x) − gi0 (x) < r}. Since X is finite dimensional, there exists a countable subset A of X such that the closure A is equal to X. Let τN denote the topology generated by {Ox (i0 , r) : x ∈ A, i0 ∈ I, r ∈ R, r > 0}. Then, under the topology τN , the following assertions hold: (1) I is compact; (2) I satisfies the first axiom of countability; (3) for each x ∈ X, the function i → gi (x) is upper semicontinuous. In fact, (2) is trivial, and (1) holds because I is compact under the original topology, which is stronger than the new topology τN , as each function i → gi (x) is upper semicontinuous. (3) follows from the density of A in X and the equicontinuity of {gi : i ∈ I} on X. In fact, let x ∈ X and  > 0. By the equicontinuity, there exists δ > 0 such that  (2.21) for each i ∈ I and each y with x − y < δ. |gi (x) − gi (y)| < 3 Pick x ∈ A such that x − x < δ. Let i0 ∈ I. Noting that, by (2.21), (2.22)

gi (x) − gi0 (x)
0, λj ≥ 0, λj ≤ 1 . ⎭ j j=1 j=1

Clearly, D (x) is not closed. 3. The SC and the MFCQ. We continue to consider the system (1.1) relative to a closed convex set C in X. Notation is as in the introduction. A main aim of this section is to provide sufficient conditions ensuring that (1.1) satisfies the BCQ relative to C. These conditions are of two types (the SC and the MFCQ): one is expressed in terms of the functions gi , while the other is expressed in terms of the directional derivatives of gi . Let us first extend several known concepts to our present general case from the semi-infinite case (cf. [2, 12, 15] for the former type and [12, 17, 23] for the latter type). Recall that S and K are defined by (1.4) and (1.5). Definition 3.1. Let x ∈ C ∩ bd S. An element d ∈ X is called (a) a linearized feasible direction of (1.1) at x if (3.1)

z ∗ , d ≤ 0

for all z ∗ ∈ ext ∂gi (x) and for all i ∈ I(x);

(b) a sequentially feasible direction of K at x if there exist a sequence (dk ) with (dk ) → d and a sequence (δk ) of positive real numbers with (δk ) → 0 such that {x + δk dk } ⊆ K. Let LFD(x) (resp., SFD(x)) denote the set of all d satisfying (a) (resp., (b)). Note that LFD(x) and SFD(x) are both closed convex cones. Definition 3.2. Let KS (x), KL (x) be, respectively, defined by  KS (x) = (x + SFD(x)) C (3.2) and (3.3)

KL (x) = (x + LFD(x))



C.

Note that these two sets are closed convex sets. Proposition 3.3. Let x ∈ C ∩ bd S. Then SFD(x) ⊆ LFD(x), and hence (3.4)

K ⊆ KS (x) ⊆ KL (x).

Proof. Since K ⊆ KS (x), (3.4) follows from the first assertion and the fact that LFD(x) is closed convex. To prove the first assertion, let d ∈ SFD(x), and let {dk }, {δk } be as in Definition 3.1(b). In particular, for each i ∈ I(x), one has gi (x + δk dk ) ≤ 0 and hence, for each z ∗ ∈ ∂gi (x), we have (3.5)

z ∗ , δk dk  ≤ gi (x + δk dk ) − gi (x) ≤ 0

thanks to the fact that G(x) = 0 as x ∈ bd S. This implies that z ∗ , dk  ≤ 0. Letting k → +∞ we obtain z ∗ , d ≤ 0. Since i ∈ I(x) and z ∗ ∈ ∂gi (x) are arbitrary, d ∈ LFD(x), and the proof is complete. Definition 3.4. Let x ∈ C ∩ bd S. We say that the system (1.1) satisfies

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

495

(a) the M F CQ relative to C at x if the following conditions are satisfied: (1) (x + LFD(x)) ∩ ri C = ∅; (2) {d ∈ X : gi (x, d) < 0 for all i ∈ I(x)} ∩ span (C − x) = ∅; (b) the weak MFCQ relative to C at x if there exists a finite subset I0 of I(x) such that (1) (x + LFD(x)) ∩ ri C = ∅; (2) {d ∈ X : gi (x, d) < 0 for all i ∈ I(x) \ I0 } ∩ span (C − x) = ∅; (3) gi is affine for each i ∈ I0 . We shall also say that x satisfies the MFCQ (resp., the weak MFCQ) for (1.1) relative to C when (a) (resp., (b)) occurs. For the following definitions, we need additional notation. Let (3.6)

˜ I(x) = {i ∈ I(x) : gi (x) = 0}.

˜ Clearly, I(x) = I(x) in the case when x ∈ bd S. Definition 3.5. The system (1.1) is said to satisfy (a) the SC relative to C if there exists x ¯ ∈ ri C such that G(¯ x) < 0; (b) the weak SC relative to C if there exists x ¯ ∈ ri C ∩ S such that ˜ x) is finite and gi is affine for each i ∈ I(¯ ˜ x); (1) I(¯ x) < 0, where G0 denotes the sup-function of (2) G0 is continuous and G0 (¯ ˜ x)}: {gi : i ∈ I \ I(¯ (3.7)

G0 (z) =

sup gi (z), ˜ x) i∈I\I(¯

z ∈ X.

A point x ¯ with the property in (a) (resp., (b)) is called a Slater (resp., weak Slater) point for the system (1.1) relative to C. As the nomenclature suggests, it is clear that (a)=⇒(b) in Definitions 3.4 and 3.5. Theorem 3.6. Suppose there exists a weak Slater point x ¯ for the system (1.1) relative to C, and let x ∈ C ∩ bd S. Then (1.1) satisfies the BCQ relative to C at x if (1.1) satisfies the PLV relative to C at x. Proof. Let G0 be defined as in Definition 3.5(b). Set ˜ x)} S0 = {y ∈ X : gi (y) ≤ 0, i ∈ I \ I(¯ and ˜ x)}. S1 = {y ∈ X : gi (y) ≤ 0, i ∈ I(¯ ˜ x) is finite, and gi is affine for each i ∈ I(¯ ˜ x). Hence x) < 0, I(¯ Then G0 (¯  ˜ x) = ∅, if I(x) ∩ I(¯ ˜ x) cone ∂gi (x) i∈I(x)∩I(¯ NS1 (x) = ˜ x) = ∅ 0 if I(x) ∩ I(¯ and

 NS0 (x) =

cone ∂G0 (x) 0

if G0 (x) = 0, if G0 (x) < 0.

Note in particular that NS1 (x) and NS0 (x) are contained in cone ∂G(x). Moreover, by [8, Proposition 2.3 and Theorem 5.1] (the proofs given there are valid for Banach

496

CHONG LI AND K. F. NG

spaces, though the results were stated in a Hilbert space setting), {C, S1 } and {C ∩ S1 , S0 } have the strong conical hull intersection property (the strong CHIP). Hence NK (x) = NC∩S1 (x) + NS0 (x) = NC (x) + NS1 (x) + NS0 (x) ⊆ NC (x) + cone ∂G(x) ⊆ NC (x) + N  (x), where the last inclusion holds because of the PLV property. For convenience of reference, we note the following immediate consequence of Theorem 2.5(iii) and Theorem 3.6. Corollary 3.7. Suppose that I is a compact space satisfying the first axiom of countability, and that the function: i → gi (x) is upper semicontinuous for each x ∈ X. Let x ∈ K. Suppose that dim C < ∞ or I(x) is finite and that there exists a weak Slater point for the system (1.1) relative to C. Then (1.1) satisfies the BCQ relative to C at x. Furthermore, if in addition {gi : i ∈ I} is equicontinuous on X, then the assumption on I about the first axiom of countability can be dropped. Theorem 3.8. For the system (1.1) and relative to C, the following assertions hold: (i) If the weak SC is satisfied, then at each point in C ∩ bd S the weak MFCQ is satisfied. If I is finite and if x ∈ C ∩ bd S satisfies the weak MFCQ, then there exist weak Slater points arbitrarily near x. (ii) If the SC is satisfied then, at each point in C ∩ bd S, the MFCQ is satisfied. Conversely, if x ∈ C ∩ bd S satisfies the MFCQ and if in addition x satisfies the PLV, then there exist Slater points arbitrarily near x (in particular, (1.1) satisfies the BCQ relative to C). Proof. (i) Let x ¯ ∈ ri C ∩ S be a weak Slater point for the system (1.1) relative to ˜ x). Then I0 is finite (as C, and let x ∈ C ∩ bd S (hence G(x) = 0). Set I0 = I(x) ∩ I(¯ ˜ I(¯ x)), and x with this I0 satisfies (1)–(3) of Definition 3.4(b). Indeed, (3) is evident. ˜ x). Then (2) of Definition 3.4(b) holds trivially. (1) also holds as Suppose I(x) ⊆ I(¯ ˜ x), one has x ¯ ∈ (x + LFD(x)) ∩ ri C because, for each i ∈ I0 = I(x) ⊆ I(¯ gi (x) = 0 and gi (¯ x) = G(¯ x). It follows from the convexity that gi (x, x ¯ − x) ≤ 0; that is, x ¯ − x ∈ LFD(x). We ˜ x). Consequently G0 (x) = G(x) = 0, can therefore assume henceforth that I(x) ⊆ I(¯ x) < 0, we have for each where G0 is the function on X defined by (3.7). Noting G0 (¯ i ∈ I(x) \ I0 that gi (x, x ¯ − x) ≤ gi (¯ x) − gi (x) ≤ G0 (¯ x) − G0 (x) < 0. This implies that the intersection in (2) of Definition 3.4(b) is a nonempty set (containing x ¯ − x) and that x ¯ − x ∈ LFD(x) as gi is affine for each i ∈ I0 (because ˜ x) and x I0 ⊆ I(¯ ¯ is a weak Slater point). Therefore x is a weak MFCQ point relative to C. Conversely, we suppose that x satisfies the weak MFCQ relative to C and, in addition, that I is finite. Let I0 ⊆ I(x) satisfy (2) and (3) of Definition 3.4(b). Then by definition, x satisfies the MFCQ relative to C for the subsystem defined by (3.8)

gi (·) ≤ 0

for all i ∈ I \ I0 .

Note that x satisfies the PLV for the subsystem (3.8) as I is finite. Hence, assuming the results in (ii), we can apply (ii) to conclude that there exists x ¯ ∈ ri C arbitrarily near x such that sup{gi (¯ x) : i ∈ I \ I0 } < 0. Clearly then x ¯ is a weak Slater point for (1.1) relative to C.

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

497

(ii) Let x ¯, x be as in the first part of (i) but assume the stronger assumption that ˜ x) must be empty and so must I0 . In this x ¯ is a Slater point relative to C. Then I(¯ case, (2) of Definition 3.4(b) coincides with (2) of Definition 3.4(a). Therefore x is an MFCQ point relative to C by (i). Conversely, suppose that x satisfies the MFCQ for the system (1.1) relative to C and, in addition, that x also satisfies the PLV. Then, by the former, there exist d1 , d2 such that x + d1 ∈ (x + LFD(x)) ∩ ri C, d2 ∈ span(C − x), and gi (x, d2 ) < 0

(3.9)

for all i ∈ I(x).

Also, by the PLV property, ∂G(x) ⊆ NC (x) + conv {∂gi (x) : i ∈ I(x)} .

(3.10)

Take t0 > 0 small enough such that x+(1−t0 )d1 +t0 d2 ∈ ri C. Set d := (1−t0 )d1 +t0 d2 . Then, for any i ∈ I(x), (3.11)

gi (x, d) ≤ (1 − t0 )gi (x, d1 ) + t0 gi (x, d2 ) ≤ t0 gi (x, d2 ) < 0.

It follows that (3.12)

G (x, d) < 0.

G (x, d) = z ∗ , x. By (3.10), Indeed, by (2.3) there exists z ∗ ∈ ∂G(x) such that  m ∗ ∗ ∗ there exist t1 , . . . , tm > 0 and z1 , . . . , zm ∈ X with j=1 tj = 1, and zj∗ ∈ ∂gij (x)  m where i1 , . . . , im ∈ I(x) such that z ∗ − j=1 tj zj∗ ∈ NC (x); in particular, z ∗ −  m ∗ j=1 tj zj , d ≤ 0. It follows from (3.11) and (2.2) that z ∗ , d ≤

m

tj gij (x, d) < 0;

j=1

this proves (3.12). Noting G(x) = 0 (as x ∈ bd S), (3.12) implies that G(x + t¯d) < 0 for all sufficiently small t¯ ∈ (0, 1). Clearly x + t¯d ∈ ri C (provided t¯ < t0 ) and so x + t¯d is a Slater point for the system (1.1) relative to C. Remark 3.1. Example 3.1(a) below will show that the finiteness assumption cannot be dropped in the second assertion of Theorem 3.8(i), and the condition of the PLV in the second assertion of Theorem 3.8(ii) cannot be dropped. Example 3.1(b) provides an example of (1.1) which satisfies the PLV and the weak MFCQ relative to C at some point x0 ∈ C ∩ bd S but fails for the weak SC and the BCQ relative to C at x0 . (In particular, the finiteness assumption in the second assertion of Theorem 3.8(i) cannot be replaced by the PLV.) Example 3.1. (a) Let X = C = R, I = [−1, 0) ∪ (0, +1], and let {gi : i ∈ I} be defined by  ix, i ∈ [−1, 0) gi (x) = for all x ∈ R. 2ix − i2 , i ∈ (0, +1] Then the corresponding sup-function is given by ⎧ x≥1 ⎨ 2x − 1, 0≤x≤1 G(x) = x2 , (3.13) ⎩ −x, x≤0

for all x ∈ R.

498

CHONG LI AND K. F. NG

Take x0 = 0. Then, S = {0}, I(x0 ) = [−1, 0), and ∂gi (x0 ) = {i} for all i ∈ [−1, 0).

(3.14) Moreover, (3.15)

gi (x0 , 1) = i < 0

for all i ∈ [−1, 0)

and (3.16)

LFD(x0 ) = [0, +∞).

For the system (1.1) relative to C at the point x0 , the following assertions hold: (1) The MFCQ is satisfied. (2) The PLV is not satisfied. (3) The weak SC is not satisfied. (4) The BCQ at x0 is not satisfied. Indeed, (1) is evident by (3.15) and (3.16). Next, by (3.14), (3.17)

D (x0 ) = [−1, 0),

N  (x0 ) = (−∞, 0],

so that ∂G(x0 ) = [−1, 0] = D (x0 ) ⊆ (−∞, 0] = N  (x0 ),

∂G(x0 ) = D (x0 ).

Hence, (2) holds. (3) is also evident because G0 (¯ x) = 0 whenever x ¯ ∈ S and G0 is the sup-function of {gi : i ∈ I \ I0 } for some finite subset I0 of I. Finally, (4) holds because NK (x0 ) = NS (x0 ) = R,

N  (x0 ) = (−∞, 0].

(b) Let g0 be defined by g0 (x) = 0 for all x ∈ X. We add this function g0 to the system in (a) above. The new system satisfies the PLV relative to C at x0 and the weak MFCQ relative to C at x0 . However, the weak SC on C does not hold; hence the BCQ relative to C at x0 is not satisfied. It is well known (see, e.g., [16]) that the following implication is not true in general: (∗)

Each finite subsystem of (1.1) satisfies the BCQ=⇒the system (1.1) does.

Therefore the remainder of this section is devoted to studying the following natural question: When does this implication hold? The following theorem (which will be useful for our discussion in section 5) provides a set of conditions ensuring the above implication (∗). To facilitate the proof of this theorem, we recall the following proposition which was established independently by Deutsch [7] and Rubenstein [20] (see also [4]). For a closed convex subset Z of X, let PZ denote the projection operator defined by PZ (x) = {y ∈ Z : x − y = dZ (x)}, where dZ (x) denotes the distance from x to Z. Recall that the duality map J from ∗ X to 2X is defined by J(x) := {x∗ ∈ X ∗ : x∗ , x = x2 , x∗  = x}.

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

499

In fact, J(x) = ∂φ(x), where φ(x) := 12 x2 . Thus a Banach space X is smooth if and only if for each x ∈ X the duality map is single-valued. Proposition 3.9. Let Z be a closed convex set in X. Then for any x ∈ X, z0 ∈ PZ (x) if and only if z0 ∈ Z and there exists x∗ ∈ J(x − z0 ) such that x∗ , z − z0  ≤ 0 for any z ∈ Z, that is, J(x − z0 ) ∩ NZ (z0 ) = ∅. In particular, when X is smooth, z0 ∈ PZ (x) if and only if z0 ∈ Z and J(x − z0 ) ∈ NZ (z0 ). Theorem 3.10. Let I be a compact metric space and let C be a closed convex subset of X such that dim C := l < +∞. Suppose that (i) for each x ∈ X, the function i → gi (x) is continuous; (ii) for any finite subset J of I, the subsystem gj (x) ≤ 0,

(3.18)

j ∈ J,

satisfies the BCQ relative to C; (iii) for any finite subset J of I with |J| ≤ l, the subsystem (3.18) satisfies the SC relative to C. Then (1.1) satisfies the BCQ relative to C. Proof. Since I is compact and metrizable, there exists a sequence {Ik } of subsets of I such that (a) each Ik is finite; (b) Ik ⊆ Ik+1 for k = 1, 2, . . .; (c) ∪∞ k=1 Ik = I. Set Sk = ∩i∈Ik {x ∈ X : gi (x) ≤ 0} for each k. Then K = C ∩ S ⊆ C ∩ Sk .

(3.19)

Let x0 ∈ K and x∗ ∈ NK (x0 ). We have to show that x∗ ∈ NC (x0 ) + N  (x0 ). We will first show that there exist{xk } ⊆ C with xk → x0 and x∗k ∈ NC∩Sk (xk ) such that {x∗k } is bounded and (3.20)

lim x∗k , y = x∗ , y for each y ∈ span (C − x0 ).

k→∞

In fact, since Z := span (C −x0 ) is of finite dimension, we may assume, without loss of generality, that the norm restricted on Z is both strictly convex and smooth. Clearly, we may assume that x∗ |Z = 0. Take z0 ∈ Z such that x∗ , z0  = x∗ |Z ·z0  = z0 2 . Write x = x0 + z0 . Then, x∗ |Z = J(x − x0 )|Z and, by Proposition 3.9, x0 = PK (x). Let xk = PC∩Sk (x). Then xk  ≤ xk − x + x ≤ x − x0  + x; hence {xk } ⊂ C is bounded. Without loss of generality, assume that xk → x ¯. Let i ∈ I and  > 0. By (b) and (c), there exists {ik } ⊆ I with ik ∈ Ik for each k such that ik → i. By assumption (i), there exists an integer k0 such that (3.21)

gi (¯ x) − gik0 (¯ x) < .

By (b), for each k > k0 , ik0 ∈ Ik0 ⊆ Ik ; hence, by the fact that xk ∈ Sk ⊆ Sk0 , (3.22)

gik0 (¯ x) = gik0 (¯ x) − gik0 (xk ) + gik0 (xk ) ≤ gik0 (¯ x) − gik0 (xk ).

Consequently, by (3.21) and (3.22), we have that, for each k > k0 , (3.23)

gi (¯ x) <  + gik0 (¯ x) − gik0 (xk ).

500

CHONG LI AND K. F. NG

Taking the limit as k → ∞ we get gi (¯ x) ≤ ; hence x ¯ ∈ K as  > 0 and i ∈ I are arbitrary. Because x − x ¯ = lim x − xk  ≤ x − y

for each y ∈ K,

k→∞

x ¯ = PK (x) = x0 . Let x∗k ∈ X ∗ with x∗k |Z = J(x − xk )|Z and x∗k  = x − xk . Then, by Proposition 3.9, x∗k ∈ NC∩Sk (xk ). By the smoothness, the mapping x → J(x)|Z is continuous and so (3.20) holds. Now let Gk denote the sup-function corresponding to the subsystem (3.18) with J = Ik ; that is, Gk (x) := sup gi (x)

for each x ∈ X.

i∈Ik

If there exists a subsequence {kj } of {k} such that Gkj (x0 ) < 0 for each j, then NC∩Skj (xkj ) = NC (xkj ), and hence x∗kj ∈ NC (xkj ). This implies that x∗ ∈ NC (xkj ) by (3.20) (and so x∗ ∈ NC (x0 ) + N  (x0 )). Therefore we may assume that, for each k, x∗k ∈ / NC (xk ), and that Gk (xk ) = 0. Then Ik (xk ) := {i ∈ Ik : gi (xk ) = Gk (xk )} = ∅. By assumption (ii), x∗k ∈ NC∩Sk (xk ) = NC (xk ) + cone {∂gi (xk ) : i ∈ Ik (xk )}, and it follows from [19, Corollary 17.2.2] that there exist zk∗ ∈ NC (xk ), ikj ∈ Ik (xk ), yi∗k ∈ ∂gikj (xk ), and λkj ≥ 0 (j = 1, 2, . . . , l) such that j

x∗k = zk∗ +

(3.24)

l

λkj yi∗k

on Z,

k = 1, 2, . . . .

j

j=1

Let λk =

l j=1

λkj . Then {λk } is bounded. Indeed, if not, by considering a subx∗

sequence if necessary, we have that limk→∞ λk = +∞. Thus λkk → 0 as k → ∞. Furthermore, without loss of generality, we may assume that, as k → ∞, ikj → ij

(3.25)

Note in particular that

and

l j=1

λkj → λj , λk

j = 1, 2, . . . , l.

λj = 1. Since yi∗k ∈ ∂gikj (xk ) ⊆ ∂G(xk ) and since j

z∗

xk → x0 , {yi∗k } is bounded. Consequently, by (3.24), { λkk |Z } is bounded too. Thus, j

we may also assume that there exist z˜0∗ , y˜i∗j ∈ Z ∗ such that zk∗ → z˜0∗ ; λk

(3.26)

yi∗k → y˜i∗j

on Z

j

as k → ∞. Then (3.27)

˜ z0∗ , z − x0  ≤ 0

for each z ∈ C

and, by assumption (i), (3.25), and (3.26), for each j, (3.28)

˜ yi∗j , z − x0  ≤ gij (z) ≤ gij (z) − gij (x0 )

for each z ∈ Z + x0 .

Hence by the Hahn–Banach theorem, z˜0∗ and y˜i∗j can be extended to z0∗ ∈ NC (x0 ) and yi∗j ∈ ∂gij (x0 ). By (3.24), (3.25), and (3.26), (3.29)

0 = z0∗ +

l

j=1

λj yi∗j

on Z.

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

501

y ) < 0 for each j = 1, 2, . . . , l. ConseBy assumption (iii), take y¯ ∈ C such that gij (¯ quently, yi∗j , y¯ − x0  < 0 for each j = 1, 2, . . . , l, and hence, z0∗ +

l

λj yi∗j , y¯ − x0

< 0,

j=1

which contradicts (3.29). Hence {λk } is bounded. Thus, taking the limits on the two sides of (3.24) and using the similar arguments as above (if necessary, use subsequences), we get that x∗ = z0∗ +

(3.30)

l

λj yi∗j

on Z

j=1

for some λj ≥ 0, z0∗ ∈ NC (x0 ), and yi∗j ∈ ∂gij (x0 ) (j = 1, 2, . . . , l). Note that, by (3.28), yi∗j ∈ NC∩g−1 (R− ) (x0 ) for each j = 1, 2, . . . , l. Since yi∗j , y¯ − x0  < 0, it follows ij

that yi∗j |Z = 0, and hence ij ∈ I(x0 ) for each j = 1, 2, . . . , l. Let y ∗ = x∗ − z0∗ − l ∗ ∗ ∗  j=1 λj yij . Then y ∈ NC (x0 ) by (3.30), and so x ∈ NC (x0 ) + N (x0 ). The proof is complete. Remark 3.2. Example 3.2 below will show that the condition dim C < ∞ cannot be dropped, while Examples 3.3 and 3.4 will show that conditions (i) and (iii) also cannot be dropped. In each of these examples, I is a compact subset of R: I = {0, 1, 12 , . . . , 1i , . . .}. Example 3.2. Let C = X = {x = (x1 , x2 , . . . , xk , . . .) : limk xk exists} with the norm defined by x = sup |xk |,

x = (xk ) ∈ X.

k

Define  gi (x) =

limk xk , x 1i ,

i = 0, i ∈ I \ {0},

x = (xk ) ∈ X.

Since |gi (x) − gi (y)| ≤ x − y for each i ∈ I, the corresponding sup-function G is continuous on X. Note that G(¯ x) = −1 < 0 for x ¯ = (−1) ∈ X. Hence conditions (ii) and (iii) in Theorem 3.10 hold. clearly, condition (i) in Theorem 3.10 is satisfied. Let x0 = 0. Then x0 ∈ C ∩ S = S and I(x0 ) = I. It is easy to see that N  (x0 ) is not closed; hence this system does not satisfy the BCQ at x0 . Example 3.3. Let C = X = R2 . Define  i = 0, x1 + x2 , gi (x) = x = (x1 , x2 ) ∈ X. x1 + ix2 , i ∈ I \ {0}, Note that |gi (x) − gi (y)| ≤ 2x − y for each i ∈ I; thus the corresponding supfunction G is continuous on X. Let x0 = 0. Then K := C ∩ S = S = {(x1 , x2 ) : x1 ≤ 0, x1 + x2 ≤ 0}, x0 ∈ bdS, and I(x0 ) = I. Since G((−1, 12 )) = − 12 < 0, conditions (ii) and (iii) in Theorem 3.10 hold. However, NK (x0 ) = NS (x0 ) = {(t1 , t2 ) : 0 ≤ t2 ≤ t1 } and N  (x0 ) = {(t1 , t2 ) : 0 < t2 ≤ t1 } ∪ {(0, 0)}. Therefore this system does not satisfy the BCQ at x0 . Note that condition (i) is not satisfied.

502

CHONG LI AND K. F. NG

Example 3.4. Let C = X = R2 and define  i = 1, x2 , gi (x) = ix1 − x2 − i2 , i ∈ I \ {1},

x = (x1 , x2 ) ∈ X.

It is easy to verify that |gi (x)−gi (y)| ≤ 2x−y for each i ∈ I; thus the corresponding sup-function G is continuous on X. Let x0 = 0. Then I(x0 ) = {0, 1} and K = S = {(x1 , 0) ∈ R2 : x1 ≤ 0}. Hence NK (x0 ) = {(t1 , t2 ) ∈ R2 : t1 ≥ 0} and N  (x0 ) = cone {(0, −1), (0, 1)} = {(t1 , t2 ) ∈ R2 : t1 = 0}}. Consequently, this system does not satisfy the BCQ at x0 . Note that conditions (i) and (ii) in Theorem 3.10 are satisfied but condition (iii) is not. 4. Optimality conditions for minimization problems. The main purpose of this section is to apply our results of the earlier sections to provide optimality conditions for optimization problems of the following type. For a real-valued function f on X, a typical problem is to (4.1)

minimize subject to

f (x) x ∈ C ∩ S,

where C and S are as at the beginning of section 1. We will consider several kinds of functions on X. Let A(X) (resp., C(X)) denote the class of all continuous real-valued affine (resp., convex) functions on X. Moreover, for x0 ∈ X, let Gx0 (X) denote the class of real-valued functions f on X such that f is Gateaux differentiable (with the Gateaux derivative Df (x0 )) at x0 and the restriction of f to C is convex. Recall that K := C ∩ S. Theorem 4.1. Let x0 ∈ K. Then the following statements are equivalent: (i) The system (1.1) satisfies the BCQ relative to C at x0 . (ii) For any f ∈ Gx0 (X), x0 is an optimal solution of (4.1) if and only if

(4.2) λi x∗i = 0 on X Df (x0 ) + x∗ + i∈I0

for some finite subset I0 of I(x0 ), x∗ ∈ NC (x0 ), λi ≥ 0, and x∗i ∈ ∂gi (x0 ) (f or all i ∈ I0 ). (ii∗ ) Same as (ii) but with (4.2) replaced by

(4.3) λi x∗i = 0 on span C. Df (x0 ) + x∗ + i∈I0

(iii) For any f ∈ C(X), x0 is an optimal solution of (4.1) if and only if

λi x∗i = 0 on X (4.4) y ∗ + x∗ + i∈I0

for some finite subset I0 of I(x0 ), y ∗ ∈ ∂f (x0 ), x∗ ∈ NC (x0 ), λi ≥ 0, and x∗i ∈ ∂gi (x0 ) (f or all i ∈ I0 ).

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

(iii∗ ) Same as (ii∗ ) but with (4.4) is replaced by

y ∗ + x∗ + (4.5) λi x∗i = 0 on

503

span C.

i∈I0

(iv) Same as (iii) but with f ∈ A(X). (iv∗ ) Same as (iii∗ ) but with f ∈ A(X). Proof. Since each of the statements in the above list is true (with I0 = ∅) when x0 ∈ C ∩ int S, we assume henceforth that x0 ∈ C ∩ bd S. (i)=⇒ (ii) and (ii∗ ). Let f ∈ Gx0 (X) define F (x) = Df (x0 ), x − x0  + IK (x),

x ∈ X.

Then, by [6, Corollary 1, p. 105] ∂F (x0 ) = Df (x0 ) + NK (x0 ).

(4.6) It follows from (i) that (4.7)

∂F (x0 ) = Df (x0 ) + NC (x0 ) + N  (x0 ).

If x0 is an optimal solution of (4.1), x0 is a minimal point of F on X. In fact, it is clear that F (x) ≥ F (x0 ) for any x ∈ X \ K. Moreover, if x ∈ K, then (4.8) F (x) = Df (x0 ), x − x0  = lim

t→0+

f (x0 + t(x − x0 )) − f (x0 ) ≥ 0 = F (x0 ) t

because x0 + t(x − x0 ) ∈ K for each 0 < t < 1 (and x0 is an optimal solution of (4.1)). Hence, by [6, Proposition 2.4.11], 0 ∈ ∂F (x0 ). Consequently, (4.7) entails (4.2) (and (4.3)) for appropriate x∗ , λi , x∗i as stated in (ii) and (ii∗ ). Conversely, suppose that  (4.3) holds for some I0 , x∗ , λi , x∗i as stated in (ii∗ ). Writing z ∗ for x∗ + i∈I0 λi x∗i , one has from (2.11) that z ∗ ∈ NK (x0 ) and from (4.3) that Df (x0 ), · = −z ∗ , · on span C. Therefore, by convexity, we have for each x ∈ K that f (x) − f (x0 ) ≥ Df (x0 ), x − x0  = −z ∗ , x − x0  ≥ 0. Thus the implications (i)=⇒ (ii) and (ii∗ ) are proved. The implication (i)=⇒ (iii) is a direct consequences of [6, Corollary, p. 52]. The necessity part of (iii∗ ) follows from (iii) while the sufficiency part of (iii∗ ) is proved similarly to the proof of the corresponding part of (ii∗ ). Further, since the implications (ii)=⇒ (iv), (ii∗ )=⇒ (iv∗ ), (iii)=⇒ (iv), and (iii∗ ) =⇒ (iv∗ ) are trivial, it remains to show that (iv)=⇒ (i) and (iv∗ )=⇒ (i). To show these, let y ∗ ∈ NK (x0 ). Then −y ∗ ∈ A(X) and x0 is an optimal solution of (4.1) with f := −y ∗ . Thus ∂f (x0 ) = −y ∗ and, assuming (iv) or (iv∗ ), there exist a finite subset I0 of I(x0 ), x∗ ∈ NC (x0 ), λi > 0, and x∗i ∈ ∂gi (x0 ) (i ∈ I0 ) such that

−y ∗ + x∗ + λi x∗i = 0 on span C, i∈I0

that is, y ∗ , · = z ∗ , · on span C,

504

CHONG LI AND K. F. NG

where z ∗ := x∗ + Consequently,

 i∈I0

λi x∗i . Since x0 ∈ C, it follows that y ∗ − z ∗ ∈ NC (x0 ).

y ∗ ∈ NC (x0 ) + x∗ +



λi x∗i ⊆ NC (x0 ) + N  (x0 ),

i∈I0

and (i) holds. This completes the proof. 5. Best restricted range approximation problem in Lp spaces. Let Ω := T ∪A, where A is a compact Hausdorff space and T is a measure space with a measure µ. As usual, let CR (A) denote the Banach space of all real-valued continuous functions on A equipped with the uniform norm denoted by  · A . Let X be the space consisting of all real functions x on Ω such that the restrictions x|T ∈ Lp (T, µ) and x|A ∈ CR (A), where p ∈ [1, +∞). Then X is a Banach space under the norm  ·  defined by x = x|T p + x|A A

for each x ∈ X.

For the so-called restricted range approximation problem in Lp spaces (studied by Levasseur and Levis in [10]), one is given the following data: Let l and u be a pair of real functions which are, respectively, upper and lower semicontinuous on A such that l(t) ≤ u(t) for each t ∈ A. Let L be a finite dimensional vector subspace of X generated by h1 , h2 , . . . , hn . Let z ∈ X be given. Then the problem to be considered is  (5.1) |z(t) − x(t)|p dµ minimize T

subject to (5.2)

x ∈ L,

l(t) ≤ x(t) ≤ u(t)

for all t ∈ A

(one can of course replace µ by µ of the form dµ = wdµ for some positive w ∈ L∞ (T, µ)). The aim of this section is to apply results of the preceding section to establish the following theorem, which was proved for the case when p = 2 (or p is an even integer) in [10] using a very different method and under assumptions H1 –H6 given in the following: H1 : A is a compact subset of R and T is a finite union of compact intervals of R. H2 : z is a real continuous function on Ω. H3 : l and u are continuous on A with l(t) < u(t) for all t ∈ A. H4 : {h1 , h2 , . . . , hn } is linearly independent on T . H5 : There exists h ∈ X such that (5.2) is satisfied with x = h. H6 : For each k = 1, 2, . . . , n, the set {h1 , h2 , . . . , hk } is a Chebyshev system of order k on A; that is, if t1 , t2 , . . . , tk are distinct points in A, then the determinant det(hj (ti )) = 0. Instead of assumptions H1 –H6 , we consider the following two conditions: D1 : There exists an element h0 ∈ L satisfying l(t) < h0 (t) < u(t) for all t ∈ A. D2 : A is metrizable (in addition to being compact Hausdorff), l and u are continuous on A, and, for any k ≤ n and any distinct points t1 , t2 , . . . , tk in A, there exists ¯ ∈ L satisfying l(ti ) < h(t ¯ i ) < u(ti ) for all i = 1, 2, . . . , k. an element h We remark that H1 + H6 =⇒ D2 .

505

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

We use Z(x) to denote the set of all points in T such that x(t) = 0. Note that if x=x ˜ a.e. on T , then the “symmetric difference” (Z(x) \ Z(˜ x)) ∪ (Z(˜ x) \ Z(x)) is of µ-measure zero; hence Z(x) h(t)dµ = Z(˜x) h(t)dµ for each integrable function h on T . We shall need the following well-known results; see, for example, [3, Example 2.19] for a proof of Lemma 5.1. The other two lemmas follow immediately from Lemma 5.1 and the Riesz representation theorem. Lemma 5.1. Let W be a normed vector space. Then, for any w ∈ W \ {0}, the subdifferential ∂ · (w) of the norm function at w equals the set {w∗ ∈ W ∗ : w∗ , w = w}. Lemma 5.2. Let W = L1 (T, µ), where here and throughout, T denotes a measure space with measure µ. Let w ∈ W \ {0}. Then, as a subset of L∞ (T ), ∂ · 1 (w) consists of all β ∈ L∞ (T ) satisfying the properties that  |β(t)| ≤ 1 a.e. on T, β(t) = sgn[w(t)] a.e. on T \ Z(w). Lemma 5.3. Let W = Lp (T, µ) with p ∈ (1, +∞). Let w ∈ W \ {0}. Then, the norm function is Fr´echet differentiable at w with the derivative  · p (w), which is represented by ξ ∈ Lq (T, µ) ( p1 + 1q = 1) with · |w(t)|p−1 · sgn[w(t)] ξ(t) = w1−p p

a.e. on T.

In particular, ∂ · p (w) = {ξ}. Our main result in this section is now stated as follows. Theorem 5.4. Suppose that D1 or D2 holds. Let x0 ∈ X satisfy (5.2) and let z ∈ X. Then x0 is an optimal solution of the minimization problem (5.1) subject to (5.2) if and only if there exist a finite subset {t1 , t2 , . . . , tk } of A and a finite subset {λt1 , λt2 , . . . , λtk } of R (0 ≤ k ≤ n) with the following properties: (a) x0 (ti ) = u(ti ) or x0 (ti ) = l(ti ) for each i = 1, 2, . . . , k; (b) for each i = 1, 2, . . . , k,  1 if x0 (ti ) = u(ti ), sgnλti = −1 if x0 (ti ) = l(ti ); (c) for each j = 1, 2, . . . , n,  |z(t) − x0 (t)|p−1 sgn[z(t) − x0 (t)] hj (t)dµ =

(5.3) T

 T

λti hj (ti ),

p > 1;

i=1

 sgn[z(t) − x0 (t)] hj (t)dµ +

(5.4)

k

β(t) hj (t)dµ = Z(z−x0 )

k

λti hj (ti ),

p = 1,

i=1

for some β ∈ L∞ with |β(t)| ≤ 1 a.e. on Z(z − x0 ). (Note: we adopt the convention that, if k = 0, the sums of the right-hand sides of (5.3) and (5.4) are read as zero.) In order to prove this theorem, we first do some preparation. We make two homeomorphic copies of A: one is still denoted by A while the other is denoted by ˇ Thus I can be Aˇ = {ˇ a : a ∈ A}, where a → a ˇ is a bijection. Let I = A ∪ A.

506

CHONG LI AND K. F. NG

topologized such that A, Aˇ are two disjoint compact subsets of I. For each t ∈ A, define gt : X → R by gt (x) = et (x) − u(t),

(5.5)

x ∈ X,

where et : X → R is the “unit point functional” at t defined by x ∈ X.

et (x) = x(t), ˇ define gtˇ : X → R by Similarly, for each tˇ ∈ A,

gtˇ(x) = l(t) − et (x),

(5.6)

x ∈ X.

Thus {gi : i ∈ I} is a family of affine functions on X such that, for each i ∈ I, (5.7)

|gi (x) − gi (y)| ≤ x − y,

x, y ∈ X.

In particular, {gi : i ∈ I} is equicontinuous on X, and the corresponding sup-function x → G(x) := sup{gi (x) : i ∈ I} is continuous. Note furthermore that the function i → gi (x) is upper semicontinuous for each x ∈ X. Clearly, x ∈ X satisfies (5.2) if and only if x ∈ L and gi (x) ≤ 0 for each i ∈ I. Assuming D1 , it is easy to verify that h0 ∈ L given in D1 satisfies the strict inequality G(h0 ) < 0 (thanks to the compactness of A); thus h0 is a Slater point for the system gi (·) ≤ 0, i ∈ I. Thus by Corollary 3.7, this system satisfies the BCQ relative to L at each point x in L ∩ S, where S := {x ∈ X : gi (x) ≤ 0, i ∈ I} (the assertion being trivial if x ∈ L ∩ int S). Similarly, if D2 holds, then it follows from Theorem 3.10 that this system satisfies the BCQ relative to L at each point x in L ∩ S. Note that if x ∈ bd S, then G(x) = 0, and I(x) defined in (1.7) is equal to the set ˜ I(x) = {i ∈ I : gi (x) = G(x) = 0} defined in (3.6); hence (5.8)

I(x) = {t ∈ A : x(t) = u(t)} ∪ {tˇ ∈ Aˇ : x(t) = l(t)}

whenever x ∈ bd S. Note that, since l(·) < u(·) on A, t ∈ I(x) implies tˇ ∈ / I(x); similarly, tˇ ∈ I(x) implies t ∈ / I(x). For easy reference, we record the following trivial fact:  i = t ∈ A, et , gi (x) = (5.9) ˇ −et , i = tˇ ∈ A. Now define f : X → R by  (5.10)

 p1 |z(t) − x(t)| dµ p

f (x) =

for each x ∈ X.

T

Then f is a continuous convex function on X, and x0 is a solution of the minimization problem (5.1) subject to (5.2) if and only if it is a solution of the problem of minimizing f subject to (5.11)

x ∈ L,

gi (x) ≤ 0

for each i ∈ I,

where {gi : i ∈ I} is defined by (5.5) and (5.6). Since f (x) defined in (5.10) depends only on the restriction of x to T , the following result follows from Lemmas 5.2 and 5.3 together with the chain rule.

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

507

Proposition 5.5. Suppose that (z − x0 )|T is not the zero-element of Lp (T, µ). Then the following assertions hold: (a) The case when p = 1. A functional y ∗ ∈ ∂f (x0 ) if and only if there exists β ∈ L∞ (T, µ) such that |β(t)| ≤ 1

(5.12) (5.13)

β(t) = sgn[z(t) − x0 (t)]

and (5.14)

y ∗ , x = −

a.e. on T, a.e. on T \ Z(z − x),

 x(t)β(t)dµ

for each x ∈ X.

T

(b) The case when p ∈ (1, +∞). A functional y ∗ ∈ ∂f (x0 ) if and only if  (5.15) y ∗ , x = −c x(t)|z(t) − x0 (t)|p−1 sgn[z(t) − x0 (t)]dµ for each x ∈ X, T

where c := (z − x0 )|T 1−p . p Proof of Theorem 5.4. Let f, gi , I, and S be the same as those introduced after the statement of Theorem 5.4. Since the proof for the case when x0 ∈ L ∩ int S is easier (in this case one can take k = 0, and the right-hand sides of (5.3) and (5.4) are replaced by zero), it suffices to consider the case when x0 ∈ L ∩ bd S. As already noted, by assumption D1 or D2 , the system of inequalities gi (·) ≤ 0, i ∈ I, satisfies the BCQ at x0 relative to L. Noting that x∗ = 0 on L whenever x∗ ∈ NL (x0 ), it follows from Theorem 4.1 that the following statements are equivalent: (i) x0 is a minimizer of f subject to (5.11). (ii) There exist a finite subset I0 of I(x0 ), y ∗ ∈ ∂f (x0 ), µi > 0, and x∗i ∈ ∂gi (x0 ) (f or all i ∈ I0 ) such that

y∗ + (5.16) µi x∗i = 0 on L. i∈I0

(ii∗ ) Same as (ii) but with (5.16) replaced by

cy ∗ + µi x∗i = 0 on L, (5.17) i∈I0

where c is a positive constant. In view of (5.8) and (5.9), we note that, in (ii) and (ii∗ ), each x∗i can be expressed as either et or −et :  if i = t ∈ I(x0 ) ∩ A (i.e., if x0 (t) = u(t)), et x∗i = −et if i = tˇ ∈ I(x0 ) ∩ Aˇ (i.e., if x0 (t) = l(t)). Thus µi x∗i can be expressed in the form λti eti for some λti ∈ R and some ti ∈ A satisfying x0 (ti ) = u(ti ) or x0 (ti ) = l(ti ). Explicitly, λti is defined by  if x0 (ti ) = u(ti ) and ti ∈ I0 , µti λti = −µti if x0 (ti ) = l(ti ) and tˇi ∈ I0 . Moreover, when (ii) (or (ii∗ )) holds, we can suppose I0 satisfies an additional property that |I0 |, the number of elements of I0 , is not greater than dim L, the dimension of

508

CHONG LI AND K. F. NG

L. To see this, we suppose, without loss of generality, that h1 , h2 , . . . , hn are linearly independent, so dim L = n. Let V denote the subset of Rn defined by {(x∗i , h1 , x∗i , h2 , . . . , x∗i , hn ) : i ∈ I0 } and U := V ∪ {(y ∗ , h1 , y ∗ , h2 , . . . , y ∗ , hn )}. Since L is spanned by h1 , h2 , . . . , hn , (ii) implies that 0 ∈ co U. Note that (5.18)

x∗i , h0 − x0  < 0

for each i ∈ I0 .

Indeed, if i = tˇ ∈ I0 with some t ∈ A, then x0 (t) = l(t) and x∗i , h0  = −et , h0  = −h0 (t) < −l(t) = −et , x0  = x∗i , x0 ; thus (5.18) is true, as the case when i = t ∈ I0 with some t ∈ A can be considered similarly. Since h0 − x0 ∈ L and L is spanned by h1 , h2 , . . . , hn , it follows from (5.18) that 0 ∈ / co V; thus 0 ∈ co U \ co V. ¯ij ≥ 0 (j = 1, 2, . . . , n) By the Carath´eodory theorem (cf. [5]), there exist µ ¯0 > 0, µ n with µ ¯0 + j=1 µ ¯ij = 1 and i1 , i2 , . . . , in ∈ I0 such that µ ¯0 y ∗ +

n

µij x∗ij = 0

at h1 , h2 , . . . , hn .

j=1

Therefore, I0 in (ii) can be replaced by its subset with at most n elements. Making use of Proposition 5.5 and noting that L is spanned by h1 , h2 , . . . , hn , we now see that (ii) (or (ii∗ )) holds if and only if (a), (b), and (c) of Theorem 5.4 hold for some finite subsets {t1 , t2 , . . . , tk } ⊆ A and {λt1 , λt2 , . . . , λtk } ⊆ R, where k ≤ n. Finally, as we already noted, (i) holds if and only if x0 is an optimal solution of the minimization problem (5.1) subject to (5.2). This completes the proof of Theorem 5.4. 6. Approximation problem with constraints by conditionally positive semidefinite functions. This problem is studied in a recent paper [22] by Strauss. Following his approach, let Q be a nonempty subset of a Euclidean space Rd and let F (Q) denote the class of all real-valued functions on Q. Let H be a vector subspace of F (Q) and suppose that H is equipped with a seminorm induced by a semi-inner product ·, · on H ×H. Let U = span {u1 , u2 , . . . , um } be an m-dimensional subspace of H, and let K be a reproducing kernel with respect to U (for undefined terms, see [1, 18]). Let A = {z1 , z2 , . . . , zn } ⊆ Q and let CA denote the (m×n)-matrix defined by CA = (uj (zi ))m,n j=1,i=1 . Let Wn , Wm , respectively, denote an (n×n)-matrix and an (m×m)-matrix; we assume that Wn and Wm are positive semidefinite. Define a seminorm  ·  on Rn × Rm by (a, b)2W = aT Wn a + bT Wm b for all a ∈ Rn , b ∈ Rm . Suppose that we are given the following data: h ∈ H, {ξi : i ∈ I} ⊆ R, {pi : i ∈ I} ⊆ Rn , and {qi : i ∈ I} ⊆ Rm , where I is an index set. Let   ZI = (a, b) : a ∈ Rn , b ∈ Rm , CA a = 0, pTi a + qTi b ≤ ξi , i ∈ I

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

509

and let dˆ : Rn × Rm → R be defined by ⎛ ⎞ 2   n m



  2 ⎟ ˆ b) = 1 ⎜ h− ai K(·, zi ) − bj u j  d(a, ⎝   + (a, b)W ⎠ , 2   i=1 j=1 where a = (a1 , a2 , . . . , an ) ∈ Rn and b = (b1 , b2 , . . . , bm ) ∈ Rm . Under the obvious modifications, ZI and dˆ can be defined when m = 0, and our discussions below are valid for this case as well as for the case when m = 0. The problem to be considered here is (6.1)

ˆ b) minimize d(a,

subject to

(a, b) ∈ ZI .

Define (6.2)

UA⊥ = {(a, b) ∈ Rn × Rm : CA a = 0},

(6.3)

KA = (K(zi , zj ))ni,j=1 ,

(6.4)

hA = (h(z1 ), . . . , h(zn ))T ,

and (6.5)

eˆ(a, b) =

 1 T 1 a (KA + Wn )a + bT Wm b − hTA a + h, h. 2 2

By Proposition 2.3 and the proof of Proposition 4.1 in [22], we have (6.6)

ˆ b) = eˆ(a, b) d(a,

for all (a, b) ∈ UA⊥

and (6.7)

KA is positive semidefinite with respect to UAT

in the sense that aT KA a ≥ 0 whenever (a, b) ∈ UAT . This implies of course that KA + Wn is also positive semidefinite, and so the quadratic function eˆ is convex on UAT . We always assume that (6.1) has a solution: let (a0 , b0 ) denote a solution of (6.1). Let r(CA ) denote the rank of the matrix CA . Finally, |J| denotes the number of elements of a finite set J. The following result was asserted as a theorem ([22, Theorem 6.2]): Suppose that (a0 , b0 ) is a solution of (6.1), r(CA ) = m, and that there exists ¯ ∈ Rn × Rm such that (¯ a, b) (6.8)

¯ = 0, CA a

¯ < ξi ¯ + qTi b pTi a

for all i ∈ I.

Then there exists a finite subset I0 of I with |I0 | ≤ n such that (a0 , b0 ) also minimizes ˆ b) subject to (a, b) ∈ ZI . d(a, 0 While we will give a counterexample below to show this result is false, we put forward the following corrected version.

510

CHONG LI AND K. F. NG

Theorem 6.1. Let (a0 , b0 ) be a solution of (6.1). Assume that the system of linear inequalities pTi a + qTi b ≤ ξi ,

(6.9)

i ∈ I,

satisfies the BCQ relative to UA⊥ at (a0 , b0 ). Then there exists a finite subset I0 of ˆ b) subject to I with |I0 | ≤ m + n − r(CA ) such that (a0 , b0 ) also minimizes d(a, (a, b) ∈ ZI0 . Proof. By assumption and (6.6), (a0 , b0 ) is a solution of the following minimization problem: subject to (a, b) ∈ UA⊥ and (6.9).

Minimize eˆ(a, b)

We shall use Dˆ e(a0 , b0 )|U ⊥ to denote the restriction to UA⊥ of the Gateaux (equivA alently, Fr´echet) derivative of the function eˆ at (a0 , b0 ). Since, by assumption, the system (6.9) in X := Rn × Rm satisfies the BCQ relative to UA⊥ at (a0 , b0 ), one can apply the implication (i)=⇒ (ii∗ ) of Theorem 4.1 to conclude that there exist a finite subset I0 of I(a0 , b0 ) and a collection {λi : i ∈ I0 } of nonnegative real numbers such that

Dˆ e(a0 , b0 )|U ⊥ + (6.10) λi (pTi , qTi )|U ⊥ = 0, A

A

i∈I0

because x∗ |U ⊥ = 0 whenever x∗ ∈ NU ⊥ (a0 , b0 ). We write µ for the sum A

A



i∈I0

λi and

ˆ on U ⊥ by assume that µ = 0 (otherwise, (a0 , b0 ) must be a minimal point of eˆ(= d) A ⊥ convexity of eˆ on UA ). Thus −Dˆ e(a0 , b0 )|U ⊥ ∈ cone{(pTi , qTi )|U ⊥ : i ∈ I0 }.

(6.11)

A

A

Since dim UA⊥ = n + m − r(CA ), it follows from [19, Corollary 17.1.2] that I0 in (6.11) can be replaced by a subset I0 with |I0 | ≤ n + m − r(CA ). Note that (6.10) continues to hold when I0 is replaced by I0 (replace the family {λi } by a suitable new one if necessary). Consequently, by (ii∗ ) of Theorem 4.1, (a0 , b0 ) minimizes eˆ(a, b) on UA⊥ subject to the system of linear inequalities pTi a + qTi b ≤ ξi , i ∈ I0 . This completes the proof as before. Example 6.1. Let A = Q consist of two distinct elements z1 , z2 . We identify F (Q) with R2 in the obvious manner. Let H = F (Q) with the usual inner product. Take h = (1, 0)T ∈ H, m = 0, Wm = 0, U = {0}, CA = 0, and UA⊥ = R2 . Define Wn 1 by Wn = ( 04 01 ) and fix the reproducing kernel K : Q → R defined by 4

 K(x, y) =

1 0

if x = y, if x = y.

Take I = {1, 2, . . .}, ξi = 0, and pi = (1, 1i ) for i ∈ I. Then, (6.8) is satisfied for ¯. Note appropriate a ! 1 1 2 2 ˆ ˆ h − a + a d(a) := d(a, b) = 2 4

CONSTRAINT QUALIFICATION FOR INFINITE SYSTEMS

and

511

 ZI =

 1 a ∈ R2 : a1 + a2 ≤ 0 for all i ∈ I . i

Clearly, ZI is the convex set of R2 containing the 4th quadrant and bounded by the two half-lines, respectively, contained in a1 + a2 = 0 and a1 = 0. It follows easily that the origin 0 ∈ R2 is the nearest point to h from ZI . On the other hand, if I0 is a finite subset of I, then ZI0 is also bounded by the two half-lines, respectively, contained in a1 + i1− a2 = 0 and a1 + i1+ a2 = 0, where i− := min I0 , i+ := max I0 . 1 T Let aI0 denote the nearest point to h from ZI0 . Then aI0 = 1+i and 2 (1, −i+ ) +

h2 = aI0 2 + h − aI0 2 . This implies that ! 1 1 1 2 2 2 ˆ I ), ˆ aI0  + h − aI0  = d(a d(0) = h > 0 2 2 4 while ˆ = 1 h − 02 ≤ 1 h − a2 ≤ d(a) ˆ d(0) 2 2

for all a ∈ ZI .

Therefore [22, Theorem 6.2] mentioned earlier is not true. Remark 6.1. Let li be the function on Rn × Rm defined by li (a, b) = pTi ai + qTi b − ξi ,

(a, b) ∈ Rn × Rm .

Suppose that I can be topologized in such a way that it is a compact space satisfying the first axiom of countability and that i → li (a, b) is upper semicontinuous on I for each (a, b) ∈ UA⊥ . Then, by Corollary 3.7, the linear inequality system (6.9) satisfies the BCQ relative to UA⊥ at (a0 , b0 ) if (6.8) is satisfied. In this particular case, the conclusion of [22, Theorem 6.2] mentioned earlier does hold. Similarly, if I can be topologized in such a way that it is a compact metric space and that i → li (a, b) is continuous on i ∈ I for each (a, b) ∈ UA⊥ , then, by Theorem 3.10, the aforesaid result is also valid even if (6.8) is replaced by a weaker condition given in the following: for ¯ ∈ Rn × Rm such that each subset J of I with |J| ≤ m + n − r(CA ), there exists (¯ a, b) (6.12)

¯ = 0, CA a

¯ < ξi ¯ + qTi b pTi a

for all i ∈ J.

Acknowledgments. We wish to thank two referees for their valuable comments and suggestions. REFERENCES [1] M. Atteia, Hilbertian Kernels and Spline Functions, North-Holland, Amsterdam, 1992. [2] H. Bauschke, J. Borwein, and W. Li, Strong conical hull intersection property, bounded linear regularity, Jameson’s property (G), and error bounds in convex optimization, Math. Program. Ser. A, 86 (1999), pp. 135–160. [3] J. F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems, SpringerVerlag New York, 2000. [4] D. Braess, Nonlinear Approximation Theory, Springer-Verlag, New York, 1986. [5] E. W. Cheney, Introduction to Approximation Theory, McGraw-Hill, New York, 1966. [6] F. Clarke, Optimization and Nonsmooth Analysis, John Wiley, New York, 1983. [7] F. Deutsch, Some Applications of Functional Analysis to Approximation Theory, Doctoral dissertation, Brown University, Providence, RI, 1965.

512

CHONG LI AND K. F. NG

[8] F. Deutsch, W. Li, and J. D. Ward, Best approximation from the intersection of a closed convex set and a polyhedron in Hilbert space, weak Slater conditions, and the strong conical hull intersection property, SIAM J. Optim., 10 (1999), pp. 252–268. ´chal, Convex Analysis and Minimization Algorithms. [9] J.-B. Hiriart-Urruty and C. Lemare I. Fundamentals Grundlehren der Mathematischen Wissenschaften 305, Springer-Verlag, Beriln, 1993. [10] K. M. Levasseur and J. T. Levis, Least-squares approximation with restricted range, J. Approx. Theory, 36 (1982), pp. 304–316. [11] C. Li, On best uniform restricted range approximation in complex-valued continuous function spaces, J. Approx. Theory, 120 (2003), pp. 71–84. [12] C. Li and X.-Q. Jin, Nonlinearly constrained best approximation in Hilbert spaces: The strong CHIP and the basic constraint qualification, SIAM J. Optim., 13 (2002), pp. 228–239. [13] C. Li and K. F. Ng, On best approximation by nonconvex sets and perturbation of nonconvex inequality systems in Hilbert spaces, SIAM J. Optim., 13 (2002), pp. 726–744. [14] C. Li and K. F. Ng, Constraint qualification, the strong CHIP, and best approximation with convex constraints in Banach spaces, SIAM J. Optim., 14 (2003), pp. 584–607. [15] W. Li, Abadie’s constraint qualification, metric regularity, and error bounds for differentiable convex inequalities, SIAM J. Optim., 7 (1997), pp. 966–978. [16] W. Li, C. Nahak, and I. Singer, Constraint qualification for semi-infinite systems of convex inequalities, SIAM J. Optim., 11 (2000), pp. 31–52. [17] O. Mangasarian, Nonlinear Programming, McGraw-Hill, New York, 1969. [18] W. R. Madych and S. A. Nelson, Multivariate interpolation and conditionally positive definite functions, Approx. Theory Appl., 4 (1988), pp. 77–89. [19] R. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970. [20] G. Sh. Rubenstein, On an extremal problem in a linear normed space, Sibirsk. Mat. Z., 6 (1965), pp. 711–714. (in Russian). [21] G. S. Smirnov and R. G. Smirnov, Best uniform restricted ranges approximation II, C. R. Acad. Sci. Paris Ser. I Math., 330 (2000), pp. 1059–1064. [22] H. Strauss, Approximation with constraints by conditionally positive definite functions, Numer. Algorithms, 30 (2002), pp. 185–198. [23] Y. Yuan and W. Sun, Optimization Theory and Methods, Science Press, Beijing, 1997 (in Chinese).