A new linear convergence result for the iterative soft

1 downloads 0 Views 1MB Size Report
[28] proved the linear convergence of the ISTA for the special case when ωk ≡ μ and under .... Note by (6) that K|I = K|EI for each index set I ⊆ IN. To accomplish ...
Optimization A Journal of Mathematical Programming and Operations Research

ISSN: 0233-1934 (Print) 1029-4945 (Online) Journal homepage: http://www.tandfonline.com/loi/gopt20

A new linear convergence result for the iterative soft thresholding algorithm Lufang Zhang, Yaohua Hu, Chong Li & Jen-Chih Yao To cite this article: Lufang Zhang, Yaohua Hu, Chong Li & Jen-Chih Yao (2017): A new linear convergence result for the iterative soft thresholding algorithm, Optimization, DOI: 10.1080/02331934.2017.1318133 To link to this article: http://dx.doi.org/10.1080/02331934.2017.1318133

Published online: 17 Apr 2017.

Submit your article to this journal

Article views: 11

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=gopt20 Download by: [National Sun Yat-Sen University]

Date: 20 April 2017, At: 05:22

OPTIMIZATION, 2017 http://dx.doi.org/10.1080/02331934.2017.1318133

A new linear convergence result for the iterative soft thresholding algorithm Lufang Zhanga , Yaohua Hub , Chong Lia and Jen-Chih Yaoc,d a School of Mathematical Sciences, Zhejiang University, Hangzhou, P.R. China; b College of Mathematics and Statistics, Shenzhen University, Shenzhen, P.R. China; c Center for General Education, China Medical University, Taichung, Taiwan; d Department of Mathematics, King Abdulaziz University, Jeddah, Saudi Arabia

ABSTRACT

ARTICLE HISTORY

The iterative soft thresholding algorithm (ISTA) is one of the most popular optimization algorithms for solving the 1 regularized least squares problem, and its linear convergence has been investigated under the assumption of finite basis injectivity property or strict sparsity pattern. In this paper, we consider the 1 regularized least squares problem in finite- or infinite-dimensional Hilbert space, introduce a weaker notion of orthogonal sparsity pattern (OSP) and establish the Q-linear convergence of ISTA under the assumption of OSP. Examples are provided to illustrate the cases where the linear convergence of ISTA can be established only by our result, but cannot be ensured by any existing result in the literature.

Received 21 January 2017 Accepted 30 March 2017 KEYWORDS

Iterative soft thresholding algorithm; linear convergence analysis; linear inverse problems; sparsity pattern

1. Introduction Let H be a Hilbert space, and let l 2 denote the Hilbert space consisting of all square-summable sequences. Let N ∈ N ∪ {+∞} be fixed, and write lN2 :=

 N R , if N ∈ N, l 2 , otherwise,

 and I N :=

{1, . . . , N}, if N ∈ N, N, otherwise.

In this paper, we consider the following 1 regularized least squares problem  1 Ku − h2 + ωk |uk |, 2 N

min 2 u∈lN

(1)

k=1

where K : lN2 → H is a bounded linear operator, and ω := (ωk ) is a sequence of weights satisfying ωk ≥ ω > 0 for any k ∈ IN .

(2)

In the last decade, problem (1) has been widely studied to approach a sparse approximate solution of the linear inverse problem and gained successful applications in a wide range of fields, such as compressive sensing [1–3], image science [4–6], systems biology [7–9] and machine learning [10–12] in finite-dimensional spaces; and Fourier analysis [13,14] and Harmonic analysis [15,16] in infinitedimensional spaces. CONTACT Yaohua Hu

[email protected]

© 2017 Informa UK Limited, trading as Taylor & Francis Group

2

L. ZHANG ET AL.

Motivated by successful applications of the 1 regularization problem (1), many practical and efficient optimization algorithms have been proposed to solve problem (1); see [2,11,12,17–23] and references therein. In particular, the iterative soft thresholding algorithm (ISTA) is one of the most widely studied first-order iterative algorithms for solving problem (1). The ISTA was originally proposed to solve the image deconvolution problem in Euclidean spaces, independently introduced by Figueiredo and Nowak [24] to approach a penalized maximum likelihood estimator under the name of EM algorithm, and by Starck et al. [25] to minimize a total-variation regularized least squares problem; and it was first investigated in [26] for Hilbert spaces. The ISTA is formally described as follows. Algorithm 1: Let an initial point u0 ∈ lN2 be given. Having un , we choose a step size sn > 0 and determine un+1 by un+1 := Ssn ω (un − sn K ∗ (Kun − h)), where Ssn ω : lN2 → lN2 is a soft thresholding operator, defined by   Ssn ω (v) := (sign(vk ) · |vk | − sn ωk + ) for each v := (vk ) ∈ lN2 .

(3)

Under the assumption that the step sizes {sn } satisfy 0 < s ≤ sn ≤ s
M.

4

L. ZHANG ET AL.

By Theorem 1.2 and Remark 1, we directly obtain the following corollary, which was proved in [28] for the case when N ∈ N and in [14] for the case when N = ∞. Corollary 1.3: Let {un } be a sequence generated by Algorithm 1 satisfying (4) and (5). Then {un } converges to a solution u∗ of problem (1). Moreover, {un } linearly converges to u∗ provided either of the following assumptions: (a) (b)

the J-BI is satisfied; the SSP is satisfied at u∗ .

The proof of Theorem 1.2 is presented in the next section. Examples are provided in section 3 to show the cases where our result in this paper is available but neither the one in [28] nor the one in [14].

2. Proof of Theorem 1.2 Let I ⊆ IN and C ⊆ lN2 . As usual, we use C ⊥ and I c to denote the orthogonal complement of C and the complementary of I, respectively. As presented in the preceding section, let K : lN2 → H be a bounded linear operator. The kernel and image of K are, respectively, defined by kerK = {u ∈ lN2 : Ku = 0} and imK = {Ku : u ∈ lN2 }. The restriction of K on C is denoted by K|C : C → H and defined by K|C (u) := Ku

for each u ∈ C.

Note by (6) that K|I = K|EI for each index set I ⊆ IN . To accomplish the proof of Theorem 1.2, we first present some basic properties of the projection operator in the following lemmas, in which Lemma 2.1(a) (resp. (b), (c), (d)) is taken from Theorem 3.14 (resp. Corollary 3.22(iii), (vi), Proposition 3.19) of [31], Lemma 2.2 is a direct consequence of [31, Fact 2.18] and Lemma 2.1(c). Lemma 2.1: Let C be a closed linear subspace of lN2 and x ∈ lN2 . Then the following assertions hold: (a) z = PC (x) if and only if z ∈ C and x − z ∈ C ⊥ for any y ∈ C; (b) PC is a linear and continuous operator with  PC ≤ 1; (c) PC∗ = PC ; (d) PC is idempotent, i.e. PC2 = PC . Lemma 2.2:

Let I ⊆ IN be a finite index set. Then the following assertion holds:  ⊥ ker(KPEI ) = im(PEI K ∗ ).

Lemma 2.3:

Let x ∈ lN2 and y ∈ lN2 , and let I1 ⊆ IN and I2 ⊆ IN be such that I1 ∩ I2 = ∅ and KPEI1 (u), KPEI2 (u) = 0 for any u ∈ lN2 .

Then the following assertion holds: KPEI1 (x), KPEI2 (y) = 0. Proof: Let u := PEI1 (x) + PEI2 (y) ∈ lN2 . Then it follows that KPEI1 (x), KPEI2 (y) = KPEI1 (u), KPEI2 (u) = 0

(10)

OPTIMIZATION

5

(due to (10) and Lemma 2.1). The proof is complete. Associated to problem (1), one can directly check using the optimality condition of convex optimization [32] that ⎧ if uk∗ > 0, ⎨ = ωk , ∗ ∗ ∗ u ∈ S ⇔ ( − K (Ku − h))k ∈ [−ωk , ωk ], if uk∗ = 0, for any k ∈ IN . (11) ⎩ if uk∗ < 0, = −ωk , For the remainder of this section, we always assume that (A1) {un } is generated by Algorithm 1 satisfying (4) and (5); (A2) u∗ := limn→∞ un ∈ S. For simplicity, we further write v ∗ := −K ∗ (Ku∗ − h) and v n := −K ∗ (Kun − h) for each n ∈ N.

(12)

Then we have that v ∗ = lim v n . n→∞

(13)

Next, we provide the following three lemmas for the iterative procedure of Algorithm 1. Recall that J is defined by (7). Lemma 2.4: Let I ⊆ J. Then there exists M ∈ N such that PEI c (un − u∗ ) = PEJ\I (un − u∗ ) for any n > M.

(14)

Proof: By (7), (11) and (12), one has that J c = {k ∈ IN : |vk∗ | < ωk } ⊆ {k ∈ IN : uk∗ = 0}.

(15)

Set vk∗ = 0 and ωk = ω for each k > N in the case when N < +∞. Then v ∗ ∈ l 2 for each N ∈ N∪{+∞}. Hence it follows that limk→∞ |vk∗ | = 0, and so it follows from (2) that limk→∞ |v ∗ | limk→∞ ωk

|vk∗ | ωk



= 0. Fix τ0 ∈ (0, 1). Then there exists M ∈ N such that

∗ |v | Let τ := max τ0 , max ωkk

|vk∗ | ≤ τ0 for any n ≥ M. ωk : k ∈ J c , k ≤ M . Hence we have that τ ∈ (0, 1) and that vk∗ ∈ [−τ ωk , τ ωk ] for any k ∈ J c .

(16)

By assumption (A2) and (13), there exists M ∈ N such that un − u∗  ≤

1−τ 1−τ s ω and v n − v ∗  ≤ ω for any n ≥ M. 2 2

(17)

Fix i ∈ J c and n ≥ M. Note by (15) that ui∗ = 0 and by (2) that ωi ≥ ω > 0. Then it follows from (17) and (4) that 1−τ 1−τ sω≤ sn ωi , |uin | = |uin − ui∗ | ≤ 2 2

6

L. ZHANG ET AL.

and from (16) and (17) that |vin | ≤ |vin − vi∗ | + |vi∗ | ≤

1−τ 1−τ 1+τ ω + τ ωi ≤ ωi + τ ωi = ωi . 2 2 2

Combining the above two inequalities, we obtain that |uin + sn vin | ≤ |uin | + sn |vin | ≤ sn ωi . Hence, in view of Algorithm 1 and by (12), one has that   uin+1 = sign(uin + sn vin ) · |uin + sn vin | − sn ωi + = 0. Since i ∈ J c is arbitrary, we have that PEJ c (un+1 ) = 0.

(18)

Note that EJ\I ⊥ EJ c and EI c = EJ\I + EJ c (since I ⊆ J). Then it follows that PEI c (un+1 ) = PEJ c (un+1 ) + PEJ\I (un+1 ) = PEJ\I (un+1 )

(19)

(due to (18)). By (15), we obtain that PEI c (u∗ ) = PEJ c (u∗ ) + PEJ\I (u∗ ) = PEJ\I (u∗ ). This, together with (19) and Lemma 2.1(b), implies (14), and the proof is complete. Lemma 2.5: Let I ⊆ J be such that the I-BI and (9) are satisfied. Then there exist λ ∈ (0, 1) and M ∈ N such that PEI (un+1 − u∗ ) ≤ λPEI (un − u∗ ) for any n > M.

(20)

Proof: By assumption, Lemma 2.4 is applicable to concluding that there exists M ∈ N such that (14) holds. One checks by definition (cf. (3)) that Ssn ω is nonexpansive, that is, Ssn ω (u) − Ssn ω (v) ≤ u − v for any u, v ∈ lN2 .

(21)

Fix n > M. In view of Algorithm 1, by (21) and Lemma 2.1(b), one has that PEI (un+1 − u∗ )

= PEI (Ssn ω (un − sn K ∗ (Kun − h)) − Ssn ω (u∗ − sn K ∗ (Ku∗ − h))) = Ssn ω PEI (un − sn K ∗ (Kun − h)) − Ssn ω PEI (u∗ − sn K ∗ (Ku∗ − h)) ≤ PEI (un − sn K ∗ (Kun − h) − (u∗ − sn K ∗ (Ku∗ − h))) = PEI (I − sn K ∗ K)(un − u∗ ).

Noting that u = PEI c (u) + PEI (u) for any u ∈ lN2 , we obtain by Lemma 2.1 that PEI PEI c = 0 and PEI is linear idempotent, and then it follows from above that PEI (un+1 − u∗ )

≤ PEI (I − sn K ∗ K)PEI (un − u∗ ) + PEI (I − sn K ∗ K)PEI c (un − u∗ ) = (PEI − sn PEI K ∗ KPEI )PEI (un − u∗ ) − sn PEI K ∗ KPEI c (un − u∗ ).

(22)

Then we claim that PEI K ∗ KPEI c (un − u∗ ) = 0.

(23)

OPTIMIZATION

7

Indeed, by (14) and Lemma 2.1(c) and (d), we obtain that PEI K ∗ KPEI c (un − u∗ )2 = PEI K ∗ KPEJ\I (un − u∗ )2

= PEI K ∗ KPEJ\I (un − u∗ ), PEI K ∗ KPEJ\I (un − u∗ ) = KPEJ\I (un − u∗ ), KPEI K ∗ KPEJ\I (un − u∗ ).

By assumption that (9) is satisfied, Lemma 2.3 is applicable (with I, J \ I in place of I1 , I2 ); hence we proved (23). Together with (23), (22) is reduced to PEI (un+1 − u∗ ) ≤ (PEI − sn PEI K ∗ KPEI )PEI (un − u∗ ) ≤ PEI − sn PEI K ∗ KPEI PEI (un − u∗ ).

(24)

We end this proof by estimating PEI − sn PEI K ∗ KPEI . One has by definition that PEI − sn PEI K ∗ KPEI 2 = sup (PEI − sn PEI K ∗ KPEI )(u), (PEI − sn PEI K ∗ KPEI )(u) u=1

= sup (PEI (u)2 − 2sn PEI (u), PEI K ∗ KPEI (u) + sn2 PEI K ∗ KPEI (u)2 ). u=1

(25)

By Lemma 2.1, one has that

and that

PEI (u), PEI K ∗ KPEI (u) = KPEI (u), KPEI (u) = KPEI (u)2 ,

(26)

PEI K ∗ KPEI (u)2 ≤ KPEI 2 KPEI (u)2 ≤ K2 KPEI (u)2 .

(27)

Together with (26) and (27), (25) implies that

 sn PEI − sn PEI K ∗ KPEI 2 ≤ sup PEI (u)2 − 2sn 1 − K2 KPEI (u)2 . 2 u=1

(28)

Note by assumptions that EI is finite-dimensional and that K|I is injective. There exists α ∈ (0, K2 ) such that (29) KPEI (u) ≥ αPEI (u) for any u ∈ lN2 . Also note by (4) that

 

 sn s 2 2 sn 1 − K ≥ s 1 − K > 0. 2 2

(30)

Together with (29) and (30), (28) yields that     s 1 − 2s 1 − K2 α 2 PEI (u)2 . 2 u=1

PEI − sn PEI K ∗ KPEI 2 ≤ sup  Let λ :=

(31)

 1 − 2s 1 − 2s K2 α 2 ∈ (0, 1) (by (4)). Noting by Lemma 2.1(b) that PEI (u) ≤ u

we have by (31) that

PEI − sn PEI K ∗ KPEI  ≤ λ.

This, together with (24), implies (20), and the proof is complete.

(32)

8

L. ZHANG ET AL.

Lemma 2.6: such that

Let I ⊆ IN be such that (8) and (9) are satisfied. Then there exist λ ∈ (0, 1) and M ∈ N PEI c (un+1 − u∗ ) ≤ λPEI c (un − u∗ ) for any n > M.

(33)

Proof: By assumption (A2) and (13), there exists M ∈ N such that |uin − ui∗ | ≤

τ 2

and |vin − vi∗ | ≤

τ 2s

for any n > M.

(34)

PEJ\I (un+1 ) = PEJ\I (I − sn K ∗ K)(un − u∗ ) + PEJ\I (u∗ ).

(35)

T := {k ∈ IN : uk∗ = 0}.

(36)

Fix n > M. We first show that

To this end, we define

It follows from (7) and (11) that T ⊆ J, which is a finite set (see Remark 1). Let τ := min{ω, min{|uk∗ | : k ∈ T}} > 0 and fix i ∈ T. Then it follows that |ui∗ | ≥ τ > 0. Without loss of generality, we assume that (37) ui∗ ≥ τ > 0; so we obtain by (11) that

vi∗ = ωi .

(38)

Note that uin + sn vin = ui∗ + uin − ui∗ + sn (vin − vi∗ ) + sn vi∗ ≥ ui∗ − |uin − ui∗ | − sn |vin − vi∗ | + sn vi∗ . This, together (37), (34) and (38), yields that uin + sn vin ≥ τ −

sn τ τ − + sn ωi ≥ sn ωi > 0 2 2s

(due to (4)). This says that sign(uin + sn vin ) > 0 and that |uin + sn vin | − sn ωi = uin + sn vin − sn ωi ≥ 0. Therefore, in view of Algorithm 1, one has by (38) that uin+1 = uin + sn vin − sn ωi = uin − ui∗ + sn (vin − vi∗ ) + ui∗ = ((I − sn K ∗ K)(un − u∗ ))i + ui∗ (due to (12)). Noting by (8) and (36) that J \ I ⊆ T and recalling that i ∈ T is arbitrary, we obtained (35). Let U := ker(KPEJ\I ). By Lemma 2.2, one has that U ⊥ = im(PEJ\I K ∗ ).

(39)

PI c (un+1 − u∗ ) ≤ PU ⊥ − sn PU ⊥ K ∗ KPU ⊥ PEI c (un − u∗ ).

(40)

Next, we show that

To show this, employing PU on both sides of (35), we obtain by Lemma 2.1(b) that PU PEJ\I (un+1 )

= PU PEJ\I (un − u∗ ) − sn PU PEJ\I K ∗ K(un − u∗ ) + PU PEJ\I (u∗ ) = PU PEJ\I (un ) − sn PU PEJ\I K ∗ K(un − u∗ ).

(41)

OPTIMIZATION

9

Noting by (39) that PEJ\I K ∗ K(un −u∗ ) ∈ U ⊥ , it is easy to see from Lemma 2.1(a) that PU PEJ\I K ∗ K(un − u∗ ) = 0. This, together with (41), implies that PU PEJ\I (un+1 ) = PU PEJ\I (un ). Noting by assumption (A2) that limn→∞ un = u∗ and that n > M is arbitrary, we obtain by Lemma 2.1(b) that PU PEJ\I (un ) = PU PEJ\I (u∗ ).

(42)

Then it follows from (42) and Lemma 2.1(b) that PEJ\I (un − u∗ ) = PU PEJ\I (un − u∗ ) + PU ⊥ PEJ\I (un − u∗ ) = PU ⊥ PEJ\I (un − u∗ ).

(43)

Employing PU ⊥ on both side of (35), we have by Lemma 2.1(b) that PU ⊥ PEJ\I (un+1 − u∗ )

= PU ⊥ PEJ\I (I − sn K ∗ K)(un − u∗ )

= PU ⊥ PEJ\I (I − sn K ∗ K)(PEI c (un − u∗ ) + PEI (un − u∗ ))

= PU ⊥ PEJ\I (I − sn K ∗ K)PEI c (un − u∗ ) − sn PU ⊥ PEJ\I K ∗ KPEI (un − u∗ ).

(44)

By assumption, Lemma 2.4 is applicable to ensuring (14). Then it follows from Lemma 2.1(b) and (d) that PU ⊥ PEJ\I (I − sn K ∗ K)PEI c (un − u∗ )

= PU ⊥ PEJ\I (I − sn K ∗ K)PEJ\I (un − u∗ )

= (PU ⊥ − sn PU ⊥ PEJ\I K ∗ KPU ⊥ )PEJ\I (un − u∗ )

(45)

(due to (43)). By definition of U ⊥ (cf. (39)), one has that PU ⊥ PEJ\I K ∗ v = PU ⊥ K ∗ v for any v ∈ H. This, together with (45), implies that PU ⊥ PEJ\I (I − sn K ∗ K)PEI c (un − u∗ ) = (PU ⊥ − sn PU ⊥ K ∗ KPU ⊥ )PEJ\I (un − u∗ ).

(46)

On the other hand, by (39) and Lemma 2.1(c)-(d), we have that PU ⊥ PEJ\I K ∗ KPEI (un − u∗ )2

= PU ⊥ PEJ\I K ∗ KPEI (un − u∗ ), PU ⊥ PEJ\I K ∗ KPEI (un − u∗ ) = KPEI (un − u∗ ), KPEJ\I PU ⊥ PEJ\I K ∗ KPEI (un − u∗ ).

Note by (9) that Lemma 2.3 is applicable to concluding that PU ⊥ PEJ\I K ∗ KPEI (un − u∗ ) = 0.

(47)

Together with (46) and (47), (44) is reduced to PU ⊥ PEJ\I (un+1 − u∗ ) = (PU ⊥ − sn PU ⊥ K ∗ KPU ⊥ )PEJ\I (un − u∗ ).

(48)

Note by (14) and (43) that PI c (un − u∗ ) = PEJ\I (un − u∗ ) = PU ⊥ PEJ\I (un − u∗ ). This, together with (48), yields (40).

10

L. ZHANG ET AL.

Recall that J is finite (see Remark 1), and so U ⊥ is finite-dimensional (by (39)). We also claim that K|U ⊥ is injective. Indeed, let u ∈ U ⊥ be such that Ku = 0.

(49)

By (39), there exists v ∈ H such that u = PEJ\I K ∗ v; hence (49) says that KPEJ\I K ∗ v = 0. Then it follows from Lemma 2.1(c) and (d) that u2 = PEJ\I K ∗ v2 = PEJ\I K ∗ v, PEJ\I K ∗ v = KPEJ\I K ∗ v, v = 0. Thus we proved that K|U ⊥ is injective, as desired. Thus, by the arguments as we did for (32) (and by (39)), we obtain that there exists λ ∈ (0, 1) such that PU ⊥ − sn PU ⊥ K ∗ KPU ⊥  ≤ λ. This, together with (40), yields (33), and the proof is complete. Now we are ready to provide the proof of Theorem 1.2 as follows. Proof of Theorem 1.2: As mentioned in the preceding section, {un } strongly converges to u∗ ∈ S. By assumptions of Theorem 1.2, the blanket assumptions in this section and the assumptions of Lemmas 2.5 and 2.6 are satisfied. Then the conclusion follows. 

3. Examples In this section, we provide two examples to show the cases where our result in this paper is available but neither the one in [28] nor the one in [14]. The first example is illustrated in Euclidean space, and the second one is demonstrated in infinite-dimensional space. Example 3.1: Consider problem (1) with   −1 1 −1 2 K= , h = ( − 3, 5)T and ω = (4, 2, 4, 8)T . 1 1 1 2 By (11), we have that u∗ ∈ S if and only if 0 ∈ 2u1∗ + 2u3∗ − 8 + 4 Sign(u1∗ ), 0 ∈ 2u2∗ + 4u4∗ − 2 + 2 Sign(u2∗ ), 0 ∈ 2u1∗ + 2u3∗ − 8 + 4 Sign(u3∗ ), 0 ∈ 4u2∗ + 8u4∗ − 4 + 8 Sign(u4∗ ), where

⎧ t > 0, ⎨ {1}, Sign(t) := [−1, 1], t = 0, ⎩ {−1}, t < 0,

(50)

for any t ∈ R.

Clearly, the second and fourth inclusions of (50) are equivalent to that u4∗ = 0 and u2∗ = 0; while the first and third inclusions of (50) are equivalent to that u1∗ + u3∗ = 2. Therefore, we have that S = (a, 0, 2 − a, 0)T : 0 ≤ a ≤ 2 . Write x ∗ := (2, 0, 0, 0)T and y ∗ := (0, 0, 2, 0)T . Then we have that (i) neither J-BI nor SSP is satisfied at any u∗ ∈ S; (ii) OSP is satisfied at each u∗ ∈ S \ {x ∗ , y ∗ }.

(51)

OPTIMIZATION

11

Indeed, by (51), it is clear that ⎧ if u∗ = x ∗ , ⎨ {1}, ∗ {3}, if u∗ = y ∗ , supp(u ) = ⎩ ∗ {1, 3}, if u ∈ S \ {x ∗ , y ∗ }.

(52)

For each u∗ ∈ S, one checks that ∗



T

|K (Ku − h)| = (4, 2, 4, 4) ,

J = {1, 2, 3},

  −1 1 −1 K|J = ; 1 −1 1

(53)

hence we observe from (52) and (53) that K|J is not injective and that supp(u∗ ) = J. Thus assertion (i) is verified. Fix u∗ ∈ S \{x ∗ , y ∗ }. Note by (52) that supp(u∗ ) = {1, 3} and note by (53) that J = {1, 2, 3}. Let I = {2}. Then one checks by (53) that (8), (9) and the I-BI are satisfied; consequently, assertion (ii) is proved. √ Let u0 = 0 and sn ≡ s ∈ (0, 15 ) (K = 10), and let {un } be a sequence generated by Algorithm 1. Below, we show that {un } linearly converges to a solution of problem (1). In view of Algorithm 1, one has that un = (1 − (1 − 4s)n , 0, 1 − (1 − 4s)n , 0)T for any n ∈ N. Since s ∈ (0, 15 ), it follows that

lim un = (1, 0, 1, 0)T ∈ S

n→∞

(due to (51)). Noting by assertion (ii) that the OSP is satisfied at (1, 0, 1, 0)T , we obtain by Theorem 1.2 the linear convergence of {un } to (1, 0, 1, 0)T . However, according to assertion (i), neither the result of linear convergence in [28] nor the one in [14] is available when solving this example. Example 3.2: Consider problem (1) with K : l 2 → l 2 being defined by Ku := (u1 + u2 , u3 , u4 , u5 , u6 , · · · )T

for each u := (uk ) ∈ l 2 ,

h := (2, 13 , 41 , 15 , 16 , · · · )T and ω := (1, 1, 15 , 15 , 15 , 15 , · · · )T . Then problem (1) is equivalent to  min

    ∞ 1 1 1 1 (u1 + u2 − 2)2 + |u1 | + |u2 | + (uk − )2 + |uk | . min 2 2 k 5

(54)

k=3

Let u∗ ∈ S. Clearly, the first minimization of problem (54) is equivalent to that u1∗ + u2∗ = 2, while 2 1 the others are equivalent to that u3∗ = 15 , u4∗ = 20 and uk∗ = 0 for each k ≥ 5. Therefore, we have that   2 1 S = (a, 1 − a, , , 0, 0, · · · )T : 0 ≤ a ≤ 1 . (55) 15 20 2 1 2 1 Write x ∗ := (1, 0, 15 , 20 , 0, 0, · · · )T and y ∗ := (0, 1, 15 , 20 , 0, 0, · · · )T . Then we have assertions (i) and (ii) in Example 3.1. Indeed, by (55), it is clear that ⎧ if u∗ = x ∗ , ⎨ {1, 3, 4}, ∗ {2, 3, 4}, if u∗ = y ∗ , supp(u ) = (56) ⎩ ∗ {1, 2, 3, 4}, if u ∈ S \ {x ∗ , y ∗ }.

For each u∗ ∈ S, one checks that T  1 1 1 1 1 |K ∗ (Ku∗ − h)| = 1, 1, , , , , , · · · , 5 5 5 6 7

12

L. ZHANG ET AL.

and so

⎛ 1 ⎜0 ⎜ ⎜0 ⎜ J = {1, 2, 3, 4, 5} and K|J = ⎜0 ⎜ ⎜0 ⎝ .. .

1 0 0 0 0 .. .

0 1 0 0 0 .. .

0 0 1 0 0 .. .

⎞ 0 0⎟ ⎟ 0⎟ ⎟ ; 1⎟ ⎟ ⎟ 0⎠ .. .

(57)

hence we observe from (56) and (57) that K|J is not injective and that supp(u∗ ) = J. That is, assertion (i) is verified. Fix u∗ ∈ S \ {x ∗ , y ∗ }. Note by (56) that supp(u∗ ) = {1, 2, 3, 4} and note by (57) that J = {1, 2, 3, 4, 5}. Let I = {5}. Then one checks by (57) that (8), (9) and the I-BI are satisfied; hence assertion (ii) is proved. √ Let u0 = 0 and sn ≡ s ∈ (0, 1) (K = 2), and let {un } be a sequence generated by Algorithm 1. Below, we show that {un } linearly converges to a solution of problem (1). In view of Algorithm 1, one has that ⎛1 1 n⎞ 2 − 2 (1 − 2s) ⎜ 1 − 1 (1 − 2s)n ⎟ ⎜ 22 2 2 ⎟ ⎜ − (1 − s)n ⎟ ⎜ 15 15 ⎟ 1 n⎟ ⎜1 un = ⎜ 20 − 20 (1 − s) ⎟ for any n ∈ N. ⎜ ⎟ 0 ⎜ ⎟ ⎜ ⎟ 0 ⎝ ⎠ .. . Since s ∈ (0, 1), it follows that  lim un =

n→∞

1 1 2 1 , , , , 0, 0, · · · 2 2 15 20

T ∈S

2 1 , 20 , 0, 0, · · · )T , Theorem (due to (55)). Noting by assertion (ii) that the OSP is satisfied at ( 12 , 12 , 15 1 1 2 1 , 0, 0, · · · )T . However, 1.2 is applicable to concluding that {un } linearly converges to ( 2 , 2 , 15 , 20 according to assertion (i), the result of linear convergence in [14] is not available when solving this example.

Acknowledgements The authors are grateful to two anonymous reviewers for their valuable suggestions and remarks which helped to improve the quality of the paper.

Disclosure statement No potential conflict of interest was reported by the authors.

Funding Yaohua Hu was partially supported by the National Natural Science Foundation of China [grant number 11601343]; Natural Science Foundation of Guangdong [grant number 2016A030310038]; Foundation for Distinguished Young Talents in Higher Education of Guangdong [grant number 2015KQNCX145]; Chong Li was supported in part by the National Natural Science Foundation of China [grant number 11571308]; Jen-Chih Yao was partially supported by the MOST [grant number 105-2115-M-039-002-MY3].

OPTIMIZATION

13

References [1] Candès E, Tao T. Decoding by linear programming. IEEE Trans Inform Theory. 2005;51:4203–4215. [2] Combettes PL, Wajs VR. Signal recovery by proximal forward-backward splitting. Multiscale Model Sim. 2005;4:1168–1200. [3] Donoho DL. Compressed sensing. IEEE Trans Inform Theory. 2006;52:1289–1306. [4] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci. 2009;2:183–202. [5] Bredies K, Lorenz DA. Iterated hard shrinkage for minimization problems with sparsity constraints. SIAM J Sci Comput. 2008;30:657–683. [6] Elad M. Sparse and redundant representations. New York(NY): Springer; 2010. [7] Qin J, Hu Y, Xu F, et al. Inferring gene regulatory networks by integrating ChIP-seq/chip and transcriptome data via LASSO-type regularization methods. Methods. 2014;67:294–303. [8] Simon N, Friedman J, Hastie T, et al. A sparse-group Lasso. J Comput Graph Stat. 2013;22:231–245. [9] Wang J, Hu Y, Li C, et al. Linear convergence of CQ algorithms and applications in gene regulatory network inference. Inverse Probl. 2017;33055017. [10] Bach F, Jenatton R, Mairal J, et al. Structured sparsity through convex optimization. Stat Sci. 2012;27:450–468. [11] Hu Y, Li C, Meng K, et al. Group sparse optimization via p,q regularization. J Mach Learn Res. 2017;18:1–52. [12] Mairal J. Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM J Optim. 2015;25:829–855. [13] Bazavan EG, Li F, Sminchisescu C. Fourier kernel learning. In: Fitzgibbon A, Lazebnik S, Perona P, et al., editors. Proceedings of 12th European Conference on Computer Vision. Berlin: Springer; 2012. p. 459–473. [14] Bredies K, Lorenz DA. Linear convergence of iterative soft-thresholding. J Fourier Anal Appl. 2008;14:813–837. [15] Donoho DL. Nonlinear solution of linear inverse problems by wavelet-vaguelette decomposition. Appl Comput Harmon Anal. 1995;2:101–126. [16] Gribonval R, Nielsen M. Highly sparse representations from dictionaries are unique and independent of the sparseness measure. Appl Comput Harmon Anal. 2007;22:335–355. [17] Figueiredo MAT, Nowak RD, Wright SJ. Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J Sel Top Signa. 2007;1:586–597. [18] He B, Yuan X. On the O(1/n) convergence rate of the Douglas-Rachford alternating direction method. SIAM J Numer Anal. 2012;50:700–709. [19] Hu Y, Li C, Yang X. On convergence rates of linearized proximal algorithms for convex composite optimization with applications. SIAM J Optim. 2016;26:1207–1235. [20] Lu Z, Xiao L. On the complexity analysis of randomized block-coordinate descent methods. Math Program. 2015;152:615–642. [21] Nesterov Y. Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J Optim. 2012;22:341–362. [22] Xiao L, Zhang T. A proximal-gradient homotopy method for the sparse least-squares problem. SIAM J Optim. 2013;23:1062–1091. [23] Yang J, Zhang Y. Alternating direction algorithms for 1 -problems in compressive sensing. SIAM J Sci Comput. 2011;33:250–278. [24] Figueiredo MAT, Nowak RD. An EM algorithm for wavelet-based image restoration. IEEE Trans Image Process. 2003;12:906–916. [25] Starck JL, Nguyen MK, Murtagh F. Wavelets and curvelets for image deconvolution: a combined approach. Signal Process. 2003;83:2279–2283. [26] Daubechies I, Defrise M, Mol CD. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun Pure Appl Math. 2004;57:1413–1457. [27] Bect J, Blanc-Fèraud L, Aubert G, et al. A l 1 unified variational framework for image restoration. In: Pajdla T, Matas J, editors. Proceedings of Eighth European Conference on Computer Vision. Berlin: Springer; 2004. p. 1–13. [28] Hale ET, Yin W, Zhang Y. Fixed-point continuation for 1 -minimization: methodology and convergence. SIAM J Optim. 2009;19:1107–1130. [29] Tao S, Boley D, Zhang S. Local linear convergence of ISTA and FISTA on the LASSO problem. SIAM J Optim. 2016;26:313–336. [30] Nesterov Y. Gradient methods for minimizing composite functions. Math Program. 2013;140:125–161. [31] Bauschke HH, Combettes PL. Convex analysis and monotone operator theory in Hilbert spaces. New York (NY): Springer; 2011. [32] Rockafellar RT. Convex analysis. Princeton: Princeton University Press; 1970.