Low Rank Approximation of Linear Operators in p-norms - Google Sites

1 downloads 14 Views 2MB Size Report
In this article, we study the optimal or best approximation of any linear operator by ... (A) = supx∈Rn, x p=1 Ax p is
ON LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS AND SOME ALGORITHMS YANG LIU

Abstract. In this article, we study the optimal or best approximation of any linear operator by low rank linear operators, especially, any linear operator on the `p -space, p ∈ [1, ∞), under `p norm, or in Minkowski distance. Considering generalized singular values and using techniques from differential geometry, we extend the classical Schmidt-Mirsky theorem in the direction of the `p -norm of linear operators for some p values. Also, we develop and provide algorithms for finding the solution to the low rank approximation problems in some non-trivial scenarios. The results can be applied to, in particular, matrix completion and sparse matrix recovery.

1. Introduction Let A be a linear operator on a finite dimensional space, then A can be represented by a matrix and let’s denote the matrix by A. The matrix p-norm defined by (p) s1 (A) = supx∈Rn ,kxkp =1 kAxkp is the largest p-singular value of matrix A. In the last decade, many research works have been concentrated on evaluating its value for matrices, see for instance, [8] and [9]. From the recent results in [8] [21], and [10], it has been shown that matrix p-norms are NP-hard to approximate if p 6= 1, 2, ∞. The extensions on the lower-rank approximation in different directions were studied. For example, in [5], an explicit solution to the rank-constrained matrix approximation in Frobenius norm, which is a generalization of the classical approximation of an m × n matrix A by a matrix of, at most, rank k, was given. However, this paper generalizes the rank-constrained matrix approximation to the matrix p-norms from the classical `2 -norm in which the singular value decomposition plays a substantial role. By singular value decomposition, A can be decomposed as A = U ΛV , where U and V are orthogonal matrices and Λ is a diagonal matrix with σ1 , σ2 , · · · , σmin(m,n) , σ1 ≥ σ2 ≥ · · · ≥ σmin(m,n) ≥ 0, on its diagonal, then min ||A − B||2 = σk+1 (A) rank(B)≤k

(1.1)

where σk+1 (A) is the (k + 1)-th singular value. In this scenario, one can choose B to be U Λk V where Λk is a diagonal matrix whose min (m, n) diagonal entries are σ1 , σ2 , · · · , σk , 0, · · · , 0, and apparently, B is the solution matrix to the problem. This yields the classical Schmidt-Mirsky theorem. But recently, there have been considerably efficient and reliable algorithms developed for the low-rank approximation of matrices in `2 -norm, see for instance [16] and [22]. 2000 Mathematics Subject Classification. 15-04; 46B09; 15A60. Key words and phrases. matrices, normed spaces, singular values, operator theory. 1

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

2

For p ≤ 1, let B be the matrix, the computation of p-singular value is relatively easier, because we have a good property on the p-singular value, for which one can see [14]. In this paper, we want to find the solution of the following minimization problem: given a matrix A of size m by n, for any k ≤ min (m, n) and p 6= 2, find the solution matrix B with rank(B) ≤ k which achieves the minimum in the minimization min ||A − B||p . rank(B)≤k

(1.2)

The motivation of studying this problem lies in the areas of compressed sensing, matrix completion, and phase retrieval in signal processing. compressed sensing is a technique for recovering sparse signals. In compressed sensing, there have been different approaches developed for recovering signal in compressed sensing. Among the different approches, one of them is the `p -aproach, which concerned with the generalized restricted isometry property of a measurement matrix, which needs to be constructed in a simple way, in the sense of low rank. This problem is related to computational problem or approaches, for instance, semi-definite programing, MAXCUT algorithm (see for instance, [6]), convex sparse coding and subspace learning (see for instance, [23]), etc, and so the solution to this problem has interesting applications to them. Furthermore, as compressed sensing (see for instance, [17] and [13]), matrix completion and sparse matrix recovery (see for instance, [3] and [2]), have become some of the hot topics in computational mathematics and applied mathematics, the applications of the solution of this problem to them play a great role in providing new approaches or perspectives, both theoretically and experimentally. For k = 0, the solution matrix B is simply the zero matrix of size m by n, and in this case minrank(B)≤k ||A − B||p = ||A||p ; for k = min (m, n), the solution matrix B is simply A, and in this case minrank(B)≤k ||A − B||p = 0. So these two cases are trivial. In this paper, we study the best or optimal approximation of linear operators by low rank linear operators, especially, the linear operators on the `p -space, p ∈ [1, ∞). Considering the generalized singular values, we extend the classical Schmidt-Mirsky theorem in the direction of matrix `p -norm for some values of p, by using a bit of differential geometry. Also, we develop and provide the algorithms for finding the solution to the low rank approximation problems in some of the cases. Simply speaking, we show that (p)

min ||A − B||p = smin(m,n) (A) , rank(B)≤k

(1.3)

which gives the minimum of (1.2), and we also present an explicit way to find the minimizer of (1.2) for some values of p. These results, as mentioned, can be applied to matrix completion and sparse matrix recovery. 2. Solution to the Problem for k = min (m, n) − 1 and its algorithm The smallest p-singular value is defined as (p)

smin(m,n) (A) =

inf

sup

V ⊆Rn , dim(V )=n−min(m,n)+1 x∈V,kxk =1 p

kAxkp .

(2.1)

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

3

For simplicity, assume A to be a matrix of size m by n with m ≥ n first, then (p)

smin(m,n) (A) = s(p) n (A) =

inf

x∈Rn ,kxkp =1

kAxkp .

(2.2)

Since the spaces we are considering are finite dimensional, there exists some x0 ∈ Rn with kx0 kp = 1, such that kAx0 kp =

inf

x∈Rn ,kxkp =1

kAxkp .

(2.3)

Now consider the gradient of the function f (x) := kxkp , x ∈ Rn , at the point x = x0 for p > 1 and the subgradient of the function for p = 1 and ∞. Denote the gradient or subgradient by ∇x0 kxkp . Then there is a subspace of Rn of codimension 1 orthogonal to the vector ∇x0 kxkp and let the subspace be V0 . Since Rn can be decomposed into the direct sum of the linear space spanned by x0 , {x ∈ Rn : x = kx0 , k ∈ R}, and V0 , we can find a matrix B0 of size m by n with rank (B0 ) ≤ min (m, n) − 1, such that B0 (kx0 + v) = Av

(2.4)

for any k ∈ R and v ∈ V0 . Theorem 2.1. The solution matrix to Problem (1.2) for k = min (m, n) − 1 is B0 (p) if m ≥ n, and the minimum is smin(m,n) (A). Proof. For any x ∈ Rn , there exist c ∈ R and v ∈ V0 such that x = cx0 + v. Therefore, ||A − B0 ||p =

sup x∈Rn ,kxkp =1

k(A − B0 ) xkp =

sup k∈R,v∈V0 ,kkx0 +vkp =1

kA (cx0 )kp . (2.5)

Since V0 is orthogonal to the gradient vector ∇x0 kxkp , then V0 is the tangent space to the sphere kxkp = 1 at x0 . Because any vector that starts at the origin and ends at a point ouside the convex sphere kxkp = 1 has a greater `p -norm and x0 + v ends at a point that falls outside of the convex sphere, we have that kx0 + vkp ≥ kx0 kp for any v ∈ V0 , which is equivalent to say kkx0 + vkp ≥ kkx0 kp for any v ∈ V0 . Thus kcx0 + vkp = λ (v) kkx0 kp for some real number λ (v) ≥ 1, and then kA (cx0 )kp =

kA (cx0 )kp = sup

kAx0 kp

. λ (v) (2.6) Because λ (v) ≥ 1 for all v ∈ V0 and λ (v) = 1 when v = 0, the supremum in (2.6) (p) is kAx0 kp , which is smin(m,n) (A) by (2.2) and (2.3). Viewing (2.5), we obtain that sup

k∈R,v∈V0 ,kkx0 +vkp =1

sup

k∈R,v∈V0 ,λ(v)kkx0 kp =1

v∈V0

(p)

||A − B0 ||p = smin(m,n) (A). On the other hand, for any matrix B of size m by n with rank (B) ≤ min (m, n)− 1, the dimension of ker (B) is not less than 1. Considering the restriction on ker (B), we have kA − Bkp =

sup x∈Rn ,kxkp =1

k(A − B) xkp ≥

sup x∈ker(B),kxkp =1

kAxkp .

(2.7)

We also know from the definition that (p)

smin(m,n) (A) =

inf

sup

V ⊆Rn ,dim(V )≥1 x∈V,kxk =1 p

kAxkp ≤

sup x∈ker(B),kxkp =1

kAxkp

(2.8)

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

4

since ker (B)is a particular subspace with dim (ker (B)) ≥ 1. Combining (2.7) and (p) (2.8) finally yields that kA − Bkp ≥ smin(m,n) (A) for any matrix B of size m by n with rank (B) ≤ min (m, n) − 1. That completes the proof.  For computational purposes, we give an algorithm to find the best low rank approximation for k = min (m, n) − 1. Given matrix A =: (aij )m×n ∈ Rm×n and 1 < p < ∞: (1) Find the minimum of the function p Pm Pn p a x ij j kAxkp j=1 i=1 P f (x) = p = n p kxkp j=1 |xj |

(2.9)

and the vector at which f achieves its minimum or an approximation of the minimum, for which there has been an efficient algorithm (see for instance, [1]). Normalize the vector, and set the normalized vector to be x∗ . (2) Let us denote (x∗1 , x∗1 , · · · , x∗n ) ∈ Rn by x∗ , then obtain ∇x∗ kxkp by using the formula   p−1 p−1 p−1 ∇x∗ kxkp = sign (x∗1 ) |x∗1 | , sign (x∗2 ) |x∗2 | , · · · , sign (x∗n ) |x∗n | . (2.10) (3) Find a basis for the (n − 1)-dimensional space defined by sign (x∗1 ) |x∗1 |

p−1

p−1

x1 + sign (x∗2 ) |x∗2 |

x2 + · · · + sign (x∗n ) |x∗n |

p−1

xn = 0

(2.11)

and denote the basis by {b1 , b2 , · · · , bn−1 }. (4) Find the matrix that transforms ei to bi for i = 1, 2, · · · , n − 1, where where ej , i = 1, 2, · · · , n are the standard basis vectors of Rn . and x∗ to the zero vector. Set the matrix to be B0 . Remark 2.2. Regarding the case when p = ∞, we will show the algorithm in Section 4. For p = 1 the problem is essentially a linear progamming problem, On the applicaiton, we have the follwoing remark: Remark 2.3. This theorem can be applied to developing algorithem for matrix recovery or matrix completion in `p -norm. Furthermore, practially, one may be able to use it to make a prediction about the demands in a market on a new product, since the matrix involved tend to have a low rank, as a result of the preferences of customers, by using the `p -approach of compressed sensing. 3. Solution to the Problem for p = ∞ For min (m, n) < 3, we know that the matrix approximation problem (1.2) is trivial. For k = 1 and min (m, n) ≥ 3, let us consider first the case `∞ , with which the unit sphere in Rn equipped is an (n − 1)-dimensional cube. Again, there exists some x0 ∈ Rn with kx0 k∞ = 1, such that kAx0 k∞ =

sup x∈Rn ,kxk∞ =1

kAxk∞ .

(3.1)

About where the supremum is achieved, we have the following lemma. Lemma 3.1. The supremum in (3.1) is achieved at one of the corners of the cube Q = {x ∈ Rn : kxk∞ = 1} for any m by n matrix A.

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS



x1 x2 .. .

  Proof. Let A =: (aij )m×n . For any x =  

5

    ∈ Rn with 

xn kxk∞ = max (|x1 | , |x2 | , · · · , |xn |) = 1,

(3.2)

we can write   X X X n n n (3.3) amj xj  . a2j xj , · · · , a1j xj , kAxk∞ = max  j=1 j=1 j=1 P Pn n For each i, i = 1, 2, · · · , m, we have j=1 aij xj ≤ j=1 |aij | because of (3.2), n

and the equality holds if x ∈ {−1, 1} , in other words, if x is a corner of the cube Q. It follows that   n n n X X X |amj | |a2j | , · · · , (3.4) kAxk∞ ≤ max  |a1j | , j=1

j=1

j=1

and the equality holds if x is a corner of the cube Q.



The 2-th ∞-singular value is defined as (∞)

s2

(A) =

inf

sup

V ⊆Rn , dim(V )=n−1 x∈V,kxk

∞ =1

kAxk∞ .

(3.5)

Since the Grassmannian, which is the space of linear subspaces of Rn of dimension n − 1, is compact (see, for instance, [18]), then the infimum in (3.5) can be achieved at some V0 ⊆ Rn with dim (V0 ) = n−1 because of the generalization of the extream value thereom on compact topologiclal space (see, for instance, [15]. Thus (∞)

s2

(A) =

sup x∈V0 ,kxk∞ =1

kAxk∞

(3.6)

and denote the supremum by σ. Theorem 3.2. Let A be an m by n matrix, then there exists an m by n matrix B of rank at most 1 such that kA − Bk∞ ≤

inf

V ⊆Rn ,dim(V )=n−1

kA|V k∞ ,

(3.7)

where A|V is the restriction of A on V that is a subspace of Rn . Proof. For any x ∈ V0 , by (3.6) we know that kAxk∞ ≤ σ kxk∞ . Let πi : Rm → R be the orthogonal projection that maps a vector in Rm to its i-th component for i = 1, 2, · · · , m. Then πi ◦A : Rn → R is a linear functional and σ kxk∞ : Rn → R is a norm on Rn . Moreover, since kAxk∞ ≤ σ kxk∞ for all x ∈ V0 then |πi ◦ A (x)| ≤ σ kxk∞ for all x ∈ V0 . By the Hahn-Banach theorem, there is a linear extension Ci , an m by n matrix, of πi ◦ A, such that Ci (x) = πi ◦ A (x) for all x ∈ V0 and |Ci (x)| ≤ σ kxk∞ for all x ∈ Rn . Now let C : Rn → Rm such that πi ◦ C (x) = Ci (x) for i = 1, 2, · · · , m, then C (x) = A (x) for all x ∈ V0 and kC (x)k∞ ≤ σ kxk∞ for all x ∈ Rn . Now take B0 = A − C, then it follows that the kernel of A − C contains V0 and kA − B0 k∞ = kCk∞ ≤ σ.

(3.8)

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

6

 We can generalize this theorem to that for any k, 0 ≤ k ≤ min (m, n). Theorem 3.3. Let A be an m by n matrix, then there exists an m by n matrix B of rank at most k such that kA − Bk∞ ≤

inf

V ⊆Rn ,dim(V )=n−k

kA|V k∞

(3.9)

for 0 ≤ k ≤ min (m, n). Proof. The proof for this is the same with the proof of Theorem 3.2, except that we would need to extend πi ◦ A on a (n − k)-dimensional subspace of Rn to Ci on Rn , to which we can still apply the Hahn-Banach theorem though.  Next, we will obtain a matrix B0 for 0 ≤ k ≤ min (m, n), by following the proof of Theorem 3.2 and extending πi ◦ A on a (n − k)-dimensional subspace of Rn to Ci on Rn for i = 1, 2, · · · , m. B0 is actually a solution to the problem for p = ∞. Theorem 3.4. The solution matrix to Problem (1.2) for p = ∞ is B0 and the (∞) minimum is sk+1 (A). Proof. Using the idea in the proof for Theorem 2.1, we can show that kA − Bk∞ ≥

inf

V ⊆Rn ,dim(V )=n−k

(∞)

kA|V k∞ = sk+1 (A)

(3.10)

for all matrix B of size m by n with rank (B) ≤ k. By Theorem 3.3, we know that B0 is the solution matrix to the problem for p = ∞ and the minimum is (∞) sk+1 (A).  4. Algorithm for Solving the Problem for p = ∞ In numerical analysis, one would need an algorithm to find the solution to a problem. In this section, we will presnet an algorithm about the matrix low rank approximation problem. Given any matrix A, by following the steps in the algorithm, one will be able to obtain the solution matrix B0 . For p = ∞ and k = 1, given matrix A =: (aij )m×n ∈ Rm×n , we want to find the matrix B0 , which is the solution to the approximation problem. From the proof of Theorem 3.2, B0 = A − C, so we just need to find C. But C consists of the components C1 , · · · , Cm on Rn , which are the extensions of π1 ◦ A, · · · , πm ◦ A on V0 . Since one can find a vector w0 ∈ Rn \ V0 and Rn = V0 ⊕ span {w0 }, then, in order to construct Ci , we just need to define the value of Ci (w0 ). The norm of A yields kAxk∞ ≤ σ kxk∞ for all x ∈ V0 . Particularly, kA (v + w0 )k∞ ≤ σ kv + w0 k∞ for all v ∈ V0 , and furthermore, |πi ◦ A (v + w0 )| ≤ σ kv + w0 k∞ for all v ∈ V0 . Now let pi (v) := σ kv + w0 k∞ − πi ◦ A (v) (4.1) for v ∈ V0 , then pi (v) ≥ σ kvk∞ − σ kw0 k∞ − πi ◦ A (v) ≥ −σ kw0 k∞ ,

(4.2)

which implies pi (v) has a lower bound. One can break it up into a number of linear programming problems. By linear programming, one can find the minimizer of pi (v) for v ∈ V0 , a finite dimensional space, and let’s denote the minimum of pi (v) on V0 by ai . Define fi (v + tw0 ) := πi ◦ A (v) + tai

(4.3)

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

7

for any v ∈ V0 and t ∈ R, which is obviously linear. Next, we are going to show that fi (v + tw0 ) will make the component Ci as wanted. Lemma 4.1. Let Ci (x) = fi (x), then Ci (x) = πi ◦ A (x) for all x ∈ V0 and |Ci (x)| ≤ σ kxk∞ for all x ∈ Rn . Proof. For any v ∈ V0 , we know σ kv + w0 k∞ +σ kv 0 + w0 k∞ ≥ σ kv − v 0 k∞ ≥ πi ◦A (v − v 0 ) = πi ◦A (v)−πi ◦A (v 0 ) (4.4) for all v 0 ∈ V0 , then we have σ kv + w0 k∞ − πi ◦ A (v) ≥ −πi ◦ A (v 0 ) − σ kv 0 + w0 k∞

(4.5)

for all v 0 ∈ V0 . It follows that − πi ◦ A (v) − σ kv + w0 k∞ ≤ ai

(4.6)

for all v ∈ V0 . Combined with pi (v) = σ kv + w0 k∞ − πi ◦ A (v) ≥ ai , it implies |fi (v + w0 )| = |πi ◦ A (v) + ai | ≤ σ kv + w0 k∞ .

(4.7)

By homogeneity, Ci (x) = |fi (x)| ≤ σ kxk∞ for all x ∈ Rn , and obviously, Ci (x) = πi ◦ A (x) for all x ∈ V0 .  For general k, 0 ≤ k ≤ min (m, n), of the case p = ∞, one can do an induction on the difference of the dimensions of Rn and V0 ⊆ Rn with dimension n − k to construct C1 , · · · , Cm . The constructed C1 , · · · , Cm make up of C, and furthermore give us the solution matrix B. In summary, the detailed steps of finding the solution are the following: (1) Find the minimum of F (V ) :=

sup x∈V,kxk∞ =1

kAxk∞

(4.8)

for V ⊆ Rn with dim (V ) = n − 1, which gives the value of σ, and the (n − 1)-dimensional subspace V0 of V at which F achieves its minimum. This is essentially a linear programming problem, because it is dual to the `1 -norm, for which one can refer to [4]. (2) Find the minimum of pi defined in (4.1) over V0 , which gives the value of ai for i = 1, 2, · · · , m, and furthermore give the explicit expression of fi defined in (4.3) or i = 1, 2, · · · , m. (3) Write down the matrix C = (fi (ej ))m×n

(4.9)

where ej , i = 1, 2, · · · , n are the standard basis vectors of Rn . C as a linear map maps a vector x ∈ Rn to the vector (f1 (x) , · · · , fm (x)) ∈ Rm . (4) Set the difference matrix A − C to be B0 , that is the solution to the low rank approximation problem in the case of k = 1 and p = ∞. For general k, 0 ≤ k ≤ min (m, n), of the case p = ∞, one can find w1 , · · · , wk in Rn \V0 where V ⊆ Rn and dim (V ) = n−k, such that Rn = V0 ⊕span {w1 , · · · , wk }. One can find the matrix C inductively, by repeating Step 2 and 3.

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

8

5. Remark on Other Cases If p 6= 1, 2, ∞, the matrix p-norms, as mentioned, are NP-hard to approximate, for which one can see [8] and [21]. The case of p = 1 is the most computable case due to its piecewise linearity, except for p = 2 and p = ∞. Assume p = 1, then the matrix approximation problem becomes finding the solution to min ||A − B||1 rank(B)≤k

(5.1)

In this case, we still have kA − Bk1 ≥

inf

V ⊆Rn ,dim(V )=n−k

(1)

kA|V k1 = sk+1 (A) ,

(5.2)

by Theorem 2.1. However, the Hahn-Banach theorem can not be applied to obtain the other direction of the inequality, mainly because `1 does not have the metric extension property, for which one can refer to [7], [19], [12], [11], and [20].

6. Computable Examples on Intermediate p-Singular Values with k=1 As mentioned in a previous section, there has not been an algorithm for solving the problem for the intermediate p-singular values in general. However, in this section, we will present some examples and geometric intuitions on computing the intermediate generalized singular values.   1 1 0 Example 6.1. A =  0 1 1 , m = 3, n = 3, and p = 1. 0 0 1     x1 x1 + x2 Let x =  x2 , then Ax =  x2 + x3  . Therefore, x3 x3 (1)

s2 (A)

= inf V ⊆R3 ,dim(V )=2 supx∈V,kxk1 =1 kAxk1 2 |+|x2 +x3 |+|x3 | . = inf V ⊆R3 ,dim(V )=2 supx∈V \{0} |x1 +x |x1 |+|x2 |+|x3 |

(6.1)

For any V ⊆ R3 with dim (V ) = 2, we have kA|V k1 = |x1 + x2 | + |x2 + x3 | + |x3 | ≥ 1

(6.2)

0 for all x ∈ V with kxk1 = 1, becausethere exists some nonzero vector x in the 3 intersection of V and the plane Q1 := x ∈ R : x1 = 0 and

|x1 + x2 | + |x2 + x3 | + |x3 | = |x2 | + |x2 + x3 | + |x3 | ≥ |x2 | + |x3 | = 1 (6.3)  for all x ∈ Q1 with kxk1 = 1. By choosing V = Q := x ∈ R3 : x2 + x3 = 0 , we have kA|V k1 = |x1 + x2 | + |x3 | ≤ |x1 | + |x2 | + |x3 | = 1. (1)

Thus it follows that s2 (A) = 1. See Figure 6.1.

(6.4)

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

9

1.0 0.5 0.0 -0.5 -1.0 1.0

0.5

0.0

-0.5

-1.0 -1.0 -0.5 0.0 0.5 1.0

Figure 6.1. In the figure above, the plane x2 + x3 = 0 cuts the octahedron |x1 | + |x2 | + |x3 | = 1.  0 21 21 Now let B =  0 1 1 , then we can see that rank (B) = 1 and 0 12 21

  1

1 − 21 x1 + x2 − x3 + − x2 + 2

2 2 2

  0 kA − Bk1 = 0 0

= sup |x1 | + |x2 | + |x3 | 1 x∈V \{0}

0 −1 2 2 1 

x3 2

=1

(6.5) since x1 + x22 − x23 + − x22 + x23 ≤ |x1 | + |x2 | + |x3 |. On the other hand, for any B 0 with rank (B 0 ) ≤ 1, the dimension of ker (B 0 ) is not less than 2. Considering the restriction on ker (B 0 ), we have kA − B 0 k1 =

sup x∈R3 ,kxk1 =1

k(A − B 0 ) xk1 ≥

sup x∈ker(B 0 ),kxk1 =1

kAxk1 .

(6.6)

By the definition of the second 1-singular value of A, (1)

s2 (A) =

inf

sup

V ⊆R3 ,dim(V )≥2 x∈V,kxk =1 1

kAxk1 ≤

sup x∈ker(B 0 ),kxk1 =1

kAxk1 .

(6.7)

(1)

0 Thus it follows from the above two inequalities that kA − B k1 ≥ s2 (A). 1 1 0 2 2 Finally, we conclude that B =  0 1 1  is a solution to the minimization 0 12 12 (1)

problem (1.2) with k = 1, for which the minimum is s2 (A). 7. Acknowledgment The author would like to thank Prof. M. J. Lai for suggesting the research problem and Prof. B. Johnson for a reference. The author is partially supported by the Air Force Office of Scientific Research under grant AFOSR 9550-12-1-0455. The author would also like to thank the referees for the helpful comments.

LOW RANK APPROXIMATION OF LINEAR OPERATORS IN p-NORMS

10

References [1] A.Bhaskara and A. Vijayaraghavan. Approximating matrix p-norms. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 497–511. SIAM, 2011. [2] J.F. Cai, E. J. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4):1956–1982, 2010. [3] E. J. Candès and B. Recht. Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6):717–772, 2009. [4] Emmanuel J Candes and Terence Tao. Decoding by linear programming. Information Theory, IEEE Transactions on, 51(12):4203–4215, 2005. [5] S. Friedland and A. Torokhti. Generalized rank-constrained matrix approximations. SIAM Journal on Matrix Analysis and Applications, 29:656–659, 2007. [6] B. Gärtner and J. Matousek. Approximation algorithms and semidefinite programming. Springer, 2012. [7] D.B. Goodner. Projections in normed linear spaces. Transactions of the American Mathematical Society, 69(89):108, 1950. [8] J.M. Hendrickx and A. Olshevsky. Matrix p-norms are np-hard to approximate if p 6= 1, 2, ∞. SIAM Journal on Matrix Analysis and Applications, 31:2802–2812, 2010. [9] N.J. Higham. Estimating the matrix p-norm. Numerische Mathematik, 62(1):539–555, 1992. [10] C. J Hillar and L.H. Lim. Most tensor problems are np-hard. Journal of the ACM (JACM), 60(6):45, 2013. [11] W.B. Johnson and J. Lindenstrauss. Handbook of the geometry of Banach spaces, volume 2. North Holland, 2003. [12] J. L. Kelley. Banach spaces with the extension property. Transactions of the American Mathematical Society, 72:323–326, 1952. [13] M.J. Lai and Y. Liu. The null space property for sparse recovery from multiple measurement vectors. Applied and Computational Harmonic Analysis, 30(3):402–406, 2011. [14] M.J. Lai and Y. Liu. The probabilistic estimates on the largest and smallest q-singular values of random matrices. Mathematics of Computation, 84(294):1775–1794, 2015. [15] John Lee. Introduction to topological manifolds, volume 940. Springer Science & Business Media, 2010. [16] E. Liberty, F. Woolfe, P.G. Martinsson, V. Rokhlin, and M. Tygert. Randomized algorithms for the low-rank approximation of matrices. Proceedings of the National Academy of Sciences, 104(51):20167, 2007. [17] Y. Liu, T. Mi, and S. Li. Compressed sensing with general frames via optimal-dual-basedanalysis. Information Theory, IEEE Transactions on, 58(7):4201–4214, 2012. [18] John Willard Milnor and James D Stasheff. Characteristic classes. Number 76. Princeton university press, 1974. [19] L. Nachbin. A theorem of the hahn-banach type for linear transformations. Transactions of the American Mathematical Society, 68(1):28–46, 1950. [20] A. Pietsch. History of Banach spaces and linear operators. Birkhauser, 2007. [21] D. Steinberg. Computation of matrix norms with applications to robust optimization. Master’s thesis, Technion-Israel Institute of Technology, 2005. [22] F. Woolfe, E. Liberty, V. Rokhlin, and M. Tygert. A fast randomized algorithm for the approximation of matrices. Applied and Computational Harmonic Analysis, 25(3):335–366, 2008. [23] X. Zhang, Y. Yu, M. White, R. Huang, and D. Schuurmans. Convex sparse coding, subspace learning, and semi-supervised extensions. In AAAI, 2011. Department of Mathematics, Michigan State University, East Lansing, MI 48824 E-mail address: [email protected]

Suggest Documents