Subspace Detection from Structured Union of Subspaces via Linear Sampling Thakshila Wimalajeewa∗, Member, IEEE, Yonina C. Eldar† , Fellow, IEEE, and Pramod K. Varshney∗ ,
arXiv:1304.6281v1 [cs.IT] 23 Apr 2013
Fellow IEEE
Abstract Lower dimensional signal representation schemes frequently assume that the signal of interest lies in a single vector space. In the context of the recently developed theory of compressive sensing (CS), it is often assumed that the signal of interest is sparse in an orthonormal basis. However, in many practical applications, this requirement may be too restrictive. A generalization of the standard sparsity assumption is that the signal lies in a union of subspaces. Recovery of such signals from a small number of samples has been studied recently in several works. Here, we consider the problem of subspace detection in which our goal is to identify the subspace (from the union) in which the signal lies from a small number of samples, in the presence of noise. More specifically, we derive performance bounds and conditions under which reliable subspace detection is guaranteed using maximum likelihood (ML) detection. We begin by treating general unions and then specify the results to the special case in which the subspaces have structure leading to block sparsity. In our analysis, we treat both general sampling operators and random sampling matrices. With general unions, we show that under certain conditions, the number of measurements required for reliable subspace detection in the presence of noise via ML is less than that implied using the restricted isometry property which guarantees signal recovery. In the special case of block sparse signals, we quantify the gain achievable over standard sparsity in subspace detection. Our results also strengthen existing results on sparsity pattern recovery in the presence of noise under the standard sparsity model.
Index terms- Maximum likelihood detection, union of linear subspaces, linear sampling, compressive sensing, random projections, block sparsity pattern recovery I. I NTRODUCTION The compressive sensing (CS) framework has established that a small number of measurements acquired via random projections are sufficient for signal recovery when the signal of interest is sparse in a certain basis. Consider a length-N signal x which can be represented in a basis V such that x = Vc. The signal x is said to be k-sparse in the basis V if c has only k non zero coefficients where k is much smaller than N . It has been shown in [1]–[3] that O(k log(N/k)) compressive measurements are sufficient to recover x when the measurements are random. 1∗ †
Dept. of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244
Dept. of Electrical Engineering, Technion-Israel Institute of Technology, Technion City, Haifa 32000, Israel
Email:
[email protected],
[email protected],
[email protected]
2
Signal recovery can be performed via optimization or greedy based approaches. A detailed overview of CS can be found in [4]–[13]. There are a variety of applications in which complete signal recovery is not necessary. The problem of sparsity pattern recovery (or the locations of non zero coefficients of the sparse signal) arises in a wide variety of areas including source localization [14], [15], sparse approximation [16], subset selection in linear regression [17], [18], estimation of frequency band locations in cognitive radio networks [19]–[21], and signal denoising [22]. In these applications, often finding the sparsity pattern of the signal is more important than approximating the signal itself. Further, in the CS framework, once the locations of the non zero coefficients are identified, the signal can be estimated using standard techniques. Performance limits on reliable recovery of the sparsity pattern of sparse signals with noisy compressive measurements have been derived by several authors in recent work exploiting information theoretic tools [23]–[30]. Most of these works focus on deriving necessary and sufficient conditions for reliable sparsity pattern recovery, and quantify the gap between the performance limits of existing computationally tractable practical algorithms and what can be achieved based on the optimal Maximum Likelihood (ML) detector when the computational constraints are removed. Although initial work on CS focused on signals which are sparse in a certain basis, there are practical scenarios where structured properties of the signal are available. Reduced dimensional signal processing with several signal models which go beyond simple sparsity was discussed in recent literature [31]–[36]. One general model that can describe many structured problems is that of a union of subspaces. In this setting, the signal is known to lie in one out of a possible set of subspaces but the specific subspace chosen is unknown. Examples include wideband spectrum sensing [20] and signals having finite rate innovation [37], [38]. With this additional information, more efficient algorithms can be developed compared to that for the traditional CS framework. Conditions under which stable sampling and recovery is possible in a general union of subspaces model are derived in [32], [34]. Signal recovery with structured union of subspaces was considered in [31], [33]. However, the problem of subspace detection has not been treated in this more general setting. In this paper, our goal is to investigate subspace detection from the union of subspaces model with a given sampling operator. We consider subspace recovery based on the optimal ML decoding scheme in the presence of noise and derive performance in terms of probability of error when sampling is performed via an arbitrary linear sampling operator. Based on an upper bound on the probability of error, we derive the minimum number of samples required for asymptotic reliable detection of subspaces in terms of a signal to noise (SNR) measure, the dimension of each subspace in the union and a term which quantifies the dependence among the subspaces. In the special case where sampling is performed via random projections and the subspaces in the union have a specific structure such that each subspace is a sum of some other k0 subspaces, we obtain a more explicit expression for the minimum number of measurements. This number depends on the number of underlying subspaces, the dimension of each subspace, and the minimum non zero block SNR (defined in Section IV.B). We note that the conventional sparsity model is a special case of this structure.
3
The asymptotic probability of error of the ML detector in the presence of noise for the standard sparsity model was first investigated in [23] followed by several other authors [24], [25], [30]. In [23], sufficient conditions were derived on the number of noisy compressive measurements needed to achieve a vanishing probability of error asymptotically for sparsity pattern recovery while necessary conditions were considered in [25]. The analyses in both papers are based on the assumption that the sampling operator is random. Here, we follow a similar direction while deriving performance metrics for subspace detection with the union of subspaces model. However, there are key differences between our derivations and that in [23]. First, we treat arbitrary (not necessarily random) sampling operators and assume a general union of subspaces model as opposed to standard sparsity. Further, the results in [23] were derived based on weak bounds on the probability of error, thus there is a gap between those results and the number of measurements required for the exact probability of error to vanish asymptotically at finite SNR. Here, we consider tighter bounds on the probability error, thus even for the conventional CS framework, we present tighter n log(N −k) o results. In particular, while the results of [23] require k + (c1 + 2048) max log N k−k , CSN measurements Rmin we have shown that only k+ CSNc2Rmin (log(N −k)) measurements are needed for reliable asymptotic sparsity pattern
recovery with the standard sparsity model where N , k, CSN Rmin are the signal dimension, sparsity index and the minimum component SNR of the signal, respectively and c1 and c2 are constants. We further illustrate numerically, the performance gap in terms of the probability of error bounds for the ML detector and the probability of error for computationally tractable algorithms (e.g., OMP) used for subspace detection with the union of subspace models. Numerical results show that the derived bound for the probability error is fairly close to the exact probability of error of the ML detector obtained via simulations especially in the high SNR region. The rest of the paper is organized as follows. In Section II, the problem of subspace detection from the union of subspace model is introduced. In Section III, performance limits with ML detection in terms of the probability of error and conditions under which asymptotic reliable subspace detection in the presence of noise is guaranteed are derived with a given linear sampling operator considering a general union of subspaces model. The results are extended in Section IV to the setting where structured properties of the subspaces in the union are available. Sufficient conditions for subspace recovery from the union of subspaces model when sampling is performed via random projections are also derived in Section IV. In Section V, we discuss some practical algorithms to detect subspaces in the union of subspace model and present numerical results to validate the theoretical claims. Throughout the paper, we use the following notation. Arbitrary vectors in a Hilbert space H, are denoted by
lower case letters, e.g., x. Calligraphic letters, e.g., S , are used to represent subspaces in H. Vectors in RN are
written in boldface lower case letters, e.g. x. Scalars (in R) are also denoted by lower case letters, e.g., x, when there is no confusion. Matrices are written in boldface upper case letters, e.g., A. Linear operators and a set of basis vectors for a given subspace S are denoted by upper case letters, e.g., A. x ∼ N (µ, Σ) denotes that the random
2 (λ) denotes that vector x is distributed as multivariate Gaussian with mean µ and the covariance matrix Σ. x ∼ Xm
the random variable x is distributed as Chi squared with the degrees of freedom m and the non centrality parameter
4 2 ). 0 is a vector with appropriate dimension in which all λ. (The central Chi squared distribution is denoted by Xm
elements are zeros. The conjugate transpose of a matrix A is denoted by A∗ . Ik denotes the identity matrix of size k. ||.||2 denotes the l2 norm and |.| uses to denote both the cardinality (of a set) and the absolute value (of a R ∞ t2 scalar). Special functions used in the paper are listed below: Gaussian Q-function: Q(x) = √12π x e− 2 dt, Gamma R∞ R∞ function: Γ(x) = 0 tx−1 e−t dt, modified Bessel function with real arguments: Kν (x) = 0 e−xcoshtcosh(νt)dt. II. P ROBLEM F ORMULATION
AND
M ATHEMATICAL M ODEL
As discussed in [31]–[34], there are many practical scenarios where the signals of interest lie in a union of subspaces instead of a single subspace. II.A Union of subspaces Definition II.1. Union of subspaces: A signal x ∈ H lies in a union of subspaces if x ∈ X where X is defined as X =
[ i
Si
(1)
and Si ’s are subspaces of H. A signal x ∈ X if and only if there exists i0 such that x ∈ Si0 . i be a basis for the finite dimensional Let T < ∞ denote the number subspaces in the union X . Let Vi = {vim }km=1
subspace Si where ki is the dimension of Si . Then each x ∈ Si can be expressed in terms of a basis expansion x=
kX i −1
ci (m)vim
m=0
where ci (m)’s for m = 0, 1, · · · ki−1 are the coefficients corresponding to the basis Vi . We assume that the subspaces are disjoint (i.e. there are no subspaces such that Si ⊆ Sj for i 6= j in the union (1)) and each subspace Si is uniquely determined by the basis Vi . II.B Observation model: Linear sampling Consider a sampling operator via a bounded linear mapping of a signal x that lives in an ambient Hilbert space H. Let the linear sampling operator A be specified by a set of unique sampling vectors {am }M m=1 . With these
notations, noisy samples are given by, y = Ax + w
(2)
where y is the M × 1 measurement vector, and the m-th element of the vector Ax is given by, (Ax)m = hx, am i for m = 0, 1, · · · , M − 1 where h.i denotes the inner product. The noise vector w is assumed to be Gaussian with
2I . mean 0 and covariance matrix σw M
When x ∈ Si for some i in the model (1), the vector of samples can be equivalently represented in the form of a matrix vector multiplication, y = Bi ci + w
(3)
5
where
Bi = AVi =
ha0 , vi0 i
ha0 , vi1 i
. .
ha0 , vi(ki −1) i
ha1 , vi0 i
ha1 , vi1 i
. .
ha1 , vi(ki −1) i
.
.
. .
.
haM −1 , vi0 i haM −1 , vi1 i . . haM −1 , vi(ki −1) i
and ci = [ci (0) ci (1) · · · ci (ki − 1)]T is the coefficient vector with respect to the basis Vi . Further, let bim denote the m-th column vector of the matrix Bi for m = 0, 1, · · · , ki −1 and i = 0, 1, · · · , T −1. We assume that the linear sampling operator A is a one-to-one mapping between X and AX . Since {vi0 , · · · , vi(ki −1) } is a set of linearly independent basis vectors, then {bi0 , · · · , bi(ki −1) } are also linearly independent for each i = 0, 1, · · · , T − 1. The sparsity model used in the Compressive Sensing (CS) literature is a special case of this general union of subspaces model. In the standard CS framework, it is assumed that a length-N signal of interest x is k-sparse in an orthonormal basis V so that x can be represented as x = Vc with c having only k ≪ N significant coefficients. This corresponds to assuming that x lies in a union of subspaces where each subspace is k-dimensional and has as a basis k vectors from the orthonormal basis V. In this case, there are T = Nk such subspaces in the union. II.C Subspace detection from the union of subspaces model As discussed in the Introduction, there are applications where it is sufficient to detect the subspace in which the signal of interest lies from the union of subspaces model (1) instead of complete signal recovery. Moreover, if there is a procedure to correctly identify the subspace with vanishing probability of error, then the signal x can be reconstructed with a good l2 norm error using standard techniques. However, the other way may not be always true, i.e., if an algorithm developed for complete signal recovery is used for subspace detection, it may not give an equivalent performance guarantee. This is because, even if such an estimate of the signal may be close to the true signal with respect to the considered performance metric (e.g., l2 norm error), the subspace in which the estimated signal lies may be different from the true subspace. This can happen especially when the SNR is not sufficiently high. Thus, investigating the problem of subspace detection is important and is the main focus of this paper. For the problem of subspace detection, the performance metrics used to evaluate the quality of the estimate are different from those used for exact signal recovery. In this paper, we consider subspace detection via the ML detector and performance is evaluated via the probability of error which will be defined in the next section. More specifically, our goal is to address the following issues. •
Performance of the optimal subspace detection scheme (ML detection) in terms of the probability of error in detecting the subspaces from the union of subspaces model (1) in the presence of noise. We are also interested in conditions under which reliable subspace recovery in the union is guaranteed with a given sampling operator.
•
How much gain in terms of the number of samples required for subspace detection can be achieved if further information on structures is available for the subspaces in (1) compared to the case when no additional structured information is available (i.e. compared to the standard sparsity model used in CS).
6
•
Illustration of the performance gap between the ML detector and computationally tractable algorithms for subspace detection from the union of subspaces model at finite SNR.
II.D Main results The main results of the paper can be summarized as follows. With the general union of subspaces model as defined in (1), the minimum number of samples required for asymptotic reliable detection of subspaces in the presence of noise is M >k+
c1 log(T¯0 ) f (SN R)
(4)
where k is the dimension of each subspace, f (SN R) is a monotonically increasing function of SNR and T¯0 is a measure of the number of subspaces in the union with maximum dependence where T¯0 ≤ T (formal definitions of all these terms are given in Section III). In the special case where each subspace in the union (1) can be expressed as a sum of k0 subspaces out of L where each such subspace is d-dimensional such that k = k0 d, the problem of subspace detection reduces to the problem of block sparsity pattern recovery. In that case, when the sampling operator is represented by random projections, the number of samples required for asymptotic reliable block sparsity pattern recovery is given by M >k+
c2 log(L − k0 ) BSN Rmin
(5)
where BSN Rmin is the minimum non zero block SNR and c2 is a constant. For the general union of subspaces model, the authors in [34] derived lower bounds on the minimum number of measurements required for the sampling matrix to satisfy the restricted isometry property (RIP). Similar results are derived in [33] when the subspaces in the union have a specific structure leading to block sparsity as considered in Section IV. Based on their results, the dominant factor of the minimum number of samples required for the sampling matrix to satisfy RIP scales as M > η1 k + η2 log(T )
(6)
with the general union of subspaces model [34] and η3 k + η4 k0 log(L/k0 )
(7)
with the block sparse model [33] for some constants ηi for i = 1, 2, 3, 4. Thus, with these numbers of measurements, signals which lie in the union of subspaces can be recovered using practical algorithms with high probability when there is no noise. In contrast, our results provide sufficient conditions for subspace detection (but not complete signal recovery) with the optimal ML detector. Our results show that, when the subspaces in the union are such that T¯0 ≪ T and we are operating in a finite SNR region, the minimum number of samples required for asymptotic subspace detection is much less than that predicted in [34] for signal recovery.
7
In the case of the standard sparsity model, our results show that M >k+
cˆ2 (log(N − k)) CSN Rmin
measurements are required for reliable sparsity pattern recovery where N = Ld and CSN Rmin (≤
(8) BSN Rmin ) d
is
the minimum component SNR. Thus, from (5) and (8), we observe that the number of measurements required for asymptotic subspace recovery beyond the sparsity index (i.e., M − k) reduces approximately d times with a block sparsity model compared to that is required with standard sparsity pattern recovery. A detailed comparison between our results and existing results in the literature is presented in Sections III and IV. From the numerical results, we will see that the derived upper bound on the probability of error of the ML detector is a tight bound for the exact probability of error obtained via simulations. Further, it is observed that existing computationally tractable algorithms for subspace detection show a considerable performance gap compared to what can be achieved via computationally expensive ML detection. III. S UBSPACE D ETECTION W ITH G ENERAL U NIONS We assume that the knowledge of the sampling operator A and the bases for each subspace is available at the detector. The problem of finding the true subspace from the observation model (3) via the ML detector becomes finding the index ˆi such that, ˆi = arg max p(y|Bi ). i=0,··· ,T −1
2 I ). Since the Given that x ∈ Si for some i, and using the observation model (3), we have p(y|Bi ) = N (Bi ci , σw M
coefficient vector ci is not known, the ML detector estimates ci such that p(y|Bi ) is maximized. The ML detector chooses the subspace Si over Sj for i 6= j if, maxp(y|Bi ) > maxp(y|Bj ). ci
cj
The ML estimator of ci can be found as, cˆi = (B∗i Bi )−1 B∗i y resulting in 1 1 log(maxp(y|Bi )) = log − 2 ||y − Pi y||22 2 M/2 ci 2σw (2πσw ) 1 1 2 − 2 ||P⊥ = log i y||2 2 M/2 2σw (2πσw )
∗ i , P⊥ where Pi = Bi (B∗i Bi )−1 B∗i is the orthogonal projector onto the span of {bim }km=1 i = I − Pi and B denotes
the Hermitian transpose of the matrix B. Thus, the ML detector detects the subspace Si over Sj for i 6= j if, 2 ⊥ 2 ||P⊥ i y||2 − ||Pj y||2 < 0.
The performance of the ML detector is characterized by the probability of error which is given by, X Pe = P r(Bestimated 6= Btrue ) = P r(ˆi 6= i|Bi )P r(Bi ) i
≤
XX i
j
P r(ˆi = i|B = Bj )P r(B = Bj )
(9)
8
where 2 ⊥ 2 P r(ˆi = i|B = Bj ) = P r(||P⊥ i y||2 − ||Pj y||2 < 0). 2 ⊥ 2 ⊥ 2 ⊥ 2 Let ∆(y) = ||P⊥ i y||2 − ||Pj y||2 . When the true subspace is Sj , we have ||Pj y||2 = ||Pj w||2 and ⊥ ⊥ ⊥ P⊥ i y = Pi Bj cj + Pi w = Pi (Bj\i cj\i + w)
where Bj\i cj\i =
P
(10)
bjm cj (m) and R(A) denotes the range space of the matrix A. More specifically, the
bjm ∈R(B / i)
M × l matrix Bj\i contains the columns of Bj which are not in the range space of the matrix Bi where l is the
number of such columns in Bj . The l × 1 vector cj\i contains the elements of cj corresponding to the column vectors in Bj\i . 2 ⊥ 2 We conclude that ∆(y) = ||P⊥ i (Bj\i cj\i +w)||2 −||Pj w||2 and P r(∆(y) < 0) = P r
2 ||P⊥ i (Bj\i cj\i +w)||2 2 ||P⊥ j w||2
0. ⊥ T Let the eigendecomposition of P⊥ i be Pi = Qi Λi Qi where Qi is a unitary matrix consisting of eigenvectors ⊥ of P⊥ i and Λi is a diagonal matrix in which the diagonal elements represent eigenvalues of Pi which are M − k
ones and k zeros. Then, for given l, λj\i =
X 1 ⊥ 2 α2m,i (l) ≥ (M − k)α2min,l ||P B c || = j\i j\i i 2 2 σw m∈Qi
where αm,i (l) =
1 σw hqm,i , Bj\i cj\i i
for given l, qm,i is the m-th eigenvector of P⊥ i , Qi is the set containing indices
corresponding to non zero eigenvalues where |Qi | = M − k and αmin,l = min|αm,i (l)|. i;i6=j
Note that (M − k)α2min,l measures the minimum SNR of the sampled signal, Ax, projected onto the null space
of any subspace Si for i 6= j , i = 0, 1, · · · , T − 1 such that |Wj\l | = l given that the true subspace in which the signal lies is Sj . For a given subspace Sj , define Tj (l) to be the number of subspaces Si such that |Wj\i | = l. With these notations, the probability of error in (12) can be further upper bounded by, T −1 k q 1 XX 1 Pe ≤ (1 − 2c0 ) (M − k)α2min,l + Ψ l, (M − k)α2min,l Tj (l) Q T 2 j=0 l=1
(13)
10
where Ψ l, (M − k)α2min,l =
√
2 2l Γ(l/2) (c0 (M
− k)α2min,l )l/2−1/2 Kl/2−1/2 (c0 (M − k)α2min,l /2). To obtain (13) we
used the facts that Q(x) is monotonically non increasing in x and Ψ(s, x) is monotonically non increasing in x for given s when x > 0. The quantity Tj (l) is a measure of the dependency among the subspaces in the union. To
compute Tj (l) explicitly, the specific structures of the subspaces should be known. For example, in the standard sparsity model used in CS in which the union in (1) consists of T = Nk subspaces from an orthonormal basis V of dimension N , there are kl N −k number of sets such that |Wj\i | = l, thus Tj (l) = kl N −k . In that particular l l case Tj (l) is the same for all j = 0, 1, · · · , T − 1. To further upper bound the probability of error of ML detector
in (13) with an arbitrary union of subspaces model, we let T0 (l) = by, Pe ≤
k X l=1
max
Tj (l). Then (13) is upper bounded
j=0,1,··· ,T −1
q 1 2 2 (1 − 2c0 ) (M − k)αmin,l + Ψ l, (M − k)αmin,l . T0 (l) Q 2
(14)
Theorem 2. Let T0 (l) and α2min,l be as defined in Subsection III.B. Suppose that the sampling is performed via any linear sampling operator A. Then we have
lim
(M −k)→∞
Pe = 0 if the following condition is satisfied:
M > k + max{M1 , M2 }
n
o where M1 = max f1 (l) = (1−2c08)2 α2 {log(T0 (l)) + log(1/2)} , l=1,··· n min,l oo n ,k 2(k/2+r0 −1) 2b0 , 0 < c0 < 1/2, b0 = log(T0 (l)) + log √ M2 = max f2 (l) = r0 c0 α2 π l=1,··· ,k
min,l
√
2π 4
and r0 > 0.
Proof: See Appendix B.
Let li ∈ {1, · · · , k} be the value of l which maximizes fi (l) as defined in Theorem 2 for i = 1, 2. For M2 , it can be verified that we can find constants c0 and r0 in the defined regimes such that fairly small. Then the dominant factor of M1 and M2 can be written in the form of
2(k/2+r0 −1) 8 if r0 c 0 (1−2c0 )2 > c1 ¯ ¯ 2min α ¯ 2min log(T0 ) where α
k is
and
T¯0 are the corresponding values of α2min,l and T0 (l) when l = l0 for l0 ∈ {l1 , l2 } and c1 is an appropriate constant.
Since, most of the scenarios we are interested in are for the case where k is sufficiently small, we get the minimum number of samples required for reliable subspace detection which is on the order of k + α¯c21 log(T¯0 ). It is further min
noted that T0 (l) ≤ T for all l and thus T¯0 ≤ T where T is the total number of subspaces in the union (1). In the following, we compare the results in Theorem 2 with some existing results for general union of subspaces model (1). It has been shown in [34] that the following conditions should be satisfied by the sampling matrix which guarantees the reliable recovery of the sparse signal based on the observation model (3): Theorem 3 ( [34]). For any given t > 0, if 2 M> cδ
12 log(2T ) + k log +t δ
(15)
then, the matrix B satisfies the restricted isometry property (RIP) with the restricted isometry constant δ (for formal definition of RIP readers may refer to [34]). From the right hand side of (15), it can be seen that the dominant factor is on the order of η1 k + η2 log(T ) for some constant η1 and η2 and thus, with that many samples, the signal x can be recovered using a practical algorithm.
11
However, in our work which specifically focuses on subspace recovery (not signal recovery) in the presence of noise, we showed that the minimum number of measurements required for reliable subspace recovery with ML c1 ¯ ¯ detector scales as k + f (SN R) log(T0 ) where T0 ≤ T and f (SN R) is a monotonically increasing function of SNR.
If the subspaces in the union are such that T¯0 k + max{M 16 ¯1 = √e ), where M (log(L − k ) + log 2 0 BSN Rmin (1−2c0 ) 2 2b0 e2 1 4(k0 /2 + r0 − 1) ¯ √ log(L − k0 ) + log M2 = c0 r0 BSNRmin 2 π
with 0 < c0 < 21 , r0 > 0 and b0 =
√ 2π 4
(22)
(23)
are constants.
Proof: Proof follows from Theorem 2 and using the relations, that e(L−k0 ) 0 log( L−k ) ≤ l log . l l
k0 l
≤
L−k0 l
for k0 ≤ L/2, and
From Lemma IV.2, we can write the minimum number of random samples required for reliable block sparsity pattern recovery asymptotically in the form of O(k + BSNcˆ1Rmin log(L − k0 )) for some constant cˆ1 in the case where k0 is sufficiently small.
Remarks 1. When BSN Rmin → ∞, M > k measurements are sufficient for asymptotic reliable block sparsity pattern detection with the ML detector.
IV.C Discussion IV.C.1 Revisiting the standard sparsity model: In the standard sparsity model considered widely in the CS literature, the subspaces in the union (1) are assumed to be k-dimensional subspaces of an orthonormal basis. With the notations used in Sections IV.A and IV.B, we can represent the standard sparsity model as having k = k0 d non zeros of the sparse signal c and V is the N × N basis matrix with N = Ld in (19). To have a fair comparison to the performance of the ML detector in the presence of noise with the standard sparsity model and block sparsity model, we introduce further notations. Define the minimum component SNR, CSNRmin =
||cm (i)||22 2 σw m∈U ,i=0,··· ,d−1
min
so that BSN Rmin ≥ dCSN Rmin . Then, when the sampling is performed via random projections, the probability of error of the ML detector with the standard sparsity model is upper bounded by, k X p k N −k 1 Pe ≤ (1 − 2c0 ) (M − k)lCSNRmin + Ψ (l, CSNRmin ) Q 2 l l
(24)
of error of the ML detector with block sparsity model (21) can be rewritten as k0 X p k0 L − k0 1 Pe ≤ (1 − 2c0 ) (M − k)ldCSNRmin + Ψ (l, dCSNRmin ) . Q l 2 l
(25)
l=1
where N = Ld, k = k0 d and Ψ (l, CSNRmin ) is as defined in Corollary IV.1. With these notations, the probability
l=1
Based on (24) and (25), it can be shown that the dominant part of the required number of random samples for reliable subspace detection asymptotically in the presence of noise can be expressed in the form of O(k+ d1 CSNcˆ1Rmin log(L−
16
k0 )) with block sparsity model and O(k +
cˆ2 CSN Rmin (log(N
− k))) with the standard sparsity model where cˆ1 and
cˆ2 are positive constants. Thus, the required number of random samples (in terms of M − k) for reliable subspace
detection with random projections is reduced by approximately a factor of d with the block sparsity model compared to that with the standard sparsity model where k is the total number of non zero coefficients of the block/standard sparse signal. Further, it is noted that the above analysis is for the worst case, i.e. the upper bounds on the probability of error are obtained considering the minimum block/component SNR. The actual number of measurements required for reliable subspace detection can be less than that predicted in Lemma IV.2. IV.C.2 Existing results with standard sparsity model: In the most related existing work on deriving sufficient conditions for the ML detector to succeed in the presence of noise with the standard sparsity model (in [23]), the results are derived based on the following bound on the probability of error: k X k N −k (M − k)lCSN Rmin Pe ≤ 4 exp − . l l 64(lCSN Rmin + 8)
(26)
l=1
When CSN Rmin → ∞, it can be easily seen that this upper bound is bounded away from zero (i.e. it is bounded by 4e−(M −k)/64 Ld − 1 > 0). Based on the upper bound (26), it was shown in [23] that k N −k log(N − k) ˜ ˜ (27) M > k + (c1 + 2048) max M1 = log , M2 = CSN Rmin k measurements are required for asymptotic reliable sparsity pattern recovery where c1 is a constant (which is different from the one used earlier in the paper). With this, when the minimum component SNR, CSN Rmin → ∞, the ML detector requires k + (c1 + 2048)k log((N − k)/k) measurements for asymptotic reliable recovery, which is much larger than k. However, as shown in [25], [42], when the measurement noise power is negligible (or in the no noise case), the exhaustive search detector is capable of recovering the sparsity pattern with M = k + 1 measurements with high probability. Thus, the limits predicted by the existing results in the literature for sparsity pattern recovery in terms of the minimum number of measurements show a gap between those limits and what is actually required. On the other hand, our results show that when CSN Rmin → ∞, the upper bound on the probability of error in (24) vanishes with the standard sparsity model when M > k. More specifically, when CSN Rmin → ∞, our results show that O(k) measurements are sufficient for reliable asymptotic sparsity pattern recovery with the ML detector
˜ 2 dominates M ˜ 1 in (27) the lower bound on the minimum which is intuitive. Further, at finite CSN Rmin , when M
number of samples required for asymptotic reliable sparsity pattern recovery obtained in [23] has the same scaling with respect to L, k, d and CSN Rmin to that is obtained in this paper with the standard sparsity model. IV.C.3 Comparison with existing results for block sparsity pattern recovery: The problem of stable recovery of block sparse signals is discussed in [33], [35], [39]. When the samples are acquired via random projections (elements in A are Gaussian) with the notations used in Section IV.B, the minimum number of samples required for the sampling matrix to satisfy block RIP with high probability is given by (from Theorem 3 and [35]) 36 L 12 M≥ log 2 +t + k log 7δ δ k0
(28)
17
for some t > 0 and 0 < δ < 1 is the restricted isometry constant. This is roughly on the order of η3 k+η4 k0 log(L/k0 ) for some positive constants η3 and η4 . Thus, block sparse signals can be reliably recovered using computationally tractable algorithms (e.g. extension of BP - mixed l2 /l1 norm recovery algorithms) with η3 k + η4 k0 log(L/k0 ) measurements when there is no measurement noise. In the presence of noise, BP based algorithm developed in [33] is shown to be robust and it can tolerate noise in a way that ensures that the norm of the recovery error is bounded by the noise level. With the analysis presented in Section IV.B for block sparsity pattern recovery with ML detector in the presence of measurement noise, it requires roughly the order of k + (ˆ c1 /BSN Rmin ) log(L − k0 ) measurements (when k0 is fairly small) for reliable block sparsity pattern recovery where cˆ1 is a positive constant. Here, the second term is significant at finite BSN Rmin while it vanishes when BSN Rmin → ∞. At finite BSN Rmin , when k0 is sublinear w.r.t. L, it can be shown that k0 log(L/k0 ) >> log(L − k0 ). Thus, in that region of k0 , the relevant
scaling obtained in (28) is larger than what is required by the optimal ML detector derived in this paper at finite BSN Rmin . The exact difference between them depends on the value of BSN Rmin and the relevant constants.
V. N UMERICAL R ESULTS Several computationally tractable algorithms for sparsity pattern recovery with standard sparsity have been derived and discussed quite extensively in the literature. Extensions for such algorithms for model based or structured CS have also been considered in several recent works. For example, extensions of CoSamp and iterative hard thresholding algorithms for model based CS were considered in [31]. Extensions of OMP algorithm for block sparsity pattern recovery (BOMP) were considered in [35], [43] while [33], [44], [45] considered the Group Lasso algorithm for block sparse signal recovery. Our goal in this section is to validate the tightness of the derived upper bounds on the probability of error of the ML detector for subspace recovery with the union of subspace models and provide numerical results to illustrate the performance gap when employing practical algorithms for subspace recovery. It is noted that simulating the ML algorithm is difficult due to its high computational complexity in the high dimensions. For the structured union of subspaces model considered in Section IV.A, the problem reduces to detecting the block sparsity pattern of a block sparse signal. The performance of the ML algorithm is compared to block-OMP as proposed in [35] which is provided in Algorithm 1 where the set Uˆ contains the estimated indices of the non zero blocks of block sparse signal. Results in both Figures 1 and 2, are based on the special structure as considered in (16) for subspaces leading to block sparsity and the sampling operator is assumed to be a random matrix in which elements are drawn from a Gaussian ensemble with mean zero and variance 1. Further, we let N × N matrix V be the standard canonical basis. In Fig 1, the exact probability of error of the ML detector (obtained via simulation) and the upper bound on the probability of error derived in (21) vs M/N are shown. In the block sparsity model, we let N = 50, d = 2, L = 25, BSN Rmin = 13dB and three different plots correspond to k0 = 3, 4, 5. The exact probability of error of
the ML detector is obtained via Monte Carlo simulations with 105 runs. In the upper bound (21), we let c0 = 1/4.
18
Algorithm 1 Block-OMP (B-OMP) for sparsity pattern detection 1) Initialize t = 1, Uˆ(0) = ∅, residual vector r0 = y 2) Find the index λ(t) such that λ(t) = arg max ||Bi rt−1 ||2 i=1,··· ,N
ˆ − 1) ∪ {λ(t)} 3) Set Uˆ(t) = U(t
−1 4) Compute the projection operator P(t) = B(Uˆ (t)) B(Uˆ(t))T B(Uˆ(t)) B(Uˆ(t))T . Update the residual vector: rl,t = (I − P(t))yl (note: B(Uˆ(t)) denotes the submatrix of B in which columns are taken from B
corresponding to the indices in Uˆ(t))
5) Increment t = t + 1 and go to step 2 if t ≤ k, otherwise, stop N=50, L=25, d=2, BSNRmin= 13 dB 1
Prob. of error of the ML detector
0.9
Marked lines: Exact prob. of error Plain lines: Upper bound on the prob. of error
0.8 0.7 0.6
k0=5 0.5 0.4
k =4 0
k =3 0
0.3 0.2 0.1 0 0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
M/N
Fig. 1.
Exact probability of error and the derived upper bound on the probability of error of the ML detector for block sparsity pattern
recovery; N = 50, L = 25, d = 2, and thus k = 10, BSN Rmin = 13dB
It can be seen from Fig. 1 that the derived upper bound on the probability of error is a tight bound on the exact probability of error especially as M/N increases. In Figure 2, the performance of the block sparsity pattern recovery with ML and B-OMP algorithms is shown when BSN Rmin varies. In Figure 2, we let k0 = 5, L = 25, d = 2 and N = 50. For B-OMP, 104 runs are performed for a given projection matrix and averaged over 100 runs. In Fig.2, the ratio between the minimum and maximum block SNR in both cases considered is set at 1.825. As observed in Fig. 1, from Fig. 2 it can be seen that the derived upper bound on the probability of error of the ML detector is fairly closer to the exact probability error obtained via Monte Carlo simulations, especially as BSN Rmin increases. Further, for a given finite BSN Rmin , there seems to be a considerable performance gap between the B-OMP and the ML detector. That is the price to pay for the computational complexity of the ML detector vs the computationally efficient B-OMP algorithm.
19 Block sparsity pattern detection, N=50, k =5, d=2, L=25 0
1 ML: Exact, BSNR
Probability of Error
=13dB
min
0.9
ML: Exact, BSNRmin=11dB
0.8
ML: Upper bound , BSNRmin=13dB
0.7
ML: Upper bound, BSNRmin=11dB Block−OMP, BSNRmin=13dB
0.6
Block−OMP, BSNR
=11dB
min
0.5 0.4 0.3 0.2 0.1 0 0.2
Fig. 2.
0.3
0.4
0.5 M/N
0.6
0.7
0.8
Performance of the ML detector and the B-OMP algorithm for block sparsity pattern recovery; L = 25, k0 = 5, d = 2, and thus
k = 10, N = 50
VI. C ONCLUSION In this paper, we investigated the problem of subspace detection based on reduced dimensional samples when the signal of interest is assumed to lie in a general union of subspaces model. With a given sampling operator, we derived the performance of the optimal ML detector for subspace detection in the presence of noise in terms of the probability of error. We further obtained conditions that should be satisfied by the number of samples to guarantee asymptotic reliable subspace detection. We extended the analysis to a special case of union of subspaces model which reduces to block sparsity. When the samples are obtained via random projections, sufficient conditions required for reliable block sparsity pattern recovery with the ML detector were derived. Performance gain in terms of the minimum number of samples required for asymptotic subspace detection with the block sparse model was quantified compared to that with the standard sparsity model. Our results further strengthen the existing results for reliable sparsity pattern recovery with the standard sparsity model used in CS framework with random projections in the presence of noise. More specifically, our results for sufficient conditions for reliable asymptotic subspace detection are derived based on a tighter bound on the probability of error of the ML detector compared to the existing results in the literature with the standard sparsity model. We further discussed and illustrated numerically the performance gap between the ML detector and the computationally tractable algorithms (e.g. B-OMP) used for subspace detection with the structured union of subspaces model. As future work, we will investigate the problem of subspace detection with more structured models (in addition to the one considered in this paper) for the subspaces in the union. Further we will extend the analysis with the single node system to a multiple node system in distributed networks.
20
A PPENDIX A Proof of Lemma III.1 To prove Lemma III.1, we consider a similar argument to that considered in [23] with certain differences as noted in the following. As shown in [23], we may write, 2 ⊥ 2 ⊥ 2 ⊥ 2 ∆(y) = ||P⊥ i y||2 − ||Pi w||2 + ||Pi w||2 − ||Pj y||2 .
For any given δ > 0, define the events h1 (δ) =
(
h2 (δ) =
)
(29)
2 ⊥ 2 ||P⊥ i y||2 − ||Pi w||2 ≤ 2δ . 2 σw
(30)
|
2 ⊥ 2 ||P⊥ j y||2 − ||Pi w||2 2 σw
|≥δ
and
Then P r(∆(y) < 0) implies that at least one event in (29) and (30) is true. Based on the union bound, we can write P r(∆(y) < 0) ≤ P r(h1 (δ)) + P r(h2 (δ)).
With the standard sparsity model and assuming that the sampling is performed via random projections, upper bounds on the probabilities P r(h1 (δ)) and P r(h2 (δ)) are derived in [23]. In contrast, in the following, we derive exact value for P r(h2 (δ)) and a tighter bound for P r(h1 (δ)) assuming that the sampling operator A is known. Thus, even for the standard sparsity model, the results presented in this paper tightens the results derived in [23]. We first evaluate P r(h1 (δ)). Let ∆1 (y) = reduces to ∆1 (y) =
1 ⊥ 2 2 (||Pj w||2 σw
1 ⊥ 2 2 (||Pj y||2 σw
2 − ||P⊥ i w||2 ). Assuming the true subspace is Sj , ∆1 (y)
2 − ||P⊥ i w||2 ). As shown in [23], the random variable ∆1 (y) can be represented
as ∆1 (y) = x1 − x2 where x1 and x2 are independent and x1 , x2 ∼ Xl2 where l is the cardinality of the set Wj\i as defined before. With these notations, we can write P r(h1 (δ)) = P r(|x1 − x2 | ≥ δ) = P r((x1 − x2 ) ≥ δ) + P r((x1 − x2 ) < −δ).
The pdf of the random variable w = x1 − x2 is symmetric around zero and thus we have, P r(h1 (δ)) = 2P r((x1 − x2 ) ≥ δ).
Proposition VI.3. When x1 ∼ Xl2 and x2 ∼ Xl2 , the random variable w = x1 − x2 has the following pdf: 1 l w f + (w) = √ w 2 − 2 K ; if w ≥ 0 l w π2 Γ(l/2) 1/2−l/2 2 fw (w) = l −1 2 2 −w f − (w) = √(−w) K ; if w < 0 w 2 π2l Γ(l/2) 1/2−l/2 where Kν (x) is the modified Bessel function.
(31)
21
Proof: Since x1 and x2 are independent, the pdf of w = x1 − x2 is given by [46] R ∞ f (w + x )f (x )dx ; if w ≥ 0 x1 2 x2 2 2 0 fw (w) = R ∞ f (w + x )f (x )dx ; if w < 0 2 x2 2 2 −w x1
First consider the case where w > 0. Then Z ∞ l/2−1 −x2 /2 (w + x2 )l/2−1 e−(w+x2 )/2 x2 e + fw (w) = dx2 l/2 l/2 2 Γ(l/2) 2 Γ(l/2) 0 Z ∞ e−w/2 l/2−1 x2 (w + x2 )l/2−1 e−x2 dx2 = 2l (Γ(l/2))2 0 e−w/2 1 √ wl/2−1/2 ew/2 Γ(l/2)K1/2−l/2 (w/2) = l 2 2 (Γ(l/2)) π l/2−1/2 w K1/2−l/2 (w/2) √ l = π2 Γ(l/2) where Kν (x) is the modified Bessel function and the third equality is obtained using the integral result ν−1/2 β 1 βµ/2 ν−1 −µx √ for µ, ν > 0 in [47, p. 348]. e Γ(ν)K1/2−ν βµ β) e dx = π µ 2
R∞ 0
xν−1 (x+
When w < 0, we have,
fw− (w) =
e−w/2 l 2 (Γ(l/2))2
Z
∞ −w
l/2−1
x2
(w + x2 )l/2−1 e−x2 dx2 .
(32)
Letting z = −w where z > 0, (32) can be rewritten as, Z ∞ ez/2 l/2−1 − x2 (x2 − z)l/2−1 e−x2 dx2 . fw (w) = l 2 (Γ(l/2))2 z ν−1/2 R∞ e−µu/2 Γ(ν)Kν−1/2 Using the integral result, u xν−1 (x − u)ν−1 e−µx dx = √1π µu the relation Kν (x) = K−ν (x), we get fw− (w) as in (31), completing the proof.
(33) µu 2
in [47, p. 347] and
Proposition VI.4. For δ > 0, the probability P r(w > δ) is given by, √ 2 δl/2−1/2 Kl/2−1/2 (δ/2) P r(w > δ) ≤ l+1 2 Γ(l/2) where Kν (x) is the modified Bessel function, and Γ(.) is the Gamma function. Proof: Based on (31), we have ∞
∞
fw+ (w)dw =
Z
Using the equivalent integral representation of Kν (az) =
zν 2
P r(w > δ) =
Z
δ
δ
integral in (34) as, P r(w > δ) = √
Since
R∞ δ
w2
e− 4t dw =
√
Z
δ
R∞
∞Z ∞
0
2 − a2 t+ zt t−ν−1 dt e
2 − 14 t+ wt l/2−3/2
e
t
√δ 2t
(34)
[47, p. 917], we can write the
dtdw.
0
, (35) reduces to, √ Z ∞ 2 δ −t/4 l/2−3/2 P r(w > δ) = e t Q √ dt 2l+1 Γ(l/2) 0 2t
2πQ
1 π2l+1 Γ(l/2)
wl/2−1/2 K1/2−l/2 (w/2) √ l dw. π2 Γ(l/2)
(35)
22
√
Z ∞ δ2 2 tl/2−3/2 e−t/4− 4t dt l+2 2 Γ(l/2) 0 √ 2 δl/2−1/2 Kl/2−1/2 (δ/2) l+1 2 Γ(l/2)
≤ =
where we used the inequality Q(x) ≤ 12 e−
x2 2
for x > 0, and the relation,
R∞ 0
xν−1 e−β/x−γx dx = 2
(36) (37) ν/2 β γ
√ Kν (2 βγ)
for β > 0 and γ > 0 [47, p. 368] while obtaining (36) and (37), respectively, which completes the proof. Then, we have
√ 2 δl/2−1/2 Kl/2−1/2 (δ/2). P r(h1 (δ)) = l 2 Γ(l/2)
Next we compute the quantity P r(h2 (δ)). Let ∆2 (y) = ∆2 (y) =
1 ⊥ 2 2 (||Pi y||2 σw
(38)
2 − ||P⊥ i w||2 ). Then we have,
1 2 T ⊥ (||P⊥ i Bj\i cj\i ||2 + 2w Pi Bj\i cj\i ). 2 σw
2 I ), ∆ (y) is a Gaussian random variable with pdf, Since w ∼ N (0, σw M 2 1 ⊥ 2 4 ⊥ 2 ∆2 (y) ∼ N ||Pi Bj\i cj\i ||2 , 2 ||Pi Bj\i cj\i ||2 . 2 σw σw
Thus, P r(h2 (δ)) = P r (∆2 (y) ≤ 2δ) ! Bj\i cj\i ||22 2δ − σ12 ||P⊥ i w = 1−Q 2 ⊥ B c || ||P j\i j\i 2 i σw ! 2δ − λj\i p . = 1−Q 2 λj\i
Since it is desired to control δ such that P r(h2 (δ)) ≤ 1/2, we select δ∗ = c0 λj\i where c0 < 12 . With this choice P r(h2 (δ)) reduces to, P r(h2 (δ)) = Q
q 1 λj\i (1 − 2c0 ) 2
where we used the relation 1 − Q(−x) = Q(x) for x > 0, while P r(h1 (δ)) reduces to, √ 2 P r(h1 (δ)) = l (c0 λj\i )l/2−1/2 Kl/2−1/2 (c0 λj\i /2). 2 Γ(l/2)
(39)
A PPENDIX B Proof of Theorem 2 To obtain conditions under which the probability of error bound in (14) asymptotically vanishes, we rely on the following corollary. Corollary VI.2. Let T0 (l) and α2min,l be as defined in Subsection III.B. The probability of error of the ML detector in (14) is further upper bounded by Pe ≤
k X l=1
T0 (l)
1 − 1 (1−2c0 )2 (M −k)α2min,l + φl e 8 2
(40)
23
where √
2π φl = 4Γ(l/2)
1 c0 (M − k)α2min,l 4
l/2−1
1
2
e− 2 c0 (M −k)αmin,l
(41)
when (M − k)α2min,l >> (l/2 − 1/2) for all l = 1, 2, · · · , k and 0 < c0 < 1/2. x2
Proof: Using the Chernoff bound for the Q function where Q(x) ≤ 21 e− 2 , we can upper bound the term q Q 12 (1 − 2c0 ) (M − k)α2min,l as, q 1 1 − 1 (1−2c0 )2 (M −k)α2min,l 2 (1 − 2c0 ) (M − k)αmin,l e 8 Q ≤ 2 2 for c0 < 12 .
To obtain (41) we used the relation Kν (z) ≈
pπ
−z 2z e
when ν k + M1 where M1 = max (1−2c08)2 α2 {log(T0 (l)) + log(1/2)} . Considering l=1,··· ,k
min,l
the second term in (40), let
Π1 = log T0 (l) + log √
b0 Γ(l/2)
+ (l/2 − 1) log
1 c0 (M − k)α2min,l 4
1 − c0 (M − k)α2min,l 2
(42)
1 such that where b0 = 42π . When 41 c0 (M − k)α2min,l is sufficiently large, we can find 0 < q0 < (k/2−1) log 41 c0 (M − k)α2min,l < q0 12 c0 (M − k)α2min,l . Then (42) is upper bounded by 1 b0 2 c0 (M − k)αmin,l (1 − q0 (k/2 − 1)) = Π2 Π1 ≤ max log(T0 (l)) + log − (43) l=1,··· ,k Γ(3/2) 2
where 0 < q0
0. Thus, (43) can be
rewritten as 2b0 1 r0 2 Π2 = max log(T0 (l)) + log √ c0 (M − k)αmin,l − → −∞ l=1,··· ,k π 2 r0 + k/2 − 1 n oo n 2b0 0 −1) √ log(T (l)) + log as (M −k) → ∞ when M > k +M2 where M2 = max 2(k/2+r , 0 < c0 < 1/2, 0 r0 c0 α2 π
b0 =
√ 2π 4 ,
l=1,··· ,k
min,l
and r0 > 0.
R EFERENCES [1] E. Cand` es, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inform. Theory, vol. 52, no. 2, pp. 489 – 509, Feb. 2006. [2] D. Donoho, “Compressed sensing,” IEEE Trans. Inform. Theory, vol. 52, no. 4, pp. 1289–1306, Apr. 2006.
24
[3] E. Cand` es and T. Tao, “Near-optimal signal recovery from random projections: Universal encoding strategies?” IEEE Trans. Info. Theory, vol. 52, no. 12, pp. 5406 – 5425, Dec. 2006. [4] Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applications.
Cambridge University Press, 2012.
[5] E. Cand` es and J. Romberg, “Practical signal recovery from random projections,” Proc. SPIE, vol. 5674, pp. 76–86, Jan. 2005. [6] E. Cand` es, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Communications on Pure and Applied Mathematics, vol. 59, no. 8, pp. 1207–1223, Aug. 2006. [7] J. K. Romberg, “Sparse signal recovery via l1 minimization,” in 40th Annual Conf. on Information Sciences and Systems (CISS), Princeton, NJ, Mar. 2006, pp. 213–215. [8] J. N. Laska, M. A. Davenport, and R. G.Baraniuk, “Exact signal recovery from sparsely corrupted measurements through the pursuit of justice,” in Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, Nov. 2009, pp. 1556–1560. [9] J. Tropp and A. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. Inform. Theory, vol. 53, no. 12, pp. 4655–4666, Dec. 2007. [10] M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,” IEEE Journal of Selected Topics in Signal Processing: Special Issue on Convex Optimization Methods for Signal Processing, vol. 1, no. 4, 2007. [11] T. Blumensath and M. E. Davies, “Iterative thresholding for sparse approximations,” J. Fourier Anal. Appl., vol. 14, pp. 629–654, 2008. [12] ——, “Gradient pursuits,” IEEE Trans. Signal Processing, vol. 56, no. 6, pp. 2370–2382, June 2008. [13] D. Needell and R. Vershynin, “Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit,” Preprint, 2007. [14] D. Malioutov, M. Cetin, and A.Willsky, “A sparse signal reconstruction perspective for source localization with sensor arrays,” IEEE Trans. Signal Processing, vol. 53, no. 8, pp. 3010–3022, Aug. 2005. [15] V. Cevher, P. Indyk, C. Hegde, and R. G. Baraniuk, “Recovery of clustered sparse signals from compressive measurements,” in Int. Conf. Sampling Theory and Applications (SAMPTA 2009), Marseille, France, May. 2009, pp. 18–22. [16] B. K. Natarajan, “Sparse approximate solutions to linear systems,” SIAM J. Computing, vol. 24, no. 2, pp. 227–234, 1995. [17] A. J. Miller, Subset Selection in Regression.
New York, NY: Chapman-Hall, 1990.
[18] E. G. Larsson and Y. Seln, “Linear regression with a sparse parameter vector,” IEEE Trans. Signal Processing, vol. 55, no. 2, pp. 451–460, Feb. 2007. [19] Z. Tian and G. Giannakis, “Compressed sensing for wideband cognitive radios,” in Proc. Acoust., Speech, Signal Processing (ICASSP), Honolulu, HI, Apr. 2007, pp. IV–1357–IV–1360. [20] M. Mishali and Y. C. Eldar, “Wideband spectrum sensing at sub-nyquist rates,” IEEE Signal Processing Magazine, vol. 28, no. 4, pp. 102–135, July 2011. [21] M. Mishali, Y. C. Eldar, O. Dounaevsky, and E. Shoshan, “Xampling: Analog to digital at sub-nyquist rates,” IET Circuits, Devices and Systems, vol. 5, no. 1, pp. 8–20, Jan. 2011. [22] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM J. Sci. Computing, vol. 20, no. 1, pp. 33–61, 1998. [23] M. J. Wainwright, “Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting,” IEEE Trans. Inform. Theory, vol. 55, no. 12, pp. 5728–5741, Dec. 2009. [24] W. Wang, M. J. Wainwright, and K. Ramachandran, “Information-theoretic limits on sparse signal recovery: Dense versus sparse measurement matrices,” IEEE Trans. Inform. Theory, vol. 56, no. 6, pp. 2967–2979, Jun. 2010. [25] A. K. Fletcher, S. Rangan, and V. K. Goyal, “Necessary and sufficient conditions for sparsity pattern recovery,” IEEE Trans. Inform. Theory, vol. 55, no. 12, pp. 5758–5772, Dec. 2009. [26] M. M. Akcakaya and V. Tarokh, “Shannon-theoretic limits on noisy compressive sampling,” IEEE Trans. Inform. Theory, vol. 56, no. 1, pp. 492–504, Jan. 2010.
25
[27] G. Reeves and M. Gastpar, “Sampling bounds for sparse support recovery in the presence of noise,” in IEEE Int. Symp. on Information Theory (ISIT), Toronto, ON, Jul. 2008, pp. 2187–2191. [28] G. Tang and A. Nehorai, “Performance analysis for sparse support recovery,” IEEE Trans. Inform. Theory, vol. 56, no. 3, pp. 1383–1399, March 2010. [29] V. K. Goyal, A. K. Fletcher, and S. Rangan, “Compressive sampling and lossy compression,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 48–56, Mar. 2008. [30] T. Wimalajeewa and P. K. Varshney, “Performance bounds for sparsity pattern recovery with quantized noisy random projections,” IEEE Journal of Selected Topics in Signal Processing, Special Issue on Robust Measures and Tests Using Sparse Data for Detection and Estimation, vol. 6, no. 1, pp. 43 – 57, Feb. 2012. [31] R. G. Baraniuk, V. Cevher, M. Duarte, and C.Hegde, “Model based compressed sensing,” IEEE Trans. Information Theory, vol. 56, no. 4, pp. 1982–2001, Apr. 2010. [32] Y. M. Lu and M. N. Do, “A theory for sampling signals from a union of subspaces,” IEEE Trans. Signal Processing, vol. 56, no. 6, pp. 2334–2345, June 2008. [33] Y. C. Eldar and M. Mishali, “Robust recovery of signals from a structured union of subspaces,” IEEE Trans. Information Theory, vol. 55, no. 11, pp. 5302–5316, Nov. 2009. [34] T. Blumensath and M. E. Davies, “Sampling theorems for signals from the union of finite-dimensional linear subspaces,” IEEE Trans. Inform. Theory, vol. 55, no. 4, pp. 1872–1882, Apr. 2009. [35] Y. C. Eldar, P. Kuppinger, and H. Bolcskei, “Block-sparse signals: Uncertainty relations and efficient recovery,” IEEE Trans. Signal Processing, vol. 58, no. 6, pp. 3042–3054, June 2010. [36] M. Duarte and Y. C. Eldar, “Structured compressed sensing: From theory to applications,” IEEE Trans. Signal Processing, vol. 59, no. 9, pp. 4053–4085, Sep. 2011. [37] Z. Ben-Haim, T. Michaeli, and Y. C. Eldar, “Performance bounds and design criteria for estimating finite rate of innovation signals,” IEEE Trans. Information Theory, vol. 58, no. 8, pp. 4993–5015, Aug. 2012. [38] P. Dragotti, M. Vetterli, and T. Blu, “Sampling moments and reconstructing signals of finite rate of innovation: Shannon meets strang-fix,” IEEE Trans. Signal Processing, vol. 55, no. 5, pp. 1741–1757, May 2007. [39] Z. Ben-Haim and Y. C. Eldar, “Near-oracle performance of greedy block-sparse estimation techniques from noisy measurements,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 5, pp. 1032–1047, Sept. 2011. [40] M. Mishali and Y. Eldar, “Blind multi-band signal reconstruction: Compressed sensing for analog signals,” IEEE Trans. Signal Processing, vol. 57, no. 3, pp. 993–1009, Mar. 2009. [41] F. Parvaresh, H. Vikalo, S. Misra, and B. Hassibi, “Recovering sparse signals using sparse measurement matrices in compressed dna microarrays,” IEEE Journal of Selected Topics in Signal Processing, vol. 2, no. 3, pp. 275–285, June 2008. [42] D. Baron, M. B. Wakin, M. F. Duarte, S. Sarvotham, and R. G. Baraniuk, “Distributed compressive sensing,” Rice Univ. Dept. Elect. Comput. Eng. Houston, TX, Tech. Rep. TREE0612, Nov 2006. [43] J. Fang and H. Li, “Block-sparsity pattern recovery from noisy observations,” in Proc. Acoust., Speech, Signal Processing (ICASSP), Mar. 2012, pp. 3321–3324. [44] X. Lv, G. Bi, and C. Wan, “The group lasso for stable recovery of block-sparse signal representations,” IEEE Trans. Signal Processing, vol. 59, no. 4, pp. 1371–1382, Apr. 2011. [45] J. Friedman, T. Hastie, and R. Tibshirani, “A note on the group lasso and a sparse group lasso,” [Online] Available: http://arxiv.org/pdf/1001.0736, preprint, 2010. [46] A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic Processes. [47] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series and Products.
McGraw Hill, 4th Edition, 2002.
Elseveir Academic Press, 2007.