Joint Sparsity Pattern Recovery with 1-bit Compressive Sensing in Distributed Sensor Networks Swatantra Kafle, Student Member, IEEE, Vipul Gupta, Student Member, IEEE, Bhavya Kailkhura, Member, IEEE, Thakshila Wimalajeewa, Member, IEEE, and Pramod K. Varshney, Fellow, IEEE
Abstract—In this paper, we study the problem of joint sparse support recovery with 1-bit quantized compressive measurements in a distributed sensor network. Multiple nodes in the network are assumed to observe sparse signals having the same but unknown sparse support. Each node quantizes its measurement vector element-wise to 1-bit. First, we consider that all the quantized measurements are available at a central fusion center. We derive performance bounds for sparsity pattern recovery using 1-bit quantized measurements from multiple sensors when the maximum likelihood decoder is employed. We further develop two computationally tractable algorithms for joint sparse support recovery in the centralized setting. One algorithm minimizes a cost function defined as the sum of the likelihood function and the l1,∞ quasi norm while the other algorithm extends the binary iterative hard thresholding algorithm to the multiple measurement vector case. Second, we consider a decentralized setting where each node transmits 1-bit measurements to its onehop neighbors. The basic idea behind the algorithms developed in the decentralized setting is to embed collaboration among nodes and fusion strategies. We show that even with noisy 1-bit compressed measurements, joint support recovery can be carried out accurately in both centralized and decentralized settings. We further show that the performance of the proposed 1-bit CS based algorithms is very close to that of their real valued counterparts except when the signal-to-noise ratio is very small.
Index terms— Compressive sensing, distributed sensor networks, quantization, common sparsity pattern recovery, binary iterative hard thresholding. I. I NTRODUCTION The problem of support recovery of a sparse signal deals with finding the locations of non-zero elements of the signal [1]–[7]. This problem has been addressed by many authors in the last decade in different contexts. A good amount of work has already been done on support recovery with real valued measurements [8]–[12]. It is, however, important to consider the problem with quantized compressive measurements since in practice, measurements are quantized before transmission or storage. Further, coarse quantization of measurements is desirable and/or even necessary in resource constrained communication networks. There are some recent works that have . This work was supported in part by ARO grant no. W911NF-14-1-0339. This work was also performed in part under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. S. Kafle, T. Wimalajeewa and P. K. Varshney are with the Department of EECS, Syracuse University, Syracuse, NY, 13244 USA (Email: {skafle, twwewelw, varshney}@syr.edu). V. Gupta is with the Department of EECS, University of California Berkeley, USA (Email: vipul
[email protected]). B. Kailkhura is with the Lawrence Livermore National Laboratory, 7000 East Av, Livermore, CA 94550 USA (Email:
[email protected]).
addressed the problem of recovering sparse signals/sparsity patterns based on quantized compressive measurements in different contexts, e.g., calculating performance bounds [13], [14], learning dictionary [15], devising recovery algorithms [16]–[23], and studying the tradeoff between the number of measurements and the number of bits per measurement when the total number of measurement bits is fixed [24]. Use of 1-bit compressive sensing (CS) is particularly attractive because of the low communication bandwidth requirement for data transfer. Most of the work on 1-bit CS till now has focused only on recovery with a single sensor [25]–[28]. Authors in [26] were the first to come up with the lower bound on the number of 1-bit compressed measurements required for sparse signal recovery with a single measurement vector. Reliable recovery of a sparse signal based on 1-bit CS is very difficult with a single measurement vector, especially when the signal-to-noise ratio (SNR) is small. On the other hand, simultaneous recovery of multiple sparse signals arises naturally in a number of applications including distributed sensor and cognitive radio networks. Multiple measurement vectors (MMVs) in distributed networks can be exploited to get better recovery results using a centralized fusion center (FC). However, in many scenarios an FC may not be available. Even when available, the FC can become an information bottleneck that may result in degradation of system performance and may even lead to system failure. Thus, it is desirable to consider a distributed network without a centralized FC where each node collaborates with its onehop neighbors to estimate the joint sparse support set. Some works in the literature [29]–[32] have addressed the problem of recovering sparsity pattern in a decentralized setting based on real valued compressive measurements. Recently, the authors in [33] have proposed a framework for data gathering in a wireless sensor network using 1-bit CS measurements for real datasets. Some recent works [33]–[36] have looked into the sparse signal reconstruction problem when multiple 1-bit compressed measurement vectors are available at a centralized FC. The work in [36] extends the BIHT algorithm for signal estimation with MMVs. The centralized BIHT algorithm proposed in [37] is similar to [36] but solves the joint sparsity pattern recovery problem. In [38], the authors have proposed a distributed 1-bit CS algorithm in which there is no centralized FC. This algorithm follows an online paradigm and requires a large number of 1-bit compressed measurements to converge. In this paper, we assume that multiple nodes in a distributed network observe signals with the same support but possibly
2
with different amplitudes. These signals are compressed using low dimensional random projections and quantized to 1-bit element-wise. First, we consider a centralized setting where the FC has access to 1-bit compressive measurement vectors from all the nodes and develop the maximum likelihood (ML) decoder for sparsity pattern recovery. We determine the minimum number of 1-bit compressive measurements that should be obtained per node to perform sparsity pattern recovery with vanishing probability of error via the ML approach. Though the ML approach provides optimal performance, its implementation becomes intractable as the signal dimension and the number of sensors increase. Second, we propose two tractable algorithms for sparsity pattern recovery in a centralized setting. In one algorithm, we formulate an optimization problem which minimizes the ML function and the `1,∞ quasi-norm of a matrix and use the iterative shrinkage-thresholding (IST) algorithm. The other algorithm extends the binary iterative hard thresholding (BIHT) algorithm with MMVs available at the FC. Second, we develop the decentralized variants of the proposed centralized algorithms. Note that, in a decentralized setting, all the nodes in the network collaborate with each other to make the global decision on the joint support set without the use of a centralized FC. All the nodes in the network send their 1-bit quantized measurement vectors to their one-hop neighbors. Both proposed decentralized algorithms have two distinct stages that strategically embed fusion of measurement information and decisions among nodes. These collaborations among the nodes can be structured into two distinct stages namely information fusion stage and index fusion stage. We note that all of the proposed algorithms assume the prior knowledge of sparsity index of the signal. If the sparsity index of signals is not known in advance, we can estimate the sparsity index by using methods proposed in the literature such as [39]–[42]. With the estimated sparsity index, we can implement the proposed algorithms to estimate the joint support of the signal. The performance of these decentralized algorithms is compared with that of centralized ones. Finally, we compare the performance of the proposed 1-bit CS algorithms with their real valued counterparts. Even with the significant loss of information due to coarse quantization, the performance of the proposed 1-bit CS algorithms is almost as good as their real valued counterparts except in the very low SNR regime. The saving in the number of bits required to represent compressed signals while recovering the sparsity pattern by the proposed algorithms is much larger than the real valued CS algorithms. Thus, the proposed 1-bit CS based algorithms are highly attractive for resource constrained sensor networks. We also compare the proposed algorithms with the algorithm developed in [38]. The algorithm does not assume the presence of a centralized FC and is the closest match to our algorithm. The algorithm developed in [38] considers 1-bit measurements at each time, evaluates the gradient with the new 1-bit measurements and then takes a gradient descent step. The proposed algorithms in this paper consider all the 1-bit measurements while computing the gradient and then take a gradient descent step. In the following, we summarize the main contributions of the paper. •
We derive a lower bound on the minimum number
of compressed measurements required per node for the joint support recovery with 1-bit quantized CS when the MMVs are available at a centralized FC. For the special case where P = 1, the derived lower bound is more general than the one derived in [13]. • We develop two computationally tractable algorithms for sparsity pattern recovery with 1-bit CS in a centralized setting. We extend the developed centralized algorithms to a decentralized setting in which nodes communicate only with their one-hop neighbors. • We perform simulations to illustrate the effectiveness of the proposed decentralized algorithms when compared to respective centralized algorithms, state-of-the-art 1-bit CS algorithms, and the real valued counter parts of the proposed algorithms. • Through numerical simulations, we show that the proposed centralized and decentralized 1-bit algorithms perform better than the SMV-based algorithm when these algorithms are constrained to use the same number of information bits. We also show that the MMV-based algorithm with 1-bit measurements is better than the SMV-based algorithm with multiple bits (real-valued measurements). Parts of this work were reported in [43] and [37]. In [43], a centralized algorithm based on IST that minimizes the ML function and the `1,∞ quasi-norm of a matrix with 1-bit CS was proposed. In [37], a decentralized algorithm based on BIHT was proposed with a slightly different observation model than is considered in this paper. Beyond [43] and [37], we have extended the work significantly by (i) providing theoretical analysis on the number of measurements per node with 1-bit CS in the MMV setting, (ii) developing the decentralized counterpart of the centralized algorithm in [43], (iii) providing extensive simulations to provide new insights on the benefits of the use of 1-bit CS algorithms in resource constrained sensor networks, and (iv) illustrating the effectiveness of theoretical lower bounds on the number of measurements per node derived in Section III by comparing with the empirical number of measurements required by the centralized algorithm for joint support vector recovery with small probability of error. Organization The paper is organized as follows. In Section II, the observation model is defined. The ML Decoder for sparse support recovery with 1-bit CS is developed in the centralized setting and performance guarantees are established in Section III. Two different centralized algorithms for joint support recovery with 1-bit CS are developed in Section IV. These centralized algorithms are extended to decentralized settings in Section V. In Section VI, numerical results are presented to show performance comparisons of different proposed algorithms. Section VII concludes the paper. Notation In this paper, we use the following notations. Scalars are denoted by lower case letters and symbols, e.g., x and τ . Vectors are written in lower case boldface letters, e.g., x. Upper case letters denote constants, e.g., P . Boldface upper
3
case letters and symbols are used to represent matrices, e.g., Q, Φ. Calligraphic letters are used to represent sets, e.g., U. We denote the identity matrix of size M as IM . Transpose of a matrix A is denoted by AT . Both the absolute value (of N a scalar) and the cardinality of a set is denoted by |.|. K denotes the number of combinations of K elements when chosen from N . k.kp is used to denote the `p norm of a 1 P p p vector defined by kxkp = . k.kp,q is used to i |xi | denote the `p,q norm of a matrix, X defined as kXkp,q = q 1 P P p p q . kXkF denotes the Frobenius norm of j( ij |xi | ) matrix X which is the `p,q matrix norm with p, q = 2. Finally, f (n) = Ω(g(n)) and f (n) = O(g(n)) denote f (n) ≥ C1 g(n) and |f (n)| ≥ C2 |g(n)| for some positive constants C1 and C2 as n → ∞, respectively. II. O BSERVATION M ODEL We consider a distributed network with multiple nodes that observe sparse signals. Let the number of sensors be P . At a given node, consider the following M × 1 real valued observation vector collected via random projections: yp = Φp sp + vp (1) where Φp is the M × N (M < N ) measurement matrix at the p-th node for p = 1, · · · , P. For each p, the entries of Φp are assumed to be drawn from a Gaussian ensemble with mean zero. The sparse signal vector of interest, sp for p = 1, · · · , P , has only K(> N , then M < 1. It means that only single measurements from some nodes of the network are sufficient to recover the joint sparsity pattern reliably. Thus, M → 0, as P → ∞.
IV. C ENTRALIZED A LGORITHMS WITH 1- BIT CS In this section, we develop two computationally tractable algorithms for joint sparsity pattern recovery in a centralized setting. In the first algorithm, we use the regularized row l1 norm minimization approach with the likelihood function as the cost function. In the second algorithm, we extend the BIHT algorithm to the MMV case. A. Centralized `1,∞ Regularized ML Based Algorithm In this algorithm, the ML function is used as the cost function instead of the commonly used least squares function [19], [44]. Sparsity is imposed by using `1,∞ regularization on ˆ the signal matrix S to obtain a sparse signal matrix estimate S or the estimate of the support of S. For the sake of tractability, we assume that the measurement matrix Φp = Φ is the same for all p = 1, 2, · · · , P . Then, from (1), we have yip = ΦTi sp + vip , (12)
The lower bound in (10) explicitly shows the minimum number of measurements required to recover the sparsity pattern of the sparse signal with 1-bit CS using the ML decoder with multiple measurement vectors. Note that, the value of a ¯pt (γ, K) also depends on the sparsity index K and the measurement SNR γ. Further, using the definition of λpik and . The work can easily be extended to the scenario having different measurement assuming finite σv2 and σs2 , it can be shown that 0 < a ¯pt < 1 matrices.
5
for i = 1, 2, · · · , M and p = 1, 2, · · · , P . In the rest of the paper, Φi denotes the i-th row of Φ. Let the matrix Z be obtained by element-wise quantization in (2) where zip is the (i, p)-th element of Z. Next, we calculate the probabilities Pr(zip = 1) and Pr(zip = 0) which will later be used to write the expression of likelihood of Z given S. We have, Pr(yip ≥ 0) ⇒ Pr(ΦTi sp + vip ≥ 0) = φ(ΦTi sp /σv ). Similarly, Pr(yip < 0) ⇒ Pr(ΦTi sp + vip < 0) = φ(−ΦTi sp /σv ). √ Rx 2 where φ(x) = (1/ 2π) −∞ e−t /2 dt. The conditional probability of Z given S, Pr(Z|S), is given by, P Y M Y Pr(Z|S) = Pr(zip |S) p=1 i=1
=
P Y M Y
ΦTi sp σv
φ
p=1 i=1
by
!!zip ×
ΦT sp φ − i σv
!!(1−zip ) .
The negative log-likelihood of Z given S, fml (ΦS), is given
fml (ΦS) = − log(Pr(Z|S)) !! " P X M X ΦT i sp + (1 − zip ) log =− zip log φ σv p=1 i=1
φ
−
ΦT i sp σv
!!#
(13)
which can be rewritten as m P X X xip zip log φ σv p=1 i=1 −xip + (1 − zip ) log φ , (14) σv and X = ΦS. In the following, we use X and ΦS interchangeably. We aim to minimize fml (ΦS) as well as incorporate the sparsity condition of the signal matrix S to obtain an ˆ or the support of S. As the signals estimated signal matrix S observed at all the nodes share the same support, the following matrix norm (as defined in [44] for real valued measurements) is appropriate. We define the row support of the coefficient matrix S as [44] rowsupp(S) = {w ∈ [1, N ] : swk 6= 0 for some k}, and the row-l0 quasi-norm of S is given by, ||S||row-0 = |rowsupp(S)|, which is also known as the l0,∞ norm, where |.| denotes the cardinality. To compute S, we aim to solve the following optimization problem: arg min {fml (ΦS) + λ||S||0,∞ } (15) fml (X) = −
S
where λ is the penalty parameter. However, the problem (15) is not tractable in its current form. The problem can be modified as arg min {fml (ΦS) + λ||S||1,∞ } (16) S
where ||S||1,∞ =
PN
max |sij |, i=1 1≤j≤P
i.e., ||S||1,∞ is the sum
of all the elements with maximum absolute value in each row, also known as the l1,∞ quasi-norm of a matrix. It is noted that, fml (ΦS) and ||S||1,∞ are convex functions. The former is the ML function while the latter is a convex relaxation of the row-l0 quasi-norm [44], therefore the expression in (16) is
a convex optimization problem. 1) Algorithm for Solving the Optimization Problem in (16): The goal is to solve the problem of the form arg min {f (ΦS) + λg(S)} (17) S
where f (ΦS) = fml (ΦS) and g(S) is the l1,∞ norm of S. The class of iterative shrinkage-thresholding algorithms (ISTA) provides one of the most popular methods for solving the problem as defined in (17). In ISTA, each iteration involves the solution of a simplified optimization problem, which in most of the cases can be easily solved using the proximal gradient method, followed by a shrinkage/soft-threshold step; for e.g., see [45]–[47]. It should be noted that in these papers, ISTA is applied to find an optimum vector which minimizes a given objective function and, therefore, cannot be applied here directly. We extend the idea to find an optimal matrix which is a minimizer of the expression f (ΦS) + λg(S). From [47], at the k-th iteration we have Sk = PLf (Sk−1 ) (18) where Lf 1 ||S − (T − ∇f (T)||2F . PLf (T) = arg min λg(S) + 2 L f ˆ S Inputs to the algorithm are Lf (the Lipshitz constant of ∇f ) and S0 , the initialization for the iterative method, which could be assumed to be a null matrix or Φ† Z, where Φ† is the pseudoinverse of Φ. For our case, the gradient of fml (X) w.r.t. matrix S can be easily calculated as ΦT ∇fml (X), where X = ΦS. Notice that, ∇fml (X) is the gradient of fml (X) w.r.t. X and is given by x ˜2
x ˜2
zip exp(− 2ip ) (1 − zip )exp(− 2ip ) ∇fml (xip ) = √ − √ , (19) 2πσv φ(˜ xip ) 2πσv φ(−˜ xip ) where x ˜ip = xip /σv . The problem defined in (18) is row separable for each iteration. Therefore, to solve for Sk , i.e., to find PL (Sk−1 ), we divide the problem into N subproblems, where N is the number of rows in S. Next, we solve the following subproblem for each row of Sk : Lf i 1 arg min λg(si ) + ||s − (ti − ∇f (ti ))||22 ; (20) 2 L i f s where si , ti and ∇f (ti ) are the ith rows of S, Sk−1 and ∇f (Sk−1 ) respectively. Equation (20) is of the form: Lf arg min λg(s) + ||s − u||22 ; (21) 2 s where g(s) = ||si ||∞ , i.e., the l∞ norm of the ith row of S and constant vector u is given by u = ti − L1f ∇f (ti ) (we avoid using superscript i on g(s) and u for brevity). Introducing an auxiliary variable t = g(s), the problem in (21) can be rewritten as 1 2 arg min t + ¯ ||s − u||2 , s.t. 0 ≤ wp sp ≤ t. (22) 2λ s ¯ = λ , wp = sign(up ) and up and sp are the pwhere λ Lf th elements in u and s respectively, for all p = 1, 2, · · · , P . The problem in (22) can be solved using Lagrangian based methods. The Lagrangian for (22) is defined as X X 1 ¯ − αi wi si + βi (wi si − t), L(s, α, β) = ||s − u||22 + λt 2 i i where α and β are the Lagrangian multipliers. Optimality
6
conditions for 1 ≤ p ≤ P for the above equation are (sp − up ) − αp wp + βp wp = 0, X ¯− λ βp = 0, p
αp (wp sp ) = 0, βp (wp sp − t) = 0,
(23) for αp , βp ≥ 0. Notice that, if the optimal t = t∗ was known, we could use the above conditions to compute the optimum x∗p via wp t∗ if |up | ≥ t∗ ; ∗ ∗ sp (t ) = (24) up otherwise. The proof of the above statement follows from the following lemma [48]. Lemma 1. The optimality conditions in (23) are satisfied by the solution in (24). Now, the problem reduces to finding the optimal t∗ . To find the optimal t∗ , we X Xdefine the function h(t) = λ − βp = λ + wp (sp (t) − up ), (25) p
p:|up |≥t
where sp (t) is obtained by (24) with t instead of t∗ . The optimal t∗ can be found by solving the following equation h(t∗ ) = 0. (26) ∗ This can be easily solved for t by applying a bisection based method using the initial interval [0, ||u||∞ ]. If there does not exist a solution in this interval, i.e., h(0) × g(||u||∞ ) ≥ 0, the trivial solution is given by t∗ = 0. Once we find t∗ , the solution to (22) is given by (24). Each subproblem given by (20) can be solved in a similar way, the solution to each of which can be used to find Sk through Equations (18) and (IV-A1). The summary of all the steps is provided in Algorithm 1, where ||.||F denotes the Frobenius norm. Algorithm 1 produces the matrix Sk and locations of non-zero elements in Sk yield the estimated support of original signal matrix S. Algorithm 1 Centralized `1,∞ Regularized ML Based Algorithm (MLA) ˜ 1) Given tolerance >0, parameters λ>λ, 0 ||Sk−1 ||F 6) k = k + 1 7) Define matrix U = Sk−1 − L1f ∇f (Sk−1 ) where ∇f (Sk−1 ) is computed as in (19) 8) For each row of Sk 9) Update the p-th row element using (26) and (24) for p = 1, 2, · · · , P 10) End For 11) End While 12) End While
(C-BIHT) algorithm. For the sake of completeness, we first introduce the BIHT algorithm. BIHT [49]: The BIHT algorithm is an iterative method that reconstructs a K-sparse signal s from binary information of compressed measurements from a single sensor. The signal estimate at the k-th iteration with quantized measurements z is given as sk = ΘK sk−1 − τ ΦT (sign(Φsk−1 ) − z) ,
where ΘK is a K-ball projection operator which forces all the elements but K with largest magnitudes to zero and τ is the step size. This method is an iterative method where at each iteration an estimate for the support is computed. This estimate is improved in successive iterations. Algorithm 2 Centralized BIHT algorithm (C-BIHT) Inputs : Φ, K, Q, τ 1) Initialize S0 2) For iteration j until the stopping criteria 3) Sj = Sj−1 + τ ΦT ( Z − sign(ΦSj−1 )) 4) U j = DetectSupport(Sj , K) 5) Sj = Threshold(Sj , U j ) 6) End For ˆ = Sj ∗ and Uˆ = U j ∗ when stopping at iteration j ∗ 7) S C-BIHT: In a centralized setting, Z is available at the fusion center. Merging BIHT [49] and the idea of Centralized IHT Algorithm [10], we propose the Centralized BIHT (C-BIHT) algorithm to estimate the joint sparsity pattern using multiple sensors. In Algorithm 2, we provide the pseudo code for CBIHT. We first initialize the signal estimate to S0 . Next, during the j-th iteration the gradient of the cost function is evaluated using Sj−1 and a step proportional to the gradient is taken in the negative direction. This step ensures that Sj moves in the direction where the cost function is minimized. In Step 4, DetectSupport (Sj , K) is a function which computes the support of the K-sparse signal matrix. A simple implementation of this function is to select K rows with the largest l2 norm as the support set. Next, in Step 5, the function Threshold (Sj , U j ) forces all the rows of Sj matrix to zero except for the indices in support U j . In other words, this step is a hard thresholding operation which forces matrix Sj to be K-row sparse. These iterations are continued until the stopping criterion is satisfied (such as minimum squared error). The support estimate of the final iteration is the estimated support. Since the algorithms developed in the centralized setting take into account all the measurements from multiple sensors, they expect to have better support recovery performance compared to algorithms that take measurements from a single sensor. However, a centralized system is not always feasible when the network is large or resource constrained. In the next section, we propose algorithms to solve the sparsity recovery problem in a decentralized manner. V. D ECENTRALIZED A LGORITHMS WITH 1-B IT CS
B. Centralized-BIHT In this section, we extend the BIHT [49] algorithm to the multiple sensor case and refer to it as the Centralized BIHT
In this section, the algorithms developed for the centralized case are extended to the decentralized settings. A decentralized network is modeled as an undirected graph G = (V, E), where
7
Algorithm 3 Decentralized- Row-`1 Regularized ML Based Algorithm (DMLA) ˜ 1) Given tolerance >0, parameters λ>λ, 0 K. B. Decentralized-BIHT In this subsection, we propose a decentralized algorithm based on the BIHT algorithm for sparsity pattern recovery. In Algorithm 4, we provide the pseudo code for D-BIHT. This algorithm also has information and index fusion stages as in DMLA. Algorithm 4 Decentralized BIHT (D-BIHT) Inputs : Φ, K, τ, neigh(p) for all p ∈ V 1) Initialize S0p for all p ∈ V 2) Local Communication at node p , for all p ∈ V Transmit zp to its one-hop neighborhood Receive zr where r ∈ neigh(p) and form Zp 3) For iteration j until the stopping criteria 4) Sjp = Spj−1 + τ ΦT (Zp − sign(ΦSj−1 p )) 5) Upj = DetectSupport(Sjp , K) 6) Sjp = Threshold (Sjp , Upj ) 7) End For∗ ˆp = Sj and Uˆp = U j ∗ when stopping at iteration j ∗ 8) S 9) Global Communication a) For all p ∈ V, transmit Uˆp to V b) Receive Uˆi for ∀i 6= p 10) Uˆ = Majority(Uˆ1 , Uˆ2 , · · · , UˆP ) 1) Information Fusion Stage: In this stage, the p-th node collects quantized compressed measurements from its one-hop neighbors and forms Zp . The p-th node then uses the Zp to estimate the support, (Uˆpj ), and the signal matrix, Sjp , from steps 3 through 7. 2) Index Fusion Stage: In this stage the p-th node receives estimates of the support set from all the other nodes in the network. Each node decides on the sparsity pattern using the Majority fusion rule implemented in majority function. This stage is the same as the index fusion stage of DMLA. The difference between the final performance of DMLA and D-BIHT is due to the difference in performance of their information fusion stage in estimating Uˆp . Decentralized BIHT modified (D-BIHTm): For a resource constrained network that has very severe restrictions on bandwidth usage and/or computation capacity (power constraint), we simplify Algorithm 4. In particular, Algorithm 4 is modified by omitting the Information Fusion stage. Each node obtains an estimate of the sparsity pattern, Uˆp , p = 1, · · · , P , based on only its information via the BIHT algorithm. This stage is referred to as the Self Decision Stage. Thus, the communication overhead/bandwidth of the network and the computational cost at each node is reduced. The next stage is the index fusion stage where the final estimate is obtained by global fusion as in Algorithm 4. This special case of D-BIHT is referred to as the Decentralized BIHT modified algorithm.
8
20
100
100 80
1290 10 80 8 70 6 60 4 2 0.5
70
90
90
90
80 70 60
0.6
80
80
70
70
0.7 0.8 Compression Ratio, M/N
60
60 0.9
100
18
98
16
96
14
94
0
14
Number of Sensors, P
90
16
20
10
Number of Sensors, P
100
0 10
18
92
12
90 10 8
100
100
88 86
90
6 4
50 1
(a) MLA
2 0.5
84
90
82
90
0.6
0.7 0.8 0.9 Compression Ratio, M/N
1
(b) C-BIHT
Fig. 1: Percentage of Sparsity Pattern Recovery (PSPR) for MLA and C-BIHT algorithms when η = -3.01 dB, N = 100 and K = 5 in a network of 10 nodes.
VI. N UMERICAL R ESULTS In this section, we evaluate the performances of the proposed centralized and decentralized algorithms through numerical simulations. We consider the percentage of sparsity pattern recovered (PSPR) correctly and the probability of exact sparsity pattern recovery (PESPR) as the performance metrics. They are defined as PSPR # of support elements out of K that are recovered correctly = × 100 K PESPR # of Monte Carlo runs in which true support is recovered = # of Monte Carlo runs
respectively. We obtain the performances of the proposed algorithms for different values of compression ratios (M/N ) and noise variances (σv2 ). We generate a signal vector of length N = 100 with sparsity index, K = 5. For each M , we generate the elements of the M × N measurement matrix Φ from a normal distribution with mean zero and unit variance. The sparse support set for the signal is selected from [1, N] uniformly. The amplitudes of the signals in the support set are generated from an i.i.d. Gaussian distribution with zero mean and unit variance. The signals are assumed to remain constant over all Monte Carlo runs. The measurement noise at each node is i.i.d. Gaussian with zero mean. The variance of the noise vector, v, is set such that E(kvk22 ) is a constant. We define the total noise power as η = 10 log10 (E(kvk22 )) dB. We compare our proposed algorithms with algorithms proposed in [38] and [28] and refer to them as ImpNoise and 1bitGAMP, respectively. We ran C-BIHT, MLA, D-BIHT, DMLA, DBIHTm, ImpNoise, 1bitGAMP, and BIHT algorithms for 1000 Monte Carlo runs. In simulations, we use step size τ = 1 for C-BIHT, D-BIHT and D-BIHTm algorithms. We ran all the proposed decentralized algorithms over a network with 10 nodes each of which is assumed to be of degree 3. We first compare the performances of the two proposed centralized algorithms, MLA and C-BIHT. In Figure 1, we present the PSPR contours, i.e., the level curves of PSPR as a function of M/N and P for η = -3.01 dB. Our results for MLA and C-BIHT are shown in Figures 1(a) and 1(b) respectively.
We can see that, with an increase in M/N for a constant P , the PSPR of both the algorithms improves. We can also see similar improvement in PSPR, with an increase in P for a constant M/N . Further, C-BIHT achieves 100% in PSPR with less values of M/N and P than MLA. For example, when the M/N is 0.8, C-BIHT achieves 100% in PSPR with 10 sensors, while MLA requires 15 sensors to achieve the 100% performance for the same M/N . Figure 2 shows the results of two experiments where we compare PSPR and PESPR of all the proposed algorithms when η = -3.01 dB and η = 6.99 dB respectively. In Figures 2(a) and 2(b), we present our results when η = -3.01 dB where the values of PSPR and PESPR are shown as a function of M/N . The results show that, when η = -3.01 dB, all of the proposed centralized and decentralized algorithms perform well with near 100% sparsity pattern recovery. All the curves corresponding to the proposed algorithms, in Figures 2(a) and 2(b) overlap and are not distinguishable. As expected, SMV based algorithms, BIHT and 1bitGAMP, perform quite poorly. It should be noted that ImpNoise, which is a distributed 1bit CS algorithm, also performs quite poorly. For the case when η = 6.99 dB, we present our results in Figures 2(c) and 2(d). It can be seen that with increased noise power, there is degradation of performance of all of the algorithms. When M/N increases, the performance of all of these algorithms improves. It is seen that C-BIHT and D-BIHT have better performance in sparsity pattern recovery, i.e., almost 100% for all the measurements. MLA and DMLA performance is 90% or above only when M/N approaches 1. The performances of both decentralized algorithms are comparable to that of their corresponding centralized algorithms. However, the performance of D-BIHT is better than DMLA. All of these algorithms have much better performance compared to the BIHT and 1bitGAMP algorithms, i.e., the algorithms using single measurement vectors. The performance of ImpNoise algorithm does not change much when compared to 2(a) and 2(b) but is still not as good as the proposed algorithms. It should be noted that D-BIHTm does not lose much in performance when it is compared with D-BIHT even though D-BIHTm uses less amount of information in estimating the joint sparse support set.
9
100
1
80
0.8
PESPR
PSPR
60 40 BIHT ImpNoise 1bitGAMP D−BIHT D−BIHTm C−BIHT MLA DMLA
20 0 −20
0.65
0.6
0.4
0.2
0.7
0.75
0.8
0.85
0.9
0.95
0 0.6
1
Compression Ratio, M/N
(a) PSPR when η = -3.01 dB
0.8
PESPR
PSPR
60 40
−20 0.6
1
1
80
0
0.7 0.8 0.9 Compression Ratio, M/N
(b) PESPR when η = -3.01 dB
100
20
BIHT ImpNoise C−BIHT MLA D−BIHT D−BIHTm 1bitGAMP DMLA
BIHT MLA DMLA ImpNoise 1bitGamp C−BIHT D−BIHT D−BIHTm
0.7 0.8 0.9 Compression Ratio, M/N
0.6
0.4
0.2
0 0.6
1
(c) PSPR when η = 6.99 dB
BIHT C−BIHT MLA DMLA ImpNoise 1bitGAMP D−BIHT D−BIHTm
0.7 0.8 0.9 Compression Ratio, M/N
1
(d) PESPR when η = 6.99 dB
Fig. 2: Percentage of Sparsity Pattern Recovery (PSPR) and Probability of Exact Sparsity Pattern Recovery (PESPR) for DBIHT, DMLA, D-BIHTm, C-BIHT, MLA, BIHT (SMV) algorithms in a network of 10 nodes each of which has degree 3 when η = -3.01 dB and η = 6.99 dB respectively for N = 100 and K = 5. TABLE I: Comparison of run times of C-BIHT and MLA in seconds to obtain the sparsity pattern when N =100 and N =500 M→ P=3 P=5 P = 10 M→ P=3 P=5 P =10 M→ P=3 P=5 P = 10 M→ P=3 P=5 P =10
C-BIHT when N = 100 and K = 5 60 70 80 90 0.0285 0.030 0.0307 0.034 0.067 0.069 0.069 0.072 0.090 0.092 0.095 0.111 MLA when N = 100 and K = 5 60 70 80 90 2.2676 2.5566 2.9729 3.3396 4.5370 5.1878 6.0173 6.6772 9.3963 10.9925 12.6525 14.0072 C-BIHT when N = 500 and K = 25 300 350 400 450 0.1757 0.1757 0.1695 0.1942 0.1764 0.1832 0.1890 0.1970 0.2170 0.2398 0.2594 0.2694 MLA when N = 500 and K = 25 300 350 400 450 11.8385 13.7213 15.8544 17.6784 19.7641 23.1694 26.4415 29.6953 39.7087 46.1982 52.7405 59.1511
100 0.035 0.118 0.160 100 4.5327 7.5497 15.7242 500 0.1857 0.2127 0.3382 500 19.6593 32.8875 65.5580
Next, we compare the computational complexities of the proposed algorithms in a centralized setting. Addition and
multiplication are considered as the basic operations in evaluating complexity. We assume that the number of operations in the evaluation of a gradient and a function is considered to be equal to the dimensions of the gradient and the variable returned by the function, respectively. Based on these assumptions, the total number of operations carried out by ˜ the MLA can be shown to be of order O ceil(logα λλ ) 1 + T 0 (3N P +N P NE +1) , where T 0 is the number of times the inner while loop executes and NE denotes the total number of operations required to update the elements of Sk using (25) and (27). Similarly, the total number of operations required by C-BIHT is of order O T (4N P + KP ) . As the exact analysis of computational complexity of MLA depends on the ˜ λ, and T 0 , it is difficult to provide algorithm parameters λ, a fair comparison with the computational complexity of CBIHT. Therefore, we employ the run times of centralized algorithms as a measure of computational complexity of these algorithms. The analysis of time complexities of the decentralized algorithms is similar to their respective centralized counterparts. This is because time complexity at each node in the decentralized algorithms is dominated by the time complexity of the smaller instance of their corresponding centralized algorithms. Table I gives a summary of run times required by the C-BIHT and MLA algorithms to estimate the sparsity pattern from given 1-bit compressive MMV for different values of N , P and M . The experiment is carried out in Matlab 2015b using processor Intel Xenon(R). The values
10
1
100
PSPR
80 70
BIHT C−BIHT MLA DMLA D−BIHT ImpNoise 1bitGAMP D−BIHTm
0.8
PESPR
90
BIHT C−BIHT MLA DMLA D−BIHT 1bitGAMP ImpNoise D−BIHTm
0.6 0.4
60 0.2 50 0 −2 40 −2
0
2 4 6 Total Noise Power (dB)
0
8
2 4 6 Total Noise Power (dB)
(a) PSPR
8
(b) PESPR
Fig. 3: Percentage of Sparsity Pattern Recovery (PSPR) and Probability of Exact Sparsity Pattern Recovery (PESPR) for DBIHT, D-BIHTm, DMLA, C-BIHT, MLA, BIHT (SMV) as a function of η when N = 100 and K = 5 in a network of 10 nodes each of which has degree 3.
SIHT C−BIHT D−BIHTm D−BIHT
99.8
99.7
100
98
80
SIHT C−BIHT D−BIHT D−BIHTm
96
94
92
99.6
99.5 0.6
100
0.7
0.8
0.9
1
90 0.6
PSPR
PSPR
99.9
PSPR
100
60
SIHT C−BIHT D−BIHT D−BIHTm
40
20
0.7
0.8
0.9
1
0
10
15
20
25
Compression Ratio, M/N
Compression Ratio, M/N
Total Noise Power (dB)
(a) η = 6.99 dB
(b) η = 16.99 dB
(c) PSPR when M = 60
Fig. 4: Comparison of SIHT with C-BIHT, D-BIHT, and D-BIHTm when N = 100, P = 10 and K = 5.
in the table show the average times required by the centralized algorithms in seconds to obtain the sparsity pattern which is obtained by averaging the total time required for 20 runs. For both the algorithms, the time required increases when one or more of N , P and M increase. The run times of C-BIHT and MLA have a 5-fold increase when the problem size, N , have a 5-fold increase, i.e., from 100 to 500. For the same values of N , P and M , C-BIHT is around 100 times faster than MLA and hence is a clear winner in terms of time complexity. Next, we compare the performances of the proposed 1-bit CS algorithms for joint support recovery with that of their real valued CS counterparts. Here we choose to compare CBIHT, D-BIHT, D-BIHTm with the simultaneous iterative hard thresholding (SIHT) algorithm [10]. Figure 4 shows the PSPR values of these algorithms for different values of η. Figure 4(a) shows the PSPR of C-BIHT, D-BIHT and SIHT when η = 6.99 dB as a function of M/N . It shows that there is almost no loss in the PSPR of the proposed algorithms when compared to SIHT. For the case when η = 16.99 dB, we present our results in Figure 4(b). It is quite interesting to see that, even with an increase in η, the PSPR of C-BIHT and D-BIHT is almost 100% for all the values of M/N . Next, we analyze the sensitivity of all of these algorithms with respect to η. In this experiment, we choose M/N = 0.6 and vary η. Figures 4(c) shows PSPR values of the proposed algorithms and SIHT as a function of η . It is seen that CBIHT and D-BIHT perform similar to SIHT when η is 16 dB or less. The result is quite promising even when M/N = 0.6. However, when η > 16 dB, the rate of degradation
in the performances of C-BIHT and D-BIHT is higher than SIHT. Degradation in the performance of 1-bit CS algorithms with an increase in η is expected. It is noted that 1-bit CS yields huge saving in the number of bits required to store and/or transmit compressed measurements. For each signal vector with 1-bit CS measurements, zp , requires only M bits. However, approximation of a real valued CS with L level of quantization requires M log2 (L) bits. Thus, 1-bit CS saves M log2 (L) − M bits. The saving increases by a factor of P in a sensor network with P sensors. The performances of these 1-bit CS algorithms, except in very low SNR regimes, are comparable to their real valued counterparts. Thus, the proposed algorithms provide a promising alternatives to real valued CS based algorithms, especially in resource constrained networks except when the total noise power is large. We have studied the advantages of using MMVs in centralized and decentralized settings over SMV. Next, we study the sparsity pattern recovery performance of the SMV-based algorithm with MMV-based algorithms when we put a constraint on the total number of bits that can be used by algorithms to estimate support of sparse signal(s). Let NB be the total number of bits that can be used in total to estimate the support of the sparse signals. 1-bit SMV-based CS algorithm makes all NB measurements to estimate the support of the sparse signal, whereas each sensor in MMV based CS algorithms makes NB /P 1-bit CS measurements. In this setting, i.e., when the total number of bits is constant, we compare the performance of MMV based algorithms (both centralized and decentralized) with SMV-based algorithms. Here, we chose
11
100 95 90
C−BIHT D−BIHT D−BIHTm BIHT IHT
PSPR
85 80 75 70 65 60 55 0.6
0.7
0.8
0.9
1
Compression Ratio, M/N
Fig. 5: Percentage of Sparsity Pattern Recovery for D-BIHT, D-BIHTm, C-BIHT, BIHT, and IHT as a function of M/N when N = 100 and K = 5 in a network of 10 nodes each of which has degree 3. Next, we numerically estimate the total number of 1-bit compressed measurements required by the C-BIHT algorithm on an average for joint support recovery with minimum error. Kσs2 +µT µ Here, for each value of signal SNR, γ = , we run Kσv2 the C-BIHT algorithm for different number of measurements, say M 0 , starting from 1. We estimate the probability of joint support recovery from 100 runs of the algorithm for M 0 . If the probability of joint support recovery is less than 0.95, we increase the value of M 0 by one and repeat the experiment. The first M 0 with the probability of joint support recovery of 0.95 or more is considered as the lower bound, Mmin . For each value of γ, we repeat the experiment for 20 instances. The average of all Mmin s for a γ is considered the desired
lower bound for C-BIHT, MCBIHT , for the γ. We compare this MCBIHT with the bound obtained in (11) which we label as MM L . Figure 6 shows the comparison of MCBIHT with MM L for K = 20 and 40. It can be seen that the MCBIHT is lower-bounded by MM L . When the γ is small, the difference between MCBIHT and MM L is high. When γ increases, the difference decreases. However, any further increase in the value of γ does not change MCBIHT by much and MCBIHT tends to stabilize. 450
MCBIHT, K = 40
400
MML, K = 40
350
MML, K= 20
MCBIHT, K = 20
300
M
to compare the performance of C-BIHT, D-BIHT, D-BIHTm with BIHT. We should note that in this setting BIHT uses an overcomplete dictionary whereas C-BIHT, D-BIHT, D-BIHTm uses an under-complete dictionary to obtain measurements. Hence, we compare the support recovery performance when SMV is available from the over-determined dictionary with the joint support recovery with MMV from the under-determined dictionary. Figure 5 shows the PSPR values of these algorithms for different values of M/N . It should be noted that M/N refers to the compression ratio of the measurement vectors of each sensor in centralized and decentralized settings. BIHT is performed when the compression ratio is equal to P M/N . Finally, we compare the performance of all of these algorithms with that of the IHT algorithm [50], the SMV based realvalued CS algorithm which, acts as an upper-bound on the performance of the 1-bit CS algorithm for the same number of measurements, here P M . It is seen that, initially, the performance of BIHT is comparable to its centralized and decentralized counterparts. As the compression ratio increases both C-BIHT and D-BIHT have a greater rate of improvement compared to BIHT, and hence the MMV based algorithm performs better when M is increased. It can be observed that PSPR of IHT is greater than C-BIHT and D-BIHT until M/N for MMV is 0.7 and 0.8, respectively. Based on the performance of IHT, it can be inferred that MMV based algorithms perform better than SMV based algorithm even if we let the precision of the SMV based algorithm be infinite.
250
200
150
100
14
16
18
20
22
24
26
28
γ dB
Fig. 6: Minimum number of measurements for support recovery vs. Signal SNR, γ when N = 1000, K= 20, σv2 = 0.01, µj = 1, for j ∈ U, and P = 5 Finally, we evaluate the performance of the C-BIHT and MLA algorithms when we relax our assumptions on the measurement matrices. In the first case, we consider the case when the measurement matrices Φp for p = 1, · · · , P are different and compare the performance with the case when Φp = Φ for all p. Figure 7(a) shows the PSPR values of the C-BIHT algorithm and MLA algorithm as a function of η. MLA diff and C-BIHT diff refer to results when different measurement matrices are used at different sensors. The performance of the C-BIHT algorithm using different Φp is comparable to when using Φ at all nodes. However, Figure 7(a) shows that the MLA algorithm has improvement in performance when different Φp are used at different sensors, especially when the total noise power is high. The improvement in the performance is due to the diversity of the measurement matrices. Previous works [51], [52] have also shown improvement in the signal recovery performance theoretically and through numerical experiments when different measurement matrices are used instead of single measurement matrix. These works, however, assumed compressed measurements to be real-valued. The numerical results in Figure 7(a) show similar results when we have multiple 1-bit compressed measurements from different measurement matrices. We can also see that the performance of the MLA algorithm improves and becomes comparable to that of the C-BIHT algorithm by using different Φp . We should note that the FC is required to know the measurement matrices used by each sensor in the network. Similarly, in a decentralized system, each sensor is required to know the measurement matrices of all of its one-hop neighbors. Hence, in both of these cases, prior communication between sensors and the FC or among sensors are required for sharing measurement matrices. Further, the space complexity of the FC in a centralized setting and each of the nodes in a decentralized
12
100
100
90
90
80
80 PSPR
PSPR
setting increases with the increase in the size of the network and increase in connectivity among nodes respectively. In the second case, PSPR values of C-BIHT and MLA algorithms are evaluated when a random partial DCT matrix, which is obtained by picking M rows uniformly at random from N rows of an N × N DCT matrix, is used as the measurement matrix and is compared to the case when random Gaussian measurement matrix is used. The numerical results of the experiment are shown in Figure 7(b). Curves represented by legends MLA dct and C-BIHT dct refer to the results when partial DCT matrices are used instead of a random Gaussian matrices. We can see a similar performance of the C-BIHT algorithm and improvement of the MLA algorithm with random partial DCT matrix when compared to a random Gaussian matrix. Hence, the numerical results show that the proposed algorithms have good sparsity pattern recovery performance even after these initial assumptions on Φp are relaxed.
70 60 50 40
70
(a) Different Φp
MLA MLA−dct C−BIHT CBIHT−dct 40 −2 0 2 4 Total Noise Power (dB) 50
6
ACKNOWLEDGEMENT The authors would like to thank H. Zayyani et al. for sharing their code for ImpNoise so that we could carry out performance comparisons. A PPENDIX A Proof of Theorem 2 First, we compute E{˜ a0pt } and E{˜ a1pt } and use it to derive the desired lower bound. PK ˜p Let upk = µi . When the entries of the i=1 ΦUk 1i projection matrix Φ are i.i.d. Gaussian with mean zero and variance N1 , it can be shown that upk is a Gaussian random variable with mean zero and variance N1 ||µ||22 . Then we have, 1
1
E{˜ a0pt } = E{(1 − λpj ) 2 (1 − λpk ) 2 | |Uj ∩ Uk | = t} =
RR
1
1
(1 − λpj (upj )) 2 (1 − λpk (upk )) 2 fUk Uj (upk , upj )dupk dupj
which can be found by a 2-fold integration, where we −u write λpj (upj ) = Q σ2 + Kpjσ2 . In the high dimensional v N s setting, given that |Uj ∩ Uk | = t, the joint pdf of (upk , upj ), fUk Uj (upk , upj ) tends to be a bi-variate Gaussian with mean 0 and covariance matrix Σt [13] given by µT µ 1 ρt Σt = ρt 1 N
60 MLA MLA_diff C−BIHT_diff C−BIHT −2 0 2 4 Total Noise Power (dB)
of error with 1 bit CS measurements when the distribution of the support set is not uniform. In a similar setting, we plan to investigate the problem of optimum quantizer design.
Pt
6
(b) random partial DCT matrix
Fig. 7: PSPR values of C-BIHT and MLA algorithms as a function of η when (a) different random Gaussian Φp are used and (b) random partial DCT matrix is used when N = 100, M = 60, K = 5, and P = 10. VII. C ONCLUSION In this work, we have considered the problem of joint sparsity pattern recovery with 1-bit quantized compressive measurements. We have determined the performance bounds for joint support recovery of sparse signals when 1-bit quantized measurements from distributed sensors are available at the FC. We have shown that the number of compressive measurements required to recover the joint sparsity pattern with vanishing probability of error has an inverse relation with the number of sensors in the sensor network. We also developed two computationally tractable centralized algorithms namely MLA and C-BIHT for sparsity pattern recovery with 1-bit quantized measurements. Further, we have extended the proposed centralized algorithms to decentralized settings. We have shown that the performance of these decentralized algorithms is comparable to the centralized algorithms. Through numerical simulations, we showed that 1-bit CS algorithms have comparable performance to real valued CS algorithms except in cases when the total noise power is large. The proposed 1-bit CS algorithms are promising for resource constrained networks as they provide a significant saving in the number of bits required to store and/or transmit with performance comparable to their real valued CS counterparts. In future work, we plan to generalize the upper bound on the probability
µ ˜
i ˜t = where ρ0 = 0, ρt = µi=1 for t = 1, · · · , K − 1, µ T µp p ˜ ˜ f or m, n = 1, 2, · · · , K} = ΦUj {µm µn ; if ΦUk
1m
1n
˜ t = [˜ for |Uj ∩ Uk | = t and µ µ1 , · · · , µ ˜t ]T . a1pt }, and upper In the following, we evaluate E{˜ a0pt } and E{˜ bound Perr . In contrast to [13], where ˜s was assumed to be first order Gaussian, we do not make any assumptions on ˜s. 1 2 2 2 Let upk = x, upj = y and σp2 = σv2 + K N σs and σ = N ||µ||2 . From Equations (2) and (14) in [53], we have the following approximation x2 1 1 2x2 exp − 2 + exp − 2 Q(x) ≈ (27) 4 3 2σp 3σp q R∞ z2 1 for small x, where Q(x) = x 2π exp(− 2 )dz. In [53], the authors have claimed that the right hand side (R.H.S.) of (27) also acts as a tight upper bound for x > 0.5. Accordingly, we have also 1 1 exp − x22 + exp − 2x22 if x < 0 4 3 2σ 3σ p2 p 2 λpj (x) ≈ 1 − 1 1 exp − x 2 + exp − 2x2 if x > 0 4 3 2σ 3σ p
p
and similar is the case with λpk (y). Then, we have, 1 1 E{˜ a0pt } = E{(1 − λpj ) 2 (1 − λpk ) 2 | |Uj ∩ Uk | = t} = (1 − λpj (x < 0))(1 − λpk (y < 0)) × P r(x < 0&y < 0| |Uj ∩ Uk | = t) +(1 − λpj (x < 0))(1 − λpk (y > 0)) × P r(x < 0&y > 0| |Uj ∩ Uk | = t) +(1 − λpj (x > 0))(1 − λpk (y < 0)) × P r(x > 0&y < 0| |Uj ∩ Uk | = t) +(1 − λpj (x > 0))(1 − λpk (y > 0)) × P r(x > 0&y > 0| |Uj ∩ Uk | = t).
13
We need to find the following expression to calculate the upper bound on Perr . 1 1 E{˜ a0pt } = E{˜ a1pt } = 2(Ip1 +Ip2 −Ip3 −Ip4 )+ + sin−1 ρt 4 2π (28) where Z 0 Z 0 1 1 2x2 x2 exp − 2 + exp − 2 Ip1 = 3 2σp 3σp −∞ −∞ 4 1 1 2y 2 y2 × × f (x, y)dxdy exp − 2 + exp − 2 4 3 2σp 3σp Z ∞Z 0 1 1 2y 2 y2 Ip2 = exp − 2 + exp − 2 3 2σp 3σp −∞ 4 0 × f (x, y)dxdy Z ∞Z 0 1 1 2x2 x2 Ip3 = exp − 2 + exp − 2 3 2σp 3σp −∞ 4 0 2 2 1 1 2y y × × f (x, y)dxdy exp − 2 + exp − 2 4 3 2σp 3σp Z ∞Z ∞ 1 1 x2 2x2 Ip4 = exp − 2 + exp − 2 4 3 2σp 3σp 0 0 × f (x, y)dxdy and fUk Uj (x, y) is bi-variate Gaussian with mean zero and the covariance matrix Σt 2 x + y 2 − 2ρt xy 1 p exp − . f (x, y) = 2σ 2 (1 − ρ2t ) 2πσ 2 1 − ρ2 Also, σ 2 = N1 ||µ||22 . The solution to the above integral can easily be found using change of variables to simplify it to the form Z 0 Z 0 1 arcsin(ρt ) fUk Uj (uk , uj )duk duj = + 4 2π −∞ ∞ and Z Z ∞ 0 1 arcsin(ρt ) . fUk Uj (uk , uj )duk duj = − 4 2π 0 ∞ The solution to (28) is 1 1 E{˜ a0pt } = E{˜ a1pt } = 2(Ip1 +Ip2 −Ip3 −Ip4 )+ + sin−1 ρt 4 2π " ! p 2 2 −1 −1 σp 1 − ρt 1 sin ρt1 2sin ρt2 3sin−1 ρt3 p = + p + p 2π 4 9 1 − ρ2t1 1 − ρ2t2 1 − ρ2t3 !# sin−1 ρt4 sin−1 ρt5 1 1 p p − + + + sin−1 ρt (29) 2 2 4 2π 3 1 − ρt4 1 − ρt5 Here, ρpt1 = ρpt2 = q
ρpt3 =
σ 2 (1
ρt σp2 ; − ρ2t ) + σp2
ρt σp2 q ; σ 2 (1 − ρ2t ) + σp2 43 σ 2 (1 − ρ2t ) + σp2 ρt σp2
4 2 3 σ (1
− ρ2t ) + σp2
and ρpt5 = q
ρt σp ; ρpt4 = q ; σ 2 (1 − ρ2t ) + σp2 ρt σp
4 2 3 σ (1
− ρ2t ) + σp2
.
If we assume the noise variances to be independent of p, we have the above solution for a ¯pt,2 (γ, K) = E{˜ a0pt } + E{˜ a1pt } independent of p. With apt,2 (γ, K), we upper bound Perr in (6). Let max a ¯pt,2 (γ, K) = aK (γ) where 0 < aK < 1. 0≤t≤K−1
Then Perr with 1-bit quantization can be upper bounded as, M PK−1 −K QP Perr ≤ 21 t=0 Kt NK−t max a ¯ (γ, K) p=1 1≤t≤K pt,2 hP i MP K K N −K 1 (a (γ)) − 1 = K t=0 t 2 K−t h i MP MP N N 1 1 = K 2 (aK (γ)) K − 1 < 2 (aK (γ)) (30) Thus, to have a vanishing probability of error it is required that, N M P ≥ CK K log K , where CK = log 1 1 only depends on aK (γ)
K. R EFERENCES
[1] D. Malioutov, M. Cetin, and A.Willsky, “A Sparse Signal Reconstruction Perspective for Source Localization with Sensor Arrays,” IEEE Trans. Signal Process., vol. 53, no. 8, pp. 3010–3022, Aug. 2005. [2] V. Cevher, P. Indyk, C. Hegde, and R. G. Baraniuk, “Recovery of Clustered Sparse Signals from Compressive Measurements,” in Int. Conf. Sampling Theory and Applications (SAMPTA 2009), Marseille, France, May. 2009, pp. 18–22. [3] B. K. Natarajan, “Sparse Approximate Solutions to Linear Systems,” SIAM J. Computing, vol. 24, no. 2, pp. 227–234, 1995. [4] A. J. Miller, Subset Selection in Regression. New York, NY: ChapmanHall, 1990. [5] E. G. Larsson and Y. Selen, “Linear Regression With a Sparse Parameter Vector,” IEEE Trans. Signal Process., vol. 55, no. 2, pp. 451–460, Feb.. 2007. [6] Z. Tian and G. Giannakis, “Compressed Sensing for Wideband Cognitive Radios,” in Proc. Acoust., Speech, Signal Process. (ICASSP), Honolulu, HI, Apr. 2007, pp. IV–1357–IV–1360. [7] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic Decomposition by Basis Pursuit,” SIAM J. Sci. Computing, vol. 20, no. 1, pp. 33–61, 1998. [8] A. K. Fletcher, S. Rangan, and V. K. Goyal, “Necessary and sufficient conditions for sparsity pattern recovery,” IEEE Trans. Inform. Theory, vol. 55, no. 12, pp. 5758–5772, Dec. 2009. [9] G. Reeves and M. Gastpar, “Sampling bounds for sparse support recovery in the presence of noise,” in IEEE Int. Symp. on Inf. Theory (ISIT), Toronto, ON, Jul. 2008, pp. 2187–2191. [10] J. D. Blanchard, M. Cermak, D. Hanle, and Y. Jing, “Greedy algorithms for joint sparse recovery,” IEEE Trans. Signal Process., vol. 62, no. 7, pp. 1694–1704, 2014. [11] K. S. Kim and S.-Y. Chung, “Greedy Subspace Pursuit for Joint Sparse Recovery,” ArXiv e-prints, Jan. 2016. [12] S. Park, N. Y. Yu, and H.-N. Lee, “An Information Theoretic Study for Noisy Compressed Sensing With Joint Sparsity Model-2,” ArXiv eprints, Apr. 2016. [13] T. Wimalajeewa and P. Varshney, “Performance Bounds for Sparsity Pattern Recovery With Quantized Noisy Random Projections,” IEEE J. Sel. Topics Signal Process., vol. 6, no. 1, pp. 43–57, Feb 2012. [14] G. Reeves and M. Gastpar, “A Note on Optimal Support Recovery in Compressed Sensing,” in 43rd Asilomar Conf. on Signals, Systems and Computers, Nov 2009, pp. 1576–1580. [15] H. Zayyani, M. Korki, and F. Marvasti, “Dictionary learning for blind one bit compressed sensing,” IEEE Signal Process. Letters, vol. 23, no. 2, pp. 187–191, 2016. [16] S. Gopi, P. Netrapalli, P. Jain, and A. V. Nori, “One-bit Compressed Sensing: Provable Support and Vector Recovery,” in Int. Conf. on Machine Learning (ICML). Journal of Machine Learning Research, June 2013. [17] M. Yan, Y. Yang, and S. Osher, “Robust 1-bit compressive sensing using adaptive outlier pursuit,” IEEE Trans. on Signal Process., vol. 60, no. 7, pp. 3868–3875, 2012. [18] H. Wang and Q. Wan, “One Bit Support Recovery,” in 6th Int. Conf. on Wireless Commun. Networking and Mobile Computing (WiCOM), 2010, Sept 2010, pp. 1–4.
14
[19] A. Zymnis, S. Boyd, and E. Candes, “Compressed Sensing With Quantized Measurements,” IEEE Signal Process. Lett., vol. 17, no. 2, pp. 149–152, Feb 2010. [20] Y. Plan and R. Vershynin, “One-Bit Compressed Sensing by Linear Programming,” Commun. on Pure and Applied Mathematics, vol. 66, no. 8, pp. 1275–1297, 2013. [Online]. Available: http://dx.doi.org/10.1002/cpa.21442 [21] P. Boufounos and R. Baraniuk, “1-Bit compressive sensing,” in 42nd Annual Conf. on Inf. Sciences and Systems (CISS), 2008, March 2008, pp. 16–21. [22] S. Bahmani, P. T. Boufounos, and B. Raj, “Robust 1-bit compressive sensing via gradient support pursuit,” arXiv preprint arXiv:1304.6627, 2013. [23] P. T. Boufounos, “Greedy sparse signal reconstruction from sign measurements,” in 43rd Asilomar Conf. on Signals, Systems and Computers, 2009, pp. 1305–1309. [24] J. N. Laska and R. G. Baraniuk, “Regime change: Bit-depth versus measurement-rate in compressive sensing,” IEEE Trans. on Signal Process., vol. 60, no. 7, pp. 3496–3505, 2012. [25] J. Fang, Y. Shen, L. Yang, and H. Li, “Adaptive one-bit quantization for compressed sensing,” Signal Process., vol. 125, pp. 145–155, 2016. [26] Y. Plan and R. Vershynin, “Robust 1-bit compressed sensing and sparse logistic regression: A convex programming approach,” IEEE Trans. on Inf. Theory, vol. 59, no. 1, pp. 482–494, 2013. [27] F. Li, J. Fang, H. Li, and L. Huang, “Robust one-bit bayesian compressed sensing with sign-flip errors,” IEEE Signal Process. Letters, vol. 22, no. 7, pp. 857–861, 2015. [28] U. S. Kamilov, A. Bourquard, A. Amini, and M. Unser, “One-bit measurements with adaptive thresholds,” IEEE Signal Process. Letters, vol. 19, no. 10, pp. 607–610, 2012. [29] Q. Ling and Z. Tian, “Decentralized support detection of multiple measurement vectors with joint sparsity,” in 2011 IEEE International Conference on Acoustics, Speech and Signal Process. (ICASSP), 2011, pp. 2996–2999. [30] T. Wimalajeewa and P. Varshney, “OMP based joint sparsity pattern recovery under communication constraints,” IEEE Trans. Signal Process., vol. 62, no. 19, pp. 5059–5072, October 1 2014. [31] S. Patterson, Y. Eldar, and I. Keidar, “Distributed compressed sensing for static and time-varying networks,” IEEE Trans. on Signal Process., vol. 62, no. 9, pp. 4931–4946, Oct 2014. [32] F. Zeng, C. Li, and Z. Tian, “Distributed compressive spectrum sensing in cooperative multihop cognitive networks,” IEEE J. Sel. Topics Signal Process., vol. 5, no. 1, pp. 37–48, Febl 2011. [33] J. Xiong and Q. Tang, “1-bit compressive data gathering for wireless sensor networks,” Journal of Sensors, vol. 2014, 2014. [34] Y. Tian, W. Xu, C. Zhang, Y. Wang, and H. Yang, “Joint reconstruction algorithms for one-bit distributed compressed sensing,” in TeleCommun. (ICT), 2015 22nd International Conference on, 2015, pp. 338–342. [35] Y. Tian, W. Xu, Y. Wang, and H. Yang, “A distributed compressed sensing scheme based on one-bit quantization,” in Vehicular Technology Conference (VTC Spring), 2014 IEEE 79th, 2014, pp. 1–6. [36] L. Pan, S. Xiao, B. Li, and X. Yuan, “Continuous-time signal recovery from 1-bit multiple measurement vectors,” AEU-International J. of Electron. and Commun., vol. 76, pp. 132–136, 2017. [37] S. Kafle, B. Kailkhura, T. Wimalajeewa, and P. K. Varshney, “Decentralized joint sparsity pattern recovery using 1-bit compressive sensing,” in 2016 IEEE Global Conference on Signal and Information Process. (GlobalSIP), 2016, pp. 1354–1358. [38] H. Zayyani, M. Korki, and F. Marvasti, “A distributed 1-bit compressed sensing algorithm robust to impulsive noise,” IEEE Commun. Letters, vol. 20, no. 6, pp. 1132–1135, 2016. [39] M. Lopes, “Estimating unknown sparsity in compressed sensing,” in International Conference on Machine Learning, 2013, pp. 217–225. [40] A. Lavrenko, F. R¨omer, G. Del Galdo, and R. S. Thom¨a, “Sparsity order estimation for sub-nyquist sampling and recovery of sparse multiband signals,” in IEEE International Conference on Communications (ICC). IEEE, 2015, pp. 4907–4912. [41] Y. Wang, Z. Tian, and C. Feng, “Sparsity order estimation and its application in compressive spectrum sensing for cognitive radios,” IEEE Trans. on Wireless Commun., vol. 11, no. 6, pp. 2116–2125, 2012. [42] Y. Wang, Z. Tian, and C. Feng, “A two-step compressed spectrum sensing scheme for wideband cognitive radios,” in IEEE Global Telecommunications Conference (GLOBECOM 2010). IEEE, 2010, pp. 1–5. [43] V. Gupta, B. Kailkhura, T. Wimalajeewa, S. Liu, and P. K. Varshney, “Joint sparsity pattern recovery with 1-bit compressive sensing in sensor networks,” in 2015 49th Asilomar Conference on Signals, Systems and Computers, 2015, pp. 1472–1476.
[44] J. A. Tropp, “Algorithms for simultaneous sparse approximation. Part II: Convex relaxation ,” Signal Process. (Special Iss. Sparse Approx. Signal Image Proess., vol. 86, no. 3, pp. 589 – 602, 2006. [45] A. Chambolle, R. De Vore, N.-Y. Lee, and B. Lucier, “Nonlinear Wavelet Image Processing: Variational Problems, Compression, and Noise Removal through Wavelet Shrinkage,” IEEE Trans. Image Process., vol. 7, no. 3, pp. 319–335, March 1998. [46] M. Figueiredo and R. Nowak, “An EM Algorithm for Wavelet-based Image Restoration,” IEEE Trans. Image Process., vol. 12, no. 8, pp. 906–916, Aug 2003. [47] A. Beck and M. Teboulle, “A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems,” SIAM J. Imaging Sci., vol. 2, no. 1, pp. 183–202, 2009. [Online]. Available: http://dx.doi.org/10.1137/080716542 [48] S. Sra., “Generalized Proximity and Projection with Norms and Mixednorms.” Technical Report 192, Max Planck Institute for Biological Cybernetics, May 2010. [49] L. Jacques, J. N. Laska, P. T. Boufounos, and R. G. Baraniuk, “Robust 1bit compressive sensing via binary stable embeddings of sparse vectors,” IEEE Trans. Inf. Theory, vol. 59, no. 4, pp. 2082–2102, 2013. [50] T. Blumensath and M. E. Davies, “Iterative hard thresholding for compressed sensing,” Applied and computational harmonic analysis, vol. 27, no. 3, pp. 265–274, 2009. [51] R. Heckel and H. Bolcskei, “Joint sparsity with different measurement matrices,” in 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2012, pp. 698–702. [52] L. Li, X. Huang, and J. A. Suykens, “Signal recovery for jointly sparse vectors with different sensing matrices,” Signal Processing, vol. 108, pp. 451–458, 2015. [53] M. Chiani, D. Dardari, and M. K. Simon, “New exponential bounds and approximations for the computation of error probability in fading channels,” IEEE Trans. Wireless Commun., vol. 2, no. 4, pp. 840–845, July 2003.
Swatantra Kafle received his B.E. in Electronics and Communications from I.O.E., Pulchowk Campus, Tribhuvan University, Nepal in 2011. He has been pursuing the Ph.D. degree in the Department of Electrical Engineering and Computer Science, Syracuse University since 2014. His research interests include statistical signal processing, compressed sensing, machine learning, and optimization.
Vipul Gupta received his B.Tech and M.Tech from Indian Institute of Technology Kanpur, India, in 2015 and 2016, respectively. Since then, he is working towards a PhD degree in the Department of Electrical Engineering and Computer Science at the University of California, Berkeley. His current research interests include designing robust algorithms for distributed computation using ideas from Information and Coding theory.
Bhavya Kailkhura is a research staff at the Lawrence Livermore National Labs, Livermore, CA. His research interests include high dimensional data analytics, robust statistics and control, and machine learning. He was the runner-up for Best Student Paper Award at IEEE Asilomar Conf. on Signals, Systems and Computers, 2014 and a SPS travel grant award recipient. He received the All University Doctoral Prize 2017 by Syracuse University for superior achievement in completed dissertations.
15
Thakshila Wimalajeewa (S’07-M’10) received the B.Sc. (First Class Hons.) degree in electronic and telecommunication engineering from the University of Moratuwa, Moratuwa, Sri Lanka, in 2004, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of New Mexico, Albuquerque, NM, USA, in 2007 and 2009, respectively. From 20102012, she was a Postdoctoral Research Associate in the Department of Electrical Engineering and Computer Science, Syracuse University (SU), Syracuse, NY, USA. She currently holds a research faculty position at SU. Her research interests include the broad areas of communication theory, signal processing, and information theory. Her current research focuses on data fusion, compressive sensing, low dimensional signal processing for communication systems, and resource optimization and decentralized processing in sensor networks.
Pramod K. Varshney (S’72-M’77-SM’82-F’97) was born in Allahabad, India, on July 1, 1952. He received the B.S. degree in electrical engineering and computer science (with highest honors), and the M.S. and Ph.D. degrees in electrical engineering from the University of Illinois at UrbanaChampaign, USA, in 1972, 1974, and 1976 respectively. From 1972 to 1976, he held teaching and research assistantships with the University of Illinois. Since 1976, he has been with Syracuse University, Syracuse, NY, where he is currently a Distinguished Professor of Electrical Engineering and Computer Science and the Director of CASE: Center for Advanced Systems and Engineering. He served as the associate chair of the department from 1993 to 1996. He is also an Adjunct Professor of Radiology at Upstate Medical University, Syracuse. His current research interests are in distributed sensor networks and data fusion, detection and estimation theory, wireless communications, image processing, radar signal processing, and remote sensing. He has published extensively. He is the author of Distributed Detection and Data Fusion (New York: Springer-Verlag, 1997). He has served as a consultant to several major companies. Dr. Varshney was a James Scholar, a Bronze Tablet Senior, and a Fellow while at the University of Illinois. He is a member of Tau Beta Pi and is the recipient of the 1981 ASEE Dow Outstanding Young Faculty Award. He was elected to the grade of Fellow of the IEEE in 1997 for his contributions in the area of distributed detection and data fusion. He was the Guest Editor of the Special Issue on Data Fusion of the Proceedings of the IEEE January 1997. In 2000, he received the Third Millennium Medal from the IEEE and Chancellor’s Citation for exceptional academic achievement at Syracuse University. He is the recipient of the IEEE 2012 Judith A. Resnik Award, the degree of Doctor of Engineering honoris causa by Drexel University in 2014 and the ECE Distinguished Alumni Award from UIUC in 2015. He is on the Editorial Boards of the Journal on Advances in Information Fusion. He was the President of International Society of Information Fusion during 2001.