weight to small components encourages sparse solutions. The CS reconstruction ... knowledge about the signal. ... MERIDI
BAYESIAN COMPRESSED SENSING USING GENERALIZED CAUCHY PRIORS Rafael E. Carrillo, Tuncer C. Aysal and Kenneth E. Barner Department of Electrical and Computer Engineering University of Delaware uses the unconstrained convex program
ABSTRACT Compressed sensing shows that a sparse or compressible signal can be reconstructed from a few incoherent measurements. Noting that sparse signals can be well modeled by algebraic-tailed impulsive distributions, in this paper, we formulate the sparse recovery problem in a Bayesian framework using algebraic-tailed priors from the generalized Cauchy distribution (GCD) family for the signal coefficients. We develop an iterative reconstruction algorithm from this Bayesian formulation. Simulation results show that the proposed method requires fewer samples than most existing reconstruction methods to recover sparse signals, thereby validating the use of GCD priors for the sparse reconstruction problem. Index Termsβ Compressed sensing, Bayesian methods, signal reconstruction, nonlinear estimation, impulse noise 1. INTRODUCTION Compressed sensing (CS) is a novel framework that goes against the traditional data acquisition paradigm. CS demonstrates that a sparse, or compressible, signal can be acquired using a low rate acquisition process that projects the signal onto a small set of vectors incoherent with the sparsity basis [1]. Let π₯ β βπ be an sparse signal, and π¦ = Ξ¦π₯ a set of measurements with Ξ¦ an π Γ π sensing matrix (π < π). The optimal algorithm to recover π₯ from the measurements is min β₯π₯β₯0 subject to Ξ¦π₯ = π¦ π₯
(1)
(optimal in the sense that finds the sparsest vector π₯ such that is consistent with the measurements). Since noise is always present in real data acquisition systems, the acquisition system can be modeled as π¦ = Ξ¦π₯ + π
(2)
where π represents the sampling noise. The problem in (1) is combinatorial and NP-hard; however, a range of different algorithms have been developed that enable approximate reconstruction of sparse signals from noisy compressive measurements [1, 2, 3]. The most common approach is to use Basis Pursuit Denoising (BPD) [1], which
min β₯π¦ β Ξ¦π₯β₯22 + πβ₯π₯β₯1 , π₯
(3)
to estimate a solution of the problem. A family of iterative greedy algorithms ([2] and references therein) are shown to enjoy a similar approximate reconstruction property, generally with less computational complexity. However, these algorithms require more measurements for exact reconstruction than the πΏ1 minimization approach. Recent works show that nonconvex optimization problems can recover a sparse signal with fewer measurements than current geometric methods, while preserving the same reconstruction quality [4, 5]. In [4], the authors replace the πΏ1 norm in BPD with the πΏπ norms, for 0 < π < 1, to approximate the πΏ0 norm and encourage sparsity in the solution. Cand`es et. al use a re-weighted πΏ1 minimization approach to find a sparse solution in [5]. The idea is that giving a large weight to small components encourages sparse solutions. The CS reconstruction problem can also be formulated in a Bayesian framework (see [6] and references therein), where the coefficients of π₯ are modeled with Laplacian priors and a solution is iteratively constructed. The basic premise in CS is that a small set of coefficients in the signal have larger value than the rest of the coefficients (ideally zero), yielding a very impulsive characterization. Algebraic-tailed distributions put more mass in very high amplitude values and also in βzerolikeβ small values, and are therefore more suitable models for sparse coefficients of compressible signals. In this paper, we formulate the CS recovery problem in a Bayesian framework using algebraic-tailed priors from the generalized Cauchy distribution (GCD) family for the signal coefficients. An iterative reconstruction algorithm is developed from this Bayesian formulation. Simulation results show that GCD priors are a good model for sparse representations. Numerical results also show that the proposed method requires fewer samples than most existing recovery strategies to perform the reconstruction. 2. BAYESIAN MODELING AND GENERALIZED CAUCHY DISTRIBUTION In Bayesian modeling, all unknowns are treated as stochastic quantities with assigned probability distributions. Consider
the observation model in (2). The unknown signal π₯ is modeled by a prior distribution π(π₯), which represents the a priori knowledge about the signal. The observation π¦ is modeled by the likelihood function π(π¦β£π₯). Modeling the sampling noise as white Gaussian noise and using a Laplacian prior for π₯, the maximum a posteriori (MAP) estimate of π₯ is equivalent to find the solution of (3) [6]. The generalized Cauchy distribution (GCD) family has algebraic tails which makes it suitable to model many impulsive processes in real life (see [7, 8] and references therein). The PDF of the GCD is given by 2
π (π§) = ππΏ(πΏ π + β£π§β£π )β π
Since π(π₯β£π¦; π, πΏ) β π(π¦β£π₯; π)π(π₯β£πΏ), the MAP estimate, assuming π and πΏ known, is 1 π₯ Λ = arg min β₯π¦ β Ξ¦π₯β₯22 + πβ₯π₯β₯πΏπΏ1 ,πΏ π₯ 2
(8)
where π = 2π 2 . One remark to make is that the πΏπΏ1 norm has been previously used to approximate the πΏ0 norm but without making a statistical connection to the signal model. The re-weighted πΏ1 approach proposed in [5] is equivalent to finding a solution for the first order approximation of the problem in (8) using a decreasing sequence for πΏ.
(4) 4. FIXED POINT ALGORITHM
with π = πΞ(2/π)/2(Ξ(1/π))2. In this representation, πΏ is the scale parameter and π is the tail constant. The GCD family contains the meridian [7] and Cauchy distributions as special cases with π = 1 and π = 2, respectively. For π < 2, the tail of the PDF decays slower than in the Cauchy distribution, resulting in a heavier-tailed PDF. Similar to πΏπ norms derived from the generalized Gaussian density (GGD) family, a family of robust metrics are derived from the GCD family [3].
In this paper, instead of directly minimizing (8), we develop a fixed point search to find a sparse solution. The fixed point algorithm is based on first order optimality conditions and is inspired in the robust statistics literature [9]. Let π₯β be a stationary point of (8), then the first order optimality condition is
Definition 1 For π’ β βπ , the πΏπΏπ norm of π’ is defined as
Noting that the gradient ββ₯π₯β β₯πΏπΏ1 ,πΏ , can be expressed as
β₯π’β₯πΏπΏπ,πΏ =
π β
log{1 + πΏ
βπ
π
β£π’π β£ }, πΏ > 0.
Ξ¦π Ξ¦π₯β β Ξ¦π π¦ + πββ₯π₯β β₯πΏπΏ1 ,πΏ = 0.
(9)
ββ₯π₯β β₯πΏπΏ1 ,πΏ = π (π₯β )π₯β , (5)
π=1
The πΏπΏπ norm (quasi-norm) doesnβt over penalize large deviations, and is therefore a robust metric appropriate for impulsive environments [3].
(10)
where π (π₯) is a diagonal matrix with diagonal elements given by πππ (π₯) = [(πΏ + β£π₯π β£)β£π₯π β£]β1 , (11) the first order optimality condition, (9), is equivalent to Ξ¦π Ξ¦π₯β β Ξ¦π π¦ + ππ (π₯β )π₯β = 0.
3. BAYESIAN COMPRESSED SENSING WITH MERIDIAN PRIORS
Solving for π₯β we find the fixed point function
Of interest here is the development of a sparse reconstruction strategy using a Bayesian framework. To encourage sparsity in the solution, we propose the use of meridian priors for the signal model. The meridian distribution possesses heavier tails than the Laplacian distribution, thus yielding more impulsive (sparser) signal models and intuitively lowering the number of samples to perform the reconstruction. We model the sampling noise as independent, zero mean, Gaussian distributed samples with variance π 2 . Using the observation model in (2) the likelihood function becomes π(π¦β£π₯; π) = π© (Ξ¦π₯, Ξ£), Ξ£ = π 2 πΌ.
(12)
(6)
Assuming the signal π₯ (or coefficients in a sparse basis) are independent meridian distributed samples yields the following prior π πΏπ β 1 π(π₯β£πΏ) = π . (7) 2 π=1 (πΏ + β£π₯π β£)2
π₯β = [Ξ¦π Ξ¦ + ππ (π₯β )]β1 Ξ¦π π¦ =π
β1
π
(π₯ )Ξ¦ [Ξ¦π β
β1
(13) π
(π₯ )Ξ¦ + ππΌ] β
β1
π¦.
The fixed point search uses the solution at previous iteration as input to update the solution. The estimate at iteration time π‘ + 1 is given by π₯ Λπ‘+1 = π β1 (Λ π₯π‘ )Ξ¦π [Ξ¦π β1 (Λ π₯π‘ )Ξ¦π + ππΌ]β1 π¦.
(14)
The fixed point algorithm turns out to be a reweighted least squares recursion, which iteratively finds a solution and updates the weight matrix using (11). As in other robust regression problems, the estimate in (8) is scale dependent (πΏ in the meridian prior formulation). In fact, πΏ controls the sparsity of the solution and in the limiting case when πΏ β 0 the solution of (8) is equivalent to the πΏ0 norm solution [5]. To address this problem we propose to jointly estimate πΏ and π₯ at each iteration similar to joint scalelocation estimates [8, 9].
A fast way to estimate πΏ from π₯ is using order statistics (although more elaborate estimates can be used as in [8]). Let π be a meridian distributed random variable with zero location and scale parameter πΏ and denote the π-th quartile of π as π(π) . The interquartile distance is π(3) β π(1) = 2πΏ, thus, a fast estimate of πΏ is half the interquartile distance of the samples π₯. Let ππ‘(π) denote the π-th quartile of the estimate π₯Λπ‘ at time π‘, then the estimate of πΏ at iteration time π‘ is given by πΏΛπ‘ = 0.5(ππ‘(3) β ππ‘(1) ).
(15)
To summarize, the final algorithm is depicted in Algorithm 1, where π½ is the maximum number of iterations and πΎ is a tolerance parameter for the error between subsequent solutions. To prevent numerical instabilities we pre-define a minimum value for πΏ denoted as πΏπππ . We start the recursion with the LS solution (π = πΌ) and we also assume a known noise variance, π 2 (recall π = 2π 2 ). The resulting algorithm is coined meridian Bayesian compressed sensing (MBCS). Algorithm 1 MBCS Require: π, πΏπππ , πΎ and π½. 1: Initialize π‘ = 0 and π₯ Λ0 = Ξ¦π (ΦΦπ + ππΌ)β1 π¦. 2: while β₯Λ π₯π‘ β π₯ Λπ‘β1 β₯2 > πΎ or π‘ < π½ do 3: Update πΏΛπ‘ and π . 4: Compute π₯ Λπ‘+1 as in equation (14). 5: π‘βπ‘+1 6: end while 7: return π₯ Λ As mentioned in the last section the reweighted πΏ1 approach of [5] and MBCS minimize the same objective. Moreover, the reweighted πΏ1 may require fewer iterations to converge, but the computational cost of one iteration of MBCS is substantially lower than the computational cost of an iteration of reweighted πΏ1 , thereby resulting in a faster algorithm. 5. EXPERIMENTAL RESULTS In this section we present numerical experiments that illustrate the effectiveness of MBCS for sparse signal reconstruction. For all experiments we use random Gaussian measurements matrices with normalized columns and πΏπππ = 10β8 in the algorithm. The first experiment shows the validity of the joint estimation approach of MBCS. Meridian distributed signals with length π = 1000 and πΏ β {10β3 , 10β2 , 10β1 } are used. The signals are sampled taking π = 200 measurements and zero mean Gaussian distributed sampling noise with variance π 2 = 10β2 is added. Table 1 shows the average reconstruction SNR (R-SNR) for 200 repetitions. The performance loss is of 6 dB approximately in the worst case, but fully automated MBCS still yields a good reconstruction.
Table 1. Comparison of reconstruction quality between known πΏ and estimated πΏ MBCS. Meridian distributed signals, π = 1000, π = 200. R-SNR (dB). πΏ = 10β3 πΏ = 10β2 πΏ = 10β1 Known πΏ 9.91 21.5 30.69 Estimated πΏ 8.16 17.58 24.98 The next set of experiments compare MBCS with current reconstruction strategies for noiseless samples and noisy samples. The algorithms used for comparison are πΏ1 minimization [1], re-weighted πΏ1 minimization [5], RWLS to approach πΏπ [4], and CoSaMP [2]. We use π-sparse signals (π nonzero coefficients) of length π = 1000, in which the amplitudes of the nonzero coefficients are Gaussian distributed with zero mean and standard deviation ππ₯ = 10. Each experiment is averaged over 200 repetitions. The first experiment compares MBCS in a noiseless setting for different sparsity levels, fixing π = 200. We use the probability of exact reconstruction as a measure of performance, where a reconstruction is considered exact if β₯Λ π₯β π₯β₯β β€ 10β4 . The results are shown in Fig. 1 (Middle). Results show that MBCS outperforms CoSaMP and πΏ1 minimization (giving larger probability of success for lager values of π) and yielding a slightly better performance than πΏπ minimization. It is of notice that MBCS has similar performance to reweighted πΏ1 , since they are minimizing the same objective, but with a different approach. The second experiment compares MBCS in the noisy case varying the number of samples (π) and fixing π = 10. The sampling noise is Gaussian distributed with variance π 2 = 10β2 . The R-SNR is used as the performance metric. Results are presented in Fig. 1 (Right). In the noisy case MBCS outperforms all other reconstruction strategies, yielding a larger R-SNR for fewer samples with a good approximation for 60 samples and above. Moreover, the R-SNR of MBCS is better than reweighted πΏ1 minimization. An explanation for this that πΏ1 minimization methods suffer from bias problems needing a de-biasing step after the solution is found (see [3] and references therein), therefore to achieve a similar performance an additional step is needed for reweighted πΏ1 . As a practical experiment, we present an example utilizing a 256 Γ 256 image. We use a Daubechies db8 wavelets as sparse basis and the number of measurements, π, is set to 256 Γ 256/4 (25% of the number of pixels of the original image). Fig. 1 (Right) shows a zoom of the normalized histogram of the coefficients along with a plot of meridian and Laplacian distributions. It can be noticed that the meridian is a better fit for the tails of the coefficient distribution. Fig. 2 (Left) shows the original image, Fig. 2 (Middle) the reconstructed image by πΏ1 minimization, and Fig. 2 (Right) the reconstructed image by MBCS. The reconstruction SNR is 15.2 dB and 19.3 dB for πΏ1 minimization and MBCS, respectively. This example shows the effectiveness of MBCS to model and recover sparse representations of real signals.
L1 minimization
rwβL
1
rwlsβL
p
0.6
CoSaMP MBCS
0.4
0.2
0 30
30
rwβL1
20
CoSaMP MBCS
10 0
70
β20 20
0.6
0.4
0.2
β10
40 50 60 Number of nonzero elements, k
Meridian Laplacian
0.8 Normalized histogram
1
0.8
1
40
L minimization Reconstruction SNR, dB
Probability of exact reconstruction
1
40
60 80 Number of samples, m
100
0 0
10
20
30 40 Value
50
60
70
Fig. 1. MBCS experiments. L: π-sparse (noiseless case), M: π-sparse (noisy measurements), R: Wavelet coefficient histogram.
Fig. 2. Image reconstruction example. L: Original image, M: πΏ1 reconstruction, R: MBCS reconstruction. 6. CONCLUSIONS In this paper, we formulate the CS recovery problem in a Bayesian framework using algebraic-tailed priors from the GCD family for the signal coefficients. An iterative reconstruction algorithm, referred to as MBCS, is developed from this Bayesian formulation. Simulation results show that the proposed method requires fewer samples than most existing reconstruction algorithms for compressed sensing, thereby validating the use of GCD priors for sparse reconstruction problems. Methods to estimate the sampling noise variance are still an open problem. A future research direction is to explore the use of GCD priors with π different from 1 to give more flexibility in the sparsity model.
sampling and reconstruction methods for sparse signals in the presence of impulsive noise,β IEEE Journal of Selected Topics in Signal Processing, accepted for publication. [4] R. Chartrand and V. Staneva, βRestricted isometry properties and nonconvex compressive sensing,β Inverse Problems, vol. 24, no. 3, 2008. [5] E. J. Cand`es, M. Wakin, and S. Boyd, βEnhacing sparsity by reweighted β1 minimization,β Journal of Fourier Analysis and Applications, vol. 14, no. 5, pp. 877β905, Dec. 2008, pecial issue on sparsity.
7. REFERENCES
[6] S. D. Babacan, R. Molina, and A. K. Katsaggelos, βBayesian compressive sensing using laplace priors,β IEEE Transactions on Image Processing, 2009, accepted for publication.
[1] E. J. Cand`es and M. B Wakin, βAn introduction to compressive sampling,β IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 21β30, Mar. 2008.
[7] T. C. Aysal and K. E. Barner, βMeridian filtering for robust signal processing,β IEEE Transactions on Signal Processing, vol. 55, no. 8, pp. 3949β3962, Aug. 2007.
[2] D. Needell and J. A. Tropp, βCosamp: Iterative signal recovery from incomplete and inaccurate samples,β Applied and Computational Harmonic Analysis, vol. 26, no. 3, pp. 301β321, Apr. 2008.
[8] R. E. Carrillo, T. C. Aysal, and K. E. Barner, βGeneralized cauchy distribution based robust estimation,β in Proceedings, IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Apr. 2008.
[3] R. E. Carrillo, K. E. Barner, and T. C. Aysal, βRobust
[9] Huber, Robust Statistics, John Wiley & Sons, Inc., 1981.