Ordered Orthogonal Matching Pursuit - IEEE Xplore

Ordered Orthogonal Matching Pursuit Deepak Baby and Sibi Raj B Pillai Department of Electrical Engineering Indian Institute of Technology Bombay, 400076, India Email: deepakbaby,[email protected]

Abstract—Compressed Sensing deals with recovering sparse signals from a relatively small number of linear measurements. Several algorithms exists for data recovery from the compressed measurements, particularly appealing among these is the greedy approach known as Orthogonal Matching Pursuit (OMP). In this paper, we propose a modified OMP based algorithm called Ordered Orthogonal Matching Pursuit (Ordered OMP). Ordered OMP is conceptually simpler and provides an improved performance when compared to OMP.

I. I NTRODUCTION Compressed Sensing (CS) provides a compact approach for dealing with high-dimensional sparse data. CS exploits the fact that most real life signals can have a sparse representation by intelligently choosing the domain [1]. For example, the wavelet domain for images. Provided that the signal has a sparse representation, there are several approaches for recovering the signal from a relatively small number of random linear measurements. The basic setting in a CS problem is as follows. We have to recover a sparse signal x of dimension d i.e., a vector x ∈ Rd from a set of linear measurements y obtained as y = Ax + e (1) where, y ∈ Rn , n � d, A ∈ Rn×d is the measurement matrix and e is the additive noise. The signal x is called S-sparse, if the �0 norm1 , �x�0 ≤ S. We can recast this as an optimization problem which can then be solved using linear programming. The theory of CS guarantees that we can successfully recover x by solving a �1 -optimization problem if the measurement matrix A satisfies the Restricted Isometry Property (RIP) condition [2]. A matrix A satisfies RIP of order δS if (1 − δS )�x�22 ≤ �Ax�22 ≤ (1 + δS )�x�22 (2) It is known that the measurement matrix A satisfies RIP with high probability if its entries are drawn from a Gaussian ensemble. Now, our problem becomes min �x�1 subject to Ax = y. But as the dimension d increases, the cost for solving such a minimization problem also increases. So, an alternate approach is to solve this using greedy algorithms like OMP [3], CoSaMP [4] etc. Such algorithms work in an iterative fashion. 1 �x�

0=

Number of non-zero entries of x

978-1-4673-0816-8/12/$31.00 ©2012 IEEE

The most popular of these greedy algorithms is the OMP. Let us denote the columns of A as a1 , a2 , . . . , ad , where ai denotes the ith column of A. Let Λt be the identified support set and rt be the available residue vector at iteration t. Note that, r0 = y. Let AΛt be the set of columns detected at iteration t and AΛct be the remaining columns in A. Notice that, AΛt ∪ AΛct = A. In every iteration, we will pick one column in AΛct that has maximum correlation with rt and add it to the support set Λt . Using the columns in Λt , we will try to find the best fit for x by solving a least squares problem. Then, the residue signal rt is updated by subtracting the effect of columns present in the support set. This will make the residue orthogonal to those columns in the present support set so that we will get a new column in the next iteration. Thus, at the end of S iterations, we will have S elements in the support set which is supposed to be the correct set. The algorithm is detailed below [3]. Input • The n × d measurement matrix A • The n dimensional measurement vector y • Sparsity level S of the actual signal x Procedure 1) Initialize: The residue vector r1 = y, support set Λ0 = φ and the sub-matrix A0 = φ 2) For t = 1 to S do 3) Find the support λt given by λt = arg max|�rt , aj �| j=1,...,d

4) Expand the support set Λt = Λt−1 ∪ {λt } and the submatrix At = [At−1 aλt ] 5) Find xt by solving xt = arg min�At u − y�2 u

6) update the residue vector rt+1 = y − At xt 7) end for loop Output • An estimate x ˆ ∈ Rd of the original signal x where, the non-zero values in x ˆ are indexed by elements in the support set ΛS and the value at the index λj is the j th value of xS . Notice that, Step 5 of the algorithm essentially finds the projection of y on to the subspace spanned by the columns in

At . The projection coefficients are obtained as xt = (ATt At )−1 ATt y.

(3)

In Step 6, we update the residue by subtracting the projection At xt from y thus making the residual orthogonal to the remaining columns in every iteration and hence the name Orthogonal Matching Pursuit. The simulation result that evaluates the performance of OMP is shown in Figure-1. Here, the percentage success is plotted against the number of measurements n for binary valued data of various sparsity levels and d = 256. Success in this paper means the correct recovery of the support.

Percentage Recovery

100 90 80 70 60 50 40

S=5 S=10 S=15

30 20

j=1,...,d

10 0 20

40

60

80

100

120

140

160

180

column in the present support set is falsely detected. In order to facilitate this check, we introduce two parameters. • Tolerance, η < 1. • Reduction factor, α < 1. In every iteration, Ordered OMP compares the projection coefficients with that obtained in the previous iteration. A column in the present support set is declared as wrongly detected if its present projection coefficient falls below η times its value in the previous iteration. If any column fails this test, i.e. if its coefficient gets smaller, then that column will be removed from the support set. Furthermore, that column is scaled down by α so that the column is given less weightage in the coming iterations. The Ordered OMP algorithm is listed below. Input • The n × d measurement matrix A • The n dimensional measurement vector y • Sparsity level S of the actual signal x • Threshold, η and reduction factor, α Procedure 1) Find λ1 = arg max|�y, aj �|

200

Number of Measurements

Fig. 1. The Percentage Recovery for OMP as a function of number of measurements for various sparsity levels S and d = 256.

Theoretical guarantees for the convergence of OMP algorithm using the the RIP property is given by Davenport and Wakin in [5]. Alternate approaches for sparse recovery are also there in literature. One such approach is gradient based approach proposed by Figueiredo et. al. [6]. However, our main emphasis in this paper is on the greedy approach. In this paper, we present a new algorithm called Ordered OMP that can achieve exact recovery at a smaller number of measurements, n when compared to OMP. Like the name implies, it is essentially a modification to OMP. We employ an update mechanism to detect a falsely selected column and then remove it from the support set. Our simulation results shows that this new algorithm has superior performance when compared to that of OMP. The entire algorithm will be detailed in the subsequent sections. The organization of the rest of the paper is as follows. In the next section, we describe the Ordered OMP in detail. In section III, we will show the simulation results and comparisons with existing algorithms. Section IV concludes and suggests some future works. II. O RDERED OMP In this section, we explain the Ordered OMP algorithm in detail. Just like OMP, Ordered OMP also starts with a set of linear measurements y and estimates the support set iteratively. The key idea in Ordered OMP is that, we use the projection coefficients in Eqn.(3) as a metric for checking whether a

2) Initialize Λ1 = {λ1 }, B = [aλ1 ], x1 = (B T B)−1 B T y and r2 = y − Bx1 3) t = 2, k = 1 and A˜ = A. Let A˜ = [˜ a1 , a ˜2 , . . . , a ˜d ]. 4) While k < S do ˜j �| 5) Find λt = arg max|�rt , a j=1,...,d

6) Expand the support set Λt = Λt−1 ∪ {λt }. Let Λt = {Λt (1), Λt (2), . . . , Λt (k + 1)}. Then, form the submatrix B = [b1 , b2 , . . . , bk+1 ] where, bi = aΛt (i) . 7) Find xt by solving xt = arg min�Bu − y�2

(4)

u

8) Identification Step For j = 1, . . . , k,if |xt (j)| ≤ η|xt−1 (j)| for some j = l, then a) Modification Step Modify Λt = Λt −{Λt (l)} and B = [b1 , b2 , . . . , bk ] where, bi = aΛt (i) . Also, modify xt using Eqn.4. aΛt (l) . b) Scale a ˜Λt (l) = α˜ else, k = k + 1 9) Update the residue vector rt+1 = y − Bxt 10) t = t + 1 11) end while Output • The support set Λt . We will show that this algorithm outperforms the conventional OMP algorithm. In fact, we are trying to obtain a better approximation by looking at the projection coefficients. In each iteration, the algorithm tries to find the best projection for the measurement vector y, rather than just relying on the correlation values. The parameters η and α play a key role

III. E XPERIMENTAL R ESULTS

Percentage Success

100 90 80 70 60 50

OMP S=5

40

OOMP S=5 30

OOMP S=10

20

OMP S=10 OOMP S=15

10 0 20

OMP S=15 40

60

80

100

120

140

160

180

200

Number of measurements

Fig. 2. The Percentage Recovery for OMP and Ordered OMP (for η = .7 and α = .9) as a function of number of measurements for various sparsity levels S and d = 256.

when compared to that of OMP. For example, when S = 10, the OMP algorithm has a percentage success of 64% at 100 measurements whereas our algorithm improves it to 90% for the same number of measurements. Having attained a better performance, next part is to study the effect of η and α in improving the performance of our algorithm. For this, we fixed S = 10 and d = 256 and ran the algorithm for various values of η and α which is explained in the next to sections. A. Effect of η The effect of the tolerance measure, η is studied in this section. The simulation results for d = 256, S = 10 and α = .9 are shown in Figure-3. The blue, green and the red curves corresponds to η = .75, .7, .65 respectively. 100 90

Percentage Success

in this algorithm. A qualitative view of how these parameters improve the performance is given below. • Tolerance (η): In OMP, in every iteration we add a new support which has the maximum correlation with the residue. But in practice, there is a chance that a wrong column gets added to the support set in the beginning of the iteration and it can be observed that its projection coefficient may fall as new support elements get added. The parameter η is introduced to take care of such a situation. In every iteration, we compare the present projection coefficient with the one obtained in the previous iteration and if the reduction is more than a fraction of 1 − η, then that column will be dropped from the support set. • Reduction Factor (α): This parameter keeps track of the wrongly declared columns by scaling it down by α. So in the remaining iterations, its correlation with the residue will also be scaled making it difficult enter back into the support set. So, more the number of wrong appearances in the support set, more will be the scaling. So, if a wrong column makes multiple entries to the support set, the scaling reduces the chance of another entry by scaling its correlation. This also helps in bypassing an infinite loop situation that may arise while running the ordered OMP algorithm. Thus, for obtaining the best performance, fine tuning of these parameters are needed. The optimum values for η and α is to be obtained for various n, d and S. It requires a stronger analysis which is left as the future work. However, we will provide extensive simulation results which gives rough guidelines in choosing these parameters. In Ordered OMP algorithm, we can achieve better performance than OMP without losing its simplicity. Other algorithms like CoSaMP [4] need to obtain inverse of matrices of size 2S ×2S while our algorithm inverts a matrix of maximum size S × S. We can also conveniently make use of the QR factorization while implementing this algorithm for increasing its speed. Ordered OMP hence has lower computational cost and is easy to implement. In the next section, we will provide simulation results to support our claim.

80 70 60 50 40 30

In this section, we present the simulation results that were obtained by implementing the algorithm in MATLAB. The random matrix generated by drawing elements from a Gaussian ensemble was used as the measurement matrix A. The variance of the Gaussian ensemble is set be 1/n. The dimension of the signal is fixed as d = 256. Various results were obtained by varying the values of S, η, and α. The comparison between OMP and Ordered OMP is studied first. Figure-2 shows the percentage success as a function of number of measurements. The solid line corresponds to Ordered OMP for η = .7 and α = .9 and the dotted line represents OMP. The result is plotted for S = 5, S = 10 and S = 15. It can be seen that the performance of Ordered OMP improves more and more as we increase the sparsity

.75

20

.7

10

.65 0 40

50

60

70

80

90

100

Number of Measurements

Fig. 3. The Percentage Recovery for Ordered OMP as a function of number of measurements for various values of η with S = 10, α = .9 and d = 256.

It can be seen that η = .7 is giving the best performance when α = .9. The observations obtained for α = .95 is shown in Figure-4. η = .7 gives the best performance in this case also. Another observation is that in both the cases, η = .75 yields an unstable performance. As we increase η, we are making the check tighter and hence the number of columns that will

Avg. number of iterations

Percentage Success

100 90 80 70 60

20 15 10 5 0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

100

Percetage Success

40 30

.75

20

.7

10 0 40

1

alpha

50

.65 50

60

70

80

90

90 80

S=5 S=10 S=15

70 60 0.5

100

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Number of meaurements

0.95

1

alpha

Fig. 4. The Percentage Recovery for Ordered OMP as a function of number of measurements for various values of η with S = 10, α = .95 and d = 256.

Fig. 6. 90%.

not pass this also increases. Beyond a limit, even the correct columns may be declared as the wrongly detected one. This is why η = .75 yields an unstable performance. From our experimental results, the optimum limit for η is somewhere near to .7. A deeper analysis is required for finding the exact limit.

remains the same at 90% even if we vary the value of α from .99 to .5. So, we conclude that the value of α does not have any effect on the performance of the algorithm.

B. Effect of α This section studies the behavior of ordered OMP for various values of reduction factor, α. The value of tolerance is fixed as η = .7. The simulation results are shown in Figure-5.

100

SImulation results obtained by varying α for percentage recovery

C. CoSaMP Vs Ordered OMP In this section, we discuss the comparison between Ordered OMP and COSaMP [4]. Figure-7 shows the results obtained for S = 5, S = 10 and S = 15 for d = 256. It can be seen that, CoSaMP has better performance. But Ordered OMP requires lesser computational resources. In CoSaMP we need to invert matrices of size upto 2S × 2S whereas Ordered OMP requires matrix inversion of maximum size S × S. Also, we can incorporate QR updating technique with Ordered OMP for faster matrix inversion and to minimize computational burden.

80

100

70

90

60

Percentage Success

Percentage Success

90

50 40 30

.9 20

.95

10

.85

80 70 60 50

OOMP S=5 CoSaMP S=5 OOMP S=10 CoSaMP S=10 OOMP S=15 CoSaMP S=15

40 0 40

50

60

70

80

90

100


30 20

Fig. 5. The Percentage Recovery for Ordered OMP as a function of number of measurements for various values of α with S = 10, η = .7 and d = 256.

We introduced the factor α in order to reduce the risk of an infinite loop situation that may arise. The simulation results shows that the value of α does not play any role in improving the performance of Ordered OMP. Next, we will study whether α affect the average number of iterations required per recovery. For this, we chose a value for n where the percentage recovery is 90% and studied the algorithm by varying α. The results are shown in Figure-6 . It can be seen that the average number of iterations required for recovery is almost constant and the percentage success also

10 0

20

40

60

80

100

120

140

160


Fig. 7. Comparison between Ordered OMP and CoSaMP for various sparsity levels.

IV. C ONCLUSION AND F UTURE W ORK The simulation results shows that our algorithm outperforms OMP. For obtaining the best performance, we have to use the optimum values for the tolerance and the reduction factor. The algorithm is simple and is computationally efficient also as it can incorporate the QR factorization technique for reducing

the burden of matrix inversion. In that point of view, our algorithm use lesser resources as compared to CoSaMP. The future work is to come up with an analytical frame work for Ordered OMP. The analysis will yield us the optimum values of the parameters α and β that can guaranty reliable performance. R EFERENCES [1] E. J. Candes, M. B. Wakin, ”An Introduction to Compressive Sampling”, IEEE Signal Processing Magazine,March 2008. pp. 21-30. [2] E. J. Candes, T. Tao, ”Decoding by Linear Programming”, IEEE Trans. on Information Theory, 51(12), pp. 4203 - 4215, December 2005. [3] J. Tropp, A. Gilbert, ” Signal recovery from random measurements via orthogonal matching pursuit”, IEEE Trans. on Information Theory, 53(12) pp. 4655-4666, December 2007. [4] D. Needell and J. A. Tropp, ”Cosamp: Iterative signal recovery from incomplete and inaccurate samples”. Applied and Computational Harmonic Analysis, 26(3):301 321, 2009. [5] Mark Davenport, Michael Wakin, ”Analysis of orthogonal matching pursuit using the restricted isometry property”. IEEE Trans. on Information Theory, 56(9), pp. 4395 - 4401, September 2010. [6] Mario A. T. Figueiredo, Robert D. Nowak, Stephen J. Wright, ”Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems”. IEEE Journal of Selected Topics in Signal Processing: Special Issue on Convex Optimization Methods for Signal Processing, 1(4), pp. 586-598, 2007.