Abstractâ Compressive imaging (CI) is a natural branch of compressed sensing (CS). Although a number of. CI implementations have started to appear, the ...
SPL-06333-2008 .R1
1
Compressed Imaging with a Separable Sensing Operator Yair Rivenson and Adrian Stern
Abstract— Compressive imaging (CI) is a natural branch of compressed sensing (CS). Although a number of CI implementations have started to appear, the design of efficient CI system still remains a challenging problem. One of the main difficulties in implementing CI is that it involves huge amounts of data, which has far-reaching implications for the complexity of the optical design, calibration, data storage and computational burden. In this paper, we solve these problems by using a twodimensional separable sensing operator. By so doing, we reduce the complexity by factor of 106 for megapixel images. We show that applying this method requires only a reasonable amount of additional samples. Index Terms— Compressed Sensing, Compressive Imaging, Separable Operator, Kronecker Product, Mutual Coherence I. INTRODUCTION he recently introduced theory of compressed sensing (CS) [1]-[4] has attracted the interest of theoreticians and practitioners alike and has initiated a fast emerging research field. CS suggests a new framework for simultaneous sampling and compression of signals. In contrast to the common framework of first collecting as much data as possible and then discarding the redundant data by digital compression techniques, CS seeks to minimize the collection of redundant data in the acquisition step. A natural branch of CS is compressive imaging (CI). A block diagram for CI with random projections is shown in Fig. 1. The object f consisting of N pixels is imaged by taking a set, g, of M random projections. One can also think of M as the number of detector pixels. We are interested in the case of M ij
(3)
where φi , ψ j ∈ ℜ N denote the column vector of Φ and Ψ respectively, and N is the length of the column vector. The mutual coherence is bounded by 1 ≤ µ ≤ N [10], where the lower bound is for the completely incoherent case (e.g., Φ, Ψ are Fourier and spikes bases pair [2],[10]), and the upper bound is for completely coherent bases. The size of the imaging operator ΦM creates several challenging implementation issues: 1) Computational – For large M, N (the typical size of an image is N=106 pixels) storage of ΦM and computational solving (1) are hardly possible. 2) Optical implementation - Realization of random ΦM requires the design of an imaging system with a space
SPL-06333-2008 .R1 bandwidth product (SBP) [11] larger than N × M . In other words, the imaging system must have N × M almost independent modes or degrees of freedom. In comparison with common linear shift invariant imaging systems, that have an SBP of O(N), CI requires an M times larger SBP. For example, the large SBP requirement was implemented in [8] by taking large number of exposures, thus demanding large acquisition time. In [5] it leads to fine phase correlation requirements. 3) Calibration – Along with the high complexity of the imaging system, difficult calibration requirements are associated [7]. For large M, N, the calibration process is both exhaustive and time consuming. For the calibration of ΦM, N point spread functions, have to be measured. In this paper, we show that these difficulties can be remedied by using an imaging operator ΦM that is separable in two dimensions (e.g., Cartesian x-y coordinates). Separable imaging operators arise naturally in many optical implementations [12],[13] , and design and analysis of optical imaging systems are often done in separable x-y coordinates. Separable CI design significantly reduces the complexity involved in the implementation, storage, and computations usage of the imaging operator. We show both theoretically and experimentally that the price (compromise) – in the form of additional samples – of implementing a separable design is not large. II. COMPRESSED IMAGING USING SEPARABLE MEASUREMENTS
Consider the CI imaging system in Fig. 1. In lexicographic notation, f ∈ R N is the object's image, Φ M ∈ ℜMxN is the imaging operator, Ψ is an orthonormal sparsifying basis, α ∈ CN is the K-sparse representation of the image f projected on ΨT , and g ∈ ℜ M is the acquired image. Let us denote by capital letters F and G the 2-D representation (on a Cartesian grid) of images f and g, i.e., F ∈ ℜ n×n , G ∈ ℜ m×m , where N = n × n and M = m × m , f = vec(F), g = vec(G), and vec(A) denotes the row-ordered vectorization of A into a single column vector [12]. Considering the imaging operator, ΦM, that is x, y separable, we can rewrite the imaging acquisition block in Fig. 1.: (4) g = Φ M,S f , as a 2-D separable transform: T
(5)
where the s in Φ stands for Φ that is separable and Φ mx ∈ R m×n operates on the columns of the 2D image F and M,S
(Φ )
m T y
Φ my , Φ M , S = Φ mx ⊗ Φ my . Note that Φ M,S is determined by as little as 2mn entries, compared to random non-separable Φ M, which is determined by m 2 × n 2 entries. B. Compressed Imaging with 2-D Separable Projections
Let us consider a sparsifying basis that is 2-D separable [12]. (This property holds for most popular sparsifying bases, such as wavelets, DCT and Fourier.) Thus, we can write: (6) ΨS = Ψ x ⊗ Ψ y , where Ψ s ∈ R n
2
× n2
, and Ψ x = Ψ y = Ψ ∈ R n×n . Using (6), we
can write the l1 minimization convex optimization program (1) as: min Α 1 α
subject to G = Φ mx ΨΑΨT ( Φ my ) , T
(7)
where A ∈ Cnxn is the matrix representation of α ∈ CN , such that α = vec( A) . Using (5) and (6) to the constraint of (7), we obtain: T g = vec(G ) = vec Φ mx ΨAΨT ( Φ my ) =
{
}
T T T = vec (Φmx Ψ)Α ΨT ( Φmy ) = (Φmx Ψ) ⊗ ΨT ( Φmy ) vec(Α) = m m m m = (Φx Ψ) ⊗ (Φ y Ψ)α = (Φx ⊗ Φ y )(Ψ ⊗ Ψ)α,
where the last equality is a standard property of the Kronecker product [12]. Using (7), we rewrite (1) as: min vec( Α) 1 subject to g = (Φ mx ⊗ Φ my )(Ψ ⊗ Ψ )α. (8) α
A. Separable Imaging Operator
G = Φ mx F ( Φ my ) ,
2
M
∈ R n×m operates on its rows. The separable transform
(5) is related to the matrix-vector notation (4) by g = {Φ mx ⊗ Φ my } f , i.e., the m 2 × n 2 imaging matrix Φ M,S is given by the Kronecker product of m × n matrices Φ mx and
Equation (8) states the standard l1 problem for a 2-D separable projection operator (and sparsifying basis). Note that this is simply a special case of the widely used 1-D vector l1 minimization program. We have verified the proposed separable CI technique on large set of images. Some of the results are presented in Sec. III. Here we bring for example the 4096×4096 Shepp-Logan phantom image (i.e., N ≈ 16 ⋅ 10 6 pixels) that was perfectly reconstructed from only M = 865x865≈ 0.7 ⋅ 106 samples taken using a separable i.i.d Gaussian random imaging operator. This implies that the complexity of O ( N × M ) involved in the implementation, storage, computation and calibration when using a conventional Φ M in Fig. 1 was reduced to O
(
N×
M
) = O (n × m) .
Although the Shepp-Logan and
other reconstruction experiment presented in section III demonstrate that CI is indeed possible with a separable imaging operator, yet we might ask the question: what is the price of using a separable imaging operator compared with the regular CS paradigm? We already know that the number of samples M is proportional to µ 2 (2). Therefore, in the next subsection, we will estimate the increase in M by deriving µ for a separable imaging operator. C. An Estimate of the Compression Ratio due to Usage of a Separable Imaging Operator
SPL-06333-2008 .R1 Theorem 1: Let ΦS, ΨS denote a pair of separable orthogonal bases on ℜ N , such that ΦS = Φ x ⊗ Φ y and Ψ S = Ψ x ⊗ Ψ y , when Φ x , Φ y , Ψ x , Ψ y ∈ ℜ n .Then,
the
mutual
coherence
µ (Φ , Ψ ) obeys: S
S
µ (Φ S , Ψ S ) = µ (Φ x , Ψ x ) µ (Φ y , Ψ y ) .
(9)
Proof. Using the definition of µ in (3), we obtain:
µ (ΦS , ΨS ) = N max φiS , ψkS = n2 max φix ⊗ φ yj , ψkx ⊗ ψly = i ,k
=
( n ) max 2
φix , ψkx ⋅ φ yj , ψly =
i , j , k ,l
= n max φ , ψ x i
i ,k
i , j , k ,l
x k
⋅ n max φ , ψ j ,l
y j
y l
= µ (Φx , Ψx )µ (Φ y , Ψ y ),
µ (Φx ⊗ Φ y , Ψx ⊗ Ψ y ) = µ (Φx , Ψx )µ (Φ y , Ψ y ) ≥ min ( µ 2 (Φi , Ψi ) ) ,(10) i = x or y
which implies that if we find a basis Φ that is less coherent with the sparsifying basis Ψ , we would rather use it for projection in both the x and y directions to assure minimal µ . of
the
paper
we
D. CI with a Separable Imaging Operator Versus CI with an Unconstrained Imaging Operator In this subsection, we use (9) to show that the mutual coherence between a separable sparsifying basis, Ψ , and a 2D separable Gaussian random projection, Φ, is larger by as little as
log( n)
than the mutual coherence when using an
unconstrained Gaussian random projection. The Gaussian random projection is a rather generic projection and is widely used in practical CS applications. The results can easily be extended to any pair of bases. Let Φ ∈ ℜ n× n denote a random orthogonal matrix, uniformly distributed on the unit sphere. It is shown in [14] that the mutual coherence, µ , behaves like the largest entry of the random matrix Φij , that obeys (with high probability):
µ (Φ, Ψ ) ≈ n max Φij ≈ n
4 ln(n)
≈ 2 log10 (n) . n By combining (9), (10) and (11), we obtain:
(11)
i, j
2
4 ln( n) µ (Φ , Ψ ) = µ (Φ , Ψ ) ≈ N ≈ 2 log10 ( n) , (12) n S
S
2
1D
1D
where ΦS = Φ1D ⊗ Φ1D ∈ ℜ N × N is the separable random imaging operator applied on images written in lexicographic form. For purposes of comparison, let us consider the conventional CS scheme using an unconstrained random 2
unit sphere, describing the measurement operator. By applying Φ to (11), and combining it with (12), we obtain: 2 log10 (n) µ (Φ1D ⊗ Φ1D , Ψ S ) (13) ≈ = log10 (n). µ (Φ , Ψ S ) 2 log10 ( N ) According to (2), the square of the ratio between the mutual coherencies in (13) determines the "oversampling" factor that has to be paid, when going from the regular CS paradigm to a 2-D separable sensing matrix CS paradigm. It is thus evident that in order to exactly reconstruct the signal (image) with overwhelming probability, one has to sample at an oversampling factor of about log 10 (n) = 3 for megapixel images. Note that these results are weak upper bounds [4]. In practice, As presented in section III, M is smaller than the presented in (2).
where the equality between the third and fourth terms is elaborated in Appendix A. We also note that:
Therefore, in the remainder choose Φ x = Φ y = Φ1D .
3
2
projection. Let us denote by unconstrained Φ ∈ R N × N = R n ×n the random orthogonal matrix, uniformly distributed on the
III. SIMULATION RESULTS In our simulations, we used the SPGL1 [15] algorithm Matlab implementation, which we modified slightly so that it would match the separable sensing scheme. The simulations were ran on an AMD 3800+ 64 bit dual core desktop PC with 4GB RAM. Fig. 2 shows recovery of a one-megapixel image, after thresholding the largest 25,000 HAAR wavelet coefficients (see [2],[10]), and projecting the image with separable random basis. In Fig. 2(b) the reconstruction PSNR is greater than 240 [dB] ("perfect reconstruction") from M = 4302 pixels, i.e., 17% of the original image size. Fig. 3(a) shows a comparison between the number of samples required for perfect recovery (MSE smaller than 1010 ) using conventional CS, M, and MS, denoting the number of required samples for perfect recovery using separable CS, as a function of N. The graphs present the average results obtained with Monte Carlo experiments involving 160 random trials applied on a set of 10 synthetically generated sparse images with K/N=1/10. It can be seen that as expected MS>M, and that practically MS is lower than M multiplied by the oversampling factor derived in (13). Fig. 3(b) depicts representative graphs of the normalized MSE for compressible images. The MSE as a function of compression ratio M/N is presented for two real 1024x1024 images (Man, Airport from [16]), which have a sparse representation under Haar wavelet basis, and a synthetically generated image with coefficient decay in a rate of i −1/ p (where i ∈ [1,1024 2 ] ), with p=3/2, corresponding to a compression ratio of about 1:20. The coefficients were randomly permuted as in [17]. IV. CONCLUSIONS In this paper we presented a new framework for CS that uses 2-D separable linear transforms. The use of a 2-D separable imaging operator for CI has significant advantages in terms of computational, storage, calibration and implementation. However, a compromise has to be made in the compression ratio, i.e., more samples have to be taken for a
SPL-06333-2008 .R1 given object. For example, by using separable random projections, the complexity is reduced from approximately O ( N 2 ) to O ( N ) , where N is the number of object pixels, and
Information Theory, Vol. 47, pp. 2845-2862, Issue 7, Nov. 2001. [15] E. Berg and M.P. Friedlander, SPGL1: A solver for largescale sparse reconstruction. Available: http://www.cs.ubc.ca/labs/scl/spgl1/, June 2007. [16] Images database. Available: http://sipi.usc.edu/database/database.cgi?volume=misc [17] E. Candes, M. Wakin, and S. Boyd, “Enhancing sparsity by reweighted l1 minimization,” Journal of Fourier Analysis and Applications, 2007. Available: http://www.citebase.org/abstract?id=oai:arXiv.org:0711.1 612 APPENDIX
the compromise requires as little as a factor of log10 ( N ) more samples than predicted in a conventional CS scheme. Practical recovery simulation suggests that the actual oversampling factor is smaller, and that separable CS is indeed feasible for compressible signals. REFERENCES
a i ⊗ b j , c k ⊗ d l = ai , c k ⋅ b j , d l
The relation
holds for
any vectors - a, b, c, d ∈ C . n
Proof:
a1i b1 j c1k d1l . . a1i bnj c1k dnl , a2i b1 j c2 k d1l . . ani bnj cnk d nl
ai ⊗ b j , ck ⊗ dl =
a1i b j c1k dl a b 2i j , c2 k dl . . ani b j cnk dl
=
=
= a1i c1k b j , dl + a2i c2k b j , dl + .... + ani cnk b j , dl = ai , ck ⋅ b j , dl α
g
f
Sparse Signal Recovery
ΦM
Ψ
Object representation
αˆ
fˆ
Ψ
Digital image reconstruction
Imaging acquisition
Fig. 1. Imaging scheme of compressed sensing
(a) (b) (c) Fig. 2. Reconstruction of 1024x1024 image of a man. (a) Original image approximated with 25,000 Haar-Wavelet coefficients (b) perfect reconstruction (PSNR=244 dB) from M = 430x430 samples (c) from M = 350x350 samples 6000 0.14 PSNR = 27.08 dB. 0.12 5000
0.1 Normalized MSE
4000
M(N)
[1] D. Donoho, “Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289- 1306, April 2006. [2] E. Candes and J. Romberg, "Sparsity and incoherence in compressive sampling," Inverse Problems, vol. 23, no. 3, pp. 969-985, 2007. [3] D. Donoho and Y. Tsaig, "Extensions of compressed sensing," Signal Processing, vol. 86, no. 3, pp. 533-548, March 2006. [4] M. Elad, "Optimized projections for compressed sensing," IEEE Transactions on Signal Processing, vol. 55, no. 12, pp. 5695-5702, Dec. 2007. [5] A. Stern and B. Javidi, "Random projections imaging with extended space-bandwidth product," IEEE/OSA Journal on Display Technology, vol. 3, no. 3, pp. 315-320, Sept. 2007. [6] A. Stern, Y. Rivenson, and B. Javidi, “Optically compressed image sensing using random aperture coding,” Keynote at Enabling Photonics Technologies for Defense, Security, and Aerospace Applications IV, SPIE, Orlando, March 2008. [7] R. Fergus, A. Torralba, and W. T. Freeman, "Random lens imaging," Computer Science and Artificial Intelligence Laboratory Technical Report, MIT-CSAIL-TR-2006-058 September 2, 2006. [8] M. Duarte et al., "Single-pixel imaging via compressive sampling," IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 83 - 91, March 2008. [9] M.Wakin et al., "Compressive imaging for video representation and coding," in Proc. Picture Coding Symposium (PCS), Beijing, China, April 2006. [10] E. Candes and M. Wakin, "An introduction to compressive sampling," IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 21-30, March 2008. [11] J. Goodman, Introduction to Fourier Optics, 2nd edition, McGraw-Hill, 1996. [12] A Jain, Fundamentals of Digital Image Processing, Chapter 2, Prentice Hall, 1989. [13] R. Robucci et al., "Compressive Sensing on a CMOS Separable Transform Image Sensor," in IEEE, ICASSP, pp. 5125-5128, Mar. 31-Apr. 4, 2008. [14] D. Donoho and X. Huo, "Uncertainty principles and ideal atomic decomposition," IEEE Transactions on
4
3000
0.08
0.06
2000
0.04 1000
0
0.02
0
1000
2000
3000
4000
5000 N
6000
7000
8000
9000
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
M/N
(a) (b) Fig. 3. (a) Average number of samples required for perfect reconstruction of a set of sparse images using separable CI scheme (dashed curve), MS, to that required using conventional CI scheme, M (solid curve). The dotted curve M log 10
N
SPL-06333-2008 .R1 denotes the theoretical bound for MS predicted by (13): . (b) Reconstruction MSE for compressible images: Man (dashed curve), Airport (dotted curve) and synthetically generated image (solid curve).
5