Wavelet-Based Multiple Description Coding of ... - Semantic Scholar

3 downloads 9036 Views 144KB Size Report
Email: [email protected]. Abstract—We consider the problem of image transmission on error-prone networks with little or no error protection. To this end, we ...
Wavelet-Based Multiple Description Coding of Images with Iterative Convex Optimization Techniques Teodora Petris¸or, B´eatrice Pesquet-Popescu

Jean-Christophe Pesquet

ENST, Signal and Image Processing Department 46, rue Barrault, 75634 Paris C´edex 13, FRANCE Email: {petrisor, pesquet}@tsi.enst.fr

IGM and UMR CNRS 8049, Universit´e de Marne la Vall´ee 5, bd. Descartes, 77454 Marne la Vall´ee C´edex 2, FRANCE Email: [email protected]

Abstract— We consider the problem of image transmission on error-prone networks with little or no error protection. To this end, we build a multiple description scheme based on classical biorthogonal filter banks, that achieves good reconstruction even at low bitrates. The novelty of the proposed approach mainly consists in building a 2D wavelet frame representation with low redundancy. Another contribution of this work is the use of a convex optimization approach at the decoder end in order to best take advantage of all the received information. The quantization constraints define convex sets, which allow us to apply fast iterative projection techniques to find a feasible solution of the decoding problem.

I. I NTRODUCTION In the class of joint source-channel coding paradigms, Multiple Description Coding (MDC) schemes are gaining more and more popularity today with the need for data transmission over best-effort packet-based networks, such as the Internet or wireless networks, and for applications where retransmission of corrupted packets is not always an option [1]. In turn, these networks often provide path diversity that can be exploited when sending data. In particular, one can build and transmit correlated representations of the same source, referred to as descriptions. For transmission on channels with similar characteristics, the need for balanced descriptions (equally important) is inherent. Several works propose methods for building such descriptions [2], [3]. In the context of this paper, the descriptions are considered as either error-free received or unrecoverable [4]. The corresponding decoder must take advantage of all the information received so as to minimize the end-to-end distortion. When all the descriptions have been received the socalled central decoder enhances the reconstruction quality by exploiting the introduced redundancy. When some descriptions are lost during the transmission, side decoders have to be used. In image coding, wavelet techniques have been proved to provide high-quality solutions. They also offer scalability features which may be useful for transmission over heterogeneous networks. Among the different filter banks which can be envisaged, extensive studies [5] have shown that biorthogonal filters such as the 9-7 pair offer significant advantages both in terms of compression abilities and efficient implementation

through the lifting scheme. In this work, we propose a wavelet frame construction which uses the same kind of filters and thus inherits their good properties. Two shifted versions of the same wavelet decomposition are considered for this purpose which are further combined with a quincunx subsampling so as to obtain two descriptions. Unlike other existing schemes [6], it is worth noticing that the redundancy in terms of coefficient number is limited to the size of an approximation subband at the coarsest resolution. Due to the redundancy introduced in the coding scheme (when both descriptions are received) or to the absence of some information (when one description is missing at the receiver), the decoding problem is not trivial. In this work, we formulate this problem as the optimization of a quadratic objective function subject to convex constraints. These constraints model the uniform quantization process which is currently used in several state-of-the-art encoding strategies such as EZBC [7]. In [8], the authors present a consistent reconstruction of a quantized signal with a POCS (Projection Onto Convex Sets) algorithm when overcomplete expansions are sent. POCS methods serve in finding an estimation of the signal belonging to the intersection of all the constraint sets. The algorithm we propose in this paper relies on recent advances in iterative convex constrained optimization techniques [9]. Compared with the POCS algorithm, the considered method offers several advantages. In particular, it converges much faster and it can be implemented on a parallel architecture. In the next section we present the method used to obtain low redundancy descriptions and we discuss the problem of perfect reconstruction. Section III describes the decoding algorithm and in Section IV we give some experimental results. The last section concludes the paper. II. E NCODER D ESIGN We consider a separable two-dimensional dyadic filter bank framework which allows us to apply the 2D Fast Wavelet Transform (2D-FWT). In the following, we consider two decompositions onto wavelet bases yielding a complete frame expansion and followed by a selective quincunx subsampling

that reduces the redundancy. To this end we refer to frame theory [10],[11], concretized by the use of an oversampled filter bank structure. A. First step Let (h[n])n∈Z (resp. (g[n])n∈Z ) be the impulse response of the analysis low-pass (resp. high-pass) filter applied along one of the dimensions. Starting from a classical wavelet decomposition, we denote the approximation subband coefficients at resolution j ∈ {1, . . . , J} by aj and the three corresponding detail subband coefficients by dhj , dvj and ddj according to their orientation: horizontal, vertical or diagonal, respectively. These sequences are computed by the following equations: X aj [n, m] = aj−1 [k, l]h[2n − k]h[2m − l] k,l

X

aj−1 [k, l]h[2n − k]g[2m − l]

dvj [n, m] =

X

aj−1 [k, l]g[2n − k]h[2m − l]

ddj [n, m] =

X

aj−1 [k, l]g[2n − k]g[2m − l]

dhj [n, m] =

k,l

(1)

k,l

k,l

In the proposed MDC scheme, for fine resolutions, i.e. j 6= J, the detail coefficients are quincunx subsampled resulting in two polyphase components and the so-obtained components are distributed between two descriptions. Recall that, for a quincunx decimation of a 2D field (x[n, m])n,m , the two polyphase components are defined by x(q) [n, m] = x[n + m + q, n − m]

(2)

where q ∈ {0, 1}. In Description I, we choose to store (0) (0) (0) coefficients dhj , dvj , ddj , for j ∈ {1, . . . , J −1}, whereas (1) the Description II is built from the detail coefficients dhj , (1) (1) dvj , ddj . Note that this operation adds no redundancy, this one being introduced by a more elaborate processing at the coarsest resolution J· B. Building the Multiple Description Scheme We now concentrate on the last decomposition stage (j = J). In order to simplify the notations, we drop the subscript J in this section. Also, the approximation coefficients obtained at the resolution level J − 1 are denoted by (x[n, m])n,m . In order to create the desired correlated descriptions, shifted variants of the last level filter bank decomposition are considered. More precisely, we can compute X a(s,s′ ) [n, m] = x[k, l]h[2n + s − k]h[2m + s′ − l]

where the shift parameters s and s′ are either equal to 0 or 1. The two shift values may be interpreted as the possibility of decimating at odd or even positions in a dyadic critically subsampled filter bank. In our approach, two shifted decompositions will be employed, the first one corresponding to shift parameters s1 and s′1 and the second one to (s2 , s′2 ) 6= (s1 , s′1 ). By keeping all the so-obtained coefficients, we would introduce a redundancy factor of 2 at the coarsest resolution level. As shown below, it is however possible to reduce this redundancy by discarding some of the subband coefficients. To this end, we perform a quincunx subsampling of the four subbands of each of the two retained shifted decompositions. This yields 16 polyphase components: (q) (q) (q) (q) a(si ,s′ ) , dh(si ,s′ ) , dv(si ,s′ ) , dd(si ,s′ ) where i ∈ {1, 2} and i i i i q ∈ {0, 1}. Both the approximation sequences resulting from the two shifted decompositions will be preserved so as to guarantee a high enough reconstruction quality for the side decoders. In the meantime, no redundancy will be introduced on the high-pass subbands by transmitting only 6 of the 12 polyphase components of the detail coefficients. In this way, the redundancy factor at the coarsest resolution level is limited to 10/8 = 1.25. Different representations can be built in this way depending on the choice of (s1 , s′1 ), (s2 , s′2 ) and the 6 selected quincunx polyphase components. It appears however that not any possible choice leads to a frame decomposition, i.e. having the ability of perfectly reconstructing the image from the corresponding set of coefficients when no quantization or errors occur, [12]. The perfect reconstruction can be checked by studying the invertibility of the poyphase transfer function matrix of the corresponding oversampled filter bank. From this study, two schemes have been selected. In the first one, De(0) (0) (0) scription I is made of: a(0,0) , dh(0,0) , dv(0,0) , dd(0,0) whereas

dh(s,s′ ) [n, m] =

x[k, l]h[2n + s − k]g[2m + s′ − l]

dv(s,s′ ) [n, m] =

X

x[k, l]g[2n + s − k]h[2m + s′ − l]

X

x[k, l]g[2n + s − k]g[2m + s′ − l]

dd(s,s′ ) [n, m] =

k,l

(1)

(1)

(1)

(0)

III. D ECODER O PTIMIZATION For simplicity, we now adopt more concise notations: •

k,l

k,l

(1)

and for Description II: a(1,0) , dh(1,0) , dv(0,0) , dd(1,0) . Note that symmetric choices obtained by some permutations of the indices s1 , s′1 , s2 , s′2 and q lead to equivalent structures.

k,l

X

(1)

Description II contains: a(1,1) , dh(0,0) , dv(0,0) , dd(0,0) . Note that this scheme includes all the coefficients resulting from a decomposition onto the “regular” wavelet basis and the completeness of the redundant representation is therefore obviously guaranteed. A second scheme is given by the following (0) (0) (1) sequences for Description I: a(0,0) , dh(1,0) , dv(0,0) , dd(1,0)



αi = (αi [j])1≤j≤N = Fi x, i ∈ {1, 2} is the vector of coefficients resulting from the decomposition of an image x onto one of two biorthogonal wavelet bases; the image x is viewed as a column vector in RN where N is the number of pixels of x. α ˆ i = Qδ (αi ) denotes the vector of quantized coefficients using a uniform quantizer Qδ with quantization step δ ∈ R∗+ (the extension to non-uniform/distinct quantizers is immediate). From the received description(s), a subset

44

40 38 36 34 32 30

A. Problem formulation

28 26

The a priori information about our problem is modeled by the quantization constraints: |αi [j] − αˆi [j]| ≤

δ . 2

fi [j] kfi [j]k

(5)

(6)

where γi = Ti (αi,0 ) with αi,0 = Fi x0 . Each operator Ti is such that, for all j 6∈ Ji , γi [j] = 0 and, for all j ∈ Ji ,  αi,0 [j] − 2δ − α ˆ i [j]   , if αi,0 [j] > 2δ + α ˆ i [j]   kfi [j]k if |αi,0 [j] − α ˆ i [j]| ≤ 2δ γi [j] = 0, (7)  δ  α [j] + − α ˆ [j] i  i,0 δ 2  ,

1.5 Bitrates [bpp]

2

2.5

3

28

where k.k denotes the euclidean norm of RN . This means that x ˆ is the projection of x0 onto S. Usually, there exists no closed form expression of x ˆ and an iterative optimization algorithm needs to be used to compute it. Before presenting such an algorithm, we note that the projection onto each set Si,j is easily expressed as PSi,j (x0 ) = x0 − γi [j]

1

29

δ δ Si,j = {x | − ≤ fi [j]T x − α ˆ i [j] ≤ } (4) 2 2 where fi [j] is the j-th basis function of the i-th representation (the j-th column of the matrix Fi ). The decoded image should T T therefore belong to the closed convex set S = i∈{1,2} j∈Ji Si,j . Let x0 be a reference image we expect the decoded image to be close to. Such a reference image may correspond to an initial estimate of the original image. The decoding problem can be cast as x∈S

0.5

(3)

The constraints (3) define the closed hyperslabs

Find x ˆ = arg min kx − x0 k

0

30

ˆ i [j]. if αi,0 [j] < − 2 + α

As proposed in [8], a POCS algorithm can be used in this context but it suffers from two main drawbacks. At first, it does not converge to the best approximation of x0 in S but to an arbitrary element of S. Secondly, its convergence is slow. B. Proposed iterative method The proposed method is derived from the general algorithm allowing to minimize a quadratic convex function under convex constraints, which was developed in [9]: Algorithm 1: ➀ Fix ǫ ∈ (0, 1/2) and set k = 0. ➁ Calculate αi,k = Fi xk , i ∈ {1, 2}. ➂ Set γi,k = Ti (αi,k ) and λi,k = kγi,k k2 , for i ∈ {1, 2}. Calculate γ˜i,k = (γi,k [j]/kfi [j]k)j .

PSNR [dB]

∀i ∈ {1, 2}, ∀j ∈ Ji ,

kfi [j]k

Init. Scheme 1 Opt. scheme 1 Init. Scheme 2 Opt. Scheme 2

42

PSNR [dB]

(ˆ αi [j])j∈Ji of quantized coefficients is known, where Ji ⊂ {1, . . . , N }. Note that, when J1 = ∅ or J2 = ∅, the reconstruction of x is achieved by directly inverting F2 or F1 . This situation however never arises for the central decoder and it also happens only in specific cases for the side decoders. Subsequently, we address the general case when both J1 and J2 are non empty.

27

26 Init. Scheme 1 Side Dec. 1 Opt. Scheme 1 Side Dec. 1 Init. Scheme 2 Side Dec. 1 Opt. Scheme 2 Side Dec. 1 Init. Scheme 1 Side Dec. 2 Opt. Scheme 1 Side Dec. 2 Init. Scheme 2 Side Dec. 2 Opt. Scheme 2 Side Dec. 2

25

24

23 0

0.2

0.4

0.6 0.8 Bitrates [bpp]

1

1.2

1.4

Fig. 1. RD evaluation of the two schemes: central (top graph) and side (bottom graph) decoders.

➃ Set ai,k = −FiT γ˜i,k , i ∈ {1, 2} ➄ Choose ω1,k ∈ (ǫ, 1 − ǫ) and set ω2,k = 1 − ω1,k . Set Lk = ω1,k λ1,k + ω2,k λ2,k . ➅ If Lk = 0, exit iteration. Otherwise, set vk = ω1,k a1,k + ω2,k a2,k and dk = kvLkk2 vk . k ➆ Set bk = x0 − xk , πk = −bTk dk , µk = kbk k2 , νk = kdk k2 , and ρk = µk8 νk − πk2 . > xk + dk if ρk = 0 and πk ≥ 0, > > > > > > > > 0 and πk νk ≥ ρk , ➇ Set xk+1 = > > ´ > νk ` > > > >xk + ρk πk bk +µk dk > > : if ρk > 0 and πk νk < ρk . ➈ Set k ← k + 1 and go to ➁.

It can be noticed that the adjoint operators FiT involved in step ➃ can be implemented by using filter bank structures. Another interesting characteristic of the algorithm is that the computations in ➁, ➂ and ➃ can be parallelized on a biprocessing unit. IV. E XPERIMENTAL R ESULTS We have performed experiments in which the wavelet coefficients of the different descriptions are encoded with the EZBC algorithm [7] and decomposed with 9/7 biortoghonal filters on three levels. We present here the results on the 512×512 grayscale “Lena” image. In the first set of simulations we consider the two schemes proposed at the end of Section II which have been evaluated when the transmitted descriptions are either perfectly received, or when one of them is lost. In Figs. 1 and 2 these schemes are referred to as ”Scheme 1” and ”Scheme 2”. The initialization

39 38 37

PSNR [dB]

36 35 34 33 32 31 30

0

PSNR optimization for Scheme 1, central decoder PSNR optimization for Scheme 2, central decoder 2

4

6 8 Number of iterations

10

12

14

29.5 29 28.5

PSNR [dB]

28 27.5 27

can be easily noticed. Also, one should observe that since in Scheme 1 the first side decoder contains only coefficients from one of the basis, no optimization needs to be performed. The characteristics of the convergence curves are similar for all bitrates, the quickest convergence being obtained at low bitrates (almost 90% of gain in 2 iterations) and the slowest at high bitrates (Fig.2). In the second set of experiments, we have tested the robustness of the proposed schemes. For illustration purposes, we present in Fig. 3 the reconstructed image before (21.22 dB) and after (32.51 dB) applying the iterative reconstruction algorithm. In this case, the image was compressed at 0.8 bpp and the two descriptions affected by 5% of random error losses. The quality of the initial image could obviously be improved by using spatial interpolation methods. We preferred not to use such empirical techniques in order to allow the reader to better localize the errors.

26.5

25.5

0

V. C ONCLUSION

Scheme 1, side decoder 1 PSNR optimization for Scheme 1, side decoder 2 PSNR optimization for Scheme 2, side decoder 1 PSNR optimization for Scheme 2, side decoder 2

26

2

4

6 8 Number of iterations

10

12

14

Fig. 2. Convergence speed for the proposed algorithm: central (top graph) at 1.7 bpp and side (bottom graph) at 0.8 bpp decoders.

We have presented several multiple description schemes based on biorthogonal filter banks with an extra feature of low redundancy. This was obtained by performing an additional quincunx subsampling on the detail subbands. We have discussed some criteria to form two descriptions from a 2D wavelet frame. This work has also presented a convex optimization approach at the decoder end that enables good reconstruction even at low bitrates. R EFERENCES

Fig. 3. Reconstruction: before (left image) and after (right image) convex optimization.

of the iterative algorithm (denoted by ”Init”) for the central or side decoders corresponds here to a weighted sum of the reconstructed wavelet coefficients from both shifted representations which are used in a scheme. One can remark from Fig. 1 that after optimization, the two schemes demonstrate close performance (less than 0.5 dB of difference) when both descriptions have been received, even though the first scheme had about 1 dB less without optimization. From the bottom graph, we also remark that the second scheme presents perfectly balanced descriptions, while for the structure including one basis representation, the two descriptions show a 4 dB difference in coding performance at almost all bitrates. Moreover, even the side decoders benefit of up to 2 dB of improvement by using our iterative algorithm. Fig. 2 gives an idea on the convergence speed for the iterative algorithm for the two central decoders (top graph) and the four side decoders (bottom graph) corresponding to the two considered schemes. In the bottom graph the fact that the second scheme is balanced

[1] S. Servetto, K. Ramchandran, V. Vaishampayan, and K. Nahrstedt, “Multiple description wavelet based image coding,” IEEE Trans. Image Processing, vol. 9, no. 5, pp. 813–826, 2000. [2] I. V. Bajic and J. W. Woods, “Domain-based multiple description coding of images and video,” IEEE Transactions on Image Processing, vol. 12(18), pp. 1211–1225, October 2003. [3] W. Jiang and A. Ortega, “Multiple description coding via polyphase transform and selective quantization,” in Proc. SPIE : Visual Communications and Image Processing, January 1999. [4] V. K. Goyal, “Multiple description coding: Compression meets the network,” IEEE Signal Processing Magazine, pp. 74–93, September 2001. [5] ISO/IEC 15444-1, “Information technology – JPEG 2000 image coding system,” 2000. [6] R. Motwani and C. Guillemot, “Tree-structured oversampled filterbanks as joint source-channel codes: application to image transmission over erasure channels,” IEEE Transactions on Signal Processing, vol. 52 (9), pp. 2584 – 2599, Sept. 2004. [7] S. Hsiang and J. Woods, “Embedded image coding using zeroblocks of subband/wavelet coefficients and context modeling,” in Int. Symp. Circ. and Syst., Geneva, May 2000, pp. 589–593. [8] P.A. Chou, S. Mehrotra, and A. Wang, “Multiple description decoding of overcomplete expansions using projections onto convex sets,” in Data Compression Conference, 1999, pp. 72–81. [9] P. L. Combettes, “A block-iterative surrogate constraint splitting method for quadratic signal recovery,” IEEE Trans. Signal Processing, vol. 51, pp. 1771–1782, July 2003. [10] V.K. Goyal, J. Kovaˇcevi´c, and J.A. Kelner, “Quantized frame expansions with erasures,” J. of Appl. and Comput. Harmonic Analysis, vol. 10, pp. 203–233, May 2001. [11] H. Bolcskei, F. Hlawatsch, and H.G. Feichtinger, “Frame-theoretic analysis of oversampled filter banks,” IEEE Transactions on Signal Processing, vol. 46(12), pp. 3256–3268, December 1998. [12] T. Petrisor, B. Pesquet-Popescu, and J.-C. Pesquet, “Perfect reconstruction in reduced redundancy wavelet-based multiple description coding of images,” to appear in Proc. of EUSIPCO, 2005.

Suggest Documents