On Optimal Frame Expansions for Multiple Description ... - CiteSeerX

0 downloads 0 Views 174KB Size Report
Sep 15, 1999 - E ky?2 k2jy2] = f2 2. 0 + e2 2. 1 ?e2f2( 2. 0 ? 2. 1)2=(e2 2. 0 + f2 2. 1), and using (12),. D1 = 2. 2. 12. + f2 2. 0 + e2 2. 1 ? e2f2( 2. 0 ? 2. 1) e2 2.
On Optimal Frame Expansions for Multiple Description Quantization Sanjeev Mehrotra and Philip A. Chou

Microsoft Corporation, One Microsoft Way, Redmond, WA 98052-6399 sanjeevm,[email protected]

September 15, 1999

Abstract

We study the problem of nding the optimal overcomplete (frame) expansion and bit allocation for multiple description quantization of a Gaussian signal at high rates over a lossy channel. We provide a general analysis for the problem and solve it for the case of a 3  2 frame expansion.

1 Introduction Recently, the use of multiple description quantization for use over lossy (erasure) channels has been studied extensively. Among the linear transform based approaches to multiple description coding have been the critically sampled correlating transforms approach [1, 2, 3, 4, 5] and the overcomplete frame expansion approach [6, 4, 7, 8]. In [3], Goyal et al. present an analysis for optimizing the correlating transform in the critically sampled case. In this paper, we perform a similar analysis on an overcomplete expansion. In multiple description quantization using overcomplete (frame) expansions, an input signal x 2 RK is represented by a vector y = Fx 2 RN , N > K , as shown in Figure 1. F is a so-called frame operator, any of whose K rows span RK . The coecients of y are scalar quantized to obtain y^, and are then independently entropy coded using on average a total of R bits allocated among the di erent coecients. The coecient encodings are grouped into P  N descriptions, and are transmitted over the channel, which randomly erases some of these descriptions. The decoder receives Pr  P descriptions corresponding to receiving Nr  N coecients after potential erasures, and reconstructs the signal x^ from the received descriptions. We wish to nd the frame operator F and the bit allocation for the transform coecients that minimizes the expected squared error D = E [kx ? x^k2 ] subject to a constraint on the average rate R, for asymptotically large R. We assume that the distribution of the source is x

y F

y Quantizer

^

y Channel

^

^

r

Figure 1: Block Diagram of System 1

Decoder

x

Gaussian with known parameters, that the distribution of the channel (i.e., the probability of di erent erasure patterns) is known, and that the decoder uses linear reconstruction.

2 Analysis Without loss of generality, we assume that x is distributed with zero mean and diagonal covariance matrix Rx = diag(02 ; :::; K2 ?1 ). If the source has some other distribution with mean  and full covariance matrix R x , then there always exists an orthonormal transform A, the KLT, such that the random vector Ax ?  has mean zero and covariance Rx . Further, since A is an orthonormal transform, E [k(Ax ? ) ? (Ax^ ? )k2 ] = E [kx ? x^)k2 ]. Hence if F is the optimal transform for the source with diagonal covariance, then the optimal transform for the source with full covariance would simply be FA. Since the channel is lossy, x^ is a random variable which is a function of the random channel in addition to being a function of the random variable x and the quantizer. The expected distortion D can be written as an expectation of conditional distortions Ds conditioned on the channel state S = s. The number of channel states is 2P since each description can be either received or lost. Thus P ?1 2X D= psDs; (1) s=0

where ps is the probability of the channel being in state s. We proceed to obtain expressions for Ds . Let yr denote the Nr dimensional vector of received coecients and let ynr denote the Nnr = N ? Nr dimensional vector of erased coecients, so that

#

# "

"

y = yyr = FFr x = Fx; nr nr where  is an N  N permutation matrix, Fr is an Nr  K matrix, and Fnr is an Nnr  K matrix. Nr and Fr are functions of the channel state S . Also let Is(j ) 2 f0; : : : ; N ? 1g; j = 0; : : : ; Nr (s) ? 1 be the index of the j th received coecient when the channel is in state s so that the j th component of yr is the Is (j )th component of y. Thus Fr is simply F with the rows

of the erased components deleted. To obtain an expression for Ds , there are two cases to consider: Nr  K and Nr < K . When Nr  K , the decoder has enough information to localize the input vector to a nite cell. We assume that the decoder uses the optimal linear reconstruction, the pseudo-inverse (even though the pseudo-inverse may not lead to a consistent reconstruction [9]). That is, x^ = Fr+ y^r , where Fr+ is the pseudo-inverse of Fr . Since x = Fr+ yr , the conditional distortion can be written as

Ds = E [kx ? x^k jS = s] = E [kFr (y ? y^)k ] = tr((Fr )T Fr E [(yr ? y^r )(yr ? y^r )T ]); where Fr is a function of the channel state s. Since each component of y is scalar quan2

+

+

2

+

tized at high rate, the quantization errors are approximately independent of each other, and 2

yr ? y^r is approximately uniformly distributed with zero mean and diagonal covariance matrix diag(Is =12; : : : ; Is Nr ? =12), where j is the quantization stepsize for the j th component of y (Is j is the quantization stepsize for the j th component of yr ). Therefore, 2

(0)

2

(

1)

( )

Ds =

X

Nr ?1 j =0

2Is (j ) : ((Fr+ )T Fr+ )jj 12

(2)

Since the rows of Fr form a frame, the columns of Fr+ form the dual frame. Thus ((Fr+ )T Fr+ )jj is simply the squared magnitude of the j th vector in the dual frame. When Nr < K , then there is not enough information to localize x to a nite cell. In particular x is bounded in Nr dimensions and unbounded in K ? Nr dimensions. Thus,

x = Fr yr + ;

(3)

+

for some in the nullspace of Fr . Since is in the nullspace of Fr , it can be written as a linear combination of vectors which form an orthonormal basis for the subspace orthogonal to the span of the rows of Fr . Let Fr? be a (K ? Nr )  K matrix whose rows form such a basis. Then using (3),

h

x = Fr

+

" # i yr ?T ? (Fr? )T y? = Fr yr + (Fr ) yr = Fr yr + ; +

(4)

+

r

where yr? is a (K ? Nr ) dimensional vector. Since the rows of Fr? form an orthonormal basis for the subspace, Fr?(Fr?)T = I; (5) and since they form a basis for the subspace orthogonal to the span of the rows of Fr ,

Fr?FrT = 0:

(6)

Since the span of the rows of Fr is the same as the span of the columns of Fr+ ,

Fr?Fr = 0:

(7)

x^ = E [xjyr = y^r ] = E [Fr yr + jyr = y^r ] = Fr y^r + E [(Fr? )T yr?jyr = y^r ]:

(8) (9) (10)

+

The decoder now reconstructs using +

+

which using equations (4), (5), (7), and (10) gives a distortion of

Ds = E [kx ? x^k jS = s] = E [k(Fr yr + (Fr?)T yr?) ? (Fr y^r + E [(Fr? )T yr?jyr = y^r ])k ] = E [kFr (yr ? y^r )k ] + E [k(Fr? )T (yr? ? E [yr? jyr = y^r ])k ] = E [kFr (yr ? y^r )k ] + E [kyr? ? E [yr? jyr = y^r ]k ]: 2

+

+

+

2

+

2

2

2

2

3

The second part of the distortion is simply the trace of the covariance matrix of yr?jyr . Since x is Gaussian, " # h " # yr = F + (F ?)T i?1 x = Fr x (11) r r y? F? r

r

is also Gaussian. Thus yr?jyr = y^r is also Gaussian with mean Fr?Rx FrT (Fr Rx FrT )?1 y^r and covariance matrix Fr?Rx (Fr? )T ? Fr?Rx FrT (Fr Rx FrT )?1 Fr Rx (Fr? )T . Therefore

1 0Nr ? X  Is j A + tr(F ?R (F ? )T ? F ? R F T (F R F T )? F R (F ? )T ): Ds = @ ((Fr )T Fr )jj 12 x r x r r x r r x r r r 1

2

+

+

( )

1

j =0

(12) Using (2) and (12) in equation (1) gives the expected distortion that is to be minimized over the transform F and the bit allocation. First, consider the problem of optimal bit allocation. Bit allocation determines the stepsize j , j = 0; : : : ; N ? 1 for each of the transform coecients. Using (1), (2), and (12), the portion of the expected distortion that can be minimized by bit allocation can be written as

D b = where

cj =

X s2S (j )

ps

X

N ?1 j =0

cj j ;

(13)

2

((Fr+ (s))T Fr+ (s))Is?1 (j )Is?1 (j ) 12

where S (j ) is the set of states in which coecient j is received and Is?1(j ) gives the position of the j th component of y in yr . cj is a constant only dependent on the channel state probabilities and the transform. Using a high rate approximation, if the coecients of y are scalar quantized and entropy coded, 2j = 2eyj2 2?2Rj ; (14) where Rj is the rate allocated to the j th component of y and yj2 is the variance of the j th component of y. So the problem is to minimize

D b =

X

N ?1 j =0

2eyj2 cj 2?2Rj ; subject to

X

N ?1 j =0

Rj  R:

Using a Lagrangian formulation, the constrained problem is turned into the unconstrained problem of minimizing N ?1 X J = (2eyj2 cj 2?2Rj + Rj ); j =0

where  is chosen to meet the rate constraint. Equating the partial derivatives @J=@Rj = 0 and solving for Rj , yields

Rj = R=N + (log2(yj cj =k))=2; where k = 2

4

Y

N ?1 l=0

cl yl 2

! =N 1

:

k is the geometric mean of cl yl . This solution only holds if Rj  0 for all j . For high rate, R will be large and this will always be the case. Using this rate allocation and equation (14) 2

yields

 2j = 2e ck 2?2R=N ; j

which using (13) gives

D b = 2eN k2? R=N : 2

Using the above formulation we consider the optimization of the transform when K = 2, N = 3, and P = 3. In this case, the channel can be in one of eight possible states. Ds is the expected distortion when the channel is in state s. The binary representation of s gives a bit vector (b b b ) corresponding to which coecients were received. For example, channel 0 1 2

state 5 = (101)2 , corresponds to receiving the rst and third components and losing the second component. D5 is the corresponding distortion and p5 is the corresponding probability of the channel being in this state. The transform can be written as

2 3 2 3" # a b y 64 y 75 = 64 c d 75 x : x e f y 0

0

1

1

2

We assume that each of the transform vectors has unit norm, so a2 + b2 = 1, c2 + d2 = 1, and e2 + f 2 = 1. This is reasonable since the optimal bit allocation will compensate for arbitrary scaling. D0 is the distortion when nothing is received. Clearly, D0 = 02 + 12 . D1 is the distortion with two erasures when only the last component is received. Here since Nr < K , we nd Fr? and using (11)

"

# "

#"

y = e f y? ?f e 2

"

#

"

0

#

1

2

So,

x x

:

#!

y e  + f  ef ( ?  ) : y?  N 0; ef ( ?  ) f  + e  E [ky?k jy ] = f  + e  ? e f ( ?  ) =(e  + f  ), and using (12), D = 12 + f  + e  ? ee f (+ f?  ) = 12 + e  +f  : 2

2

2 0

2

2

2

2

2

2

2 0

2 2

1

Similarly,

2

2 1

2

2

2 0

2

2 1

2 0

2 2 1

2 0

2

2

2

2 1

2

2

2 0

2 1

2

2

2 0

2 0

2

2 0

2 0

2

2 1

2 1

2 1

2 2

2 1 2 1

2

2 0

2 0

2 1

2

2 1

D = 12 + c  +d  and D = 12 + a  + b  . 2

2 1

2

2 0

2 0

2 1

2

4

2 1

2 0

2

2 0

2 0

2 1

2

2 1

D is the distortion when the second and third components are received and the rst component is lost. Here Nr  K , and using x = Fr y gives 3

"

x x

0 1

+

" #" # #? " # 1 f ? d y : c d y = e f y y = cf ? de ?e c

# "

1

1

1

2

2

5

So, using (2)

D =  (f +12(ecf) +?de)(d + c ) = 12(cf+?de) :

Similarly,

2

2 1

3

2

2

2 2

2 2

2 1

2

2

2

D = 12(af+?be) and D = 12(ad+?bc) . 2 2

2 0

5

2

2 1

2 0

6

2

To calculate D7 , when all of the components are received, we assume the frame is tight. When all of the coecients are received, Fr = F . If the frame is tight, F T F = rIK K , where r = N=K is the redundancy of the frame. Thus, F + = (1=r)F T , and (F +)T F + = (1=r2 )FF T . Since the vectors in the frame are assumed to have unit length, (FF T )ii = 1 for i = 0; : : : ; N ? 1 and so ((F + )T F + )ii = 1=r2 for all i. For a 3  2, expansion, ((F + )T F + )ii = (2=3)2 . So using (2), ! 2 2 2    4 D7 = 9 120 + 121 + 122 Let pei be the probability that i coecients are received. So, p = pe , p = p = p = pe , p = p = p = pe , and p = pe . Using the above analysis in equation (13), the portion of the 0

3

5

6

2

7

0

1

2

4

1

3

transform that can be minimized by the bit allocation is

 pe



+ 12(afpe?2 be)2 + 12(adpe?2 bc)2 + p27e3 20 + 12  pe1  p p p e2 e2 e3 + 12(cf ? de)2 + 12(ad ? bc)2 + 27 21 + 12   pe1 p p p e2 e2 e3 2 12 + 12(af ? be)2 + 12(cf ? de)2 + 27 2

D b =

1

= c0 20 + c1 21 + c2 22

Also, y20 = a2 02 + b2 12 , y21 = c2 02 + d2 12 , y22 = e2 02 + f 2 12 . Using this for optimal bit allocation gives k = [c0 c1c2 (a2 02 + b212 )(c2 02 + d2 12)(e2 02 + f 212 )]1=3 : Thus with the optimal bit allocation,

D = pe

1

!

    ? R=N : a  + b  + c  + d  + e  + f  + + 2eN k2 2

2 0

2 0

2 1

2

2 1

2

2 0

2 0

2 1

2

2 1

2

2 0

2 0

2 1

2

2 1

2

The rst part of D is the part that cannot be optimized by varying the bit allocation. It is the second portion of the distortion given in (12). Although it is hard to nd a closed form solution for the minimum, we can say a few things about the endpoints. Assume 0  1 . When the rate is very high or when pe2 and pe3 are very small, the optimal solution only has to minimize the rst part of D. Since 02 12 =(a2 02 + b2 12 ) is monotonically decreasing from 02 to 12 as a goes from 0 to 1, the optimal value for a is close to 1. For this the optimal solution is simply to repeat the vector with the largest variance in all three coecients. When pe1 is small or when 0 = 1 , i.e. the source is white, then the optimal 6

solution has to minimize only the second component of D which is to minimize k. The rst component of k (the product c0 c1 c2 ), by symmetry, is minimized when the three vectors are equally spaced out, so a = cos(0); b = sin(0); c = cos(2=3); d = sin(2=3); e = cos(4=3); f = sin(4=3). However, the second portion (the geometric mean of the yl2 ) is minimized when b, d, and f are close to 1. Another case we consider is a transform 2 is of the 3 form 64 01 01 75 :

e f We use the same distortions as in the general 3  2 case, but since the frame is not tight, we calculate D using (2) ! ! 1 (1 + f ) + e f (1 + e ) + e f    + + D = 7

2 2

2 0

2

2

2 2

2 1

12 4 12 Then, we nd k using optimal bit allocation as 7

2

2

2 2

4

12 4

!

f ) +e f ) c = p12e + 12pef + p12e + pe ((1 +12(4) ! p p p p e e e ((1 + e ) + e f ) e c = 12 + 12e + 12 + 12(4)  pe pe pe  c = 12 + 12f + 12pee + 12(4) k = [c c c ( )( )(e  + f  )] = : 2

1

0

2

3

2

3

2 2

2

2

2 2

2

2

2

2

1

1

2

1

2

2 0

And then the expected distortion is 1

2

2

0 1 2

D = pe

2

2 1

2

2 0

2

 +  + e  +f  2 1

2 0

2

2 0

3

2

2 0

2 1

2

2 1

!

2 1

1 3

+ 2eN k 2?2R=N

where f 2 = 1 ? e2 . So the only thing to optimize is to nd the value for e or equivalently the value for  = tan?1 (f=e). Assume 0  1 . The rst part of D is minimized by letting e = 1 ? . The solution holds as long as  ! 0 but  6= 0. This corresponds to  close to 0. Note that if  = 0, then the second portion of D would be in nite since k would be in nite. This holds when  = 0 or when pe2 and pe3 are small (the loss rate is high). This corresponds to repeating the component with the higher variance. The second part of D is minimized p k p when  is minimized. The product c0 c1 c2 in k is minimized when e = f or [e f ] = [1= 2 1= 2] . This corresponds to taking the average of x0 and x1 . However, the geometric mean of the variances in k is minimized when e = 0. So, k is minimized when  2 [=4; =2]. Minimizing k is valid when 0 = 1 or when pe1 is small (the loss rate is low). A closed form solution is hard, so we show some numerical results here. Let 02 = 4 and 12 = 1. Then the probability of losing a single coecient is varied from 0 to 1. The results for the optimal  are shown in Figure 2(a) for R = 5 and R = 10 bits. As expected, when the probability of loss is small,  2 [=4; =2] and when the loss probability is large,  goes to 0. Also  goes to 0 faster for the higher rate case. Also shown in Figure 2(b) is the optimal bit allocation when R = 5 bits. 7

1.6

3

π/2

Rate for y 0 y 1 y2

1.4 2.5

1

2

0.8

Bits

0

Theta (angle w.r.t x axis)

1.2

π/4 R=5bits

1.5

0.6 R=10bits

0.4 1 0.2

0 0

0.1

0.2

0.3

0.4 0.5 0.6 Loss probability

0.7

0.8

0.9

0.5 0

1

0.1

0.2

0.3

0.4 0.5 0.6 Loss probability

0.7

0.8

0.9

1

Figure 2: Results for optimal (a) value for  = tan?1 (f=e) and (b) rate allocation.

3 Conclusion We have presented an analysis for optimizing the bit allocation and transform in an overcomplete expansion for use over a lossy channel. Results were presented for a 3  2 expansion and numerical results were presented for a special case of a 3  2 expansion. In low loss rates, the transform tries to get as close to the orthogonal transform as possible. In high loss rates, the transform tries to repeat the coecient with the highest variance. With this intuition, we can try to nd optimal transforms for higher dimensions.

References [1] Y. Wang, M. T. Orchard, and A. R. Reibman. Multiple description image coding for noisy channels by pairing transform coecients. In Proc. Workshop on Multimedia Signal Processing, pages 419{424. IEEE, Princeton, NJ, June 1997. [2] M. T. Orchard, Y. Wang, V. A. Vaishampayan, and A. R. Reibman. Redundancy ratedistortion analysis of multiple description coding. In Proc. Int'l Conf. Image Processing, Santa Barbara, CA, October 1997. IEEE. [3] V. K. Goyal and J. Kovacevic. Optimal multiple description transform coding of Gaussian vectors. In Proc. Data Compression Conference, pages 388{397, Snowbird, UT, March 1998. IEEE Computer Society. [4] V. K. Goyal, J. Kovacevic, R. Arean, and M. Vetterli. Multiple description transform coding of images. In Proc. Int'l Conf. Image Processing, Chicago, IL, October 1998. IEEE. [5] Y. Wang, M. T. Orchard, and A. R. Reibman. Optimal pairwise correlating transforms for multiple description coding. In Proc. Int'l Conf. Image Processing, Chicago, IL, October 1998. IEEE. 8

[6] V. K. Goyal, J. Kovacevic, and M. Vetterli. Multiple description transform coding: robustness to erasures using tight frame expansions. In Proc. Int'l Symp. Information Theory, page 408, Cambridge, MA, August 1998. IEEE. [7] V. K. Goyal, J. Kovacevic, and M. Vetterli. Quantized frame expansions as source-channel codes for erasure channels. In Proc. Data Compression Conference, pages 326{335, Snowbird, UT, March 1999. IEEE Computer Society. [8] P.A. Chou, S. Mehrotra, and A. Wang. Multiple description coding of overcomplete expansions using projection onto convex sets. In Proc. Data Compression Conference, pages 72{81, Snowbird, UT, March 1999. IEEE Computer Society. [9] V. K. Goyal, M. Vetterli, and N. T. Thao. Quantized overcomplete expansions in Rn : Analysis, synthesis, and algorithms. IEEE Trans. Information Theory, 44(1):16{31, January 1998.

9

Suggest Documents