Adaptive Source-Channel Subband Video Coding for ... - CiteSeerX

1 downloads 34 Views 260KB Size Report
Aug 14, 1998 - Adaptive Source-Channel Subband Video Coding .... bits are allocated to di erent transform or subband coe cients in a rate-distortion ...... tizers," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no.
Adaptive Source-Channel Subband Video Coding for Wireless Channels  Murari Srinivasan and Rama Chellappa Department of Electrical Engineering and Center for Automation Research University of Maryland College Park, MD 20742 email:fmurari,[email protected] Tel: (301) 405-3656, Fax: (301) 314-9115

August 14, 1998 Abstract

This paper presents a general framework for combined source-channel coding within the context of subband coding. The unequal importance of subbands in reconstruction of the source is exploited by an appropriate allocation of source and channel coding rates for the coding and transmission of subbands over a noisy channel. For each subband, the source coding rate as well as the level of protection (quanti ed by the channel coding rate) are jointly chosen to minimize the total end-to-end mean-squared distortion su ered by the source. This allocation of source and channel coding rates is posed as a constrained optimization problem, and solved using a generalized bit allocation algorithm. The optimal choice of source and channel coding rates depends on the state of the physical channel. These results are extended to transmission over fading channels using a nite state model, where every state corresponds to an AWGN channel. A coding strategy is also developed that minimizes the average distortion when the channel state is unavailable at the transmitter. Experimental results are provided that demonstrate application of these combined sourcechannel coding strategies on video sequences.

This work was supported in part by the Oce of Naval Research under MURI grant N00014951-0521 (ARPA order C635) 

1

List of Tables 1

Rate Allocation as a function of channel state . . . . . . . . . . . . .

20

2

Discretization of the Channel Fade Distribution . . . . . . . . . . . .

24

3

Channel-averaged distortion performance of the coding schemes . . .

26

List of Figures 1

The Equivalent Channel as seen by the Source Coder . . . . . . . . .

2

The distribution of instantaneous SNR for lognormal fading, s = 4:5

10

dB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3

The joint source-channel coding system . . . . . . . . . . . . . . . . .

18

4

The spatio-temporal subband decomposition used . . . . . . . . . . .

19

5

Comparison of Computed and Estimated Distortions . . . . . . . . .

21

6

Performance of the Adaptive Source-Channel coder . . . . . . . . . .

22

7

Partitioning of the pdf of the SNR into a nite number of states . . .

23

8

Performance of the Coding Scheme that minimizes Average Distortion 24

9

Frame 301 of the sequence received at a channel SNR of 1 dB, PSNR = 27.17 dB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

10 Frame 301 of the sequence received at a channel SNR of 4 dB, PSNR = 29.37 dB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

11 Frame 301 of the sequence received at a channel SNR of 8 dB, PSNR = 32.94 dB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

27

1 Introduction One of the important components of multimedia communications is reliable transmission of digital video over wireless channels. Wireless channels exhibit fading e ects caused by shadowing and multipath phenomena. Video coding schemes based on current standards such as MPEG or H.261/H.263 perform very poorly over such channels especially under low SNR conditions. A video compression scheme designed for these channels must degrade gracefully in performance when channel fading occurs. This requirement favors a coding method that adapts itself to the channel condition. In this paper, we consider a three-dimensional subband coding scheme, motivated by two reasons. Firstly, the propagation of channel-induced errors is limited due to the small temporal extent of the spatio-temporal decomposition. Secondly, the inherent multiresolution character of the coding scheme naturally lends itself to unequal error protection (UEP). UEP enables an ecient utilization of available bandwidth by protecting di erent layers of information according to their importance. Shannon's separation principle [1] establishes the optimality of separate design of source and channel coders, and states that total distortion is essentially limited to the source coding distortion as long as the rate of the source coder is less than channel capacity. This however, is an asymptotic result, and is achieved in the limit by using extremely long block codes. Real-world systems bene t through joint design of the source and channel coders, given knowledge of the channel. The aim of a combined source-channel coding approach is to allocate bits between the source and channel coders in an optimal manner, subject to a constraint on the overall coding rate. 3

Several constructive approaches for combined source-channel coding have been proposed in the literature. One of the earliest papers in this area is by Modestino et al., who illustrate the advantages of source-channel coding using the DCT in [2]. Vaishampayan and Farvardin [3] present a source-channel DCT coding approach using quantizers optimized for noisy channels. They also provide theoretical predictions for performance and rate-distortion theoretic bounds based on an image model. There has been a large body of work on robust vector quantization for transmission over noisy channels. These include the work on index assignment [4] and channeloptimized VQ (COVQ) [5, 6]. Some approaches have used bit-sensitivity calculations to design UEP schemes using variable rate channel codes. Ruf and Modestino [7] use bit-sensitivity analysis to optimally allocate source and channel coding rates for image transmission over noisy channels and also provide information-theoretic bounds on performance. Cheung and Zakhor [8] consider three-dimensional subband coding using multi-rate quantization over noisy channels, also using a bit-sensitivity approach. Another class of approaches is based on a joint source-channel decoding approach. Burlina and Alajaji [9] use a joint source-channel decoding approach in the form of MAP decoding for exploiting residual redundancy in images transmitted over channels with memory. Xu, Hagenauer and Hollmann [10] demonstrate the advantages of a joint source-channel decoding approach using residual redundancy in compressed images. We approach this problem along the lines of [11] and [12] in which we have developed a combined source-channel subband coding strategy based on UEP for the 4

subbands as against UEP based on bit-sensitivities. In typical source coding systems, bits are allocated to di erent transform or subband coecients in a rate-distortion (source coding rate vs. coding distortion) framework. Our combined source-channel coding approach is analogous to the rate-distortion based source coding approach in that it minimizes the end-to-end mean square distortion by suitably choosing source and channel coding rates for each of the subbands. In our formulation, we derive analytical expressions for the end-to-end distortion in the source as a function of source and channel coding rates for the subbands. The goal is to minimize the distortion subject to a constraint on the overall coding rate. This is posed as a constrained minimization problem where the minimization is carried out over all choices of source and channel coding rates for the subbands. We then use the notion of a \composite" rate distortion curve (which re ects the optimal choice of source and channel coding rates for a given overall coding rate) to reformulate the minimization problem in terms of an allocation of total coding rates to the subbands. This minimization problem is then solved using a dynamic programming approach that was developed by Shoham and Gersho in [13]. We apply these general techniques to a speci c subband video coding scheme. In this scheme, a spatio-temporal subband decomposition followed by vector quantization (VQ) of the subband coecients forms the source coding approch. The VQ indices of each coded subband are interleaved and protected using rate-compatible punctured convolutional (RCPC) codes [14]. Interleaving the indices distributes error bursts (caused by the Viterbi decoder making a wrong decision which leads to devia5

tion from the correct path in the trellis) among the codewords. In a VQ designed by \splitting" [15], codewords that di er by a larger Hamming distance usually represent centroids that are farther away in Euclidean distance. Interleaving helps in this situation because codeword indices that are received erroneously are only a small Hamming distance away from the correct indices. Interleaving would be even more advantageous if optimal index assignment strategies (along the lines of [4]) are used. Furthermore, interleaving aids in the analytical computation of the channel-induced distortion by making the equivalent channel (the channel as seen by the source encoder and decoder) memoryless. For this system, we derive operational rate-distortion curves for each subband which re ect source coding distortion as well as channel-induced distortion. These \composite" rate-distortion curves are computed along the lines of the approach in [16] except that the interleaver in the system considered here renders the equivalent channel memoryless. Joint selection of source and channel coding rates now amounts to performing a bit allocation using the composite rate-distortion curves. This is analogous to the bit allocation problem in source-coding, applied however in a framework that takes into account distortions at the source coder as well as those induced by the channel. This paper is organized as follows. The problem statement and end-to-end distortion analysis are developed in Section 2.1. In Section 2.2, we present the methodology for obtaining the optimal rate allocation. In Section 3, we extend these results to obtain a coding strategy that minimizes the average distortion over fading channels. The details of the particular combined source-channel coding scheme are described in 6

Section 4.1. Theoretically predicted performance results as well as results obtained through simulation are presented in Section 4.2. Summary and conclusions are in Section 5.

2 Distortion Analysis In the rst part of this section, we present general expressions for the end-to-end distortion experienced by a subband coded source transmitted over a noisy channel. The optimal allocation of source and channel coding rates to the subbands is posed as a constrained minimization problem. We reduce this problem to a constrained allocation of total coding rate to the subbands using the notion of composite rate-distortion curves, which capture both source and channel coding distortions. In the latter part of this section, we derive analytical expressions for the end-to-end distortion under more speci c assumptions about the components of the coding and transmission system. The methodology adopted to obtain the optimal rate allocation is then described.

2.1 Problem Formulation Consider a source that is decomposed into M subbands. The joint source-channel coding approach considered here is to choose Rs = fRs; ; :::; Rs;M g (the set of source 1

coding rates) and Rc = fRc; ; :::; Rc;M g (the set of channel coding rates) in order to 1

minimize the overall distortion in the source, subject to an overall rate constraint. If the subband lters are orthogonal (In this case, the subband coecients are obtained by a unitary transform of the source samples), the total distortion per sample of the

source is D = PMi

=1

fi Di

, where Di is the total distortion per coecient of subband 7

and fi is the fraction of coecients in subband i. Since the coded subbands are

i

transmitted over a noisy channel, the total distortion per sample of subband i, Di , depends on the source and channel coding rates for that subband (Rs;i and

Rc;i

respectively) and is represented by Di(Rs;i ; Rc;i). Therefore, the overall distortion is a function of the source and channel coding rates used and may be expressed as D

(Rs; Rc) =

X f D (R M

i=1

i

i

s;i ; Rc;i

)

(1)

The problem under consideration may be stated in its most general form as

X f D (R

Minimize

M

f(Rs;i; Rc;i) 2 Rs;i  Rc;i; i = 1; :::; M g subject to

i

i=1 M

i

s;i ; Rc;i

Xf R  R R i

i=1

s;i

c;i

) (2)

tgt

Let Di(Ri ) represent the minimum achievable distortion per sample of subband , when the total coding rate (per sample) is

i

Ri

. This function (referred to as a

composite rate-distortion function) determines the optimal partition of a total rate Ri

into a source-coding rate Rs;i and a channel coding rate Rc;i i.e, a partition between

source and channel coding rates for subband i that minimizes the overall distortion (Di(Rs;i; Rc;i )) subject to

Rs;i Rc;i

 Ri. Formally,

( ) = (R ; R ) 2 R min Rc;i : Rs;i =Rc;i  Ri s;i c;i s;i

Di Ri

(

Di Rs;i ; Rc;i

)

(3)

In practice, Di (Ri) is computed as the lower hull of all the values of Di (Rs;i; Rc;i ) obtained by sweeping over (Rs;i; Rc;i). The rate allocation problem then reduces to Minimize fRi 2 Ri; i = 1; :::; M g subject to 8

X f D (R ) M

i=1

i

i

Xf R  R

i

M

i=1

i

i

tgt

(4)

where Ri is the support set of Di(Ri) as determined from (3). It is not necessary to compute the functions fDi(Rs;i ; Rc;i); i = 1; ::M g directly. Under the two following assumptions,

 The source coder satis es the centroid condition  Channel errors are independent of the source codewords the overall distortion decomposes into the sum of source-coding and channelinduced distortions [4, 16]. (

Di Rs;i ; Rc;i

) = Ds;i(Rs;i ) + Dc;i(Rs;i ; Rc;i)

(5)

where Ds;i (Rs;i) is the operational source rate-distortion curve for subband i, and (

Dc;i Rc;i ; Rs;i

) represents an operational rate-distortion curve re ecting channel dis-

tortion in subband i as a function of the channel coding rate and the source codebook (and hence on Rs;i ). Once the functions fDi(Ri); i = 1; ::M g are computed, the objective then is to choose fRi; i = 1; :::; M g to solve (4).

2.2 Solution Methodology The formulation in the previous section is a very general one, and we need to make more speci c assumptions in order to outline a solution method. We speci cally consider a source coder which generates xed-length codewords (such as a vector quantizer) and a family of convolutional channel codes. Further, we make the assumption that the codewords generated by the source coder are interleaved before being passed on to the channel coder. If the codewords are binary (as they would be 9

Convolutional

Interleaver

Equivalent

Channel

Channel

Encoder

Convolutional Decoder

Deinterleaver

Figure 1: The Equivalent Channel as seen by the Source Coder if quantizer indices were being transmitted), the equivalent channel (the channel as seen by the source coder, Figure. 1) can be modeled by a binary symmetric channel. The crossover probability of this channel, , will depend on the signal-to-noise ratio in the physical channel as well as the channel coding rate used ( = fn(SNR; Rs;i)). It can be seen from (5) that the rate-distortion curves Ds;i(Rs;i ) and Dc;i (Rs;i; Rc;i ) (representing source and channel distortions respectively) are used in the computation of the composite rate-distortion curve, Di (Ri). The operational source rate-distortion curve for subband i, Ds;i (Rs;i), is computed during optimization of the source quantizers using training data. Under the assumptions outlined above, the channel induced distortion in the ith subband can be expressed as (

Dc;i Rc;i ; Rs;i

)=

X X p(u)p(v=u)d(u; v) u

v

u; v

2 f1; 2; :; Cg

(6)

where C is the cardinality of the codebook and d(u; v) is the per sample distortion 10

between the source samples corresponding to code vectors with indices

u

and v.

The probability p(u) may be computed from the source statistics (and is in fact computed when the source codebook is generated). The transition probability p(v=u) is a function of the channel state, the channel coding rate Rc;i and the convolutional code itself, and can be computed as (

) = d

p v=u

u;v)

H(

(1:0 ? )n?d

(7)

u;v)

H(

where dH (u; v) is the Hamming distance between codewords u and v and  is the BER of the equivalent channel. Using Ds;i (Rs;i) and Dc;i(Rc;i; Rs;i ), we derive an (operational) composite rate-distortion curve for the ith subband that takes both source and channel distortions into consideration. We denote this curve by Di(Ri ). Jafarkhani et al. [16] have developed a framework for computing the composite rate-distortion function by explicitly modeling the interaction of the convolutional encoder and the Viterbi decoder. In this work, we have considered a simpler model based on a memoryless equivalent channel. In order to optimally allocate the source and channel coding rates for the subbands, we consider the ensemble of composite source/channel (operational) ratedistortion curves of all the subbands, and choose an operating point fR ; R ; :::; RM g that results in minimum PMi

=1

1

2

( ). This is a conventional rate allocation prob-

f i Di R i

lem, performed however on the composite rate-distortion curves. This optimal solution has to be derived from the operational rate-distortion curves, which may not necessarily be convex or even monotone decreasing. In this situation, the optimal allocation does not result from the \equal slope" allocation policy which is commonly 11

used. We use the rate allocation algorithm presented in [13] to nd the optimal solution and o er a brief exposition of the solution procedure here. Corresponding to the constrained minimization problem stated in (4), an unconstrained minimization problem may be stated as Minimize R 2 QM R fDi (Ri) + Ri g i=1

(8)

i

Let R() be the vector of (total) coding rates that solves (8). We may then

de ne the e ective coding rate to be Ref f () = PMi

=1

fi Ri 

( ). The solution to the

unconstrained problem is the solution to the original constrained problem (stated in (4)) if Ref f 

( ) = Rtgt

(9)

The solution procedure involves sweeping  from 0 to 1 and solving for the corresponding R() (and therefore Ref f ()) until (9) is satis ed. Since the minimization problem in (8) is separable, the optimal solution is obtained by minimizing each term separately. The ith component, Ri () is obtained as (

) = Rargmin 2 R fDi (Ri) + Ri g

Ri 

i

i

(10)

In [13], Shoham and Gersho exploit the discrete nature of the problem (QMi Ri =1

is a nite set for most problems of practical interest) to develop an ecient search procedure over the  space, which we use in our experimental results. The optimal solution naturally depends on the SNR of the physical channel and has to be computed for every possible operating SNR. In practice, the SNR is estimated by the receiver and is relayed to the transmitter. The transmitter only has 12

to choose the appropriate (pre-computed) allocation based on this information and encode the source. This strategy results in an adaptive source-channel coding scheme where the source and channel coding rates are chosen optimally given the current state of the channel.

3 Extension to fading channels In a fading channel (rendered memoryless through interleaving), the received signal may be represented as yn

where xn is the transmitted symbol,

=

q

Gn xn

+ wn

(11)

pG is the amplitude of the fade value and fw g n n

is a stationary white Gaussian process with zero mean and variance  . If n denotes 2

the instantaneous SNR at time n, n

= sGn

(12)

where s = P EG2 is the SNR (for a transmit power P ) when fading is absent. (

n)

In order to extend the source-channel coding approach for fading channels, we approximate a fading channel by a nite state model along the lines of [17]. Each state in this model is approximated by an AWGN channel with a suitable noise variance. This approach is simlar to [18] where a fading channel is approximated by a nite state channel model with each state being modeled by a BSC of suitable BER. Such a model (where every state is represented by a BSC) is suitable for systems that utilize hard-decision decoding at the receiver. We make use of soft-decision decoding [19] and therefore the performance of the system is a function of the SNR of the received 13

The Lognormal pdf 0.7

0.6

lognormal density function

0.5

0.4

0.3

0.2

0.1

0

1

2

3

4 5 SNR (not in dB)

6

7

8

Figure 2: The distribution of instantaneous SNR for lognormal fading, s = 4:5 dB signal. More speci cally, the performance of the Viterbi decoder depends only on the SNR at the receiver as long as the CSI is not used explicitly in the decoder metric. Therefore, the SNR dictates performance, and not the actual fade value. This implies that a nite-state model, where each state is represented by an AWGN channel is an appropriate model for this system. The discretization of the pdf of the SNR is necessary for an adaptive encoding system because the optimization of the encoder can only be performed for a nite number of states. Each state in a nite state model approximates a range of SNRs, and therefore its probability may be obtained by integrating the pdf of the fade process over the appropriate range of SNRs. The distribution of the received SNR depends on the physical nature of the underlying fading process. Knowledge of the 14

channel statistics may be used to analyze the average performance of the adaptive source-channel coding scheme. Assume that the channel fade distribution has been discretized into K states with the adaptive source-channel coding scheme optimized for the resulting set of channel SNRs. Let p(k) be the probability of the channel being in the kth state. Let D k (Rs; Rc) be the distortion that results from using the ( )

source coding rates Rs and the channel coding rates Rc when the channel is in the k th

state. The average distortion is then D

(av )

X (Rs; Rc ) = p(k)D (Rs; Rc ) K

(13)

(k )

k=1

In this paper, we consider lognormal fading which is a suitable model for channels which su er from slow fading due to shadowing e ects. Under lognormal fading, ( ) is Gaussian  N (0; G). Therefore, ln( )  N (ln(s); G). This implies that 2

ln G

2

the density of the instantaneous SNR is ln(s )) ( ) = p 1 exp(? (ln(x) ? ) 2 2G x 2

f x

2

(14)

Figure 2 illustrates this density function for s = 4:5 dB.

3.1 Unavailability of Exact Channel State at the Transmitter In this subsection, we present a coding strategy that may be used when the exact knowledge of the channel state is unavailable at the transmitter. One approach is a minimax approach where the channel is assumed to be in the worst possible state resulting in the lowest SNR under which the video coder can operate. The minimax 15

strategy attempts to minimize the distortion in this channel state. It may be easily seen that this is just a special case of the adaptive coding approach outlined in the previous subsection. The aim of the approach here is to choose a set of source and channel coding rates that will minimize the distortion averaged over the channel statistics (13). The aim is to choose (Rs; Rc ) to minimize D

(av )

X (Rs; Rc ) = p(k)D (Rs; Rc ) K

subject to

(k )

k=1

XR M

s;i

i=1 Rc;i

fi

R

(15)

where D k (Rs; Rc) is the total (source coding as well as channel-induced) per( )

pixel distortion in channel state k when the source and channel coding rate vectors are Rs and Rc respectively. Equation 15 may be expanded as D(av)

(Rs; Rc) =

X p(k) X f D K

M

i=1

k=1

i

(k )

i

(Rs;i; Rc;i)

(16)

where Di k (Rs;i ; Rc;i) is the per-pixel distortion in subband i. Interchanging sum( )

mations (facilitated by the orthogonal subband lters), D(av)

(Rs; Rc) =

X f X p(k)D K

M

i=1

i

k=1

(k )

i

(Rs;i; (Rc;i)

(17)

which may be compactly represented by D

(av )

(Rs; Rc) =

Xf D M

i=1

i

(av )

i

(Rs;i; Rc;i )

(18)

where Di av (Rs;i; Rc;i) is a rate-distortion function for subband i that has been (

)

averaged over the channel statistics. The rate allocation that satis es (15) is obtained in two steps. First, the channel-statistics-averaged rate-distortion function 16

(av )

Di

(Rs;i; Rc;i) may be replaced by a composite rate-distortion curve, Di av (Ri) in (

which the source and channel coding rates Rs;i and

Rc;i

)

are implicitly represented

by the total rate, Ri (i.e, as elaborated upon earlier in this section, the total rate Ri

is decomposed into that choice of source and channel coding rates Rs;i and Rc;i

which minimize Di (Rs;i; Rc;i)). This is done by calculating

(av )

Di

(Rs;i ; Rc;i) for all

possible choices of (Rs;i; Rc;i). The lower hull of these points is then the composite rate-distortion curve of interest, Di av (Ri). (

)

The second step is simply a rate-allocation using the channel-averaged rate-distortion curves, fDi av (Ri); i = 1; :::; M g, i.e choose fRi; i = 1; 2; :::; M g to minimize (

)

D

(av )

(R) =

Xf D M

i=1

i

(av )

i

(Ri)

subject to

Xf R  R M

i=1

i

i

(19)

The solution procedure reduces to a simple rate-allocation on the channel-averaged composite rate-distortion curves using the same algorithm that was used in the previous section.

4 Experimental Results 4.1 Coding scheme The joint source-channel coding scheme we consider is illustrated in Figure 3. The source coder is based on a three-dimensional subband coding scheme outlined in [20] and illustrated in Figure 4. The spatio-temporal subband decomposition used here localizes channel induced distortions, and makes it possible to analyze and compute them in an explicit manner. This is not the case with motion-compensated coding schemes such as MPEG or H.263 (which utilize variable length codes for ecient 17

VQ Encoder

RCPC Interleaver

(Convolutional)

Encoder

Channel

Adaptation Unit

VQ Decoder

RCPC Deinterleaver

(Viterbi)

Decoder

Figure 3: The joint source-channel coding system compression) where the encoder and decoder are synchronized only as long as there are no errors induced by the channel. In these coding schemes, channel-induced distortions cause errors to propagate until intra-coded blocks (or a complete intracoded frame) are introduced by the encoder to \refresh" the prediction loop. In [20], the authors use a combination of unbalanced tree-structured vector quantizers and geometric VQ to code the subband coecients. We however use full-search VQ's for all the subbands except for subband zero which uses scalar quantizers for reasons of complexity. While our approach sacri ces compression eciency, it makes the computation of the source-coding and channel-induced distortion tractable. Our goal in this exercise is to develop an analytical framework for source-channel coding of subbands and not to develop an image compression and transmission system that 18

0

1

2

4

7

8

6

9

10

3 5

WL

WH

t

t

Figure 4: The spatio-temporal subband decomposition used operates at the lowest possible rate. The temporal subband decomposition is carried out using the Haar basis functions, which results in WL and WH being the sum and di erence of neighbouring t

t

frames respectively. Separable orthogonal subband lters (described in [21]) are used to decompose WL and WH in the spatial dimensions, producing the decomposition t

t

of Figure 4. The low-temporal subband WL is decomposed into a two-level logaritht

mic decomposition, while the high-temporal subband WH is uniformly decomposed t

into four equal-sized subbands. The subband coecients are quantized using fullsearch VQs, which are trained using the LBG algorithm [15]. The lowest subband is quantized using Lloyd-Max scalar quantizers because of the complexity of using high-rate VQ's for this high-energy subband. Apart from the complexity of training high-rate VQ's for this subband, the computation of the channel-induced distortion (6) is very intensive. The complexity of this computation is O(jCj ), where jCj = 2kr 2

is the cardinality of the codebook (k is the dimension of the VQ, and r is the coding 19

Rate Allocation { Total Coding Rate = 1.0 bpp Ch. SNR 1dB Ch. SNR 4dB Ch. SNR 8dB subband src rate ch rate src rate ch rate src rate ch rate 0 8.00 0.33 8.00 0.33 8.00 0.57 1 1.50 0.33 2.25 0.57 3.00 0.80 2 1.50 0.33 2.25 0.57 3.00 0.80 3 0.00 { 0.00 { 2.25 0.80 4 0.00 { 0.00 { 2.00 0.80 5 0.00 { 0.00 { 0.00 { 6 0.00 { 0.00 { 0.00 { 7 0.00 { 0.00 { 0.00 { 8 0.00 { 0.00 { 0.00 { 9 0.00 { 0.00 { 0.00 { 10 0.00 { 0.00 { 0.00 { Table 1: Rate Allocation as a function of channel state rate per source sample). For a reasonable quality of reconstruction, subband zero must be coded at rates of 4.0 bits per source sample or higher (assuming a dimension of 4), which implies values of 2 or higher for jCj. 16

The quantizer indices are protected for transmission over the channel, by using di erent channel codes for di erent subbands. Channel coding is carried out using RCPC codes [14]. These codes make it practical to achieve a wide range of channel coding rates using a single encoder and a single Viterbi decoder at the receiver by simply puncturing the coded bitstream according to the level of protection desired. The Viterbi decoder only needs knowledge of the mother convolutional code and the puncturing mode. The quantizer indices are interleaved and then protected using convolutional codes for transmission over the noisy channel. A class of RCPC codes described in [14] with a basic coding rate of

1 4

is used. Higher coding rates are

obtained by puncturing the coded bitstream. We use a set of three codes with rates 20

Comparison of Computed and Estimated Distortions

3

10

Total Mean Squared Distortion

Distortion computed from training data Estimated Distortion

2

10

1

10

Coefficients from Subband 1 ; Channel SNR = 1dB 0

10

0

1

2

3

4 5 Total Rate (Rs / Rc)

6

7

8

9

Figure 5: Comparison of Computed and Estimated Distortions

f0.33 0.571 0.80g as well as the uncoded case (coding rate = 1.0). In this context, source-channel coding now implies the choice of a source coding rate and a channel coding rate (puncturing mode) for each subband. The modulation scheme used here is uncoded BPSK and soft-decision decoding is assumed. At the receiver, the sequence of bits estimated by the Viterbi decoder is de-interleaved and the received VQ indices are mapped into the corresponding VQ centroids. The subband synthesis bank then reconstructs the pair of image frames. The receiver obtains the channel state information (CSI) based on signal to noise ratio (SNR) measurements. It then relays the CSI back to the adaptation unit of the encoder. This unit then chooses the optimal joint source-channel coding strategy for this channel state. This feedback information requires only a low-rate channel 21

Performance of the Adaptive Source−Channel Coding scheme 34

PSNR of the received image sequence

32 30 28 26 24 −.− Adaptive Scheme

22

−x− Scheme optimized for SNR = 8dB 20

−o− Scheme optimized for SNR = 1dB

18 16 14 1

2

3

4 5 Channel SNR in dB

6

7

8

Figure 6: Performance of the Adaptive Source-Channel coder and enables the encoder to adapt to the changing channel conditions. Since we only consider slow-fading channels, we assume that the fade level is relatively constant for a two-frame duration. The range of channel SNRs considered is 1 dB through 8 dB.

4.2 Simulation Results Figure 5 compares the theoretically predicted values for the end-to-end distortion with those obtained through simulation. This gure presents results for a single subband at a particular SNR. The results for other subbands and SNRs follow a similar pattern, justifying the memoryless assumption for the equivalent channel. Table 1 lists the source and channel coding rates allocated using the composite rate-distortion curves and the algorithm described in Section 2.2 for three di erent states of the channel. It may be observed that the more important subbands are 22

protected using channel codes with lower rate (i.e, those that o er more protection). Also, as the channel SNR decreases, the algorithm o ers more protection for the more important subbands at the cost of higher frequency subbands. Discretization of the Lognormal pdf 0.7

0.6

lognormal density function

0.5

0.4

0.3

0.2

0.1

0

1

2

3

4 5 SNR (not in dB)

6

7

8

Figure 7: Partitioning of the pdf of the SNR into a nite number of states Figure 6 plots received image quality as a function of the channel state. The received image PSNR is averaged over 2 frames of the video sequence. This is done because the three-dimensional subband coder operates on units of two consecutive frames.The adaptive scheme is compared with two xed schemes, one optimized for a poor channel (1dB channel SNR) and the other optimized for higher channel SNR (8dB). In all three schemes, the coding rate is 1.0 bpp (source and channel coding combined). A scheme optimized for poor channels has a large proportion of the rate allocated to channel coding, and hence has too little rate for source coding when the 23

Comparison of Adaptive, Average Distortion and Minimax Coding Schemes 36

Average PSNR of the video frames

34

32

30

28

26

Coding Scheme based on exact knowledge Coding Scheme that minimizes average distortion Minimax Coding Scheme

24

22

0

1

2

3

4 5 Channel SNR in dB

6

7

8

9

Figure 8: Performance of the Coding Scheme that minimizes Average Distortion channel SNR improves. A scheme optimized for channels with high SNR on the other hand performs poorly in low SNR channels since the channel coding rate is insucient to protect against channel errors. Since the SNR in a wireless channel can vary over a large range, an adaptive scheme has a natural advantage. State No. Avg. SNR State Probability 1 1.00 0.0022 2 2.01 0.0258 3 3.00 0.1372 4 4.00 0.3328 5 5.01 0.3308 6 6.00 0.1432 7 7.02 0.0246 8 8.00 0.0023 Table 2: Discretization of the Channel Fade Distribution We now discuss simulation experiments on fading channels. Table 2 illustrates the 24

Figure 9: Frame 301 of the sequence received at a channel SNR of 1 dB, PSNR = 27.17 dB discretization of the density function of the lognormal fading channel. Our encoder is optimized for the set of channel SNRs f1,2,3,4,5,6,7,8g (all in dB). The probabilities for these states are computed assuming an average SNR when no fading is present (s in the notation of Section 3 of 4.5 dB). The partition of the fade distribution which results in the states illustrated in Table 2 is f0.0, 1.30, 2.35, 3.40, 4.50, 5.575, 6.67, 7.675, 9.0g and is illustrated in Figure 7. The performance of the coding approach which minimizes the average (over the channel statistics) distortion in the absence of exact state information at the transmitter is illustrated in Figure 8. The performance of this approach is compared with the adaptive scheme (best possible performance) as well as the minimax (most pessimistic) approach. Table 3 lists the average distortions experienced by the video 25

Figure 10: Frame 301 of the sequence received at a channel SNR of 4 dB, PSNR = 29.37 dB Adaptive Coder Coder minimizing avg. distortion Minimax (Pessimistic) Coder 31.61 dB 30.20 dB 28.99 dB Table 3: Channel-averaged distortion performance of the coding schemes coder operating under the three coding schemes illustrated in Figure 8. Finally, Figures 9 through 11 illustrate the performance of the adaptive sourcechannel coding scheme on one frame of the \Salesman" sequence under di erent channel conditions, when the total overall coding rate is 1 bpp. The results on video sequences may be viewed at http://www.cfar.umd.edu/~murari/src ch/experiments.html

5 Conclusions This paper presents an adaptive source-channel subband coding approach and applies it to robust video transmission over noisy channels. Over AWGN channels, it is 26

Figure 11: Frame 301 of the sequence received at a channel SNR of 8 dB, PSNR = 32.94 dB shown that acceptable video quality can be obtained even at very low channel SNR. The advantages of adaption are most signi cant in fading channels where the encoder can be optimized for various channel states. This provides a graceful degradation of performance with decreasing channel SNR. An approach that minimizes the average (over the channel statistics) distortion is outlined that will be useful in situations where the exact channel state is unavailable at the transmitter. While this scheme is not the most ecient in terms of bit-rate, it enables a systematic study of the relative importance of subbands in coding and transmission through analysis and simulation.

References [1] C. E. Shannon, \A mathematical theory of communication," Bell Systems 27

Technical Journal, vol. 27, pp. 379{423, 1948.

[2] J. W. Modestino, D. G. Daut, and A. L. Vickers, \Combined source-channel coding of images using the block cosine transform," IEEE Transactions on Communications, vol. 29, no. 9, pp. 1261{1274, September 1981.

[3] V. A. Vaishampayan and N. Farvardin, \Optimal block cosine transform image coding for noisy channels," IEEE Transactions on Communications, vol. 38, no. 3, pp. 327{336, March 1990. [4] N. Farvardin, \A study of vector quantization for noisy channels," IEEE Transactions on Information Theory, vol. 36, no. 4, pp. 799{809, July 1990.

[5] N. Farvardin and V. A. Vaishampayan, \On the performance and complexity of channel-optimized vector quantizers," IEEE Transactions on Information Theory, vol. 37, pp. 155{160, January 1991.

[6] K. A. Zeger and A. Gersho, \Vector quantizer design for memoryless noisy channels," in IEEE International Conference on Communications, Philadelphia, PA, USA, 1988, vol. 3, pp. 1593{1597. [7] M. Ruf and J. W. Modestino, \"rate-distortion performance for joint source and channel coding of images"," Tech. Rep., LIDS, Massachusetts Institute of Technology, June 1996.

28

[8] G. Cheung and A. Zakhor, \Joint source/channel coding of scalable video over noisy channels," in IEEE International Conf. on Image Processing, Lausanne, Switzerland, September 1996, vol. III, pp. 767{770. [9] P. Burlina and F. Alajaji, \An error resilient scheme for image transmission over noisy channels with memory," IEEE Transactions on Image Processing, vol. 7, no. 4, pp. 593{600, April 1998. [10] W. Xu, J. Hagenauer, and J. Hollman, \Joint source channel decoding using the residual redundancy in compressed images," in IEEE International Conference on Communications, Dallas, TX, June 1996, vol. 1, pp. 142{148.

[11] M. Srinivasan and R. Chellappa, \Joint source-channel coding of images," in IEEE International Conf. on Acoustics, Speech, and Signal Processing, Munich,

Germany, April 1997, vol. 4, pp. 2925{2928. [12] M.Srinivasan, R. Chellappa, and P. Burlina, \Adaptive source-channel subband video coding for wireless channels," in First IEEE Workshop on Multimedia Signal Processing, Princeton, NJ, June 1997, pp. 407{412.

[13] Y. Shoham and A. Gersho, \Ecient bit allocation for an arbitrary set of quantizers," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 9, pp. 1445{1453, September 1988.

29

[14] J. Hagenauer, \Rate-compatible punctured convolutional codes (rcpc codes) and their applications," IEEE Transactions on Communications, vol. 36, no. 4, pp. 389{400, April 1988. [15] Y. Linde, A. Buzo, and R. M. Gray, \An algorithm for vector quantizer design," IEEE Transactions on Communications, vol. 28, no. 1, pp. 702{710, January

1980. [16] H. Jafarkhani, P. Ligdas, and N. Farvardin, \Adaptive rate allocation in a joint source/channel coding framework for wireless channels," in IEEE Vehicular Technology Conference, Atlanta, GA, 1996, vol. 1, pp. 492{496.

[17] I. Kozintsev and K. Ramchandran, \Joint source-channel coding using multiresolution constellations for power-constrained time-varying channels," in IEEE International Conf. on Acoustics, Speech, and Signal Processing, Atlanta, GA,

May 1996, vol. 4, pp. 2343{2346. [18] H.S. Wang and N. Moayeri, \Finite-state markov channel - a useful model for radio communication channels," IEEE Transactions on Vehicular Technology, vol. 44, no. 1, pp. 163{171, February 1995. [19] J. G. Proakis, Digital Communications, McGraw-Hill, third edition, 1995. [20] C. Podilchuk, N. Farvardin, and N. S. Jayant, \3-d subband coding," IEEE Transactions on Image Processing, vol. 4, no. 2, pp. 125{139, February 1995.

30

[21] I. Daubechies, \Orthonormal bases of compactly supported wavelets," Communications on Pure and Applied Mathematics, vol. 41, pp. 909{996, 1988.

31

Suggest Documents