the distortion as a function of the length of the erasure code. We then consider the performance of scalable source coding, where the source is partitioned into ...
Source Fidelity over Fading Channels: Erasure Codes versus Scalable Codes Konstantinos E. Zachariadis, Michael L. Honig, and Aggelos K. Katsaggelos Electrical and Computer Engineering Department Northwestern University Evanston, IL 60208-3118, U.S.A. e-mail: {kez, mh, aggk}@ece.northwestern.edu
Transmission of a continuous source, such as an image or video, through a fading channel must account for distortion due to both quantization and channel-induced errors. If the channel code-word is sufficiently long, i.e., extends over multiple fading cycles, then the ergodic capacity can be used to determine the achievable information rate, which in turn determines the (fixed) source rate. However, a slowly-varying fading channel and/or short channel code-words, relative to the channel variations, can lead to outages, which substantially increase the received distortion. To gain insight into the effect of different coding schemes on performance, we consider a model in which a continuous Gaussian source is transmitted through a block Rayleigh fading channel. Each code word spans a single block, so that the achievable rate can be characterized in terms of the outage capacity [1]. Namely, an outage occurs when the channel code rate exceeds the capacity conditioned on the current channel gain. Optimization of the source and channel code rates to minimize the mean distortion for this model was considered in [2]. Namely, increasing the source rate decreases the distortion due to quantization, but increases the probability of outage. Here we evaluate the mean distortion when erasure or scalable coding is used, in addition to the coding scheme
that achieves a given outage capacity. We first consider adding an outer erasure code over multiple blocks. That is, we view the channel with the inner channel code as a symbol erasure channel where the erasure probability is the outage probability for each block. As the length of the erasure code increases, the capacity and distortion converge to deterministic values. We optimize the inner channel code rate to maximize the capacity of this erasure channel, and compute the standard deviation of the distortion as a function of the length of the erasure code. We then consider the performance of scalable source coding, where the source is partitioned into layers representing successive refinements of the quantizer. Each layer, corresponding to a particular quantizer, can have a different rate and power, and all layers are transmitted simultaneously as a superposition code [3]. 1 The layers are sequentially decoded, and the loss of a particular layer implies the loss of all succeeding layers. The received distortion for a given channel state then depends on how many layers are decodable. Here the Gaussian source provides additional flexibility since it is inherently refinable [4]. We first evaluate the mean distortion with two layers before considering an arbitrary finite number of layers. We then evaluate the mean distortion with an infinite number of layers in the high SNR regime. In that case, we obtain closedform expressions for the power/rate allocation across layers along with the minimum mean distortion, and the standard deviation. Numerical results show that an infinite-length erasure code gives less distortion than infinite-layer scalable coding, and that the gap increases with Signal-to-Noise Ratio (SNR). However, comparing the scalable code with a finite-length erasure code, keeping the standard deviation of the distortion the same in both cases, shows that scalable coding gives a small performance improvement relative to erasure codes. The tradeoff between quantization distortion and channelinduced distortion for fading channels is also considered in [5], [6], [7]. Other related work on optimization of source and channel parameters for minimizing received distortion in different contexts is presented in [8], [9]. Neither erasure nor scalable source codes are considered in that work. Scalable, or progressive source-channel coding is considered
1 This work was supported by ARO under grant DAAD19-99-1-0288 and NSF under grant CCR-0310809.
1 This type of scalable coding also applies to the broadcast channel where the different channel states are associated with different users [3].
Abstract— We consider the transmission of a Gaussian source through a block fading channel. Assuming each block is decoded independently, the received distortion depends on the tradeoff between quantization accuracy and probability of outage. Namely, higher quantization accuracy requires a higher channel code rate, which increases the probability of outage. Here we evaluate the received mean distortion with erasure coding across blocks as a function of the code length. We also evaluate the performance of scalable, or multi-resolution coding in which coded layers are superimposed, and the layers are sequentially decoded. In addition to analyzing a finite number of layers, we evaluate the mean distortion at high Signal-to-Noise Ratios as the number of layers becomes infinite. As the block length of the erasure code increases to infinity, the received distortion converges to a deterministic limit, which is less than the mean distortion with an infinite-layer scalable coding scheme. However, for the same standard deviation in received distortion, infinite layer scalable coding performs slightly better than erasure coding.
I. I NTRODUCTION
in [10], although the model is different from that considered here. Enhancement of scalable video coding to provide fine granular scalability is discussed in [11]. That work serves as practical motivation for considering an infinite-layer scalable code. II. S YSTEM M ODEL We assume an i.i.d. source sequence {Xi }K i=1 , where Xi is a real Gaussian random variable with zero mean and variance σx2 = 1. The rate-distortion function is [3], 2 1 R = 0.5 log (1) or D = e−2R , D where R is the source information rate in nats per source sample and D is the distortion. Assuming a block Rayleigh fading channel, the capacity, conditioned on the channel gain h, is hP 1 C = log 1 + , (2) 2 N in nats per second per Hz (per dimension), where P is the transmitted power, and N the noise power. Since the channel gain h is exponentially distributed (with mean value let H), the capacity is a random variable with cumulative distribution function (cdf) [1] hP < Rt = 1 − exp −γ −1 e2Rt − 1 , (3) Pr C N where γ = HP N is the (average) SNR. During each (fixed) coherence time T , the source emits a sequence of K source symbols, represented by the vector X, and the channel encoder maps the output of the source encoder to to a sequence of L channel symbols Y = {Y1 , ..., YL }. Since we use the channel L times to represent KR source nats, the channel code rate, Rt = KR L = αR, where the “processing gain” α = K/L [5], [6]. We will assume optimal channel coding, in the sense that Rt achieves the given outage capacity. First consider transmitting the source through an additive white Gaussian noise (AWGN) channel. In that case the capacity and distortion are determinstic, i.e., C(γ) = 12 log(1 + γ),
III. E RASURE C ODES We now view the channel as a symbol erasure channel, where each symbol refers to a channel codeword Y, and the erasure probability is the outage probability due to fading. For the time being, if we assume that the erasure code is over an infinite number of blocks, then the achievable rate is given by the capacity [3] −1 2αR Cer = log(|C|)(1 − per )/L = αRe−γ (e −1)
(5)
where |C| is the cardinality of the symbol set C, i.e., eKR −1 2αR in our case, and per = p = 1 − e−γ (e −1) is the outage probability. Since the capacity is deterministic in this case, the distortion D = e−2Cer /α . The source rate that maximizes Cer satisfies 2αRe2αR = γ, and the unique real solution is given by 1 LW(γ), (6) 2α where LW(·) is Lambert’s W function (also called Omega function), and is the inverse function of f (x) = xex . The corresponding received distortion is (for α = 1) LW(γ) −1 −e γ D = exp −LW(γ)e . (7) R∗ =
We now evaluate the mean distortion when a finite-length erasure code is used. Specifically, the erasure code maps U code words generated by the inner code, {Y1 , Y2 , ..., YU }, to V code words {Y1 , Y2 , ..., YV }. Hence the information rate is RU/V . The source bits are assumed to be interleaved across the V code words, so that the distortion statistics do not change from block to block. In what follows, we assume a linear (V, U, d) erasure channel code [12] with minimum distance d, i.e., the U information codewords can be recovered from (V −d+1) codewords. Furthermore, we assume that d = V − U + 1, corresponding to a Maximum Distance Separable (MDS) code. In that case, the U information codewords can be recovered from any subset of U correctly decoded codewords. The expected distortion corresponding to source rate R and erasure code rate U/V is given by U R Pr[C] + (1 − Pr[C]), (8) E[D] = D V
1 log(1 + γ). 2α ∗ and the corresponding distortion for α = 1 is Ddet = 1/(1 + where C is the event that at least U code words are correctly decoded, hence γ). For the block fading channel the expected distortion at the V V −U V V i V −i receiver is given by Pr[C] = = (1−p) p pi (1−p)V −i i i i=0 i=U hP hP (9) E[D] = D(R)·Pr C ≥ αR +1·Pr C < αR , N N where p is the outage probability for each block. (4) We want to minimize the expected distortion at the receiver where the second term on the right corresponds to an outage. over the source rate R. For α = 1, and γ = 30 dB, the Minimization of E[D] over the source rate R is considered optimal rate from (6) is R∗ = 2.62 nats/source sample, and the in [2]. Here we omit the corresponding expressions for the optimal erasure code rate can be approximated as VU = 10 12 . Fig. optimal rate and minimum mean distortion. 1 shows the minimum E[D] and the corresponding standard deviation versus V for fixed U/V . We see that E[D] decreases 2 This result is asymptotic as the sequence length K → ∞. For finite K this relation gives a lower bound on the achievable distortion. with V (and goes to a finite limit, designated by the dotted ∗ Rdet = C(γ)/α =
power beta 1 P
straight red line in the figure), and as expected, the standard deviation tends to zero. If U = V , corresponding to no erasure code, then the optimal rate R∗ = 1.73 nats per source sample and E[D]∗ = −12.16dB, which is significantly larger than that achievable with an erasure code.
C c ={1,..., exp (KR 1 )}
X -12
Source Encoder
0.12
-13
C r ={1,..., exp [K (R- R 1 )]}
0.1
Channel Encoder
Y1
Channel Encoder
Y2
Y
Standard Deviation
E[D] dB
-14
-15
-16
0.08
0.06
0.04
-17
rates R 1 , R
-19 0 10
power (1- beta 1 )P
0.02
-18
10
2
4
10 V total packets
10
6
10
8
0 0 10
(a)
10
2
4
10 V total packets
10
6
10
8
Fig. 2.
Scalable source code with two layers.
(b)
Fig. 1. (a) Minimum expected distortion and (b) the corresponding standard U deviation vs V with γ = 30db ⇒ V ≈ 10/12.
IV. S CALABLE C ODING We first consider a scalable code with two layers. Namely, the source vector X is mapped to two indices; the first index i1 is chosen from the set Cc = {1, ..., eKR1 } and the second index i2 is taken from the set Cr = {1, ..., eK(R−R1 ) }. Correct decoding of i1 ∈ Cc (the coarse layer) allows reconstruction of the source with asymptotic distortion D1 = D(R1 ) = e−2R1 . Correct decoding of only index i2 ∈ Cr (the refinement layer) results in a much greater distortion than the rate-distortion bound, i.e., D2 > e−2(R−R1 ) . In what follows, we assume that the index i2 alone provides no information about the source. A Gaussian source is “successively refineable” [4], meaning that reception of both indices i1 and i2 gives asymptotic distortion D0 = D(R) = e−2R , which achieves the rate-distortion bound. The two-layer scalable coding scheme is illustrated in Fig. 2. Namely, the source is coded into coarse and refinement indices. Each index is the input to an optimal channel coder with code rate α, as in the previous section. Let Y1 , Y2 denote the channel codewords corresponding to indices i1 , and i2 , respectively. We assume that Y1 has power β1 P , 0 ≤ β1 ≤ 1, and Y2 has power (1 − β1 )P . The sum of the codewords Y = Y1 + Y2 , which has power P , is transmitted through the channel. This is an example of a superposition code, which achieves the capacity of a broadcast channel [3]. In this scheme the coarse layer has rate R1 and the refinement layer has rate R − R1 . If the channel realization does not allow reliable transmission at rate R1 , then the received distortion is the source variance. If the channel does support rate R1 , then the coarse index is decoded first, treating the refinement codeword as noise. The codeword Y1 is then subtracted from the received codeword to obtain codeword Y2 . If the channel can support the additional rate R − R1 , then Y2 can be correctly decoded to give overall distortion D(R). Otherwise, successful decoding of Y1 alone gives distortion D(R1 ).
Let A denote the event that Y1 is successfully decoded and B denote the event that Y2 is successfully decoded given that Y1 is successfully decoded. Then hβ1 P Pr[A] = Pr C ≥ αR1 = Pr h ≥ h1thr , h(1 − β1 )P + N where h1thr =
e2αR1 − 1 , P 2αR1 ] N [1 − (1 − β1 )e
and N is the noise power as in Section II. Note that h1thr > 0 requires 1 − (1 − β1 )e2αR1 > 0 ⇒ R1 >
1 1 log . 2α 1 − β1
We also have
h(1 − β1 )P Pr[B] = Pr C ≥ α(R − R1 )|A N = Pr h ≥ h2thr |h ≥ h1thr ,
where h2thr =
e2α(R−R1 ) − 1 . P (1 − β1 ) N
The mean distortion is therefore E[D]
=
Pr[h < h1thr ] + Pr[h1thr < h < h2thr ]D(R1 ) + Pr[h > h2thr ]D(R).
(10)
Now consider Q ≥ 2 layers, where the first i layers, 1 ≤ i ≤ Q, are assigned power βi P , and give total rate Ri if successfully decoded (i.e., Ri ≥ Ri−1 and βi ≥ βi−1 ). Let ∆Ri Ri − Ri−1 , i.e., the incremental rate associated with layer i. The probability that the received distortion is D(Ri ) is then the probability that we can decode the ith layer, but not the {i + 1}st layer. Since the ith layer has power ∆βi βi − βi−1 , and receives interference power (1 − βi )P from layers i + 1 through Q, the probability of successful decoding
is πi − πi+1 , where h∆βi P πi = Pr C ≥ α∆Ri h(1 − βi )P + N
exp(2α∆Ri ) − 1 1 = exp − γ ∆βi − (1 − βi ) [exp(2α∆Ri ) − 1] Our problem is to min
β1 ,...,βQ ,R1 ,...,RQ
E[D] =
Q
D(Ri ) · (πi − πi+1 ),
i=0
subject to the boundary conditions
Solving these equations appears to be difficult. However, we can obtain a closed-form solution in the high SNR regime. Namely, in that case the necessary conditions can be simplified, and we obtain the approximate solution 3(at + b)2/3 3(at + b)2/3 + a ξ(t) = 1 + d exp − a 3 2/3 (at + b) + c r(t) = 2a where −3 1 1 1 + LW(γ) , b= 2c − c= 2 3 (2c)
π0 = 1, πQ+1 = 0, βQ = 1, β0 = 0, R0 = 0.
a=−
The objective function can be written as E[D]
=
Q
D(Ri ) · (πi − πi+1 )
i=0
=
1+
Q
L(βi , Ri , ∆βi , ∆Ri ).
(11)
i=1
1.5b2/3 c
d=
,
2 . γea
Substituting for ξ(t) and r(t) in (13) with α = 1 gives the minimum expected distortion 1 E[D]∗ = exp (−LW(γ/2)) 1 + LW(γ/2) . (15) 2 Similarly, it can be shown that the second moment of the distortion with the optimal power/rate allocations is given by
where L(βi , Ri , ∆βi , ∆Ri ) = D(Ri )[1 − D(∆Ri )]πi
e2α∆Ri −1 1 −2Ri 2∆Ri 1−e exp − γ ∆β −(1−β ) e2α∆Ri −1 (12) . =e ] i i [ The preceding nonlinear optimization problem is difficult to solve analytically. To obtain additional insight, we now consider the limiting performance in which the number of layers tends to infinity. Let t = i/Q denote the normalized layer index. As Q → ∞, t becomes a continous variable in (0, 1), and we assume that the set of fractional power allocations {βi } converges to the continuous function ξ(t), and the set of rates {Ri } converges to the continuous function r(t). The expected distortion then converges to 1 L[ξ(t), r(t), ξ (t), r (t)]dt (13) E[D] = 1 −
E[D2 ]∗ = exp (−LW(γ/2)) .
Also of interest is the maximum achievable rate, assuming all layers are successfully decoded, given by ¯ = 1 LW(γ/2). r(1) = R (17) 2 Fig. 3 shows a plot of the power and rate allocations ξ(t) and r(t) for γ = 30 dB and α = 1. Also shown are the corresponding power and rate allocations with 10 layers obtained by solving the discrete optimization numerically. Those results are nearly the same as the infinite-layer results. The expected distortion for this example is approximately 15 dB (with both 10 and infinite layers). 1
0
2.5 Infinite Layers 10 Layers
0.9
where
0.8
2αr (t)
and prime denotes derivative. We seek to minimize the preceding expression with respect to ξ(t), r(t), ξ (t), and r (t). The functional L(ξ, r, ξ , r ) does not depend on the layer index t, so that our problem is autonomous. Consequently, the Euler-Lagrange necessary conditions for optimality are [13] Lξ
= Lξ ξ ξ + Lξ ξ ξ
Lr
= L
r r
r +L
r r
r ,
where Lx = ∂L/∂x, and we have the following transversality condition since r1 is free Lr (ξ1 , r1 , ξ1 , r1 ) = 0.
2
Rate Allocation rt
0.7
Power Allocation ξ t
L[ξ(t), r(t), ξ (t), r (t)] =
1 1 2r (t) exp[−2r(t)] exp − γ ξ (t) 1 −[1−ξ(t)] (14)
(16)
0.6 0.5 0.4
1.5
1
0.3 0.2
0.5
0.1 0
Infinite Layers 10 Layers 0
0.2
0.4 0.6 Continuous Layer Index t
0.8
1
0
0
0.2
(a)
0.4 0.6 Continuous Layer Index t
0.8
(b)
Fig. 3. (a) Power and (b) rate allocation functions for infinite-layer scalable coding. Results with 10 layers are also shown (γ = 30dB).
V. N UMERICAL C OMPARISON Fig. 4 shows the minimum (expected) distortion versus SNR γ for infinite-length erasure codes and scalable codes. For γ > 30 dB, the analytical results for an infinite-layer scalable code are shown, and for γ < 30 dB, numerical results for 10 layers are shown. Also shown for comparison are the
1
analogous curves for an AWGN channel, and for a singlelayer code, i.e., with no refinements, as described in Section II.
-5
Distortion [dB]
-10 -15 -20 -25 -30 Single Layer Infinite Layer Scalable Erasure Codes AWGN
-35 -40
Fig. 4.
0
5
10
15 SNR
20 [dB]
25
30
35
40
These results show that for high SNRs, the (deterministic) distortion for erasure coding is approximately 7 dB less than the expected distortion for scalable coding. Of course, the erasure codes also incur an infinite decoding delay. The minimum distortion for the AWGN channel is substantially lower than that for the block fading channel with erasure codes, and the gap (in dB) increases with SNR. Similarly, the gap between single-layer coding and infinite-layer scalable coding increases substantially at high SNRs. Instead of comparing the performance of the scalable coding scheme with an infinite-length erasure code, as in Fig. 4, Fig. 5 shows distortion versus SNR for finite-length erasure codes. Namely, the length of the erasure code is chosen so that the standard deviation of the distortion is the same as that for the scalable code (i.e., from 16). (The curve for the scalable code is the same as that shown in Fig. 4.) In this case, the erasure code performs slightly worse than the scalable code.
Infinite Layer Scalable Erasure Codes
-10 -15
Distortion [dB]
-20 -25 -30 -35 -40 -45 30
40
50 SNR
We have considered the use of erasure codes and scalable codes for the transmission of a continuous source through a block fading channel. In both cases we are able to compute the expected mean distortion and standard deviation of the distortion. A numerical comparison shows that both coding schemes offer a substantial performance improvement relative to a basic single-layer coding scheme. For a fixed distortion variance, both coding schemes give similar minimum expected distortion. Although scalable coding appears to be more complex than erasure coding, it incurs a delay of only a single block. Finally, we remark that here we have assumed a flat Rayleigh fading channel. The relative performance of these coding schemes with additional modes of diversity remains an open issue. VII. ACKNOWLEDGEMENTS
Comparison of distortion versus SNR for four different schemes.
20
VI. C ONCLUSIONS
60
70
[dB]
Fig. 5. Expected distortion versus SNR for erasure codes and infinite-layer scalable codes with fixed standard deviation.
The authors thank J. Nicholas Laneman for suggesting the use of scalable coding in fading channels, and Yiftach Eisenberg for helpful discussions concerning this work. R EFERENCES [1] L. Ozarow, S. Shamai, and A. D. Wyner, “Infomation theoretic considerations for cellural mobile radio,” IEEE Trans. Vehicular Technology, vol. 43, no. 2, pp. 359–378, May 1994. [2] K. E. Zachariadis, M. L. Honig, and A. K. Katsaggelos, “Source Fidelity over a Two-Hop Fading Channel,” in Proc. MILCOM, International Conference on Military Communications, Monterey, CA, Nov 2004. [3] T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley Series inTelecommunications, New York: John Wiley and Sons, 1991. [4] W. H. R. Equitz and T. M. Cover, “Successive refinement of information,” IEEE Trans. Information Theory, vol. 37, no. 2, pp. 269–275, March 1991. [5] H. Coward, R. Knopp, and S. Servetto, “On the performane of a natural class of joint source/channel codes based upon multiple descriptions,” IEEE Trans. Information Theory, Submitted for publication 2002. [6] J. N. Laneman, E. Martinian, G. W. Wornell, and J. G. Apostolopoulos, “Source-channel diversity approaches for multimedia communication,” IEEE Trans. Information Theory, Accepted for publication 2004. [7] T. Holliday and A. Goldsmith, “Joint source and channel coding for mimo systems,” in Proceedings: Allerton Conference on Communications, Control, and Computing, Oct 2004. [8] Q. Zhao, P. Cosman, and L. B. Milstein, “Optimal allocation of bandwidth for source coding, channel coding, and spreading in cdma systems,” IEEE Trans. Communications, vol. 52, no. 10, pp. 1797–1808, Oct 2004. [9] Y. Eisenberg, C. E. Luna, T. N. Pappas, R. Berry, and A. K. Katsaggelos, “Joint source coding and transmission power management for energy efficient wireless video communications,” IEEE Trans. Circuits and Systems for Video Technology, vol. 12, no. 6, pp. 411–424, June 2002. [10] A. Nosratinia, J. Lu, and B. Aazhang, “Source-channel rate allocation for progressive transmission of images,” IEEE Trans. Communications, Accepted for publication 2004. [11] W. Li, “Overview of fine granularity scalability in mpeg-4 video standard,” IEEE Trans. Circuits and Systems for Video Technology, vol. 11, no. 3, pp. 301–317, Mar. 2001. [12] S. Lin and D. J. Costello, Error Control Coding: Fundamentals and Applications, Prentice-Hall, 1982. [13] M. I. Kamien and N. L. Schwartz, Dynamic optimization. The calculus of variations and optimal control in economics and management, NorthHolland (Advanced Textbooks in Economics), 1991.