rate of a bursty channel model into an end-to-end rate-distortion. (R-D) function to achieve an optimum tradeoff between source coding accuracy and channel ...
962
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000
Robust Joint Source-Channel Coding for Image Transmission Over Wireless Channels Jianfei Cai and Chang Wen Chen
Abstract—In this letter, we present a fixed-length robust joint source-channel coding (RJSCC) scheme for transmitting images over wireless channels. The system integrates a joint source-channel coding (JSCC) scheme with all-pass filtering source shaping to enable robust image transmission. In particular, we are able to incorporate both transition probability and bit error rate of a bursty channel model into an end-to-end rate-distortion (R-D) function to achieve an optimum tradeoff between source coding accuracy and channel error protection under a fixed transmission rate. Experimental results show that the proposed scheme can achieve not only high peak signal-to-noise ratio performance, but also excellent perceptual quality, especially when the channel mismatch occurs. Index Terms—Bit allocation, image communication, joint source-channel coding, source reshaping, wireless communications.
I. INTRODUCTION
Fig. 1.
Recently, Ruf and Modestino [1] proposed a fixed-length joint source-channel coding (JSCC) scheme for image transmission over additive white Gaussian noise (AWGN) channels, in which different channel codes are applied to different bits according to their respective importance on the reconstructed image. To extend this scheme to wireless transmission, it is necessary to consider bursty characteristics of wireless channels. It is also desired to improve the performance of fixed-length coding scheme so that it becomes competitive against algorithms that employ variable-length coding. In this paper, we propose a fixed-length robust joint sourcechannel coding (RJSCC) scheme. This scheme is based on an optimal joint source and channel coding (OJSCC) developed for generalized Gaussian sources [2] and an all-pass filtering source reshaping. We derive general R-D functions for three channel models: binary symmetric channels (BSC) and AWGN channels for memoryless channels, and Gilbert–Elliott channels (GEC) [3] for bursty channels. The idea of source reshaping is motivated by the scheme of robust quantization (RQ) presented in [4]. However, there are several fundamental differences. First, the channel coding and optimal bit allocation were not considered in [4]. Second, bursty characteristics of the channel was not addressed. Third, we show that source reshaping is applicable to cases beyond Gaussian distribution, including the limit case of uniform distribution. The contribution of this paper is twofold. First, we derive an explicit R-D function based on the fixed-length RJSCC Manuscript received June 28, 1999; revised March 16, 2000. This research is supported by the University of Missouri Research Board under Grant URB-98-142. This paper was recommended by Associate Editor F. Pereira. The authors are with the Department of Electrical Engineering, University of Missouri–Columbia, Columbia, MO 65211 USA. Publisher Item Identifier S 1051-8215(00)07563-7.
SNR performance for 3 bits/sample encoding of memoryless sources.
scheme for general wireless channels modeled by finite-state Markov processes. Second, an integration of all-pass filtering with OJSCC scheme enables the wireless image transmission to achieve not only better PSNR performance but also better perceptual quality. Compared with the state-of-the-art JSCC schemes, this proposed scheme outperforms many of them especially when the channel mismatch occurs. II. AN OJSCC FOR MEMORYLESS AND BURSTY CHANNELS We have recently developed an OJSCC scheme for generalized Gaussian distribution (GGD) sources over memoryless and bursty channels [2]. The scheme is similar to the approach proposed by Ruf and Modestino [1]. However, we consider a full range of GGD shape factors and bursty characteristics of a channel. This scheme enables us to conduct extensive study on the behavior of OJSCC under different source distributions and channel conditions. The study facilitates the integration of all-pass filtering with OJSCC to develop the RJSCC. Let denote the exponential decay rate, or shape factor, of a GGD source. The source becomes Gaussian distribution when , and Laplacian distribution when . GGD with a provides a useful model value of in the range for broad-tailed processes. Notice that for very large value of , the distribution tends to a uniform distribution [5]. In the case of wavelet image coding, the transformed coefficients are shown to be distributed as generalized Gaussian with their shape factors usually less than two [6]. Fig. 1 shows a summary of the study reported in [2] with SNR performance of the OJSCC system for coding memory-less Uniform, Gaussian, Laplacian, and GGD-0.5 sources with 3 bits/sample over BSC, where GGD-0.5 denotes GGD data . An interesting observation is that, the larger the with
1051–8215/00$10.00 © 2000 IEEE
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000
963
Fig. 2. System structure.
shape factor of the input data, the better SNR performance the OJSCC scheme. Other coding rates and other channel models produce similar observations [2]. Therefore, if a source of smaller shape factor can be reshaped into a source of larger shape factor, an improved transmission performance can be obtained. It is such observation that motivates us to develop the RJSCC scheme, an integration of the OJSCC scheme with all-pass filtering source reshaping. The all-pass filtering is able to shape a wide range of input sources into Gaussian distributed sources based on, intuitively, the central limit theorem. Notice that if we can design a filter to shape input sources into uniform distributed sources, we can achieve even better performance. However, it is nontrivial to design such a filter capable of nonlinear mapping. Therefore, the all-pass filtering method is adopted for its implementation advantage. The study reported in [2] also shows that, for Gaussian sources, the performance of uniform quantization and that of optimal quantization are nearly the same for all three channel models. Therefore, the complexity of quantization is greatly reduced in the proposed RJSCC scheme since, after all-pass filtering, all input sources become Gaussian distributed and the uniform quantization can be employed.
III. SYSTEM DESCRIPTION The proposed RJSCC system is shown in Fig. 2. First, an input image is decorrelated by using discrete wavelet transform (DWT). The transform also facilitates the decomposition of the original image source into many subsources so that high compression efficiency can be achieved. Then, all-pass filtering is applied to reshape each subsource into approximately Gaussian distribution. Each sample of these reshaped subsources is then mapped into an index by an -bit uniform quantizer designed for memoryless Gaussian sources. The output of the quantizer is fed into channel encoder, where unequal error protection (UEP) is applied. The receiver is essentially an inverse process. A. Image-Coding Structures Three image-coding systems are considered in this research. These systems are the following. 1) System A: In this system, an input image is decomposed into 13 hierarchical subbands using DWT with the
Daubechies’ 9/7 biorthogonal filterbanks. Each subband is treated as a subsource. There are totally 13 subsources in System A. 2) System B: In this system, an input image is decomposed into 16 hierarchical subbands using DWT. There are totally 16 subsources in System B. 3) System C: This system is the same as System B for the hierarchical decomposition. However, each subband is further partitioned into subsources with size equal to that of the low-frequency subband (LFS). Therefore, there are totally 1024 subsources in System C. System A is the same as the A-RQ system in [4]. System B and System C are the same as the systems of 16-UT and 1024-GG in [1], respectively. We adopt these systems for comparison purpose. B. All-Pass Filtering The principles of all-pass filtering have been addressed in detail in [7]. Our extensive study [2] also shows that, for same variance, the performance of OJSCC for a Gaussian source is better than any other GGD source with shape factor less than two. Therefore, for these sources, we can improve performance by applying all-pass filtering source reshaping. For GGD sources with , all-pass filtering may increase the mean square error (MSE). However, application of all-pass filtering will improve perceptual quality of the reconstructed images [4]. This is because the application of all-pass filtering is able to spread the error energy over many transformed coefficients. The total energy of distortion caused by transmission errors is unchanged, but the noise energy is now distributed over many coefficients, hence the perceptual advantage is dramatic [4]. There are many ways to implement all-pass filtering. We adopt the binary phase spectra of pseudonoise signals -sequence or [4] because pseudonoise signals, such as quasi- -array, can be easily generated, and will appear random to human perception. Furthermore, there are numerous such sequences and arrays available. For image coding, the phase spectra of 2-D quasi- -arrays are employed. As these arrays can be generated in advance and quantized to two values, 1 and 1, the applications of prefiltering and postfiltring are straightforward and can be implemented via the sequential operations of FFT, spectra multiplication, and inverse FFT.
964
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000
C. R-D Function Following [1], we can define the R-D function for memory-less channels as
PSNR RESULTS FOR
512 2
TABLE I THREE IMAGE-CODING STRUCTURES 512 LENA OVER BSC WITH 0.5 BPP OF
(1) where number of subsources; total number of pixels in the original image; number of pixels in the th subsource; distortion caused by -bit quantization for the th subsource; bit-error sensitivity and the equivalent channel bit-error rate after channel coding for the th bit of the th subsource. When a finite-state Markov model [3] is adopted to model the wireless channel, the general R-D function for wireless channels can be derived as
bit, and other bits, from 0th to ( )th, denote magnitude bits. We can derive the bit-error sensitivity for all bits as
erfc
erfc
(2)
erfc (4)
where number of channel states; probability of channel staying at the th state; bit-error rate for the th state; equivalent bit-error rate of the th bit of the th subsource for the th state after channel coding. For BSC or AWGN models, there is only one state, so (2) becomes exactly the same as (1). For the GEC model, there are and represent the trantwo states: Good and Bad. Let and sition probabilities from one state to the other; let denote the BERs at Good state and Bad state, respectively; let and denote the probabilities staying at Good state and with Bad state, respectively. By replacing , we can generate the R-D function for GEC model. The overall rate (bits/sample) can be written as (3)
the channel code rate assigned to the th bit of the th with subsource. is known. Suppose the channel condition ( is determined by the channel codes. and are determined by the quantization scheme. As indicated in Section II, uniform quantization can be employed in the proposed RJSCC system. More precisely, it can be called Gaussian optimized uniform quantization as all subsources have now been reshaped into deGaussian sources. For an -bit quantizer, let denote the quantization error. It is note quantization step and clear that such uniform quantization is a one-dimensional optiis minimized. mization problem to find the optimal so that )th bit denote the sign For each quantization index, let the (
The details on the derivation of (2) and (4) can be found in [2]. and Therefore, the remaining unknown parameters are ,( ; ). The optimal values and can be obtained using the well-known bit-alof location algorithm of Westerink et al. [8] to minimize the total distortion. We would like to point out that the RJSCC system shown in Fig. 2 is a flexible system. Various scalar quantization schemes and channel codes can be employed in the system without changing the analytical form of the R-D function. IV. EXPERIMENTAL RESULTS The monochrome Lena and Goldhill images with 8 bits/pixel are used as the test images. We use rate-compatible punctured convolutional (RCPC) codes [9] to provide UEP for coded image transmission. This is because RCPC can easily change coding rates without changing the basic codec structure. The available channel code rates of the RCPC codes (with , puncture period ) are {1/1, 8/9, 4/5, 2/3, memory 4/7, 1/2, 4/9, 4/10/, 4/11, 1/3, 4/13, 4/15, 1/4}. Other channelcoding techniques, including the combination of several channel coding schemes, can also be applied. A. Results with BSC Table I shows the PSNR results of 20 trials of the three different image coding structures with RJSCC for Lena image at 0.5 bpp. Notice that both average and worst PSNRs are shown to demonstrate the robustness of the proposed scheme over different trials. We show that System C produces the best performance. This is because more subsources are introduced by System C coding structure so that bits can be better allocated between source coding and channel coding. The comparisons using System A among A-RJSCC, A-OJSCC and A-RQ [4] are tabulated in Tables II and III.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000
965
TABLE II PSNR (IN DECIBELS) OF TRANSMITTING 512
(a)
(b)
TABLE III PSNR (IN DECIBELS) OF TRANSMITTING 512
(c)
2 512 LENA OVER BSC
2 512 GOLDHILL OVER BSC
(d)
Fig. 3. Reconstructed images using A-RJSCC coded at 0.5 bpp. (a) Lena with 10 . (b) Lena with BER = 10 . (c) Goldhill with BER = 10 . BER (d) Goldhill with BER = 10 .
=
We show that A-RJSCC outperforms A-OJSCC at all cases. Comparing A-RJSCC with A-RQ, we show that A-RJSCC performs slightly worse than A-RQ for noise free cases. However, A-RJSCC outperforms A-RQ for up to 5.86 dB for moderate and high BER channel conditions. The reconstructed images of Lena and Goldhill at 0.5 bpp with various channel BERs are shown in Fig. 3. These images show that the perceptual quality is still quite good for highly corrupted channels with BER 0.1. Table IV summarizes the comparison of the proposed scheme with several best systems, reported in [4] (A-RQ), [6] [S/C-SUB(D)] and [10] (SPIHT/RCPC/CRC), respectively. The proposed C-RJSCC is better than S/C-SLJB(D) and A-RQ, but worse than SPIHT/RCPC/CRC. We also show the results of RJSCC/CRC, where RCPC/CRC channel codes instead of RCPC are adopted. Notice that the scheme of SPIHT/RCPC/CRC [10] is based on known and fixed BER channels. If there is a sufficient mismatch between the design and the actual BER, that system will perform poorly. The fixed-length coding in this RJSCC system will not produce catastrophic error propagation. Table V shows a channel mismatch example when channel mismatch is moderate, namely, from 0.01 to 0.05, and all schemes use designed BER 0.01. These results demonstrate that the C-RJSCC scheme degrades gracefully as the channel BER increases and therefore is resilient to channel mismatch. In such a case, C-RJSCC will outperform both schemes proposed in [10] and [11]. B. Results with AWGN For AWGN channels, we use BPSK as the modulation scheme and 8-bit soft decision for the Viterbi algorithm. We have performed the comparisons between C-RJSCC and Ruf
TABLE IV COMPARISON OF DIFFERENT SYSTEMS FOR 512
2 512 LENA OVER BSC
TABLE V PSNR FOR LENA AT 0.5 BPP WITH DESIGNED BER = 0.01
and Modestino’s 1024-GG reported in [1], both with the same image-coding structure of System C. The results are shown in Table VI for coding rates of 1 and 0.5 bpp, and for a range of channel SNRS. The experiments show that C-RJSCC out-performs 1024-GG between 0.42–0.73 dB. Also, we compare B-RJSCC with Ruf and Modestino’s 16-UT reported in [1],
966
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000
both with the same image coding structure of System B. The experiments show that B-RJSCC outperforms 16-UT for 2–3 dB.
TABLE VI PSNR RESULTS FOR LENA OVER AWGN CHANNELS
C. Results with GEC A particular GEC model is adopted to generate bursty er, rors. The parameters of the channel are , , , , , and so the average BER 0.0248. For this channel, we compare three optimal bit allocation strategies. 1) Strategy A: The optimization is based on our proposed R-D function shown in (2). 2) Strategy B: The channel is treated as a BSC with the average BER of the GEC model and the optimization based on the R-D function shown in (1). 3) Strategy C: The channel is also treated as a BSC with the BER of the Bad state, and the optimization based on the R-D function shown in (1). Table VII shows the PSNR results of Lena coded at 0.5 bpp by A-RJSCC and C-RJSCC for different strategies. The results show that the optimal design based on the fully considered R-D function, Strategy A, outperforms the other two strategies based on the simple R-D function not only on average but also in the worst case among 20 trials. It demonstrates that our proposed R-D function better characterizes the wireless channels, because the channel transition probability is incorporated. D. Side Information Certain side information must be reliably transmitted, including the mean of LFS and the variance of each subsource. Suppose we use 16 bits to quantize the mean and variance of the LFS and 8 bits to quantize the variances of other subsources. We will have about 128 bits overhead for System A, or less than 0.0005 bpp. For System C, the overhead is about 0.0313 bpp. In this research, we have assumed the overhead can be reliably transmitted over the noisy channel with appropriate channel coding. Comparing with Ruf and Modestino’s OJSCC system [1], the side information has been reduced. This is because the shape factor of each subsource needs to be sent as overhead in their scheme. However, in the case of RJSCC, as all subsources become Gaussian distributed after all-pass filtering, there is no need to send shape factors. V. CONCLUSION In this paper, a robust joint source-channel coding scheme is proposed for transmitting images over wireless channels. Experimental results show that RJSCC is able to achieve very good PSNR performance and perceptual quality with modest com-
TABLE VII PSNR RESULTS FOR LENA AT 0.5 BPP OVER THE BURSTY CHANNEL
plexity. We also show that the proposed scheme is more robust to channel mismatches. This is particularly useful as wireless channels often fluctuate over a modest range of channel condition due to fading and multipath. REFERENCES [1] M. J. Ruf and J. W. Modestino, “Operational rate-distortion performance for joint source and channel coding of images,” IEEE Trans. Image Processing, vol. 8, pp. 305–320, March 1999. [2] J. Cai and C. W. Chen, “Operational rate-distortion design for joint source-channel coding over noisy channels,” in Proc. IEEE WCNC’99, New Orleans, LA, Sept. 1999. [3] H. S. Wang and N. Moayeri, “Finite-state Markov channel—A useful model for radio communication channels,” IEEE Trans. Veh. Technol., vol. 44, pp. 163–171, Feb. 1995. [4] Q. Chen and T. R. Fischer, “Image coding using robust quantization for noisy digital transmission,” IEEE Trans. Image Processing, vol. 7, pp. 496–505, Apr. 1998. [5] N. Farvardin and V. Vaishampayan, “Optimal quantizer design for noisy channels: An approach to combined source-channel coding,” IEEE Trans. Inform. Theory, vol. 33, pp. 827–838, Nov. 1987. [6] N. Tanabe and N. Farvardin, “Subband image coding using entropy-coded quantization over noisy channels,” IEEE J. Select. Areas Commun., vol. 10, pp. 926–943, June 1992. [7] A. C. Popat and K. Zeger, “Robust quantization of memoryless sources using dispersive FIR filters,” IEEE Trans. Commun., vol. 40, pp. 1670–1674, Nov. 1992. [8] R. H. Westerink, J. Biemond, and D. E. Boekee, “An optimal bit allication algorithm for subband coding,” in Proc. ICASSP’88, 1988, pp. 757–760. [9] J. Hagenauer, “Rate-compatible punctured convolutional codes and their applications,” IEEE Trans. Commun., vol. 36, pp. 389–400, Apr. 1988. [10] P. G. Sherwood and K. Zeger, “Progressive image coding for noisy channels,” IEEE Signal Processing Lett., vol. 4, pp. 189–191, July 1997. [11] H. Li and C. W. Chen, “Bi-directional synchronization and hierarchical error correction for robust image transmission,” in Proc. SPIE Visual Communication and Image Processing’99, San Jose, CA, Jan. 1999, pp. 63–72.