COMBINED IMAGE SUBBAND CODING AND MULTILEVEL MODULATION FOR COMMUNICATION OVER POWER- AND BANDWIDTH-LIMITED CHANNELS John M. Lervik Arild Fuldseth Tor A. Ramstad Department of Telecommunications, The Norwegian Institute of Technology, O.S. Bragstads Plass 2B, N-7034 Trondheim, Norway e-mail:
[email protected]
ABSTRACT
This paper presents a bandwidth-ecient, power-limited still image transmission system based on subband coding and 64-QAM. A new optimization procedure, based on the simulated annealing algorithm, is used to optimize the mappings between quantized subband samples and the modulation signal constellation. The optimization criterion is the mean square error between the transmitted and the reconstructed quantized subband samples. Assuming an additive white Gaussian noise (AWGN) channel it is shown that the proposed image transmission system gives a signi cant improvement in image quality as compared to a system with random mappings.
1. INTRODUCTION Traditionally, the source encoder/decoder, the channel encoder/decoder, and the modulator/demodulator pairs in a digital communication system have been considered as separate entities. However, improved performance can be achieved by joint optimization of the entire communication system. Consequently, there has been a lot of work in the area of combined source and channel coding during the recent years. Most of this work has focused on the transmission of samples from a Gaussian source over a noisy channel without a power constraint [1, 2]. In [1] simulated annealing is used to nd a mapping between the output from a vector quantizer (VQ) and the codewords of a binary symmetric channel such as to minimize the eects of transmission errors in the reconstructed signal. The result is improved robustness to channel errors without sacri cing performance for a noise free channel. A practical scheme for transmission of subband coded images over narrow-band channels using traditional transmission system design is considered by Stedman et. al [3]. An embedded video subband coder, BCH error-correction coder, and 16-level quadrature amplitude modulator (16QAM) are investigated for transmission over micro-cellular mobile channels. This paper extends the work in [1] in two directions. Firstly, it is shown how the optimization procedure based on simulated annealing can be modi ed to take into account a power constraint with multilevel signaling. Secondly, we have considered a bandwidth ecient still image transmission system based on subband-coding and 64-QAM where a real image source is used instead of Gaussian data.
In contrast to [3] we have optimized the image transmission system in such a way that there is no explicit need for error-correction coding overhead, except for a small amount protecting some necessary side information and possibly the lowpass-lowpass band. In this paper, the mappings between the quantized source samples and the modulation signal constellation are optimized such as to minimize the eect of transmission errors at the receiver side. Thus, the system is gracefully degrading and almost no degradation is found under error-free channel conditions as compared to an unprotected system. In addition, a larger signal constellation is used { 64-QAM as compared to 16-QAM { oering a greater spectral eciency. The paper is organized as follows. Section 2 discusses the mechanisms for noise generation in communication systems. Section 3 describes the structure of the combined subband coding and QAM system, while Section 4 formulates the mapping optimization problem, and provides an ecient, practical algorithm for optimization. Section 5 compares the image transmission results for the optimized system against a random mapping reference system, and Section 6 is the Conclusion.
2. NOISE GENERATION IN TRANSMISSION SYSTEMS Traditionally, the goal has been to make as good a source coder as possible as measured by a bit rate versus image quality criterion, i.e. remove as much redundancy as possible, and then use explicit error protection, i.e. introduce redundancy, to avoid that channel errors deteriorate the reconstructed image quality. E.g. image/video compression standards like JPEG [4], MPEG-1 [5], and MPEG-2 [6] are designed for operation at error free transmission conditions, and quantization is assumed to be the only noise source in the reconstructed signal. For noisy channels, e.g. terrestrial communication, these standards may need an excessive amount of error protection to avoid total breakdown in the reconstructed signal under poor reception conditions. For noisy channel environments, it seems more reasonable to allow both the quantization and the channel errors to contribute to the noise in the reconstructed image. Thus, we may make explicit error protection super uous by ensuring that the system itself provides enough implicit error protection. This is done by the proposed combined subband coding and multilevel modulation system where the subband coded samples are mapped to the modulation sig-
In Proc. Workshop on Visual Signal Processing and Comm., (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sep. 1994
X0- Fixed rate DPCM
X
Image Source
-
88 Analysis Filter Bank
X1X2X3.. .
Blockwise Bit Allocation
X63-
Y1-
1-Bit Quantizer
Y^1-
Y2-
2-Bit Quantizer
Y^2-
.. .
Y6-
6-Bit Quantizer
Side Information
Y^6-
Explicit error protection
-
Explicit error protection
-
Mapping to 64-QAM symbols Y^ (1)- m(1) () s(m(1) (k))Y^ (2)- m(2) () s(m(2) (k))s(m(3) (k))^ (3) Grouping Y - m(3) () .. . Y^ (11)- m(11) () s(m(11) (k))-
Figure 1. Block diagram of the transmitter side of the image transmission system.
nal constellation in such a way that the eect of channel errors is minimized in the reconstructed image. In the proposed system the aggregate eect of quantization noise and channel noise constitutes the signal degradations. It is, however, not obvious what is the optimal ratio between these two noise sources [7]. A given channel bandwidth can, for a xed channel SNR, be obtained both by a high rate source coder combined with a highly bandwidth ecient modulation method, or by a low rate source coder combined with a less bandwidth ecient modulation method. In the rst system the noise due to channel errors will dominate, while in the second system the quantization will be the predominant noise source. The superior system is the one which gives the best delity of the reconstructed image under realistic operation conditions [7]. Ideally, the noise in the reconstructed image due to channel errors should have the same characteristics and statistics as the quantization noise. This will give a simple additive noise model which easily can be analyzed. This is, however, not necessarily the case for mappings between multidimensional spaces of unequal dimensionality .
3. SYSTEM DESCRIPTION Subband coding has attained a lot of interest as an image compression technique during the last ten years [8, 9]. However, no standards for image or video transmission have yet used image subband coders, although it may give improved performance over transform coders, that is part of all existing standards. A subband coder consists of an analysis and a synthesis lter bank, and a quantization part. The analysis lter bank decomposes the source signal into subbands such that ecient coding of the decomposed signal is possible, i.e. ideally, it removes all of the interband and, to a certain degree, the intraband correlation [10]. The quantization strategies are usually based on distribution of bits among the subbands to minimize the distortion.
We will in this section describe the image transmission system shown in Figure 1 with emphasis on the employed quantization strategy and the bandwidth reducing ability of the system.
3.1. Image transmission system
The subband coder is optimized for noise free channel conditions [11], as we want no degradation of the system performance as compared to an unprotected coder while the transmission is error free. The analysis lter bank splits the input image X into 8 8 subbands, Xi ; i = 0; 1; ; 63. The lowpass-lowpass band, X0 , is quantized with a 5 bits/sample DPCM-coder with xed predictor con guration and explicitly error protected to avoid transmission errors. The subband samples from the higher bands, Xi ; i = 1; 2; 63, are allocated bits according to the blockwise root mean square (rms) values as described in Section 3.2, and then divided into seven classes, Yr ; r = 0; 1; ; 6, where r represents the number of bits in the quantizers. The individual subband samples from each of the six classes allocated a nonzero number of bits are then encoded with scalar pdf optimized Laplacian quantizers [9]. The quantized samples, Y^r = Q(Yr ); r = 1; 2; ; 6, are grouped into one of 11 vector structures, Y^ (i) ; i = 1; 2; 11, and mapped to symbols, s(m(i) ()), in a uniform 64-QAM signal constellation as explained in Section 3.3. We have assumed an additive, white, Gaussian (AWGN) channel. At the receiver side a maximum likelihood (ML) demodulator is used, and the image signal, X^ , is reconstructed through inverse operations as compared to the encoder side.
3.2. Explicit bit allocation
We have used a xed rate, explicit bit allocation strategy originally presented for speech coding applications by Ramstad in [12] and further developed for image subband coders by Husy [9].
In Proc. Workshop on Visual Signal Processing and Comm., (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sep. 1994
The bit allocation procedure can be described as follows. Each subband except of the lowpass-lowpass band, Xi ; i = 1; 2; ; 63, is divided into blocks of size 4 41 , and the rms value is computed for each block. Each block is allocated bits according to its rms value. Thus, additional local adaptivity is gained as compared to a bit allocation procedure based on allocating bits to whole subbands, though at the cost of excess overhead for informing the decoder about which quantizer is used for each block. The block with the largest estimated standard deviation is allocated one bit per pixel. Then the estimated standard deviation of that block is divided by a factor of 2, after which the procedure is repeated until all available bits are allocated to the dierent blocks. It can be shown that this explicit bit allocation method, in the limit of high bit rates, renders maximum coding gain for a xed rate [12]. A maximum allowable no. of bits per pixel is preset. This limits the side information; A larger number of possible classes will increase the necessary side information for telling the decoder which class each block belongs to. In this paper the maximum no. of bits per pixel is set to 6. The blocks allocated the same no bits per pixel are said to belong to the same class . All blocks belonging to class r, Yr ; r = 1; 2; ; 6, are quantized employing the same r-bit quantizer. To prevent variance mismatch in the quantization process, samples belonging to blocks within one class are normalized to unity variance prior to being quantized. Each of the 6 class variances is passed to the decoder as side information. Also, as mentioned above, for each block we must indicate to which class it belongs. An important feature of this bit allocation strategy is that the total bit rate can be kept constant. Thus, the buering associated with variable length coding is avoided.
3.3. How bandwidth reduction is obtained
According to [7] we de ne the eciency of the total transmission system as the ratio between source and the channel bandwidth for a xed delity of the reconstructed signal and a xed channel signal-to-noise-ratio (SNR). A reduced channel bandwidth is basically obtained by reducing the channel symbol rate as compared to the source symbol rate. In the proposed system this is achieved by rst removing signal redundancies and irrelevancies in the subband source coder ( lter bank and quantizers), after which several source coder symbols are mapped intelligently into one symbol in the modulation signal constellation. To achieve a reasonable bandwidth reduction through symbol combination we have chosen 64-QAM as modulation method. QAM is bandwidth-ecient and well suited for e.g. terrestrial broadcasting [13]. Given 64-QAM, log 2 64 = 6 bits can be transmitted per QAM-symbol2 . The channel symbol rate is reduced by grouping a number of quantized subband samples, altogether represented by 6 bits, and mapping the grouped symbols into one 64-QAM symbol. The quantized subband samples, Y^r ; r = 1; 2; ; 6, are 1 The block size is a compromise between local adaptivity and bits2 spent on side information. One 64-QAM symbol can be represented with two 8-level (i.e. 3 bits) real channel symbols (i.e. two 8-PAM symbols.)
Y^ (1) = (Y^1 ; Y^1 ; Y^1 ; Y^1 ; Y^1 ; Y^1 ) ^Y(2) = (Y^2 ; Y^1 ; Y^1 ; Y^1 ; Y^1 ) Y^ (3) = (Y^2 ; Y^2 ; Y^1 ; Y^1 ) (4) Y^ (5) = (Y^3 ; Y^1 ; Y^1 ; Y^1 ) Y^ (6) = (Y^2 ; Y^2 ; Y^2 ) Y^ (7) = (Y^3 ; Y^2 ; Y^1 ) Y^ (8) = (Y^4 ; Y^1 ; Y^1 ) Y^ (9) = (Y^3 ; Y^3 ) Y^ (10) = (Y^4 ; Y^2 ) Y^ (11) = (Y^5 ; Y^1 ) Y^ = (Y^6 ) Table 1. Structure of the 11 grouping vectors.
grouped into one of 11 possible vector structures, Y^ (i) ; i = 1; 2; ; 11 according to Table 1. Each of the vector structures is constructed such that the sum of the number of bits used for the individual components is equal to 6. The grouping is unambiguously given by the bit allocation table which is transmitted without errors, at a cost of a small amount of error protection overhead. Finally, the vectors, Y^ i ; i = 1; 2; 11, representing the quantized subband samples are mapped to a uniform 64QAM signal constellation (see Figure 2). In addition to the combination of symbols, the symbol rate is also reduced by discarding subband samples which have been allocated 0 bits. A larger modulation constellation could have been chosen, thus gaining channel bandwidth, though at the cost of a increased channel symbol error rate for a xed channel SNR. A smaller modulation constellation would correspondingly, for a xed channel SNR, have required a larger channel bandwidth, but at a gain in symbol error rate.
4. OPTIMIZATION OF THE MAPPINGS In this section we discuss the problem of nding good mappings between quantized subband samples and the 64-QAM signal constellation, such as to optimize the overall system performance. Ideally, one should select the mapping that minimize the mean square error (mse) between the original image, X, and the reconstructed image, X^ . However, to simplify the problem, we minimize the mse between the quantized subband samples, Y^r ; r = 1; 2; ; 6 at the encoder, and the corresponding reconstructed subband samples at the receiver side. Obviously, the desired mappings should i) minimize the eects of transmission errors in the reconstructed signal and ii) maximize the minimum distance of the signal constellation for a xed transmitted power . The rst criterion means that symbols that are close in the source signal space should on average be mapped to neighboring symbols in the modulation signal constellation. This criterion is also necessary when considering discrete channels without a power constraint [1]. The second criterion means mapping the most probable symbols to the smallest amplitudes in the channel space, and vice versa. For a given transmitted
In Proc. Workshop on Visual Signal Processing and Comm., (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sep. 1994
s
s
s
Ims
6s
s
s
s
s
s
s
s
s
s
s
s
62(s n) s ?s s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
2( -n)
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
to
S (i) =
s
-
s
Re
Figure 2. Uniform 64-QAM signal constellation. power, this will reduce the channel symbol error rate sig-
ni cantly. These may be opposing requirements. Mapping two very probable source symbols close to the center of the modulation signal constellation may pay o, even if it the two symbols are representing quite dierent source symbols, as it allows a larger minimum distance in the modulation signal constellation, thus reducing the overall symbol error rate. Notice that in cases with equal probability channel symbols the second criterion is redundant. The method described in this paper achieves the best overall trade-o between these two criteria. Referring to the notation in Figure 1 and Figure 2 the error criterion to be minimized for each mapping, m(i) (), is given by:
X64 E ( ) = P ( ) (^y( ) ) X=164 P( )[s(m( )(l)) s(m( )(k))] d(^y( ); y^( )); i k
i
i
k
l=1
n
i
j
i
i = 1; 2;
i k
l
i
; 11; (1)
where P (i) (^yk(i) ); i = 1; 2; ; 11; k = 1; 2; ; 64, is the probability that the kth vector of grouping no. i, y^ k(i) , was transmitted, and m(i)(k) de nes the mapping between the vector y^k(i) and the QAM symbol s(m(i) (k)). Furthermore, P(n) [ j ] are the channel transition probabilities and the subscript (n) indicates that the transition probabilities are a function of the minimum distance in the modulation signal constellation at iteration no. n. Finally, d(^yk(i); y^ l(i) ) is the squared Euclidean distance between the quantized subband sample vectors y^ k(i) and y^l(i) , where d(^yk(i) ; y^ k(i) ) = 0. By using the simulated annealing algorithm, the desired mapping, m(i) (), is found by iteratively perturbing the mapping and by evaluating the resulting error criterion at each iteration. For a xed minimum distance, (n), and unequal symbol probabilities, the average transmitted power, S (i) = S , will depend on the particular mapping, m(i) (), between the source and channel symbols according
X64 P ( )(^y( )) i
k=1
i k
s(m(i)(k)) 2 ; i = 1; 2;
j
j
; 11: (2)
Thus, in order to maintain a xed average transmitted power, S , the minimum distance, (n), is scaled correspondingly at each stage of the iterative procedure. As a consequence, the transition probabilities need to be updated at each iteration as well. Ideally, one would like to compute the true transition probabilities at each iteration in the simulated annealing procedure. However, in order to reduce the computational complexity, we propose an approximate but computationally ecient solution to the problem. The approximate solution is obtained by scaling all the transition probabilities with a common scale factor such that the approximate transition probabilities can be expressed as
P((n)) [s(m(i)(l)) s(m(i) (k))] ((n)) P0 [s(m(i) (l)) s(m(i) (k))] d(yk(i) ; yl(i) ) k; l = 1; 2; ; 64; (3) where ((n)) is a scaling factor depending on the minimum distance, (n), and P0 [ ] is a xed set of channel j
j
j
transition probabilities obtained by assuming equiprobable symbols and a certain design channel SNR. The scaling factor, ((n)), is computed for each iteration and is de ned such as to compensate for variations in the channel symbol probabilities due to changes in the mapping, m(i) (). (i) , is obtained by The simpli ed error criterion, Eapprox substituting the approximation in Equation 3 into Equation 1.
5. RESULTS
The simulated annealing algorithm is used to optimize the mappings for the 11 bit combinations assuming Laplacian distributed subband samples [9]. Assuming a xed noise variance, the mappings are optimized for a xed channel SNR. The method is applied to the 512 512 image of \Lenna"3 and evaluated by comparing the resulting peak signal-to-noise ratio (PSNR) of the reconstructed image for various values of channel SNR. As a reference, the PSNR values for a random mapping between the source symbols and the signal constellation are included. The results are illustrated in Fig. 3. More graceful degradation is obtained for the optimized system than for the random mapping system. The image quality for a part of the picture \Lenna" coded at a total bit rate, including side information, of 0.65 bit per pixel and transmitted at a channel SNR of 18 dB is shown in Fig. 4 for the random mapping and the optimized mapping. It is important to notice that the quality improvement in \Lenna" for the optimized mapping case is not because channel symbol errors do not aect the reconstructed image quality, but because the channel symbol errors mainly lead to small errors in the reconstructed signal due to the
3 We have used the green (G) component of the RGB version of \Lenna".
In Proc. Workshop on Visual Signal Processing and Comm., (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sep. 1994
36
34
32
PSNR [dB]
30
28
26
24
22
20 10
12
14
16
18 CSNR [dB]
20
22
24
26
Figure 3. Performance results; random mapping ( ) and mapping for a xed channel SNR (||). Total bit rate: 0.65 bit per pixel. Image: Lenna.
optimized mappings. Thus, in contrast to conventional systems, we allow channel errors to aect the reconstructed signal quality. The channel symbol errors should, however, have as small an eect as possible.
6. CONCLUSION A robust, bandwidth-ecient system for combined subband coding of still images and QAM has been presented. The mapping between the quantized subband coded samples and a multilevel modulation signal set is optimized by simulated annealing for a power-limited AWGN channel. It is shown that for a given image delity (PSNR) an improvement of the channel signal-to-noise ratio on the order of 2{4 dB is achieved for noisy channels, as compared to an average mapping system. It is emphasized that this is gained without reducing the performance for error-free conditions. It is believed by the authors that the principles presented for combined subband coding and multilevel systems may be applied for future image communication where the available channel bandwidth and power are limited. A further improved overall performance is expected through increased integration of the source coder and the modulator.
REFERENCES
[1] N. Farvardin, \A study of vector quantization for noisy channels," IEEE Trans. Inform. Theory, vol. 36, pp. 799{809, July 1990. [2] P. Knagenhjelm, \A recursive design method for robust vector quantization," in Proc. Int. Conf. on Signal Processing Applications and Technology, (Boston), pp. 948{954, Nov. 1992. [3] R. Stedman et. al, \Transmission of subband-coded images via mobile channels," IEEE Trans. Circuits, Syst. for Video Tech., vol. 3, pp. 15{26, Feb. 1993.
Figure 4. Top: A part of the image \Lenna" for the system employing the random mappings. Bottom: The same part of \Lenna" for the system employing the optimized mappings. Channel SNR: 18 dB. Total bit rate: 0.65 bit per pixel.
[4] Joint Photographic Experts Group ISO/IEC, JTC/ SC/WG8, CCITT SGVIII, \JPEG technical speci cations, revision 5," Report JPEG-8-R5, Jan. 1990. [5] ISO/IEC JTC1/SC2/WG11 { MPEG-1, Committee Draft 11172 Video, Nov. 1991. Coding of Moving Pictures and Associated Audio. [6] ISO-IEC/JICI/SC29/WG11 { MPEG-2, Committee Draft, May 1992. [7] John M. Lervik and Tor A. Ramstad, \An analog interpretation of compression for digital communication systems," in Proc. IEEE Conference on Acoustics, Speech & Signal Processing (ICASSP-94), vol. 5, (Adelaide, South Australia), pp. V{281{V{284, IEEE, Apr. 1994.
In Proc. Workshop on Visual Signal Processing and Comm., (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sep. 1994
[8] P. H. Westerink, Subband Coding of Images. PhD thesis, Technische Universitat Delft, Oct. 1989. [9] J. H. Husy, Subband Coding of Still Images and Video. PhD thesis, The Norwegian Institute of Technology, Mar. 1991. [10] S. O. Aase and T. A. Ramstad, \On the optimality of nonunitary lter banks in subband coders," IEEE Trans. Image Processing. To be published. [11] S. O. Aase, Image Subband Coding Artifacts: Analysis and Remedies. PhD thesis, The Norwegian Institute of Technology, Mar. 1993. [12] T. A. Ramstad, \Considerations on quantization and dynamic bitallocation in subband coders," in Proc. ICASSP, pp. 841{844, 1986. [13] E. Stare, \Development of a prototype system for digital terrestrial HDTV," Tele English Edition, no. 1, 1992.
In Proc. Workshop on Visual Signal Processing and Comm., (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sep. 1994