JOINT UTILIZATION OF FIXED AND VARIABLE-LENGTH CODES FOR IMPROVING SYNCHRONIZATION IMMUNITY FOR IMAGE TRANSMISSION A. Aydn Alatan and John W. Woods Center for Image Processing Research Rensselaer Polytechnic Institute Troy, NY 12180-3590 e-mail: falatan@cipr,
[email protected]
ABSTRACT Robust transmission of images is achieved by using xed and variable-length coding together without much loss in compression eciency. The probability distribution function of a DCT coecient can be divided into two regions using a threshold, so that one portion contains roughly equiprobable transform coef cients. While xed-length coding, which is a powerful solution to the synchronization problem, is used in this inner equiprobable region without sacri cing compression, the outer (saturating) region is reserved for variable-length codes. The proposed image coder rst encodes the bit allocated DCT coecients using a xed-rate quantizer bank, then the saturated values for these coecients are encoded using an entropy constrained scalar quantizer, followed by an arithmetic encoder. Our simulations show that the proposed encoder is appropriate for applications in which an acceptable quality must always be maintained in any channel condition.
1. INTRODUCTION Robust transmission of information is strictly necessary over wireless channels due to damage from inevitable noise in such media. Most of the current standards for image/video transmission do not take channel errors into account during the source coding step with the assumption of eliminating these problems simply by adding error control codes over the compressed bitstream. Although Shannon showed that this approach This work was partially supported by U.S. Army Research Oce under grant # DAAH04-96-1-0380
is optimal asymptotically, the practical requirements on complexity, delay and time-varying channels move us to re-examine the source coding stage more carefully. Experience shows that better compression generally yields worse noise immunity, i.e. more fragile bitstreams. This result is mainly due to utilization of variable-length codes, which can be catastrophic for even one bit error, although these codes are quite successful for approaching the entropy limit of a source. On the other hand, xed-length coded data has the privilege of immunity to error propagation, whereas the compression performance is usually insucient. Our aim is to eliminate the synchronization problem due to variable-length coded data without losing compression eciency. Then, it would be possible to receive subjectively acceptable images even under severe and time-varying channel conditions. We propose a novel method which utilizes xed and variable-length codes together. The next section is a brief summary of previous work on synchronization in image/video transmission. Then, the fundamentals of our approach are presented and a novel robust image codec is proposed. Then some simulations on image data are presented.
2. BACKGROUND ON SYNCHRONIZATION PROBLEM The ultimate solution to the synchronization problem is to use xed-length coded data. However, such an approach yields either low compression for scalar or high complexity for vector quantized data [1]. On the other hand, variable-length codes can be made less susceptible to errors by a number of dierent methods. In
many standards, in order to stop error propagation, some synchronization words are usually added at the cost of some overhead on the bit-rate. Moreover, the errors on these sync words might still cause catastrophe [2]. Some sophisticated methods, like EREC [3], decrease the eect of erroneously decoded sync words by elegantly reorganizing blocks into slots that lead to an automatic detection of the beginning of each block. However, with an increasing number of slots, decoding complexity increases as well. Forward error correction codes can also used in low error rate channels, but results fail shortly when the bit error rate increases. Conversion from variable- to xed-length codes is another well known solution, but it is possible to add or insert source symbols erroneously by this method and this situation can not be accepted for images which consist of data in strict locations [4]. There are also some hybrid methods which use both variable and xed-length codes. In one of these approaches, xed- or variable-length codes are chosen according to their compression performance on each block [5]. However, the synchronization problem still exists within the blocks that are encoded with variable-length codes. In another approach, a number of xed-length quantizers are designed and used appropriately in different parts of the image; the information regarding this quantizer assignment is variable-length coded and sent with some protection [6]. Nevertheless, the saturating values for these xed-length quantizers are expected to create some distortion. In the next section, a novel method is proposed which eliminates some drawbacks of these current approaches.
3. JOINT FIXED AND VARIABLE-LENGTH CODING All the current image/video compression algorithms use transform coding in their decorrelation step. A typical probability distribution function (pdf) of a transform coecient is seen in Figure 1. This symmetric pdf can be divided into two regions, saturating and non-saturating, by the help of a threshold. The main idea behind this division is to obtain a rather at region around the origin, so that the source symbols are almost equiprobable inside this non-saturating portion. Since these symbols are (almost) equiprobable, utiliza-
tion of xed-length codes is expected not to signi cantly eect the eciency of compression. Although the non-saturating region usually is not at, careful design of non-uniform quantizers decrease the problems related with the rate-distortion performance of the source. The saturating regions in Figure 1 are variablelength coded. A simple encoder which takes the above ideas into account is shown in Figure 2 (a). Each source symbol is quantized, encoded and transmitted with a xedlength code. When a symbol is found out to be saturated after xed-rate quantization, the original symbol is subtracted from its xed-rate quantized version and the switch between xed and variable-rate quantizers is closed to pass this dierence to the next stage. As the next step, the saturation oset is quantized and encoded using a variable-length code. The receiver checks whether a symbol is saturated or not using the received xed-length coded data and if so, it will decode and add the saturated dierence on top of the xed-length part. Every xed-length coded source symbol is expected to have strong immunity to error propagation as well as some good compression eciency, whereas the variablelength coded part handles the saturation eects. However, the proposed approach has a small overhead compared to conventional schemes; this is due to encoding of every saturated coecient by two symbols. By taking these ideas into account, a novel image codec is proposed in the next section.
4. PROPOSED IMAGE CODER The proposed image encoder is shown in Figure 2 (b). In our initial system, zonal coding is chosen due to its simplicity. In zonal coding, only a predetermined region is encoded in each DCT block, considering outof-region coecients to be zero, i.e. they are never transmitted. Although, such an approach has lower rate-distortion performance compared to state-of-theart methods, it gives an overall idea on the merits of this hybrid ( xed+variable) approach. The main components of this system are a block-based DCT, a bit allocation unit, a xed-rate quantizer bank, which is designed using the Lloyd-Max algorithm, a decision unit which represents the switch in Figure 2 (a), an entropy constrained scalar quantizer for variable-length encod-
ing, an arithmetic encoder to compress indices of the scalar quantizer, and a pair of MUX/DEMUX unit to distribute/merge symbols according to their bit allocations to/from the quantizer bank. In Figure 2 (b), there are two important design parameters for this system : The S , parameter, which is a multiplicative factor to de ne the widths of the non-saturating regions for each coecient of DCT (after normalization using their standard deviations) and Rfixed , which shows the amount of bits to be used for the xed-length coded part while the total bit-rate, Rtotal is assumed to be given beforehand. These parameters determine the performance of this algorithm against errors and they can be selected using the operational rate-distortion characteristics of the overall system. In the proposed system, the input image is transformed into 8x8 DCT blocks and the resulting DCT coecients are quantized using the corresponding xedrate quantizer bank according to the allocated number of bits for each coecient. Once the bit allocation map is set using the training data and bit-rate, this map is used for the whole image at each block. For every coecient with a non-zero bit allocation, a xed-length data is transmitted. If the coecient is found out to be saturated, the saturated oset part is quantized using ECSQ. The ECSQ indices are encoded by an arithmetic coder and the resulting variable-length bit stream is transmitted to the channel. An important property of the proposed system is to avoid predictive coding (for the DC coecient like in JPEG) in order to prevent any kind of error propagation. Moreover, the indices of the xed-rate quantizer are encoded using a scalar pseudo-Gray code [7] for better noise performance. The variable-length coded stream is also improved against errors by combining a number of blocks to de ne slices, adding some end-ofslice markers and control words to the nal bit-stream. During transmission, if it is not possible to send xed and variable-length streams in dierent channels, one possible solution is to multiplex these streams together, so that all the xed-length coded stream is sent rst, followed by the variable-length coded data.
5. SIMULATIONS During all the simulations, a total bit-rate of Rtotal = 0:5 b=p is targeted for Lena image (512x512), while Goldhill, Baboon and Lena images are used as the training data for Lloyd-Max and ECSQ designs. After bitallocation for Rtotal = 0:5 b=p, the PSNR of the Lena image reconstructed with the non-zero bit allocated coecients is found to be 32:77 dB, which is the PSNR upper limit for the system. The eects of S and Rfixed parameters are compared for both noise-free (Table 1) and binary symmetric channels (BSC) for dierent bit-error rates (BER) (Tables 2, 3 and 4). Table 5 shows the performance of variable-only coding with the same scheme (i.e. S = 0:0 and Rfixed = 0:0) for dierent slice lengths and BER values. The results point out that it is best to use variableonly coding for noise-free environments. However, when BER only increases to 10,4, the best results are obtained for the hybrid system with parameters S = 5 and Rfixed = 0:1. For further increase in BER, the best systems are obtained by increasing the xedlength percentage within the total bit-rate while S = 5. In noisy channels, the simulations show that the PSNR performance of the proposed hybrid scheme surpasses the variable-only coding even for the best slice length. Some typical reconstructions for xed only and xed+variable-length coded bit-streams are shown in Figure 3.
6. DISCUSSION The proposed approach targets applications in which image data has to be observed at the receiver side at every time instant, i.e. no severe picture losses due to synchronization problems. However, the challenging part of the problem is to obtain such results using a less complex, adjustable and ecient algorithm. The proposed scheme has quite low demand on computation once the training stage is completed. Using two parameters, the algorithm can also be easily adjusted to dierent channel characteristics. Moreover, a channel mismatch, which is a typical drawback for error protected variable-length data, obviously does not severely aect this method. For noisy channels, the
proposed hybrid method has better performance compared to variable-only encoding of the same coecients. The obtained image quality can be improved by utilization of threshold coding rather than zonal coding. In threshold coding, the coecients with high magnitude values are encoded, with the location information transmitted as overhead. After taking our hybrid encoding into account, such an approach with better protected location information (e.g. run lengths, zerotrees) remains as a promising future work.
7. REFERENCES [1] R. Laroia and N. Farvardin \A Structured FixedRate Vector Quantizer derived from a Variablelength Scalar Quantizer: Part I - Memoryless Sources," IEEE Trans. on Information Theory, vol. 39, pp. 851{867, May 1993. [2] M. Y. Cheung and J. Vaisey \A Comparison of Scalar Quantization Strategies for Noisy Channel Data Transmission," IEEE Trans. on Communications, vol. 43, pp. 738{742, Feb/Mar/Apr 1995. [3] D. W. Redmill and N. G. Kingsbury \The EREC : An Error-Resilient Technique for Coding VariableLength Blocks of Data," IEEE Trans. on Image Processing, vol. 5, pp. 565{574, April 1996. [4] T. J. Ferguson and J. H. Rabinowitz \SelfSynchronizing Human Codes," IEEE Trans. on Information Theory, vol. 30, pp. 687{693, July 1984. [5] P. Nasiopoulos and R. K. Ward \A Hybrid Coding Method for Digital HDTV," IEEE Trans. on Consumer Electronics, vol. 41, pp. 1080{1087, 1995. [6] S. L. Regunathan, K. Rose and S. Gadkari \Multimode Image Coding for Noisy Channels," in Proc. of Data Compression Conference, Snow Bird, UT, March 1997. [7] K. Zeger and A. Gersho \Pseudo-Gray Coding," IEEE Trans. on Communications, vol. 38, pp. 2147{2158, December 1990.
Table 1: P SN R values (dB ) for error-free reconstruction of Lena image for Rtotal = 0:5 b=p and 8 blocks=slice (values in parenthesis obtained only reconstructing xed-length coded data). P SN R
[dB ]
S
1 3 5 7 9
0.10 29.3(16.6) 30.1(20.6) 30.5(24.3) 29.9(24.2) 28.0(23.6)
Rfixed 0.25 27.4(14.8) 27.8(21.3) 29.5(26.3) 28.7(25.9) 27.1(25.0)
[b=p]
0.40 28.2(14.9) 29.0(21.9) 29.6(29.0) 29.7(28.8) 29.2(28.5)
-
0.50 (14.9) (22.1) (29.9) (30.1) (29.9)
Table 2: P SN R values (dB ) of the reconstructed Lena image for dierent BER = 10,4 of a BSC while Rtotal = 0:5 b=p and 8 blocks=slice (values in parenthesis obtained only reconstructing xed-length coded data). P SN R
[dB ]
S
1 3 5 7 9
0.10 28.2(14.6) 29.6(20.6) 30.1(24.2) 29.5(24.1) 27.8(23.5)
Rfixed 0.25 26.7(14.7) 27.6(21.3) 29.4(26.2) 28.6(25.9) 26.9(24.9)
[b=p]
0.40 27.4(14.9) 28.8(21.9) 29.5(29.0) 29.5(28.7) 29.1(28.4)
-
0.50 (14.9) (22.0) (29.8) (30.0) (29.7)
Table 3: P SN R values (dB ) of the reconstructed Lena image for dierent BER = 10,3 of a BSC while Rtotal = 0:5 b=p and 8 blocks=slice. (values in parenthesis obtained only reconstructing xed-length coded data). P SN R
[dB ]
S
1 3 5 7 9
0.10 16.3(14.6) 23.4(20.5) 25.1(24.0) 25.3(23.9) 24.5(23.2)
Rfixed 0.25 15.9(14.7) 24.0(21.2) 28.4(25.9) 26.9(25.5) 25.3(24.5)
[b=p]
0.40 16.2(14.8) 24.8(21.7) 28.4(28.3) 27.9(27.9) 27.2(27.2)
-
0.50 (14.8) (21.8) (28.9) (28.9) (28.2)
Table 4: P SN R values (dB ) of the reconstructed Lena image for dierent BER = 10,2 of a BSC while Rtotal = 0:5 b=p and 8 blocks=slice. (values in parenthesis obtained only reconstructing xed-length coded data). P SN R
[dB ]
S
1 3 5 7 9
0.10 13.0(14.0) 19.1(19.7) 22.0(22.7) 21.5(22.1) 20.2(20.3)
Rfixed 0.25 13.1(14.2) 19.9(20.2) 23.5(23.7) 22.8(22.8) 20.2(20.4)
[b=p]
0.40 13.1(14.1) 19.8(20.1) 24.7(24.7) 23.7(23.7) 22.0(22.0)
-
0.50 (13.9) (20.2) (24.6) (24.0) (22.3)
Table 5: P SN R values (dB ) of the reconstructed Lena image for dierent BER of a BSC and slice length while Rtotal = 0:5 b=p amd Rfixed = 0:0 b=p (variable-only) P SN R
[dB ]
number of blocks at each slice
BER
No error 29.8 31.6 32.0 32.1 32.2
4 8 16 32 64
10,4 10,3 10,2 23.6 25.9 26.4 24.7 23.2
13.2 14.7 14.9 14.7 14.9
9.9 10.3 10.9 11.0 11.0
pdf(x)
(a) x S
Figure 1: Typical pdf of a transform coecient Variable-rate Quantizer SAT. REG.
SAT. REG.
Xs
-+ ^ Xf
X S
^ Xv
saturated
Fixed-rate Quantizer
R
^ Xf
(a)
C H A N N E L
+ +
saturated
X^
fixed
BIT ALLOCATION Traning Data
C
Lloyd-Max QUANTIZER
S-parameter
(b)
DESIGN
H Image
DCT
MUX
QUANTIZER BANK
NO Saturate?
DEMUX
Fixed-length bit-stream YES
+
A N
-
N Normalize
SCALAR
Arithmetic
QUANTIZER
Coding
Variable-length bit-stream
E
Training Data ECSQ DESIGN (R - R fixed ) total
L
(b)
Figure 2: Proposed (a) basic and (b) complete system
(c) Figure 3: A typical Lena result for (a) variable-only decoding; (c) xed part of (b) xed+variable decoding for S = 5, ,3 (Rtotal = 0:5b=p). Rfixed = 0:25, BER = 10