a part of the available transmission rate in a redundant entropy code? ... with a variable length code (VLC).1 Often there is residual redundancy left in the VLC, for ... such a case, even a few bits in error may make several output symbols of the ...
On Joint Iterative Decoding of Variable-length Source Codes and Channel Codes Ahmadreza Hedayat and Aria Nosratinia ∗ Multimedia Communications Laboratory, The University of Texas at Dallas, 2601 N. Floyd Road, EC-33, Richardson, TX 75080 {hedayat,aria}@utdallas.edu
Abstract The residual redundancy that intentionally or unintentionally remains in source encoded streams can be exploited by joint source-channel coding. This principle has been successfully applied to variable-length encoded sequences via iterative detection. It has been shown that the resulting source-channel decoding outperforms the separable counterpart. However, the computational complexity of the two systems are not comparable. We pose the question: with equivalent computational complexity, is it beneficial to invest a part of the available transmission rate in a redundant entropy code? To answer this question, we compare two systems with equivalent overall rate: joint source-channel coding with iterative decoding, vs. a separable system with concatenated channel code. We show that the joint source-channel coding with reversible variable length codes, as previously reported, is inferior to the separable system under equitable conditions. The joint sourcechannel coding is superior only with a careful design of redundant entropy codes according to the design rules of serial concatenated codes.
1
Introduction
Joint source-channel coding greatly improves the performance of communication systems under complexity and/or delay constraints. One variation of joint source-channel coding (JSCC) utilizes the source-induced dependency that, intentionally or unintentionally, remains in the compressed bit-stream. This residual redundancy and the techniques that exploit it have been investigated in a variety of guises [1, 2]. In particular, it is possible to perform iterative (turbo) decoding between the redundancy of channel code and the residual redundancy of the source code [3, 4]. ∗ This
work was supported in part by the NSF grant CCR-9985171 and THECB grant 009741-0120-1999.
This versatile technique can be used, for example, when the stream is entropy encoded with a variable length code (VLC).1 Often there is residual redundancy left in the VLC, for example in the case of reversible variable length codes (RVLC) [5] utilized in the video coding standard H.263+ and its descendants. Bauer and Hagenauer [3] proposed a novel iterative (turbo) decoding scheme between a channel code and the residual redundancy of an RVLC. They reported a significant coding gain compared to a system with equivalent transmission rate which employs a Huffman code for the same source and a slightly more powerful convolutional code. The efficiency of the serial concatenation of a VLC and channel code depends on the amount of redundancy that exists in both codes. In the maximum likelihood (ML) or iterative decoding of the serial chain, the intentional redundancy of the VLC plays a role as important as the redundancy of the channel code. Huffman codes and other entropy codes retain very little redundancy, thus it is difficult to achieve improvements through iterative decoding. A question then arises as to when there is merit in leaving intentional redundancy in an entropy code to construct a good serial concatenated code. In this paper we show that leaving redundancy in an entropy code, with the hope of iterative decoding between the entropy and the channel decoders, should be done based on the design principles of serial concatenated codes [6]. As is mentioned in [3], the nature of the redundancy in a VLC represents a “weak binary channel code,” e.g. RVLC, and this may not be a good constituent code for the serial chain. Although iterative detection makes the best use of the remaining redundancy, we present our observation of using the efficient entropy codes and leaving the redundancy in channel codes. More specifically, we show experimentally that removing the intentional residual redundancy from the RVLC in [3], i.e. utilizing a Huffman code, and forming a serial concatenated convolutional code outperforms the setup in [3] (with the same overall transmission rate). This paper is organized as follows. We describe our system in Section 2. In Section 3, the iterative decoding between a VLC and a channel code, and the necessary soft-input soft-out blocks are explained. In Section 4, we present the conjecture described above, give some hints toward the answer of the above question, and present our experimental results in Section 4.1. We conclude in Section 5.
2
System Description
We consider a basic source-channel coding scenario common in many compression schemes. It consists of a serial concatenation of a non-binary source, an entropy coder, and a channel coder (See Figure 1). The non-binary source in Figure 1 represents a typical source in a video, image, speech, or text compression system. We employ VLC’s as entropy codes in our system, because of their high efficiency. The channel code protects the entropy-encoded sequence in Figure 1, so 1 Variable
length codes are particularly prone to the effects of channel errors since an individual bit error can propagate and cause many symbol errors.
Non-binary Source ( alphabet size q)
s
Entropy encoder e.g. VLC
Interleaver
( size q )
u
Channel encoder e.g. Convolutional encoder
c
Figure 1: System block diagram that an acceptable reconstruction of the data is feasible at the receiver. Convolutional codes are used extensively in communication systems, therefore we consider convolutional codes and their concatenated variations [7, 6] in our system. We use binary phase-shift keying (BPSK) modulation. We investigate both AWGN and slow flat Rayleigh fading channels.
3
Iterative VLC and Channel Decoding
In the conventional decoder of the system in Figure 1, the received noisy sequence is first channel decoded, and then entropy-decoding is applied. Due to hostile channel conditions, the channel decoder may not be able to deliver an errorless sequence to the entropy decoder. In such a case, even a few bits in error may make several output symbols of the entropy-decoder erroneous. The described decoder is not an ML decoder, hence the overall performance can be improved either by ML decoding or its approximated version, iterative decoding. Iterative detection (decoding) is possible when a sequence has two or more sets of concurrent likelihood expressions for a data sequence. These expressions represent different constraints over the sequence. Obviously, all constraints have to be satisfied for the detection process. Iterative decoding consists of enforcing each constraint separately and repeating the process. The use of interleaver and deinterleaver (see Figure 1) makes the iterative detection more accurate [7, 8]. In iterative decoding, each decoder processes the noisy received sequence and produces a type of information, called extrinsic information, to be used by the other decoders [7, 8]. Extrinsic information represents the additional information obtained by applying the constraint of a constituent decoder. A soft-input soft-output (SISO) module is the heart of an iterative decoder. An SISO module takes as input a noisy sequence and outputs the extrinsic information [9]. We discuss this block for a channel code and a VLC in the following sections.
3.1 Soft-input soft-output channel decoder A soft-output algorithm for channel decoding was introduced by Bahl et al. in 1974 [10]. A slightly different version of this algorithm, called the SISO module, was introduced in [9]. We give a system-level description of this block in the following. An SISO module (shown in Figure 2) works on the trellis of the channel code. It accepts two probability streams P(c; I) and P(u; I) as inputs. The former describes the coded sequence c and the latter describes the information sequence, u. Applying the constraint provided by the channel code, additional information is obtained for both sequences, P(c; O) and P(u; O),
P(c;I)
P(c;O) Soft-input Soft-output Decoder
P(u;I)
P(u;O)
Figure 2: The SISO module which are called the extrinsic information [9]. This newly generated information can be used by the other decoder. Each decoder repeats this process by using the extrinsic information that has been fed back as its new input.
3.2 Bit-level soft-input soft-output VLC decoder Many efficient channel decoding algorithms are trellis-based. Particularly, Viterbi algorithm (VA), and SISO algorithm [10, 9] are both trellis-based. By building a trellis for a VLC, one may employ these algorithms in the decoding of VLC’s. The trellis in [3] is simply obtained by assigning the states of the trellis to the nodes of the VLC tree. The root node and all terminal nodes are assumed to represent the same state, since they all show the start of a new symbol (a new sequence of bits). Other nodes, the so-called internal nodes, are assigned one-by-one to the other states of the trellis. The number of states of the trellis is equal to the number of internal nodes and the root node. As an example, Figure 3 shows the trellis corresponding to the Huffman code C = {00, 11, 10, 010, 011}. There is a difference between the implementation of trellis-based algorithms on the trellis of a VLC and that of a convolutional code. Considering such a trellis (for example in Figure 3), only one symbol (bit) is assigned to each branch. This is not the case in the trellis of a convolutional code which has two symbols assigned to each branch (the information symbol and the coded symbol). This difference simplifies the trellis-based algorithms 2 . R 1 R
I2 0
1
1 0
R
R
I1
R 0
0 R I1
I2
1 0 1 0 1
0
1
0
I3
I3
1 R
Figure 3: The tree and bit-level trellis of the VLC C = {00, 11, 10, 010, 011} 2 Also
in the Viterbi algorithm, only in the root node R the compare-select process is done, and a surviving path is selected. In other nodes only the metric is calculated.
Based on the above trellis representation of a VLC, we derive an SISO algorithm for VLC’s. Following the notation of [9], the extrinsic information is calculated as follows. At time k the output probability distribution is evaluated as P˜k (u; O) = h˜
∑
Ak−1 (sS (e))Bk (sE (e))Pk (u; I)
e:u(e)=u
where, e represents a branch of the trellis; u(e), sS (e), and sE (e) are, respectively, the branch value, the starting state, and the ending state of the branch e. h˜ is a normalizing factor that ensures P˜k (0; O) + P˜k (1; O) = 1. The quantities Ak (.) and Bk (.) are calculated through forward and backward recursions, respectively, as follows. Ak (s) =
∑ E
Ak−1 (sS (e))Pk (u; I)
∑ S
Bk+1 (sS (e))Pk+1 (u; I)
e:s (e)=s
Bk (s) =
e:s (e)=s
with initial values A0 (s) = BN (s) = 0 for all states except for the root state, A0 (0) = BN (0) = 1, since the trellis always starts and ends at the root state. In order to exclude the input probability, Pk (u; I), from the output probability, and obtain the so-called extrinsic information, both sides of the previous equation are divided by Pk (u; I). Pk (u; O) =
P˜k (u; O) = h ∑ Ak−1 (sS (e))Bk (sE (e)) Pk (u; I) e:u(e)=u
where h is again a normalization factor. Pk (u; I) (input probability) and Pk (u; O) (extrinsic information) together form the a posteriori probability (APP) of the input sequence.
3.3 Iterative decoding An iterative decoding scheme for the system in Figure 1 is shown in Figure 4, using the SISO blocks already introduced. The blocks shown by π and π −1 are interleaver and deinterleaver, respectively. As mentioned earlier, they make the iterative detection more accurate by introducing weak correlation between the extrinsic information components of the SISO modules, hence the subsequent SISO processes a less correlated sequence. In each iteration, only the extrinsic information generated by each SISO, PCC (u; O) and PVLC (u; O), are exchanged between the soft-output decoders. After the last iteration, the final soft-output sequence, PVLC (u; I) + PVLC (u; O), is decoded at symbol-level by the Viterbi decoder over the same bit-level trellis.
4
Redundancy in Entropy Codes
In parallel or serial concatenated codes [7, 6], the knowledge of the redundancy in each constituent decoder purifies the noisy sequence and adds more information (i.e. extrinsic information) to what is already known, possibly approaching the ML solution. The same fact applies to
PCC(c;I) from demodulator
PCC(u;I)
PCC(c;O)
SISO
not used
Convolutional Code
PCC(u;O)
SISO
π−1
PVLC(u;I) Variable-length Code
Viterbi Decoder S
PVLC(u;O)
output
Variable-length Code
π
Figure 4: Iterative VLC and channel decoding the case of concatenated VLC and channel codes. Hence, one would expect that the improvement of an iterative decoding scheme between a Huffman code and a channel code is negligible, since the outer code has almost no redundancy. Compared to Huffman codes, RVLC’s have more redundancy, which supports a more powerful concatenated code when followed by a channel code. Good serial concatenated codes are characterized by some design rules for both outer and inner codes [6]. Among the design rules, the ones corresponding to the outer code should be adopted for a VLC whose output data is to be channel encoded. For example, given the total rate and the complexity of the constituent codes, we know that the free distance of the outer code determines the so-called interleaver gain [6]. The results of iterative decoding of an RVLC decoder and a convolutional decoder is reported in [3]. RVLC’s may have free distance greater than one. The free distance of VLC’s is defined in [11, 3]3 . The serial concatenated RVLC and convolutional code is compared to a conventional chain of a Huffman code and a convolutional code (with the same overall rate) in [3], and it is shown that the iterative decoding of the former outperforms significantly the conventional decoding of the latter. The comparison, however, is not altogether fair because the two compared methods have vastly different computational complexity. A more equitable test would use an iterative channel code as the baseline for comparison, where the two systems have equal overall rate. In the serial chain of RVLC’s and channel codes, the outer code plays a weak role, because of having low free distance. According to [6], the overall performance of the chain can be improved by assigning more redundancy to the outer code. This can be done in different ways. One may employ a more redundant entropy code, e.g. variable-length error correcting codes [11]. One may alternatively invest the intentional redundancy of the RVLC in the channel code and leave the entropy code with no redundancy, e.g. using Huffman codes. The former answer is the subject of our ongoing research. We verify the latter solution in the next section. 3 Unlike
RVLC’s, Huffman codes have free distance of one, because the two longest codewords have the same length and differ only in the last bit.
C2 Huffman code rHC = 2.19
redundancy in RVLC
π
r = 2.19/2.46 ∼ 8/9
i
C Convolutional code, rCC = 1/2
C1 : RVLC rRVLC = 2.46 RVLC+CC
C2 Huffman code rHC = 2.19
i
o
C Convolutional code, rCC = 1/2
π
C Punctured convolutional code, rCC = 8/9
HC+SCCC
Figure 5: Comparison between the two systems : Redundancy in VLC versus channel code
4.1 Experimental results Our experimental setup is shown in Figure 5. Two entropy codes designed for a 5-ary source with probability {0.33, 0.30, 0.18, 0.10, 0.09} are considered in Figure 5. The RVLC code has the codebook C1 = {00, 11, 010, 101, 0110} with average length 2.46 [5, 3]. The Huffman code designed for the same source, with average length 2.19, is C2 = {10, 11, 00, 010, 011} (See Figure 3). In the first system, an eight-state recursive convolutional code, Ci , with rate 1/2 and gen3 erator polynomial Gi (D) = (1, 1+D+D 1+D ) is used. For the second system a serial concatenated convolutional code (SCCC) [6] is used. The outer code is a convolutional code, Co , with rate 1/2 and generator polynomial Go (D) = (1 + D, 1 + D + D3 ). The inner code is a punctured version4 of Ci with rate 8/9. In both systems, a packet of 2000 symbols is entropy-encoded, interleaved, and channel encoded. In the first system, the iterative VLC and channel decoding is employed. In the second system, regular iterative decoding of SCCC’s is used [6]. Simulation is conducted in AWGN and Rayleigh fading channels with coherent receiver. Note that the existing redundancy in C1 (RVLC code) is removed in the second system and the Huffman code of the given source, C2 , is employed. The released rate from C1 , almost 8/9, is employed in the form of an inner punctured convolutional code. By doing so, the SCCC has outer code of rate 1/2 and d f = 5, and one would expect to perform better. It is worth mentioning that the rate of the constituent codes of the SCCC of the second system might be selected differently and still have the same overall transmission rate. For example, both outer and inner codes may have rate of 2/3 which has the same rate with the setups in Figure 5. Or, a different concatenated channel, such as parallel concatenated codes [7], can be used. The results of comparison between the two systems are shown in Figure 6 for AWGN channel and in Figure 7 for fast Rayleigh fading channel. Levenshtein distance [12, 3] is used in reporting SER in Figure 6(a) and Figure 7(a). Levenshtein distance is defined as the minimum number of insertions, deletions or substitutions that transform one sequence to another one. The SER of the output of VLC decoder is better represented by Levenshtein distance than Hamming distance because of the self-synchronizing property of VLC’s. Figure 6(b) and Figure 7(b) compares the BER of the output of the channel decoder in the two systems. The second system 4 The
SCCC used in the second system is taken from [6], Section V.B.3, but no optimization is applied for puncturing the inner code.
0
0
10
10
−1
10
−1
10
Channel code BER
Symbol Error Rate (SER)
−2
10 −2
10
−3
10
−4
10
−4
RVLC+CC, 0 itr RVLC+CC, 2 itr RVLC+CC, 3 itr RVLC+CC, 4 itr HC+SCCC, 0 itr HC+SCCC, 2 itr HC+SCCC, 3 itr HC+SCCC, 4 itr
10
−5
10
−5
10
0.5
−3
10
RVLC+CC, 0 itr RVLC+CC, 2 itr RVLC+CC, 3 itr RVLC+CC, 4 itr HC+SCCC, 0 itr HC+SCCC, 2 itr HC+SCCC, 3 itr HC+SCCC, 4 itr
−6
1
1.5
2 AWGN : Eb/N0 (dB)
2.5
3
(a) SER at the output of the VLC decoder
3.5
10
0.5
1
1.5
2 AWGN : Eb/N0 (dB)
2.5
3
(b) BER at the output of the channel decoder
Figure 6: Comparison between the two systems in Figure 5 for 0,2,3, and 4 iterations, AWGN channel outperforms the first system after the first few iterations. The first system shows negligible improvement after the few first iterations, suggesting that the ML performance is achieved by a few further iterations. In fact, our results, not shown here for clarity of the figures, show very little improvement (for up to 9 iterations in the same Eb /N0 range) to what has been achieved by the first five iterations. On the other hand, the second system shows significant improvement by each iteration.
5
3.5
Conclusion
Motivated by the application of inefficient entropy codes, we question if the assignment of a part of the transmission rate to a redundant entropy code is beneficial. Reversible VLC, employed in state-of-art compression standards, is an example of introducing redundancy in entropy codes. The concatenation of entropy codes and channel codes resembles the serial concatenated codes. Therefore, we advocate applying the design rules of serial concatenated codes to the investigated system. In the serial concatenation of RVLC and convolutional code, the design rule regarding the outer code is not considered. To show the inferior performance of such a concatenated code, we propose an equivalent system, consisting of an efficient entropy code and a concatenated channel code, which outperforms the investigated system. It is also possible to employ entropy codes with higher redundancy (while keeping the overall rate fixed) which complies the design rules specified by the serial concatenated codes. The latter approach is our ongoing research.
References [1] J. Hagenauer, “Source-controlled channel decoding,” IEEE Transactions on Communications, vol. 43, pp. 2449–2457, September 1995.
0
−1
10
10
−1
−2
10
Channel code BER
Symbol Error Rate (SER)
10
−2
10
−3
10
RVLC+CC, 0 itr RVLC+CC, 2 itr RVLC+CC, 3 itr RVLC+CC, 4 itr HC+SCCC, 0 itr HC+SCCC, 2 itr HC+SCCC, 3 itr HC+SCCC, 4 itr
−4
10
−4
10
RVLC+CC, 0 itr RVLC+CC, 2 itr RVLC+CC, 3 itr RVLC+CC, 4 itr HC+SCCC, 0 itr HC+SCCC, 2 itr HC+SCCC, 3 itr HC+SCCC, 4 itr
−5
10
−5
10
−3
10
−6
2
2.5
3
3.5 4 Rayleigh Fading : Eb/N0 (dB)
4.5
5
(a) SER at the output of the VLC decoder
5.5
10
2
2.5
3
3.5 4 Rayleigh Fading : Eb/N0 (dB)
4.5
5
5.5
(b) BER at the output of the channel decoder
Figure 7: Comparison between the two systems in Figure 5 for 0,2,3, and 4 iterations, fast Rayleigh Fading [2] A. Murad and T. Fuja, “Robust transmission of variable-length encoded sources,” in Proc. IEEE Wireless Communications and Networking Conference, September 1999. [3] R. Bauer and J. Hagenauer, “On variable length codes for iterative source/channel decoding,” in Proc. Data Compression Conference, April 2001, pp. 273–282. [4] R. Perker, M. Kaindl, and T. Hindelang, “Iterative source and channel decoding for GSM,” in Proc. IEEE ICASSP, May 2001, vol. 4, pp. 2649 –2652. [5] Y. Takishima, M. Wada, and H. Murakami, “Reversible variable length codes,” IEEE Transactions on Communications, vol. 43, pp. 158–162, February/March/April 1995. [6] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Serial concatenation of interleaved codes: performance analysis, design, and iterative decoding,” IEEE Transactions on Information Theory, vol. 44, no. 3, pp. 909 –926, May 1998. [7] C. Berrou and A. Glavieux, “Near optimum error correcting coding and decoding: Turbo codes,” IEEE Transactions on Communications, vol. 44, pp. 1261 –1271, October 1996. [8] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block and convolutional codes,” IEEE Transactions on Information Theory, vol. 42, pp. 429–445, March 1996. [9] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “A soft-input soft-ouput APP module for iterative decoding of concatenated codes,” IEEE Communications Letters, vol. 1, no. 1, pp. 22–24, January 1997. [10] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Transactions on Information Theory, vol. 20, pp. 284–287, March 1974.
[11] V. Buttigieg, “Variable-Length Error-correcting Codes,” Ph.D. Thesis, Department of Electrical Engineering, University of Manchester, 1995. [12] T. Okuda, E. Tanaka, and T. Kasai, “A method for the correction of garbled words based on the Leveneshtein metric,” IEEE Trans. on Computers, vol. C-25, no. 2, pp. 172–176, February 1976.